Model Construction, Software Framework, and Simulation of a Humanoid Robot Based on ROS

The field of robotics, since its inception in the mid-20th century, has evolved to become integral to numerous aspects of modern life. Among the various robotic forms, the humanoid robot stands out due to its anthropomorphic structure, which fosters higher social acceptance and enables interaction within environments designed for humans. The development of a humanoid robot is a multidisciplinary challenge, encompassing mechanical engineering, electronics, computer science, sensor technology, and advanced control algorithms. It represents a significant benchmark for a nation’s technological prowess. However, the increasing complexity of robotic software has historically led to issues with reusability and interoperability. The introduction of the Robot Operating System (ROS), an open-source, meta-operating system featuring a distributed, peer-to-peer architecture, has provided a robust framework to address these challenges. This article details the comprehensive design, modeling, and simulation of a 20-Degree-of-Freedom (DOF) humanoid robot within the ROS ecosystem.

A humanoid robot standing in a laboratory environment.

The overarching goal is to establish a complete software-in-the-loop simulation pipeline, providing a validated platform for future research into complex motion planning, dynamic control, and environment interaction for the humanoid robot.

1. Hardware Architecture of the Humanoid Robot

The hardware system is designed with a dual-controller architecture to effectively separate high-level cognitive computations from low-level real-time control tasks. This segregation ensures both computational power for perception/planning and deterministic performance for actuator control.

Table 1: Hardware System Components of the Humanoid Robot
Subsystem	Component	Key Specifications / Model	Primary Function
High-Level Compute System	Main Controller	PICO-HC101 SBC (Intel Atom/Celeron Processor)	Runs Ubuntu/ROS; handles vision, motion planning, and overall behavior control.
High-Level Compute System	Vision Sensor	USB Camera	Provides visual input for perception modules.
Low-Level Control System	Sub-Controller	STM32F103RET6 Microcontroller	Manages real-time communication with actuators and sensors.
	Actuators	MX-28T Dynamixel Servos (x20)	Provide torque and position control for all 20 joints.
	Inertial Measurement Unit (IMU)	Integrated Gyroscope & Accelerometer	Provides orientation and acceleration data for balance.
	Peripherals	LEDs, Buttons, Buzzer, Microphones	Enable basic interaction, status indication, and audio input.
Connectivity	Communication Bridges	USB-to-Serial, UART-to-TTL, I2C/SPI circuits	Facilitate data exchange between the main controller, sub-controller, servos, and IMU.

The kinematic structure of the humanoid robot allocates 20 degrees of freedom (DOF) across its body: 2 DOF for head pan and tilt, 3 DOF for each arm (shoulder roll/pitch, elbow pitch), and 6 DOF for each leg (hip roll/pitch/yaw, knee pitch, ankle roll/pitch). The joint angle limits are defined considering both human anatomical ranges and mechanical constraints to prevent self-collision, as summarized later in the modeling section.

2. ROS-Based Software Framework for the Humanoid Robot

The software architecture is constructed on ROS Kinetic, running on Ubuntu 16.04 LTS installed on the main controller. ROS’s node-based, message-passing paradigm is leveraged to create a modular and scalable framework for the humanoid robot. The architecture is layered as follows:

Hardware Layer: Comprises the physical components listed in Table 1.
Operating System Layer: Ubuntu 16.04 LTS with the ROS Kinetic middleware.
Middleware Layer: Includes ROS client libraries (roscpp, rospy), communication protocols (TCPROS/UDPROS), and critical packages like `ros_control` for controller interfacing.
Application Layer: Consists of functional nodes grouped into modules:
- Behavior Control Module: The high-level decision maker, integrating information from other modules to execute tasks.
- Robot Control Module: Contains the critical hardware interface nodes. One node manages communication with the real STM32 controller, while a virtual interface node is designed for simulation, subscribing to joint trajectory commands and publishing joint states.
- Vision Module: A collection of nodes for video capture, object detection/tracking, and depth estimation using libraries like OpenCV.
- Motion Module: Hosts core nodes for gait generation, head tracking, predefined action playback, and fall detection.

All modules register with the ROS Master and communicate asynchronously via topics and services, creating a flexible and decoupled system for the humanoid robot.

3. Kinematic and Dynamic Modeling

Accurate modeling is fundamental for simulation and control. The kinematic chain of the humanoid robot is described using the Denavit-Hartenberg (D-H) convention. For a revolute joint i, the standard D-H parameters are: link length $a_{i-1}$, link twist $\alpha_{i-1}$, link offset $d_i$, and joint angle $\theta_i$. The homogeneous transformation matrix from frame {i-1} to frame {i} is given by:

$$
^{i-1}T_i = \operatorname{Rot}_{z}(\theta_i) \cdot \operatorname{Trans}_{z}(d_i) \cdot \operatorname{Trans}_{x}(a_{i-1}) \cdot \operatorname{Rot}_{x}(\alpha_{i-1})
$$

$$
^{i-1}T_i = \begin{bmatrix}
\cos\theta_i & -\sin\theta_i\cos\alpha_{i-1} & \sin\theta_i\sin\alpha_{i-1} & a_{i-1}\cos\theta_i \\
\sin\theta_i & \cos\theta_i\cos\alpha_{i-1} & -\cos\theta_i\sin\alpha_{i-1} & a_{i-1}\sin\theta_i \\
0 & \sin\alpha_{i-1} & \cos\alpha_{i-1} & d_i \\
0 & 0 & 0 & 1
\end{bmatrix}
$$

The complete forward kinematics for a limb, e.g., the left leg, is found by chaining these transformations from the base (pelvis) to the end-effector (foot):

$$
^{Pelvis}T_{Foot} = ^{Pelvis}T_{Hip} \cdot ^{Hip}T_{Knee} \cdot ^{Knee}T_{Ankle} \cdot ^{Ankle}T_{Foot}
$$

Inverse kinematics is solved for each limb using analytical or numerical methods to compute joint angles $(\theta_1, \theta_2, …, \theta_n)$ for a desired end-effector pose $^{Base}T_{End}$.

For dynamic simulation, the Lagrangian formulation is employed. The Lagrangian $L$ is defined as the difference between kinetic energy $K$ and potential energy $P$ of the system:

$$
L(q, \dot{q}) = K(q, \dot{q}) – P(q)
$$

The equations of motion are then derived using:

$$
\frac{d}{dt} \left( \frac{\partial L}{\partial \dot{q}_i} \right) – \frac{\partial L}{\partial q_i} = \tau_i
$$

where $q_i$ and $\dot{q}_i$ are the generalized coordinate (joint angle) and velocity for joint $i$, and $\tau_i$ is the generalized force (torque) acting on it. This yields the standard form:

$$
M(q)\ddot{q} + C(q, \dot{q})\dot{q} + G(q) = \tau
$$

where $M(q)$ is the mass-inertia matrix, $C(q, \dot{q})\dot{q}$ represents Coriolis and centrifugal forces, and $G(q)$ is the gravity vector. These dynamics are crucial for the Gazebo physics engine to simulate realistic motion of the humanoid robot.

4. URDF Model Construction in ROS

The Unified Robot Description Format (URDF) is an XML file used in ROS to define a robot’s physical properties. The model for the humanoid robot was constructed by defining “ elements for each rigid body segment and “ elements connecting them. The visual and collision geometries for links were imported from STL mesh files of the DARwIn-OP2 robot shell, providing a realistic appearance. Key properties defined for each joint include:

Type: revolute.
Axis of rotation: Defined by the “ tag.
Limits: Effort, velocity, and most importantly, the angle bounds based on Table 2.
Origin: The transform from the parent link frame to the child link frame.

Table 2: Joint Angle Limits for the Humanoid Robot
Body Part	Joint Name	Servo ID	Angle Range (Degrees)	Corresponding Human Motion
Head	Head Yaw	19	-90 to +90	Turning head side-to-side.
Head	Head Pitch	20	-30 to +60	Nodding up and down.
Left Arm	Shoulder Roll	4	-30 to +100	Arm lifting sideways.
	Shoulder Pitch	2	-250 to +250	Arm swinging forward/backward.
	Elbow Pitch	6	-160 to 0	Bending the elbow.
Left Leg	Hip Yaw	8	-45 to +30	Twisting the leg inward/outward.
	Hip Roll	10	0 to +60	Lifting leg sideways.
	Hip Pitch	12	-100 to +30	Swinging leg forward/backward.
	Knee Pitch	14	0 to +130	Bending the knee.
	Ankle Pitch	16	-30 to +30	Pointing foot up/down.
	Ankle Roll	18	-30 to +45	Tilting foot inward/outward.

The URDF tree structure was validated using the `check_urdf` tool and visualized in Rviz, confirming the correct assembly and kinematic chain of the humanoid robot model.

5. Gazebo Simulation Environment and Controller Configuration

To simulate the physics of the humanoid robot, the model was extended for use in the Gazebo simulator. This involved adding several crucial tags to the URDF:

<inertial>: Mass and inertia tensor for each link, calculated from geometry and material density.
<collision>: Simplified geometry (often a box or cylinder) for efficient physics collision checking.
<gazebo>: Plugin declarations and material properties for rendering.

<transmission>: Defines the relationship between an actuator and a joint. For a revolute joint using position control:

<transmission name="<joint_name>_trans">
  <type>transmission_interface/SimpleTransmission</type>
  <joint name="<joint_name>">
    <hardwareInterface>hardware_interface/PositionJointInterface</hardwareInterface>
  </joint>
  <actuator name="<joint_name>_motor">
    <hardwareInterface>hardware_interface/PositionJointInterface</hardwareInterface>
    <mechanicalReduction>1</mechanicalReduction>
  </actuator>
</transmission>

Gazebo ROS Control Plugin: Launches the `ros_control` infrastructure inside Gazebo.

<gazebo>
  <plugin name="gazebo_ros_control" filename="libgazebo_ros_control.so">
    <robotNamespace>/humanoid_robot</robotNamespace>
  </plugin>
</gazebo>

A YAML configuration file defines the controllers. For instance, to control the left arm joint group:

left_arm_controller:
  type: "position_controllers/JointTrajectoryController"
  joints:
    - left_shoulder_roll_joint
    - left_shoulder_pitch_joint
    - left_elbow_pitch_joint
  constraints:
    goal_time: 0.6
  state_publish_rate: 50

This controller is spawned via a launch file, bridging the simulated joints in Gazebo to the ROS `control_msgs/JointTrajectoryAction` interface.

6. Motion Planning and Simulation with MoveIt!

MoveIt! is configured for the humanoid robot using the Setup Assistant. Planning groups are defined for the left arm and left leg. The configuration generates key files including the Semantic Robot Description Format (SRDF), which defines the planning groups, default poses, and self-collision matrix. The kinematics of each group are solved using the KDL (Kinematics and Dynamics Library) solver by default.

The integration pipeline for MoveIt! and Gazebo works as follows:

The user sets a goal pose for the robot’s end-effector (e.g., left hand) in the MoveIt! Rviz plugin.
MoveIt!’s planning pipeline (OMPL) computes a collision-free joint trajectory.
This trajectory is sent as an `action` to the `FollowJointTrajectory` action server hosted by the `ros_control` manager.
The active Joint Trajectory Controller (`left_arm_controller`) receives the trajectory and calculates interpolated setpoints at a high rate.
The `ros_control` framework passes these setpoints to the simulated `PositionJointInterface` in Gazebo.
Gazebo’s physics engine applies forces to move the joints, and the new joint states are fed back to `/joint_states`.
Rviz and MoveIt! subscribe to `/joint_states`, updating the visualization to match Gazebo.

This closed-loop simulation allows for testing motion plans in a physically realistic environment. Successful simulations were performed for both the left arm (reaching to various points in space) and the left leg (performing lifting and stepping motions), validating the kinematic model, controller configuration, and the feasibility of basic motion control for the humanoid robot.

7. Conclusion and Future Work

This work has presented a holistic approach to developing a humanoid robot system within the ROS framework. Starting from a dual-controller hardware design, a modular software architecture was established. A detailed kinematic and dynamic model was described, which was then implemented as a URDF model utilizing an existing anthropomorphic shell. The integration of this model into the Gazebo simulator, coupled with the configuration of `ros_control` and MoveIt!, created a powerful simulation and validation platform. The successful execution of joint trajectories for the arm and leg in the Gazebo environment demonstrates the operational validity of the joint models, the controller interfaces, and the overall software framework.

The established pipeline forms a critical foundation for subsequent advanced research on the humanoid robot. Immediate future work includes:

Implementing a full-body dynamic gait planner and balance controller.
Integrating the vision module for object manipulation tasks in simulation.
Porting the validated controllers and motion modules to the physical hardware.
Exploring whole-body coordination and multi-limb manipulation planning using MoveIt!.

The use of ROS throughout this project underscores its efficacy in managing the complexity inherent in a sophisticated humanoid robot system, significantly accelerating the development cycle from modeling to simulation.