Collaborative Humanoid Robot System: Design and Realization

The evolution of artificial intelligence, particularly embodied intelligence, has identified humanoid robots as a premier hardware platform. Their programmable nature, anthropomorphic form, and strong execution capabilities make them ideal for interacting with human-centric environments and tools. This potential positions humanoid robotics to profoundly transform production methods and daily life, reshaping global industrial landscapes. However, the field remains in its nascent stages, grappling with significant technical implementation challenges and high manufacturing costs, which currently limit widespread application and adoption.

Addressing these challenges requires practical exploration and innovation at the system integration level. This work presents the design and implementation of a collaborative system comprising two distinct humanoid robot platforms: a Race-Walking Robot and a Gymnastics Robot. This system was developed in response to the specific demands of competition environments, which serve as excellent testbeds for reliability and performance. The race-walking humanoid robot is engineered for autonomous navigation, obstacle avoidance, and task execution, while the gymnastics humanoid robot specializes in performing high-degree-of-freedom dynamic movements. Their coordination via Bluetooth communication forms a complete synergistic unit. Innovations across mechanical structure, electronic control systems, and functional algorithms are detailed herein, offering a reference framework for similar research endeavors in collaborative humanoid robot systems.

1. System Architecture and Design Philosophy

The core philosophy behind our collaborative system is modularity and specialization. The system is bifurcated into two primary agents: the Race-Walking Robot, responsible for environmental perception and path planning, and the Gymnastics Robot, dedicated to complex motion performance. This division of labor allows for optimized design in each domain. A unified communication layer, implemented using Bluetooth modules, enables command and status exchange, allowing the two robots to function as a cohesive team rather than isolated units. The overall system architecture is designed to be flexible, permitting independent testing and development of each agent while ensuring seamless integration for cooperative tasks.

2. Mechanical Structure Design and Optimization

The mechanical design of each humanoid robot was driven by its distinct operational requirements, balancing stability, agility, weight, and cost.

2.1 Race-Walking Humanoid Robot Design

The primary objective for this humanoid robot was to achieve stable, fast bipedal locomotion on a defined track while performing secondary tasks. Consequently, the design emphasized leg stability over complex upper-body articulation. The robot features a total of 13 degrees of freedom (DoF), strategically allocated.

The leg assembly is the most critical subsystem. Each foot is designed with a compound structure housing 2 DoF for pitch (forward/backward tilt for stepping) and 2 DoF for roll (left/right tilt for ground contact adaptation). This configuration lowers the center of gravity and significantly enhances stability during dynamic gait. The knees and hips provide the primary actuation for leg swinging. A key design choice was the use of high-torque, higher-weight servos (LX-22D) for all leg joints to ensure powerful and rapid stride execution. In contrast, the shoulders and head, which only require simple oscillatory motions for balance and aesthetics, are actuated by lightweight, low-cost micro servos (SG90). This differential servo selection optimizes performance while controlling mass and expense.

Mass distribution is crucial for a walking humanoid robot. The heavier battery pack is positioned low between the legs, and the main control board is mounted vertically in the chest cavity. This layout minimizes the pendulum effect of high-mounted masses, reducing unwanted torso pitch and yaw oscillations during walking, thereby improving overall gait stability. The kinematic configuration is summarized below:

Body Section	Degrees of Freedom	Primary Function	Servo Type
Foot (per leg)	2 (Pitch) + 2 (Roll)	Step execution & ground stability	LX-22D
Knee (per leg)	1 (Pitch)	Leg swing	LX-22D
Hip/Thigh (per leg)	1 (Pitch)	Leg swing	LX-22D
Pelvis	2 (Yaw)	Turning adjustment	LX-22D
Shoulder (per arm)	1 (Swing)	Balance & posture	SG90
Head	1 (Pan)	Vision system orientation	SG90
TOTAL	13 DoF

2.2 Gymnastics Humanoid Robot Design

This humanoid robot is designed for dynamic, acrobatic maneuvers such as bows, cartwheels, and forward/backward rolls, requiring high agility, strength-to-weight ratio, and dynamic balance. It features 15 DoF driven by digital servos (LX-16A) known for their programmability and holding torque.

The torso utilizes a PLA plastic framework with 3D-printed lattice shell panels, achieving a balance between structural rigidity and weight reduction. A significant innovation is the introduction of a dual-axis linkage at the waist-hip junction. This adds an active degree of freedom that allows for dynamic balance adjustment between the upper and lower body, drastically improving stability during flips and handstands.

The hand structure is optimized for the cartwheel motion. Mimicking a clenched fist, it provides a robust, high-strength contact point with a minimal mass footprint. For safety and system protection during high-dynamic motions, a sealed composite backshell is employed. It incorporates battery securing features and ventilation channels, offering impact buffering and stable thermal management for the electronic components within.

The distribution of degrees of freedom is as follows:

Body Section	Degrees of Freedom	Primary Function
Leg (per leg)	3 (Hip:2, Knee:1)	Locomotion, kicking, support
Arm (per arm)	2 (Shoulder:1, Elbow:1)	Support, balance, gesture
Waist	1	Upper/lower body bend
Pelvis	2	Balance adjustment, leg lift
Shoulder Joint	2	Arm rotation & positioning
TOTAL	15 DoF

3. Electronic Control System Design

A robust, integrated electronic control system is the nervous system of our collaborative humanoid robot platform. It was architected for real-time sensor processing, precise actuator control, and reliable inter-robot communication.

3.1 Hardware Architecture and Circuit Design

The system core is an STM32F103-series microcontroller, chosen for its high performance, rich peripheral set (UART, I2C, PWM), and real-time capabilities. The hardware architecture is modular, centered around several key subsystems:

1. Power Management Module: A 7.4V Li-Po battery feeds into a central board housing multiple DC-DC voltage regulators. These provide stable, isolated power rails: 5V for the main controller and logic, 6V for the high-torque servo motors, and 3.3V for sensitive sensors. The design incorporates over-current protection and low-voltage detection circuits to ensure operational safety and longevity.

2. Sensor Fusion & Processing Module: This module interfaces with all perception units. An OpenMV camera module provides visual data for navigation. A dedicated QR code scanner reads task instructions. An MPU6050 6-axis IMU (on the gymnastics robot) delivers inertial data for balance control. Circuit board layouts use standardized connectors, with signal paths incorporating capacitor filtering and ground shielding to minimize electromagnetic interference (EMI) from power circuits and servos.

3. Actuator Drive Module: This module generates the multi-channel PWM signals required to control up to 15 servos per robot. Power lines for the servos are kept physically separate from low-voltage signal lines on the PCB. Local decoupling capacitors are placed at each servo connector to suppress noise and ensure clean signal delivery, which is critical for precise and jitter-free motion.

4. Communication Module: Inter-robot coordination is achieved via HC-05 Bluetooth modules. Their placement on the PCB is strategically away from high-current servo drivers to prevent RF interference, ensuring stable communication at a 9600 baud rate.

The custom-designed main control board embodies these principles, featuring plug-and-play interfaces for all modules. This facilitates rapid maintenance, debugging, and future expansion (e.g., adding a voice module). A significant system-level optimization is software-controlled power gating for non-essential sensors and peripherals during idle states, reducing standby power consumption to below 0.5W.

Module	Key Components	Primary Function	Design Optimization
Core Controller	STM32F103 MCU	Algorithm execution, system control	Rich peripheral utilization
Power Management	DC-DC Regulators, Protection ICs	Provide stable, protected voltage rails	Isolated rails, low-voltage cutoff
Vision & Sensing	OpenMV, MPU6050, QR Scanner	Environmental perception, self-state awareness	Shielded signal paths, filtered inputs
Actuation	PWM Drivers, Servo Connectors	Precise control of 13-15 servos	Power/signal isolation, local decoupling
Communication	HC-05 Bluetooth Module	Inter-robot data exchange	Physical separation from noise sources

3.2 Control Algorithms

The intelligence of each humanoid robot is encoded in its control algorithms, which run on the STM32 microcontroller.

3.2.1 Race-Walking Robot Control

This robot’s control stack focuses on vision-based navigation and reactive task execution. The core loop involves:

1. Path Following: The OpenMV camera captures the track (typically a line). An image processing algorithm returns the line’s angular deviation, $\theta$, from the robot’s heading. A proportional control law adjusts the hip yaw servos to minimize this error:
$$ \Delta \phi_{yaw} = K_p \cdot \theta $$
where $\Delta \phi_{yaw}$ is the corrective yaw angle for the pelvis joints and $K_p$ is the proportional gain. For straight segments ($|\theta| < \theta_{threshold}$), a symmetric gait pattern is executed.

2. Obstacle Avoidance: When a red obstacle is detected within a defined bounding box, the robot stops. The control algorithm calculates a circumvention maneuver by generating a sequence of joint angles that rotate the robot away from and around the object before re-acquiring the main path line.

3. Task Execution: Upon reaching a QR code station, the scanner decodes the command (e.g., “TURN_LEFT”, “WAVE”). The microcontroller parses this command and executes a pre-programmed action script by calling the corresponding servo sequence function.

4. Coordination Trigger: Completion of a major task (e.g., one lap) triggers the transmission of a command via Bluetooth to the gymnastics robot, initiating a coordinated performance phase.

3.2.2 Gymnastics Robot Control

Control for this humanoid robot is centered on dynamic motion playback and balance maintenance.

1. Motion Sequence Playback: Complex routines like a cartwheel are decomposed into a timed series of keyframes. Each keyframe defines the target angle for every servo. The controller interpolates between these keyframes and outputs the corresponding PWM signals at a high frame rate (>50 Hz) to ensure smooth motion. A motion can be defined as a sequence:
$$ M = \{ (t_1, \Phi_1), (t_2, \Phi_2), …, (t_n, \Phi_n) \} $$
where $t_i$ is the timestamp and $\Phi_i$ is the vector of joint angles at that time.

2. Dynamic Balance Adjustment: The MPU6050 IMU provides real-time accelerometer and gyroscope data. A Kalman filter is implemented on the microcontroller to fuse this noisy data into reliable estimates of torso orientation ($\alpha, \beta, \gamma$) and angular velocity.
$$ \hat{x}_k = A \hat{x}_{k-1} + B u_k $$
$$ P_k = A P_{k-1} A^T + Q $$
(Measurement Update)
$$ K_k = P_k H^T (H P_k H^T + R)^{-1} $$
$$ \hat{x}_k = \hat{x}_k + K_k (z_k – H \hat{x}_k) $$
$$ P_k = (I – K_k H) P_k $$
Here, $\hat{x}_k$ is the state estimate (orientation, velocity), $P_k$ is the error covariance, $K_k$ is the Kalman gain, $z_k$ is the IMU measurement, and $Q$ and $R$ are process and measurement noise covariances. If the estimated pitch or roll exceeds a safe threshold during a motion, the controller dynamically adjusts the waist and ankle servo angles to counteract the imbalance.

3. Bluetooth Response: The robot remains in a listening mode. Upon receiving a valid command (e.g., “PERFORM_ROUTINE_3”) from the race-walking robot, the controller immediately loads the corresponding motion sequence $M$ from its library and begins execution, providing visual feedback via an onboard LED.

4. Functional Implementation and Vision Algorithm

The autonomous capability of the race-walking humanoid robot hinges on its robust machine vision system. Implemented on the OpenMV platform, the algorithm must handle variable lighting and potential path discontinuities.

The core of the navigation algorithm is a hybrid strategy combining grayscale image processing with linear regression. The process flow is as follows:

1. Region of Interest (ROI) Selection: To reduce computational load, the algorithm does not process the full image. Instead, it defines two dynamic search windows (ROIs) on the left and right sides of the image’s lower half, focusing on areas where the path edges are most likely to appear. This cuts processing area by over 70%.

2. Edge Detection & Blob Analysis: Within each ROI, the image is converted to grayscale and a binary threshold is applied. Contours or blobs corresponding to the track line are identified. A size filter removes small noise pixels.

3. Robust Line Fitting with RANSAC: The coordinates of the identified line pixels are used to fit a line. Standard least-squares fitting is sensitive to outliers (e.g., specks of dirt mistakenly thresholded). Therefore, we employ the RANSAC (Random Sample Consensus) algorithm:
a. Randomly select a minimal sample of 2 points from the set.
b. Fit a line model to these points.
c. Determine how many other points in the set are within a distance tolerance $d_{tol}$ of this line (these are “inliers”).
d. Repeat for $N$ iterations.
e. Choose the model with the largest number of inliers.
f. Re-fit the line using all identified inliers for a final, accurate model.
This process yields a line equation highly resistant to visual noise.

4. Path Angle Calculation and Decision Logic: The fitted line’s angle $\theta$ relative to the vertical image axis is calculated. The control logic is:
– If $|\theta| < 30^\circ$: Execute straight-walking gait.
– If $30^\circ \leq \theta < 90^\circ$: Line is curving right. Adjust pelvis yaw and step timing for a right turn.
– If $-90^\circ < \theta \leq -30^\circ$: Line is curving left. Adjust for a left turn.
– If no valid line is found in either ROI: Execute a search pattern (small rotations) until the line is re-acquired.

The entire vision-control loop runs at a frequency sufficient to allow the humanoid robot to navigate smoothly at its target speed without oscillating. The table below summarizes the key steps and their purpose:

Step	Algorithm/Technique	Purpose
1. Image Acquisition & ROI	Dynamic Window Selection	Reduce data, focus on relevant area
2. Feature Extraction	Grayscale Thresholding, Blob Analysis	Isolate track line pixels from background
3. Model Fitting	RANSAC Linear Regression	Robustly identify line amidst noise
4. Control Parameter Generation	Angle ($\theta$) Calculation	Translate visual data into steering command
5. Actuation	Proportional Control on Joint Angles	Execute physical turn or straight walk

5. System Integration and Collaborative Workflow

The true innovation lies in the orchestration of the two specialized humanoid robot platforms. The collaborative workflow follows a state-based protocol:

Phase 1: Independent Operation. The race-walking robot autonomously completes its navigation and task circuit. The gymnastics robot remains in a ready state.

Phase 2: Task Completion & Signal. Upon finishing its final designated task (e.g., crossing the finish line), the race-walking robot’s controller sends a specific Bluetooth packet, e.g., {"cmd": "start_performance", "routine": 2}.

Phase 3: Synchronized Performance. The gymnastics robot receives the command, acknowledges it with an LED blink, and immediately begins executing the specified routine (e.g., a sequence of flips and bows). The race-walking robot may assume a static “observer” pose or perform simple synchronized movements (e.g., head turns, arm waves) to enhance the collaborative spectacle.

This demonstrates a practical master-slave collaboration where one intelligent agent (the navigator) triggers a complex behavioral response in another (the performer), showcasing multi-agent coordination potential for humanoid robot teams in scenarios like cooperative fetch-and-present or staged human-robot interaction.

6. Conclusion and Future Prospects

This work has presented the comprehensive design and successful implementation of a collaborative humanoid robot system. Key contributions include the specialized mechanical design of two distinct humanoid robot types—one optimized for stable, autonomous locomotion and the other for dynamic, high-degree-of-freedom motion. The integrated electronic control system, featuring custom PCB design with low-power and anti-interference optimizations, provides a robust hardware foundation. Advanced algorithms for vision-based navigation using RANSAC and dynamic balance control using a Kalman filter were implemented effectively on embedded platforms.

The system’s efficacy was validated in a competitive environment, where it demonstrated reliable performance, earning national recognition. This project serves as a concrete case study and a viable technical blueprint for developing collaborative humanoid robot systems. Future work will focus on enhancing the level of collaboration, moving beyond simple triggering to real-time, closed-loop interaction. This could involve shared state estimation (e.g., the gymnastics robot using the race-walker’s camera feed for positioning), more complex communication protocols, and the development of adaptive behaviors where both robots react to a shared environment dynamically. The path forward for practical humanoid robot applications undoubtedly lies in such multi-agent, synergistic systems.