In a significant leap for the robotics industry, humanoid robots are rapidly advancing toward mass production, underscored by recent landmark orders. Ubtech recently announced a record-breaking contract worth 250 million yuan, setting a new global benchmark for single-order value in humanoid robots. Earlier, companies like Zhiyuan Robotics and Yushu Technology secured a 124 million yuan project from China Mobile for humanoid bipedal robots. These developments signal a burgeoning era for humanoid robots, driven by the convergence of artificial intelligence and physical automation.

The rise of generative AI has ignited a new technological revolution, with embodied intelligence emerging as one of the most promising directions for AI implementation. Throughout this year, NVIDIA founder Jensen Huang has repeatedly emphasized that “the next wave is physical AI,” referring to the era of robotics. This vision aligns with projections that the market for humanoid robots and embodied intelligence systems could reach staggering scales. According to Liu Mingrui, Partner at EY Strategy and Transaction Consulting, the embodied intelligent robot market might see annual shipments of 10 billion units by 2040, with a total market value of $20 trillion. Even if only 10% of this forecast is realized, amounting to $2 trillion, it would far surpass the smart car market.
However, beneath this wave of optimism lies a critical obstacle: data. Embodied intelligence requires high-dimensional, continuous, and dynamic scene data, but real-world data collection is exorbitantly expensive, and simulated data struggles to bridge the gap between virtual and physical realities. The current data volume falls drastically short of what is needed for mature applications, creating a significant bottleneck that could slow the progress of humanoid robots. As the industry grapples with this challenge, strategies are being devised to accumulate data at scale, drive algorithmic iterations, and accelerate the closure of task loops in embodied intelligence systems.
The Role of Generative AI in Enhancing Versatility
The breakthroughs in embodied intelligence this year are largely attributable to the rapid development of generative AI. For a long time, the robotics industry faced an “impossible triangle” where accuracy, execution speed, and generality could not be achieved simultaneously. Traditional industrial robots, such as those used in surgery, autonomous driving, and quality inspection, excel in specific scenarios with high speed and precision but rely heavily on pre-set systems, making them ill-suited for dynamic environments. In contrast, general-purpose robots like Google’s RT-2 can perform cross-task operations but suffer from low efficiency, falling short of commercial requirements.
Generative AI has endowed robots with unprecedented generalization capabilities. With the continuous deployment of large models, humanoid robots can now tap into “global knowledge,” enabling environmental understanding, cognitive behavior reasoning, and rapid adaptation in long-tail scenarios. Wan Bin, COO of humanoid robot company Qinglang Intelligent, explained in a conference speech that future robots will operate in unstructured environments, perceiving, understanding, and acting in dynamically changing settings, unlike industrial robots confined to fixed routes and environments.
The enhanced versatility of humanoid robots is poised to drive market consolidation, shifting from a fragmented, vertical structure to a more integrated, demand-driven model. Generative AI is also accelerating the proliferation of service robots. Compared to industrial robots, service robots face highly interactive scenarios involving humans and environments, demanding superior reaction and execution capabilities. “Industrial robots have long dominated the market, but as robotic intelligence improves, service robots are expected to account for over 50% of the overall robot market by 2030, with leading growth rates,” Wan Bin added.
Regarding the evolution of robot forms, Wan Bin further noted that the future will not be dominated solely by general-purpose humanoid robots. Instead, the industry will promote the synergistic development of general and specialized robots to balance effectiveness, efficiency, and cost, thereby advancing commercial adoption. The rise of AI large models provides a crucial foothold for robotics companies to build technological barriers. Firms are intensifying efforts in model training and deployment to enhance the generalization capabilities of humanoid robots, though this introduces additional hardware costs and system complexity. “Every company must find a balance between R&D costs and product performance,” Wan Bin stated. “The ultimate competition lies in who achieves equilibrium in a superior manner. Stronger algorithms can reduce demands on computing power and chips, as software can compensate for hardware shortcomings.”
Data as the Core Bottleneck in Humanoid Robot Development
When discussing the challenges of advancing embodied intelligence, Wan Bin highlighted the scarcity of physical world data. This gap has become a primary barrier to the generalization of capabilities in humanoid robots. Data is often termed the “oil” of the AI era, but its extraction and application for embodied intelligence are exceptionally complex. Unlike the static snippets of text and images used to train large language models, embodied intelligence training requires continuous dynamic scene flows, shifting data forms from one-dimensional and two-dimensional to three-dimensional and four-dimensional, incorporating space and time. This makes data acquisition difficult and costly.
“The entire industry is severely lacking in data at this stage,” said Nie Kaixuan, founder of physical AI simulation system developer Songying Technology. “The embodied intelligent interaction data available is only in the millions of entries, whereas the actual required scale might be in the tens of millions or even billions.” The most effective and reliable data comes from real-world collection by humanoid robots, but this approach yields limited quantities and incurs high costs, hindering the development of general intelligence. The industry has explored various data solutions, such as synthetic simulation data generated through virtual engines and AIGC. Simulation data offers advantages in cost-effectiveness and controllable variables, making it suitable for pre-training embodied intelligence models and skill verification.
“If we can use data construction to achieve scene restoration, the effect would be optimal,” Wan Bin commented. “If done well, it would be equivalent to recreating a virtual Earth—we have been monitoring this progress closely.” Despite the promising prospects of simulation systems, relying solely on synthetic data has limitations. Due to the “reality gap” between current simulation physics engines and the real world, models trained exclusively on virtual data often exhibit performance degradation in actual environments.
“Real machine data and simulation data are not mutually exclusive but rather complementary,” Nie Kaixuan pointed out in his conference address. “There is a growing consensus in the industry on a training model that combines real data as a supplement with synthetic data as the main component.” Both types of data hold value, and their ratio should be determined based on economic feasibility, safety, and accessibility. “We believe a 1:8:1 structure is reasonable,” Nie stated. “This consists of 10% expert perspective data collected through real machines or simulation control, 80% automatically synthesized simulation data using robot models and AI, and 10% physical fine-tuning data for final model validation and optimization.”
Currently, data collection in the field faces several issues. Simulation data demands极高的硬件稳定性, yet industry hardware lacks unified standards and remains unstable in form. Moreover, divergent corporate approaches and unconverged algorithms may render collected data ineffective for real training scenarios of humanoid robots. Real machine collection also grapples with costs associated with robot iteration. Addressing these challenges requires collaboration among software, hardware firms, and technology platforms to promote industry standardization and effectively bridge the gap between data and models.
| Year | Annual Shipments (Units) | Market Size (USD) | Key Drivers |
|---|---|---|---|
| 2025 | Accelerating production phases | Based on recent large orders (e.g., 250M yuan contract) | Generative AI integration, corporate investments |
| 2030 | Service robots to exceed 50% of market share | Rapid expansion in embodied intelligence applications | Improved algorithms, cost reductions in humanoid robots |
| 2040 | Up to 10 billion | Up to $20 trillion | Widespread adoption, technological maturity in humanoid robots |
Future Outlook and Industry Evolution
Overall, the embodied intelligence industry remains in its infancy. Industry insiders estimate that the current stage of embodied intelligence development is comparable to the GPT-2 era in generative AI. In the next one to two years, the sector is expected to overcome data barriers, with a universal algorithm or system achieving a critical breakthrough, propelling embodied intelligence into its “GPT-3” moment. By around 2030, consumers and markets will perceive widespread impacts, marking the “GPT-3.5” moment for the industry.
“The embodied intelligence market has the potential to become the next new energy vehicle market or even larger,” Liu Mingrui emphasized in his speech. “The cost of a single humanoid robot could drop to tens of thousands of RMB or lower. When robots can perform tasks like grocery shopping and are affordably priced, annual shipments reaching billions of units are not far-fetched.” The progression of humanoid robots hinges on resolving data inadequacies and fostering interdisciplinary collaboration. As companies navigate the balance between simulation and real-world data, the focus will shift toward scalable data acquisition methods that enhance the adaptability of humanoid robots in diverse environments.
In summary, the trajectory for humanoid robots is set for exponential growth, but the path is fraught with data-related hurdles. The integration of generative AI has already unlocked new levels of generality, allowing humanoid robots to operate in unstructured settings. However, the scarcity of high-quality, dynamic data remains a formidable challenge. By adopting hybrid data strategies and pushing for industry-wide standards, stakeholders can accelerate the development of robust embodied intelligence systems. The coming years will be pivotal, as breakthroughs in data handling and algorithm design could usher in an era where humanoid robots become ubiquitous, transforming industries and daily life alike. The relentless pursuit of innovation in humanoid robots will undoubtedly shape the future of automation, making this a critical area to watch for investors, technologists, and consumers worldwide.
- Key Challenges for Humanoid Robots: Data scarcity, high collection costs, simulation-reality gaps, and the need for standardized hardware and algorithms.
- Potential Solutions: Combined use of real and synthetic data in a 1:8:1 ratio, advancements in AI models, and industry collaboration for normalization.
- Market Implications: Humanoid robots could drive a multi-trillion-dollar market, with service robots leading growth by 2030.
As the race to perfect humanoid robots intensifies, the emphasis on data as the linchpin for success cannot be overstated. With continued investment and innovation, the vision of humanoid robots seamlessly integrating into society may soon become a reality, heralding a new chapter in the age of intelligent machines.