Contemporary artificial intelligence research is undergoing a profound paradigm shift, moving from the classical computational-representational model to a phenomenological framework centered on embodied intelligence. This transition emphasizes that intelligence does not arise from abstract computations detached from the body but is rooted in the dynamic coupling between the agent and its environment, relying on the interplay of action, perception, and meaning construction. By examining the dimensions of sensorimotor, situational, and interactive embodiment, this shift addresses fundamental questions about the nature of intelligence. Humanoid robots, as a cutting-edge form of AI, highlight the constitutive role of the body in intelligence, though they differ fundamentally from human embodied cognition in their functional simulations, which do not capture the transcendental structures of bodily experience. A critical analysis of humanoid robot research can establish the phenomenological conditions and ontological foundations for embodied AI systems, pushing beyond traditional cognitive boundaries to reveal the structural tensions between body, intelligence, and the world, while prompting ethical vigilance.
-
From Disembodied to Embodied Research Paradigms
Cognitive science, as an extension of contemporary epistemology, has evolved since the mid-20th century into an interdisciplinary field encompassing philosophy, psychology, neuroscience, artificial intelligence, and linguistics. While its core concerns echo Kantian questions about how knowledge is possible, its methods have expanded with empirical science. Currently, cognitive science lacks a unified theoretical paradigm, instead featuring competing models such as computationalism, symbolism, representationalism, and cognitivism. These approaches, despite differences, share a foundation in the “cognition as representation” schema, viewing intelligence as an internal information-processing mechanism and reducing the body to a mere conduit for perception and motion, not a constitutive element of cognitive generation. This non-embodied tradition, epitomized by computationalism, asserts that cognition is essentially symbolic computation, with thought processes expressible through Turing models, as seen in the physical symbol system hypothesis by Newell and Simon, which posits that intelligence requires formal symbol manipulation.
However, since the late 1970s, advances in neuroscience and complexity theory have challenged this framework. Connectionism, for instance, rejects the reduction of cognition to symbolic rules, instead viewing it as state activations and weight changes in neural networks, emphasizing emergence as higher-level abilities arise nonlinearly from lower-level interactions. Artificial neural networks build on this, simulating neuronal connections to model non-symbolic aspects of cognition. Yet, connectionism remains within non-embodied paradigms by treating cognition as brain-centric information processing, with the body as a peripheral input-output channel. Similarly, cognitivism and representationalism frame cognition as rule-based manipulation of internal representations, abstracted from specific contexts.
The limitations of non-embodied approaches become apparent in explaining everyday perception, motor control, and complex social behaviors. In the 1980s, embodied cognition emerged as a critical response, rekindling debates on the relationship between cognition, body, and world. In his seminal work, Hubert Dreyfus critiqued the cognitiveist paradigm in AI, drawing on Heidegger and Merleau-Ponty to argue that traditional models, based on symbol manipulation and rule systems detached from context, are ontologically and epistemologically flawed. He opposed the “brain as computer” analogy, emphasizing that cognitive abilities stem from dynamic, practical engagements between the body and world, with higher-order cognition grounded in non-representational, pre-reflective bodily skills.
For example, pattern recognition, a foundational human ability, is often simplified in AI as a logical process with defined inputs, rules, and outputs. However, real-world perception occurs in open, ambiguous backgrounds where objects emerge as figures based on context and bodily experience—a phenomenological structure irreproducible by logic programs. Dreyfus highlighted that the “input-processing-output” model ignores the roles of embodiment and situatedness, as pattern recognition involves continuous, body-based interactions with the environment, characterized by pre-reflective and operative intentionality, where meaning arises from action before explicit representation. Collaborative efforts by Selfridge and Neisser to incorporate heuristic rules into computer pattern recognition revealed that formal logic cannot replicate human “vague anticipation,” where expectations and experience enable perception in uncertain situations. This underscores that for AI to achieve human-like adaptability, it must transcend symbolism and formal logic, reevaluating the deep coupling of body, perception, and action, leading to the rise of embodied research paradigms through interdisciplinary dialogue.
-
Three Dimensions of Embodied Intelligence
The shift from non-embodied to embodied cognition represents not just a change in approach but a paradigm revolution in ontology and epistemology. Embodied cognition theory posits that intelligence emerges from continuous sensorimotor cycles between the agent and environment, rooted in embodied situations defined by morphological constraints, proprioceptive feedback, and environmental affordances. The body’s structure not only limits possible actions but also participates pre-reflectively in meaning generation, with perception and judgment arising from pre-intentional adjustments in specific contexts. Thus, intelligence is a dynamic, extended process co-constituted by the body-environment system, requiring ongoing self-reorganization through action-oriented interactions. This revolution challenges AI’s epistemological foundations and prompts ontological reflection: What forms of embodiment enable intelligence to emerge? How does dynamic coupling shape cognition through perception, meaning, and interaction? To address this, we explore three dimensions: sensorimotor embodiment as the cognitive basis, situational embodiment as the meaning-generation mechanism, and interactive embodiment as the social paradigm.
- Sensorimotor Embodiment as the Cognitive Foundation
Sensorimotor embodiment emphasizes that cognition originates in the perceptual-motor cycles between the body and world. Merleau-Ponty, in Phenomenology of Perception, describes the body as primordially in the world, forming the basis for agent-environment interactions through pre-reflective perceptual-action patterns. In AI, traditional models often modularize perception, action, and cognition, neglecting the body’s core role. Embodied cognition, however, highlights the constitutive impact of bodily morphology on intelligent behavior. In robotics, Rodney Brooks’ “subsumption architecture” exemplifies this, replacing perception-modeling-planning paths with layered behaviors (e.g., obstacle avoidance, navigation) subsumed under lower-level sensorimotor actions. Though limited in advanced cognition, this bottom-up approach laid experimental groundwork. Further, Pfeifer and Bongard’s “morphological computation” model shows how robot bodies can intrinsically realize aspects of intelligence through physical mechanics, reducing reliance on central computation. Hauser and colleagues demonstrated how body structures participate in computation via feedback mechanisms for complex behavior regulation. These findings reveal that intelligence depends on managing perceptual changes and actions, offering biological plausibility and engineering pathways for embodied AI. - Situational Embodiment as the Meaning-Generation Mechanism
While sensorimotor embodiment addresses cognitive mechanisms, situational embodiment ensures that intelligent acts are meaningful by embedding them in specific physical and social contexts. Gibson’s ecological psychology concept of “affordances” underpins this—objects like stairs are “climbable” or handles “graspable” based on direct, body-coupled meanings, not representations. For AI, this means systems must perceive and adapt to environments, coupling actions with structural changes to generate self-world meaning. Recent neurorobotics research, such as by Krichmar, focuses on neural structure-environment coupling, using bio-inspired learning for adaptive, self-organizing action patterns in complex situations. Neurorobots enable smooth transitions from simulation to real worlds via neural plasticity, mimicking brain adjustments to inputs. However, challenges remain, as studies primarily simulate low-level neural mechanisms (e.g., visual cortex encoding) without cross-level integration. - Interactive Embodiment as the Social Paradigm
Interactive embodiment extends beyond agent-environment coupling to focus on how agents, including humans, co-construct meaning, norms, and consensus through interaction. Rooted in Husserl’s “pair-appearances” and Merleau-Ponty’s “intercorporeality,” this dimension views intersubjectivity not as information exchange but as a shared experiential field where others are perceived primordially. De Jaegher and Di Paolo’s “participatory sense-making” theory advances this, describing intersubjective interactions as dynamic, self-organizing systems that feedback into participants’ perceptual and cognitive states. In AI design, interactive embodiment requires systems to engage socially, exhibiting “legibility” and “negotiability” through non-verbal coordination like gestures and rhythms. Human-robot interaction research embraces this; for instance, Breazeal’s emotion-driven architectures enable robots to display affective consistency and purpose in social settings, enhancing coherence and trust. Belpaeme and colleagues stress that effective interaction hinges on emotional connections and embodied co-presence, not just efficiency. Thus, interactive embodiment shifts AI from individual cognition to collaborative meaning-making in shared worlds.
These dimensions are intertwined: sensorimotor coupling grounds cognition in bodily experience, situational embodiment ensures environmental adaptability, and interactive embodiment expands cognition into intersubjective processes. Together, they depict intelligence as a generative, processual existence embedded in body-environment-other relations.
- Sensorimotor Embodiment as the Cognitive Foundation
-
Humanoid Robots: Redefining Traditional Robotics
The theoretical framework of embodied intelligence has profound practical implications, with humanoid robots serving as a prime example of its application. As融合 of mechanical engineering, cognitive science, and electronics, humanoid robots are not only testbeds for embodied AI but also philosophical probes into the nature of intelligence, embodiment, and cognitive generation. Merleau-Ponty’s insight that the body is a site of meaning generation raises the question: Can the “body” of a humanoid robot support autonomous meaning structures, or does it assume continuity between biological and machine intelligence? To answer this, we examine how humanoid robots redefine traditional robotics and under what conditions they constitute true embodied AI, touching on technological boundaries and ontological status.
Unlike traditional AI based on abstract symbols, humanoid robots, empowered by large models, mimic human morphology, perception, and interaction to perform complex tasks in human-designed environments like manufacturing, services, and special operations, aiming for generalizable, adaptive intelligence. Morphologically, humanoid robots fall into three categories: wheeled types emphasizing tactile sensors and dexterous hands; semi-bipedal types focusing on leg mobility; and full-bodied types with limbs and diverse perception for open-world tasks. By application, they include specialized, industrial, medical, entertainment, public service, and domestic variants.
In 2023, Boston Dynamics’ Atlas robot demonstrated unprecedented dynamic balance with fluid walking, rising, and 180-degree head and waist rotations. In 2024, Cloud Ginger by达闼科技 represented a peak in embodied AI, featuring natural language communication, cloud-based brain for cross-scenario tasks, and 18 degrees of freedom for autonomous path planning and obstacle avoidance in complex settings like malls and hospitals. These humanoid robots integrate “brain,” “cerebellum,” and “body,” enabling synergistic coupling and environmental interaction, redefining robotics through embodiment.
First, traditional AI emphasizes discrete inputs and central processing, ignoring bodily constraints, but humanoid robots employ morphological computation, distributing intelligence across the body-environment system. For instance, Atlas’s jumps and flips rely on mechanical leverage of inertia and feedback, not central calculation, showcasing body-function coupling. The body, as the载体 for human-like functions, integrates actuators, chips, sensors, and materials, with rotational actuators in joints, linear ones in limbs, and end-effectors in hands and feet enabling precise motion.

Second, traditional robots operate in predictable, limited contexts, but humanoid robots thrive in open, noisy, uncertain real worlds. For example, SoftBank Robotics’ Pepper, deployed in a California mall in 2025, interacts with customers to identify needs, such as discerning “that pair of shoes” from ambiguous language by integrating vision, voice, object recognition, and context. This moves semantics from static networks to situation-driven multimodal construction. In collaborative work, Pepper uses sensors and machine learning to dynamically adjust routes, showing scene adaptability.
Third, humanoid robots challenge instrumental views of technology, where humans dominate tools. Instead, they become affective, interactive agents. Social robots like Furhat use high-resolution animations and real-time expressions to establish interaction rhythms and complex behaviors in multi-user settings, demonstrating interactive embodiment’s potential beyond individual cognition. In such human-robot interactions, human cognition expands, action possibilities multiply, and mutual understanding forms through intertwined relationships, creating new technical environments for co-learning and co-construction.
By embedding sensorimotor, situational, and interactive embodiment, humanoid robots offer a decentralized, interactively generated AI paradigm, redefining “robot” from controllable mechanical agents to embodied entities whose intelligence arises from dynamic engagement with context and others. This redefinition challenges philosophical foundations, suggesting the body is a condition for cognition, the environment a co-constructive field, and others participants in meaning-making.
However, humanoid robot research faces practical hurdles: high computational costs in motion coordination and multimodal perception, difficulties embedding robots in human emotional and ethical structures for trust and empathy, and theoretical gaps in unifying phenomenological descriptions with formal AI models. These issues stem from a deeper paradigm shift, requiring philosophical and scientific reorientation. Thus, humanoid robots are not the final form of embodied intelligence but catalysts for reflecting on intelligence, body, world relations, and human-robot risks.
In conclusion, the paradigm shift toward embodiment positions it as a core issue in AI, asserting that intelligence stems from dynamic body-environment coupling, not abstract computation. Humanoid robots, by simulating human form and function, underscore the body’s constitutive role, approximating external aspects of human intelligence but differing fundamentally in structural-functional coordination and meaning openness. This gap points not just to engineering challenges but to philosophical questions about whether a humanoid robot’s “body” can bear phenomenological embodiment’s transcendental conditions. While functional simulations fall short of bodily experience’s generative relations, humanoid robots serve as pivotal experimental platforms, exposing non-embodied paradigms’ limits and prompting ontological reconsideration of intelligence. As artifacts at the phenomenology-engineering junction, they highlight irreducible tensions between body, intelligence, and world, urging ongoing ethical alertness to artificial life possibilities.
| Type | Key Features | Examples |
|---|---|---|
| Wheeled Humanoid Robots | Emphasis on tactile sensors and dexterous hand manipulation; wheel-based mobility | Various research prototypes |
| Semi-Bipedal Humanoid Robots | Focus on leg movement and balance; partial human-like lower body | Models under development |
| Full-Bodied Humanoid Robots | Complete limbs and diverse perception; adaptability to open environments | Atlas, Cloud Ginger |
The evolution of humanoid robots illustrates how embodied intelligence transforms AI from a computational artifact to a participatory presence, with implications for cognitive modeling, robotic design, and human-machine symbiosis. As research progresses, the integration of sensorimotor, situational, and interactive dimensions will be crucial for achieving truly adaptive and socially embedded AI systems, while raising important ethical considerations for future development.
