In recent years, the integration of humanoid robots into daily life and commercial services has accelerated, necessitating designs that foster user acceptance and trust. A critical factor influencing user perception is the robot’s appearance, which can evoke emotional responses ranging from comfort to aversion. Traditional design approaches often prioritize engineering feasibility over emotional appeal, resulting in humanoid robots that appear cold, rigid, and uninviting. To address this, I focus on enhancing the affinity of humanoid robot exteriors by leveraging Kansei Engineering to quantify design elements and training Stable Diffusion (SD) models to generate high-affinity设计方案 efficiently. This approach bridges the gap between subjective user preferences and objective design parameters, enabling the creation of humanoid robots that are not only functional but also emotionally engaging.
The concept of affinity in product design refers to the degree to which a product aligns with human physiological and psychological needs, evoking feelings of pleasure, comfort, and relaxation. For humanoid robots, affinity is manifested through visual elements such as form, material, and color, which collectively shape user impressions. Kansei Engineering provides a systematic methodology to translate these感性 perceptions into tangible design specifications. By decomposing affinity into evaluative dimensions—specifically, affinity degree, gentleness degree, and liveliness degree—I establish a framework for assessing how each design element contributes to the overall亲和力. This quantitative analysis forms the basis for guiding AI model training, ensuring that generated designs adhere to empirically derived affinity criteria.
To explore the design features that enhance affinity in humanoid robots, I employed Kansei Engineering methods, which involve correlating user emotions with product attributes. Initially, I identified key design elements: form (including head shape, eye shape, and body proportion), material (with surface treatments), and color (covering hue, saturation, and brightness). Through a造型分析法, I abstracted common features from existing humanoid robots to create standardized questionnaire stimuli. For instance, head shapes were categorized into square, circle, vertical跑道圆, horizontal跑道圆, and semi-circle, while eye shapes included circle, square, vertical跑道圆, and horizontal跑道圆. Body proportions were modeled after infantile, child, adolescent, and adult figures, further divided into slender and robust types. Materials encompassed plastic, cloth, metal, and transparent variants, each with two surface finishes, and colors were selected from warm, cool, and neutral palettes with variations in saturation and brightness.
A questionnaire was distributed to 645 participants, who rated each design element based on three affinity dimensions: affinity degree (measuring closeness), gentleness degree (reflecting warmth and care), and liveliness degree (indicating vitality). The data were analyzed using statistical methods, including Kendall’s correlation for ordinal variables and Pearson’s correlation for continuous variables, to determine the relationships between design elements and affinity scores. The results revealed that head shape had the highest impact on affinity, followed by facial expression, color, material, and body proportion. For example, horizontal跑道圆 heads and vertical跑道圆 eyes scored highest in亲和力, while slender adolescent body proportions were preferred over robust adult forms. Materials like cloth with textured surfaces enhanced affinity, whereas metal surfaces diminished it. Colors with higher brightness and lower saturation, such as light warm tones, received the highest affinity ratings.
To quantify these findings, I developed an affinity scoring table that assigns values to each design feature based on its contribution to overall亲和力. This table serves as a guideline for selecting and creating training samples for the SD model. The scores are derived from the median ratings of each element across the three dimensions, with higher values indicating greater affinity. For instance, a head shape of horizontal跑道圆 (A4) scores 3, while a metal material (D5) scores 0. This systematic scoring enables precise control over the design attributes during model training.
The affinity score for a humanoid robot design can be represented as a weighted sum of its element scores. Let \( S_{\text{affinity}} \) denote the total affinity score, and \( w_i \) represent the weight of each design element \( i \), based on its impact factor from the questionnaire analysis. The score is calculated as:
$$ S_{\text{affinity}} = \sum_{i=1}^{n} w_i \cdot s_i $$
where \( s_i \) is the score of element \( i \) from the affinity table, and \( n \) is the number of elements considered. For example, if head shape (weight \( w_h = 0.3 \)) scores \( s_h = 3 \), and color (weight \( w_c = 0.2 \)) scores \( s_c = 4 \), the contribution to \( S_{\text{affinity}} \) would be \( 0.3 \times 3 + 0.2 \times 4 = 1.7 \). This formula allows for the optimization of designs by maximizing the total score under constraints.
| Element Type | Code | Score | Element Type | Code | Score |
|---|---|---|---|---|---|
| Head Shape | A1 | 3 | Eye Shape | B1 | 1 |
| A2 | 3 | B2 | 1 | ||
| A3 | 0 | B3 | 3 | ||
| A4 | 3 | B4 | 0 | ||
| A5 | 0 | – | – | – | |
| Body Proportion | C1 | 3 | Material | D1 | 1 |
| C2 | 2 | D2 | 2 | ||
| C3 | 4 | D3 | 3 | ||
| C4 | 3 | D4 | 3 | ||
| C5 | 2 | D5 | 0 | ||
| C6 | 1 | D6 | 0 | ||
| C7 | 0 | D7 | 1 | ||
| – | – | – | D8 | 1 | |
| Color | E1 | 4 | – | – | – |
| E2 | 2 | – | – | – | |
| E3 | 2 | – | – | – | |
| E4 | 1 | – | – | – | |
| E5 | 3 | – | – | – | |
| E6 | 0 | – | – | – | |
| E7 | 0 | – | – | – |
The correlation analysis between the three affinity dimensions revealed strong positive relationships for most elements, indicating consistency in user perceptions. For instance, the Kendall’s correlation between affinity degree and gentleness degree for head shape was \( \tau = 0.356 \) (p < 0.01), and between affinity degree and liveliness degree for eye shape was \( \tau = 0.345 \) (p < 0.01). However, some materials and color saturations showed negative correlations, such as between affinity degree and liveliness degree for plastic surfaces (\( \tau = -0.060 \), p < 0.01), suggesting that reducing surface smoothness can enhance affinity without compromising vitality. These insights inform the selection of design combinations that maximize overall亲和力.
| Design Element | Affinity-Gentleness | Affinity-Liveliness | Gentleness-Liveliness |
|---|---|---|---|
| Head Shape | 0.356** | 0.303** | 0.284** |
| Eye Shape | 0.318** | 0.345** | 0.368** |
| Body Proportion | 0.160** | 0.113** | 0.152** |
| Plastic Material | 0.048** | -0.006 | -0.060** |
| Cloth Material | -0.016 | -0.006 | -0.020 |
| Neutral Colors | 0.294** | 0.319** | 0.301** |
With the affinity criteria established, I proceeded to train Stable Diffusion models to generate humanoid robot designs that embody these characteristics. SD is a latent diffusion model that generates images from text prompts or input images by iteratively denoising random noise. The training process involves fine-tuning the model on a curated dataset of high-affinity humanoid robot images, using the affinity scoring table to guide sample selection. I employed two primary methods: DreamBooth for comprehensive model adjustments and LoRA for efficient, lightweight training. DreamBooth modifies all layers of the SD neural network to capture specific features, while LoRA inserts additional layers to influence output without extensive retraining, balancing quality and computational efficiency.
The training dataset was constructed by selecting and refining images that scored highly on the affinity scale. Using tools like ControlNet, img2img, and inpainting in SD, along with manual edits in Photoshop, I enhanced initial素材 to align with affinity features, such as rounded forms, soft materials, and light colors. Each training sample was annotated with descriptive tags based on design elements, such as “oval head,” “vertical eyes,” “white body,” and “plastic material,” ensuring the model learns the correct associations. The annotation process followed a structured approach: overall form, detailed造型, style, material, and color were meticulously labeled to provide clear guidance during training.
During training, I iteratively updated the dataset based on generated outputs, using the affinity score to evaluate and select samples for subsequent rounds. The training parameters were optimized through experimentation, with a learning rate of \( 1 \times 10^{-4} \), batch size of 5, and 10 epochs using the 8bit-Adam optimizer and Cosine scheduler. The loss function, which measures the difference between generated and target images, was monitored to ensure convergence. Ideally, the loss value decreases to around 0.08 by epochs 7-9, indicating effective learning. The loss over epochs can be modeled as an exponential decay:
$$ L(t) = L_0 \cdot e^{-kt} $$
where \( L(t) \) is the loss at epoch \( t \), \( L_0 \) is the initial loss, and \( k \) is the decay constant. This equation helps in adjusting parameters to achieve optimal training performance.
After training, I evaluated the models using XY cross plots, which display generated images under different model weights and epochs. By applying the affinity scoring table, I identified the best-performing models—typically those from epoch 6 with weights between 0.6 and 0.8, which produced designs with scores up to 14 out of a possible 16. These models can be used standalone or combined with style-specific LoRA models to generate a diverse matrix of humanoid robot designs. For example, pairing the affinity model with a “sci-fi” or “organic” style LoRA yields variations that maintain high亲和力 while exploring different aesthetics. The style matrix allows for rapid screening of设计方案, with the highest-scoring options easily identifiable for further development.

The integration of Kansei Engineering and AI-driven design represents a significant advancement in humanoid robot development. By quantifying subjective preferences, I have created a repeatable process for designing robots that resonate emotionally with users. The trained SD models demonstrate the ability to generate numerous high-affinity设计方案 efficiently, reducing the time and resources required for iterative design. Moreover, the affinity scoring system provides a objective metric for evaluating generated outputs, ensuring consistency and quality. This approach not only enhances the aesthetic appeal of humanoid robots but also promotes broader acceptance in service and domestic environments.
In conclusion, this methodology bridges the gap between human emotions and robotic design, offering a scalable solution for creating亲和力 humanoid robots. Future work could expand into cross-cultural studies to validate the universality of affinity features, incorporate dynamic interactions into the design process, and explore more advanced AI models for real-time customization. As humanoid robots become increasingly prevalent, fostering positive human-robot relationships through thoughtful design will be crucial for their successful integration into society.