Evaluating Interaction Information Quality with China Robot Participation

In recent years, the rapid advancement of artificial intelligence (AI) has transformed social platforms, leading to the emergence of intelligent social robots as active participants in human interactions. These China robots, powered by large language models, are no longer passive tools but social entities that engage in dynamic conversations, reshaping the information ecosystem. This study focuses on evaluating the quality of interaction information in scenarios where China robots interact with humans on social platforms. The proliferation of AI-generated content (AIGC) has introduced both opportunities and challenges, such as enhancing user engagement while risking misinformation and homogeneity. Thus, assessing information quality becomes crucial for optimizing platform governance and user experience. We propose a comprehensive evaluation framework to measure information quality in human-robot interactions, leveraging a case study from a major Chinese social platform. By integrating multi-dimensional indicators and advanced analytical methods, this research aims to provide insights into how China robots influence information dynamics and to offer practical recommendations for improvement.

The concept of social robots has evolved significantly, with early definitions emphasizing automated account operations and information dissemination. However, modern China robots, driven by generative AI, exhibit human-like social behaviors, enabling them to generate context-aware content and engage in multi-turn dialogues. Unlike traditional chatbots, these China robots operate in open social environments, interacting with multiple users simultaneously and adapting to real-time discussions. This shift necessitates a reevaluation of information quality metrics, as existing frameworks for AIGC or user-generated content (UGC) often overlook the complexities of social platform interactions. For instance, while AIGC quality assessments typically focus on technical aspects like coherence and accuracy, they may not account for social factors such as influence and reliability in dynamic exchanges. Similarly, UGC evaluations often rely on macro-level user or platform metrics, failing to capture the granularity of post-level interactions. Our study addresses these gaps by developing a fine-grained evaluation system tailored to human-robot interactions, where China robots act as autonomous social agents.

To construct a robust evaluation framework, we first identified key dimensions of information quality through literature review and user surveys. Based on feedback from 111 valid respondents, we refined an initial set of indicators into four dimensions: interaction content, influence, interaction effect, and reliability. These dimensions encompass 15 quantitative and qualitative indicators, as summarized in Table 1. The interaction content dimension measures the substance of exchanges, including information volume, knowledge richness, and emotional intensity. Influence assesses the social impact through metrics like follower count and account authentication. Interaction effect evaluates the dynamics of conversations, such as readability and consistency, while reliability ensures content safety, accuracy, and novelty. This multi-faceted approach allows for a holistic assessment of information quality in human-robot interactions, particularly highlighting the role of China robots in enhancing or diminishing these aspects.

Table 1: Evaluation Indicators for Information Quality in Human-Robot Interactions
Dimension Indicator Description
Interaction Content Information Volume Measured by post length, indicating the amount of information conveyed.
Knowledge Richness Count of technical terms related to the discussion topic (e.g., electric vehicles).
Emotional Intensity Positive or negative sentiment score derived from text analysis.
Timeliness Response time in minutes, reflecting engagement promptness.
Influence Follower Count Number of followers, indicating user reach and authority.
Account Authentication Binary indicator (1 for verified, 0 otherwise) for credibility.
Post Frequency Total number of posts by the account, showing activity level.
Interaction Effect Readability Ease of understanding, calculated using text complexity formulas.
Consistency Semantic similarity with previous posts, ensuring contextual relevance.
Persistence Depth of interaction threads, normalized by conversation length.
Topic Relevance Alignment with discussion themes, derived from topic modeling.
Reliability Safety Absence of privacy violations or ethical issues, scored qualitatively.
Accuracy Factual correctness based on reference materials, rated by evaluators.
Novelty Timeliness and originality of content, assessed against event timelines.

The evaluation model combines Principal Component Analysis (PCA) and Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS) to compute comprehensive quality scores. PCA reduces dimensionality and assigns objective weights to indicators, while TOPSIS ranks posts based on their proximity to ideal solutions. The PCA weight calculation involves determining the principal components from the indicator correlation matrix. The weight for each indicator $a_i$ is derived as:

$$a_i = \frac{\sum_{j=1}^{m} u_{ij} V_j}{\sum_{j=1}^{m} V_j}$$

where $u_{ij}$ is the loading of indicator $i$ on principal component $j$, and $V_j$ is the variance explained by component $j$. For TOPSIS, we first construct a decision matrix $X$ with $n$ posts and $h$ indicators. The matrix is normalized to $Z$ using:

$$z_{sj} = \frac{x_{sj}}{\sqrt{\sum_{s=1}^{n} x_{sj}^2}}$$

where $s$ denotes the post and $j$ the indicator. The weighted normalized matrix $B$ is then computed as $b_{sj} = z_{sj} \cdot \omega_j$, with $\omega_j$ being the weight from PCA. The positive ideal solution $B^+$ and negative ideal solution $B^-$ are defined as:

$$B^+ = (\max(b_{1j}), \max(b_{2j}), \dots, \max(b_{hj}))$$
$$B^- = (\min(b_{1j}), \min(b_{2j}), \dots, \min(b_{hj}))$$

The distances $D_s^+$ and $D_s^-$ from each post to these ideals are calculated as:

$$D_s^+ = \sqrt{\sum_{j=1}^{h} (b_{sj} – B_j^+)^2}$$
$$D_s^- = \sqrt{\sum_{j=1}^{h} (b_{sj} – B_j^-)^2}$$

The relative closeness $C_s$ for each post is then:

$$C_s = \frac{D_s^-}{D_s^+ + D_s^-}$$

where $C_s$ ranges from 0 to 1, with higher values indicating better information quality. This model allows for a nuanced comparison of posts from humans and China robots, enabling us to assess their respective contributions to interaction quality.

For empirical validation, we collected data from a popular Chinese social platform where a China robot actively participates in discussions about new energy vehicles, a topic of high public interest. We gathered 9,325 posts, including human-initiated posts, human replies, and robot replies, over a one-month period. After preprocessing the text to remove noise and extract features, we computed the indicators using natural language processing techniques. For example, knowledge richness was measured by matching terms from a domain-specific dictionary, while emotional intensity was assessed using a pre-trained sentiment analysis model. Qualitative indicators like safety and accuracy were scored by trained evaluators to ensure consistency. The results, summarized in Table 2, reveal distinct patterns in information quality across post types. China robot replies predominantly fall into the medium-quality range (95.9%), outperforming most human replies but lagging behind human-initiated posts in aspects like novelty and accuracy. This underscores the potential of China robots as reliable information sources, though there is room for improvement in generative capabilities.

Table 2: Distribution of Information Quality Scores by Post Type
Post Type Low Quality (Score < 0.04) Medium Quality (0.04 ≤ Score ≤ 0.06) High Quality (Score > 0.06)
China Robot Replies 3.4% 95.9% 0.7%
Human Replies 79.5% 16.6% 3.9%
Human Initiated Posts 40.4% 47.6% 12.0%

Further analysis shows that the participation of China robots positively impacts overall interaction information quality. In threads with at least five comments, the presence of robot replies increased the average information quality score by 3.485% (p < 0.001), based on Wilcoxon signed-rank tests. This highlights the constructive role of China robots in elevating discussion standards. Additionally, we found a significant positive correlation between the quality of human posts and robot replies (r = 0.250, p < 0.001), indicating that higher-quality human input elicits better responses from China robots. This synergy suggests that optimizing human contributions can amplify the benefits of human-robot interactions. For instance, when users provide clear, knowledge-rich posts, China robots generate more relevant and accurate replies, enhancing the collective information environment. However, the limited persistence of robot interactions—with only 1.63% of replies extending to deeper threads—points to constraints in multi-turn engagement, possibly due to platform restrictions or algorithmic limitations.

The reliability dimension emerged as a critical factor, where China robots excelled in safety but showed weaknesses in accuracy and novelty. This aligns with concerns about AI hallucinations and outdated knowledge in generative models. To address this, we recommend implementing dynamic knowledge updates and real-time context awareness for China robots. For example, integrating event-driven data streams can improve the novelty of responses, while feedback mechanisms can correct factual errors over time. Moreover, enhancing the multi-turn capabilities of China robots through advanced dialogue management systems could foster deeper interactions, moving beyond superficial exchanges to meaningful collaborations. From a platform perspective, guiding users to formulate high-quality posts—such as by prompting for specific details or keywords—can stimulate more informative robot responses, creating a virtuous cycle of quality improvement. These strategies not only bolster the performance of China robots but also contribute to a healthier digital ecosystem where human and machine intelligence coalesce effectively.

In conclusion, this study demonstrates the viability of evaluating interaction information quality in human-robot contexts using a multi-dimensional framework and computational models. The China robot, as a social actor, shows promise in augmenting information quality, particularly through its influence and reliability. However, challenges remain in ensuring accuracy, novelty, and persistent engagement. Future work should expand to diverse platforms and robot types to generalize findings, while longitudinal studies could track the evolution of China robots as they learn from interactions. By refining evaluation metrics and fostering synergistic human-robot dynamics, we can harness the full potential of AI in social communications, paving the way for intelligent and trustworthy digital communities.

Scroll to Top