Design and Implementation of a Companion Robot for Elderly Care

As the pace of modern work and life accelerates, population aging has become an increasingly severe issue worldwide. The rapid growth of empty-nest families highlights the urgent need for innovative products and services tailored to the health and care requirements of the elderly. However, the greatest risk for the elderly is not solely health problems but social isolation and loneliness. Isolation refers to physical separation from others, such as elderly individuals living alone, while loneliness denotes the psychological feeling of solitude. To address these challenges, companion robots have emerged as a promising solution. Existing companion robots, such as the early “Yorisoi ifbot” and the famous seal robot Paro, are primarily autonomous robots with fixed feedback patterns to external stimuli. These models often lose their novelty and appeal over time, and users cannot adjust the robot’s responses based on personal needs or preferences, making it difficult to establish emotional connections and a sense of personalized belonging. With advancements in internet and artificial intelligence technologies, robots can now leverage powerful “back-end” support through cloud-based systems, enabling global information and knowledge sharing. Cloud computing and storage facilitate complex computations and data storage, accelerating the learning process, improving computational efficiency, and reducing development and manufacturing costs. For instance, Pepper by SoftBank is a typical cloud-based companion robot capable of storing and learning from information and data, continuously understanding users through voice, facial expressions, posture, and language to provide appropriate content. However, its high cost and technical requirements limit its accessibility. The development of Internet of Things (IoT) technology further allows companion robots to connect with other products, creating a more integrated environment. In this context, our research aims to design a personalized companion robot for the elderly, incorporating cloud services, natural language processing, and IoT technologies to enable diverse interactive modes. This companion robot serves as a user interface between the IoT environment and the elderly, facilitating scenarios such as conversational dialogue, home appliance control, and game companionship during rehabilitation exercises.

The overall system design of our companion robot revolves around two primary modes of companionship. First, the natural language dialogue mode allows users to engage in Chinese voice conversations with the robot through an app running on a tablet computer. This mode enables queries about weather and dates, control of home appliances, and other daily conveniences. Second, the game companionship mode integrates the companion robot with smart home-based rehabilitation games for the elderly, enhancing the fun during gameplay and motivating participation in dementia rehabilitation exercises, thereby reducing the workload for caregivers in institutions. The system architecture is illustrated in Figure 1. In the natural language dialogue mode, the companion robot consists of a tablet computer mounted on a robotic base. The user’s voice input is transmitted via the app to a cloud server, which processes the chat content, retrieves relevant information, and sends the results back to the tablet for audio feedback to the user. In the game companionship mode, the app receives signal changes from the rehabilitation games via the cloud server, processes this information, and transmits corresponding commands through a Bluetooth module to the robot’s motion module. This enables the robotic base to perform various actions such as rotation and lifting, accompanied by “three-dimensional facial expression patterns” on the tablet to interact with the user or environment, increasing communication趣味性.

The hardware design of our companion robot is inspired by the animation robot concept, notably from Pixar’s 1986 short film “Luxo Jr.,” which features a large desk lamp (Luxo Sr.) and a small desk lamp (Luxo Jr.). Despite lacking facial expressions or hand movements, their personalities and emotions are vividly expressed through dynamic body movements. This inspiration led us to adopt exaggerated techniques to present object or human characteristics, common in animated films. Based on this, we designed the robotic base with three degrees of freedom: 180° horizontal rotation, 90° vertical rotation to each side, and a certain degree of forward tilting lift. These degrees of freedom, combined with the tablet’s facial expressions, allow the companion robot to convey emotions effectively. The hardware of the robotic base is divided into three main parts: the base, the elevation lift mechanism, and the top rotation mechanism. The base, at the bottom layer, supports the platform and houses control components. The elevation lift mechanism maintains the tablet at a suitable reading angle when switched off and enables the forward tilting lift degree of freedom. The top rotation mechanism provides the vertical rotation freedom for the tablet holder, directly driven by a servo motor to minimize additional parts and weight. Considering overall weight and strength, we used plastic steel as the primary material, with aluminum alloy for parts bearing larger loads, such as the slide rails in the elevation lift mechanism. The actual product of the robotic base is shown in Figure 5, with related specifications summarized in Table 1.

Table 1: Specifications of the Companion Robot Base
Item	Value
Base Height (cm)	3
Base Disk Diameter (cm)	19
Tablet Holding Spacing (cm)	21
Total Height (cm)	27.9
Mass (kg)	0.8 (excluding tablet)
Head Rotation Angle Range (°)	180 (90° each side)
Forward Tilt Lift Angle Range (°)	30 (5° forward, 25° backward)
Bottom Rotation Angle Range (°)	180
Operating Voltage (V)	5

To achieve command transmission and meet the planned usage scenarios, the control system’s hardware design primarily integrates an ATmega328 8-bit AVR microprocessor and an nRF52832 Bluetooth module. The ATmega328 microprocessor is responsible for signal collection, analysis, and execution of various commands, sending instructions to control servo motor actions. It features 23 input/output lines, with 3 PWM pins used for servo motor control. The power supply voltage is maximized at 5.5 V, incorporating step-down and voltage stabilization circuits to provide a stable 5 V for the servo motors. It includes 32 KB of program writing space and 32 KB of Flash Memory for necessary information storage, meeting our program storage requirements. The Bluetooth module, BT4.2 Module (nRF52832), enables low-energy and stable communication between the tablet and the robotic base. Pins P0.26 and P0.27 are used as communication ports connected to the TX/RX of the ATmega328 chip, facilitating motor control and position information transmission. The nRF52832 chip is widely adopted with abundant resources, aiding product development. Programs written into the microprocessor, as shown in Figure 8, establish Bluetooth connection and communication.

The control program for the robotic base is developed in the Arduino integrated development environment. The flowchart for program development is depicted in Figure 9. After downloading the Arduino IDE, we set preferences, added the board manager URL, installed the Arduino package, and selected the Arduino Uno board to begin programming. For motion arrangement, we use the VarSpeedServo Library to control servo motors on the Arduino side. Let $myservoTop$ represent the servo motor for vertical rotation, $myservoMid$ for forward and backward tilting, and $myservoBot$ for overall horizontal rotation. The $write(int value, speed)$ method is employed to control the servo motors, inputting the target angle and speed. Through continuous fine-tuning, we programmed nine standard actions, as summarized in Table 2. Each situational action is a combination of these basic actions, with variations in sequence and composition depending on the context.

Table 2: Basic Actions and Program Codes for the Companion Robot
Action Name	Program Code	Code ID
Left Head Sway	$myservoTop.write(115, 65)$	A1
Right Head Sway	$myservoTop.write(75, 65)$	A2
Head Initial Position	$myservoTop.write(95, 65)$	A3
Forward Tilt	$myservoMid.write(90, 35)$	B1
Backward Tilt	$myservoMid.write(120, 35)$	B2
Middle Initial Position	$myservoMid.write(135, 35)$	B3
Left Rotation	$myservoBot.write(135, 35)$	C1
Right Rotation	$myservoBot.write(45, 65)$	C2
Base Initial Position	$myservoBot.write(90, 35)$	C3

The client app is developed using the Android Studio integrated development environment, focusing on implementing voice control functionality and building an information transmission platform. The app’s workflow and components are illustrated in Figure 10. The app receives voice commands from the user, and upon judgment, if the command is recognizable by the companion robot, it plays corresponding animations and sends action commands via Bluetooth to the robotic base, prompting emotional actions. The wake-up function utilizes the SpeechRecognizer from the pocketsphinx-android-5prealpha-release library module for wake-word development. Voice recognition employs the voiceRequest function from the ai.api:sdk:2.0.7@aar library module to convert speech to text. Speech synthesis uses the textToSpeech function from the ai.api.sample.TTS library module to convert text to audio output. To achieve this flow, the app incorporates animation playback by storing animations in a specified folder; upon receiving game state commands and after judgment, it uses the playVideo() function from the android.media.MediaPlayer library module to play corresponding videos from the folder.

Experimental testing was conducted to validate the functionalities of our companion robot. Users open the app and log in, with the interface in standby mode, activated by voice or touch screen. The main interface includes voice control symbols and animated expressions, as shown in Figure 11. This interface comprises a standby screen where the companion robot waits to be awakened; upon successful activation, it changes to a command reception interface to receive user voice commands; after reception, it enters a voice processing interface; upon completion, it switches to a reply interface to provide feedback to the user. A manual touch button is set in the upper left corner for activation and command input via touch screen. The natural language dialogue mode can be categorized into three types: chat functionality, information query functionality, and task issuance functionality. Some of the completed dialogue sentences are listed in Table 3. Based on historical records from the natural language processing platform, the average response time for events is 1.71 seconds, with a successful matching rate of 66.67%, meeting design requirements.

Table 3: List of Trained Dialogue Sentences for the Companion Robot
Dialogue Mode	Situational Intent	Trained Sentences
Chat Statements	Greeting	Hey, hello
Chat Statements	Praise	You’re so cute
Chat Statements	Self-introduction	Introduce yourself
Chat Statements	Express Emotion	I’m very happy
Information Query	Query Date	What’s the date today?
Information Query	Query Weather	How’s the weather in Taipei today?
Task Issuance	Control Curtains	Open/close the curtains

The game companionship mode involves scenarios where the companion robot interacts with elderly rehabilitation games. When the game starts, the companion robot plays animations and performs actions to remind the user of the game commencement. Upon successful color matching by the user, the companion robot plays animations and actions to cheer for the user. If the user fails to match for an extended period, the companion robot plays animations and actions to encourage the user. At game end, the companion robot plays animations and actions to announce the game score. Experimental results show that, due to network latency, the robot’s response actions may have slight delays, but the accuracy rate reaches 100%,符合设计要求。Examples of robot animations and actions in these scenarios are illustrated in Figures 13 and 14. For instance, during game start, the companion robot displays a specific animation coupled with a forward tilt motion, while upon successful matching, it shows a celebratory animation with a head sway action.

To further analyze the performance of our companion robot, we can incorporate theoretical models for motion control and system dynamics. For example, the servo motor control can be described using a proportional-integral-derivative (PID) controller model. The control law for each servo motor can be expressed as:

$$u(t) = K_p e(t) + K_i \int_0^t e(\tau) d\tau + K_d \frac{de(t)}{dt}$$

where $u(t)$ is the control signal, $e(t)$ is the error between the desired angle and actual angle, and $K_p$, $K_i$, $K_d$ are the proportional, integral, and derivative gains, respectively. For our companion robot, we tuned these parameters to achieve smooth and accurate movements, ensuring that the emotional expressions are natural and engaging. Additionally, the dynamics of the robotic base can be modeled using Lagrangian mechanics. The kinetic energy $T$ and potential energy $V$ of the system are given by:

$$T = \frac{1}{2} I_1 \dot{\theta}_1^2 + \frac{1}{2} I_2 \dot{\theta}_2^2 + \frac{1}{2} I_3 \dot{\theta}_3^2$$
$$V = m_1 g h_1 + m_2 g h_2 + m_3 g h_3$$

where $I_1$, $I_2$, $I_3$ are the moments of inertia for the three rotational degrees of freedom, $\theta_1$, $\theta_2$, $\theta_3$ are the corresponding angles, $m_1$, $m_2$, $m_3$ are the masses of the components, and $h_1$, $h_2$, $h_3$ are their heights. The equations of motion can be derived using the Euler-Lagrange equation:

$$\frac{d}{dt} \left( \frac{\partial L}{\partial \dot{\theta}_i} \right) – \frac{\partial L}{\partial \theta_i} = \tau_i, \quad i=1,2,3$$

where $L = T – V$ is the Lagrangian, and $\tau_i$ are the torques applied by the servo motors. This model helps in optimizing the motion trajectories to minimize energy consumption and enhance responsiveness. Furthermore, for the natural language processing component, we can employ a probabilistic framework for intent recognition. Given a user utterance $x$, the probability of an intent $y$ is computed using Bayes’ theorem:

$$P(y|x) = \frac{P(x|y) P(y)}{P(x)}$$

where $P(x|y)$ is the likelihood of the utterance given the intent, $P(y)$ is the prior probability of the intent, and $P(x)$ is the evidence. We trained our model on a dataset of dialogue sentences to improve accuracy, as reflected in the high matching rate observed in tests. The integration of cloud services allows for continuous learning, where the companion robot updates $P(y)$ and $P(x|y)$ based on user interactions, personalizing the experience over time.

In terms of hardware reliability, we conducted stress tests on the robotic base to ensure durability. The servo motors were subjected to continuous operation cycles, and their performance degradation was monitored. The torque $\tau$ required for each motor can be estimated as:

$$\tau = I \alpha + b \dot{\theta} + \tau_{ext}$$

where $I$ is the moment of inertia of the load, $\alpha$ is the angular acceleration, $b$ is the viscous friction coefficient, and $\tau_{ext}$ represents external disturbances. Our design selected servo motors with torque margins above 20% to account for wear and tear, ensuring long-term functionality. The Bluetooth communication reliability was also assessed by measuring packet loss rates under varying distances and interference conditions. The signal strength $P_r$ at a distance $d$ can be modeled using the Friis transmission equation:

$$P_r = P_t G_t G_r \left( \frac{\lambda}{4 \pi d} \right)^2$$

where $P_t$ is the transmission power, $G_t$ and $G_r$ are the antenna gains of the transmitter and receiver, and $\lambda$ is the wavelength. We optimized the antenna placement and used error-correction codes to maintain a stable connection within a 10-meter range, suitable for typical home environments.

The companion robot’s energy efficiency is another critical aspect. The total power consumption $P_{total}$ can be expressed as:

$$P_{total} = P_{microprocessor} + P_{servos} + P_{Bluetooth} + P_{tablet}$$

where $P_{microprocessor}$ is the power drawn by the ATmega328, estimated at 0.5 W during active operation; $P_{servos}$ depends on the duty cycle of movements, averaging 2 W; $P_{Bluetooth}$ is about 0.1 W; and $P_{tablet}$ varies but we assume 5 W for the app running. With a 5 V supply, the current draw is approximately 1.52 A, allowing for battery operation if needed. We implemented sleep modes in the microprocessor to reduce $P_{microprocessor}$ during idle periods, extending battery life for portable use.

For the game companionship mode, we developed algorithms to synchronize the robot’s actions with game events. Let $S(t)$ represent the game state at time $t$, and $A(t)$ be the action set for the companion robot. The mapping function $f: S(t) \rightarrow A(t)$ is defined based on rules such as:

$$
A(t) =
\begin{cases}
\text{Animation}_1 + \text{Action}_1 & \text{if } S(t) = \text{GameStart} \\
\text{Animation}_2 + \text{Action}_2 & \text{if } S(t) = \text{Success} \\
\text{Animation}_3 + \text{Action}_3 & \text{if } S(t) = \text{Failure} \\
\text{Animation}_4 + \text{Action}_4 & \text{if } S(t) = \text{GameEnd}
\end{cases}
$$

This ensures timely and context-appropriate responses, enhancing user engagement. The animations and actions are stored as pre-rendered sequences, but future versions could incorporate real-time generation using graphical engines. The cloud server handles state updates from the game module via MQTT protocol, with latency measured to be under 200 ms, which is acceptable for non-critical interactions.

In conclusion, our companion robot design successfully integrates animation robot concepts with advanced technologies to address elderly loneliness and isolation. By leveraging cloud services, IoT, and natural language processing, this companion robot offers personalized and interactive companionship modes. The hardware is cost-effective and reliable, with a simple structure that facilitates mass production. Experimental validations confirm that the robot meets expected functionalities in both dialogue and game scenarios. Future work will focus on enhancing emotional intelligence through machine learning, expanding IoT integrations, and conducting long-term user studies to assess social impact. This companion robot represents a significant step towards affordable and accessible robotic companionship for the aging population, with promising market prospects. The continuous evolution of such systems will undoubtedly contribute to improving the quality of life for the elderly, making the companion robot an indispensable part of future smart homes and care environments.