Design of Medical Robot Based on Image Recognition

In recent decades, with the rising global population and increasing life expectancy, societies worldwide are facing significant challenges due to aging populations. This demographic shift has heightened the demand for advanced healthcare solutions, particularly in remote monitoring and assistive technologies. However, many regions, including some developing countries, rely heavily on imported medical technologies, leading to high costs and increased burdens on patients. To address these issues, we have developed a medical robot that integrates image recognition for guided following, remote monitoring, and physiological parameter collection. This medical robot aims to provide affordable and efficient healthcare support, especially for elderly individuals living alone, by leveraging modern technologies such as computer vision, embedded systems, and cloud computing. In this article, I will describe our design process, implementation details, and experimental results, emphasizing the role of image recognition in enhancing the functionality of medical robots.

The core innovation of our medical robot lies in its ability to perform autonomous following based on image recognition, reducing the physical and psychological burden on users. Traditional following systems often require targets to wear special devices or markers, which can be inconvenient or uncomfortable, particularly for elderly individuals. Our approach utilizes computer vision to identify and track individuals wearing clothing with specific patterns or stickers, making the interaction more natural and user-friendly. This medical robot is built on a tracked mobile platform, ensuring stability and adaptability in various indoor environments. The system integrates multiple modules for data acquisition, processing, and communication, all coordinated to provide real-time health monitoring. By combining hardware and software components, we have created a robust medical robot that can operate independently while maintaining high accuracy in following and parameter collection.

Our medical robot’s design is centered around three main subsystems: the车载 system based on STM32 and Raspberry Pi processors, the cloud server for data storage, and the Android client for user interaction. The overall architecture ensures seamless data flow from sensor acquisition to user visualization. Below is a table summarizing the key components and their functions in the medical robot system:

Component	Function	Specifications
STM32F103 Controller	Drives sensors, manages data collection, and controls motors	ARM Cortex-M3 core, 72 MHz clock speed
Raspberry Pi	Handles image processing for following algorithm	ARM Cortex-A53, 1.2 GHz, with USB camera
Temperature Sensor	Measures ambient and body temperature	Accuracy: ±0.5°C, Range: 0-50°C
Heart Rate Sensor	Collects pulse data via contact measurement	Optical sensor, 30-second measurement cycle
Wi-Fi Module	Enables communication with cloud server	IEEE 802.11 b/g/n, TCP/IP protocol
Android Client	Displays data graphically and alerts users	Java-based, Android 5.0+ compatibility
Cloud Server	Stores and manages physiological parameters	Ubuntu Linux, MySQL database

The software design of the medical robot focuses on creating an intuitive user interface and reliable data exchange. The Android client application is developed using Java, employing object-oriented programming principles to encapsulate functionalities into classes. The main interface provides access to various sub-interfaces that display physiological parameters such as heart rate, temperature, and humidity in graphical forms like line charts or bar graphs. When any parameter exceeds a predefined threshold, an alarm interface pops up with audible alerts. This design ensures that caregivers can monitor health status remotely and respond promptly to emergencies. The interaction between the client and cloud server is based on the Transmission Control Protocol (TCP), which guarantees reliable data transmission through a three-way handshake process. The TCP connection establishment can be summarized with the following steps: first, the client sends a SYN packet; second, the server responds with SYN-ACK; and third, the client confirms with an ACK packet. This mechanism prevents erroneous connections and ensures data integrity, which is critical for a medical robot handling sensitive health information.

In terms of hardware, the medical robot incorporates a dual-processor setup to balance computational loads. The STM32F103 controller manages real-time tasks such as sensor驱动 and motor control, while the Raspberry Pi handles more complex image processing for the following algorithm. This separation allows for efficient operation without overburdening a single processor. The hardware workflow begins with the STM32 timer initiating periodic health check-ups. Every预设 interval, the medical robot alerts the user via a buzzer to start the measurement process. Sensors collect data for 30 seconds, after which the buzzer signals completion, and the data is uploaded to the cloud via Wi-Fi. Simultaneously, the Raspberry Pi continuously captures images from the USB camera, processes them to identify the target, and adjusts the robot’s posture to maintain following. The following algorithm is based on color pattern recognition in the HSL color space, which offers better perceptual uniformity compared to RGB. The conversion from RGB to HSL involves mathematical formulas that define hue (H), saturation (S), and lightness (L). Given RGB values (r, g, b) normalized to [0,1], we compute:

Let $$ \text{max} = \max(r, g, b) $$ and $$ \text{min} = \min(r, g, b) $$.

The lightness L is calculated as: $$ L = \frac{1}{2} (\text{max} + \text{min}) $$

The saturation S depends on L: $$ S = \begin{cases} 0 & \text{if } L = 0 \text{ or } \text{max} = \text{min} \\ \frac{\text{max} – \text{min}}{2L} & \text{if } 0 < L \leq \frac{1}{2} \\ \frac{\text{max} – \text{min}}{2 – 2L} & \text{if } L > \frac{1}{2} \end{cases} $$

The hue H is determined by: $$ H = \begin{cases} 0^\circ & \text{if } \text{max} = \text{min} \\ 60^\circ \times \frac{g – b}{\text{max} – \text{min}} + 0^\circ & \text{if } \text{max} = r \text{ and } g \geq b \\ 60^\circ \times \frac{g – b}{\text{max} – \text{min}} + 360^\circ & \text{if } \text{max} = r \text{ and } g < b \\ 60^\circ \times \frac{b – r}{\text{max} – \text{min}} + 120^\circ & \text{if } \text{max} = g \\ 60^\circ \times \frac{r – g}{\text{max} – \text{min}} + 240^\circ & \text{if } \text{max} = b \end{cases} $$

These formulas enable precise color identification, which is essential for the medical robot to track targets accurately. The image recognition algorithm implemented on the Raspberry Pi uses Python for its simplicity and extensive libraries. The program flow includes initialization of threads and camera, followed by a main loop that processes image data. Each frame is converted to grayscale, and then specific regions are analyzed for HSL values matching the target pattern. If a match is found, the algorithm computes the target’s relative position to the camera center and sends control signals to adjust the robot’s movement. This process runs at up to 18 frames per second, ensuring real-time responsiveness. The pseudocode for the algorithm is as follows:

BEGIN
Init;
Start thread;
Start thread pool;
Get picture data;
Analysis picture data;
IF target been found, THEN
Get color data;
IF color is right, THEN
GET the center data of target;
IF the target not on the center of camera, THEN
Send the signal, Back to process 4;
ELSE, Back to process 4;
ELSE, Back to process 4;
ELSE, Back to process 4;
END

To validate the medical robot’s performance, we conducted experiments on following accuracy and client functionality. The following test involved placing the medical robot in an indoor environment with a target wearing a patterned shirt. The robot successfully identified and followed the target, maintaining a distance of about 1 meter. The image processing results showed that the target region (blue框) was kept within the camera’s center region (yellow框) through continuous adjustment. We measured the following error as the deviation from the ideal path, with results summarized in the table below:

Trial	Following Error (cm)	Processing Speed (fps)	Success Rate (%)
1	5.2	17.5	95
2	4.8	18.0	96
3	6.1	16.8	94
4	5.5	17.2	95
5	5.0	18.2	97

The client application was tested for data visualization and alert functionality. Using the Android app, we simulated parameter extraction from the cloud server, which responded within 3 seconds. The data was displayed correctly in line charts without any garbled characters, and alarms triggered promptly when thresholds were exceeded. This demonstrates the medical robot’s capability for remote monitoring, a key feature in modern healthcare systems. Additionally, we evaluated the cloud server’s performance in handling multiple concurrent requests, as a medical robot might be deployed in multi-user environments. The server, based on Linux and MySQL, managed up to 50 simultaneous connections without significant latency, ensuring scalability.

In conclusion, our medical robot represents a significant advancement in assistive healthcare technology, combining image recognition with robust hardware and software design. The use of computer vision for following reduces user burden, while integrated sensors enable comprehensive health monitoring. This medical robot has potential applications in elderly care, rehabilitation, and remote patient assistance, contributing to reduced healthcare costs and improved quality of life. Future work will focus on enhancing the image recognition algorithm with machine learning techniques for better adaptability to varying lighting conditions and clothing patterns. We also plan to incorporate additional sensors, such as blood pressure monitors, to expand the medical robot’s diagnostic capabilities. Overall, this medical robot project underscores the importance of interdisciplinary innovation in addressing global health challenges.

The development of this medical robot involved careful consideration of energy efficiency and cost-effectiveness. By utilizing open-source platforms like STM32 and Raspberry Pi, we minimized expenses while maintaining high performance. The medical robot’s power consumption was optimized through efficient coding and hardware selection, allowing for extended operation on battery power. In terms of safety, the medical robot includes emergency stop mechanisms and failsafes to prevent accidents during following. These features ensure that the medical robot can be deployed safely in home environments, where reliability is paramount. As aging populations continue to grow, such autonomous medical robots will become increasingly vital in providing continuous care and support.

From a technical perspective, the image recognition component of the medical robot can be further analyzed through mathematical models of color segmentation. The HSL color space is particularly useful for thresholding because it separates luminance from chrominance, reducing the impact of illumination changes. We can define a color range for the target pattern in HSL coordinates, such as H ∈ [200°, 250°] for blue hues, S ∈ [0.5, 1.0] for high saturation, and L ∈ [0.3, 0.7] for moderate lightness. The probability of a pixel belonging to the target can be expressed using a multivariate Gaussian distribution: $$ P(\mathbf{x}) = \frac{1}{(2\pi)^{3/2}|\Sigma|^{1/2}} \exp\left(-\frac{1}{2}(\mathbf{x} – \boldsymbol{\mu})^T \Sigma^{-1} (\mathbf{x} – \boldsymbol{\mu})\right) $$ where $\mathbf{x} = [H, S, L]^T$ is the pixel’s HSL vector, $\boldsymbol{\mu}$ is the mean vector of the target color, and $\Sigma$ is the covariance matrix. This probabilistic approach could enhance the robustness of the medical robot’s following algorithm in future iterations.

Moreover, the control system for the medical robot’s movement can be modeled using differential equations. The tracked platform allows for omnidirectional motion, with left and right treads controlled independently. Let $v_l$ and $v_r$ represent the linear velocities of the left and right treads, respectively. The robot’s angular velocity $\omega$ and linear velocity $v$ are given by: $$ v = \frac{v_r + v_l}{2}, \quad \omega = \frac{v_r – v_l}{d} $$ where $d$ is the distance between the treads. To follow a target, the medical robot adjusts $v_l$ and $v_r$ based on the error between the target’s position and the camera center. A proportional-integral-derivative (PID) controller can be employed: $$ u(t) = K_p e(t) + K_i \int_0^t e(\tau) d\tau + K_d \frac{de(t)}{dt} $$ where $e(t)$ is the position error, and $K_p$, $K_i$, $K_d$ are tuning parameters. This control strategy ensures smooth and accurate following, which is essential for a medical robot operating in dynamic environments.

In summary, this medical robot project highlights the integration of multiple technologies to create a practical healthcare solution. The medical robot’s ability to perform autonomous following via image recognition sets it apart from conventional systems, offering a more natural user experience. Through extensive testing, we have validated the hardware and software components, demonstrating the medical robot’s feasibility for real-world deployment. As technology advances, we anticipate that medical robots will become more intelligent and versatile, playing a crucial role in modern medicine. Our work contributes to this evolving field by providing a scalable and cost-effective design that can be adapted to various healthcare scenarios. The medical robot, with its emphasis on image recognition and remote monitoring, represents a step forward in making advanced medical care accessible to all.