Investigating Medical Robot Innovation Through Co-Occurrence Networks and Topic Modeling

In recent years, the rapid advancement of the medical robot industry has become a focal point for technological innovation, driven by the strategic push for high-level technological self-reliance and the development of new quality productive forces. This study aims to analyze the trajectory of technological innovation within the medical robot sector from 2010 to 2024, utilizing patent data as the primary source of intelligence. Patents, being the direct outputs of inventive activity, encapsulate a rich fusion of technological, legal, and economic information, offering a profound lens through which to examine the evolution of industry-specific knowledge. Unlike previous studies that often rely on a single analytical perspective, this research adopts a dual-lens approach: examining macro-level technological linkages through co-occurrence network analysis and uncovering fine-grained semantic themes via an enhanced topic modeling technique. By integrating these two distinct yet complementary viewpoints, the study seeks to reveal the multifaceted nature of innovation trends, identify convergent patterns of knowledge flow, and provide a more robust foundation for forecasting key technological directions within the medical robot domain. The ultimate goal is to contribute insights that can enhance the efficiency and accuracy of strategic planning for sustainable growth and competitive advantage in this critical high-tech industry.

The methodology of this investigation is grounded in a dataset of 31,338 patent documents (after de-duplication of patent families) related to medical robots, retrieved from the incoPat database. The timeframe spans from January 1, 2010, to September 30, 2024. To capture temporal evolution, the dataset is partitioned into three sequential periods of equal patent volume: 2010–2018, 2019–2021, and 2022–2024. This partitioning allows for a clear chronological analysis of shifting innovation patterns.

Analytical Framework: A Dual-Perspective Approach

Two principal analytical methods are employed, each offering a unique vantage point on the same underlying patent data.

1. Co-Occurrence Network Analysis (Macro Perspective)

This perspective utilizes the International Patent Classification (IPC) codes assigned to each patent. IPC codes represent standardized, hierarchical categories for technology areas. A co-occurrence network is constructed for each time period, where nodes represent individual IPC codes, and edges connect codes that frequently appear together on the same patent documents. The strength of a connection is proportional to the co-occurrence frequency. These networks are visualized and analyzed using Gephi software, which applies a modularity algorithm to detect communities (clusters) of tightly interconnected IPC codes. Each community signifies a coherent, macro-level technological theme prevalent within the medical robot innovation landscape during that period. The evolution of these communities across the three time windows reveals the shifting foci of broad technological domains.

2. Enhanced BERTopic Modeling (Semantic Perspective)

While IPC networks provide a structural overview, they can lack nuanced semantic detail. To delve into the specific linguistic content of innovations, this study implements an improved BERTopic topic modeling pipeline on the abstracts of the patent documents. BERTopic is a state-of-the-art, unsupervised deep learning model that excels at capturing contextual meaning. The enhancement involves integrating the K-means clustering algorithm for optimized topic formation. The process involves four key steps:

  1. Text Vectorization: Patent abstracts are converted into high-dimensional semantic vectors using the `all-MiniLM-L6-v2` pre-trained BERT model. This model effectively represents the contextual meaning of words and sentences.
  2. Dimensionality Reduction: The high-dimensional vectors are reduced to a manageable size using the UMAP (Uniform Manifold Approximation and Projection) algorithm, which preserves both local and global data structures.
  3. Topic Clustering: The reduced vectors are clustered using the K-means algorithm to group patents discussing similar technological concepts. The K-means algorithm aims to partition the $n$ document vectors into $k$ clusters $(C_1, C_2, …, C_k)$ by minimizing the within-cluster sum of squares (variance). The objective is to find:
    $$ \arg \min_C \sum_{i=1}^{k} \sum_{\mathbf{x} \in C_i} \|\mathbf{x} – \boldsymbol{\mu}_i\|^2 $$
    where $\boldsymbol{\mu}_i$ is the mean (centroid) of vectors in cluster $C_i$.
  4. Topic Representation: For each resulting cluster (topic), key terms are extracted and weighted using a class-based TF-IDF (c-TF-IDF) algorithm. This provides a human-interpretable label for each discovered technological theme, reflecting the core semantic content of patents within that cluster.

This refined model is executed separately for the abstracts from each of the three time periods, enabling the tracking of fine-grained thematic evolution.

Findings from the Co-Occurrence Network Perspective

The analysis of IPC co-occurrence networks reveals a clear macro-evolution of technological domains in the medical robot sector.

Period 1: 2010-2018 – Foundational Mechanical and Diagnostic Focus

The network for this period resolved into three primary communities, indicating three dominant macro-themes.

Community Core IPC Codes Macro Technological Theme
1 A61B34/30, A61B19/00, A61B17/00, A61B90/20, A61B34/00 Surgical instruments, apparatus, or methods; Computer-assisted surgical planning; Surgical navigation systems.
2 B25J11/00, B25J9/16, B25J9/00, B25J5/00, G06F19/00 Manipulators (robotic arms); Control devices; Peripheral apparatus; Data processing for specific applications.
3 A61B5/00 Measurement for diagnostic purposes.

This era was characterized by foundational work on the core hardware of surgical robots (manipulators, controls) and their integration with surgical planning and navigation. Diagnostic measurement emerged as a distinct but separate stream.

Period 2: 2019-2021 – Consolidation and Emergence of Rehabilitation

The network structure maintained three communities, but with notable shifts in emphasis and composition.

Community Core IPC Codes Macro Technological Theme
1 A61B34/30, A61B34/00, A61B90/00, A61B34/20, A61B34/35 Surgical robotics dominates; includes telesurgery and computer-assisted planning.
2 B25J11/00, B25J9/16, B25J19/00, B25J5/00 Manipulators and control apparatus (data processing code G06F19/00 disappears).
3 A61B5/00, A61H1/02 Diagnostic measurement converges with apparatus for stretching or bending for exercise (rehabilitation).

The surgical robotics theme solidified as the dominant cluster. Crucially, diagnostic technology began to merge with rehabilitation (A61H1/02), signaling the rising innovation interest in rehabilitation robots that combine assessment with guided physical therapy.

Period 3: 2022-2024 – Diversification into AI and Imaging

The network expanded to four communities, indicating a diversification and deepening of technological pursuits.

Community Core IPC Codes Macro Technological Theme
1 A61B34/30, A61B34/00, A61B90/00, A61B34/20, A61B34/37 Continued focus on surgical manipulators, robots, navigation, and computer-assisted planning.
2 B25J9/16, B25J11/00, B25J19/00 Manipulators with a specific focus on programmed control.
3 A61B5/00, G16H40/67, G06N20/00 Diagnostic measurement now linked to devices for remote operation and machine learning.
4 G06T7/00, G06N3/08 A new, distinct theme emerges: Image analysis and associated learning methods.

This most recent period shows the powerful convergence of medical robot technology with cutting-edge digital technologies. Themes explicitly incorporating machine learning (G06N20/00, G06N3/08), remote operation (G16H40/67), and advanced image analysis (G06T7/00) have crystallized. This reflects the industry’s push towards greater autonomy, data-driven intelligence, and precision in medical robotics applications.

Findings from the Topic Modeling Perspective

The enhanced BERTopic analysis of patent abstracts provides a complementary, semantic view of innovation trends, validating and enriching the macro-level picture.

Period 1: 2010-2018 – Control Systems and Core Mechanisms

Five semantic topics were identified, with a strong emphasis on fundamental robotic systems.

Topic Representative Keywords Semantic Theme Interpretation Doc Count
0 robot, device, system, control Robotic control systems and apparatus 4969
1 connection, robot, drive, device Robotic connection and drive mechanisms 2886
2 robot, system, surgical, device Surgical robotic systems and instruments 1752
3 level, robot, bend, platform Robot flexibility and leveling platforms 940
4 robot, image, laser, system Imaging-assisted robotics 446

The dominance of Topic 0 (nearly 50% of documents) underscores that early innovation in medical robots was heavily centered on achieving reliable and precise robotic control systems—the essential “brain” and “nervous system” of the device.

Period 2: 2019-2021 – Shift Towards Integration and Fixation

The thematic structure remained at five topics, but with a noticeable shift in focus.

Topic Representative Keywords Semantic Theme Interpretation Doc Count
0 robot, control, device, system Robotic control systems and apparatus 3517
1 robot, connection, fix, install Robotic connection and fixation apparatus 3264
2 connection, drive, component, robot Robotic connection and drive components 1622
3 robot, system, surgical, device Surgical robotic systems and instruments 1074
4 laser, robot, optical, fiber Imaging-assisted robotics (laser/optical focus) 321

The prominence of control systems (Topic 0) diminished relative to the previous period, while topics related to physical connection, fixation, and installation (Topics 1 & 2) gained substantial traction. This suggests a maturation of core control algorithms and a growing innovative focus on how the medical robot is physically integrated into the surgical or clinical environment—a critical aspect for usability and safety.

Period 3: 2022-2024 – Component Specialization and Navigation

The semantic analysis for the latest period shows a clear specialization trend.

Topic Representative Keywords Semantic Theme Interpretation Doc Count
0 robot, connection, device, drive Robotic connection and drive apparatus 4273
1 robot, system, control, module Robotic control systems and modules 2313
2 connection, drive, component, device Connection and drive components (specialized) 1510
3 robot, system, surgery, device Surgical robot systems and instruments 812
4 robot, device, optical, connection Robotic image-assisted navigation apparatus 753

Innovation has decisively pivoted towards the refinement and specialization of connection and drive components (Topics 0 & 2 combined represent the largest share), indicating an industry-wide effort to optimize the physical performance and reliability of the medical robot. Simultaneously, Topic 4 has evolved into a distinct theme focused on image-assisted navigation, aligning with the emergence of the image analysis community in the IPC network. This highlights the growing importance of visual data fusion for guiding robotic interventions.

Synthesis and Convergence of Perspectives

The most significant finding of this dual-perspective study is the strong, convergent signal regarding the evolution of innovation in the medical robot industry. Despite using fundamentally different data features (categorical codes vs. natural language), both analytical methods point to a coherent and logical technological trajectory.

The journey begins with a foundational phase (2010-2018) where the core medical robot platform is established, focusing on basic robotic control, manipulator mechanics, and surgical integration. The IPC network shows communities for surgical apparatus and manipulators, while the topic model highlights control systems and connection drives as the primary semantic themes.

The industry then enters a maturation and expansion phase (2019-2021). The macro view shows surgical robotics solidifying its dominance and the novel convergence of diagnosis with rehabilitation, marking the rise of the rehabilitation medical robot. The semantic view corroborates this by showing a relative decline in control system focus and a surge in innovation related to physical fixation and installation—key challenges for deploying robots in diverse clinical settings, including rehabilitation.

The current phase (2022-2024) is characterized by intelligent diversification. Both perspectives capture this shift. The IPC network explicitly reveals new communities dedicated to machine learning, remote operation, and image analysis—technologies that inject “intelligence” into the medical robot. The topic model reflects this through the specialization of drive components (enabling finer, more intelligent movement) and the crystallization of image-assisted navigation as a major thematic pursuit. This synergy between advanced hardware components and AI-driven software capabilities defines the cutting edge of medical robot innovation.

This convergence is not merely coincidental; it represents the holistic nature of technological progress in complex systems like the medical robot. Knowledge flows simultaneously at different levels: from broad technological domains (captured by IPC co-classification) down to specific engineering problems and solutions (captured by abstract semantics). The fact that both flows exhibit parallel directional trends underscores the robustness of the identified innovation pathway. The formula for within-cluster variance minimization in topic modeling, $$ \arg \min_C \sum_{i=1}^{k} \sum_{\mathbf{x} \in C_i} \|\mathbf{x} – \boldsymbol{\mu}_i\|^2 $$, finds its conceptual counterpart in the modularity optimization of the IPC networks, both seeking to find the most coherent groupings within the innovation data. This cross-dimensional alignment provides a powerful, validated signal for forecasting. It suggests that future key technologies in the medical robot industry will likely reside at the intersection of these converging streams: further miniaturization and intelligence of drive components, deeper integration of real-time image analysis and AI-based decision support, and the expansion of telesurgery and adaptive rehabilitation platforms.

In conclusion, this study demonstrates the value of a multi-perspective, data-driven approach to mapping technological innovation. By jointly analyzing co-occurrence networks and semantic topics within patent data, we have delineated a clear and convergent evolution of the medical robot industry from foundational mechanics to intelligent, integrated systems. The identified trend towards the fusion of sophisticated hardware with AI and imaging technologies offers a reliable compass for stakeholders aiming to navigate the future of this dynamic field. While the reliance on patent data provides a strong, output-oriented view of innovation, future research could further enhance this framework by incorporating complementary data sources, such as scientific publications and market analytics, to build an even more comprehensive model of the medical robot innovation ecosystem.

Scroll to Top