Data-driven Position and Orientation Control for Redundant Robotic Manipulators with Unknown Models

In my research, I address a significant challenge in the control of redundant robotic manipulators: achieving precise trajectory tracking for the end effector‘s position and orientation when the robot’s kinematic model is inaccurate or completely unknown. Traditional control schemes predominantly rely on an accurate analytical model, such as the Denavit-Hartenberg (DH) parameters. However, in practical scenarios, the model can change due to the attachment of different tools or end effectors, or simply be imperfectly calibrated. This discrepancy leads to substantial control errors and task failure. Furthermore, many existing approaches from a kinematic perspective focus solely on position control, neglecting the critical aspect of orientation control for the end effector. Many industrial tasks, like welding, polishing, or assembly, require the tool to maintain a specific angle relative to the work surface. This gap motivates the development of a novel, model-free control scheme that simultaneously governs both the position and orientation of the end effector.

My proposed solution leverages data-driven technology. Instead of depending on a pre-defined model, the scheme learns the necessary kinematic relationships online from the stream of input-output data generated during the robot’s operation. This approach provides robustness against model uncertainties. For representing and controlling the orientation, I employ unit quaternions. Quaternions offer a non-singular, computationally efficient representation for three-dimensional rotations, which is superior to Euler angles for control purposes. The core of my work is a unified, velocity-level optimization scheme that integrates online Jacobian estimation with quaternion-based orientation tracking, solved via a designed neurodynamic solver.

Problem Formulation and Preliminaries

Consider a redundant manipulator with $a$ degrees of freedom (DOF), where $a > b$, the dimension of the end effector‘s task space (typically $b=6$ for full position and orientation). Let $\theta \in \mathbb{R}^a$ denote the joint angle vector and $r \in \mathbb{R}^3$ denote the position of the end effector. The forward kinematics for position is given by $r = f(\theta)$. The differential relationship is:

$$
\dot{r} = J(\theta) \dot{\theta},
$$

where $J(\theta) = \partial f(\theta)/\partial \theta \in \mathbb{R}^{3 \times a}$ is the geometric Jacobian matrix for position. For orientation, the end effector‘s attitude is described by a rotation matrix $R(\theta) \in SO(3)$. To avoid singularities, I represent this orientation using a unit quaternion $q_E(\theta) = [\eta, \epsilon^T]^T \in \mathbb{R}^4$, where $\eta$ is the scalar part and $\epsilon \in \mathbb{R}^3$ is the vector part, satisfying $\|q_E\|_2 = 1$. The differential kinematics for the quaternion is:

$$
\dot{q}_E = H(\theta) \dot{\theta},
$$

where $H(\theta) \in \mathbb{R}^{4 \times a}$ is the Jacobian relating joint velocities to quaternion velocity. Not all four components of the quaternion are independent due to the unit norm constraint. For control purposes, it is sufficient and more straightforward to control a minimal representation derived from the rotation matrix. I define an orientation vector $o(\theta) \in \mathbb{R}^5$, which can be composed of specific elements of $R(\theta)$ that uniquely define it (e.g., $o = [R_{11}, R_{12}, R_{21}, R_{22}, R_{33}]^T$). Its differential form is:

$$
\dot{o} = G(\theta) \dot{\theta},
$$

where $G(\theta) \in \mathbb{R}^{5 \times a}$. Given a desired orientation represented by its corresponding vector $q_d \in \mathbb{R}^5$, the orientation tracking error can be defined. The challenge is that the true Jacobian matrices $J(\theta)$ and $G(\theta)$ are unknown or inaccurate.

Proposed Data-Driven Control Scheme

I propose a comprehensive control scheme that learns the kinematics online while performing trajectory tracking. Let $\hat{J} \in \mathbb{R}^{3 \times a}$ be the estimated position Jacobian, and $\hat{\dot{\theta}} = \dot{\theta} + u$ represent the joint velocity measurement perturbed by a bounded random noise $u$ ($\|u\| \leq u_0$). This noise injection is crucial to ensure persistent excitation for the learning algorithm when the system is at rest. The desired trajectory is defined by $r_d(t)$ for position and $q_d(t)$ for orientation. The complete scheme is formulated as the following constrained optimization problem at the velocity level:

$$
\begin{aligned}
\min_{\dot{\theta}} \quad & \frac{1}{2} \dot{\theta}^T \dot{\theta} \\
\text{s.t.} \quad & \hat{J} \hat{\dot{\theta}} = \dot{r}_d, \\
& G(\theta) \hat{\dot{\theta}} = \dot{q}_d, \\
& \dot{\hat{J}} = -\gamma (\hat{J} \hat{\dot{\theta}} – \dot{r}) \hat{\dot{\theta}}^T, \\
& \dot{\theta}^- \leq \dot{\theta} \leq \dot{\theta}^+.
\end{aligned}
$$

The objective function minimizes the joint velocity norm, promoting smooth motion and minimizing energy. The first two constraints enforce the velocity-level tracking for both position and orientation using the estimated and true (for $G$) Jacobians. The third equation is the key data-driven law for updating the estimated Jacobian $\hat{J}$ online, where $\gamma > 0$ is a learning gain and $\dot{r}$ is the measured end effector velocity. This update rule is derived from the gradient descent minimization of the estimation error $\|\hat{J}\hat{\dot{\theta}} – \dot{r}\|_2^2$. The final constraint represents the physical joint velocity limits.

To unify the position and orientation tracking constraints, I define an augmented task vector and its estimated Jacobian:

$$
v_d = \begin{bmatrix} r_d \\ q_d \end{bmatrix} \in \mathbb{R}^8, \quad \hat{A} = \begin{bmatrix} \hat{J} \\ G(\theta) \end{bmatrix} \in \mathbb{R}^{8 \times a}.
$$

Note that while $\hat{J}$ is learned online, $G(\theta)$ is considered to be derivable from the orientation vector $o(\theta)$, which can be computed from the measured joint angles $\theta$ using the nominal (possibly inaccurate) kinematic model for structure, as the relationship between $\theta$ and $o$ is primarily geometric. The core tracking constraint becomes $\hat{A} \hat{\dot{\theta}} = \dot{v}_d$.

Table 1: Key Variables in the Proposed Control Scheme
Symbol	Description
$\theta, \dot{\theta}$	Joint angle and velocity vectors
$r, \dot{r}, r_d$	End effector position, velocity, and desired position
$q_E(\theta)$	Unit quaternion representing end effector orientation
$o(\theta), q_d$	Orientation vector and desired orientation vector
$J(\theta), G(\theta)$	True geometric Jacobians for position and orientation
$\hat{J}, \hat{A}$	Estimated position Jacobian and augmented Jacobian
$\hat{\dot{\theta}}$	Measured joint velocity (with noise)
$\dot{\theta}^-, \dot{\theta}^+$	Lower and upper bounds for joint velocity

Neurodynamic Solver Design and Convergence Analysis

To solve the real-time optimization problem, I design a recurrent neural network (RNN) or neurodynamic solver. First, I define the tracking error $e = v – v_d$, where $v = [r^T, o(\theta)^T]^T$. A gradient descent term for error minimization is $-\eta \hat{A}^T e$, where $\eta > 0$. To achieve exact constraint satisfaction ($\hat{A} \hat{\dot{\theta}} = \dot{v}_d$), a dynamic state feedback term $\lambda \in \mathbb{R}^8$ is introduced. The joint velocity command is generated by projecting the combined signal onto the feasible set defined by the velocity bounds:

$$
\dot{\theta} = P_{\Omega}\big( -\eta \hat{A}^T e + \hat{A}^T \lambda \big),
$$

where $P_{\Omega}(\cdot)$ is a projection operator ensuring $\dot{\theta}^- \leq \dot{\theta} \leq \dot{\theta}^+$. The Lagrange multiplier $\lambda$ is dynamically updated to enforce the equality constraint:

$$
\dot{\lambda} = \mu (\dot{v}_d – \hat{A} \hat{A}^T \lambda),
$$

with $\mu > 0$. This constitutes the neurodynamic solver. Combining it with the Jacobian update law gives the complete online control system:

$$
\begin{aligned}
\text{(Controller)} &: \quad \dot{\theta} = P_{\Omega}\big( -\eta \hat{A}^T (v – v_d) + \hat{A}^T \lambda \big), \\
\text{(Multiplier)} &: \quad \dot{\lambda} = \mu (\dot{v}_d – \hat{A} \hat{A}^T \lambda), \\
\text{(Estimator)} &: \quad \dot{\hat{J}} = -\gamma (\hat{J} \hat{\dot{\theta}} – \dot{r}) \hat{\dot{\theta}}^T, \\
\text{(Measurement)} &: \quad \hat{\dot{\theta}} = \dot{\theta} + u.
\end{aligned}
$$

Theorem (Convergence): For the neurodynamic solver under the condition that $\hat{A}^T \lambda$ remains within the feasible region $\Omega$, the tracking error $e(t)$ converges to zero globally and asymptotically.

Proof Sketch: Consider two Lyapunov functions. First, for the Lagrange multiplier subsystem, let $V_1 = \frac{1}{2} \|\dot{v}_d – \hat{A}\hat{A}^T \lambda\|_2^2$. Its time derivative leads to $\dot{V}_1 \leq -\mu \sigma_{min} V_1$, where $\sigma_{min}$ is the minimum eigenvalue of $\hat{A}\hat{A}^T$, proving that $\hat{A}\hat{A}^T \lambda \to \dot{v}_d$ exponentially. At steady state, $\hat{A}^T \lambda$ acts as $\hat{A}^{\dagger} \dot{v}_d$, where $\dagger$ denotes the pseudo-inverse. Second, for the tracking error $e$, using the property of the projection operator and the steady-state behavior of $\lambda$, one can show that the derivative of $V_2 = \frac{1}{2} e^T e$ is non-positive, i.e., $\dot{V}_2 \leq 0$. By LaSalle’s invariance principle, the system converges to the largest invariant set where $\dot{V}_2 = 0$, which implies $e = 0$. Therefore, both position and orientation tracking for the end effector are achieved.

Simulation Studies and Comparative Analysis

I conducted extensive simulations on a 7-DOF Franka Emika Panda manipulator model to validate the proposed scheme. Two primary tasks were designed: 1) Tracking a complex planar trajectory (a cloverleaf) while maintaining a constant end effector orientation. 2) Tracking a 3D helical path on a conical surface while continuously changing the end effector orientation to remain normal to the surface. The initial estimated Jacobian $\hat{J}(0)$ was set with random values, and the nominal model used for $G(\theta)$ was intentionally perturbed to simulate model uncertainty.

Table 2: Performance Metrics for the Conical Surface Tracking Task
Metric	Steady-State Value	Comment
Position Error $\\|r – r_d\\|_2$	$< 2.0 \times 10^{-3}$ m	Converges within seconds
Orientation Error $\\|o – q_d\\|_2$	$< 4.0 \times 10^{-3}$	Accurate orientation tracking
Jacobian Estimation Error $\\|\hat{J} – J_{true}\\|_F$	$< 1.0 \times 10^{-2}$	Online learning is effective
Joint Velocities	Within $[\dot{\theta}^-, \dot{\theta}^+]$	Physical constraints are satisfied

The simulation results confirmed several key advantages. First, the estimated Jacobian $\hat{J}$ quickly converged to a neighborhood of the true Jacobian, enabling accurate position control despite the unknown model. Second, the quaternion-based framework seamlessly integrated orientation control, allowing the end effector to precisely follow the desired attitude profile. The joint velocities remained smooth and within the prescribed limits. A comparative analysis with other state-of-the-art methods further highlights the novelty and effectiveness of the proposed approach.

Table 3: Comparison with Existing Redundant Manipulator Control Schemes
Scheme Feature	Proposed Scheme	Model-Based Scheme [A]	Data-Driven Position-Only Scheme [B]
Requires Accurate Model	No	Yes	No
Controls End Effector Orientation	Yes (Quaternion-based)	Yes	No
Online Jacobian Learning	Yes	No	Yes
Handles Joint Velocity Limits	Yes	Yes	Sometimes
Primary Application	Model-unknown tasks requiring precise end effector pose	Tasks with perfectly known model	Model-unknown tasks where orientation is not critical

The comparison clearly shows that my proposed scheme uniquely combines three vital features: model independence via data-driven learning, full pose (position + orientation) control, and consideration of physical actuator limits. This combination makes it highly suitable for advanced robotic applications where adaptability and precise end effector placement are paramount.

Conclusion

In this work, I have presented a novel data-driven control scheme for redundant manipulators that simultaneously addresses the problems of model uncertainty and integrated pose control. The framework successfully merges online Jacobian estimation through a simple yet effective update law with a quaternion-based orientation control strategy. The formulated velocity-level optimization problem, which incorporates joint limits, is solved efficiently by a designed neurodynamic solver whose convergence is rigorously proven. Simulation studies on a 7-DOF manipulator demonstrate the scheme’s capability to accurately track both the position and orientation trajectories of the end effector without prior knowledge of the robot’s kinematic model. The comparative analysis underscores the scheme’s novelty in bridging the gap between data-driven adaptability and full six-DOF pose control, offering a practical solution for real-world robotic tasks where flexibility and precision of the end effector are essential.