Hierarchical optimal feedback control of redundant systems Emanuel Todorov and Weiwei Li, University of California San Diego Sensorimotor function results from multiple feedback loops that operate simultaneously. Yet, most existing models of feedback control involve a single transformation from (estimated) states into control signals. Our goal here is to develop a general method for constructing feedback control hierarchies, and optimizing them for redundant tasks. The setting is illustrated in Fig 1D. The high-level controller, adapted to the task via methods for optimal controller design, interacts with an augmented dynamical system instead of the physical plant. The augmentation is performed by a low-level mechanism – which extracts a small set of task-relevant features, sends those features to the high level, obtains a response in the form of a desired change in the features, and transforms that response into appropriate control signals. Such a method can greatly simplify optimal controller design, because the augmented system is treated as having much fewer state variables. But can it provide a good approximation to optimal control? Based on our recent work, we believe it can. We have shown that optimal feedback controllers in redundant tasks, although not explicitly designed to be hierarchical, end up being (roughly) hierarchical anyway. The optimal mapping from states into controls (Fig 1A) has reduced rank, and can therefore be represented by a 3-layer neural network with a bottleneck: layer 1 performs feature extraction, layer 2 performs feedback control in feature space, layer 3 generates controls via motor synergies (Fig 1B). The transition to the present approximation scheme is illustrated in Fig 1C. The direct link from states to controls, which has no analog in Fig 1A,B, is essential for the development of our method below. We now present the new method in its general form. Consider a plant with dynamics x˙ = f (x, ux ), where x is the state vector and ux the control vector. The feature vector y is related to the state by y = h (x), and contains enough information so that the cost function q (x) + uTx Rux can be written as q (y) + uTx Rux . The true dynamics of y is then given by y˙ = ∂h/∂x f (x, ux ). If we want uy to specify desired changes in y, the desired dynamics is y˙ = passive + uy , where the passive dynamics correspond to ux = 0. Equating the desired and actual dynamics, and linearizing f with respect to ux , we obtain a relationship between high-level commands and actual control signals: uy = ∂h/∂x ∂f /∂ux ux . In addition to this relationship we want to keep the control cost small. Thus we will compute ux online by static minimization of uTx Rux + kuy − ∂h/∂x ∂f /∂ux ux k2 with respect to ux . This affords automatic construction of motor synergies, once the features (or "controlled parameters") are defined. In order to apply model-based optimal control on the task level, we need a virtual dynamical model of y which does not depend on x, ux . So we seek a function g (y) such that y˙ = g (y) + uy , which is achieved when g (y) = passive = ∂h/∂x f (x, 0). Now we see why this hierarchical method is approximate: the latter equation cannot be satisfied exactly, because the mapping x → y is many-to-one. In some cases we will be able to find the best approximation g (y); in other cases we will initialize g using physical intuition, and then improve it through learning (by generating uy ’s, measuring the resulting changes in y, and fitting g (y; w) = y− ˙ uy ). The above general method is easily instantiated for linear dynamical systems, as follows. Suppose x˙ = Ax + Bux 2 and y = Hx. Then ∂h/∂x = H, ∂f /∂ux = B, and ux is found by minimizing uTx Rux + kuy − HBux k . Let the virtual dynamical model be in the form y˙ = Gy + uy . Then G should satisfy GHx = HAx for all x. This is not possible exactly, but the best approximation is given by G = HAH † . It will be interesting to compare the performance of our new method to non-hierarchical optimal feedback controllers. We now illustrate the method on a non-linear problem, involving reaching with a 2-link 6-muscle arm model (Fig 2). The muscles are modeled as low-pass filters, and so their activations are state variables (along with the joint angles and velocities). The task features we chose are hand position, velocity, and net force acting on the hand. The virtual dynamics g (y) initially corresponded to a point mass, and was later improved by fitting a second-order polynomial. Optimal feedback controllers on the task level were constructed by our generalized LQG method, which can handle nonlinear dynamics. Fig 3A,B show hand trajectories before (3A) and after (3B) improvement of g. Black curves are actual trajectories generated by our hierarchical control scheme, gray curves are "virtual" trajectories that would result from applying the task-level controller alone to a system with dynamics g. Before learning, the virtual trajectories are straight because g is a linear point-mass model. The actual trajectories are quite different, but note that they still converge to the target. After learning a nonlinear g the discrepancy is abolished. Fig 3C shows that the muscle controls generated by the hierarchical controller (gray) are very similar to the controls generated by a nonhierarchical controller (dashed); the latter is obtained with the generalized LQG method, initialized from the solution of the hierarchical controller. The close correspondence indicates that the method yields a good approximation to optimal feedback control. Although noise was omitted here for simplicity, it can be incorporated. We believe this is the first comprehensive approach to hierarchical optimal feedback control; apart from modeling the neural control of movement, it may have more general applications to the control of complex systems – including neuro-prostheses. 1

(B)

x2 x1

(C) feature extraction

motor synergies

(D)

designed by optimal control methods

task-level feedback

Task-level controller

y(x)

uy(y)

Feedback transformation

x

controls

states

final state covariance ir t d t) an ons nd c du 2 re 1+x (x

o u ptim 1 = a u l co 2 = n f(x tro l 1 + s: x 2)

task goal: x1+x2 = target

ta (x sk1 + re x le v 2 v a an rie t d s) i r

(A)

Plant designed by new method

plant

ux(x,uy)

.

x = f (x,ux)

Figure 1: Schematic illustration of the new method and its motivation. (A) In the simplest redundant task, we have shown that the two controls u1 , u2 (affecting the state variables x1 , x2 ) are coupled into a motor synergy. The control signal is a function of the task-relevant feature x1 + x2 , but not the individual x1 , x2 . Analysis shows similar control structure for arbitrary redundant tasks. (B) Such low-rank controllers can be represented as networks with bottlenecks. (C) We fix the feature extractor and motor synergies (through the choice of features), and only optimize the feedback controller operating in feature space. (D) Diagram of our new method.

Figure 2: Model of a 2-link 6-muscle human arm in the horizontal plane. (A) Schematic illustration. (B) Lengthvelocity-tension function, based on Virtual Muscle. (C) Muscle moment arms. Other parameters are taken form the experimental literature. Excitation dynamics is modelled as a 1st-order low-pass filter.

(B)

(C)

hierarchical local minimum

Muscle activations

(A)

0

10 cm

Time (sec)

1

Figure 3: Hand-space trajectories before (A) and after (B) learning a better virtual model. Black – actual trajectories; gray – trajectories obtained by replacing the augmented system with the virtual model. There are two start points and two targets. (C) Comparison of the control signals obtained by the hierarchical and non-hierarchical optimal feedback controllers, for one movement. The cost being optimized includes endpoint accuracy and control effort.

2

Hierarchical optimal feedback control of redundant ...

In addition to this relationship we want to keep the control cost small. ... model-based optimal control on the task level, we need a virtual dynamical model of y ... approximation g (y); in other cases we will initialize g using physical intuition, and ...

242KB Sizes 1 Downloads 236 Views

Recommend Documents

Optimal Adaptive Feedback Control of a Network Buffer.
system to obtain a robust quasi optimal adaptive control law. Such an approach is used ..... therefore reduces to the tracking of the singular value xsing given by eq. (8). For the .... [7] I. Smets, G. Bastin, and J. Van Impe. Feedback stabilisation

Optimal Adaptive Feedback Control of a Network Buffer
American control conference 2005. Portland, Oregon, USA - Juin 8-10 2005. Optimal Adaptive Feedback Control of a Network Buffer – p.1/19 ...

Numerical solution to the optimal feedback control of ... - Springer Link
Received: 6 April 2005 / Accepted: 6 December 2006 / Published online: 11 ... of the continuous casting process in the secondary cooling zone with water spray control ... Academy of Mathematics and System Sciences, Academia Sinica, Beijing 100080, ..

Optimal Adaptive Feedback Control of a Network Buffer.
Mechanics (CESAME) ... {guffens,bastin}@auto.ucl.ac.be ... suitable for representing a large class of queueing system. An ..... 2) Fixed final state value x(tf ) with x(tf ) small, tf free. ..... Perturbation analysis for online control and optimizat

Optimal Feedback Control of Rhythmic Movements: The Bouncing Ball ...
How do we bounce a ball in the air with a hand-held racket in a controlled rhythmic fashion? Using this model task previous theoretical and experimental work by Sternad and colleagues showed that experienced human subjects performed this skill in a d

Kinematic control of redundant manipulators
above by the maximum velocity of 0.2m.s−1. In a second priority stage S2, .... in Mathematics for Machine Learning and Vision in. 2006. He received his Ph.D ...

Visual PID Control of a redundant Parallel Robot
Abstract ––In this paper, we study an image-based PID control of a redundant planar parallel robot using a fixed camera configuration. The control objective is to ...

Kinematic control of a redundant manipulator using ...
with a large network size and a large training data set. ... of training data to reduce the forward approxima- ... effector position attained by the manipulator when.

Kinematic control of a redundant manipulator using ...
guished into three different classes: direct inver- ..... An online inverse-forward adaptive scheme is used to solve the inverse ... San Francisco, CA, April 2000.

Feedback Control Tutorial
Design a phase lead compensator to achieve a phase margin of at least 45º and a .... Both passive component variations are specified in terms of parametric ...

Recurrence and transience of optimal feedback ...
Graduate School of Engineering, Hiroshima University,. Japan. Supported in .... B are devoted to some technical estimates used in this paper. 3 ... order to give a rigorous meaning of the process characterized by (1.4), we employ the notion of ...

OPTIMAL CONTROL SYSTEM.pdf
How optimal control problems are classified ? Give the practical examples for each classification. 10. b) Find the extremal for the following functional dt. 2t. x (t) J.

OPTIMAL CONTROL SYSTEMS.pdf
OPTIMAL CONTROL SYSTEMS.pdf. OPTIMAL CONTROL SYSTEMS.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying OPTIMAL CONTROL ...

OPTIMAL CONTROL SYSTEMS.pdf
... time and fixed end state. problem. Indicate the different cases of Euler-Lagrange equation. 10. 2. a) Find the extremal of a functional. J(x) [ ] x (t) x (t) x (t) x (t) dt.