Using Inverse Optimal Control To Predict Human Reaching Motion in Collaborative Tasks Jim Mainprice1 , Rafi Hayne2 , Dmitry Berenson2 1

{[email protected] , [email protected], [email protected]} Max-Planck-Institute for Intelligent Systems, Autonomous Motion Department, Paul-Ehrlich-Str. 15, 72076 Tbingen, Germany 2 Robotics Engineering Program, Worcester Polytechnic Institute, 100 Institute Rd, Worcester, MA 01609.

A great deal of work in the fields of neuroscience [1], [2], [3] and biomechanics [4] has sought to model the principles underlying human motion. However, human motion in environments with obstacles has been difficult to characterize. Furthermore, human motion in collaborative tasks where two humans share a workspace is difficult to model due to unclear social, interference, and comfort criteria. In this work we present a method to learn the cost function of a motion planner that mimics human collaborative manipulation tasks. Our approach is based on Inverse Optimal Control (IOC), which, by considering a set of demonstrations allows us to find a cost function that balances different feature functions. The demonstrations are generated through motion capture, while the feature functions are designed to avoid interference and collision with the partner as well as maintain smoothness of the trajectories. Prediction of human motion is then performed by iterative replanning using the trajectory optimizer STOMP [5], which is able to handle difficult environmental constraints. IOC, occasionally named Inverse Reinforcement Learning, is the problem of finding the cost or reward function that an agent optimizes when computing a trajectory or policy. The early Apprenticeship learning approach [6] consists of solving iteratively the forward problem, modifying the weights at each iteration. Our approach is based on the more recent PIIRL algorithm [7], which does not require solving the forward problem and thus allows handling highdimensional continuous state spaces by only requiring local optimality of the demonstrated trajectories. This work is supported in part by the Office of Naval Research under Grant N00014-13-1-0735 and by the National Science Foundation under Grant IIS-1317462.

Type No replanning With replanning No replanning With replanning

µ

σ min max Joint center distances 52.89 9.66 39.94 67.09 44.91 6.62 36.15 55.20 Task space 49.22 8.25 37.75 63.78 36.20 8.13 24.81 50.77

TABLE I DTW PERFORMED BETWEEN THE DEMONSTRATION OF F IGURE 1 AND THE TRAJECTORIES PLANNED , RESULTS ARE AVERAGED OVER 10 RUNS

Fig. 1. Shared workspace assembly experiment (left) and a demonstration of the benefits of replanning on a difficult example (right). Original motion (red) and predicted motions with (blue) and without (green) replanning.

We applied our framework to data gathered from two humans performing pick and place tasks in close proximity (see Figure 1). To demonstrate the efficacy of our approach we provide test results that compare the learned cost functions with hand tuned versions and without iterative replanning. We evaluated Dynamic Time Warping between the demonstrations and the trajectories obtained from planning with the human kinematic model used in the learning phase (see Table I). We found that we are able to capture a cost function for collaborative reaching motions that outperforms baseline methods in terms of generalizing to unseen reaching examples. Motions obtained with and without replanning are shown in Figure 1. R EFERENCES [1] T. Flash and N. Hogan, “The coordination of arm movements: an experimentally confirmed mathematical model,” The journal of Neuroscience, vol. 5, no. 7, pp. 1688–1703, 1985. [2] E. Burdet, R. Osu, D. W. Franklin, T. E. Milner, and M. Kawato, “The central nervous system stabilizes unstable dynamics by learning optimal impedance,” Nature, vol. 414, no. 6862, pp. 446–449, 2001. [3] E. Todorov and M. I. Jordan, “Optimal feedback control as a theory of motor coordination,” Nature neuroscience, vol. 5, no. 11, pp. 1226– 1235, 2002. [4] Wu and et al, “Isb recommendation on definitions of joint coordinate systems of various joints for the reporting of human joint motion – part ii: shoulder, elbow, wrist and hand,” Journal of biomechanics, vol. 38, no. 5, pp. 981–992, 2005. [5] M. Kalakrishnan, S. Chitta, E. Theodorou, P. Pastor, and S. Schaal, “STOMP: Stochastic trajectory optimization for motion planning,” in ICRA, 2011. [6] P. Abbeel and A. Y. Ng, “Apprenticeship learning via inverse reinforcement learning,” in ICML, 2004. [7] M. Kalakrishnan, P. Pastor, L. Righetti, and S. Schaal, “Learning objective functions for manipulation,” in ICRA, 2013.

Using Inverse Optimal Control To Predict Human ...

σ min max. Joint center distances ... (red) and predicted motions with (blue) and without (green) replanning. We applied our framework to data gathered from two.

695KB Sizes 3 Downloads 164 Views

Recommend Documents

Goal Region Inverse Optimal Control To Predict Human ...
Recently some works have proposed to use Inverse Opti- ... PIIRL [5] that we call Goalset-PIIRL. ... 11th IEEE-RAS International Conference on, 2011.

Approximate MaxEnt Inverse Optimal Control and its ...
are Zi = (Xi,Ai) and the target values are Ri + ˆVt+1(Xi),. andˆVt+1 = ... ˜Qt(Xi,Ai), i.e., ˜Qt is the ..... http://cvrc.ece.utexas.edu/SDHA2010/Human Interaction.html.

Using presenceonly modelling to predict Asian ... - Wiley Online Library
Forest and Environment Program, funded by MultiDonor. Fund, with the initial phase being funded by WWF Nether- lands and the Dutch Zoo Conservation Fund.

Using Artificial Neural Network to Predict the Particle ...
B. Model Implementation and Network Optimisation. In this work, a simple model considering multi-layer perception (MLP) based on back propagation algorithm ...

USING BIG DATA TO IDENTIFY, PREDICT AND ... - Automotive Digest
many businesses and industries – and fleet management ... of things to identify patterns, trends, and associations. ... 2016 report by McKinsey & Company, while.

OPTIMAL CONTROL SYSTEM.pdf
How optimal control problems are classified ? Give the practical examples for each classification. 10. b) Find the extremal for the following functional dt. 2t. x (t) J.

OPTIMAL CONTROL SYSTEMS.pdf
OPTIMAL CONTROL SYSTEMS.pdf. OPTIMAL CONTROL SYSTEMS.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying OPTIMAL CONTROL ...

OPTIMAL CONTROL SYSTEMS.pdf
... time and fixed end state. problem. Indicate the different cases of Euler-Lagrange equation. 10. 2. a) Find the extremal of a functional. J(x) [ ] x (t) x (t) x (t) x (t) dt.

Numerical solution to the optimal feedback control of ... - Springer Link
Received: 6 April 2005 / Accepted: 6 December 2006 / Published online: 11 ... of the continuous casting process in the secondary cooling zone with water spray control ... Academy of Mathematics and System Sciences, Academia Sinica, Beijing 100080, ..

Optimal Placement Optimal Placement of BTS Using ABC ... - IJRIT
Wireless Communication, since the beginning of this century has observed enormous ... phone users, thus cellular telephony become the most important form of ...

Optimal Placement Optimal Placement of BTS Using ABC ... - IJRIT
IJRIT International Journal of Research in Information Technology, Volume 2, Issue 4, April .... the control functions and physical links between MSC and BTS.

Optimal control framework successfully explains ...
during experiments with Brain Machine Interfaces ... Data analysis: Overall neural modulations are defined as the variance of the underlying rate, and expressed ...

Inverse Functions and Inverse Trigonometric Functions.pdf ...
Sign in. Loading… Whoops! There was a problem loading more pages. Retrying... Whoops! There was a problem previewing this document. Retrying.

Evolving Nash-optimal poker strategies using evolutionary ...
Evolving Nash-optimal poker strategies using evolutionary computation.pdf. Evolving Nash-optimal poker strategies using evolutionary computation.pdf. Open.

Optimal codes for human beings
Oct 3, 2006 - forming that word, and the computer deduces the word. ... the next button iterate over all words whose code starts with the sequence of digits we have just .... easier for him to select a word from the list by pressing next than to deci

Online Determination of Track Loss Using Template Inverse Matching
Sep 29, 2008 - match between the real target and one of the virtual targets. Under this .... The experiments are implemented on a standard PC. (Pentium IV at ...

OPTIMAL FRAME STRUCTURE DESIGN USING ...
design coding structures to optimally trade off storage size of the frame structure with ..... [2] “Stanford Light Field Archive,” http://lightfield.stanford.edu/lfs.html.

Inverse Aerodynamic Design using Double Adjoint ...
Dec 5, 2013 - communication with an optimization software, calculation of gradients, and verification and validation of the utilization of sensitivity analysis with ...