An Interpretive Model of Hand-Eye Coordination Tom Erez, Washington University in St. Louis Overview Normal cognitive function requires coordination between perception and action. However, both action and perception pose enormous challenges to our scientific understanding, and, as a result they are often studied separately, with experimental paradigms carefully crafted to tease them apart (e.g., gaze fixation). We present an interpretive model that addresses both faculties within a single framework. A simplified task of hand-eye coordination is modeled as a stochastic optimal control problem, and the solution is a coordinated motion of eye and hands, derived from first principles. Our numerical results reveal a rich behavioral repertoire, with both smooth pursuit and saccades emerging as components of the optimal solution to the eye-hand control task.

The Model We present an abstract model of a reaching task – two free-moving points (“hands”) move on a twodimensional plane (the “scene”), which includes four obstacles and a target (depicted in figure 1). The hands start at both sides of the bottom of the scene, and at a final time they are penalized proportional to their distance from the target, located at the top-middle of the scene. Each hand moves through a pair of obstacles on its path to the goal. The linear dynamics of the hands are subject to constant process noise, and the agent is provided with noisy observations of the state of the scene. State estimation and feedback control of the hands are therefore necessary to complete the task. The positions of both obstacles and target are fixed, but the learning agent starts with an uncertain estimate of these positions, and so they have to be observed and estimated as well. The eye is involved in the observation process – it provides information that allows for better state estimation, thereby improving feedback control and so contributing to task performance. Motivated by this abstraction, we model the eye as a “point of gaze”, a free-moving point in the scene. The effect of this gaze is a local modulation of the observation noise: if an object (hand, obstacle or target) is close to the point of gaze, the noise perturbing the observation of the object’s position will be reduced (this reduction being a Gaussian function of distance from the gaze point, as illustrated in figure 2B). Therefore, directing the gaze at an object allows the learning agent to generate better estimation of its position, which yields more accurate feedback control of the hands and hence better overall performance.

Solution Since the observation noise is state-dependent (and hence not Gaussian), estimation requires a nonlinear filter (we employ the Extended Kalman Filter); furthermore, since the cost is not quadratic (due to the obstacles), the Stochastic Optimal Control (SOC) problem cannot be solved using standard LinearQuadratic-Gaussian techniques. Instead, we solve the SOC using Minimax Differential Dynamic Programming (the complete mathematical description was published in [1]; see references therein). The optimal solution is a feedback policy for controlling the positions of both hands and eye through the scene that exhibits both saccades (fig. 3) and smooth pursuit (fig. 4). In order to make this high-dimensional optimization problem tractable, we shape the solution by gradually reducing the width of the eye’s “fovea” (the width of the Gaussian inside which observation noise is reduced) – at first, wider fovea allows the agent to detect and learn what areas of the scene are most relevant at every stage; eventually, a narrower fovea leads to more distinct eye motions.

Conclusion The purpose of this model is interpretive [2] – to explore the behavioral repertoire that emerges from computational principles, without relying on heuristics or other domain-specific assumptions. While it may overlook many of the particular complexities of motor control (inertia, redundancy, etc.) and visual perception (interpretation of visual information, coordination transformation, etc.), the results demonstrate mutually-responsive motion patterns for both gaze and hands, directly addressing the coupling between perception and action.

Observation noise variance

A

100 200 300 400 500 600 700 800 900

1.5

B

1 0.5 0 −0.5

0 Distance from gaze point

0.5

1000

Figure 1 The model consists of two hands (blue and red diamonds) that start at the bottom of the scene, and are required to reach the target position (triangle) at a fixed final time. The eye starts at the center, and is free to move about the scene. The obstacles are represented by black Gaussian blurs – the hands incur a Gaussian penalty for coming too close to the obstacles. The positions of both obstacles and target are fixed, but the learning agent receives only uncertain observations of these positions, and has to use state estimation to disambiguate them. Overall, the system’s state consists of estimated positions of hands, obstacles and target, as well as eye position, which is always known accurately, as well as an estimate of the estimation uncertainty for all these quantities.

A

B

C

Figure 2 The eye is modeled by its point of gaze, and the observation noise is locally reduced around this point. This results in foveated visual perception (when the Gaussian is wide, as in A) or even tunnel vision (when the Gaussian is narrow, as in B). A: A graphic illustration of the eye’s effect. The gaze point is at the center, and elements of the scene that are farther away from this point are subjected to greater observation noise. B: Observation noise as a function of distance from the gaze point. The baseline observation noise is high, so observation of objects outside the fovea is unreliable, diminishing the effect of peripheral vision. Optimal behavior was shaped by gradually reducing the width of the fovea, as described in the text.

D

E

Figure 3 When proprioception acts as an independent (and reliable) channel of state observation, state estimation is mostly needed for determining the position of the objects of the scene – obstacles and target. This figure shows the optimal motion through five snapshots at different stages of task performance, with the hands (diamonds) and eye leaving traces (dashed lines) along their past trajectories, to clarify the motion pattern. The elements of the scene are shaded according to the estimated uncertainty of the state estimation (lighter color means more confident estimate). Note that although the obstacles are represented as circles, the cost they contribute is still Gaussian, as described in figure 1. The eye starts by saccading to the left pair of obstacles (A) – although further away, this is beneficial for the overall task (as the left hand will soon reach these obstacles). The eye then saccades to the right pair of obstacles (B), reducing uncertainty regarding their estimated position (note how their shade changes from A to C). Finally, the eye turns to the target, disambiguating its position as both hands converge there as well (D-E).

A

B

C

D

E

Figure 4 When proprioception is diminished, the eye’s support is needed to disambiguate the hands’ position at mission-critical stages of the task. The eye first saccades to the bottom left and escorts the left hand through the left pair of obstacles (A). It then saccades towards the right hand (B), meeting it in time to escort it through its respective pair of obstacles (C). Since uncertainty has been accumulating regarding the position of the left hand (note the darker shad of the left diamond in C), the eye saccades again to the left hand (D), and then positions itself between both hands as they both approach the goal (E). References: [1] Erez, T. and W. D. Smart, “Coupling perception and action using minimax optimal control”, in Proceedings of IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2009, pp. 58-65 [2] Dayan, P. and L.F. Abbott, Theoretical Neuroscience, MIT Press, 2001, p. xiii

An interpretive model of hand-eye coordination - Semantic Scholar

and the agent is provided with noisy observations of the state of the scene. State estimation ... eye starts at the center, and is free to move about the scene. The.

106KB Sizes 1 Downloads 250 Views

Recommend Documents

An interpretive model of hand-eye coordination - Semantic Scholar
target are fixed, but the learning agent starts with an uncertain estimate of these .... the left hand (note the darker shad of the left diamond in C), the eye saccades.

Multiagent Coordination by Stochastic Cellular ... - Semantic Scholar
work from engineering, computer science, and mathemat- ics. Examples ..... ing serves to smooth out differences between connected cells. However, if this ...

Communication, Coordination and Networks! - Semantic Scholar
[email protected], URL: http://sites.google.com/site/jihong33). 1 .... Pre%play communication in social networks has been studied in theoretical .... Page 10 ...

Peri-operative Coordination and Communication ... - Semantic Scholar
inside a hospital [2]. The main ... Information Technology (ONC) estimated that more that $. 17.2 Billion will be ... Based on years of design, development, and deployment of a PoCCS ... the degree of this phenomena is problematic. For this ...

Communication, Coordination and Networks! - Semantic Scholar
... of cheap talk in promoting cooperation and coordination in different games and compare it to that of ... In contrast, our paper deals with the distributional issue as well as effi ciency. .... The first number in each cell is the number of subjec

Peri-operative Coordination and Communication ... - Semantic Scholar
In this position paper, we want to introduce our current work on taking .... Our current implementation of a PoCCS system has five main .... IOS Press, 2007. 8.

Model of dissipative dielectric elastomers - Semantic Scholar
Feb 3, 2012 - View online: http://dx.doi.org/10.1063/1.3680878. View Table of Contents: ... As a result of their fast response time and high energy density, dielectric ... transducer is an electromechanical system with two degrees of freedom.

An elastic–plastic interface constitutive model - Semantic Scholar
Available online 27 February 2004. Abstract. An interface constitutive ... +1-617-253-1635; fax: +1-617-258-8742. E-mail address: ... occurs naturally across the interface, and traction-free cracks form and propagate along element boundaries.

Model Combination for Machine Translation - Semantic Scholar
ing component models, enabling us to com- bine systems with heterogenous structure. Un- like most system combination techniques, we reuse the search space ...

A demographic model for Palaeolithic ... - Semantic Scholar
Dec 25, 2008 - A tradition may be defined as a particular behaviour (e.g., tool ...... Stamer, C., Prugnolle, F., van der Merwe, S.W., Yamaoka, Y., Graham, D.Y., ...

Model Interoperability in Building Information ... - Semantic Scholar
Abstract The exchange of design models in the de- sign and construction .... that schema, a mapping (StepXML [9]) for XML file representation of .... databases of emissions data. .... what constitutes good modelling practice. The success.

ACTIVE MODEL SELECTION FOR GRAPH ... - Semantic Scholar
Experimental results on four real-world datasets are provided to demonstrate the ... data mining, one often faces a lack of sufficient labeled data, since labeling often requires ..... This work is supported by the project (60675009) of the National.

Pulmonary Rehabilitation: Summary of an ... - Semantic Scholar
Documenting the scientific evidence underlying clinical practice has been important ... standard of care for the management of patients with chronic obstructive ...

Model Construction in Planning - Semantic Scholar
For all but the simplest domains, this technique has obvious deficiencies. ... programming might be a good trick to use, though, if we can develop a planner that can identify parts of a ... It might post a goal to visit the market to buy supplies.

Allocation Of Indivisible Goods: A General Model ... - Semantic Scholar
I.2.4 [Computing Methodologies]: Artificial Intelligence— ... INTRODUCTION. Allocation of ... neglected, and in artificial intelligence, where research has focused ...

Variation of the Balanced POD Algorithm for Model ... - Semantic Scholar
is transformation-free, i.e., the balanced reduced order model ... over the spatial domain Ω = [0, 1] × [0, 1], with Dirichlet boundary ..... 9.07 × 100. 2.91 × 100. MA.

SNIF-ACT: A Cognitive Model of User Navigation ... - Semantic Scholar
Applications of the SNIF-ACT Model. 6.2. Cognitive Models ..... The model also requires the specification of the attentional weight parame- ter Wj. We have ..... Parse the Interface Objects, Coded Protocol, and Event Log to deter- mine the next ...

Model-based Detection of Routing Events in ... - Semantic Scholar
Jun 11, 2004 - To deal with alternative routing, creation of items, flow bifurcations and convergences are allowed. Given the set of ... the tracking of nuclear material and radioactive sources. Assuring item ... factor in achieving public acceptance

Symbolic Model Checking of Signaling Pathways ... - Semantic Scholar
ply Model Checking to the study of a biological system ... of hardware, digital circuits, and software designs. Given .... This is in accord with evidence from cancer.

SNIF-ACT: A Cognitive Model of User Navigation ... - Semantic Scholar
Users With Different Background Knowledge. APPENDIX A. THE ... SNIF-ACT. 357. 1. Internet use is estimated to be 68.3% of the North American population.

A Taxonomy of Model-Based Testing for ... - Semantic Scholar
ensure a repeatable and scientific basis for testing. ... is insufficient to achieve a desired level of test coverage, different test methods are ... The test specification can take the form of a model, executable model, script, or computer ..... Onl

Variation of the Balanced POD Algorithm for Model ... - Semantic Scholar
is transformation-free, i.e., the balanced reduced order model is approximated directly ... one dimensional hyperbolic PDE system that has a transfer function that can be ... y)wy +b(x, y)u(t), over the spatial domain Ω = [0, 1] × [0, 1], with Diri

The Planning Solution in a Textbook Model of ... - Semantic Scholar
Feb 23, 2004 - This note uses recursive methods to provide a simple characterization of the planner's solution of the continuous time and discrete time version of the simplest Pissarides (2000) model. I show that the solutions are virtually identical