Help Me Help You: Interfaces for Personal Robots Ian J. Goodfellow, Nate Koenig, Marius Muja, Caroline Pantofaru, Alexander Sorokin, Leila Takayama Willow Garage, Inc. Menlo Park, California, USA Email: {goodfellow, nkoenig, mariusm, pantofaru, sorokin, takayama} @willowgarage.com

Index Terms—HRI, mobile user interface, information theory

I. R ESEARCH P ROBLEM AND A P ROPOSAL The communication bottleneck between robots and people [1] presents an enormous challenge to the human-robot interaction community. Rather than exclusively focusing on improving robot object learning, task learning, and natural language understanding, we propose also designing interfaces that make up for low communication bandwidth by thoughtfully accounting for the constrained capabilities of robots [2]. People are adept at compensating for communication limitations, changing their communicative strategies for talking to pets, babies [3], foreigners [4], and robots [5]. Communicative accommodation already exists. Thus, instead of requiring robots to perfectly understand natural language, gestures, etc., there is a wide variety of research and design to be done in the space of alternative communicative modalities. We propose to approach this problem by accounting for limitations in robot abilities and taking advantage of already familiar human-computer interaction models, leveraging a communication model based upon Information Theory. Using this design perspective, we present three different mobile user interfaces that were fully developed and implemented on a PR2 (Personal Robot 2) [6] for task domains in navigation, perception, learning and manipulation. II. R ELEVANT T HEORIES We can observe parallels between human robot interaction and the interaction between humans and general complex autonomous systems. Sheridan’s taxonomy of complex human-machine systems describes the following sequence of operations: (1) acquire information, (2) analyze and display information, (3) decide on an action, and (4) implement that action [7, p. 61]. This provides the groundwork for identifying the stages at which people and/or robots should lead. In the current projects, the personal robot autonomously completes steps 1, 2 and 4, and the person completes step 3. Thus, the user interface design must address how the robot analyzes and displays its sensor information and world model to the human, and how the human can effectively communicate desired actions to the robot. An analysis of our case studies in Sheridan’s framework is displayed in Fig. 1. All of our systems use the trading model of alternately passing control back and forth between human and robot, as opposed to the sharing model of simultaneous control described in [7, p. 63].

Gold proposed using an Information Pipeline model for HRI that is based upon information theory [8], a mathematical model of communication developed for quantifying the amount of information that could be transported through a given channel. Schramm [9] developed a theory of communication that put these ideas into the context of two-way joint communications. This could be helpful when considering the large amount of overhead involved in encoding and decoding messages sent between people and robots. The focus of the projects in this paper was on designing interfaces that applied this theory to human-robot communication. With a robot encoding messages in a way that humans can understand and humans encoding messages in a way that robots can understand, communication is easy and effective. III. T HE D ESIGN S PACE AND T HREE UI S The personal robot platform used throughout these projects is the PR2, and the robot behaviors are built using the Robot Operating System (ROS) [10]. The PR2 is a mobile robot standing approximately 5 feet high. It drives using casters in its base, has two long arms for manipulation, and has numerous sensors for perception. It is designed to operate in any environment that is American Disabilities Act compliant. 1) Navigation: The first task a personal robot will be asked to do is drive to a given location. The PR2 is capable of navigating a complicated, cluttered environment, but a person needs to tell it where to go. With a labeled map, a user can select a location from a set of options. By installing a PBX server on the PR2 and using DTMF codes, this prototype system allowed users to call the robot to tell it where to go with key presses corresponding to a menu (with items such as “drive to destination”), a language both the user and robot understand. The robot then executes the task and calls the user back to inform them of task completion. See Project 1, Figure 1 for an outline. Although we focused on the task of driving to a destination room, this framework facilitates a variety of interactions such as calling a user when the battery is low. 2) Perception: Another common scenario involves the user asking the robot to fetch objects, e.g., a drink. The location in the house of the general object class of beverages is known, and the PR2 is capable of finding bottles on a flat shelf. However identifying the correct flavor is difficult, especially if the drink selection changes. The robot may have never seen mango juice before, is unable to tell it apart from orange juice, and does not know what name to assign to each bottle. To bridge the gap, the robot photographs the available bottles, remembers their locations, and presents the photos to the user,

(1) Acquire Information Project 1 Navigation

Robot gets map

(2) Analyze and display Robot localizes itself in the environment

(3) Decide upon action Requests command from user

User decides where robot should go

Auditory

Project 2 Object Selection

Robot scans environment with cameras

Robot detects drinks and displays camera images via a GUI

Requests command from user

Graphical

Project 3 Robot scans Manipulation environment with cameras and and lasers Learning

Focus Improving the design of human-robot interfaces

Fig. 1.

Robot detects posts and identifies disks

Requests command from user

Textual Focus Design requests that are easy for the human to understand

(4) Implement action Robot navigates to the requested location

Tells robot where to go

Key press

User decides which drink to request

User decides next game move

Selects image of desired drink

Robot fetches and delivers desired drink to user

Touch screen

Informs user of progress

Auditory

Delivers drink to user

Physical

Selects move from menu

Robot performs the next move

Touch screen

Shows progress to user

Video

Focus Design commands that are easy for the robot to understand

Focus Design simple feedback for the human

Summary of the three projects, divided according to Sheridan’s four stages of complex human-machine tasks [7].

who then taps the touchscreen to select the desired bottle image. This interface is outlined in Project 2, Figure 1. The key idea is that communication through pictures avoids the need to abstract ideas into language (e.g., [11]). 3) Learning and Manipulation: Although a personal robot deployed in a home will be pre-programmed to do many tasks, there will be unforeseen situations requiring new behaviors. This system allows a person to teach a robot a new hierarchical task, composed of basic pre-programmed actions. The user interface is crucial as the robot is encountering a new situation that it cannot describe. Instead, the robot suggests action primitives that it knows how to perform, guaranteeing that the human instructions are feasible. Thus the human comprehends the robot’s capabilities, and formulates the high-level action sequence that the robot lacks. This prototype uses a mobile web interface to provide a selection of known action primitives to the user, and images from the robot’s camera to provide feedback, as outlined in Project 3, Figure 1. With this interface, humans instructed a robot on how to solve the Towers of Hanoi puzzle. A user study of twenty end-users showed that this UI design eliminated the need for a long instructional period. IV. I MPLICATIONS FOR D ESIGN AND F UTURE W ORK These case studies represent a variety of task domains and system designs, but all UIs leverage existing technologies and established human-computer interactions. From the experience of designing, implementing, and using these systems, we learned several lessons that inform future designs. Start by considering the task, its constraints and available information. For example, since the PR2 knew office numbers and their locations, DTMF codes were sufficient for communication. Second, consider the limitations of the robot to divide work between the human and robot appropriately. Since the PR2

can detect bottles, but not specific juice flavors, it found and photographed the bottles while the customer recognized the images. Finally, design user interfaces that support division of labor. Because the Towers of Hanoi task only required a constrained set of manipulation subtasks, the interface presented a constrained action-object (noun-verb) model of communication instead of natural sentences. The core requirement for a personal robot is to provide service to its user, and so a key capability for such a robot is to understand and communicate with its user. Through three case studies we have shown how effective communication accommodates for both robot and human abilities, making difficult communication tasks possible with current technologies. By using mobile web and phone technology, all three interfaces are available for already pervasive platforms. The source code for these projects is available at: http://code.ros.org. R EFERENCES [1] K. Gold, “An information pipeline model of human-robot interaction,” in HRI, 2009. [2] Y. Fernaeus, M. Jacobsson, S. Ljungblad, and L. Holmquist, “Are we living in a robot cargo cult?” in HRI, 2009. [3] D. Burnham, C. Kitamura, and U. Vollmer-Conna, “What’s new, Pussycat? On talking to babies and animals,” in Science, 2002. [4] R. Scarborough, J. Brenier, Y. Zhao, L. Hall-Lew, and O. Dmitrieva, “An acoustic study of real and imagined foreigner-directed speech,” in International Congress of Phonetic Sciences, 2007. [5] A. Batliner, S. Biersack, and S. Steidi, “The prosody of pet robot directed speech: Evidence from children,” in Proc. Speech Prosody, 2006. [6] Willow Garage, “Personal Robot 2 (PR2),” www.willowgarage.com. [7] T. B. Sheridan, Humans and Automation. John Wiley, 2002. [8] C. Shannon and W. Weaver, Mathematical Theory of Communication. University of Illinois Press, 2002. [9] W. Schramm, The Process and Effects of Mass Communication. University of Illinois Press, 1955. [10] M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler, and A. Ng, “ROS,” in ICRA, 2009. [11] M. Franklin, The Universal Phrasebook. Sterling, 2005.

Interfaces for Personal Robots - Semantic Scholar

human-machine systems describes the following sequence of operations: ... By installing a PBX .... using mobile web and phone technology, all three interfaces.

165KB Sizes 0 Downloads 249 Views

Recommend Documents

Help Me Help You: Interfaces for Personal Robots - CiteSeerX
this design perspective, we present three different mobile user interfaces that were ... system allowed users to call the robot to tell it where to go with key presses ...

Real Robots Don't Drive Straight - Semantic Scholar
pects of AI map well onto practical robots, while other parts of the conventional .... a motor with an integrated encoder (Figure 3) and software that transparently.

Motion planning for formations of mobile robots - Semantic Scholar
for tele-operation of such a network from earth where communication .... A clear advantage of using motion planning for dynamic formations is that maneuvers can .... individual robots via a wireless RC signal. ... may provide a disadvantage for appli

Anesthesia for ECT - Semantic Scholar
Nov 8, 2001 - Successful electroconvulsive therapy (ECT) requires close collaboration between the psychiatrist and the anaes- thetist. During the past decades, anaesthetic techniques have evolved to improve the comfort and safety of administration of

The Personal Vote and the Efficacy of Education ... - Semantic Scholar
We test our hypotheses using cross-national data on education spending and ..... lar country.18 To gain additional leverage on country-level heterogeneity, we ...

Personal Tasks at Work: An Exploration - Semantic Scholar
and most of the industry contributions to this problem focus on ... office [8]. Research also found that more flexible work ... also found that people check social media at work as a .... used to manage both work and personal needs in the.

The Personal Vote and the Efficacy of Education ... - Semantic Scholar
sonal credit, and (b) can directly and personally control ..... Patients must present an ID card or a letter signed by the .... They offer three variables, Ballot,.

Personal Tasks at Work: An Exploration - Semantic Scholar
H.5.m. Information interfaces and presentation (e.g.,. HCI): Miscellaneous; See ... also found that people check social media at work as a .... sites. Ergonomics 40, 1 (1997), 78–91. [6] Hill, E.J., Miller, B.C., Weiner, S.P., and Colihan, J.

Considerations for Airway Management for ... - Semantic Scholar
Characteristics. 1. Cervical and upper thoracic fusion, typically of three or more levels. 2 ..... The clinical practice of airway management in patients with cervical.

Czech-Sign Speech Corpus for Semantic based ... - Semantic Scholar
Marsahll, I., Safar, E., “Sign Language Generation using HPSG”, In Proceedings of the 9th International Conference on Theoretical and Methodological Issues in.

Discriminative Models for Semi-Supervised ... - Semantic Scholar
and structured learning tasks in NLP that are traditionally ... supervised learners for other NLP tasks. ... text classification using support vector machines. In.

Dependency-based paraphrasing for recognizing ... - Semantic Scholar
also address paraphrasing above the lexical level. .... at the left top of Figure 2: buy with a PP modi- .... phrases on the fly using the web as a corpus, e.g.,.

Coevolving Communication and Cooperation for ... - Semantic Scholar
Chicago, Illinois, 12-16 July 2003. Coevolving ... University of Toronto. 4925 Dufferin Street .... Each CA agent could be considered a parallel processing computer, in which a set of .... After 300 generations, the GA run converged to a reasonably h

Model Combination for Machine Translation - Semantic Scholar
ing component models, enabling us to com- bine systems with heterogenous structure. Un- like most system combination techniques, we reuse the search space ...

Biorefineries for the chemical industry - Semantic Scholar
the “green” products can be sold to a cluster of chemical and material ..... DSM advertised its transition process to a specialty company while building an.

Nonlinear Spectral Transformations for Robust ... - Semantic Scholar
resents the angle between the vectors xo and xk in. N di- mensional space. Phase AutoCorrelation (PAC) coefficients, P[k] , are de- rived from the autocorrelation ...

Leveraging Speech Production Knowledge for ... - Semantic Scholar
the inability of phones to effectively model production vari- ability is exposed in the ... The GP theory is built on a small set of primes (articulation properties), and ...

Enforcing Verifiable Object Abstractions for ... - Semantic Scholar
(code, data, stack), system memory (e.g., BIOS data, free memory), CPU state and privileged instructions, system devices and I/O regions. Every Řobject includes a use manifest in its contract that describes which resources it may access. It is held

SVM Optimization for Lattice Kernels - Semantic Scholar
gorithms such as support vector machines (SVMs) [3, 8, 25] or other .... labels of a weighted transducer U results in a weighted au- tomaton A which is said to be ...

Sparse Spatiotemporal Coding for Activity ... - Semantic Scholar
of weights and are slow to train. We present an algorithm .... They guess the signs by performing line searches using a conjugate gradi- ent solver. To solve the ...

A demographic model for Palaeolithic ... - Semantic Scholar
Dec 25, 2008 - A tradition may be defined as a particular behaviour (e.g., tool ...... Stamer, C., Prugnolle, F., van der Merwe, S.W., Yamaoka, Y., Graham, D.Y., ...

Improved Competitive Performance Bounds for ... - Semantic Scholar
Email: [email protected]. 3 Communication Systems ... Email: [email protected]. Abstract. .... the packet to be sent on the output link. Since Internet traffic is ...

Semantic Language Models for Topic Detection ... - Semantic Scholar
Ramesh Nallapati. Center for Intelligent Information Retrieval, ... 1 Introduction. TDT is a research ..... Proc. of Uncertainty in Artificial Intelligence, 1999. Martin, A.