Integrated Intelligence for Human-Robot Teams

Oh, Jean; Howard, Thomas M.; Walter, Matthew R.; Barber, Daniel; Zhu, Menglong; Park, Sangdon; Suppe, Arne; Navarro-Serment, Luis; Duvallet, Felix; Boularias, Abdeslam; Romero, Oscar; Vinokurov, Jerry; Keegan, Terence; Dean, Robert; Lennon, Craig; Bodt, Barry; Childers, Marshal; Shi, Jianbo; Daniilidis, Kostas; Roy, Nicholas; Lebiere, Christian; Hebert, Martial; Stentz, Anthony

doi:10.1007/978-3-319-50115-4_28

Jean Oh⁷,
Thomas M. Howard⁸,
Matthew R. Walter⁹,
Daniel Barber¹⁰,
Menglong Zhu¹¹,
Sangdon Park¹¹,
Arne Suppe⁷,
Luis Navarro-Serment⁷,
Felix Duvallet¹³,
Abdeslam Boularias¹²,
Oscar Romero⁷,
Jerry Vinokurov⁷,
Terence Keegan¹⁵,
Robert Dean¹⁵,
Craig Lennon¹⁶,
Barry Bodt¹⁶,
Marshal Childers¹⁶,
Jianbo Shi¹¹,
Kostas Daniilidis¹¹,
Nicholas Roy¹⁴,
Christian Lebiere⁷,
Martial Hebert⁷ &
…
Anthony Stentz⁷

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 1))

Included in the following conference series:

International Symposium on Experimental Robotics

4612 Accesses
6 Citations
2 Altmetric

Abstract

With recent advances in robotics technologies and autonomous systems, the idea of human-robot teams is gaining ever-increasing attention. In this context, our research focuses on developing an intelligent robot that can autonomously perform non-trivial, but specific tasks conveyed through natural language. Toward this goal, a consortium of researchers develop and integrate various types of intelligence into mobile robot platforms, including cognitive abilities to reason about high-level missions, perception to classify regions and detect relevant objects in an environment, and linguistic abilities to associate instructions with the robot’s world model and to communicate with human teammates in a natural way. This paper describes the resulting system with integrated intelligence and reports on the latest assessment.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Cognitive Human–Robot Interaction

Key Elements for Human-Robot Joint Action

Human-Robot Teaming: Concepts and Components for Design

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

As robots become commonplace in a variety of domains ranging from manufacturing to the military, there has been growing interest in the development of intelligent robots that can support humans not only as tools, but also as teammates. To be a competent teammate, e.g., to perform a screening mission illustrated in Fig. 1, a robot needs to have basic cognitive abilities including perceiving the semantics of its environment, reasoning about spatial relationships, and communicating with natural language. In this context, while the subfields of robotics and artificial intelligence have been extensively evaluated according to standard metrics accepted within each research community, little work has been done to-date that gauges the current state-of-the-art for an intelligent robot with cognitive abilities. For example, the computer vision community has mainly focused on improving performance on benchmark data sets as opposed to addressing the types of real world challenges faced in robotics [12, 20]. As a result of such disconnections, the majority of existing works in intelligent (or cognitive) robotics includes simplifying assumptions, e.g., ideas are verified in simulated environments or a robot’s perception is assumed to be perfect or is simplified in order to measure the intelligence without including errors due to imperfect perception [6, 9, 10, 17, 21]. In our work, we aim to assess where the technology stands and where technology gaps are in the development of an intelligent robot teammate by integrating various pieces of technologies needed for a robot to perform tactical behaviors autonomously without adding simplifying assumptions. In this paper, we focus on semi-urban outdoor navigation and search behavior.

Toward this goal, we develop an intelligence architecture and integrate relevant technologies including state-of-the-art perception modules on a robot platform to assess robot intelligence at the tactical behavior level. Specifically, the capabilities that have been integrated to support intelligence are the following: (1) multi-modal interface to support rich interaction with humans,* (2) semantic world model, (3) high-level mission planning, (4) object detection,* (5) door detection,* (6) human detection and tracking,* (7) scene classification, (8) building (stuff) detection, (9) object prediction beyond sensor ranges, (10) natural language grounding,* (11) object symbol grounding, (12) (global and local) path planning, (13) imitation learning for navigation modes, and (14) an interaction layer for mobile robots. We note that the architecture builds on our prior work [19], augmented with new capabilities (marked with *). We describe our approach and share the lessons we have learned from recent assessment.

2 Technical Approach

Figure 2 shows an architectural diagram of our intelligence system for a robot teammate. In this section, we briefly illustrate how various modules contribute and interact within this architecture to support high-level robot intelligence.

Because our goal is focused on robots that can work with humans, it is important that robots be able to communicate in ways that are natural and efficient to humans. In our system, the interaction between a robot and a human is supported by a Multi-Modal Interface (MMI). Using this interface, a human teammate can issue commands through natural language speech and hand gestures, and review the robot’s reasoning process via annotated camera images and semantic maps.

The world model is a central storage of information that is accumulated and merged from various modules. The information stored in a world state includes robot pose data, sensor data, semantic objects, multi-layered cost maps, commands, and various action status. The world model supports a query interface for the modules to look up relevant information.

The mission planner takes a command and reasons about pre- and post-conditions of available actions to find a plan that will accomplish the task specified by the command. For instance, given a command “Screen the back of the building,” a set of actions needs to be performed in a sequential order; i.e., the robot needs to navigate to the back of the building, locate a door in the back of the building, and then monitor the area near the door to report upon anyone’s egress from the building.

The core of the intelligence system consists of perception, prediction and language understanding. These units contribute to the robot’s understanding of its environment and enable it to interpret and execute a given natural language command. The perception module translates the raw data from the robot’s sensors into semantically meaningful information (e.g., semantic scene classifier, an object detector, a door detector, and a human detector). The prediction module enables the robot to infer a world model for the unseen parts of the environment, effectively compensating for limitations in the robot’s sensing range (as well as possible perception errors) by using prior information about object models or descriptions of objects specified in the natural language command. The language understanding module translates a spoken utterance into a structured representation, known here as Tactical Behavior Specification (TBS) [19], that formally represents the task and its constraints, and computes symbol grounding results [3]. Combined together, these modules enable the robot to robustly perform complex tasks in an unknown environment.

2.1 Human-Robot Interface (HRI)

The effectiveness of human-robot teams is intrinsically linked to the efficiency of bi-directional communication. Robots must be able to transform human forms of expression (e.g., language and gesture) into a meaningful representation and communicate their understanding and actions to humans in order to share a cognitive model of mission goals and objectives. To address these challenges, we developed a MMI based on a Toughpad FZ-M1 tablet (Fig. 3). This device enables a human teammate to command the robot through a combination of speech and gestures and receive robot status from the visual display and auditory cues. The MMI represents instructions to the intelligence architecture using the TBS lexicon.

Grounding natural language to a TBS in the MMI is performed by the Hierarchical Distributed Correspondence Graph (HDCG) [2, 4, 11]. This model searches a pair of graphical models to efficiently translate natural language into a TBS command. The first graphical model is used to infer a set of rules to construct a more efficient representation of a second graphical model that is used to infer a distribution of the physical meaning of each phrase. To characterize the performance of the HDCG in this application, we measured the average run-time of symbol grounding for the natural language expression “screen the back of the building that is behind the car.” Over 100 queries on a MacBook Pro with a 2.6 GHz Intel Core i7 processor, we observed that the model required 0.131 s on average to correctly translate the expression to a valid TBS command.

2.2 Common World Model (CWM)

The Common World Model (CWM) [5] defines and instantiates the data model for the intelligence architecture, providing a common, centralized intelligent data storage services. The world model is divided into three main concepts: Metric (sensor data and aggregates), Semantic (class descriptions and instances), and Self Information–data relative the robot, e.g., pose data. At the Semantic level, objects represent symbolic information, enabling abstract reasoning needed for intelligent behavior. Here, CWM maintains semantic information from perception modules, and provides methods for client modules, e.g., the navigate action, to search for semantic objects that are relevant to a specific mission context with a set of filtering criteria.

2.3 Mission Planner

The goal of the mission planner is to take commands in the mission vernacular from a teammate (via the MMI) and convert them into a sequence of actions (TBSs). We leverage recent work in ACT-R [1] on models of instruction following in the form of decision graphs, where the decisions themselves are made based on examples of past decisions in the form of Instance-Based Learning [13]. This research uses a single model of decision-making in which more instructions and examples can be included in the system in the form of “chunks”—ACT-R representations of semantic information. The goal of this new model is to provide increased flexibility in adding new examples to the model, which, in turn, allows the model to plan for new missions, as well as in combining generalizations from multiple examples.

2.4 Perception

We first describe four sensor-based perception modules in our system. Additionally included in this section is perception through prediction.

Semantic Classifier. An online scene labeler is used to find buildings, vehicles, traffic barrels, and fire hydrants, and to classify background, e.g., trees, asphalt, concrete, gravel, or grass as shown in Fig. 6. Our approach builds on the Hierarchical Inference Machine [18], a scene labeling method that decomposes an image into a hierarchy of nested superpixel regions. Rather than perform inference on a graphical model, which can be expensive, we instead train a decision forest regressor with 10 trees and the segmentation hierarchy of depth 7 for predicting label distribution. We use SIFT [16], LAB colorspace statistics, and texture information derived from convolving the image with a bank of spatial filters, in addition to statistics on the size and shape of a superpixel region. We process a \(640 \times 384\) image in approximately 2 s on a dedicated quad-core i7-3615QM at 2.3 GHz, with feature extraction being the dominant cost.

Object Detector. We employ an Active Deformable Part Models (ADPM) method [23] for on-board object detection on our system. ADPM is an accelerated DPM that dynamically schedules parts and prunes locations in a cascade framework. With the current MATLAB/C++ implementation, ADPM simultaneously detects 5 classes on a 10 MP image at 0.5 Hz on a modern CPU. ADPM employs a sliding window approach at multiple image scales to detect objects at different positions and distances. In order to reduce the number of false positives, the detection hypotheses are further pruned using LADAR measurements as shown in Fig. 4.

Door Detection. Detecting doors imposes a unique challenge because doors undergo severe perspective distortion under different viewpoints. Based on the intuition that doors should be seen as a rectangle at a frontal (canonical) viewpoint, each façade candidate is mapped to the image domain according to the known calibration of each sensor. We preprocessed each candidate façade for door detection as follows: façade regions in the image are rectified using the estimated plane orientation in 3D and resized to a fixed scale such that the rectified façades are (virtually) observed at a fixed distance. Due to the canonicalization, the pose and scale variation of doors in the façades can be eliminated. On top of the rectified façades, a Deformable Part Model based door detector [8] is applied. Since the façades are standardized in canonical view and fixed distance, detection can be performed online because searching in a single scale space is sufficient to detect doors as seen in Fig. 5.

Human Detection. One of the main objectives of the human-robot team is to identify potential human threats, which would feed directly into the observe action as the architecture is currently laid out. A tree-structured Deformable Part Model [22] was chosen as the state-of-the-art algorithm to perform this task. Given a rectified image, the algorithm reports the locations of 26 individual parts for each detected person. Our contribution is to port the feature pyramid processing code to run on a FPGA or GPU while the rest of the code runs as a module on a separate laptop. Using the current system architecture, streaming \(1020 \times 768\) images from the camera, and processing all scales within the human detection algorithm runs at a 0.5 Hz processing rate. Additional LADAR processing is included within the observe action to better discriminate humans from other arbitrary objects.

2.5 Object Prediction

In addition to those approaches that use actual sensors to detect objects or humans in the robot’s environment, we also utilize language inputs to perceive objects, primarily in the part of the environment that the robot has not directly explored. The current approach hypothesizes an object when two conditions are met: symbol grounding fails to map a symbol to an object in the world model; and there are areas that satisfy the spatial constraints but have not been explored by the robot. Given a language phrase l that describes a target object with spatial constraints relative to a reference object o, we sample a set of candidate locations from a discretized 2D map defined in \(X \times Y\) space. A predicted object is created in an unseen location (x, y) that best satisfies the given spatial constraints: \((x,y) = \arg \max _{(x',y') \in X \times Y} k(x',y') \phi (x',y',l,o),\) where k(x, y) is a binary indicator with value 0 for free space (i.e., no detection) that has been visited, 1 otherwise; and \(\phi \) is a function that represents how well a given location (x, y) satisfies the spatial constraints l relative to a reference object o.

2.6 Structured Command Grounding

The symbol grounding algorithm takes as inputs a TBS command and a set of semantic objects in the world model, and grounds each object symbol referenced in the TBS to an object instance in the world model. Spatial constraints specified in the TBS are evaluated in a robot-centric manner, i.e., a spatial relationship relative to the position of the robot at the time when the command was given.

We first use a log-linear model to represent the probability that an object in the environment satisfies a given spatial relation. Given an object, this probability is defined as a function \(\phi \) of weighted sum of the object’s spatial feature values. The spatial features used here include the distances and the angles between the centers of objects and the robot. A weight vector of each relation is learned by maximizing the log-likelihood of all the training examples using gradient descent with the \(l_1\) regularization. For details, we refer to previous work [3].

2.7 Actions: Tactical Behaviors

An action implements a specific tactical behavior of a robot. Currently supported actions include: navigate (Fig. 6), search, observe (Fig. 7), bump, go-to-xy, and wait; here, we describe navigate as an example.

Navigate. Semantic navigation [19] differs from path planning with regards to the expressiveness of its command, as shown in Fig. 6. In contrast to the go-to-xy action, for instance, where a goal is specified in map coordinates, a destination can be described using its spatial relationships with landmarks in an environment. Additionally, a navigation mode can also be specified to instruct a robot to move quickly or more covertly depending on the characteristics of a mission.

3 Experimental Results

To assess the ability of the intelligence architecture to use different capabilities, the system was tested in various mission scenarios. A human teammate used speech and gestures to command each mission through the MMI, and the robots performed the mission autonomously for the entire duration. We evaluated the robot’s performance both via human assessment and via comparisons against human performance on similar tasks.

Table 1. Results on the four vignettes involving navigation (against results from 2013).

Full size table

3.1 Evaluation by Human Experts

Performances on screening missions: The complete runs involved two building sites, the Church and the Bar, requiring the robot to navigate 20–60 m to achieve the mission. Total of 17 runs were graded on a 0–100 scale by increments of 20. Table 1 contains the overall human evaluation. When compared with an earlier performance, there has been a significant improvement. In previous results, on a similar set of navigation tasks, the average completion rate was 50 % (where only 30 % received full scores) [14]. Overall, the system consistently executed the screening mission, with 11 of 17 runs scored at \(100\,\%\). Of the remaining runs, 3 failed due to low batteries or software crashes, 2 because of the communication system, and 1 because of a symbol grounding error.

Performances on semantic navigation: Table 2 summarizes the experiments from two distinct outdoor environments. The first set of experiments was conducted as part of a larger system assessment in a physically simulated town with 12 buildings in 1 km\(^{2}\) outdoor space in a military training facility. A qualitative summary from this set of experiments was reported in [15]. This set of experiments consisted of 57 runs that are 2 replications of 30 commands divided into 12 vignettes–i.e., world configurations.

The second set of experiments was carried out in a parking lot of a large, irregular-shaped building. The background in this environment was natural but highly cluttered. In the vignette where the robot was facing the large building, the robot performed poorly because there were many unknown objects on which the recognition algorithm had not been trained. The performance in those vignettes involving known objects was highly reliable, resulting in the average completion of 100 % and 86 % in the complete and the incomplete information cases, respectively.

Table 2. Outdoor semantic navigation completion rate (%) with complete vs. incomplete information (The number of runs is in parenthesis).

Full size table

3.2 Evaluation Against Human Performance on Similar Tasks

According to our preliminary data collection on 20 subjects, human interpretation of a verbal instruction can vary significantly. Given a simple command, “go to a barrel that is in the back of the building,” 20 % (4 out of 20) of the subjects interpreted the command differently from the commander’s intention, and the paths chosen by the majority who chose similar goal positions as the commander also varied.

Motivated by this result, we have collected a larger set of user data on interpreting navigation directions. We created a human intelligence test (HIT) on Amazon Mechanical Turk to collect the navigation paths selected by humans for a similar set of problems to the robot’s. Two out of 84 data entries were eliminated due to incompleteness.

To compare the paths generated by a robot against that by a human, we used Frechét distance [7] that measures the distance between two curves. We sort the entries according to their choice of a goal landmark and their mode of navigation, e.g., left or right of a building. We computed Frechét distance between the robot’s path and the paths taken by the group of users who had made the same grounding decisions as the robot. We note that, in all 6 examples, the robot’s grounding choice agreed with that of the human majority.

Path comparison: For each human turker, we computed Frechét distance between the path chosen by the human and that of the robot. In addition, we randomly selected another human turker and computed the distance between the paths chosen by the two human participants. The mean and the standard deviation of the Frechét distance for the example shown in Fig. 9 between the paths chosen by a robot and 69 human users who have chosen the same building as their landmarks (drawn in green lines) are \(56.79\pm 14.37\), whereas those between human users in that same group were \(67.70\pm 83.19\). The t-test failed to reject the null hypothesis that there is no significant difference when a human generated path is compared against that of a robot or a human; the confidence interval at the 0.05 significance level was \([-34.29, 12.48]\) on the example in Fig. 9.

Task-level performance comparison: When evaluated based on the intended goal and landmark groundings, the accuracy of human participants was 68.9%. People performed better on path constraints, reaching 86.9% in accuracy. We also asked the participants to evaluate the paths generated by a robot given the same set of navigation commands. Based on the evaluation of 82 participants, the robot scored 86.0%.

4 Main Experimental Insights

Our approach takes advantage of additional information conveyed within verbal commands by a human teammate to improve a robot’s perception. Figure 10 shows progressive changes in the robot’s navigation plans as the robot drives from a partially known world to a known world by gradually acquiring information through perception. The blue dotted line shows the path that the robot would have taken if it had complete information about the environment at the time when the command was given; the red line is the actual path that the robot has taken; the green lines and magenta triangles show the paths and the goals, respectively, that the robot pursued during execution. In these runs, the robot’s early goals may not be precisely correct (because they were the hypothesized goals as opposed to those perceived) but generally guide the robot to a proper direction so that the robot can revise its plan for the actual goal when detected. These examples illustrate that the paths taken by the robot under incomplete information strongly resemble those that would have been taken under complete information. Our experimental results show that, in outdoor navigation, semantic understanding of an environment is still challenging and exploiting information from verbal directions can compensate significantly.

In our previous experiments, the performance has been assessed only in terms of task completion as shown in Table 1. Here, we also evaluated the robot performance by surveying human participants on similar navigation tasks. Our experiments suggest that the paths generated by the robot resemble closely those generated by humans and that the robot performs comparably with humans.

5 Conclusion

In this paper, we present an intelligence architecture for human-robot teams that has been fully integrated into a mobile robot platform. During extensive assessments on various screening missions, the system performed consistently and robustly, demonstrating the strength of integrated intelligence. We conclude that combining the latest perception technologies and reasoning about complex surroundings with additional capabilities, such as natural language understanding to follow instructions from teammates or predicting unseen environments beyond the ranges of sensors, can lead to a viable robot teammate for implementing high-level intelligence in real environments.

References

Anderson, J.R., Bothell, D., Byrne, M.D., Douglass, S.A., Lebiere, C., Qin, Y.: An integrated theory of the mind. Psychol. Rev. 111, 1036–1060 (2004)
Article Google Scholar
Barber, D., Howard, T.M., Walter, M.R.: A multimodal interface for real-time soldier-robot teaming. In Proceedings of the SPIE 9837, Unmanned Systems Technology XVIII (2016)
Google Scholar
Boularias, A., Duvallet, F., Oh, F., Stentz, A.: Grounding spatial relations for outdoor robot navigation. In Proceedings of the IEEE International Conference on Robotics and Automation, pp. 1976–1982 (2015)
Google Scholar
Chung, I., Propp, O., Walter, M.R., Howard, T.M.: On the performance of hierarchical distributed correspondence graphs for efficient symbol grounding of robot instructions. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5247–5252 (2015)
Google Scholar
Dean, R.: Common world model for unmanned systems. In: Procedings of the SPIE 8741, Unmanned Systems Technology XV (2013)
Google Scholar
Duvallet, F., Walter, M.R., Howard, T., Hemachandra, S., Oh, J., Teller, S., Roy, N., Stentz, A.: Inferring maps and behaviors from natural language instructions. In: Hsieh, M.A., Khatib, O., Kumar, V. (eds.) Experimental Robotics. STAR, vol. 109, pp. 373–388. Springer, Heidelberg (2016). doi:10.1007/978-3-319-23778-7_25
Chapter Google Scholar
Eiter, T., Mannila, H.: Computing discrete frechet distance. Technical report, Christian Doppler Laboratory, Vienna University of Technology (1994)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Golland, D., Liang, P., Klein, D.: A game-theoretic approach to generating spatial descriptions. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 410–419 (2010)
Google Scholar
Hawes, N., Klenk, M., Lockwood, K., Horn, G.S., Kelleher, J.D.: Towards a cognitive system that can recognize spatial regions based on context. In: Proceedings of the AAAI Conference on Artificial Intelligence (2012)
Google Scholar
Hemachandra, S., Duvallet, F., Howard, T.M., Roy, N., Stentz, A., Walter, M.R.: Learning models for following natural language directions in unknown environments. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 5608–5615 (2015)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Lebiere, C., Jentsch, F., Ososky, S.: Cognitive models of decision making processes for human-robot interaction. In: Proceedings of the International Conference on Virtual, Augmented and Mixed Reality, pp. 285–294 (2013)
Google Scholar
Lennon, C., Bodt, B., Childers, M., Dean, R., Oh, J., DiBerardino, C.: Assessment of navigation using a hybrid cognitive/metric world model. Technical Report ARL-TR-7175, Army Research Labs, January 2015
Google Scholar
Lennon, C., Bodt, B., Childers, M., Dean, R., Oh, J., DiBerardino, C., Keegan, T.: RCTA Capstone Assessment. In Proceedings of the SPIE 9468, Unmanned Systems Technology XVII (2015)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Matuszek, C., Herbst, E., Zettlemoyer, L., Fox, D.: Learning to parse natural language commands to a robot control system. In: Proceedings of the International Symposium on Experimental Robotics (2012)
Google Scholar
Munoz, D.: Inference Machines: Parsing Scenes via Iterated Predictions. PhD thesis, The Robotics Institute, Carnegie Mellon University, 2013
Google Scholar
Oh, J., Suppe, A., Duvallet, F., Boularias, A., Vinokurov, J., Navarro-Serment, L., Romero, O., Dean, R., Lebiere, C., Hebert, M., Stentz, A.: Toward mobile robots reasoning like humans. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1371–1379 (2015)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of the Neural Information Processing Systems (2015)
Google Scholar
Tellex, S., Kollar, T., Dickerson, S., Walter, M.R., Banerjee, A.G., Teller, S.J., Roy, M.: Understanding natural language commands for robotic navigation and mobile manipulation. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1507–1514 (2011)
Google Scholar
Yang, Y., Ramanan, D.: Articulated pose estimation using flexible mixtures of parts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1385–1392 (2011)
Google Scholar
Zhu, M., Atanasov, N., Pappas, G.J., Daniilidis, K.: Active deformable part models inference. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 281–296. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10584-0_19
Google Scholar

Download references

Acknowledgments

This work was conducted in part through collaborative participation in the Robotics Consortium sponsored by the U.S Army Research Laboratory under the Collaborative Technology Alliance Program, Cooperative Agreement W911NF-10-2-0016, and in part by ONR under MURI grant “Reasoning in Reduced Information Spaces” (no. N00014-09-1-1052). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory of the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein.

Author information

Authors and Affiliations

Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
Jean Oh, Arne Suppe, Luis Navarro-Serment, Oscar Romero, Jerry Vinokurov, Christian Lebiere, Martial Hebert & Anthony Stentz
University of Rochester, Rochester, New York, USA
Thomas M. Howard
Toyota Technological Institute at Chicago, Chicago, Illinois, USA
Matthew R. Walter
University of Central Florida, Orlando, Florida, USA
Daniel Barber
University of Pennsylvania, Philadelphia, Pennsylvania, USA
Menglong Zhu, Sangdon Park, Jianbo Shi & Kostas Daniilidis
Rutgers University, New Brunswick, New Jersey, USA
Abdeslam Boularias
Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland
Felix Duvallet
Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
Nicholas Roy
General Dynamics Robotic Systems, Westminster, Maryland, USA
Terence Keegan & Robert Dean
U.S. Army Research Laboratory, Adelphi, Maryland, USA
Craig Lennon, Barry Bodt & Marshal Childers

Authors

Jean Oh
View author publications
You can also search for this author in PubMed Google Scholar
Thomas M. Howard
View author publications
You can also search for this author in PubMed Google Scholar
Matthew R. Walter
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Barber
View author publications
You can also search for this author in PubMed Google Scholar
Menglong Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Sangdon Park
View author publications
You can also search for this author in PubMed Google Scholar
Arne Suppe
View author publications
You can also search for this author in PubMed Google Scholar
Luis Navarro-Serment
View author publications
You can also search for this author in PubMed Google Scholar
Felix Duvallet
View author publications
You can also search for this author in PubMed Google Scholar
Abdeslam Boularias
View author publications
You can also search for this author in PubMed Google Scholar
Oscar Romero
View author publications
You can also search for this author in PubMed Google Scholar
Jerry Vinokurov
View author publications
You can also search for this author in PubMed Google Scholar
Terence Keegan
View author publications
You can also search for this author in PubMed Google Scholar
Robert Dean
View author publications
You can also search for this author in PubMed Google Scholar
Craig Lennon
View author publications
You can also search for this author in PubMed Google Scholar
Barry Bodt
View author publications
You can also search for this author in PubMed Google Scholar
Marshal Childers
View author publications
You can also search for this author in PubMed Google Scholar
Jianbo Shi
View author publications
You can also search for this author in PubMed Google Scholar
Kostas Daniilidis
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas Roy
View author publications
You can also search for this author in PubMed Google Scholar
Christian Lebiere
View author publications
You can also search for this author in PubMed Google Scholar
Martial Hebert
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Stentz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jean Oh .

Editor information

Editors and Affiliations

Gra Sch of Info Sci&Tech,Dept of MechInf, The University of Tokyo Gra Sch of Info Sci&Tech,Dept of MechInf, Tokyo, Japan
Dana Kulić
Computer Science Department, Stanford University Computer Science Department, Stanford, California, USA
Yoshihiko Nakamura
Department of Electrical & Computer Engg, University of Waterloo Department of Electrical & Computer Engg, Waterloo, Ontario, Canada
Oussama Khatib
Department of Mechanical Systems Enginee, Tokyo University of Agriculture and Tech Department of Mechanical Systems Enginee, Tokyo, Japan
Gentiane Venture

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Oh, J. et al. (2017). Integrated Intelligence for Human-Robot Teams. In: Kulić, D., Nakamura, Y., Khatib, O., Venture, G. (eds) 2016 International Symposium on Experimental Robotics. ISER 2016. Springer Proceedings in Advanced Robotics, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-319-50115-4_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-50115-4_28
Published: 21 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50114-7
Online ISBN: 978-3-319-50115-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Integrated Intelligence for Human-Robot Teams

Abstract