There has been significant growth in the commercialization of personal robots and other embodied assistants (e.g. self-driving vehicles, smart home systems), leading to their increasing presence in everyday settings. These technologies are largely intended to provide intuitive assistance to people in their daily lives, yet we cannot program all task intelligence they will need a priori. Thus in order to provide the desired assistance, these agents need the ability to dynamically acquire and expand their task domain knowledge. More specifically, one key recurring challenge occurs when the robotic agent is given a high-level task, described by an abstract task plan. The robot must first perceptually ground each entity and concept within the recipe (e.g. items, locations) in order to perform the task. An example of this may be that in order to learn to serve cooked pasta in a home, a robot must first ground concepts like cooking pot, stove, and bottle of pasta sauce. Assuming no prior knowledge, this is particularly challenging in newly situated or dynamic environments, where the robot has limited representative training data. My thesis research examines the problem of enabling a social robotic agent to leverage interaction with a human partner for learning to efficiently ground task-relevant concepts in its situated environment. Our prior work has investigated Learning from Demonstration (LfD) approaches for the acquisition of (1) training instances as examples of task-relevant concepts and (2) informative features for appropriately representing and discriminating between task-relevant concepts. Though our findings have validated the usefulness of user domain knowledge, it relies upon the ability of each human partner to also be proficient at teaching and tracking the state of the robot’s knowledge over time. This is an unreasonable burden to place on everyday users, particularly in dynamically changing environments. In ongoing work, we examine the design of algorithms that enable the social robot learner to autonomously manage the interaction with its human partner, towards actively gathering both instance and feature information for learning the concept groundings. This is intended to improve both efficiency in learning the grounded concepts and adaptation in a dynamic environment. Perhaps even more compellingly, it is motivated by the way that humans learn, by combining information rather than simply focusing on one type.
My ongoing work explores strategies for enabling a social robot learner to autonomously manage its own learning interactions with a human teacher, towards actively gathering diverse types of task knowledge. Assuming no prior knowledge, the learning agent is given a task (e.g. serving pasta) and with it, task relevant concepts (e.g. cooking pot, pasta sauce) it must perceptually ground, in order to later recognize instances of these concepts in the situated environment and use them to perform the task. The agent learns to ground all task-relevant concepts by actively querying its human partner for relevant information. My first research project within the thread of managing interaction with a human teacher is described below.
Active Learning of Grounded Concepts using Diverse Types of Learning Queries
My most recent work contributes the design of two classes of algorithms for arbitrating between diverse types of active learning queries, with the goal of autonomously gathering both representative examples of the concepts and informative features for discriminating between the concepts [Bullard et al, IROS 2018]. In this work, we examine both rule-based and decision-theoretic strategies for selecting between multiple queries of different types, at each turn in a learning episode. The video below shows a demo of the robot using the decision-theoretic arbitration strategy explored. Here, the agent is learning to ground concepts relevant to a Pack Lunchbox task from a human teacher (myself, as it were :). For the given task, the agent must learn from the teacher different ways of embodying the concepts (1) main dish, (2) fruit, (3) snack, and (4) beverage, with the constraint that all examples should be appropriate for the lunch packing task.
My previous work focused on how to leverage structured interaction with a human partner to learn the perceptual groundings for all task-relevant concepts. The primary difference in this thread of research as compared to ongoing research is that the agent's role was that of a passive observer, instead of an active questioner. I have led two research projects within the thread of leveraging interaction: the first towards acquiring representative training instances, the second towards eliciting informative features. Throughout my thesis work, the goal for the agent has been to perceptually ground all concepts (i.e. items, locations) relevant to a specific task; what changes is the way it utilizes interaction with a human partner in order to achieve this goal.
The image to the right shows an example of the robot's workspace for the Curi robot, while observing task demonstrations being provided by a human teacher.
LfD for Grounding Task-Relevant Concepts
Actions in a given task plan or recipe are often parameterized by object and semantic location symbols (concepts) relevant for task execution, but not grounded in the physical environment where the robot is situated. In this work, we employed the paradigm of Learning from Demonstration (LfD) to understand how many demonstrations of a task were necessary to perceptually ground all of the given task-relevant concepts in the agent's environment [Bullard et al, RO-MAN 2016]. Assuming the robot does not already have classifiers for identifying instantiations of the concepts in its environment, we sought to leverage LfD for efficiently acquiring the relevant task knowledge in any new environment where the robot may be placed.
For evaluation of this work, the agent had to ground concepts for two different tasks, each in three different experimental environments (shown below). The object and semantic location concepts to be grounded are derived from the parameters of the task recipes given to agent (like the ones shown below). Each environment represents a kitchen the agent could be placed in and thus contains the same abstract objects and locations, but instantiated differently (as one would expect in different homes). The agent's goal was to learn binary classifiers for each of the abstract concepts (labels), given demonstrations of each task from the teacher.
Serve Salad Task Recipe
pick-place <bowl, cupboard, counter>
pick-place <salad-dressing, fridge, counter>
Serve Pasta Task Recipe
pick-place <pasta-pot, stove, counter>
pick-place <bowl, cupboard, counter>
pick-place <sauce-bottle, fridge, counter>
Experimental findings showed that environment-specific groundings of all task-relevant objects and semantic locations could be learned efficiently by employing the paradigm of LfD. In all environments, learning began to stabilize after only about 5-6 task demonstrations.
The key implication of this work is that we can leverage interaction with a human partner to solve the task-situated symbol grounding problem; my subsequent work relaxes some of the assumptions made in this first interactive learning project and builds upon this key insight.
Human-Driven Feature Selection for Representing Task-Relevant Concepts
Feature Selection directly impacts how quickly the agent is able to learn a sufficient model for each abstract concept given in a task recipe. The challenge is that humans typically only provide a small number of examples to an agent, which may be insufficient for computational feature extraction techniques, and it is not feasible to hand code features for every task a priori. Thus, my subsequent project investigated whether humans can also help the agent identify informative features for discriminating between task-relevant objects and how to elicit this feature information from a human teacher [Bullard et al, ICRA 2018]. It contributed five different approaches for human-driven feature selection.
The images below depict three of the approaches explored. In this work, we use an Unpack Groceries task, where the agent must learn to ground the following concepts: (a) beverages, (b) produce, (c) food cans/jars, (d) snacks. The human instance selection (HIS) approach (shown on left) represents the typical LfD case where a teacher provides a small number of representative training examples of each concept, with the caveat that the teacher's explicit goal when selecting these examples is to help the learner to differentiate between the concepts. This provides a way for the teacher to indirectly communicate useful features, given that some features may be abstract or difficult to articulate. The human feature selection (HFS) and human feature reduction (HFR) approaches (depicted on right) allow the teacher to directly enumerate informative features to the learner. HFS highlights all features believed to be useful for differentiating between the task concepts, and HFR eliminates all features that should be ignored by the learner. As a note, the images show only a simplified version of these experimental conditions, for illustration purposes. The last two experimental approaches for human-driven feature selection were a combination of the former two, whereby the teacher first selects a small set of training examples and then explicitly enumerates features he/she was attempting to implicitly highlight through the instances selected prior.
Our experimental findings provided the key insight that the HFS approach was especially valuable for LfD domains (where training data is limited) as it statistically outperformed all computational feature selection approaches tested on the task, given the small set of examples provided by the teacher. However the caveat is that the individual features must be semantically interpretable or intuitive to the teacher.
The key implication of this work is that the agent can also extract informative features from a human partner, which leads to more efficient learning of its concept groundings. Based upon findings from both of the previous projects discussed, my ongoing work seeks to enable the agent with algorithms for autonomously requesting the feature and instance information it needs, thereby mitigating the cognitive burden on the teacher and collaborating in the learning process.