This project entails robot online behavioral adaptation during interactive learning with humans. Specifically, the robot shall adapt to each human subject’s specific way of giving feedback during the interaction. Feedback here includes reward, instruction and demonstration, and can be regrouped under the term “teaching signals”. For example, some human subjects prefer a proactive robot while others prefer the robot to wait for their instructions; some only tell the robot when it performs a wrong action, while others reward correct actions, etc. The main outcome will be a new ensemble method of human-robot interaction which can learn models of various human feedback strategies and use them for online tuning of reinforcement learning so that the robot can quickly learn an appropriate behavioral policy. We will first derive an optimal solution to the problem and then compare the empirical performance of ensemble methods to this optimum through a set of numerical simulations.
Paper in IEEE RO-MAN or ACM/IEEE HRI or ACM CHI
- Sorbonne University, Mehdi Khamassi
- Sorbonne University, Mohamed Chetouani
- Athena Research Center, Petros Maragos
Primary Contact: Mehdi Khamassi, Sorbonne University