BAgger: A Bayesian Algorithm for Safe and Query-efficient Imitation Learning

Abstract

Safety and query efficiency may present a challenge when learning a robot control policy with Dataset Aggregation (DAgger). We propose BAgger, an Imitation Learning algorithm that, using a Bayesian approach, aims to mitigate those challenges by predicting state novelty and policy error. In BAgger, the expert is queried only when there is a significant risk of not being able to imitate the expert, eg in novel parts of the state space. We present empirical results indicating that BAgger is, both, safer than DAgger and SafeDAgger on a robot control task, while still being query-efficient.