GoTriple - Towards Vygotskian Autotelic Agents : Learning Skills with Goals, Language and Intrinsically Motivated Deep Reinforcement Learning

Abstract

Building autonomous machines that can explore large environments, discover interesting interactions and learn open-ended repertoires of skills is a long-standing goal in artificial intelligence. Inspired by the remarkable lifelong learning of humans, the field of developmental machine learning aims at studying the mechanisms enabling autonomous machines to self-organize their own developmental trajectories and grow their own repertoires of skills. This research makes steps towards that goal.Reinforcement learning methods (RL) train learning agents to control their environment by maximizing future rewards and, thus, seem adapted to our purpose. Although it achieved impressive results in the last decade---beating humans at video games, chess, go or controlling robotic agents---it falls short of solving our goal. Indeed, RL agents demonstrate low autonomy and open-endedness because they usually target a (small) set of pre-defined tasks characterized by hand-defined reward functions. In this research, we transfer, adapt and extend ideas from a developmental framework called intrinsically motivated goal exploration process (IMGEP) to the RL setting. The resulting framework builds on goal-conditioned RL techniques to design autotelic RL agents: agents that are intrinsically motivated to represent, generate, pursue and master their own goals as a way to grow repertoires of skills.The efficient acquisition of open-ended repertoires of skills further requires agents to creatively generate novel goals out of the domain of known effects (creative exploration), to readily generalize their understanding of known skills to similar ones (systematic generalization), and to compose known skills to form new ones (composition). Inspired by developmental psychology, we propose to use language as a cognitive tool to support such properties.We organize the manuscript around these two notions: goals and language. The first part focuses on goals. It covers foundational concepts and related work on intrinsic motivations, reinforcement learning and developmental robotics before introducing our framework, goal-conditioned intrinsically motivated goal exploration process (GC-IMGEP), the intersection of RL and IMGEPs. Building on this framework, we present three computational studies of the properties of autotelic agents. We first show that we can use autotelic exploration to solve external hard-exploration tasks (study 1: GEP-PG and 2: ME-ES). We then move on to reward-free environments and propose CURIOUS, an autotelic agent that targets a diversity of goals, transfers knowledge across skills and organizes its own learning trajectory by pursuing goals associated with high learning progress (study 3).The second part focuses on language. Inspired by the pioneering work of Vygotsky and others, we first discuss existing communicative and cognitive uses of language for goal-directed artificial agents. Language facilitates human-agent communications, abstraction, systematic generalization, long-horizon control, but also creativity and mental simulations. In two subsequent computational studies, we propose to implement these two last cognitive uses of language. IMAGINE uses language both to learn goal representations from social interactions (communicative use) and to imagine out-of-distribution goals used to drive its creative exploration and enhance systematic generalization (cognitive use). In our last study, LGB trains a language-conditioned world model to generate a diversity of possible futures conditioned on linguistic descriptions. This leads to behavioral diversity and strategy-switching behaviors.