Thesis of Pierre Marza

Large-scale learning of high-level navigation skills for autonomous agents in 3D environments


The last years have witnessed the soaring of Machine Learning (ML), which has provided disruptive performance gains in several fields. Apart from undeniable advances in methodology, these gains are often attributed to massive amounts of training data and computing power, which led to breakthroughs in speech recognition, computer vision and natural language processing.

In the REMEMBER project, to which this thesis is associated, we propose to extend these advances to the sequential decision-making of agents in a context of planning and control in complex 3D environments.


This thesis will deal with methodological contributions (models and algorithms) for the training of real and virtual agents allowing them to learn to solve complex tasks independently. Indeed, intelligent agents require high-level reasoning skills, awareness of their environment and the ability to make the right decisions at the right time [1]. The decision-making policies required are complex because they involve large observation and state spaces, partially observed problems, and largely nonlinear and intricate interdependencies. We believe that their learning will depend on the ability of the algorithm to learn compact representations of memory structured spatially and semantically, capable of capturing complex regularities of the environment and of the task in question.


A key requirement is the ability to learn these representations with minimal human intervention and annotation, as manual design of complex representations is almost impossible. It requires the efficient use of raw data and the discovery of patterns through different learning formalisms: supervised, unsupervised or self-supervised, by reward or by intrinsic motivation, etc.

Advisor: Christian Wolf
Coadvisor: Laetitia Matignon