Thesis of Arthur Aubret

Deep reinforcement learning of skills for multi-agent coordination.

Defense date: 30/11/2021

Advisor: Salima Hassas
Coadvisor: Laetitia Matignon


In reinforcement learning, an agent learns by trials and errors to maximize the expectation of rewards received while acting in his environment. In a multi-agent scenario, some tasks imply that multiple agents have to cooperate ; yet, despite novel advances in deep reinforcement learning, it is known to be difficult to coordinate the agents, particularly when the number of agents is growing. Communication may be an efficient way to coordinate agents, however actual models only includes observations into the communication bewteen agents and consider scenarios with few agents. To adress those issues, we want to take advantage of recent works on intrinsic motivation. At first, we want our agents to be able to communicate high-level information, for instance their intentions in addition to their observations, to improve their coordination. To do so, they have to learn a representation of their skills. As a second step, our goal is that our agents learn to choose what to communicate, when and to whom.

Mr Dutech AlainProfesseur(e)Rapporteur(e)
Mr Filliat David Professeur(e)Rapporteur(e)
Mr Oudeyer Pierre-Yves Professeur(e)Examinateur​(trice)
Mr Aussem Alexandre Professeur(e)Université Lyon 1Examinateur​(trice)
Mme Hassas SalimaProfesseur(e)Université Lyon 1Directeur(trice) de thèse
Mme Matignon Laëtitia Maître de conférenceUniversité Lyon 1Co-encadrant(e)