Thèse de Yasmine Djebrouni

Sujet :
Characterizing and Optimizing Distributed Machine Learning Systems : Towards a Control Theoretic and Multi-Objective Approach

Date de soutenance : 20/02/2024

Encadrant : Sara Bouchenak
Co-encadrant : Vania Marangozova-Martin

Résumé :

With the pervasiveness of computer devices and digital services, huge amounts of data are nowadays continuously generated and collected. Data analysis is beneficial in many application domains, such as medical diagnosis, transportation, or urban planning. Machine learning and Artificial Intelligence techniques are widely used to extract hidden yet valuable knowledge from data.
The generalization of cloud computing has enabled various types of services such as AI Software-as-a- Service. Such services allow, for instance, detecting objects in images or videos, transforming speech to text, etc. In this context, building and training from scratch a machine learning model requires some expertise, as well as significant data storage and computing resources. Therefore, many cloud service providers offer specialized, pre-trained models and algorithms to solve very specific tasks, such as extracting properties, making predictions, etc.
However, there is an emerging trend among AI service providers, moving from a specific AI-based service to a general offering of large- scale AI platforms, known as AI Platform-as-a-Service. There are, for example, efforts from Amazon AWS, IBM Watson, Microsoft Azure and Google to provide comprehensive AI PaaS platforms. The advantages of AI PaaS solutions are clearly the savings in time and money, and the fact that there is no need for a high level of expertise since the deployment and maintenance of the system are simplified. However, today’s AI PaaS platforms remain opaque and do not allow adapting the models to application needs. Furthermore, due to the large data volume and ubiquitous data sensing, machine learning is shifting from the centralized cloud mode to distributed/decentralized systems, where edge nodes can train and update models (a)synchronously. These emerging systems need yet to face many design and performance challenges.
The research objective of this PhD project is to derive scalable distributed solutions for (deep) machine learning.

Jury :
M. Thomas GaëlProfesseur(e)TELECOM SudParisRapporteur(e)
M. Monnet SébastienProfesseur(e)Université Savoie Mont BlancRapporteur(e)
M. Benoit AlexandreProfesseur(e)Université Savoie Mont BlancExaminateur​(trice)
M. Trystram DenisProfesseur(e)Université Grenoble AlpesExaminateur​(trice)
Mme Marangozova VaniaMaître de conférenceUniversité Grenoble AlpesDirecteur(trice) de thèse
Mme Bouchenak SaraProfesseur(e)LIRIS INSA LyonCo-directeur (trice)