Thèse de Nawel Benarba
Machine learning (ML) is applied in many areas to extract knowledge from data and guide the decision making process, such in search engines , recommendation systems  and disease diagnosis . With the rapid growth of data, ML algorithms evolved from centralized to distributed solutions. And to address data privacy issues, Federated Learning (FL) has emerged to allow a set of participants to collectively resolve a machine learning problem without sharing their data. Thus, several data owners are able to collectively learn from each other’s data, without sharing their actual data. Such a decentralized learning approach has many applications in areas such as health care, digital banking systems, etc.
However, FL could exacerbate the problem of bias and unfairness [8, 1]. Bias is a phenomenon that occurs when ML models produce unfair decisions due to the use of incomplete, faulty or prejudicial datasets and models. Bias may have serious consequences such as sexist segregation,illegal actions, or reduced revenues [3, 4, 9]. Federated Learning may have an impact on the problem of bias [8, 3], because of the decentralized nature of FL, where data distribution and size are particularly heterogeneous. Furthermore, data privacy constraints in FL do not allow the use of classical ML bias mitigation techniques [10, 5].
More precisely, this project aims to answer the following questions:
How to characterize the actual impact of Federated Learning on bias, i.e., to which extent do FL data distributions, FL models, FL selection, aggregation and robustness algorithms impact bias?
What novel FL selection and aggregation algorithms could be proposed for bias mitigation?
How to take into account privacy, bias and fairness in Federated Learning through a multi-objective approach, these objectives being usually antagonistic?
This PhD project aims to precisely answer these questions.
Encadrant : Sara Bouchenak