Thèse de Silvia Grosso
Sujet :
Date de début : 06/11/2023
Date de fin (estimée) : 06/11/2026
Encadrant : Sara Bouchenak
Résumé :
Machine learning (ML) is applied in many areas to extract knowledge from data and guide the decision making process, such in search engines [7], recommendation systems [2] and disease diagnosis [6]. With the rapid growth of data, ML algorithms evolved from centralized to distributed solutions. And to address data privacy issues, Federated Learning (FL) has emerged to allow a set of participants to collectively resolve a machine learning problem without sharing their data. Most current FL methods are limited to datasets from a single modality, such as images or text [5,10]. However, given the proliferation of sensor types and collection methods, there is a pressing demand to integrate information from different data modalities to improve model performance and provide more comprehensive insights while maintaining the advantages of FL.
The challenge of non-IID data in FL is widely recognized, and heterogeneous modalities and architectures in multimodal models exacerbate the problem, leading to side effects known as bias and unfairness. Indeed, multimodal FL could exacerbate the problem of bias [8, 1]. Bias is a phenomenon that occurs when ML models produce unfair decisions due to the use of incomplete, faulty or prejudicial datasets and models. Bias may have serious consequences such as sexist segregation, illegal actions, or reduced revenues [3, 4, 9]. Federated Learning may have an impact on the problem of bias [8, 3], because of the decentralized nature of FL, where data distribution and size are particularly heterogeneous.
This project will investigate innovative FL architecture and protocols to handle heterogeneity in data modalities and model architectures, analyzing the trade-off between communication costs and global performance. A further challenge, particularly relevant in the cross-device FL scale, is tackling different computation capabilities to ensure fairness across different local resources when simultaneously dealing with uni- and multimodal clients
References:
1. Abay (A.), Chuba (E.), Zhou (Y.), Baracaldo (N.) et Ludwig (H.). – Addressing uniquefairness obstacles within federated learning. 2021.
2. Aher (S. B.) et Lobo (L.). – Combination of machine learning algorithms for recommen-dation of courses in e-learning system based on historical data.Knowledge-Based Systems,vol. 51, 2013.
3. Bellamy (R. K.), Dey (K.), Hind (M.), Hoffman (S. C.), Houde (S.), Kannan (K.), Lohia (P.),Martino (J.), Mehta (S.), Mojsilovic (A.) et al. – Ai fairness 360: An extensible toolkitfor detecting, understanding, and mitigating unwanted algorithmic bias.arXiv preprintarXiv:1810.01943, 2018.
4. Bogroff (A.) et Guegan (D.). – Artificial intelligence, data, ethics an holistic approach forrisks and regulation.University Ca’Foscari of Venice, Dept. of Economics Research Paper Series,no19, 2019.
5. McMahan (H. B.), Moore (E.), Ramage (D.), Hampson (S.) et y Arcas (B. A.). Communication-Efficient Learning of Deep Networks from Decentralized Data, 2023.
6. Kourou (K.), Exarchos (T. P.), Exarchos (K. P.), Karamouzis (M. V.) et Fotiadis (D. I.). – Ma-chine learning applications in cancer prognosis and prediction.Computational and structural biotechnology journal, vol. 13, 2015, pp. 8–17.
7. McCallumzy (A.), Nigamy (K.), Renniey (J.) et Seymorey (K.). – Building domain-specificsearch engines with machine learning techniques. – InProceedings of the AAAI Spring Sym-posium on Intelligent Agents in Cyberspace. Citeseer, pp. 28–39. Citeseer, 1999.
8. McMahan (H. B.) et al. – Advances and open problems in federated learning.Foundationsand Trends® in Machine Learning, vol. 14, n1, 2021.
9. Wang (T.), Zhao (J.), Yatskar (M.), Chang (K.-W.) et Ordonez (V.). – Balanced datasets arenot enough: Estimating and mitigating gender bias in deep image representations. – InProceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5310–5319, 2019.
10. Chen (Y.), Qin (X.), Wang (J.), Yu (C.) et Gao (W.). FedHealth: A Federated Transfer Learning Framework for Wearable Healthcare. vol. 35, n4, 2020-07, pp. 83–93.
Selected publications of the advisor related to the topic :
• L. Ferraguig, Y. Djebrouni, S. Bouchenak, and V. Marangozova. Survey of Bias Mitigation in Federated Learning. Conférence sur le Parallélisme/ Architecture/ Système/ Temps Réel (ComPAS’2021), Lyon, France, 5-9 juillet 2021.
• B. Khalfoun, S. Ben Mokhtar, S. Bouchenak, V. Nitu. EDEN: Enforcing Location Privacy through Re-identification Risk Assessment: A Federated Learning Approach. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Volume 5, Issue 2, June 2021.
• M. Maouche, S. Ben Mokhtar, S. Bouchenak. HMC: Robust Privacy Protection of Mobility Data Against Multiple Re-Identification Attacks. ACM Journal on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2(3), September 2018.
• R. Talbi, S. Bouchenak, L. Y. Chen. Towards Dynamic End-to-End Privacy Preserving Data Classification. IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2018), Fast Abstract, Luxembourg, June 25-28, 2018.
• S. Cerf, V. Primault, A. Boutet, S. Ben Mokhtar, R. Birke, S. Bouchenak, L.Y. Chen, N. Marchand, B. Robu. PULP: Achieving Privacy and Utility Trade-Off in User Mobility Data. SRDS 2017. The 36th IEEE Symposium on Reliable Distributed Systems (SRDS 2017), Hong Kong, September 26-29, 2017.