Thesis of Cyril Perosino


Subject:
Machine learning models on complex data for default prevention

Start date: 02/12/2024
End date (estimated): 02/12/2027

Advisor: Hamida Seba

Summary:

For many real-world applications, identifying patterns that do not conform to normal activity is a fundamental issue in ensuring correct service delivery as well as system security and reliability [1]. This is mainly the case for surveillance and monitoring applications such as video surveillance, medical monitoring, malware detection, financial fraud detection and so on. An abnormal pattern is called an anomaly or outlier. An anomaly is generally defined as a behavioral pattern that deviates significantly from most behavioral patterns of the system being monitored, and appears in a significantly smaller proportion than normal patterns. With the explosion in the amount of data to be processed for this kind of application, the use of learning models, and in particular deep learning, has become inevitable in this field.

During this internship, we are interested in learning models capable of processing complex multi-source and heterogeneous data.
The aim of analyzing this data is to achieve fairer profiling (without socio-discriminatory criteria) of customers, which will help prevent payment defaults. The proposed approach is the construction of ego-centric knowledge graphs [2] profiling customers and making it possible to represent all the information concerning them. The next step is to use this representation to detect any anomalies that could lead to payment default.  This representation based on knowledge graphs will require the use and/or design of adapted learning models [3, 4]. The trainee will first carry out a state-of-the-art study of the problem and existing learning models. He or she will then implement the solution best suited to the data considered during the internship.