Thesis of Sara Makki


Subject:
An Efficient Classification Model for Analyzing Skewed Data to Detect Frauds in the Financial Sector" (Un Modèle de Classification Efficace pour l'Analyse des Données Déséquilibrées pour Détecter les Fraudes dans le Secteur Financier.)

Defense date: 20/12/2019

Advisor: Mohand-Said Hacid

Summary:

There are different types of risks in financial domain such as, terrorist financing, money laundering, credit card fraudulence and insurance fraudulence that are usually detected using classification algorithms. In classification problems, the skewed distribution of classes also known as class imbalance, is a very common challenge in financial fraud detection.

We developed two approaches: A Cost-Sensitive Cosine Similarity K-Nearest Neighbor (CoSKNN) as a single classifier, and a K-modes Imbalance Classification Hybrid Approach (K-MICHA) as an ensemble learning methodology. In CoSKNN, our aim was to tackle the imbalance problem by using cosine similarity as a distance metric and by introducing a cost sensitive score for the classification using the KNN algorithm. On the other hand, the aim of K-MICHA is to cluster similar data points in terms of the classifiers outputs. Then, calculating the fraud probabilities in the obtained clusters in order to use them for detecting frauds of new transactions.

At the end, we applied K-MICHA to a credit card, mobile payment and auto insurance fraud data sets. In all three case studies, we compare K-MICHA with stacking using voting, weighted voting, logistic regression and CART. We also compared with Adaboost and random forest. We prove the efficiency of K-MICHA based on these experiments.

Keywords: Financial fraud, Class imbalance, F1 – score, Cost Sensitive Classification, Cosine similarity, K-Nearest Neighbors, Ensemble learning, K-modes.


Jury:
Mme Murisasco ElisabethProfesseur(e)Université de ToulonRapporteur(e)
Mme Soule-Dupuy ChantalProfesseur(e)Université ToulouseRapporteur(e)
M. Boucelma OmarProfesseur(e)Université Aix-Marseille Examinateur​(trice)
Mme Assaghir ZainabProfesseur(e) associé(e)Université LibanaiseExaminateur​(trice)
M. Taher YehiaMaître de conférenceUniversité de VersaillesExaminateur​(trice)
Mme Seba HamidaMaître de conférenceLIRIS - Université Claude Bernard Lyon 1Examinateur​(trice)
M. Hacid Mohand-SaïdProfesseur(e)LIRIS - Université Claude Bernard Lyon 1Directeur(trice) de thèse
M. Zeineddine HassanProfesseur(e)Université LibanaiseDirecteur(trice) de thèse
M. Haque Akm RafiqulDirecteur(trice) de rechercheCognitus Invité(e)