Thesis of Abdelhamid Gaddari


Subject:
Analysis and Prediction of Patient Pathways in the Context of Supplemental Health Insurance

Start date: 19/11/2020
End date (estimated): 19/11/2023

Advisor: Mohand-Said Hacid
Coadvisor: Haytham Elghazel

Summary:

This thesis work falls into the category of healthcare informatics research, specifically the analysis and prediction of patients’ care pathways, which are the sequences of medical services consumed by patients over time. Our aim is to propose an innovative approach for the exploitation of patient care trajectory data in order to achieve not only binary, but also multi-label classification. We also design a new sentence embedding framework exclusively for the french medical domain, which will harness another view of the patients’ care pathways in order to enhance the predictive performance of our proposed approach.

Our research is part of the work of CEGEDIM ASSURANCES, a business unit of the CEGEDIM Group that provides software and services for the french supplementary healthcare insurance and risk management sectors. By analyzing the patient care pathway and leveraging our proposed approach, we can extract valuable insights and identify patterns within the patients’ medical journeys in order to predict potential medical events or upcoming medical consumption. This will allow insurers to forecast future healthcare claims and therefore negotiate better rates with healthcare providers, allowing for accurate financial planning, fair pricing models and cost reductions. Furthermore, it enables private healthcare insurers to design personalized health plans that meet the specific needs of the patients, ensuring they receive the right care at the right time to prevent disease progression. Ultimately, offering preventive care programs and customized health products and services enhances client relationship, improving their satisfaction and reducing churn.

In this work, we aim to develop an approach to analyze patient care pathways and predict medical events or upcoming treatments, based on a large portfolio of reimbursed medical records. To achieve this goal, we first propose a new time-aware long-short term memory based framework that can achieve both binary and multi-label classification. The proposed framework is then extended with another aspect of the patient healthcare trajectories, namely additional information from a fuzzy clustering of the same portfolio. We show that our proposed approach outperforms traditional and deep learning methods in medical binary and multi-label prediction. Subsequently, we enhance the predictive performance of our proposed approach by exploiting a supplementary view of the patient care pathways that consists of a detailed textual description of the consumed medical treatments. This is achieved through the design of F-BERTMed, a new sentence embedding framework for the french medical domain that presents significant advantages over the natural language processing (NLP) state-of-the-art methods. F-BERTMed is based on FlauBERT, whose pretraining using MLM (Masked Language Modeling) was extended on french medical texts before being fine-tuned on NLI (Natural Language Inference) and STS (Semantic Textual Similarity) tasks. We finally show that using F-BERTMed to generate a new representation of the patient care pathways enhances the performance of our proposed medical predictive framework on both binary and multi-label classification tasks.