Thesis of Orhan Yazar


Subject:
Demand Management Process in the Agri-Food Industry : building, analyzing, and implementing a model forecast

Summary:

Multi-target regression (MTR) has attracted an increasing amount of attention in recent years. The main challenge in multi-target regression is to create predictive models for problems with multiple continuous targets by considering the inter-target correlation, which can greatly influence the predictive performance. MTR emerges in several modern applications including ecology, biophysical and medicine. 
There is a thing that most of existing methods, namely the impact of inputs in target correlations (i.e., conditional target correlation). In this thesis, we first propose a novel MTR framework, termed as Conditionally Decorrelated Multi-Target Regression (CDMTR). CDMTR learns from the MTR data following three elementary steps: clustering analysis, conditional target decorrelation and multi-target regression models induction. The clustering step aims to investigate the underlying properties of training data for decomposing the original MTR problem into several MTR sub problems. The goal is to effectively capture correlations in the input-feature space to facilitate the subsequent discrimination process. In the second step, CDMTR conducts, in each given cluster, a principal component analysis (PCA) of the target space for deriving linear combinations of the targets. Subsequently, the transformed targets (i.e., the principal components) are used in a simple single-target regression method that does not have to care about conditional target dependencies, knowing that the transformed targets are uncorrelated in each clustering partition. 
Through this approach, we demonstrate that the benefit of exploiting conditional target dependencies in MTR can greatly influence the generalization performance but is known to be closely dependent on the properties of the data and the type of loss to be minimized. Indeed, in MTR data where many inter-dependencies between the targets may be present, explicitly modeling all inter-target and input-output relationships is intuitively far more reasonable. In a second part of this thesis, the multi-target regression and optimal feature subset selection problems were formulated within a unified probabilistic framework, termed as Conditionally Independent Target Subsets (CITS). It consists of using the power of Bayesian networks to explicitly identify different conditionally independent target subsets and their optimal set of predictors to improve the multi-target regression training process. 
Satisfactorily tested on several benchmark data sets, the approaches developed in this thesis show promise compared to competitive state-of-the-art alternatives. Extensive experiments are also conducted on the Panzani industrial database for assessing discount campaigns in the Agri-food industry.


Advisor: Mohand-Said Hacid
Coadvisor: Haytham Elghazel

Defense date: friday, june 11, 2021

Jury:
Mr Bennani YounesProfesseur(e)Université Sorbonne Paris NordRapporteur(e)
Mme Kuntz PascaleProfesseur(e)Université de NantesRapporteur(e)
Mme Amer-Yahia SihemDirecteur(trice) de rechercheCNRS GrenobleExaminateur​(trice)
Mr Benabdeslem KhalidMaître de conférenceUniversité Lyon 1Examinateur​(trice)
Mr Hacid Mohand-SaïdProfesseur(e)Université Lyon 1Directeur(trice) de thèse
Mr Elghazel HaythamMaître de conférenceUniversité Lyon 1Co-directeur (trice)
Mme Castin NathalieResponsable industriel, PanzaniInvité(e)