Thesis of Haitang Feng
Subject:
Defense date:
Advisor: Mohand-Said Hacid
Coadvisor: Nicolas Lumineau
Summary:
The calculation of forecasting information has become crucial in the development of marketing and financial strategies in organizations. An important issue is to obtain quality forecasts (i.e.: reliable, detailed and up-to-date). Solutions exist and they generally require the establishment of heavy reporting, but they hardly allow achieving the quality objectives. The entreprise Anticipeo suggests possible alternatives, but they rely on complex treatments of a great number of data from statistics. To be functional, the solution Anticipeo must be optimized to guarantee a process time acceptable to customers. Indeed, according to the amount of data processed, the current process time (calculation and data access) may exceed several hours or even several days. Therefore this represents an important brake to the implementation of the solution for customers of large businesses types.
The optimization of these treatments shows different scientific locks to be removed. Indeed, access to resources (CPU and data) is a sensitive issue in a supercomputing context.
It is therefore essential to find an optimized execution plan which will leverage as most as possible of resources and minimize the data access in the supercomputing context.
Thus, this thesis addresses the performance issues of complex treatments and time consuming to distributed data across a cluster of servers. The objective is to define a smart solution which is able to self-organize and allocate available resources in order to optimize the calculations. The orientation chosen for this research aims to develop an application layer for the monitoring of resources and data. Some data will be replicated to facilitate the calculation parallelization and data and index caches will be set up to promote access to relevant data for the calculations.