Personal tools
Laboratoire d'InfoRmatique en Image et Systèmes d'information

Skip to content. | Skip to navigation

Laboratoire d'InfoRmatique en Image et Systèmes d'information
UMR 5205 CNRS / INSA de Lyon / Université Claude Bernard Lyon 1 / Université Lumière Lyon 2 / École Centrale de Lyon
You are here: Home > equipes

Team DM2L : Data Mining and Machine Learning

Display the presentation of the Team
Coordinator Céline Robardet Assistant coordinator Alexandre Aussem
Beginning date Oct 01, 2006
The team Data Mining and Machine Learning belongs to the Pole Data Science
Presentation Members News Publications International relationships Industrial relationships Projects and Research conventions

    DM2L is a team created in 2012 whose scientific activity is devoted to Knowledge Discovery from Data using automatic or semi-automatic techniques. This includes: data mining, machine learning, pattern recognition, statistical learning, data analysis, data archeology, etc.. Its research interests are mainly data mining and machine learning.

    • Data mining is an area of research that appears in the early 1990s from the need for methods of knowledge discovery from large amounts of data. Initially related to disciplines such as statistics, machine learning, and databases, data mining is now a mature field with its major annual conferences (ACM SIGKDD, IEEE ICDM SIAM DM, ECML / PKDD, PAKDD) and its well established journals (Data Mining and Knowledge Discovery, ACM Transactions on Knowledge Discovery from Data, IEEE Transactions on Knowledge and Data Engineering). Data mining methods are often known as unsupervised processes that intend to describe, summarize, raise hypotheses from data.
    • Machine learning field refers to the development, analysis and implementation of methods that perform a task from examples. Its main objective is to build systems with capacities not only for learning but also for generalization, i.e. the ability to extend to the whole what has been observed in a sample. Basic research in recent decades has led to the development of many tools for practitioners from varied fields such as industrial production (helps product design, preventive maintenance, industrial diagnosis), biology and health (support drug discovery, study of risk factors in epidemiology, diagnostic medical assistance), web marketing, e-banking, CRM (e-commerce) in retail, or fraud detection. A researcher in machine learning, depending on the nature of his work and also his sensitivity, would describe his area of research as part of artificial intelligence, inferential statistics, cognitive science, statistical physics, continuous or combinatoric optimization, pattern recognition, information theory or induction. This diversity allows multiple cross-fertilization. However, the designer of algorithmic learning pays particular attention to
      • (a) the natural mechanisms of learning and generalization
      • (b) the solid mathematical foundations of the methods, and
      • (c) the rigorous statistical validation on data that were not used to fit the model.

      In addition to many major annual conferences that are common to DM and ML, machine learning specific conferences are mainly ICML, UAI, and NIPS. Similarly, in addition to the major journals on KDD, there are few well established journals that are dedicated to learning methods (Journal of Machine Learning Research, Pattern Recognition, Machine Learning, Neurocomputing). Significant contributions in data mining and machine learning can occur in conferences (and journals) of Artificial Intelligence (eg, IJCAI, AAAI, ECAI, Artificial Intelligence) and data management (eg, VLDB, ACM CIKM, Information Systems).

    This research is developed in relation to real data analysis: the quantitative and qualitative empirical study on real data is absolutely essential. While DM2L research team develops mainly methods and algorithms rather than applications, it works with owners of data from several environments. If the life sciences and molecular biology have been particularly targeted in recent years, we are also interested in the study of data related to "Intelligences Urban Worlds" labex (eco-technologies and "urban monitoring", transport and mobility, the emergence communities and analysis of social interactions, understanding of the human impact on the environment and biodiversity, etc.) while pursuing applications in areas such as health and the design and monitoring of manufactured goods. This diversity shows our willingness to be centered "Methods" and develop generic algorithms applicable to a broad spectrum of applications.

    Our results are theoretical, methodological, algorithmic, software, and applications. Our guiding principle is to try to help data owners throughout the interactive process of knowledge discovery from data. As these processes require the combination of a wide range of paradigms of description or induction (pattern extraction, classification, statistical learning, including Bayesian networks, set-methods, kernel methods, connectionist methods, etc.), DM2L team leverages ten permanent researchers whose computer skills are complementary.

    More specifically, the team is working on the following problems:
    • Fundamentals of constrain-based mining
    • N-ary relations or Boolean tensors mining
    • Spatio-temporal data mining
    • Dynamic attributed graph mining
    • Ensemble learning
    • Learning probabilistic graphical models
    • Unsupervised learning with and without constraints

    Last update : 2014-01-10 08:52:47