Tutorial on Preference-based Pattern Mining

Speakers

The authors of the tutorial are, in alphabetical order, Bruno Crémilleux, Marc Plantevit and Arnaud Soulet. All have extensive experience in research on pattern mining and how to discover useful patterns.

  • Bruno Crémilleux, Université de Caen, France. He received his PhD in computer science in 1991 from the University of Grenoble. He is professor in computer science since 2005 at the University of Caen-Normandy. His main research interests are pattern (set) discovery, Constraint Satisfaction Problems and data mining, preference queries and exploratory data mining. He was co-chair of several ECML/PKDD workshops.
  • Marc Plantevit, Université de Lyon, France. He received his PhD in computer science in 2008 from the University of Montpellier. He has been an associate professor in the computer science department of the University of Lyon since 2009. His research interest include constraint-based pattern mining in general. Currently, he is very interested with sophisticate pattern domains (dynamic/ attributed graphs) and in incorporating background knowledge into pattern mining.
  • Arnaud Soulet, Université François Rabelais de Tours, France. He received his PhD in 2006 from the University of Caen. He is currently associate professor in computer science since 2007 at the University François Rabelais of Tours. He has an expertise in constraint-based pattern mining and involvement in the mining process like pattern mining techniques for preference elicitation.

Material

Click on the image to see the slides of the tutorial.

Tutorial Description

Context and Goal

The paradigm of constraint-based pattern mining assumes a strong assumption: the user knows what he/she is looking for and, even more, he/she is able to express queries to pattern mining solvers. In practice, he/she has only a vague idea of what useful patterns could be and it is very hard to derive appropriate queries for the solvers. It explains the growing place of the preferences in pattern mining. The aim of this tutorial is to provide a comprehensive overview of preferences and relevant methods in the preference-based pattern mining field.

Content

Constraint-based pattern mining is now a mature domain of data mining that makes it possible to handle various different pattern domains (e.g., itemsets, sequences, graphs, dynamic graphs) with a large variety of constraints thanks to solid theoretical foundations and an efficient algorithmic machinery. However, constraint-based pattern mining assumes that the user is able to express what he/she is looking for, requires to finely tune thresholds and the collection of patterns are often too large to be truly exploited. This picture may explain why preferences in pattern mining become more and more important.

Preferences in pattern mining do not come from scratch. In constraint-based pattern mining, the utility functions finely measure the interest of a pattern and can be seen as a quantitative preference model. Many other mechanisms have been developed such as mining the most interesting patterns with one measure (top-k patterns) or more (skyline patterns), reducing redundancy by integrating subjective interestingness and then putting the pattern mining task to an optimization problem.

However, all of the above approaches assume that preferences are explicit and given in the process. In practice, the user has only a vague idea of what useful patterns could be and there is a need to elicit preferences. The recent research field of interactive pattern mining relies on the automatic acquisition of these preferences. Basically, its principle is to repeat a short mining loop centered on the user. At each iteration, only some patterns are mined and the user has to indicate those that are relevant (by liking/disliking, rating, ranking). The user feedback improves an automatically learned model of preferences that will refine the pattern mining step in the next iteration. A great advantage is the user does not have to explicit her preference model. In addition, each iteration is fast and it does not overwhelm the user with a huge collection of patterns impossible to analyze. Interestingly, this mining process raises new challenges: what user feedback to capture? How to elicit a preference model? How to instantly mine patterns based on preferences?

Relevance

Preferences are a way to put the user in the loop of the data mining process. More generally, user-centered methods are crucial in the field of exploratory data analysis (Information Retrieval, OnLine Analytical Processing, Knowledge Discovery in Databases). They are based primarily on subjective knowledge of the user which results in the form of preferences. Last years, part of the work in pattern mining follows that direction. It seems important to present a tutorial on the motivations, challenges and methods at the intersection of preferences and pattern mining.

Target Audience

The target audience of this tutorial is formed by researchers and practitioners in both academia and industry interested in getting a high-level, comprehensive overview of how high-quality patterns can be mined and employed by taking into account the end-user’s preferences. Knowledge on constraint-based pattern mining, preferences and constraints are not required, we will provide a quick overview of these topics.

Outline

This is neither a tutorial on constraint-based pattern mining nor on preference learning.

preferencebasedpatternminingtutorial.txt · Last modified: 2016/09/16 13:32 by mplantev
CC Attribution-Noncommercial-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0