About softness for inductive querying on sequence databases - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2006

About softness for inductive querying on sequence databases

Résumé

In many application domains (e.g., WWW usage mining, telecommunication data analysis, molecular biology), large sequence databases are available and yet under-exploited. The inductive database framework assumes that both such databases and the various patterns holding within them might be queryable. In this setting, queries which return patterns are called inductive queries and solving them is one of the main topics in database mining research. Indeed, constraint-based mining techniques on sequence databases have been studied extensively the last few years and efficient algorithms enable to compute complete collections of patterns (e.g., sequences) which satisfy conjunctions of monotonic and/or anti-monotonic constraints in potentially large sequence databases (e.g., minimal and maximal frequency constraints). Studying new applications of these techniques, we consider that fault-tolerance and softness are extremely important issues for tackling real-life data analysts. In this paper, we address some of the open problems when computing soft occurrences of patterns within database sequences instead of the classical exact matching ones. Such an extension is not trivial since it prevents the clever use of monotonicity for pruning the search space. We describe our proposal and we provide an experimental validation on real-life clickstream data which confirms the added value of this approach
Fichier non déposé

Dates et versions

hal-01613490 , version 1 (09-10-2017)

Identifiants

Citer

Ieva Mitasiunaite, Jean-François Boulicaut. About softness for inductive querying on sequence databases. 7th International Baltic Conference on Databases and Information Systems, DB&IS'06, Jul 2006, Vilnius, Lithuania. pp.77-82, ⟨10.1109/DBIS.2006.1678478⟩. ⟨hal-01613490⟩
34 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More