HDR of Andrea Mauri
Subject:
Summary:
Data-intensive applications are increasingly involved in critical decision-making processes across domains such as healthcare, policymaking, and digital services. While the data management and machine learning communities have made substantial progress in improving the efficiency, scalability, and accuracy of these systems, the role of humans, both as contributors to data and as individuals affected by data-driven decisions, remains insufficiently integrated into their design and deployment. In my research I've argued for a comprehensive, human-centered perspective on data-intensive applications, in which human factors are treated as first-class concerns throughout the entire data lifecycle.
The work is structured around a conceptual model of the data pipeline comprising data collection, data pre-processing, and data analysis, and investigates how human involvement can be systematically incorporated at each stage. In the context of data collection, the manuscript presents participatory methods and a policy-sandboxing approach that leverage empathy and exposure to diverse perspectives to mitigate bias and support more inclusive decision-making. For data pre-processing, the work introduces user-centric and interactive techniques for improving data quality, with a particular focus on graph data, including novel frameworks and empirical studies on human-centered graph repair, as well as exploratory investigations into the use of large language models to assist these processes. In the data analysis stage, the manuscript examines how users, particularly non-experts—interact with complex data systems, presenting quantitative and qualitative studies on the learning and use of graph query languages, and deriving actionable insights for the design of more accessible analytical tools.
Beyond individual pipeline stages, the manuscript addresses the broader challenge of designing data-intensive applications that account for societal impact, value tensions, and care-oriented considerations, drawing on methods from human-computer interaction such as participatory and speculative design. Overall, this work contributes a coherent research agenda, methodological foundations, and concrete systems and empirical results that advance the field of Human-Centered Data Management, demonstrating how integrating human factors can lead to more trustworthy, inclusive, and effective data-intensive applications.
Jury:
| Monsieur Fletcher George | Professeur(e) | Eindhoven University of Technology | Rapporteur(e) |
| Monsieur Quercia Daniele | Professeur(e) | Nokia Bell Labs Cambridge and Politecnico di Torino | Examinateur(trice) |
| Monsieur Miklos Zoltan | Professeur(e) | University of Rennes | Examinateur(trice) |
| Monsieur Missier Paolo | Professeur(e) | University of Birmingham | Examinateur(trice) |
| Monsieur Kheddouci Hamamache | Professeur(e) | Université Claude Bernard Lyon 1 | Examinateur(trice) |
| Madame Bonifati Angela | Professeur(e) | Université Claude Bernard Lyon 1 | Examinateur(trice) |