Best paper award at WEBIST 2019
CATI: Assisted Classification of Documents (text and images)
Domain knowledge is essential to Data science since it provides the
scope for the construction of models, methods and techniques that
allow extracting insights from large amounts of data. A well-known
problem in this multi-disciplinary field is that the person having
such a initial knowledge is not necessarily a data scientist but a
domain expert, therefore s/he misses the education and experience
on data analysis. In this articlewe present a full platform for assisted
classification in the domain of microblogs, mainly conducted by
text- and image-based event detection and Active Learning (AL).
The process is fully supported through a graphical user interface
which source code is freely accessible, and provides users with
classification and data exploration features.