Impresso: explore 200 years of newspaper archives
Apr 29, 2023 — 7:00 AM - 4:00 PMSTCC - niveau Garden, halls 3 & 4
STCC - niveau Garden, halls 3 & 4
Description
Historical newspapers constitute an extremely rich and varied historical source, but they are difficult to exploit (there are many pages to read!) and they very often remain isolated in 'institutional silos' (the archival collections are disconnected from one others).
‘Impresso. Media Monitoring of the Past' used text mining techniques to enrich a corpus of almost 100 newspapers in French, German and Luxembourgish, developing a new web interface to facilitate data exploration inspired by historical research practices .
To accomplish this, an interdisciplinary team composed of computer scientists, designers and historians worked closely together to meet the challenges of accessing data from the past, namely: obtaining good performance despite historical documents that are difficult to process with automatic tools, applying the tools to very large volumes of data, and designing an appropriate interface to facilitate the work of historians. Beyond these specific challenges, the question of how best to adapt text mining tools and their use by humanities researchers is at the heart of the Impresso project.