GoTriple - A Medieval Epigraphic Corpus and its Retro-Developments (CIFM-CBMA). The Exploratory Research of the COSME2 Consortium

authors

Contributors

Laboratoire de Médiévistique Occidentale de Paris (LAMOP) ; Université Paris 1 Panthéon-Sorbonne (UP1)-Centre National de la Recherche Scientifique (CNRS),

Cosme2 (Consortium Sources Médiévales 2) - TGIR Huma-Num - CNRS

+ 2

Publisher

HAL CCSD,

Oxford University Press

Abstract

International audience The digital “Burgundian Epigraphic Corpus” is the result of the collaboration between two teams, the CIFM (Corpus of Inscriptions of Medieval France) and the CBMA (Corpus of Medieval Burgundian Texts), as part of the Cosme2 (Consortium Sources Médiévales - linked to TGIR Huma-Num from CNRS - France), dedicated to the digital approaches of the historical corpora. This article stress how a complex set of documents mixing Latin, Greek, and Old French texts, accompanied by rich metadata, has been processed in order to allow new surveys by humanists. It shows how the corpus is constantly reinvested and how its exploitation, thanks to artificial intelligence, generates new data and metadata that can be reinjected into the corpus and in turn operated creating a kind of virtuous circle. Three retro-developments are briefly discussed here: 1. Semantic Web, Connectivity and Named Entities; 2. GIS and Automated Extraction of New Metadata; 3. Lemmatization and Automatic Language Detection.