Digital Diplomatics Conference 2022

Digital Diplomatics Conference 2022

Organizer(s)
Centre for Information Modelling - Austrian Centre for Digital Humanities
ZIP
8010
Location
Graz
Country
Austria
Took place
Hybrid
From - Until
28.09.2022 - 30.09.2022
By
Franziska Decker / Georg Vogeler, Zentrum für Informationsmodellierung in den Geisteswissenschaften, Karl-Franzens-Universität Graz

The Digital Diplomatics 2022 conference was the fourth instance of an international workshop/conferences series (Munich 2004, Munich 2007, Naples 2011, Paris 2013).1 Several of the 2022 speakers had already been present at previous instances, allowing us to follow the development of their respective projects and datasets over the years. BALÁSZ CSIBA and JURAJ ŠEDIVÝ (Bratislava) updated us on the Documenta Posoniensia collection of documents regarding the city of Bratislava, accessible on Monasterium2, Crossborder Archives SK-AT (CRAC)3, and the general cultural heritage portal pamMap.4 The problems and perspectives of the Diplomatarium Serbicum Digitale (DSD), proposed at the 2011 conference, launched in 2015 , and shut down in 2020, were presented by ŽARKO VUJOŠEVIĆ (Belgrade). ZSOLT HUNYADI (Szeged) covered the history and current status of a database for charter calendars of the 86-year period in which Hungary was ruled by kings of Angevin lineage5, as reported also at the 2013 conference in Paris.

Other repeat guests at the conference series are the two well-known German research institutions of the Monumenta Germaniae Historia (MGH) and the Regesta Imperii (RI), continuously reporting on the progress of their digitization processes and the progress of their respective databases. CLEMENS RADL (Munich) gave a brief overview of the MGH digital “ecosystem”, in particular the most recent openMGH project, aiming to successively provide TEI-XML files of all the printed editions. STEFFEN KRIEB (Mainz) presented the RI, arguing that the first volume’s preface by Johann Friedrich Böhmer could be read as encouraging the use of the regesta for digital diplomatics purposes, when he pointed out that the value of the regesta not only lies in being a preliminary work but also as a research source on their own. Krieb concluded that further improving the machine-readability of the data is a focus of the RI branch in Mainz.

The creation of new databases and gathering data from existing collections has been a topic from the very beginning of the Digital Diplomatics conference series. JULIEN THÉRY (Lyon) presented the 2017 launched APOSCRIPTA database whose scope it is to gather a unified corpus of as many mediaeval papal documents as possible as a collective and collaborative addition to the third edition of the Regesta Pontificum Romanorum or the Index actorum Romanorum pontificum. JAN BURGERS and RIK HOEKSTRA (Amsterdam) covered the Digitale Charterbank Nederland (DCN)5, hosted in the same system used by the majority of Dutch archives and gathering the corpus of mediaeval and early modern charters preserved in these archives. The speakers presented examples where well-selected search terms make major historical developments visible, but emphasised the importance of thorough source criticism when working with databases. JOANNA TUCKER and JOHN DAVIES (Glasgow) discussed the research tools of the Models of Authority: Scottish Charters and the Emergence of Government 1100–1250 project6, with deep annotations (clauses, graphs, persons and places) and its interoperability with the People of Medieval Scotland 1093–13717 database. They concluded with a case study on boundary and holding, as well as warrandice clauses. STEFAN BRÖHL (Karlsruhe) spoke about the joint efforts of six Southern German archives to collect, describe, and publish about 7,000 diplomatic units and their metadata of the counts palatine of Rhine from 1449–1508, realised in the context of the Archivportal-D. KRISZTINA ARANY and BENCE PETERFI (Budapest) introduced the audience to the history of the Hungarian National Archives and their research portals as e.g. their Digital Archives Portal8 and Hungaricana9, focussing especially on the “Reformation” and “Vestigia” databases. MARGUERITE DALLAS (Zurich) presented a linguist approach to charter databases. The database Documents et analyses linguistiques de la Galloromania médiévale (GallRom), work in progress since 2020, aims to combine bibliographical and editions data, with lexical data from reference dictionaries and philological data like scriptological analyses, covering with French, Francoprovençal, Occitan, and Gascon.

Having been touched only marginally at the previous conference instances10, databases using Semantic Web annotation can be seen as a rather modern approach of database construction in our conference series. CHRISTIAN DOMENIG (Klagenfurt) gave an update to his presentation of 2007 and reported about his digital edition of the charters of the counts of Cilli (Quellen zur Geschichte des Alpen-Adria-Raumes), which was rebuilt by moving from Microsoft Access to SemanticMediaWiki. He demonstrated how such tools could be used for zero-budget-projects and especially for teaching purposes. ADAM ZAPALA (Warsaw) discussed a workflow for the processing of papal letters as part of the Monumenta Vaticana res gestas Polonicas Illustrantia project. The project creates an XML database with persistent identifiers by Wikibase and then uses an image indexing software based on GIS technology (INDXR) to annotate content information from the documents, treating their digital representations as maps. They plan to publish the annotated full text editions using TEI Publisher and connecting it to Wikibase. Another promising aspect of modern database development, the use of crowdsourcing, was addressed by AMABLE SABLON DU CORAIL (Paris). To allow a community-based annotation, the French National Archives have developed a web tool for HTR and the conversion of archival descriptions into XML/EAD encoded data (Saisie d’Inventaires Manuscrits Assistée par Reconnaissance Automatique (SIMARA)), which could benefit a change in frames of references from traditionals fonds to a “records in contexts” conceptual model.

While many talks of the conference dealt with the provision of source material in the form of sustainable online databases, a significant part of the presentations showcased the use of such data with machine learning approaches for specific research questions. These topics have only rarely been discussed at the previous conference instances until 2013. For instance, automated text recognition and digital palaeography continue to be an important topic in the digital humanities, as shown by three talks in Graz. PETER A. STOKES (Paris) recalled his personal research history in digital diplomatics and palaeography, emphasising the value of training data, rather than the software itself, for the results of a machine learning model. We were encouraged to collaborate and share as much training data as possible, even when this might be challenging due to differences in transcription and segmentation standards, images, and script-style. SERENA AMMIRATI and PAOLO MERIALDO (Rome) presented the In Codice Ratio project and its sequel project Matrices, training a neural network capable of identifying different scribes. By applying an explainable AI (XAI) technique, they seek to understand which parts of the data the ML system considers as important for the classification process – knowledge that could help the experts to focus on details and possibly discover new features relevant for their palaeographic analyses. TOBIAS HODEL (Bern) asked if models trained on the confidence matrices of automated text recognition (ATR) could be used as tools for palaeographic questions as the identification of scribes. He concluded that they may not be suitable yet for scribe identification but can help us to better understand how ATR algorithms are working.

The broad field of computer vision offers other methods for diplomatics aside palaeography. The NOT A writtEn word but graphic symbols (NOTAE) project11, presented by FRANCESCO LEOTTA (Rome), studies graphical signs drawn into documentary sources of all possible kinds (e.g. charters, letters, …) from the 5th to the 8th centuries. Their algorithm, based on ground truth augmented by a generative adversarial network, outputs suggestions for a manual evaluation by experts. However, automated classification is planned for the future. MARK BELL (London) and JOHN STELL (Leeds) promoted the use of qualitative spatial representation (QSR) as a method in tackling digital diplomatics challenges. The method tries to use spatial relationships in document layout to infer meaning like causal or temporal relationships, e.g., via common patterns of spatial reasoning, proofing its vast potential especially for the analysis of registers. ÁDÁM NOVÁK and SÁNDOR ÓNADI (Debrecen) showed how “reconstructions” of seals could be put together from RTI (reflectance transformation imaging) data of multiple incomplete impression remnants from the same matrix, how an original’s preservation could be enhanced by exhibiting its digital 3D images or 3D-printed copies, and that this also means another possibility of “bringing home” seals whose originals remain in collections abroad.

The well-established field of natural language processing (NLP) was represented in multiple talks during the conference. TIMO KORKIAKANGAS‘ (Helsinki) talk covered the Late Latin Charter Treebank (LLCT), a set of three dependency treebanks of about 1,200 Early Medieval Latin documentary texts, mainly from Tuscany. He demonstrated distant approaches in three case studies, e.g., how similar grammatical errors or usage of formulaic ‘templates’ could lead to conclusions on Latin teaching, scribal and chancery schools, and educational ideals which then would reflect back into linguistic research. OLIVER SCHALLERT (Munich), CARSTEN BECKER (Berlin), and HELMUT SCHMID (Munich) showed that a vast, electronically accessible corpus of Middle High German charters provides fine-scaled data on area-specific language variation over the study period, unlike other historical documents. With the help of regular expressions drawn from the Wörterbuch der mittelhochdeutschen Urkundensprache (WMU) and semi-automated annotations created by Schmid's RNNTagger12, one can study fine-scaled grammatical variation or the transmission of linguistic innovations. JEROEN DEPLOIGE and MARIJKE BEERSMANS (Ghent) presented ongoing work at the Diplomatica Belgica database, as well as a case study of named entity recognition (NER) in a small corpus of 155 Middle Dutch charters and 5,000 tokens. They compared the outcomes of a custom bidirectional long-short term memory (Bi-LSTM) model and the state-of-the-art spaCy library, with the latter outperforming the former. MARLENE ERNST and ALINA OSTROWSKI (Passau) covered their recently started project on papal-delegated jurisdiction in the Iberian Peninsula of the 12th century. The speakers’ current work focuses on the application of text re-use detection to identify formulaic language, which will hopefully allow for an automated detection of diplomatic charter sections. Additionally, they want to do historical network analysis. EVELINE LECLERCQ (Ghent) used a “document vector matrix”, TF-IDF normalisation and square distance matrices combined with close reading to identify members of the chancery of the bishops of Cambrai, concluding that the procedure could allow also for the identification of editorial or ecclesiastical influence.

Statistical approaches are part of traditional diplomatics questions and now supported by modern querying possibilities, as shown by MIGUEL CALLEJA-PUERTA (Oviedo) of the Notariado y construcción social de la realidad. Hacia una codificación del documento notarial, siglos XII–XVII project (NotFor). He demonstrated the possibilities of an XML markup domain-specific to the diplomatic structure of notarial documents and XQuery-based selections by case studies on the notariate of Ribadavia, as well as on tax clauses and penal clauses in Oviedo and Grado.

The exploitation of machine learning methods for digital-born statistical questions played a more prominent role now than at earlier instances of the conference series. NICOLE BERGK-PINTO (Paris), JACQUELINE SCHINDLER (Graz) and NIKLAS TSCHERNE (Graz) presented their ongoing project BeCoRe – Between Composition and Reception. It studies visual signs of authority (seals, monograms, initials, other graphical signs like the chrismon, etc.), but also takes textual features and layout into account. They want to compare if clusters created by computational methods relate to regional or systematic categories (monastic order, monastery, or others) in the sense of Fichtenau’s concept of “diplomatic landscapes”. PIA GEISSEL’s (Wuppertal) stylometric experiments with principal component analysis (PCA) on temporal and geographic classes of Brescian and Milanese charters from 10th to 12th century in the Codice diplomatico della Lombardia medievale (CDML)13 showed a homogeneity of the material not representative for that period’s and area’s charter production, stemming from the corpus creation. JOHN MCEWAN (St. Louis) trained a machine learning algorithm to extract information on the shape and class (meaning their depiction) from the unattached seals found on the Digital Sigillography Resource (Digisig)14 platform, and to automatically date them, comparing the outcomes to datings from the Portable Antiquities Scheme Dataset.

The aforementioned shift from traditional concepts to higher statistics was shown in the conference’s keynote by MICHAEL GERVERS and GELILA TILAHUN (Toronto) as well. It became evident in their Documents of Early England Data Set (DEEDS) project15, that only a small subset of their documents carried fixed dating clauses. By the use of two-word shingles and statistical analyses, they were able to date any charter from this corpus with an accuracy of ±7 years. They now examine how certain economic and social changes in England led to the change of patterns in the vocabulary and style of the investigated documents by applying a latent dirichlet allocation (LDA) model. Gervers concluded that standalone databases like the DEEDS database are too limited to sufficiently answer this kind of diplomatic questions. Therefore, he advocated for gathering, standardising and linking more data and repositories, summarising his wishes for the future of digital diplomatics in the bi-gram “collaborative cooperation”.

Summarising, one common thread in many talks of the conference was the importance of well-processed and sustainable data which gains its value from being accessible and useful to not only singular projects but also further research. Collaboration was a keyword of the conference. Many of the conference’s speakers had presented their plans to produce data at earlier editions of the series. Now, this data is available and we can start to analyse it, which is an encouraging message to junior researchers: While some of the applied machine learning methods might not necessarily be new approaches and well-established in their respective computational research fields, their application to corpora of diplomatic sources is rather modern, and it results in new and exciting research questions for our traditional discipline. They allow us to climb on the shoulders of the data giants that have emerged in the last 15 years, with new ideas and unwavering enthusiasm.16

Conference overview:

Serena Ammirati (Rome) / Paolo Merialdo (Rome): Roman Mediaeval Documents Meet Machine Learning: A Great Opportunity for Research and Teaching

Jan Burgers (Amsterdam) / Rik Hoekstra (Amsterdam): The Digitale Charterbank Nederland (DCN): A Digital Portal to 200.000 Charters in Dutch Archives

Žarko Vujošević (Belgrade): Diplomatarium Serbicum Digitale. Eine Wiederaufnahme zwischen Datenbank und digitaler Edition

Miguel Calleja-Puerta (Oviedo): Modelling Notarial Charters for Diplomatic Analysis: The Experience of the NotFor Project

Pia Geißel (Wuppertal): Chancen und Möglichkeiten computergestützter stilometrischer Verfahren für Urkundentexte

Zsolt Hunyadi (Szeged): Charter-Calendars of the Angevin Period (1301–1387) of Hungary: Digital Images, Editions, Calendars, Toponyms, Seals

Filipa Roldão (Lisboa) / Joana Serafim (Lisboa) / João Paulo Silvestre (Lisboa): The Electronic Edition of Portuguese Municipal Charters (12th–15th Centuries): Challenges on Diplomatics and Digital Humanities

Juraj Šedivý (Bratislava) / Balázs Csiba (Bratislava): Klassische und digitale Editionen der mittelalterlichen Quellen in der Slowakei / Printed and Digital Editions of Mediaeval Sources in Slovakia

Steffen Krieb (Mainz): Distant Diplomatics Avant la Lettre. The Regesta Imperii as Data for Digital Diplomatics

Clemens Radl (Munich): Dealing with Diplomata: The Digital MGH and Beyond

Michael Gervers (Toronto) / Gelila Tilahun (Toronto): Patterns of Change. The DEEDS Database, Topic Modeling and Network Analysis in English Medieval Charters, and the Future of Digital Diplomatics

Jeroen Deploige (Ghent) / Marijke Beersmans (Ghent): Recent Developments in the Diplomata Belgica-Project. Case Study: NER Applied to a Corpus of Middle Dutch Charters

Amable Sablon du Corail (Paris): Des fonds d’archives aux corpus numériques: L’impact des nouvelles technologies sur les services d’archives en France

Christian Domenig (Klagenfurt): Quellen zur Geschichte des Alpen-Adria-Raumes: Eine digitale Urkundenedition auf Basis von MediaWiki

Mark Bell (London) / John Stell (Leeds): Qualitative Space in Digital Diplomatics

Stefan Bröhl (Karlsruhe): Charters of the Counts Palatine of the Rhine 1449–1508: A Project of Cooperative Description and Presentation in Archivportal-D.

Marlene Ernst (Passau) / Alina Ostrowski (Passau): Big Data Diplomacy. Information Extraction on Delegated Jurisdiction in the Iberian Peninsula of the 12th Century

Timo Korkiakangas (Helsinki): Distant Diplomatics: New Chances for “Distant Linguistics” on Mediaeval Charters?

Joanna Tucker (Glasgow) / John Davies (Glasgow): Digitising Scottish Charters, 1100–1250: Building and Using a Sustainable Digital Research Tool for Charter Research

Eveline Leclercq (Ghent): Chancery, Aristocracy and Chapter: The assessment of a Stylometric Approach in Studying the Cambrai Bishop’s Diplomatic Landscape (12th C.)

Adam Zapala (Warsaw): The Use of Digital Tools for the Processing of Late Mediaeval Papal Letters

Oliver Schallert (Munich) / Carsten Becker (Berlin) / Helmut Schmid (Munich): Areal Variation in Middle High German: Methodological and Quantitative Aspects

Francesco Leotta (Rome): Image Processing and Semantic Technologies in Digital Humanities: The NOTAE Experience

Julien Théry (Lyon): APOSCRIPTA. A Database and Text Corpus of Papal Charters: Towards a Unified Evolutive and Collaborative Research Tool

Peter A. Stokes (Paris): Looking Forward and Looking Back on Some Projects in Digital Diplomatics

Nicole Bergk-Pinto (Paris) / Jacqueline Schindler (Graz) / Niklas Tscherne (Graz): BeCoRe: Between Composition and Reception

Short papers:
Marguerite Dallas (Zurich): Documents et Analyses Linguistiques de la Galloromania Médiévale (GallRom): A New Potential for Interpretive Research

John McEwan (St. Louis): From Archives to Archaeology via Machine Learning: Exploring the History of Sealing Practices in the British Isles

Krisztina Arany (Budapest) / Bence Peterfi (Budapest): “In search of lost time”: The Databases of the Hungarian National Archives

Tobias Hodel (Bern): Combining Automatic Text Recognition with Digital Paleography: New Trajectories for Digital Diplomatics

Ádám Novák (Debrecen) / Sándor Ónadi (Debrecen): Database of Diplomatic Sources and 3D Scanning of Wax Seals – Observations

Posters:

Laura Bitterli (Zurich): Ad fontes: E-Learning Platform for the Handling of Archival Sources Goes Automatic Text Recognition (ATR)

Martina Bolom-Kotari (Hradec Králové): Digital Editions of Seals: Challenge for Sphragists, Archivists and IT Experts. Example of an Approach in Czech Archives

Simon Zsolt (Târgu Mureș): Digital Editions and Document Corpora of Mediaeval and Early Modern Transylvanian Sources

Serafina Filippelli (Bergamo): L’utilità delle immagini digitali per la diplomatica, in particolare per l’analisi e lo studio delle forme estrinseche, nel lavoro dal titolo: “Uno strumento per la fruizione dei documenti del Monastero di Astino nell’ Archivio di Stato di Bergamo. Regesto del libro dei censi dei diritti dell’Abate Silvestro De Benedictis” (1461–1464)

Giuseppe Consolo (Naples): The Digital Critical Edition of the Account Books. The Case of the Female Monastery of SS. Pietro and Sebastiano

Notes:
1https://www.didip.hypotheses.org/conference-2022/digital-diplomatics-conference-series (09.05.2023).
2https://www.monasterium.net/ (09.05.2023).
3https://www.crossborderarchives.eu/ (09.05.2023).
4https://www.pammap.sk/ (09.05.2023).
5http://www.eruditio.hu/lectio/mokka-ms (09.05.2023).
6http://www.modelsofauthority.ac.uk/ (09.05.2023).
7http://www.poms.ac.uk/ (09.05.2023).
8https://www.eleveltar.hu/ (09.05.2023).
9https://www.hungaricana.hu/en/ (09.05.2023).
10 See the contribution by Aleksandrs Ivanovs / Aleksey Varfolomeyev, Some Approaches to the Semantic Publication of Charter Corpora. The Case of the Diplomatic Edition of Old Russian Charters, in: Digital Diplomatics. The Computer as a Tool for the Diplomatist?, Antonella Ambrosio et. al. (ed.), Archiv für Diplomatik, Schriftgeschichte, Siegel- und Wappenkunde. Beiheft 14, Köln 2014, pp. 149–68.
11http://www.notae-project.eu/further-info/project (09.05.2023).
12https://www.cis.uni-muenchen.de/~schmid/tools/RNNTagger/ (09.05.2023).
13https://lombardiabeniculturali.it/cdlm/ (09.05.2023).
14http://www.digisig.org/ (09.05.2023).
15https://www.deeds.library.utoronto.ca/ (09.05.2023).
16 You can find an extensive report at https://didip.hypotheses.org/ (09.05.2023).

Editors Information
Published on
Classification
Regional Classification
Additional Informations
Country Event
Conf. Language(s)
English
Language