Datafication in the Historical Humanities: Reconsidering Traditional Understandings of Sources and Data

Datafication in the Historical Humanities: Reconsidering Traditional Understandings of Sources and Data

Veranstalter
German Historical Institute Washington in collaboration with Luxembourg Centre for Contemporary and Digital History (C²DH), Chair of Digital History at Humboldt Universität zu Berlin, Consortium Initiative NFDI4Memory, Roy Rosenzweig Center for History and New Media, and Stanford University, Department of History
Veranstaltungsort
GHI Washington
Gefördert durch
German Research Foundation (DFG)
PLZ
20009
Ort
Washington DC
Land
United States
Vom - Bis
02.06.2022 - 04.06.2022
Von
Alma Bender, Friedrich-Meinecke-Institut, Freie Universität Berlin

Welcome to the Fifth Annual GHI Conference on Digital Humanities and Digital History “Datafication in the Historical Humanities: Reconsidering Traditional Understandings of Sources and Data,” from June 2 to 4, 2022.

Datafication in the Historical Humanities: Reconsidering Traditional Understandings of Sources and Data

The Fifth Annual GHI Conference on Digital Humanities and Digital History will revolve around the concept of “datafication,” that is, the production of and the shift toward digital representations of historical sources as a prerequisite for storage, access, and analysis, not to mention their transmission and publication online.

Historians outside the field of quantitative social history rarely consider their objects of study as “data,” even when they look at documents or paintings in digitized versions on their screen. These witnesses of human lives call for emotional, imaginative, and empathetic engagement and thus cannot be reduced to mere commodities to fuel a new kind of computational research, despite what the slogan “data is the new oil” might suggest. Sources, not data, we might thus insist, are at the heart of historical research. On the other hand, we readily observe that gathering, organizing, sorting, excluding, and searching for selected information from (digital) sources are routine processes of historical investigation. Data-centered research, seen from this angle, seems more a continuation with updated tools and technologies than a radical break from traditional methods of inquiry. Johanna Drucker has forcefully pointed out that we should reconceive all data as “capta,” taken and not simply given as the designation might imply. Data is therefore not a natural representation of something pre-existing, but created as part of a knowledge-production process open to investigation and critique. Data in the humanities, by adopting Christof Schöch’s working definition, can therefore be considered as a digital, selectively constructed, machine-actionable abstraction representing some aspects of a given object of humanistic inquiry.

While we have seen a convergence in data modeling in text-oriented humanities (TEI), library science (FRBR), and for cultural heritage information (CIDOC CRM), no conceptual framework for modeling, curating, and managing data in historical research has gained wide adoption. The one possible exception comes from Wikidata, a project that has been conceptualized and populated with very little input from within our field. Ruth Mostern and Marieka Arksey argue that there are still no standards to emulate due to the small number of historical datasets currently available, and their heterogeneous nature. However, historical data repositories are “unlikely to realize their promise until the social life of data becomes part of the profession.” The current push by funders for National Research Data Infrastructures, such as NFDI in Germany, both adopts this idea of making data sharing a part of professional practice and calls for interdisciplinary research. Such activities are premised on the idea of the “social life of data,” the concept that research data and models designed and collected for very specific questions might become useful for a broader audience. The support for the re-use of both technical infrastructure and the models used for data collection will jumpstart their wider adoption.

The obstacles to such an undertaking are simultaneously conceptual, structural and practical: modeling the entire range of historical investigation is a call to modeling the entire world, from the very beginning until now. This raises the question whether these models are not in principle culture-bound, which excludes a global approach per se and leads to the question to what extent it is possible to find a generic conceptualization within a subgroup alone. However, especially in the context of datafication processes, the question of data modeling is a crucial one, since it lays the groundwork for historical research for future generations. It is a time-consuming and cost-intensive process that needs to be well conceived and thought through. There is a great risk of creating path dependencies that later limit our ability to work with this data.

Historical research often takes a nonlinear or even meandering path through many phases of uncertainty and redefinition. Just like traditional source-based studies, a data-driven investigation will not usually start with a predefined set of sources and questions, but will extend and refine the scope, the structure, and the rules for data entry continuously as new questions arise and additional material is encountered. In addition, we notice a lack of tradition in collaborating in larger teams that include programmers, archivists, librarians and other information professionals. Therefore, humanist data often has quite irregular shapes and does not meet the expectations of a building block that can easily be incorporated into larger structures outside the context of its original research.

For the conference, we would like to focus on the still mostly manual, therefore labor-intensive, and intellectually challenging task of transforming sources and collections into comparatively small but highly rigorous “handcrafted” datasets. How are the archives for such projects defined, developed, and managed? How do we select primary sources, deal with collections and create data models for their digital representations? With whom do we collaborate in this process? What logic and constraints shape the normalization of information when inputting them for comparison and analysis, and, just as important, what is discarded and how is absent or ambivalent data handled? What standards guide our datafication processes, which tools support us and what is the right scale to use? At the same time, which explicit and implicit limitations do such decisions impose on us? How does datafication create new archives, as Vincent Brown argues, defined by the tools used to explore them and the design decisions made during their creation? What could be the general design principles we follow in the process of datafication of historical sciences?

Programm

Thursday, June 2, 2022, Conference Day I
08:30 – 09:30

Conference Registration (Reading Room)

09:30 – 10:00

Welcome, Simone Lässig, GHI Washington
(Lecture Hall/ Seminar Room/ Hybrid)

10:00 – 11:00

Keynote I: “Table for One: Anecdotes on the Cultures and Challenges of Data(fication) for Historians,” Zoe LeBlanc, University of Illinois, Urbana-Champaign, School of Informatics, USA (Lecture Hall/ Seminar Room/ Hybrid)
Chair: Zephyr Frank, Stanford University, USA

11:00 – 11:30

Coffee Break

11:30 – 01:00

Workshop Session A
I: “People, Datasets, and Slavery Studies: Enslaved.org,” Daryle Williams, University of California, Riverside, Walter Hawthorne, Michigan State University, Kristina Poznan, University of Maryland, Catherine Foley, Michigan State University, and Alicia Sheill, Michigan State University, USA (Lecture Hall/ Hybrid)
II: “DataScribe: Transcribing Structured Historical Data,” Jessica Otis, Greta Swain, and Megan Brett, George Mason University, USA (Seminar Room/ Hybrid)

01:00 – 02:30

Lunch Break (Foyer)

02:30 – 04:00

Panel I: “Merging Datasets from Different Archives” (Lecture Hall/ Seminar Room)
Chair: Jessica Otis, George Mason University, USA
- “Introducing EncycNet: Graph-based Modeling of 19th Century German Encyclopedic Knowledge,” Andreas Witt, University of Cologne & IDS Mannheim, and Thora Hagen and Fotis Jannidis, Würzburg University, Germany
- “From Primary Sources to Research Results: How Datafication Transforms Historical Research on the Han Empire,” Yunxin Li, Stanford University, USA
- “Creating Datasets from Disparate Digital Archives: 18th Century Colonial American Merchant Networks,” Jeremy Land, University of Helsinki, Finland and Werner Scheltjens, Univerity of Bamberg, Germany

04:00 – 04:30

Coffee Break

04:30 – 05:30

Panel Il: “How to Deal with Biased or Incomplete Data(sets)?” (Lecture Hall/ Seminar Room)
Chair: Meghan Ferriter, Library of Congress Labs, USA
- “Past Data of the Environment: How to Write a Data-Driven Environmental History of the 19th Century?” Martin Schmitt, Technical University Darmstadt, Germany
- “Mind the Gap! Graph-based Modelling of Incompleteness in Historical Sources Using the Example of Medieval Armorials on Murals,” Philipp Schneider, Humboldt-Universität zu Berlin, Germany

Friday, June 3, 2022, Conference Day II
10:00 – 11:00

Keynote II: “What’s in a Footnote? Datafication and the Consequences for Quality Control in Historical Scholarship,” Pim Huijnen, Utrecht University, Netherlands (Lecture Hall/ Seminar Room/ Hybrid)
Chair: Torsten Hiltmann, Humboldt Universität zu Berlin, Germany

11:00 – 11:30

Coffee Break

11:30 – 01:00

Panel III: “Case Studies for Research Data Management in the Historical Humanities” (Lecture Hall/ Seminar Room)
Chair: Jennifer Serventi, National Endowment for the Humanities, USA
- “FAIRification of Research Data Mawittde Easy with the Geovistory Toolbox and OntoME,” David Knecht, KleioLab GmbH, Switzerland, and Francesco Beretta, Université de Lyon, France
- “The Conflicting Digital Legacies of James Carnegie-Arbuthnott, Jacobite Sheriff of Forfarshire: Practicing Safe Datafication of Historical Personae in Prosopographical Databases,” Darren Layne, The Jacobite Database of 1745, USA
- “‘Two Days Later, our Unit was Moved from Athens to Corinth.’ The Datafication of the German Occupation of Greece during the Second World War an its Challenges,” Valentin Schneider, National Hellenic Research Foundation, Greece

01:00 – 02:30

Lunch Break

02:30 – 04:00

Panel IV : “Turning Analog Into Digital Data: Opportunities for Transregional Research” (Lecture Hall/ Seminar Room)
Chair: Katharina Hering, German Historical Institute, USA
- “Between Repression and War: Looking for Overlaps in Russia’s largest XX Century Prosopographical Databases,” Daniil Skorinkin, University of Potsdam, Germany
- “The Challenges of Building a Transnational Database for 18th and 19th Century Legal and Notarial Records,” Clemente Penna, Mecila – Maria Sibylla Merian Center Conviviality-Inequality in Latin America, Brazil
- “Datafication of a Historical Return Migration Movement within the ‘ReMigra’-project: Critical Reflections on Reusable Work Cycles for Small Scale Digitization Efforts,” Eva Pfanzelter, University of Innsbruck, Austria

04:00 – 04:30

Coffee Break

04:30 – 06:00

Panel V : “Research with the Public: Crowdsourced Datafication” (Lecture Hall/ Seminar Room)
Chair: Atiba Pertilla, German Historical Institute, USA
- “Development Trajectories of Female Employment over Time (19th/20th century). A Source-Critical Analysis of User-Generated Research Data,” Katrin Moeller and Georg Fertig, Martin-Luther-Universität Halle-Wittenberg, Germany
- “Out into the Crowd – Back into the Archives: Datafication and the Representation of Persecuted Individuals in one of the Largest Archival Collections on Nazi Persecution,” Katharina Menschick and Kim Dresel, Arolsen Archives-International Center on Nazi Persecution, Germany
- “TBD,” Abby Shelton, Library of Congress Labs, USA

07:00

Conference Dinner

Saturday, June 4, 2022, Conference Day III
09:00 – 10:30

Virtual Poster Session
Chair: Emily Kühbauch, German Historical Institute, USA
(1) “Mapping Mobility in Slavery and Freedom: The Ethical Issues and Methodological Process of Datafying Slavery’s Archive,” Laura Brannan, George Mason University, USA
(2) “Correlates of State Formation in Central Europe – A Relational Spatio-Temporal Database on Longue Durée State Formation Processes from the End of the Thirty Years War to the End of the German Wars of Unification (1648–1871),” Heiko Brendel, University of Passau, Germany
(3) “Hero? Volunteer? Traitor? Depends on the Data: Datafication in Wartime and Post-War Sources in Luxembourg,” Nina Janz, University of Luxembourg, Luxemburg
(4) “Excavating the Newspaper Navigator Dataset,” Benjamin Lee, University of Washington, USA
(5) “Machines Reading Maps: Finding and Understanding Text on Maps,” Katherine McDonough, The Alan Turing Institute, USA
(6) “Datafication as the Basis for Quantitative Analyses of Historical Sources? Stylometric Text Analysis in the Historical Sciences,” Jan Rohden, German Research Foundation, Germany, and Jörg Hörnschemeyer, German Historical Institute Rome, Italy
(7) “The Slaughterhouse of Science: Turning Scientific Leftovers into Historical Data,” Alina Volynskaya, EPFL, Switzerland

10:30 – 11:00

Virtual Coffee Break

11:00 – 12:30

Workshop Session B
I: “Beyond 2022,” Sarah Hendriks, Trinity College Dublin, The National Archives, Ireland (Lecture Hall/ Hybrid)
II: “Introduction to HistText – an Integrated Framework for the Datafication of Massive, Multilingual Digital Corpora,” Cécile Armand and Christian Henriot, Aix Marseille University, France (Seminar Room/ Hybrid)
III: “From Text to Data – Digital Methods with Nopaque,” Laura Niewöhner and Patrick Jentsch, University of Bielefeld, Germany (Reading Room/ Hybrid)

12:30 – 01:30

Lunch Break (F)

01:30 – 03:00

Panel VI: “Methodologies of Datafication” (Lecture Hall/ Seminar Room)
Chair: Daniel Burckhardt, German Historical Institute, USA
- “Quo Vadis, Cartographic Source Edition? The Struggle for Datafication in the European Historic Towns Atlas,” Daniel Stracke, Institut für vergl. Städtegeschichte, Germany
- “On These Grounds: Designing an Event Ontology to Describe the Lived Experiences of Enslaved People Who Labored for Colleges and Universities,” Sharon Leon, Michigan State University, USA
- “Datafication & Graphing Using the Historical Records of Post-WWII War Crimes Trials in Asia and the Pacific: The Case of the War Crimes Documentation Initiative (WCDI) Lab at the University of Hawaiʻi at Mānoa,” Peter Bushell, University of Hawaiʻi at Mānoa, USA

03:00 – 03:30

Coffee Break

03:30 – 05:00

Panel VII: “How to Create Sustainable Digital Projects” (Lecture Hall/ Seminar Room)
Chair: Elizabeth Murice Alexander, Maryland Institute for Technology in the Humanities (MITH), USA
- “Transforming Information into Data: A Case Study of Reusing a Rich Data Source for a New Research Project,” Vanessa Hannesschlaeger, German Literature Archive Marbach, Germany
- “Data Disillusionment: Confessions of a Project Leader,” Joëlle Weis, University of Trier, Germany
- “Digital History Advanced Research Projects Accelerator (DHARPA) – Software Demonstration: Data Creation, Analysis, and Critical Reflection,” Helena Jaskov, University of Luxembourg, Luxembourg

05:00

Conference Closing, Andreas Fickers, Luxembourg Centre for Contemporary and Digital History (C²DH), Luxembourg

https://datafication.hypotheses.org/
Redaktion
Veröffentlicht am
Autor(en)
Beiträger
Klassifikation
Weitere Informationen
Land Veranstaltung
Sprach(en) der Veranstaltung
Englisch
Sprache der Ankündigung