Digital Humanities in the Netherlands

Joris van Zundert, Huygens Institute for the History of the Netherlands, Royal Netherlands Academy of Art; Karina van Dalen-Oskam, Huygens Institute for the History of the Netherlands, Royal Netherlands Academy of Art and Sciences / University of Amsterdam

Early Beginnings — From alfa-informatica to Digital Humanities
As the personal computer began to make its way into the Dutch humanities university environment and into scholarly practice during the 1980s, humanities faculties were willing to host computing oriented departments to help humanists cope with the information problems they had. These problems often did not pertain to very fundamental aspects of humanities and computation. Most historians and literary scholars arguably only came in contact with digital humanities departments such as “Computer & Letteren” – as it was called at the Utrecht University at the time – when a floppy disk went bad or when a file got lost. Telltale for the relative immature state of the field these departments went under their various disguises of “Computers & Humanities”, “Historical Information Science”, and “Alpha Informatics”. No common denominator had been established yet. Unable to establish and expound digital humanities in a form that was recognizable as a scientific discipline[1] these centers came under threat with the introduction of easy-to-use Windows-based software.[2] Misunderstood as a support and teaching service only, management assumed that humanities computing centers had outlived their purposefulness when basic computer literacy did not present itself as an urgent problem anymore. Consequently, university cutbacks erased the centers representing the first wave of digital humanities in the Netherlands rapidly and almost completely. The story of the all but demise of alfa-informatica – the then common denomination for the activities within ‘computer and humanities’ centers – is illustrative for the relation between computing and the humanities ever since. Because the history of digital humanities in the Netherlands is still also a history of much misconception and misunderstanding on the computational as well as the humanities side, digital humanities is very much an interdisciplinary work in progress.

No history of digital humanities in the Netherlands can be complete without mentioning the Dutch Historical Data Archive. The founding of this archive in Leiden in 1989 was inspired by the increased use and output of digital data and the anticipated problems of their digital sustainability.[3] The vulnerability of digital data was not just an archival problem though, but also a problem of scientific accountability and reproducibility. Closely modeled on the Arts and Humanities Data Service in the UK[4] the historical data archive set itself the task of collecting and archiving digital humanities research data and – more importantly – of raising awareness for the digital aspects of historical research. Instrumental in raising that awareness proved a book co-authored by a group of historians with various methodological backgrounds under the title of “Historische Informatiekunde. Inleiding tot het Gebruik van de Computer bij Historische Studies”.[5] Although now sold out, the publisher still lists this book[6], somewhat ironically with the warning that it was printed in 1992 and given the volatility of the field can only be used for historical purposes. This is certainly true for almost any technological description in the book. One of the chapters titled “Anatomy of the Computer”, outdated as it may be, indicates the level of education and explanation that was needed at the time to familiarize most humanities researchers with the new technology. The principles put forward in other chapters however – like the one by Hans Voorbij on the computational handling and analysis of text – seem far less outdated, even hinting for instance at xml tagging several years avant-la-lettre.

Avant-garde: Computational Linguistics, Stylometry, Stemmatology
Digital humanities in the Netherlands is indebted to computational linguistics. Linguistics took a head start as to the formalizations, quantification, and the application of probabilistic models that are a close fit to calculation and computing.[7] Many humanities fields like history, textual scholarship, art history, but also adjoining fields such as sociology have maintained to a large degree an interpretative and decidedly narrative approach to observation and reasoning. Linguistics may have taken to quantified approaches earlier because of its intrinsic interest in the structure of language where other humanities fields tended to focus on interpretation of semantics. Sharing much of the same ‘base layer’ of data (i.e. text) it seemed more than appropriate to fit the tools of linguistics to problems of other humanities fields. Currently this dynamic is probably most prominently visible in stylometry. Distinct traces of these linguistics approaches merging with historic approaches are clearly visible in the work for instance of Margit Rem, currently assistant professor of historical linguistics at Radboud University in Nijmegen.[8] Together with mathematician and now professor emeritus Evert Wattel from the Free University in Amsterdam she developed a statistics based computational approach to the problem of localization of 14th century dialectic variants in Middle Dutch charters.[9] The interaction between philology or the historical study of text and linguistic and statistical approaches reaches back to earlier days though. Eep Talstra, who recently retired as professor at the Free University, applied computational linguistic approaches to the problems of tradition and interpretation of the Old Testament as early as the 1980s. The intimate relationship between linguistics and digital humanities is also put forward by Groningen University professor John Nerbonne in a 2005 publication in “Literary and Linguistic Computing”: “Humanities Computing is focused on computational linguistics, which is a well-established interdisciplinary field with a strong tradition of serious computational work on language”.[10] Nerbonne offers a concise overview of the considerable contribution this approach at Groningen University has brought to humanities, which includes inter alia historic sea trade analysis and the georeferencing of historic city maps. John Nerbonne opposes the view of digital humanities as a field in its own right and talks instead predominantly of the service role that linguistic and other types of computing provide to humanities. A similar stance can be inferred from the work that professor Antal van den Bosch has done, who started his scientific career at Tilburg University, moved to Radboud University, and is meanwhile also active as a eScience Integrator at the eScience center that fosters multi-disciplinary data-intensive research and “breaks down the barriers between traditional disciplines and ICT technologies”.[11]

Less linguistically informed but rather adhering to methodologies from both philology and bio-informatics is the domain of stemmatology. Stemmatology – or philogenetics – concerns itself with the problem of the genealogy of variant versions of manuscripts and print books. Ultimately the method derives from the insights of Karl Lachman on the information that differences between text witnesses offer as to their lineage. An early attempt to apply computation to the problem of establishing the genealogy of a text tradition in the Netherlands was undertaken by Maaike Mulder together with romanist Anthonij Dees and computer linguist Marcel Dekker.[12] The publication led to agitation in the field because the computational approach was portrayed as being superior to the conventional ways of drawing up a stemma. Her results opposing that of her PhD supervisor Anton van Duinhoven, Mulder was even accused of ‘patricide’.[13] Notwithstanding this slight turmoil, stemmatology – apart from and next to linguistics – has since been the field within Dutch digital humanities most closely related to computational approaches and analysis. Arguably the work of for instance Ben Salemans, Pieter van Reenen, Margot van Mulken, and – again – Evert Wattel has contributed significantly to the status of this field as one of advanced humanities computing even in international perspective.[14] Rens Bod in a recent history of the humanities classifies stemmatic philology as “possibly the most successful humanistic discipline”; stemmatology “provides a precise method with which texts from all periods and regions can be reconstructed. Stemmatic philology is to the humanities what classical mechanics is to natural science […] a showpiece for the field as a whole.” Yet, Bod concludes, stemmatology is not widely held in high regards in the Dutch research arena, and leads a rather rudimentary existence.[15]

Changing Institutional Landscapes
Though not conceived under a lucky star, the establishing of the Netherlands Institute for Scientific Information Services (NIWI)[16] in 1997 under the aegis of the Royal Netherlands Academy of Arts and Sciences signified another considerable momentum developing for digital humanities in the Netherlands. The institute was conceived as a merger between several small institutes and departments that individually were related to computer supported research work. In hindsight the rationale for the merger seems weak. All departments provided some service on scientific information indeed, but the departments were rather heterogeneous as to individual scope, domain, and organization. As a consequence the merger was short lived and the various departments were eventually housed in different institutes again. Yet, in that short space of time some interesting developments took place. The Dutch Historical Data Archive had been part of the merger and its expertise in digitization, software development, and computational analysis spread as a very beneficial digital impulse to several of the other departments. Thus the merger did create part of the groundwork of a Dutch digital humanities impetus that was to spread with the remnants of the NIWI. Next to that the NIWI was home to a small but very active research group called NERDI (Networked Research and Digital Information) focusing on knowledge production in new digital networks. NERDI was the forerunner of what was to become the Virtual Knowledge Studio, which for several years would be a major player in Dutch eHumanities research.[17]

After the dismantling of the NIWI most of what had previously been the Dutch Historical Data Archive was expanded into the now national digital research data service DANS (Data Archiving and Networked Services)[18], which continues NIWI’s mission. DANS aims to be a general digital research data archiving service rather than a humanities only digital archive, but it maintains strong ties to its humanities legacy. DANS played a major role for instance in the creation of the European digital humanities initiative DARIAH.[19] DANS also makes major contributions to the development of digital humanities research infrastructure in the Netherlands by providing essential services such as the EDNA Dutch archeological e-depot and others.[20] Likewise, DANS is a project partner in several important humanities research projects such as the “War in Parliament” project on political references to World War II in Dutch politics.[21]

The “War in Parliament” project in itself links to the impulse that CLARIN added to digital humanities in the Netherlands.[22] Originally conceived as a project aimed at establishing a common digital research infrastructure for linguistic research and resources, CLARIN extended its aim to explicitly include funding not primarily linguistics oriented humanities research projects that apply tools to language data. Through this CLARIN has been pivotal in subsidizing a number of important digital humanities research projects.[23] The Netherlands has known several of these funding instruments that have been beneficial to furthering digital humanities infrastructure, resources, digitization, and research. Most notably, the program known as Continuous Access To Cultural Heritage (CATCH)[24] of the Netherlands Organisation for Scientific Research (NWO) has resulted in many initiatives to digitally disclose cultural heritage collections. But also the Collaborative Organisation for ICT in Dutch Higher Education and Research (SURF) has had its share in this. Paramount as well in this respect is the effort of national institutions such as the Royal Library.[25] Such effort resulted in the large scale digitization and publication of major cultural resources such as the “Memory of the Netherlands”[26] and recently Delpher[27], offering access to over 90.000 books and 1 million newspapers in full text. Important from a humanities research point of view is that the National Library decided to adopt guardianship over the digital collection that is being created since 1999 by the Digital Library for Dutch Literature (DBNL – Digitale Bibliotheek voor de Nederlandse Letteren).[28] Likewise an institution such as the National Archive is putting effort in making its collections digitally available for research.[29] This effort in itself links again to digital humanities and artificial intelligence research: because many of the written documents in the archive cannot be rendered machine readable by conventional OCR methods, there is much interest in innovative means of recognizing the text of such documents. The Monk project by professor Lambert Schomaker and others[30] aims to offer such innovative methods to disclose primary humanities research sources.[31] To continue the digitization and infrastructural momentum that resulted from all such institutional impulses currently a proposal is under consideration that integrates the Dutch CLARIN and DARIAH initiatives under the name of CLARIAH.

The Role of the Huygens Institute for the History of the Netherlands
At the moment of writing it is still unclear whether or not this proposal will be successful. If so, project management will reside with the Huygens Institute for the History of the Netherlands, an institute that has become more visible in Dutch digital humanities in the last decade. Around 2005 the Constantijn Huygens Institute[32] – as its name was at the time – developed plans towards integrating a digital workflow into its publishing next to its conventional activities. The focus of the institute until then was on printed scholarly editions and other high-end scholarly resources for humanities research, mainly in the domains of intellectual history and literature. The institute, however, lacked sufficient IT skills and digital humanities research experience to effectively create a digital variant of its workflow. Given the approaching termination of the NIWI it seemed opportune to add the literary research department of that institute to the Constantijn Huygens Institute. This resulted in the Bibliography for Dutch Literature and Linguistics being carried over to the Huygens Institute, as well as the staff and scientific personnel of the NIWI department of Dutch Literary Research that had acquired digital humanities research and development skills during its time at NIWI. This meant that we, the authors of this text Karina van Dalen-Oskam and Joris van Zundert, also joined the institute. At the NIWI we had been developing a virtual research environment for the creation of humanities and social science digital data, called eLaborate.[33] The tool offered inter alia a transcription environment for non OCR-able manuscripts. In essence this meant an out of the box, rough around the corners but getting the job done type of digital workflow for digital editions. eLaborate has been part of the digital infrastructure of the Huygens Institute ever since and has seen several life cycles of development. The aim is that at some point eLaborate will be part of the digital backbone for publishing digital scholarly editions at the institute.

Two decisions have been paramount to the relative success we had in developing further digital humanities initiatives at the Huygens Institute. The first was the liberty and ample support we were given to pull together a skilled IT research and development team to foster digital products. The other was that Karina was given similar freedom in developing a digital humanities research program. In the process we were able to attract one of the highest skilled developers we had worked with during the NIWI days. Together we proposed that the development strategy for the growing R&D group would be based on Agile principles.[34] The rationale for Agile is sensible in a research organization: it is well adapted to quickly changing and unclear functional requirements, which is exactly the type of requirements one tends to find in learning organizations like research institutes.[35] The clear downside of Agile is that it eschews the project documentation that plans, controls, and reports in standard protocol project management. This is particular unsettling to management that tends to feel left without control and inkling of progress. Over time a methodological mixture was developed of Agile methodology and standard project management reporting that now caters to the needs of both worlds. We dwell a little on this point here because in our view it is pivotal that researchers are drawn in on all relevant matters and decisions of developing the digital tools and their functions. Agile methods ensure that developers and researchers work very closely together, almost on a daily basis. And although this has caused some serious tensions over time, it did allow both ends of the equation to get to know each other’s style, skills, and quirks. The result is that within the current institute there is a sincere methodological dialogue between software development and computer science on the one hand and humanities research on the other hand. This is not a ‘happy blue sky world’, admittedly, but we seem to have escaped at least the client vs. contractor pattern between humanist and computer scientist and benefit from – in the best cases – an intrinsic two-way research collaboration.

On the research side our first efforts were aimed at the application of authorship attribution techniques involving Burrows’ Delta.[36] We applied this now common procedure to a case of double authorship in a medieval Arthurian novel. We had to adapt the standard corpus based technique to work with a single text. The resulting approach seems to have appealed to the international stylometry community and was deemed pioneering certainly in the Dutch humanities research realm.[37]

These developments gave the Huygens Institute a first toehold of digital humanities in the domain of historic literary research and textual scholarship in the Netherlands. Since that time the Huygens Institute was able to integrate various researchers and projects focused on valuable humanities content and using advanced computational or digital methodology. Peter Boot for instance led the development of the digital edition of Vincent van Gogh’s correspondence[38], not neglecting to tend to his own contribution to international digital humanities theory.[39] Ronald Dekker together with international kindred spirits in the realm of the Interedition project[40] developed CollateX which provides a high-quality text collation engine used in several projects internationally.[41] Similarly strong in computational approach is the network analysis underlying the project “Circulation of Knowledge and Learned Practices in the 17th century Dutch Republic”[42] that resulted in several tools for visualizing correspondences.[43] The momentum thus created in the Huygens Institute is now also visible in its rising number of professorships at Dutch universities. In particular the chairs of Charles van den Heuvel (professor of digital methodology and history)[44] and Karina van Dalen-Oskam (professor of computational literary studies)[45] focus on digital humanities and computational methods in the humanities.

Current Situation
These recently established chairs – just as the chair in digital and computational humanities that Rens Bod accepted in 2013 – show the current interest in the Netherlands to make digital humanities part of the curriculum at university level. The involvement of digital humanities in the curriculum is still modest however. There are no formal master level programs for digital humanities, but a minor at Utrecht University[46] started in 2013 and there is a new minor being developed as a cooperation between the Free University Amsterdam and the University of Amsterdam.[47] Professors like Van Dalen-Oskam, Bod, and Van den Bosch try to connect to and integrate with existing university master and PhD training in for instance linguistics, literary studies, and media studies. Such work is also continued at the Groningen (Alfa-informatica)[48] and Utrecht (Digital Humanities Lab)[49] universities. It seems therefore justifiable to say that currently digital humanities in the Netherlands engages mostly at the research level but is slowly making its way into some of the main curricula.

At the research level we found that the Huygens Institute is now a significant player, but obviously it is not the only place where digital humanities work is happening. Certainly also the Meertens Institute should be mentioned.[50] The research at the Meertens Institute veers slightly more to the linguistics side of the spectrum, but many of the projects at the institute contain digital elements or are based on digital methodology. It is home for instance to the large scale NederLab project which develops a digital infrastructure which aims to offer digital access to all Dutch texts between 800AD and the present.[51] A more reflective stance on digital humanities has been taken by the eHumanities Group of the Royal Academy[52], a follow-up to the Virtual Knowledge Studio program that was terminated in 2010. The eHumanities Group investigates the relation between digital technology and humanities and the social sciences.[53]

Digital humanities in The Netherlands has not been formally delineated. That, combined with its intrinsic interdisciplinary nature, makes it hard to tell where one should draw the line for counting certain work into the digital humanities category. For instance much of the work of Piek Vossen – as a Spinoza laureate granted the highest Dutch scientific award – at the Free University relates narrowly to digital humanities.[54] But he would categorize himself probably more in the realms of computational lexicography and computational cognition or artificial intelligence. Yet projects such as Mapping Notes and Nodes clearly classify as digital humanities.[55]

The work of Piek Vossen relates to another important initiative in The Netherlands, the Centre for Digital Humanities.[56] This is a collaboration between the University of Amsterdam, the Free University of Amsterdam, the Royal Academy of Arts and Sciences, and the eScience Center to foster digital humanities research. This center facilitates so-called embedded research projects, in which research questions from the humanities are approached by using techniques and concepts from the field of digital humanities. For the Royal Academy itself it is one activity in a series over the past decade to create an additional impulse for digital humanities and computational approaches in the humanities. The Academy has in the past strategically subsidized a number of humanities research projects containing some form of digital or computational component, most notably Alfalab[57], a prelude to the far larger ambition of the Academy to create a humanities center in Amsterdam comprising a number of its humanities institutes and intended to focus on digital and computational methods next to its ongoing conventional humanities research.[58] Through its current Computational Humanities program, which is monitored by the eHumanities Group, the Academy is funding four large projects on literary quality, elite networks, motive analysis in text and music, and networked census data.

What, if anything at all, can be derived from the above as to general trends and conclusions? First of all we feel the need to stress how much we have left out to be able to construct a somewhat concise and coherent story. We have completely skipped over such fields as musicology and art history for instance. For all colleagues in the field we named we would be able to name three we did not mention and whose work is invaluable to the overall endeavor.[59] In a way this points to a characteristic of digital humanities in the Netherlands – and probably in many other countries. Digital humanities is a highly inter- and multidisciplinary field. Many contributors to the field might not even count themselves as being in the field of digital humanities proper, but only related to it. Similar also to the situation in other countries, digital humanities practices are found on many levels in many institutions. The digitization effort in institutions like the National Library and in other places to remediate humanities resources in a digital environment are of great importance to the field. However, these activities normally are not seen as research proper, but as a research support service at best. Yet obviously, much knowledge on digital humanities practices is created in those contexts. Neither is the work of building infrastructure generally regarded as being on a research level – as Geoffrey Rockwell stated: “Research infrastructure is not research just as roads are not economic activity”.[60] Even though there is much fundamental knowledge that must be gained about the properties of humanities information and knowledge through virtual infrastructure endeavors, researchers should be highly aware of the inherent pitfalls and distracting dangers involved with building and maintaining infrastructures.[61] Only recently the Royal Academy has proposed that tool building and the development of algorithms as well as the creation of datasets that are clearly underpinning research may serve as indicators of scientific output. Given that digital humanities is multi-disciplinary and very much grounded also in digital practices, this is an encouraging and important signal. As practitioners we often felt that we had to fight a battle on two fronts at least. Not only did we need to attend to the time-consuming work of developing tools to bootstrap our research – but since we could not count that in any way as scientific activity we had to produce twice the number of articles to make up for that investment. It seems that in this respect slowly but surely we are benefiting from the general trend of remediation of research in the digital realm, which creates a situation where it is more normal and scientifically accountable to be engaged in digital activities also in the humanities.

In any case, this overview shows that digital humanities in the Netherlands indeed has gained mass and momentum over the last decade. A number of special issues of renowned Dutch humanities journals[62] can also be taken as an indicator of that momentum. It is still early to judge, but it seems reasonable to assume that digital methodology in the humanities is around to stay. That, however, is not to state that digital humanities as a discipline proper is here to stay. This is one of the main unresolved issues surrounding digital humanities: is it a field in its own right or does it signify a change or expansion of the methodological paradigm? For the moment we lean toward the latter position. Sure enough there will be a generation, maybe even two, of researchers that will deem digital humanities as their native field. But the digital is not something that stands next to what we know and who we are. Rather the digital permeates all aspects of humanistic culture and behavior. This calls for a very deep reorientation on what humanities as a field is, or what its role is in a ubiquitously digital society.[63] As such, eventually the digital will also find its seamless integration in the methodology of humanities proper.

What form this integration takes is currently very much a topic of debate, most certainly so in the Netherlands. In his inaugural address Rens Bod has proposed – in accordance with the theme of his 2013 monograph – that digital humanities is effectively a remarrying of humanities methodology with scientific method to form a successor to ‘Humanities 1.0’.[64] Bod proposes an empirics of pattern searching in big data that should underpin any inference in humanities research. His views certainly have not met with universal acclaim.[65] We also for the moment are not convinced that such a ‘naive empiricism’[66] would be appropriate for a field that has a strong hermeneutic tenet reaching back several centuries. Indeed we expect humanities methodology to experience dramatic changes towards the digital in the next decade. However, these changes shall in all likelihood be negotiated in a deep dialogue between humanities and computer science[67] and not by a simple overriding of humanities methodology by the quantitative empirics of the so-called hard sciences. Rather we expect that we will see a remediation of hermeneutic methods through ‘next gen’ computer logic and languages as a new generation of humanities researchers will embrace coding as a native form of expression for their research. The days when contact between humanities researchers and digital humanists was limited to cases of floppy discs gone bad are long over. Indeed, we expect the distinction between the two to become increasingly blurred and, eventually, meaningless.

