Franziska Decker, Zentrum für Informationsmodellierung, Karl-Franzens-Universität Graz
Preliminary programm:
Day 1: Introduction to NLP and Text Analysis
- Python programming course for novice participant (technical)
- Introduction to digital humanities projects and resources using NLP (DH scholar presentation)
- What is NLP? Taxonomy of NLP Methods? Why is it relevant to medieval studies? (Overview of NLP tasks and applications) (technical)
- Practical Exercises: Setup (Juptyer, Colab, packages) (technical)
- Evening lecture: Jean-Baptiste Camps and reception
Day 2: Topic Modeling
- Topic modeling: Discovering hidden themes and topics in large collections of texts (DH Scholar presentation)
- Different levels of text analysis and representation methods (tokenization, stemming, lemmatization, TF-IDF, embeddings) (technical)
- Hands-on session: Applying topic modeling to a corpus of medieval charters
Day 3: Named Entity Recognition
- Applications of NER in genealogical research, prosopography, and historical event analysis (DH scholar presentation)
- Named entity recognition (NER): Identifying and classifying people, places, organizations, and dates in text (technical)
- Hands-on session: Building a simple NER system and exploring relationship extraction (technical)
Day 4: Text Reuse and Authorship Analysis
- Exploring how authors often borrowed, adapted, and transformed existing texts, creating intertextual networks (DH Scholar presentation)
- Methods for identifying text reuse through techniques like stylometry, plagiarism detection (word usage patterns, sentence structure, n-grams, different similarity measurements) (technical)
- Hands-on session: Use stylometric tools to compare different medieval texts and assess the possibility of authorship attribution (technical)
Day 5: Future Directions (LLM)
- Introduction to advanced methods and their NLP applications (DH scholar presentation)
- Challenges of using NLP in historical research.
- Presentation of the participants’ projects (Poster session)
- Open discussion: Exploring potential applications of NLP in students’ specific research projects.