In the past few decades, humanities scholars have transitioned from data scarcity to data flooding. Increasing volumes of research data available in digital form have had two major consequences. First, they have enabled researchers to start using robust, verifiable and reproducible quantitative, automated approaches, techniques and methods, originally developed in data science and natural language processing, to process, visualize and combine research data. Second, these interdisciplinary approaches have enabled scholars to conduct humanities research on a much larger scale, sparking entirely novel research questions, which fill gaps, shed new light, give new insights or completely transform humanities research.
However, before the interdisciplinary approaches can be successfully employed, the available resources, tools and research techniques need to be well understood. This is the aim of this training module, which demonstrates data formatting, corpus encoding and corpus annotation for different types of specialized discourse as well as showcases corpus querying, visualization and interpretation in order to tackle research questions in the same dataset from a broad range of research disciplines in the humanities. Close attention is paid to distinguishing between the research data and metadata that were harvested from the source, and their enrichments obtained from DH tools.
The module also raises awareness of the different specific societal, institutional and technological circumstances that shape the content, structure and language of a particular type of specialized discourse as well as highlights the importance, potential, peculiarities and implications that these factors have for research in different DH disciplines. The module also presents the support offered by Research Infrastructures to scholars interested in this type of research in terms of the available resources, tools and case studies that help novice and experienced researchers alike better identify relevant research questions and understand the main quantitative research methods used to study specialized discourse from a broad range of DH disciplines.
Link straight to module sections:
- Collections of Parliamentary Records (written by CLARIN-ERIC)
- Collections of Computer-Mediated Communication (written by CLARIN-ERIC)
- Collections of Digitised Newspapers as Historical Resources (written by Impresso)
- Digital Humanities and Heritage Research Infrastructures (written by E-RIHS)
- #dariahteach (Courses on Digital Arts and Humanities):
- Videolectures on Digital Humanities:
- OpenMethods (Metablog Highlighting Digital Humanities Methods and Tools):
- PARTHENOS SSK (A Collection of research use case scenarios illustrating best practices in Digital Humanities and Heritage research):