Boosting Digital Humanities research with parliamentary data

Boosting Digital Humanities research with parliamentary data
by Ulrike Wuttke

By the end of this section, you will be able to…

  • Identify possible research questions on parliamentary discourse from different DH disciplines
  • Understand the main quantitative research methods used to study parliamentary discourse in DH

The unprecedented availability of large amounts of parliamentary records in digital form has opened up new opportunities for political scientists, sociologists, and historians, who have increasingly started to adopt methodologies previously known in the fields of natural language processing and data science, such as text mining, social network analysis, geospatial analysis, and data visualization, in order to structure, search, mine, manipulate, visualize, share, and combine parliamentary data. These approaches enable scholars to apply the interpretative traditions of the humanities and social sciences to data on a very large scale but also allow them to address new research questions and develop novel techniques for tackling complex social phenomena (e.g. migrant crisis, Euroscepticism, populist movements).

Typical workflow in digital humanities research with parliamentary data.

Figure 13. Typical workflow in digital humanities research with parliamentary data.


Case study 1: the War in Parliament project

War in Parliament was a successful data curation project that made the proceedings of the Dutch parliament available as a semi-structured corpus compliant with CLARIN standards. The corpus is now accessible through PoliticalMashup, an advanced search engine tailored to historical and social science research. Importantly, the project served as an illustrative case study that clearly showed how a corpus-based approach to the analysis of parliamentary proceedings unveiled certain aspects of the political past that had hitherto remained only as vaguely remembered events in a nation’s collective memory.

Concretely, the authors of War in Parliament wanted to systematically check how the Dutch Boerenpartij (Farmers’ Party) was associated with National Socialism between the years 1958 and 1982. By using complex corpus-based search queries, they were able to determine the following two facts. On the one hand, it was indeed the case that the Boerenpartij was implicitly criticised for their right-wing political stance throughout this entire period. On the other hand, the corpus-based approach showed, for the first time, that they were actually only once directly accused of fostering National Socialist ideas, and this was when Hendrik Adams, a member of the party, was singled out for being a supporter of the German occupier during World War II.

Case study 2: Gender in the Danish parliament

The motivation for researching gender issues in the Danish parliament is the fact that active participation of women in politics is historically relatively new, and that women are still underrepresented in the Danish parliament. Hansen et al. (2018) investigated gender differences in the revised transcripts of speeches from the Danish Parliament over a time span of eight years (2009-2017). They investigated the topics addressed, speech frequencies and speech lengths with respect to their duration as well as the number of the speakers, their age, party and role in the party.

Their study shows that the number of female MPs under 29 is larger than the number of male MPs from the same age group and that in general women speak less frequently and for a shorter time than male MPs in proportion to their seats in Parliament. The difference in speaking time between female and male MPs is statistically significant. The data also show that women belonging to left-wing parties speak less frequently than women from right-wing parties compared to their seats in Parliament. The data also indicate that ministers and spokespersons speak more frequently than ordinary MPs and that female ministers under a male prime minister give fewer speeches than female ministers under a female prime minister even though their percentages in the two periods are similar. Their analysis shows that there were relatively more male spokespersons than female ones in the period covered by the corpus. Finally, the Danish data seem to confirm the findings from related studies that female MPs more often spoke about “softer” political areas, while in the speeches of male MPs “harder” subjects prevailed.

Case study 3: The Linked Open Data of Talk of Europe

Talk of Europe was a successful project that used the multilingual proceedings of the European Parliament debates to create a highly interactive and structured parliamentary corpus that could accommodate a wide range of research approaches.

The Talk of Europe proceedings are enriched with Linked Open Data, which means that the debates are dynamically linked to other Open Data datasets freely available on the Internet, such as a general-purpose encyclopaedia that provides information about the Members of Parliament and a geographical knowledge base for European countries. The corpus that resulted from this project is available through an online search interface that requires users to input the search parameters in the SPARQL query language. Since SPARQL allows a researcher to search for fairly complex notions, the Talk of Europe dataset is ideal for tackling various research questions where the gap between qualitative and quantitative approaches to parliamentary analysis needs to be bridged.

Watch the video on Talk of Europe here:

For instance, Kessels et al. (2014) used network analysis to explore highly complex constellations of relations between members of the EU parliament based on their interactions in plenary debates. They transformed the text into a social graph representing the political debates, including the networks, connections and affiliations of the participants. Birckholz et al. (2015) used the dataset and the power of the SPARQL query language to study how Members of Parliament often suggest higher education as a solution to unrelated policy problems. Mandravickaite et al. (2015) applies stylometric analysis to the speeches of the EU members of parliament to uncover how the rhetoric of the members of the EP is similar or different to the rhetoric of the party groups they belong to and to that of the other party groups.

Watch interviews with researchers!

Interview with Federico Nanni (University of Mannheim) about researching parliamentary collections (approx. 6 minutes):

Interview with Prof. Dr. Andreas Blätte of the University of Duisburg-Essen (UDE) about working with Parliamentary Data (approx. 5 minutes):


Image credit:

Figure 13 is taken from the presentation by Gkoumas et al. (2018), slide 4.

  • Hughes, L. M., Ell, P. S., Knight, G. A., and Dobreva, M. (2013). Assessing and measuring impact of a digital collection in the humanities: An analysis of the SPHERE (Stormont Parliamentary Hansards: Embedded in Research and Education) Project. Digital Scholarship in the Humanities, 30(2): 183-198.
  • Robertson, S. (2016). The Differences between Digital Humanities and Digital History. In: Debates in Digital Humanities 2016. Minneapolis and London: University of Minnesota Press. Link to full text:
  • Spiro, L. (2014). Access, Explore, Converse: The Impact (and Potential Impact) of the Digital Humanities on Scholarship. In: Keys for architectural history research in the digital era. Link to full text:
  • Piersma, Hinke, Ismee Tames, Lars Buitinck, Johan van Doornik and Maarten Marx (2014). War in Parliament: What a Digital Approach Can Add to the Study of Parliamentary History. Digital Humanities Quarterly 8(1). Link to full text:
  • van Aggelen, Astrid, Laura Hollink, Max Kemman, Martijn Kleppe and Hendri Beunders (2017). The debates of the European Parliament as Linked Open Data. Semantic Web 8: 271–281. Link to full text: