The FAIR Principles – Parthenos training

By the end of this section, you should be able to….

Describe the FAIR Principles

Understand how Research Infrastructures ensure their data is FAIR

What are the FAIR Principles?

If we agree that improved and increased the sharing of research data would be of benefit to research communities and collections holding institutions alike, then how should we proceed? What ground rules should given how people share, when and where? How can we establish a common understanding of how far the ethic of sharing can and should extend?

These questions have been answered by the development of the FAIR (which stands for Findable, Accessible, Interoperable, Reusable) principles. Developed by FORCE 11 (a pan-disciplinary organisation, not one specific to arts and humanities), these principles provide a baseline understanding for the value sharing data can deliver, and the baseline requirements for doing so.

The FAIR principles are described as follows:

TO BE FINDABLE:

F1. (meta)data are assigned a globally unique and eternally persistent identifier.
F2. data are described with rich metadata.
F3. (meta)data are registered or indexed in a searchable resource.
F4. metadata specify the data identifier.

TO BE ACCESSIBLE:

A1 (meta)data are retrievable by their identifier using a standardised communications protocol.
A1.1 the protocol is open, free, and universally implementable.
A1.2 the protocol allows for an authentication and authorisation procedure, where necessary.
A2 metadata are accessible, even when the data are no longer available.

TO BE INTEROPERABLE:

I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
I2. (meta)data use vocabularies that follow FAIR principles.
I3. (meta)data include qualified references to other (meta)data.

TO BE RE-USABLE:

R1. meta(data) have a plurality of accurate and relevant attributes.
R1.1. (meta)data are released with a clear and accessible data usage license.
R1.2. (meta)data are associated with their provenance.
R1.3. (meta)data meet domain-relevant community standards.

Obviously not every collection of research data is equally eligible to be shared in a FAIR way. Anonymity of personal data must be respected and may only be sharable in a redacted form, for example, or unprotected research discoveries may require an embargo. Particular problems in the arts and humanities can exist, due to the shared nature of the ownership of cultural data (eg. between archives and researchers, or between publishers and authors). So the application of the FAIR principles is usually applied with the caveat condition that data be “as open as possible, as closed as necessary”

Case Study: CENDARI Data Soup

CENDARI Data Soup (Click to expand)

The Collaborative European Digital Archive Infrastructure (CENDARI) project is one of the PARTHENOS participating e-infrastructures. CENDARI gathers curated data covering two research areas in the community of “Studies of the Past”: WW1 and Middle Ages. It includes data from different sources (mostly across the GLAMs sector) both unique and deposited. The so-called CENDARI ‘data soup’, contains a wide range of formats and levels of description of data. Recognised and interoperable standards – in use in the different research domains involved – were used to encode data and describe cultural objects and collections (i.e.: EAD for Archival documents).

The CENDARI dataspace contains 829,087 descriptions, represented in several types of data formats. This information is stored in a repository called CKAN, an open source data portal platform developed and maintained by the Open Knowledge Foundation. The kind of file formats and standards, as well as the level of organization and accessibility of data provided by the Cultural Heritage Institutions in contact with CENDARI, vary from case to case: small archives are usually lacking resources for metadata standardization and data storage, therefore their archival descriptions are often accessible via spreadsheets and are not available online (hidden archives). National and international archives, instead, usually have a cataloguing and encoding department: nevertheless, they often lack both technical and political means to share their data with other institutions and projects.

Along with the aggregation work on data, CENDARI researchers have also encoded information related to archival descriptions and archival institutions, using the open source software ATOM (‘Access to Memory’), promoted by the International Council for Archives and fully supporting all the archival descriptions standards. CENDARI established collaborations with international networks in Digital Humanities, in order to engage communities of scholars and digital humanists: thus, the risk that data collected in the context of research projects become obsolete and unusable is reduced.

Watch!

The FAIR Principles in practice.

This video shows how data that complies with the FAIR Principles helps researchers to use Linked Open Data.

“Linked Open Data – What is it?” from Europeana (approx 4 minutes)

(To watch this video in another language visit available at https://vimeo.com/36752317)

Linked Open Data – What is it? from Europeana on Vimeo.

The PARTHENOS GUIDELINES to FAIRify data management and make data reusable

As part of its work into Best Practices around Data Management, a team of over fifty PARTHENOS project members investigated commonalities in the implementation of policies and strategies for research data management and used results from desk research, questionnaires and interviews with selected experts to gather around 100 current data management policies. This included guides for preferred formats, data review policies and best practices, both formal as well as tacit.

The resulting guide PARTHENOS GUIDELINES to FAIRify data management and make data reusable offers a series of guidelines to align the efforts of data producers, data archivists and data users in humanities and social sciences to make research data as reusable as possible.

The PARTHENOS team extracted a set of twenty guidelines which different disciplines have in common, with a focus on (meta)data and repository quality.

For easy reference, the team assigned each of the guidelines to making data Findable, Accessible, Interoperable or Reusable. This subdivision is based on the FAIR Data Principles, which were first published by FORCE11 (2016) and are intended to guide those wishing to enhance the reusability of research data. Each of the PARTHENOS guidelines is accompanied by specific recommendations for data producers and data users on the one hand and for data archivists on the other hand.

Click here to download the “PARTHENOS GUIDELINES to FAIRify data management and make data reusable”

Watch!

Dieter Van Uytvanck – CLARIN and the FAIR Principles (approx 30 mins)

Look at how Research Infrastructures ensure that they are compliant with the FAIR Principles. Dieter Van Uytvanck of CLARIN gave this presentation at the PARTHENOS-DARIAH-CLARIN ‘FAIR Principles Workshop’ held on the periphery of DHBenelux2017 in Utrecht, July 2017.

Further Learning (click to expand)

The Fair Data Principles

https://www.force11.org/group/fairgroup/fairprinciples

YouTube video: “Barend Mons / FAIR Principles”, by GODAN Secretariat, published 15th Sept 2016, https://youtu.be/K40utIzUzOk (accessed 23rd Jan 2018)

Congratulations! You have completed the “Introduction to Research Data Management” section!

Your progress through the "Manage, Improve and Open Up Your Research Data" module

15%