Creating Citizen Science Data through Crowdsourcing

Crowdsourcing is the process of leveraging public participation in or contributions to projects and activities and a common part of many citizen science projects. It is the practice of engaging a ‘crowd’ or group for a common goal. As an evolving and quite recent phenomenon, crowdsourcing has not yet reached the point of having an exhaustive definition. A recent study (Estelles-Arolas & Gonzalez-Ladron-de-Guevara in 2012) documenting forty original definitions found in thirty-two articles published between 2006 and 2012 results in the following: “Crowdsourcing is a type of participative online activity in which an individual, an institution, a non-profit organization, or company proposes to a group of individuals of varying knowledge, heterogeneity, and number, via a flexible open call, the voluntary undertaking of a task.”

This term has been quite popular and increasingly familiar over the last decade as it is enforced in many different areas. Government, industry, research and commercial enterprises have been developing crowdsourcing as a means to engage their audiences, enrich their collections, build new resources, and solve time-consuming problems. Technology and the growing connectivity through social media platforms have enabled crowdsourcing on larger scales and in various different contexts. Not surprisingly, according to the context, the practice of crowdsourcing can have diverse enforcements and purpose as a crowd initiative. Especially in the humanities, the focus on crowdsourcing has been on the improvement and transformation of content from one type to another, the description of objects and the synthesis of information from different resources. In this disciplinary context, the purpose of crowdsourcing initiatives can be summarised as:

Exploring new forms of public engagement (e.g. Tag! You’re it!; Expose: My Favourite Landscape)
Enriching institutional resources through the contribution of the crowd (e.g. Transcribe Bentham; Old Weather)
Building novel resources (e.g. archive) through the contribution of the crowd (e.g. Letters 1916-1923; Europeana 1914-1918; 9/11 Memorial Museum)

Who is behind this “crowd”?

Crowdsourcing as a social engagement practice suggests the existence of a community of volunteers surrounding a project, adopting different roles and tasks. Analysing this “crowd” in the literature led to defining it as “a group of individuals of varying knowledge, heterogeneity, and number” who voluntarily undertake a task. This group, varying from amateurs to students, scientists or professionals, takes on tasks according to its personal motivations, for professional development, knowledge enhancing, personal interest to a topic etc.

“Social engagement is about giving the public the ability to communicate with us and each other; to add value to existing library data by tagging, commenting, rating, reviewing, text correcting; and to create and upload content to add to our collections. This type of engagement is usually undertaken by individuals for themselves and their own purposes … Crowdsourcing uses social engagement techniques to help a group of people achieve a shared, usually significant, and large goal by working collaboratively together as a group.”

Where does crowdsourcing take place?

Despite the fact that crowdsourcing has been mostly described as “a type of participative online activity”, interactions do not occur exclusively online. While the majority of such projects do happen online, often there is a blended approach, in humanities projects in particular, combining both live and “face-to-face” events with more standard computer-mediated actions. Especially in initiatives aiming to document historical events (e.g. Europeana 1914-1918), by organising crowdsourcing events to collect memorabilia provided by the public, physical and online interactions are inevitably intertwined.

What do volunteers do?

Crowdsourcing encompasses many practices. According to the purpose of the initiative or project, the tasks that participants are asked to perform vary. In general, as suggested before, we could say that the main trends in terms of projects are:

Crowdsourcing projects that require the “crowd” to integrate/enrich/reconfigure existing institutional resources
Crowdsourcing projects that ask the “crowd” to create/contribute novel resources

These types of projects suggest the following tasks:

When interacting with an existing collection, the public is mostly asked to intervene in terms of curation (e.g., social tagging, image selection, exhibition curation, classification); revision (e.g., transcription, correction); and location (e.g., artworks mapping, map matching, location storytelling).
When developing a new resource, the public is mostly invited to share physical or digital objects, document private life (e.g., audio/video of intimate conversations); document historical events (e.g., family memorabilia); and enrich known locations (e.g., location-related storytelling.

From transcribing handwritten text into digital form, tagging photographs to facilitate discovery and preservation, entering structured or semi-structured data, commenting on content or participating in discussions, or recording one’s own experiences and memories in the form of oral history, public involvement can be traced in many forms. In the table below you can find examples of applications of crowdsourcing in research according to type of activity and type of data, which can inspire you to think about the many ways you might include non-professional researchers in your projects.

Applications of crowdsourcing in research


Application	Description
Data processing
Classification	This is a common application of citizen science in crowdsourcing. Gathering descriptive metadata related to an object in a collection. Social tagging is a well-known example. It is particularly useful for example for images that humans can interpret better than computers can, such as images of landscapes containing wildlife.
Correction and Transcription processes	Inviting users to correct and/or transcribe outputs of digitisation
Contextualisation	Adding contextual knowledge to objects, e.g. by telling stories or writing articles/wiki pages with contextual data
Co-curation	Using inspiration/expertise of non-professional curators to create (Web)exhibits
Data collection
Data gathering	Data gathering is a common application of crowdsourcing in citizen science. Crowds can collect large volumes of data covering many geographical locations or moments in time.
Complementing Collection	Active pursuit of additional objects to be included in a (Web)exhibit or collection
Problem solving
Contests and prizes	Participants, who are often not traditional experts in the subject area, compete to solve problems or develop novel ideas. This approach enables benefits by drawing on diverse perspectives.
Puzzle games	‘Games with a purpose’ may encourage participation because they are fun and do not require knowledge of the underlying research questions.
Shaping research priorities
Agenda setting by citizens	Different communities of stakeholders may identify areas of importance that they feel should be addressed in research.
Idea generation
Collaborative community	Idea generation and management platforms are a type of crowdsourcing platform that offer a digital, social space to generate, discuss, refine and evaluate ideas. Organisations or individuals can use them to create online spaces where communities of stakeholders can gather to share and rate ideas in real time. Ideas can be collaborated on, voted on, and researched by participants. Top-ranked ideas can then be adopted by organisations, or taken forward in other ways. (e.g. Wikipedia)
Crowdfunding
Crowdfunding	Collective cooperation of people who pool their money and other resources together to support efforts initiated by others.

Perhaps one of the most interesting challenges arising from crowdsourcing practices, in particular in the context of cultural institutions (archives, galleries, museums), is the space between official and unofficial knowledge and the need to accommodate, blend and present both in their collections.

Further Learning (click to expand)

Mark Hedges, Stuart Dunn, Academic Crowdsourcing in the Humanities, Chandos Publishing, 2018.
Crowdsourcing week website: https://crowdsourcingweek.com/what-is-crowdsourcing/
Carletti, Laura et al., “Digital humanities and crowdsourcing: An exploration.” Museums and the Web, 2013.
Enrique Estellés-Arolas, Fernando González-Ladrón-de-Guevara, “Towards an integrated crowdsourcing definition”, Journal of Information Science, vo.38, issue 2, pp.189-200, 2012 https://doi.org/10.1177/0165551512437638
Ridge, Ms Mia, ed. Crowdsourcing our cultural heritage. Ashgate Publishing, Ltd., 2014.
Holley, Rose. “Crowdsourcing: how and why should libraries do it?.” D-Lib magazine 16.3/4 Ma (2010).
Oomen, Johan, and Lora Aroyo. “Crowdsourcing in the cultural heritage domain: opportunities and challenges.” Proceedings of the 5th International Conference on Communities and Technologies. ACM, 2011.
Strang, Lucy, and Rebecca K. Simmons. “Citizen science: Crowdsourcing for systematic reviews.” (2018).