INTRODUCTION

Crowdsourcing is an approach to accomplishing a task by opening up its completion to broad sections of the public. Innovation tournaments, prizes for solving an engineering problem, or paying online participants for categorizing images are examples of crowdsourcing. What ties these approaches together is that the task is outsourced with little restriction on who might participate. Despite the potential of crowdsourcing, little is known about the applications and feasibility of this approach for collecting or analyzing health and medical research data where the stakes are high for data quality and validity.

One of the most celebrated crowdsourcing tasks was the prize established in 1714 by Britain’s Parliament in the Longitude Act, offered to anyone who could solve the problem of identifying a ship’s longitudinal position.1 The Audubon Society’s Christmas Bird Count began in 1900 and continues to this day as a way for “citizen scientists” to provide data that can be used for studying bird population trends.2 However, today the world has 2.3 billion Internet users and 6 billion mobile phone subscriptions,3 providing access that facilitates crowdsourcing to a much greater extent than was available to Britain’s Parliament and the Longitude Act. The Galaxy Zoo project (galaxyzoo.org) successfully classified nearly 900,000 galaxies with the help of hundreds of thousands of online volunteers. The simple visual classification was easily performed by humans but not by computers.4 Other examples include Whale.fm (whale.fm), which has almost 16,000 whale calls that volunteers are classifying5 in order to help process large data sets that have become unmanageable for researchers alone to classify.6 The online platform eBird (ebird.org) collected more than 48 million bird observations from well over 35,000 contributors.7,8

While this prior work illustrates the promise of crowdsourcing as a research tool, little is known about the types of questions crowdsourcing is best suited to answer and about the limitations of its use. Health research in particular requires high standards for data collection and processing, tasks traditionally conducted by professionals and not the public. Furthermore, human health research often requires protections for privacy and against physical harm. To better understand the potential of crowdsourcing methods in health research, we conducted a systematic literature review to identify primary peer-reviewed articles focused on health-related research that used crowdsourcing of the public. Our aim was to characterize the types of health research tasks crowdsourcing has been used to address and the approaches used in order to define future opportunities and challenges.

METHODOLOGY

Data Sources and Searches

Definitions

Crowdsourcing was defined as soliciting over the Internet from a group of unselected people, services and data that could not normally be provided solely by automated sensors or computation lacking human input. Crowdsourcing participants had to be actively engaged in the crowdsourcing task and not simply passively have their data mined without their knowledge.

Health research was defined as research that contributes to the World Health Organization’s definition of health: “a state of complete physical, mental and social well-being and not merely the absence of disease or infirmity.”9

Systematic Literature Search

A systematic literature search was performed on March 24, 2013, by searching PubMed, Embase, and CINAHL using the following Boolean search string: crowdsourc* OR “crowd source” OR “crowd sourcing” OR “crowd sourced” OR “citizen science” OR “citizen scientist” OR “citizen scientists.” Articles underwent a multistage screening process whereby results were pooled and duplicates were removed. Two reviewers (BR and RM) screened abstracts for relevance and then screened full-text articles to confirm eligibility criteria. Two literature-informed database searches were performed to identify additional studies. Authors of published papers meeting eligibility criteria are likely to be experts in their field. As a result, PubMed was searched using the full names of first authors of all papers meeting eligibility criteria. Two common crowdsourcing platforms used by articles meeting manuscript criteria were Amazon Mechanical Turk (AMT) (mturk.com) and Foldit (fold.it). Therefore, a literature-informed search of all three databases was performed by including the terms “mechanical turk” and “foldit.” Reference lists of articles meeting eligibility criteria were reviewed to identify additional articles, as were relevant review articles that were returned by the database search. Project websites that were specifically cited by references were manually searched for relevant publications.

Crowdsourced Literature Search

A crowdsourced search for literature was performed by posting an open call for articles on two free websites: Yahoo! Answers (answers.yahoo.com) and Quora (quora.com). The title used in the question was: “Crowdsourcing: published literature on crowdsourcing in health/medicine?” and the body of the question was: “What scientific research articles in health/medicine have been published that use crowdsourcing in part or in whole to achieve their research objectives?” Results were collected after seven days. Responders consented to allow use of provided references.

Study Selection

Studies were included if they met the following criteria: (1) primary peer-reviewed journal article representing original health research; (2) methodology and results provided; and (3) citizen crowdsourcing used by scientists to obtain at least part of the results. Excluded were studies soliciting opinions only from other experts (i.e., experts collaborating with each other), abstracts, editorials, and wikis that existed simply to create content but not to answer a specific research question with original data. Also excluded were papers that used an Internet survey that did not contain any of the original crowdsourcing keywords from our Boolean search string in the title or abstract of the paper. Behavioral research has been conducted via the Internet through Internet surveys for over 15 years,10 and therefore this is not a particularly novel research method. We included the surveys returned by our database search to provide a few examples from the field.

Data Extraction and Quality Assessment

Two reviewers independently extracted data (BR and YH). The following data were extracted from articles (including participant data items recommended by the Cochrane Handbook checklist11): study background information (title, author, publication year, research field, methodology type, study objective, study outcome), demographic and other characteristics of the crowd (size of the crowd, age, gender, racial/ethnic background, geographic location, occupation, education, relationship to the research problem, referral source, stated conflict of interest, motivation), and the logistics of the crowdsourcing [length of time crowdsourcing was conducted, use of a web platform and/or a mobile platform, use of individuals compared to teams, intracrowd sharing techniques (such as team wiki or forum), data collected or processed, complexity of the task, time given to do the task, advertisement of project, skill set required, monetary incentives offered, and data validation techniques]. Additional extracted data included the viewer-to-participant ratio, reflecting how many people saw the task (or website with the task) to how many people completed the task.

Data Synthesis and Analysis

Summary statistics were used to describe the number of studies reviewed and to characterize the data extracted from these studies.

This study was approved by the University of Pennsylvania Institutional Review Board.

RESULTS

Systematic Literature Search

There were 231 unique articles identified from the initial database and crowdsourced search and 209 articles identified in the literature-informed database search. Of the 440 articles, 76 articles underwent a full-text review. Sixteen of these articles met eligibility criteria. Five additional articles were identified from reference lists of eligible articles (Fig. 1). The final article cohort consisted of 21 unique health research-related primary peer-reviewed publications that used crowdsourcing as a methodology1232 (Table 1).

Figure 1.
figure 1

Results of the systematic literature search for health-related crowdsourcing studies. This figure shows the results of the systematic literature search for primary peer-reviewed articles that used crowdsourcing for health research.

Table 1 Study Background Information

Study Characteristics

We identified four types of research tasks that articles in our cohort employed crowdsourcing to accomplish: problem solving, data processing, surveillance/monitoring, and surveying (Appendix Fig. 2 and Appendix Table 4).

Problem Solving

Seven of 21 articles (33 %) from the final cohort employed crowdsourcing for problem solving. Six of these used Foldit, an online game that allows users to manipulate the three-dimensional structures of proteins in order to find the most likely tertiary structure.12,13,1719,21 Also described was the online game Phylo, where users moved colored blocks representing different nucleotides of a gene promoter sequence around on screen in order to make the most parsimonious phylogenetic tree.23

Data Processing

Crowdsourcing was used to provide data processing in 7 of 21 articles (33 %). Three papers used AMT, a service from Amazon.com that allows individuals to create accounts and sign up to do an online task in exchange for payment. AMT Knowledge Workers (KWs) classified polyps in computer tomography (CT) colonography images28 and then were asked questions to gauge how to optimize presentation of the polyps.27 AMT was also used to annotate public webcam images to determine how the addition of a bike lane changed the mode of transportation observed in the images.31 Three manuscripts reported on two independent games that used crowdsourcing to either identify red blood cells (RBCs) infected with25,26 or thick blood smears containing24 malaria parasites (Plasmodium falciparum). The final paper in the data processing category attempted to use the crowd to update the literature and evidence covered by a systematic review.16

Surveillance/Monitoring

Surveillance/monitoring was employed in 3 of 21 (14 %) studies. One of the papers used AMT to ask users about their malaria symptoms in order to assess malaria prevalence in India.20 Another used a mobile phone application that allowed users to report potential flu-like symptoms along with GPS coordinates and other details, which enabled researchers to chart incidence of flu symptoms that matched relatively well with Centers for Disease Control and Prevention data.14 The last paper created a map of automated external defibrillators (AEDs) by having users of a mobile phone application locate and take pictures of AEDs.32

Surveying

Crowdsourcing to conduct surveys was reported in 4 of 21 (19 %) papers. All four used AMT to administer surveys. One study allowed surveyors to include a more diverse population than the typical university research subject pool while maintaining reliability (as measured by Cronbach’s coefficient alpha).15 A second study used AMT to study the human decision-making process and showed that AMT replicated previous laboratory findings, indicating that it may be a good platform for future decision-making studies.22 A third study used AMT to administer surveys about health promotional materials and solicit feedback.29 The last study used KWs as subjects for cognitive behavioral tests administered through AMT.30

Crowdsourcing Logistics

The length of the crowdsourcing study was mentioned in 18 of 21 (86 %) studies and varied from <2 hours to 10 months. Teams were used in 6 of 21 (29 %) studies, and intracrowd sharing was allowed in 7 of 21 (33 %) of studies. Monetary incentives were offered in 9 of 21 (43 %) studies and ranged from $0.01 USD to $2.50 USD per task, with many studies offering bonuses for either good completion or as a raffle/prize. The reported size of the crowd engaged in studies ranged from 5 to >110,000 people, with >136,395 people collectively engaged. Eleven of 21 studies (52 %) reported what advertising was used to attract participants.

All articles (100 %) reported employing some method for validating or improving the quality of data reported from the crowd. The types of validation techniques varied from inserting random questions with known answers into the task to screen for users who were incorrectly marking answers to comparing responses among multiple users and discarding outliers (Table 2).

Table 2 Logistics of the Crowdsourcing

Demographics of the Crowd

Reporting of the demographics of the crowd varied widely, with studies reporting crowd size (16/21, 76 %), age (7/21, 33 %), gender (5/21, 24 %), race (1/21, 5 %), geographic location (10/21, 48 %), occupation (4/21, 19 %), education (3/21, 14 %), relationship to the research question (4/21, 19 %), referral source (2/21, 10 %), conflict of interest (2/21, 10 %), reported motivation (3/21, 14 %), and viewer-to-participant ratio (4/21, 19 %). One (5 %) study reported age, sex, and race, and seven (33 %) studies reported at least one of these three descriptors (Table 3).

Table 3 Demographics of the Crowd

DISCUSSION

This is the first study to identify the types of crowdsourcing tasks used in primary peer-reviewed health research. This study has three main findings. First, we identified only 21 articles reflecting the use of crowdsourcing for health-related research. Second, we found that these studies used crowdsourcing for four different principal objectives and that there are several advantages to utilizing crowdsourcing. Third, we found considerable variability in how the methods of crowdsourcing were reported.

While citizen science has been in existence for more than a century and crowdsourcing has been used in science for at least a decade, crowdsourcing has been utilized primarily by non-medical fields, and little is known about its potential in health research. Every health field from studying chronic diseases to global health has a potential need for human computing power that crowdsourcing could fill in order to accelerate research. Prior work has heralded crowdsourcing as a feasible method for data collection, but a clear roadmap for the types of questions crowdsourcing could answer and the ways it could be applied has been lacking. Understanding how crowdsourcing has been used successfully in health research is crucial to understanding where crowdsourcing fits in the health care space, especially when there may be higher standards or tighter regulations for data quality and validity compared to the science fields that were early adopters.

The limited number of articles using crowdsourcing is surprising given the potential benefits of this approach. Although we identified 21 articles, most of which used crowdsourcing successfully, crowdsourcing clearly is not used pervasively in health research, and it is important to understand the quality of data it provides. Even though the use of crowdsourcing in health research is in its infancy, the papers we identified successfully used crowdsourcing to solve protein structure problems,18 improve alignment of promoter sequences,23 track H1N1 influenza outbreaks in near real time,14 classify colonic polyps,27,28 and identify RBCs infected with Plasmodium falciparum parasites.2426 Furthermore, as Mavandadi et al. point out, one way around the problem of involving lay people in making a medical diagnosis is to use crowdsourcing to distill the data for a medical professional, who can then make the final decision. For example, a pathologist must look at more than 1,000 RBCs to rule a sample negative, but if crowdsourcing identifies RBCs that are infected, all a pathologist has to do is officially confirm the diagnosis with a single image.25 This could be especially useful in resource-poor areas.

Crowdsourcing used for health research has been employed to accomplish one of four main categories of tasks: problem solving, data processing, surveillance/monitoring, and surveying. Data from existing studies show that crowdsourcing has the potential to beneficially address the following points: quality, cost, volume, speed, and novel science. Crowdsourcing has been demonstrated to be a viable way to increase the accuracy of computer recognition of RBCs infected with malaria parasites25 (quality), be a low cost alternative to more traditional behavioral research and epidemiology studies15,20,22,30 (cost), engage over one hundred thousand people in a research problem14 (volume), allow research to progress much faster than if processed by investigators alone14,15,20,2224,2632 (speed), and produce new scientific discoveries13,18,19,21,23 (novel science).

Additionally, there is the advantage of an untapped expertise of the crowd. Even though these crowdsourced projects are not asking scientific experts to participate, participants have been found to be experts at puzzles and problem solving, which would make them specifically adept at solving protein structures in Foldit12,13,1719,21 and solving multiple sequence alignment with Phylo.23 Presumably, among members of the public one could find experts at many different tasks, especially when the task is presented as a game that benefits science. Finally, crowdsourced projects raise public awareness about the project and about science in general.

The papers identified in this study varied widely in the amount and type of data that were reported about the crowd and the experimental setup. Crowdsourcing articles rarely reported data about the demographics of the crowd participating, including information standard to most clinical trials such as the size of the cohort, age, gender, and geographic location.11 These data and others such as motivation for participation, education, and occupation are crucial for understanding the people involved in the research. Ideally, all papers that use crowdsourcing should include at a minimum data regarding the demographics and logistics of data collection (Appendix Table 5). If collected in the future, these data would provide crucial information to help the scientific community understand how the crowd works and how to best maximize the use of crowdsourcing. However, the crowdsourcing methodology for health research is in an early phase of development, and there is additional work to be done to develop these methods and related reporting standards.

When thinking of conducting a crowdsourced study, it may be difficult to choose or create the most appropriate platform. In some cases researchers used their own custom platforms and in others they employed AMT. We examined studies that employed the crowd to perform crowdsourced tasks; however, there are also platforms that allow one to post a “challenge” and offer a monetary reward for the best solution. Kaggle (kaggle.com) allows scientists and others to post complex data analysis problems along with monetary rewards for the best solution,33 and InnoCentive (innocentive.com) is a more general platform that allows prizes to be posted for any sort of research and development problem.34 It is also worth noting that crowdfunding websites have become popular and may be a potential way to fund research projects. Examples include RocketHub (rockethub.com) and Petridish (petridish.org).35

Although we conducted a review of references and review articles in addition to a crowdsourced search for literature, our results do not include all articles. Posting a survey on the Internet (collecting research data about Internet users themselves by having them answer questions or perform tasks) has been around for >15 years10 and therefore is not a novel research method. Our review only includes a sample of projects that used the Internet to survey participants. This specific type of crowdsourcing has been reviewed elsewhere.36 Additionally, there are other types of research that are sometimes referred to as crowdsourcing but do not meet our definition of crowdsourcing. They involve investigators mining data that have been generated by users, but generally not for research purposes, such as estimating influenza activity by analyzing Twitter (twitter.com) posts.37 Other examples reviewed elsewhere include investigators mining data or surveying online communities such as PatientsLikeMe (patientslikeme.com) and 23andMe (23andme.com).38 Our search did not include gray literature or searching the Internet using a search engine. The goal of the study, however, was to characterize primary peer-reviewed health research. While several projects may be featured on the World Wide Web and science magazines, it is imperative that projects are published in academic journals so that the scientific community can validate the methods and demonstrate the varied, interesting, and successful uses of crowdsourcing in health and medical research.

CONCLUSION

Crowdsourcing has been used to help answer important health-related research questions. Utilizing crowdsourcing can improve the quality, cost, and speed of a research project while engaging large segments of the public and creating novel science. This methodology serves as an alternative approach for studies that could benefit from large amounts of manual data processing, surveillance conducted by people around the world, specific skills that members of the public may have, or diverse subject pools that can be surveyed at low cost. In this systematic review, we identified four types of research needs that have been addressed by crowdsourcing and specify criteria that future studies should meet in order to help standardize the use of crowdsourcing in health and medical research.