1 Introduction

Mobile technologies and social media have transformed the landscape of emergency and disaster management by enabling disaster-stricken citizens to produce digital, real time, local information on critical events. Hurricane Sandy in 2012, Typhoons Haiyan or Hagpuit in 2013–2014, or the Nepal earthquake in 2015 offer examples of user-generated data by the millions (Purohit et al. 2014; Imran et al., 2015). The 2015–2016 refugee crises in Europe, with hundreds of thousands of asylum seekers checking their social media and messaging apps across borders, constitute another example of data-intensive emergency management in the humanitarian domain. Metaphors of social media data as a stream, torrent, deluge, flood, or exaflood, so popular in the information management literature, are now equally common among disaster management experts. The growing interest on how to leverage the vast landscape of social media data for disaster management comes as no surprise, nor the number of online platforms and tools that aim at making sense of crowdsourced data for disaster and emergency management. Collaborative management and negotiated integration of information are distinctive features of the “paradigmatic shift” that data-intensive disaster response represents (Bunker et al. 2015).

Together with streams of big data, social media have brought new forms of crowd participation in emergencies and disasters. From Hurricane Sandy’s response in 2012 to the refugee waves in Europe in 2016, there are many examples of tech-savvy affected populations, first responders and diasporas with different degrees of data management expertise. Likewise, digital volunteers with the most varied backgrounds and fortuitous experiences contribute remotely either as individuals or within networks and organisations. This opening up of new forms of participation for individuals and communities often blurs the distinctions between amateurs and professionals based on skill levels. At the same time, the heterogeneity of the participants may render difficult a shared understanding about how the data must be represented. In this regard, the use of standardized ontologies should be an important objective towards unification of different sources of data (e.g. Fan and Zlatanova 2011, Yusuf et al. 2012, Shih et al. 2013). Standards are a form of soft regulation, but they have become crucial to facilitate interoperability, to promote the practical efficacy of systems, to clarify possible misunderstandings, and to prevent different interpretations causing legal conflicts that may arise in the web of data (Casanovas et al. 2016). This approach seems particularly appropriate for the management of crowdsourcing platforms, which typically bring together people with different geographical, cultural, and professional backgrounds.

This paper explores the synergies between crowds and data by focusing on the intersection between the level of involvement of citizens and the type of data they process in disaster and emergency management. These two dimensions can also help to understand the affordances of the online platforms and tools that we survey in this paper. The paper, therefore, is organised as follows: Section 2 offers an overview of relevant approaches to crowdsourcing within the broader field of collective intelligence as a conceptual framework for its use in emergency and crisis management; Section 3 crosses crowdsourcing methods with different types of data and proposes a classification of crowdsourcing roles emerging from this intersection; Section 4 surveys the features and functionalities of current platforms and mobile applications leveraging the convergence of crowds and data and proposes a classification of the tools based on their origin, methods, functionalities, and prospective end users. Section 5 discusses the main findings, including a review on the use of ontologies for the emergency and disaster management domain. Finally, the paper concludes by stressing the need for further research on crowdsourcing roles matching the needs of each phase of the disaster management cycle (DMC).

2 Crowdsourcing: Social and technological aspects

For some years now, both researchers and practitioners in the areas of disaster and emergency management have been exploring the role of crowdsourcing in collecting, processing, and sharing information across organisations and affected populations. Even if the concept has a recent history, crowdsourcing has now a number of ramifications. The term was first coined by Jeff Howe in 2006 to define “the act of taking a job traditionally performed by a designated agent (usually an employee) and outsourcing it to an undefined, generally large group of people in the form of an open call” (Howe 2006). To Howe, crowdsourcing finds its contemporary roots in the open source software movement:

Open source revealed a fundamental truth about humans that had go largely unnoticed until the connectivity of the Internet brought it into high relief: labor can often be organized more efficiently in the context of community than it can in the context of a corporation. The best person to do a job is the one who most wants to do that job; and the best people to evaluate their performance are their friends and peers who, by the way, will enthusiastically pitch in to improve the final product, simply for the sheer pleasure of helping one another and creating something beautiful from which they all will benefit (Howe 2008).

Since Howe’s first definition, an extended range of crowdsourcing categories, processes, and typologies have been proposed from a number of disciplines: computer sciences (Doan et al. 2011; Hetmank 2013); management (Schenk and Guittard; 2011; Hossain and Kauranen 2015); information systems (Geiger et al. 2011; Estellés-Arolas and Gonzalez-Ladron-de-Guevara 2012; Haklay 2013; Saxton et al. 2013; Nakatsu et al. 2014; Zhao and Zhu 2014). While these domains may vary the emphasis on different components (computational arrangements, organisational aspects, data management), they all consider the challenges of crowdsourcing in the web as of a ‘socio-technical” nature (Doan et al. 2011), for they strike an appropriate balance between technology and human intelligence.

In this regard, and as a “general-purpose problem-solving method” (Doan et al. 2011) crowdsourcing has also caught the attention of emerging paradigms such as collective intelligence, human computation, or social computing (e.g. Quinn and Bederson 2011; Michelucci 2013). The intersections between these domains have been noted as they coincide in their focus on horizontal processes that engage large groups of individuals towards clearly defined goals. The term “collective intelligence” (CI) predates the notion of crowdsourcing and gained popularity with the publication of Pierre Lévy’s book L’intelligence collective (Lévy 1997). Lévy initially defined CI as “universally distributed intelligence, constantly enhanced, coordinated in real time, and resulting in the effective mobilization of skills” (Levy 1997). This premise resonates with Hutchins’ work on socially distributed cognition and his effort to resituate the focus of cognitive science as a study of “the social and material organization of cognitive activity’ rather than the solitary individual” (Hutchins, 1995). Other frequently quoted definitions approach CI as “the capability for a group of individuals to envision a future and reach it in a complex context” (Noubel 2004), “groups of individuals doing things collectively that seem intelligent” (Malone 2009) or “the general ability of a group to perform a wide variety of tasks” (Woolley et al. 2010). A critical review of the CI literature has pointed that these somewhat overlapping definitions fail to “exclude anything that is collectively created”, and then proposes to distinguish CI from other forms of “communal intelligence” such as “team intelligence” (Aulinger and Miller 2014). In this view, CI is “the degree of ability of two or more living things to overcome challenges through the aggregation of individually processed information, whereby all actors follow identical rules of how to participate in the collective”, as opposed to “team intelligence” where “the actors don’t follow completely identical rules of how to participate in the team” (Aulinger and Miller 2014). Another review discussing literature on CI in humans highlights the multidisciplinary character of this emergent paradigm and identifies three levels of abstraction: the micro-level (CI as “a combination of psychological, cognitive and behavioral elements”), the macro-level (CI as a “statistical phenomenon”), and the level of emergence between the two which “deals with the question of how system behavior on the macro-level emerges from interactions of individuals at the micro-level” (Salminen 2012).

Crowdsourcing and human computation are sometimes considered interchangeable concepts too. The notion of human computation has been traced back to 1838 in the philosophy and psychology literature (Quinn and Bederson 2011), but its present meaning has been shaped by the influential work of Louis Von Ahm, the inventor of the CAPTCHA, ReCAPTCHA and Duolingo (Von Ahn et al. 2003, 2008; Von Ahn 2013). To Quinn and Bederson, the overlap between the two concepts exists but they apply to different operations: “whereas human computation replaces computers with humans, crowdsourcing replaces traditional human workers with members of the public” (Quinn and Bederson 2011). In Michelucci’s view, crowdsourcing does not necessarily require the computational component, but since “it is a common method for engaging many participants in human computation […] they often coincide” (Michelucci 2013).

Likewise, there are coincidences with these terms and the notions of social computing, even though the social behaviour component might not be present in the two previous ones. As Michelucci puts it, “a workflow process may elicit human input, transform that input, and then pass the result to another human, in a pipeline that involves no social behaviour or interaction whatsoever, yet is very much a manifestation of human computation” (Michelucci 2013).

The multiple definitions of crowdsourcing result from combining different theoretical elements drawn from CI, human computation and social computing approaches. But web-based and mobile technologies have given crowdsourcing its present allure. According to Saxton et al. definition, “crowdsourcing is a sourcing model in which organizations use predominantly advanced Internet technologies to harness the efforts of a virtual crowd to perform specific organizational tasks” (Saxton et al. 2013). Chamales also highlights the technological component of crowdsourcing, noting that “the implementation of a crowdsourcing system can vary widely, from complex online websites that coordinate a million simultaneous workers to low-tech, ad hoc approaches that use a shared spreadsheet.” (Chamales 2013).

There is no doubt that web-based and mobile technologies have expanded crowdsourcing methods to the point that the concept has become an umbrella term covering multiple ways to collect and share information online, respond to labour offers or contests, or volunteer for a number of tasks. The size and composition of the crowd can also help to determinate whether the crowdsourced effort is unbounded (anyone can participate) or bounded to “a small number of trusted individuals” (Meier 2011). In this line, Prpic and Shukla (2013) have distinguished different types of “crowd capital” generation based on the “crowd capabilities” of organizations as they engage with the dispersed knowledge of individuals (i.e. public crowd, public crowd curated, and captive crowd).

Similarly, in her account of crowdsourcing for emergencies and crises, Liu has analysed the distinct skills and expertise of different crowds typically involved in crisis management: (i) affected-populations, (ii) diasporas, (iii) social networks, and (iv) digital volunteer communities (Liu 2014). In this framework, disaster-affected populations bring “local, timely, and direct experiential information”, diasporas provide their distinctive “socio-cultural tacit knowledge”, social networks make available “varied background with unexpected and possibly fortuitous experience” and, finally, digital volunteers offer their “capabilities in processing and managing crisis data” (Liu 2014). While the benefits of combining these different layers of information, knowledge, and skills have already been documented (e.g. Whipkey and Verity 2015) there are new challenges to address, such as organisational coordination (Digital Humanitarian Network 2012a, b) and the quality and accuracy of information gathered through crowdsourcing methods (Hossain and Kauranen 2015). Both challenges have socio-technical components. For example, coordination among organisations requires well-defined responsibilities, workflows, and tasks. Likewise, the quality and accuracy of crisis information relies on the capacity to assign different levels of trust and the use of semantic web tools (e.g. Keßler and Hendrix 2015) and artificial intelligence (e.g. Imran et al. 2014) to facilitate the processing of filtered and relevant information.

3 Crowdsourcing for data generation: Capabilities and roles

When it comes to data, the crowdsourcing literature abounds in disaster-related metaphors: data may come as an “explo-sion”, “deluge”, “flood” or “exaflood”. Crowdsourcing methods typically include the design of architectures and workflows that aim at effectively channelling the flood of digital data (e.g. by collecting, filtering, tagging, mapping, etc.). As a result, the vast majority of crowdsourced tasks are either data-intensive or rely on data-intense aggregation processes.

Given the ubiquity and variety of data, it is possible to consider the different roles of the participant crowds based on the type of data they produce or process. In this regard, we can distinguish different types of data:

  1. (i)

    raw data (e.g. data collected from mobile position sensors, geo-social check-ins)

  2. (ii)

    unstructured data (e.g. texts, images, videos)

  3. (iii)

    semi-structured data (e.g. tags, geotags, hashtags)

  4. (iv)

    structured data (e.g. spreadsheets, tables, datasets, metadata)

Moreover, we may also combine different types of data with participants’ type of involvement. Thus, individuals can contribute to crowdsourced projects either as passive data generators or as actively engaged contributors. Data drawn from the position sensors of mobile devices (e.g. accelerometers, gyroscopes, magnetometers, GPS receivers) are an example of passive involvement. Citizens collecting and sharing data from barometers or ambient thermometers embedded in their smartphones in the context of urban sensing initiatives, environmental projects, or participatory mapping qualify as active involvement.

Figure 1 captures four different roles of the crowd when associated to four types of data and two types of involvement (active or passive).

Fig. 1
figure 1

Crowdsourcing roles based on types of data processed and level of involvement

The bottom of the pyramid represents the wide base of users who generate raw data by merely carrying their mobile devices or being involved in a data collection initiative. In both cases, people generate raw data just because some processes are automatically performed by sensor-enabled mobile devices (e.g. processes run in the backend by GIS receivers, accelerometers, gyroscopes, magnetometers, etc.) which can be later on used for a purpose (i.e. mobile phone coordinates for positional triangulation, traffic flow estimates, etc.). This type of data collection has been defined elsewhere as “people as sensors” (Goodchild 2007), “opportunistic sensing” (Lane et al. 2008) or “opportunistic crowdsourcing” (Chatzimilioudis et al. 2012). Opportunistic crowdsourcing requires very low data processing capabilities (if any) on the side of participants and is the most passive role in the contributing information chain. Yet, a fraction of these users may actively engage in crowdsourcing projects by enabling their devices to contribute to particular data collection processes. In a recent meta-analysis of studies that use smartphones, Birenboim and Shoval (2016) have found a combination of both approaches in the research design (e.g., passive location recording with active reports that participants are asked to send). Whether opportunistic or actively participatory, there are many potential uses of these crowdsourced raw data for disaster management, especially in the mitigation and preparedness phases (e.g. detection of spatial-temporal patterns to prevent stampedes or traffic jams, seismic sensing, displace population sensing, etc.). With regard to scalability, this data collection process relies on the platform under use but we can envision that it is scalable at a reasonable cost (i.e. using Amazon IaaS as the number of people increases). In addition, raw data are available to third users real time, as soon as sensors release them.

The second layer of the pyramid includes social media users (crowd as a social computer). People use social media (e.g. Twitter, Facebook, Instagram, etc.) for their own communication purposes and process information with no particular structure, but these data can be reused to extract semantically structured information (Garcia-Santa et al. 2016). This type of interaction does not need an implicit domain knowledge and it is a low complexity task that is performed voluntarily or unintentionally (i.e. whenever people just post messages about the disaster in social media channels). Furthermore, in the last years there has been extended research on different methods to process social media messages to extract actionable real-time information during disasters (e.g. Imran et al. 2015). In some cases, social media users can also actively engage in social games whose purpose goes beyond the pure entertainment objective. These games have been referred to as “serious games” (Zyda 2005) or “games with a purpose” (GWAP) (Von Ahn and Dabbish, 2008). Individuals can also engage in activities that help to achieve other goals (e.g. Duolingo invites user to freely learn a language and interact with peers as they progress, but the application is also a crowdsourced text translation platform drawing from the translation effort of advanced learners). The use of serious games for emergency and disaster management is being currently explored to increase awareness, education, or training skills (e.G. Loreto et al. 2012; Meesters and Van der Walle 2014; Meera et al. 2016).

The two top tiers include users with an explicit, conscious use of a priori knowledge to achieve a specific goal. Thus, “crowd as reporters” refers to social media users producing first-hand, real-time information on events as they are unfolding (e.g. they tweet about a hurricane making landfall and the reporting damages in a specific location). This user-generated content already contains valuable metadata added by users themselves (e.g. hashtags) than can be used as semi-structured, preprocessed data. It is worth mentioning here that veracity of the data (also known as the fourth “V” in big data) can be as good as the credibility of the reporters and, therefore, a lack of control in this step could eventually mislead the decisions based on it. Credibility is one of the quality dimensions according to the ISO/IEC 25012 (ISO/IEC 25012 2009) and has been included by the authors (Merino et al. 2015) as part of the contextual adequacy in their proposed data quality in use model for big data. There are also other studies that have independently tested this quality dimension for uncontrolled data sources such as Twitter (Castillo et al. 2013; Gupta et al. 2014; Lee et al. 2015) showing that the credibility can be predicted with a high precision helping to achieve it in an automatic way and solving, at least at some degree, the problem of the veracity of the data. The tasks associated to this role are of medium complexity since they must provide quality control based on its experience, reputation of sources, and verification with other sources of information. As a result, the response time of this type of tasks increases compared with the previous two layers that are nearly instantaneous (i.e. we can assume a response time of seconds or minutes in the best cases).

Finally, “crowd as microtaskers” include people performing micro-tasked activities. Microtasking can be considered a special sub-type of human computation where tasks involving different degrees of complexity are divided into smaller and independent micro-tasks (Luz et al. 2015). Microtaskers can then generate structured, high-quality, and interpreted data by performing some specific tasks over raw data (e.g. labelling images, adding coordinates, tagging reports with categories, etc.). This role requires an active participation of users in the crowdsourcing effort and it may exploit special skills or require different levels of previous training. The response time for this type of activities may vary upon the specific task to be performed but in the best cases it will be in the range of seconds or minutes (i.e. analysing a text for retrieving people needs) and it could go from days to weeks (i.e. finding the Malaysia Airline flight MH370 and its 239 passengers vanished from radar using satellite imageryFootnote 1). Also, automatics tools and machine learning algorithms (i.e. categorizing text) can be combined to meet the adequate response time in a disaster management scenario.

4 Participation of the crowd in the disaster management cycle

The UN-SPIDER glossary defines the DMC as “the complete set of phases related to disasters and their management” (UN-SPIDER 2014). While disaster relief agencies and organizations may conceptualize the disaster management phases differently, most models generally include the following ones: (i) mitigation; (ii) preparedness, (iii) response, and (iv) recovery.

According to the standard definitions by the United Nations Office for Disaster Risk Reduction, mitigation refers to “the lessening or limitation of the adverse impacts of hazards and related disasters”; preparedness includes “the knowledge and capacities developed by governments, professional response and recovery organizations, communities and individuals to effectively anticipate, respond to, and recover from, the impacts of likely, imminent or current hazard events or conditions”; response involves “the provision of emergency services and public assistance during or immediately after a disaster in order to save lives, reduce health impacts, ensure public safety and meet the basic subsistence needs of the people affected”; recovery extends to “the restoration, and improvement where appropriate, of facilities, livelihoods and living conditions of disaster-affected communities, including efforts to reduce disaster risk factors” (UNISDR 2009).

The entire DMC has only recently been understood and modelled in its integrity, as a whole integrated iterative process within other broader social and economic processes. Policy making, effective allocation of resources, procedural ruling and local decision-making are closely interlinked tasks, but efficient coordination is far from easy and can lead to error-prone situations.

Even if, in practice, disasters tend to unfold in a continuum and the phases of the cycle may sometimes be difficult to isolate, the four crowdsourcing roles we have established in Section 3 can be linked to the different phases of the cycle. This association can be valuable in order to identify specific groups for different disaster phases. Thus, the role of the crowd as a sensor is especially relevant in the preparedness and training phases when sensors can provide critical information of events or sub-events for different geographical locations and at large scale (Boulos et al. 2011; Kjærgaard et al. 2012; Sheick Dawood et al. 2013; Radianti et al. 2013). People may contribute data either inadvertently or by explicit consent: while GPS location services require users’ explicit permission of access on both Android and iOS systems, other location sensors such as accelerometers and gyroscopes do not (Liu 2013).

The role of the crowd as a “social computer” and as a “reporter” may be critical in the other three steps of the lifecycle (response, recovery, and mitigation) where people and organizations (citizens, volunteer groups, and emergency authorities) can engage in multi-channeled information sharing and provide near-real time updates on the events as they unfold (Chon et al. 2012). Given the ever-growing amount of information that people shares during a disaster, leveraging social media information posted on Twitter or Facebook becomes most relevant to facilitate situational awareness during an emergency (Cameron et al. 2012). In this regard, there is a growing literature on methods to mine Twitter data for disaster management (e.g. Li and Rao 2008; Caragea et al. 2011; Chowdhury et al. 2013; Imran et al. 2013; Robinson et al. 2013; Sakaki et al. 2013; Power et al. 2013; Parsons et al. 2015; de Albuquerque et al. 2015; Takahashi et al. 2015).

Yet, there are a number of critical issues when using social media information: trustworthiness of the sources, veracity and accuracy of information, and privacy. Some of these issues are easier to handle as the crowd actively take the role of a “reporter”. In that case, as people tend to be already identified, assessing the trustworthiness of the source and verifying the incoming information may become less problematic. People who reports and uses the reported information can even be part of the crowdsourced verification process (e.g. within an Ushahidi deployment). Efficient methods to do it by applying simple recruiter reward and punishment approach have already been proposed and tested (Tang et al. 2011; Naroditskiy et al. 2012).

The role of the crowd as a “microtasker” is especially relevant when it comes to produce and analyse structured data, both in the preparedness and training phases or later in the response and recovery ones when a priori knowledge is required to get insights into what is happening and set a plan to react rapidly. Table 1 summarizes how the different types of crowdsourcing roles described above relate to the different phases of the emergency management cycle:

Table 1 Crowdsourcing roles and DMC

4.1 Crowdsourcing tools and disaster management phases

In this section we present the different technologies and platforms that are already available for the disaster management domain. We have classified them by establishing a set of dimensions that best represent their main characteristics in order to obtain a global perspective. Previous research offers a preliminary classification of mobile technologies within the governance domain (Poblet 2011a). In this study, we have added new ones based on related sources and initiatives turning into the elaboration of the final matrix. The four basic criteria for inclusion in the final list have been:

  1. 1.

    The tool has been designed to be used on one or more phases of the emergency management cycle or, alternatively, it is applicable in this domain.

  2. 2.

    The tool leverages at least one of the crowdsourcing roles described in Section 3 (crowds as sensors, social computers, reporters, or micro-taskers) as part of the emergency management process,

  3. 3.

    The tool is currently available to end users,

  4. 4.

    The tool comes with enough information (i.e. demos, use cases, technical documentation, etc.) to make an accurate assessment of its functionalities.

Our analysis includes a total of 38 tools (25 web platforms and 13 mobile apps) addressing different aspects of the DMC. The analysis does not include Mobile Data Collection Systems (MDCS) that are intended to collect specific information from targeted audiences via pre-designed surveys. In this regard, previous research on MDCS has shown that, from an initial list of 36 solutions, there are up to 24 tools currently available for use in humanitarian relief interventions (Jung 2011). While MDCS are highly relevant to our research, the platforms reviewed here have a broader scope and typically include additional functionalities (i.e. data aggregation, data filtering, data clustering, analytics, etc.). In fact, most MDCS could be integrated into DMC platforms as part of the data collection process treating each mobile data collector as a sensor or as we have defined previously considering the crowdsourcing type of “people as sensors” (i.e. FrontlineSMS and Ushahidi have already worked together to push incoming SMS to the Ushahidi and Crowdmap platforms). Similarly, our analysis does not consider the 250 emergency-related applications available in Google Play already reviewed in Gómez et al. (2013).

4.2 Taxonomy of crowdsourcing tools

We have classified the different platforms and mobile apps upon the next four major characteristics: i) the phase of the DMC where it better applies to, ii) the availability of the tool and its source code, iii) the main core functionalities, and the iv) crowdsourcing role types:

  • Management crisis lifecycle step: which one of the four phases of the DMC the tool applies to (mitigation, preparedness, response, and recovery).

  • Availability of the tool: how the tool is made it available and under which license (open source license, commercial license).

  • Core functionalities: which are the main functionalities that the tool is offering. We have identified the following subclasses:

    • Natural Language Understanding (NLU): the tool provides some functionalities to perform text analysis in order to obtain useful information from natural language sentences (structured or unstructured) or from raw text (e.g. entity recognition).

    • Data collection: the tool enables data collection from any device connected to the platform. It also provides data management functionalities on the data collected. This dimension is closely linked to the roles of the crowd as a sensor and as a social computer.

    • Data filtering: the tool displays different filtering options over the data collected. The filtering can be done by keywords, by location, or by any other predefined filter.

    • Data tagging: the tool provides tagging functionalities to facilitate the categorization of the collected data. This dimension is closely related with the crowdsourcing role of the crowd as a social computer.

    • Mapping and navigation: the tool allows plotting geographic information related with the collected data in a map. It also may allow using this data for navigating in the map and retrieve data based on its geolocation. This dimension is closely related with the crowdsourcing role of the crowd as a sensor.

    • Volunteer management tools: the tool comes with a dedicated module to manage the participation of digital or field volunteers (or both).

  • Crowdsourcing roles: the tool provides a framework for a particular crowdsourcing role, as in:

    • Crowd as a sensor: the tool enables the collection of data from multiple devices, including mobile handsets, and each of these devices provides some local information that can be either automatically generated (run by sensors in the background) or human generated.

    • Crowd as a social computer: the tool provides some applications or human computer interfaces enabling the users to collect data from social media and engage in social conversation if needed.

    • Crowd as a reporter: the tools provides a platform where people can offer first-hand information on events as they are unfolding and allow the identification of a reporter versus an occasional user in order to preserve trustworthiness.

    • Crowd as a microtasker: the tool provides applications or human computer interfaces for the execution of specific processing tasks by users. These tasks differ from the previous ones in that they that they exploit some specific knowledge and may also require a training phase to accomplish them.

5 Main findings

Stemming from this approach and taxonomies, there are two main types of end users that can benefit from these tools. The first ones are emergency responders such as fire services, 911 emergency services, police, and the CERTs (Community Emergency Response Teams) that can leverage real time analysis and enrichment of the data provided by reporters, social interaction, and sensors. Critical needs for these users are: (i) finding information directly related with the disaster; (ii) detecting precise location; (iii) detecting current needs within the affected area; (iv) categorising data; (v) verification of the information provided. These needs can be met by the tools reviewed in Table 2 (marked within the core functionalities of data filtering, data tagging, mapping, and NLU). Yet, as there is no tool including all these functionalities some of they will need to be added manually. For instance, Garcia-Santa et al. (2016) used Ushahidi but since it did not support NLU this functionality was included by adding external tools (i.e. Freeling NLUFootnote 2) jointly with some machine learning algorithms for categorization of the collected data.

Table 2 Crowdsourcing tools for disaster management

A second type of end users are the people involved in a disaster area who relies in this technology to provide real-time feedback and capabilities for verifying it (i.e. people as reporters for supervising the quality collected data). Some examples of this type of information are “four people injured in collision of two passenger cars in GR3417 #Granada” or “inquiries are continuing after a house fire at Springwood overnight”. The tools shown in Table 2 that are marked within the core functionalities of data collection and volunteer management tools can cover this type of information.

From a technological point of view, there are two main methods to cater for the needs of the two end users groups above. One relies on web-based online platforms that aggregate information and provide an ongoing monitorisation of events as they unfold (it can be assumed that this is part of the command-and-control strategy and therefore it suits to the first group of users); the other method applies mobile apps which provide a straightforward way for users, reporters, etc. to submit information of the current happenings within a specific area (end user type 2). Table 2 includes these two different methods in a matrix of tools and functionalities.

5.1 Web-based platforms

The majority of the web-platforms included in our list (15 out of 25) are either open source, have some open source components or can be used for free. CrowdCrafting, GeoTag-X and Micromappers are three platforms powered by PyBossa, an open source software designed for large microtasking projects. Likewise, the AIDR platform (providing real-time automatic classifications of tweets) has been integrated in CrisisTracker.

The majority of the solutions analysed primarily support response (22 out of 25) and recovery-based efforts (14 out of 22). Generally, the primary focus is on single, event-based, location-specific, and dynamically evolving scenarios that trigger an urgent response and the need for verified facts (Coppola 2011). Nevertheless, most of the platforms could also be applicable in the mitigation and preparedness phases, especially those who have developed dedicated modules (i.e. Sahana contains different modules for organization registry, human resources, inventory, assets, etc. which focus on the mitigation and preparedness phases; OpenIR maps ecological risks revealed by infrared satellite data to identify vulnerable areas and support its emergency management). ArcGIS enable developers and users to build custom applications or create and run projects that could also focus on mitigation and preparedness.

Yet, since social media information can also be leveraged at any stage of the emergency management cycle (i.e. at the preparedness and training phase, by constantly monitoring information to spot and follow emergency situations, or at the response phase, by communicating real-time between citizens or citizens and authorities) it is difficult to constrain potential uses of the platforms that include social media functionalities (i.e. Ushahidi and CrisisTracker) to just one phase. An example of flexibility and interoperability can be found in (Garcia-Santa et al. 2016) where a complete system based on Ushahidi platform is implemented for detecting and enriching needs on real time given a set of Web sources.

As per core functionalities of the platforms, the most common ones are data collection (20 instances) and data filtering and tagging (16 and 17 instances respectively); up to 14 tools offer mapping functionalities and 12 of them include some module to manage volunteer contributions, mostly through explicitly designed microtasking workflows (CrowdCrafting, GeoTax-0, i-Coast, Micromappers, Tomnod, and Verily). Some common elements to this specific microtasking approach are: (i) size: a large number of small unit tasks which are aggregated to form a large project; (ii) scale: undertaken by a large number of distributed individuals; (iii) temporal/spatial span: short tasks conducted online either individually or collaboratively; (iv) human intelligence involvement: tasks cannot be fully automated and include routine and specialist skills.

5.2 Mobile apps

The market for disaster management apps has remarkably expanded in the last few years (Poblet 2011b). Updates and alerts on hurricanes are now embedded in several apps that target residents in hurricane-prone areas (Peckham 2012). However, even if these apps provide real-time information and updates georeferenced in storm maps, satellite images, and weather forecasts, the information flow remains one way, since it is delivered by the US National Hurricane Center or the US National Weather Service. In contrast, the apps listed in Table 2 tap into user-generated contents to supply updated information to both response organizations (i.e. UN or FEMA) and citizens.

As it is the case with online platforms, mobile applications reviewed here address the response phase of the disaster cycle (although four of them are also applicable in preparedness and one in recovery). Three of the platforms reviewed come with open source licenses and the remaining eight can be used for free.

As per core functionalities, the vast majority of the platforms allow data collection (10) and have mapping/navigation functionalities (10), while a few of them provide data filtering (5) and data tagging (5) functionalities. More specifically, Geopictures, UN Assign, FemaApp, and FirsToSee allow users to upload and share geo-tagged pictures, Pushpin and Vespucci are editing apps intended to facilitate edition and contribution of new data to OpenStreetMap. OSMTracker allows track logging and quick (voice) waypoint annotations when driving a car or on a bicycle, and OSMAnd is a map and navigation application with access to OpenStreetMap data that also offers both online and offline routing, with optical and voice guidance, for cars, bikes, and pedestrians. Fulcrum offers a suite of dedicated apps for disaster response (i.e. damage report, disaster shelter assessment, evacuee information, or post storm building damage report). Stormpins turns its users into local reporter by enabling them to share pin alerts with local TV, emergency managers and local communities. EmergencyAU, also enables its users to upload pictures, videos, and comments about breaking emergencies. The most recent one, MyShake, uses people as passive sensors by collecting data from the accelerometers embedded in mobile phones related to earthquake vibrations.

5.3 Ontologies

Only a small set of the tools surveyed in Table 2 integrate the use of ontologies developed within the domains of situational awareness and crisis management. ArcGIS allows customisation to leverage an ontology; Publicsonar (previously known as Twitcident/CrowdSense) uses the GeniUS framework to exploit Linked Open Data; Swiftriver developed SiLCC, a semantic based relevancy filter/tagger/classifier; Twitris uses the DBPedia ontology (Jadhav et al. 2013). Apisakmontri et al. (2013, 2016) have proposed the Humanitarian Aid for Refugees in Emergencies ontology (HARE) which would be compatible with systems such as Sahana and Ushahidi (Apisakmontri et al. 2016).

A number of papers have reviewed ontologies in the areas of emergency and disaster management. For example, Baumgartner and Retschitzegger (2006) provide a review of situational awareness ontologies, and proposals to formalize the basic components of situation awareness in an ontology can also be found in the literature (e.g. Kokar et al. 2009, Baumgartner et al. 2010) which also provides different ontological approaches to disaster management (Klien et al., 2006; Xu and Zlatanova 2007; Ratnam and Karunaratne, 2008; Babitski et al. 2009; Murgante et al., 2009; Babitski et al., 2011; Kalabokidis et al. 2011; Grolinger et al. 2011; Ortmann et al. 2011; Apisakmontri et al. 2013; Mansourizadeh and Gharooni 2013; Li and Li 2014). Liu et al. (2013) offer a comprehensive review of 26 ontologies. The types of crisis information systems reviewed include critical infrastructures, resource management, decision support, response coordination, command and control, and other types such as humanitarian response and relief. While some of these ontologies have been completed and are publicly available, the others are the result of academic work and remain non-downloadable. The authors identified eleven subject-matters pointing at their interoperability —people, organizations, resources, disasters, geography, processes, infrastructure, damage, topography, hydrology and meteorology. 65% of the existing ontologies are semantically interoperable. Only four of them (EM-DAT, UNEP-DTIE, Canadian Disaster Database, Australian Government Attorney-General’s Department Disasters Database) are focused specifically on disaster management. But those are database-oriented and do not provide a formal representation of disasters and their properties. Therefore, the review concludes that this is an emerging research field with room for improvement. Apisakmontri et al. (2016) have also noted that most existing humanitarian aid information systems are “stored in relational databases and are not initially developed for supporting information integration”.

More recently, Imran et al. (2015) have reviewed up to 11 crisis ontologies, including Management of a Crisis (MOAC)Footnote 3 and the newly developed Humanitarian eXchange Language (HXL).Footnote 4 Although not social-media specific, these ontologies “can be combined with ontologies describing social media concepts such as users, tagging, sharing, and linking” (Imran et al. 2015). In a similar vein, Hassan and Chen-Burger (2016) have proposed a Communication Tracking Ontology (CTO) that reuses some of these ontologies to support disaster relief mobile applications.

There have been some other attempts to find a consistent representation for all kind of emergencies, seeking for a general pattern-matching description. In this regard, Othman and Beydoun (2013) have built a Disaster Management Metamodel to serve as a representational layer of DM expertise. But providing such a comprehensive representational layer, “a unified view of common concepts and actions applied in various disasters” can be certainly taken as a first step, a useful heuristics scheme, but it cannot be confused with a universally shared and accepted DM core-ontology. The plurality and heterogeneity of ontologies in this domain reveals both its richness and the interoperability challenges ahead.

6 Conclusion

Current online tools enable online volunteers and organizations to offer a global response by allowing them to participate in a number of tasks: social media monitoring, data collection, data filtering, tagging, geolocation of events, etc. A central component of these tools is the use of crowdsourcing to channel the contribution of large numbers of users.

Our motivation in developing a typology of crowdsourcing roles and reviewing state-of-the-art platforms and applications dealing with disaster and crisis control management was to stimulate new directions of research in the area of crowdsourced social media information applied to crisis events. While there is an emerging body of literature in this direction, comparative research on the current state of the art of tools and its functionalities is still scarce. In addition, we have found little connection between platform development and research in ontologies for disaster management, even if there are some synergies than could be explored further.

In this paper we have focused on the identification of a set of dimensions that we believe that characterize well the domain and we have classified a representative set of tools that are already available. Enriching platforms to structure their content as usable and reusable knowledge is related to contextual, ethical and legal problems that largely exceed the scope of this paper. Likewise, we have not addressed further limitations, such as the potential operational gaps that may emerge when attempting to identify and match needs and offers crowdsourced from social media. Purohit et al. (2013) provide an illuminating example: in the aftermath of Hurricane Sandy (2012) people used social media to both request and offer help (shelter, clothes, volunteer work, blood donations, etc.) that was critical for the response, but also raised significant coordination challenges. Lack of resources in adequately and rapidly matching these crowdsourced requests and offers may result in a ‘second disaster’ for response organisations (Purohit et al. 2013). Adequately matching relevant crowd-generated information with the help of computer-assisted techniques (such as natural language processing (NLP) algorithms and machine learning could be critical to support logistic operations.

We have shown that empowering online volunteers and organizations to offer a global response means including citizens as main players triggering such a response. This is shown in Fig. 1 with the degree of users’ involvement on the information analysis process. An example of use can be found in Garcia-Santa et al. (2016) where a modification of the platform Ushahidi was used to detect needs on real time using the information provided by end-users and volunteers acting as sensors. While none of the reviewed tools includes all the functionalities we have prioritised in this paper, there are a few of them that should be considered as a first option due to its openness, modularity, and versatility. Thus, open-source tools such as Usahidi, Sahana or SwiftRiver allow the inclusion of third party modules and self-implemented algorithms improving its performance (both tools are open source). Ultimately, nevertheless, the choice will depend both on the contextual needs of the users and the capabilities at hand.

However, since there are no panaceas against natural disasters (Ostrom et al. 2007; Ostrom 2010), crowdsourcing, citizen participation and digital neighborhood might be considered as complementary mechanisms to give an appropriate and sustainable response. But this does not come without problems. Collier et al. (2009) stressed that in developing societies the practitioners’ point of view might divert from the experts’ point of view, for as “behavioural changes at the micro level in response to increasing income may lead to a nonlinear relationship between aggregating incomes and disaster damages, where risks increase with income before they decrease”. In other words, economic development may raise more obstacles for appropriate responses to natural disasters. Flooding, landslides, windstorms, extreme temperature events and earthquakes can be perceived only as unavoidable and “natural”, darkening the social and institutional side of lack of prevention. Collier et al. proposed two mechanisms for transformative change to link the fields of climate change adaptation and disaster risk reduction: (1) the use of iterative risk management as a primary instrument for adaptive decision making, and (2) the establishment of ‘boundary organizations’ and institutional changes that increase the transfer of knowledge between not only science and policy, but also science, policy and practice (Collier et al. 2009).

“Boundary organizations” are important, but the other side of this threefold cooperation should be the construction of institutions specifically designed to enhance self-organization and crowdsourced mechanisms.

This faces new regulatory challenges in an emerging field. Privacy, data protection and security matter when we realize that accidents, earthquakes or bushfires hit people in states with a great diversity of legal and political systems. Principles, values and norms to be applied to platforms, and the processing of the information provided bottom-up by volunteers can be analysed to the light of the relational perspective on law (Casanovas and Poblet 2008) and justice (Casanovas 2009) aiming at fostering, empowering, and protecting citizens’ participation and not only legal compliance. But liability in social media monitoring, tagging and filtering events cannot be ignored either (Poblet et al. 2012). Future research will further develop in greater detail the emergence of crowdsourcing typologies and types of regulation as they are currently being enabled by the new generation of mobile technology tools.