1 Introduction

With the rapid urban development and scarcity of natural resources, cities worldwide are facing challenges in maintaining sustainable development and improving the quality of life of their citizens (Angelidou 2015). The concept of smart cities has emerged to address those challenges by focusing on the integration of human, collective and technological capital in urban development to achieve a better quality of life and sustainable economic, social and environmental development (Chourabi et al. 2012). To achieve that goal, smart cities leverage the capabilities of technological infrastructures, populations, and institutions and revolve around four main facets: (1) Sustainability; (2) Quality of life; (3) Urban development; and (4) Intelligence (Dhingra and Chattopadhyay 2016).

To tackle the intelligence aspect of smart cities, one of the key capabilities is responding to man-made and natural disasters in a timely and effective manner to protect the life and property of citizens. The capability required continuous monitoring of city-related infrastructure and data as intelligence for streamlining the city’s emergency operation. For instance, data and intelligence about a crisis can be used by emergency response organizations of a city for coordinating crisis response activities and decision making. Depending upon the scale of a disaster, response organizations may include government, public authorities, commercial entities, volunteer organizations, media organizations, and the public. In a crisis, these entities work together as a single virtual entity to save lives, preserve infrastructure and community resources, and reestablish normalcy within the community. For the emergency operation, the response network needs to gather situational information (e.g., state of the civil, transportation, and information infrastructures), together with information about available resources (e.g., medical facilities, rescue, and law enforcement units). One of the challenges of gathering situational information is the collection of real-time data related to human life and infrastructure damage.

In the past, response organizations were using the traditional emergency management systems in combination with remote sensing data interpretation techniques in order to manage emergencies. An example of such a system is the Copernicus Emergency Management Service (CEMS) (Copernicus 2018), managed by the European Commission (EC) and established in 2012 for forecasting floods and fires with the use of optical satellite imaginary. TerraSAR-X (Mason et al. 2010) is another example using synthetic aperture radar satellite-based images to assess the damage after a flood. However, recent studies showed significant operational delays in services using such systems due to delays in collecting and analyzing satellite images. For example, CEMS takes 48–72 h to analyze the image after a disaster has occurred (Schnebele and Cervone 2013). Besides speed, another limitation of such systems is the information accuracy. Studies showed that the accuracy of an optical satellite-based damage assessment after an earthquake is approximately 65% (Dell'Acqua and Gamba 2012), whereas the accuracy of a synthetic aperture radar satellite-based damage assessment after a flood is around 75% (Mason et al. 2010). The speed and accuracy at which information about the crisis flows through the disaster response network has the potential to revolutionize the existing crisis management systems.

Recently, the ubiquitous connectivity and proliferation of social networks opened up new crisis management opportunities through crowdsourcing. Researchers have started developing intelligent platforms powered by crowdsourcing and emerging technological innovations, such as cloud computing and big data analytics (BDA) to build systems for getting insight from user-generated content on different social media platforms (Dell'Acqua and Gamba 2012; Huyck and Adams 2002; Schnebele and Cervone 2013; Starbird and Stamberger 2010). Among these applications, mobile crowdsensing for the collection of crisis-related information has attracted much attention in academic and industrial forums. In this approach, the general public collaborates with the Emergency Operation Center (EOC) of a city in gathering situational information related to phenomena of interest (e.g., incident location, disaster-related losses, and needs of victims) and helps in organizing disaster response-related activities. In modern days, whenever any crisis occurs, the national security agencies, civil defense, emergency and humanitarian professionals, media outlets, and the general public join hands and collaborate using modern technologies such as social media.

The potential use of social media caught the crisis management research community’s attention when close attention was given to social media by the crisis management organization Red Cross in its Emergency Social Data Summit held in August 2010 (Cross 2010). Since then, the research community has started exploring how to monitor, analyze, and display data extracted from social media. A recent survey (Yu et al. 2018) on the role of big data in disaster management shows that social media has been used as the most prominent information source in various disaster management phases, such as long-term risk assessment and reduction, forecasting and prediction, monitoring and detection, early warning, and post-disaster coordination and response. However, extracting meaningful information from such a large, diverse, dynamic, but potentially useful data source is a big challenge that is just beginning to be addressed by the research community.

One such crowdsourcing tool is Ushahdi (Okolloh 2009). It was initially developed to visualize crowdsource reports of violence in Kenya after the post-election violence in 2008. Since then, it is expanding and has become an excellent example of crowdsourcing for raising awareness on different social issues. This platform collects users’ generated data through SMS, email, and Twitter, and then it visualizes it using maps and charts.

“MicroMappers” (Meier 2015) is a relevant example of a crowdsourcing platform, which collects images and texts from social media and classifies them into predefined disaster categories for the effective response of humanitarian organizations. SensePlace2 (Mac Eachren et al. 2011) was developed to collect crisis-related tweets with place and time information and then visualize this information using maps and time plots.

All of the systems mentioned above are helpful to identify the different disaster types and their locations. However, these systems are limited to a single functionality of visualization with a specific information need. Moreover, they only use a single source of data such as text or images and collect data from one or two social media sites, respectively. Furthermore, a crisis response process requires an end-to-end Information Technology (IT) solution for organizing crisis response activities include crisis reporting, damage and need assessment, measures undertaken to protect life and property immediately before, during, and immediately after disaster impact.

The shortcomings of the existing sensors’ based disaster management system and the potential of improvements in social media-based platforms led us to propose dynamic social media analytics and crowdsourcing framework for assisting/improving a city’s EOC. In this chapter, we investigate a cloud computing-based big data framework that will enable us to utilize heterogeneous data sources and sophisticated machine learning techniques to gather and process information intelligently and provide emergency workers useful insights for making informed decisions. Such a comprehensive framework will help a city develop comprehensive Disaster Risk Management capability to automatically predict hazards, early warning, risk assessment, and risk mitigation, including coordination of emergency activities and evacuation.

To the best of our knowledge, the proposed system will be the first to develop a next-generation IT solution for social media analytics focusing on disaster informatics. The system will be scalable to real-time situational awareness of any large-scale event, not just restricted to man-made and natural disasters. Additionally, an advantage of the proposed approach over the already existing system is the use of multilingual-multimodality data over several social media channels, which will increase the quality and quantity of data leading to an improved results analysis.

2 Social Media Analytics System for Emergency Event Detection and Crisis Management

This chapter aims to investigate a cloud-based integrated solution for disaster and emergency management using social media analytics. The main thrust is augmenting existing sensor-based Disaster Risk Management (DRM) systems with social media capabilities by keeping the public in the loop (human sensors). The solution will allow relevant disaster management authorities to integrate and access data from several internet-based social data sources and apply semantic analysis to generate actions to be executed in response to presented contents. The result will be used by relevant emergency monitoring and disaster management agencies for emergency response, early warning, risk assessment, and risk mitigation, including coordination of emergency activities.

Figure 1 depicts the overall architecture, which comprises components for data acquisition, storage, management and analysis, and graphical user interfaces for the emergency response operators and the crowd. We have identified the following key components in building next-generation IT solutions for disaster response and management centered around disaster events.

Fig. 1
figure 1

The abstract architecture of social media-based incident detection and monitoring system

Event Extraction and Interpretation: Automated event detection capabilities are required to extract events from streams of all modalities, i.e., text, audio, and video. Multimodal extraction, fusion and assimilation technology that enables extraction and interpretation of disaster related information from multiple social media channels.

Semantic Analysis: The data shall be transformed in a format suitable to apply subsequent semantic analysis and stored in the event database following preprocessing. Robust technologies are required to extract semantic information from raw data streams and go all the way to high-quality extracted information for decision-makers in the forms most appropriate for their various tasks. Automated capabilities are required to process the various pieces of information, such as images, audio and video feeds, text feeds, etc., and also integrated and fuse information coming from multiple disparate sources. We will develop a disaster ontology to map metadata derived from social data to the ontology matching concept. Ontology creation and alignment are the immediate future work under this research project. At this stage, we assume that the ontology will have classes for different disaster types and information about relevant relief organization and the location of those organizations.

Event Data Management: Technologies that support modeling, storing, querying and indexing, situational information. We need data management capabilities for storage and structured querying of the information collected, processed, and integrated into the ingest phase. The information obtained about the unfolding situation and status should be queryable in a structured manner.

Monitoring and Visualization: Capabilities to effectively monitor and visualize crisis-related social media contents for selective dissemination of information with disaster response network; visualize incident-related information such as incidents graphs showing a relationship between events and statistics about the different incidents; and query and navigation interfaces to allow users to filter through the detail of incidents and events such as location, damage etc.

We now describe techniques related to each tier in the proposed architecture to provide an overview of the methods integrated into each component.

2.1 Incident Extraction and Representation

In the context of emergencies, incident extraction refers to the task of discovering a new event by continuously monitoring raw data streams. Automated event detection capabilities are required to extract events from streams of all modalities, i.e., text, audio, and video. Most large social media platforms provide programmatic access to their content through an Application Programming Interface (API). APIs allow data collectors to express an information need, including one or several of the following constraints: (1) a time period; (2) a geographical region for messages that have GPS coordinates or (3) a set of keywords that must be present in the messages.

The workflow can either be triggered automatically upon detecting an event matching with information need or an operator manually on deployment. To start the crawler, several parameters need to be specified.

The location-based crawler required a predefined area of interest and time window size for all social media networks configured for crawling. The location will be provided using static location coordinates (i.e., longitude, latitude) through Google API.

The keyword-based crawler starts with the specific search terms or uses predefined terms stored in the language database to initiate the search in the keyword search. The crawler will then start searching the contents matching with search terms. Keyword-based filtering can be used to continuously monitor the activity in social media networks concerning natural disasters. The frequency of particular words associated with disasters can be temporarily analyzed to detect a new or upcoming disaster.

The multi-language component’s goal is to provide the capability of crawling the content uploaded in different languages. The system will provide a language translation service using Google/Microsoft language translation APIs to translate posts in the target language and store language-specific keywords in the language knowledge-base. Upon setting up these parameters, the application starts receiving data from one or more social media sites, such as Twitter, Facebook, YouTube, and news feed.

The content from one or more sources may contain various forms, including text (posts, comments on blog posts, news, etc.) and media uploaded with the post (image and video contents with rich metadata information such as location and text) and videos. Following the data collection and translation, the data crawled is transformed into a suitable format to apply preprocessing techniques on text, image, and video, before applying semantic analysis. Data preprocessing techniques are described under the relevant section of each modality.

2.2 Text Processor

This research area focuses on developing robust text analytic approaches for the extraction of information from text, starting with raw reports/posts or documents and providing extracted information of high quality and reliability to the end-users. One key aspect of effective disaster response is up-to-date information on the area of interest (Seppanen and Virrantaus 2015). To this end, developing useful information processing tools to identify the disaster type, timing, and locations is critical (Seppanen and Virrantaus 2015).

The use of social media data for disaster management has led to creating a variety of text analytics techniques for information processing (Zhang et al. 2019). There are three main streams of research in the existing literature to leverage social media posts for enhanced situation awareness during disaster response and recovery: (1) text classification, (2) geographical estimation, and (3) information extraction.

2.2.1 Text Classification

The first stream of research focuses on supervised and unsupervised algorithms to categorize extracts in one or more disaster-related classes. Under supervised algorithms, multiple machine learning classifiers have been used to classify social media data with disaster-related topics, such as Naïve Bayesian classifier (Imran et al. 2013), Convolutional Neural Networks (Huang et al. 2019), and Generative Adversarial Network (Dao et al. 2018). A supervised algorithm can be used when a set of example items in each category is given. However, provided the dynamic nature of social media posts, it is impossible to predetermine labeled training set for each type of disaster.

Since supervised approaches for text classification require human-annotated labels, the use of the unsupervised approaches has started increasing in the last decade or so. Unsupervised methods seek to identify and explain important hidden patterns in unlabeled data. Well-known unsupervised learning methods used in text analytics are topic modeling and clustering. Topic models reveal the latent topics by identifying words’ occurrence patterns in a document. There are many studies that use topic modeling to extract topics corresponding to disaster events. For example, Ragini et al. (2018) used a set of keywords that refer to danger, such as “trapped, stranded, help, save, rescue, struck, caught” to extract relevant information from disaster-related tweets. Mishler et al. (2015) used the Structural Topic Model (STM), a variant of LDA, to capture temporal changes in the Ukrainian crisis. Kireyev et al. (2009) used basic LDA on Twitter data from two earthquakes in America and Indonesia to uncover the most prominent topics during these two natural disasters. They emphasized dynamic corpus refinement to overcome the content sparsity problems in a short text by training the topic model on large external corpora to enrich the short representation of text.

These techniques are proven useful in detecting disaster events using posts shared on social media. However, in the context of effective emergency response, the extracted information should fulfill the needs of a disaster response network, targeting specific information needs. Many response organizations have predetermined information needs. For example, in a building collapse, a field worker might require detailed information about the event, while a rescue team may only need to know the location and number of injured people in the vicinity of the catastrophe. By considering the information need of a specific organization, in this project, we focus on employing state-of-the-art AI techniques to analyze and understand useful information for humanitarian decision-makers and responders.

2.2.2 Location Estimation

Another stream of studies has focused on assessing disaster damages over different locations for disaster-related posts. By doing so, the unfolding of events can be mapped to a particular location. The majority of the existing social media platforms retrieved disaster-related tweets across different geographical locations to identify spatial patterns of a crisis region. Most of the current systems primarily employed geotagging to automatically extract geocoordinate (latitude/longitude) from posted content metadata. For example, SensePlace2 (MacEachren et al. 2011) extracted geotags using explicit geolocation metadata in the tweet, location of users’ posting, and place-based hashtags.

Similarly, Twitcident (Abel et al. 2012) used a geography boundary box (longitude/latitude of a specific location) to gather geotagged tweets only and employs classification algorithms to extract messages about small-scale crisis response, such as music festival and factory fires. Tweak the Tweet (Starbird and Stamberger 2010) used tweet syntax to encourage content originators to markup location-based keywords, which assist in filtering and classification of emergency-related information. Using the relevant hashtags, people can craft more relevant geotagged content to capture the situation in disaster-affected areas. However, the author concluded that the content originator had not widely adopted syntax. TweetTracker (Kumar et al. 2011) parses a Twitter feed to extract location-specific keywords and place-based hashtags to monitor and analyze crisis regions to improve situational awareness. Although many disaster management systems support geotagged capabilities, the availability of explicit location metadata from social media posts is limited. Situational information published by reporters and individuals tends not to attach the geotag in their posts due to privacy concerns. In particular, only 1% of the posts include machine readable location metadata (Malik et al. 2015).

To further enhance location insights from social media posts, many studies used the concept of volunteered geographic information (VGI), which focuses on utilizing the potential of crowdsourcing to generate useful geographic information to track and group geotagged social media posts on the location map. For example, MicroMappers (Meier 2015) employed volunteers to manually geotag messages into different humanitarian categories across different locations. The tool then constructs a location map of these categories that can be sent to response agencies providing the up-to-date status of affected areas. Ushahidi (Okolloh 2009) used crowed to manually tag posts from a wide range of resources and geo-visualize crowdsource reports to increase situational awareness of the affected region. CrisisTracker (Rogstadius et al. 2013) integrated crowdsourcing techniques to annotate rapid streams of unstructured tweets with metadata and cluster related stories based on their lexical similarity. It then utilized volunteers to verify and further curate stories by adding additional metadata, removing duplicate stories and/or merging similar stories. Zook et al. (2010) provided a recent overview of VGI/social media applied to crisis management with a focus on Haiti.

Despite the popularity of crowdsourcing techniques to curate location information for further interpretation, the effectiveness of such systems is highly dependent on the motivation and size of the crowd. Furthermore, none of the existing crowdsourcing frameworks utilized the inference techniques to automatically discover location from huge data and disseminate results to relevant authorities. Understanding the social media attention to a crisis-affected region requires detecting geographical information of the location in the posts. Many social media platforms, such as Twitter, allow users to share geographic information in the form of physical addresses specified in user profiles; however, users’ tend not to share their locations due to privacy concerns. Therefore, a small fraction (less than 1%) of social media posts usually have location information (Malik et al. 2015). However, some studies (Fan et al. 2020; Liu et al. 2013; MacEachren et al. 2011; Middleton et al. 2013) have discovered that locations mentioned in the posted content can provide an additional opportunity to complement the limited geotagged posts. By considering the above problems, in this work, we selected Twitter as a case study to infer tweet geolocation from geo-coordinate, user location tweet place-field, and location mention in tweet content as shown in Fig. 2 .

Fig. 2
figure 2

Location extraction from Twitter posts

Twitter provides an option to enable the exact geolocation (geocode) of users on mobile devices that contain GPS coordinates, i.e., latitude and longitude. These coordinates can be directly mapped into a valid spatial representation using geocode services. There are several commercial geocoding services (Google Geocoding API,Footnote 1 OpenStreetMap Nominatim,Footnote 2 and Bing Maps APIFootnote 3) based upon an underlying map database, which can take well-formatted location descriptions and return map references to them.

Tweet content represents the actual tweet in text format. To extract location mentions in the tweet text, we tried several publicly available Named-Entity Recognition systems, including NLTK2, a recent version of Stanford NER (Finkel et al. 2005), and a system by (Ritter et al. 2011) to extract location mention in the tweets. However, user location, place full name, and location mention in tweets are unstructured content, which needs location parsing and location disambiguation prior to geocoding. We used a geolocation algorithm (GeoLocator v1.0)Footnote 4, which contains both geoparser that extract locations and a geo-coder that assigns latitude and longitude to each location.

However, due to either unsatisfactory results (e.g., most of them need proper capitalization to detect location names) or not enough computational speed to deal with a large dataset, we could not directly use these tools. Instead, we have adopted a gazetteer-based approach and used Nominatim, a search engine for OpenStreetMap (OSM) data. Specifically, we used the Nominatim tool to perform two operations: (1) geocoding—the process of obtaining geo-coordinates from a location name, and (2) reverse geocoding—getting a location name from geo-coordinates. Furthermore, we used WordNet to identify location variants to map each location into its standard format.

By using the Google Maps Geocoding API, the coordinates of the recognized location entities can be saved. Accordingly, we can use the latitudes and longitudes for the identified locations to locate a post on a geographic map to estimate the density of posts across various locations for different categories of events.

2.2.3 Information Extraction

The task of Information Extraction (IE) involves automatically extracting structured information from unstructured (e.g., plain text documents) or semi-structured (e.g., web pages). The most common IE task is Named Entity Recognition (NER), which consists of detecting regions of a text referring to people, organizations, or locations (Liu et al. 2013; Ritter et al. 2011).

In the context of disaster-related posts, IE can be used to create incident reports by extracting facts from social media posts written in natural language. For example, “10 injured and 4 dead in a traffic accident in Riyadh” can be normalized to transform into machine-readable record such as {< affected-people = 10, report-type = injury, location = Riyadh, Saudi Arabia>}, {<affected-people = 4, report-type = casualty, location = Riyadh, Saudi Arabia>}. These normalized records can be easily filtered, sorted, or aggregated.

Such systems can be built by specifying a ‘schema’ for the facts to be extracted along with semantic information and properties about the different attributes of events. The user may also provide some logical deductive rules that state what event slot should be filled with entities. Many ‘off-the-shelf’ language processing tools are available to develop powerful natural language processing applications, for example, text analytics tools such as GATE (Cunningham 2002), semantic parsers such as Shalmaneser (Erk and Pado 2006) and the Stanford NER Parser (Klein and Manning 2002).

We are developing a text processing platform with automated capabilities that extricate information by combining the above task under a single event-centered platform. The system involves several phases, such as first classify which post could contain the disaster-related event in the first place utilizing supervised and unsupervised methods, then grouping relevant posts using clustering approaches, and finally extracting entities by filling slot values utilizing named entity linking semantic technology.

2.3 Image Processor

The images are an integral part of social media posts and sources of exhaustive information about the events. The usefulness of visual contents of social media posts in crises can be utilized in many ways, such as to collect contextual information (e.g., the current status of transportation or infrastructure), information about available resources (e.g., food supply, medical aid, water shortage), information regarding damage (e.g., damage severity levels) and the awareness status (e.g., the warning). Such information can be collected by looking at the visual contents to identify the type of information an image carries. However, it is challenging to extract comprehensive information from the massive number of social media data, as it is not possible to perform this task manually. Therefore, it is required to develop an approach that allows automating the disaster-related information filtering process.

We developed machine learning and related image recognition algorithms for image-based event detection for enhanced situation awareness. There are two primary tasks for extracting semantic information from images, as shown in Fig.3. (1) image acquisition and labeling and (2) image processing pipeline. In the following section, we will describe methods integrated into each component.

Fig. 3
figure 3

Image processing model for disaster type classification and disaster related objects detection for emergency response

2.3.1 Image Acquisition and Labeling

Various types of images related to different hazard types such as collapsed buildings and infrastructure, flood, hurricane, etc., can be collected through a web search, social media platforms (Flickr, Twitter, Instagram, etc.,), and publicly available image catalogs. Towards developing a successful image processing model, it is essential to annotate the image dataset correctly with precise labels to identify different types of information related to a crisis. Although few studies exist on information extraction from disaster-related images in the literature, however, they focus on a specific information type such as ‘severe damage’, ‘damage’, or ‘no damage’ (Alam et al. 2018a; Barz et al. 2019; Murthy et al. 2016; Olteanu et al. 2015), first-aid activities such as rescue, volunteering, or donation (Alam et al. 2018a; Olteanu et al. 2015), the need of supplies such as food and basic needs (Murthy et al. 2016; Olteanu et al. 2015), people who are affected from the crisis (people) (Alam et al. 2018a; Murthy et al. 2016; Olteanu et al. 2015), warning information (Olteanu et al. 2015), and information regarding social activities during the crisis (Gaur et al. 2019). There is a tremendous potential for further improvements in the delivery of crisis-related information to relevant response authorities through the development of a single platform addressing different information needs. Therefore, we have developed a comprehensive disaster taxonomy targeting the information need of a specific responder. The taxonomy consists of six different disaster categories, including: (1) damage related information, (2) rescue, volunteering and donation, (3) food and basic needs supplies, (4) affected individuals, (5) caution, and (6) effect on social activities.

To map the disaster related images on the proposed categories in the taxonomy, image annotation is needed. There are some existing open-source datasets consist of disaster-related annotated such as CrisisMMD (Alam et al. 2018a), European Flood 2013 Dataset (Castillo 2016), and Fire Dataset (Saied 2020). These datasets are useful in classifying the images into various disaster types such as earthquake, flood, hurricane, and fire etc. However, the organizations working for the disaster management activities require specific information from an image for the appropriate response. To further facilitate disaster operations, we can annotate disaster-related objects from a given image using a 2D bounding box (Dwibedi et al. 2016) technique. Several platforms are available for image annotation, such as CrowdFlower,Footnote 5 Amazon Mechanical TurkFootnote 6, and AWS rekognition.Footnote 7 These platforms allow users to annotate a specific object by drawing a bounded box around the objects within the images using the human workforce. To develop the dataset, we first downloaded images from publicly available resources (Flicker, Google, Twitter) using the specific disaster-related keywords. We then performed basic preprocessing such as resizing and removing the duplicates. In the next step, we annotated images by drawing bounded box around disaster-related objects using opensource tool LabelImgFootnote 8 and labeled them according to proposed labelling scheme/taxonomy.

2.3.2 Image Processing Pipeline

We developed an image processing pipeline comprised of images collector, image pre-processor, image classifier, object detector, and visualization modules for a detailed analysis of imagery content.

At first, images were downloaded using Twitter and Flicker) and various preprocessing techniques are applied to the raw images. In the case of Twitter, only the tweets contained media URLs are selected, and images are extracted using Madia class from tweeter API with attributes of id and media_url. The data collection from the Flickr platform is done by using Flickr search API. In the first pre-processing step duplicates are detected and erased from the data set. The remaining images are used for further interpretation using machine learning and related image recognition algorithms. Besides the photos, the spatial location of images is an essential component for mapping the results. The location extraction task is challenging due to limited data with explicit location information; therefore, the messages/social media post metadata is used for the geo-localization using location estimation techniques discussed under the text processor module. In the previous studies, location is estimated using selection of tweets with media and fine-grained localization (Francalanci et al. 2017). The geo-localized images are extracted from social media data to support emergency response and used in the image processing pipeline to analyze area-specific images and their usefulness (Ionescu et al. 2014; Murthy et al. 2016; Peters and de Albuquerque 2015). Keyword filtering is another way of geo-localization where the Euclidean distance is calculated between the users and the hazard location. (Peters and de Albuquerque 2015).

After location extraction and duplicate removal, image classification techniques are applied to classify the images on predefined categories. A range of classification techniques from binary classifier such as disaster and non-disaster images to multi-classifier on disaster types such as earthquake, flood, hurricane, etc., humanitarian categories (Nguyen et al. 2017b), situational awareness (Peters and de Albuquerque 2015), damage severity (Alam et al. 2018a; Nguyen et al. 2017a), flooded and dry images, and water pollution (Barz et al. 2019) discussed in the previous literature. For the classification of images on certain damage types, various deep-learning neural networks, specifically convolutional neural networks, can be applied to gain accurate results.

For the image classification task, we used the Google Tensorflow library for building a machine learning pipeline. It allows transfer-learning, meaning that it is not required to develop a model from scratch but to build on a previously trained model and to develop only a new final layer. Various transfer learning architecture such as AlexNet (Iandola et al. 2016), GoogleNet (Szegedy et al. 2015), VGGNet (Simonyan and Zisserman 2014), ResNet (with 50, 101, and 152 layers) (Targ et al. 2016) are explored to select the best classification model. Among these models, fine-tuned VGG-16 network achieved the best performance with 95% accuracy. However, the performance of different disaster classification deep neural networks depends on the structure of a neural network, dataset image quality, the size of the dataset, the type of disaster images, and the quality of image annotation. The results of this study are similar to the previous research where VGGNet (Alam et al. 2018b) achieved the best performance, however, the classification results are only available on limited scenarios of disaster and datasets. Therefore, the challenge is to investigate the deep learning neural network to attain the best performance in classifying as many types of disasters as possible.

The disaster-related social media images can be further explored to discover additional details of the damage within an image, such as resources, rescue needs and services, and warning signs, which require object detection techniques. The previous studies are very limited on discovering disaster-related objects from the images. One of the main focuses of our research is the investigation of disaster-related objects and detection from the images. This will allow utilization of the social media posted image data in effective disaster emergency response. In the proposed pipeline, after classifying an image into the relevant category, object detection techniques are applied. We used YOLOv4 (Bochkovskiy et al. 2020) object detection model to recognize the disaster-related objects. The identified objects are then fused with the disaster class identified in the previous step to narrate the meaning from the image and sent to appropriate emergency response workers. This will help filter and transfer a huge number of social media images efficiently to the specific authority responsible for disaster management activities. In the previous research, object detection is applied to find common objects in the images in emergencies. For this purpose, the techniques of the Faster R-CNN (Ren et al. 2016) trained on COCO dataset (Lin et al. 2014) are applied (Chino et al. 2015). However, there are limited contributions to the area of disaster-related object detection from social media images. Many object detection algorithms based on region proposal-based methods such as R-CNN (Girshick 2015), SPP-net (Purkait et al. 2017), Fast R-CNN (Girshick 2015), Faster R-CNN (Ren et al. 2016), R-FCN, FPN, and Mask R-CNN (He et al. 2017) and regression/classification based methods such as MultiBox (Erhan et al. 2014), AttentionNet (Yoo et al. 2015), G-CNN (Najibi et al. 2016), YOLO (Redmon et al. 2016), SSD, YOLOv2, YOLO3 (Farhadi and Redmon 2018), YOLO4 (Bochkovskiy et al. 2020), DSSD, and DSOD (Zhao et al. 2019) discussed in the literature for object recognition. However, we investigated one of the best performing object detection algorithm YOLO. Further research is needed to investigate the efficient object detection methods for identifying the disaster-related objects to understand the context and the details about the situation.

2.4 Audio-Video Processor

Information shared on social media is often not unique but available in multiple versions. As part of the proposed model, we are developing approaches for extracting information from a single modality and extracting and fusing information from multimodal data. The research focuses on developing approaches for the interpretation and analysis of audio and video data.

Video-based event detection can be done using computer vision technology. In the context of emergency event detection, we tend to identify objects and action concepts. Examples of object concepts for crises can be fire, car accidents, building collapse, etc. Object concepts can be detected by evaluating different object detectors on each video frame and aggregating the results temporally over the video. State-of-the-art object detectors based on deep neural network architects can provide thousands of object detectors, pre-trained over large image datasets such as ImageNet (Krizhevsky et al. 2012). Examples of action concepts include fighting, running, crowd pushing, etc. Detection of action concepts can be achieved by extracting low-level dynamic features, including dense trajectories (Wang and Schmid 2013) and STIP (Laptev et al. 2008), as well as static features (Dalal and Triggs 2005). In the proposed architecture, objects and concepts from the selected videos will be extracted into a significant number of still photos. In the next step, image analysis techniques will be applied to classify images extracted from the video, as explained in the previous section.

Audio-based event detection requires automated capabilities for feature extraction from low-level signals and robust speech recognition technologies. In the context of emergency event detection, audio-based concepts include sounds of gunshots, explosions, etc. Audio concepts can be detected by learning classifiers on MFCC features (Mel-Frequency Ceptral Coefficients) from the audio track in a way similar to (Logan 2000). Automatic speech recognition provides a stream of words (text) corresponding to a speech in an audio or video. We are developing an end-to-end speech recognition engine that displays words as if they are spoken, and respective text analytics techniques (explained in the previous section) will be applied to identify disaster-related cues.

2.5 Ontology for Disaster Management

Disaster response and planning usually involve various organizations addressing different aspects of a problem in a coordinated fashion. Law enforcement authorities and humanitarian agencies often coordinate their operations using manual interactions to save lives and minimize property and infrastructure damage. Recently, social media are flooded with a huge amount of user-generated information related to natural disasters or national emergencies at a scale that is beyond the capacity of manual human processing. Automation is needed in both identifying the misinformation as well as integrating the fragments of correct information from various sources so that the relevant stakeholders can act on such information accordingly.

One of the major challenges in crisis management is about developing solutions that can seamlessly bring together relevant information from all types of sources (text, images, audio-video, and geo-spatial) to support crisis-related activities amongst the relevant stakeholders (from the field to the situation room, and amongst situation rooms as well as workers in the field) during crises. In addressing such challenges, an ontology-based knowledge representation to validate and integrate social media contents is inevitable in facilitating the seamless flow of information with relevant agencies and stakeholders working together in emergency and crisis management.

In an ontology-based approach to disaster management, first, we need a unified representation of standard vocabularies that describes the concepts as well as the attributes, constraints, and relationships among the disaster-related concepts. Such an explicit specification of conceptualization of the disaster domain is known as the Disaster Ontology. We also utilized various natural language processing tools to semi-automatically maintain the ontology using relevant knowledge extracted from plain texts (e.g., social media postings and news reports) (Khatoon et al. 2020). Tools are also being developed to exchange or integrate information between our ontology and heterogeneous databases located at the sites of the law-enforcement, emergency response, and humanitarian agencies.

We integrated our disaster ontology with a cloud-enabled crowdsourcing platform, Ushahidi.Footnote 9 The integrated system helps users creating a crisis reporting interface augmented with the domain knowledge. The disaster ontology also helps us with identifying the relevant authority or agency and a series of relevant actions (workflow) that are crucial in coordinating the emergency response operation effectively. The disaster ontology is equally useful in data integration with heterogeneous databases maintained by diverse stakeholders coordinating the emergency response operations in a seamless manner.

2.5.1 The Ushahidi Crowdsourcing Platform

Ushahidi is an open-source crowdsourcing platform which is in continuous development since 2008. Ushahidi has been deployed by various communities and international aid organizations to support disaster handling and management through crowdsourcing. The current cloud-based implementation of Ushahidi is extremely flexible as it can be easily integrated with popular social media platforms, external databases as well as SMS and email gateways. However, such flexibility comes with a cost. Most of the configuration, integration, and day-to-day operation on Ushahidi are done manually. For example, to gather information about a particular type of disaster, the Ushahidi Administrators need to setup reporting forms (templates) along with a series of tasks (workflow) manually. The comprehensive reporting of different types of disasters requires specialized human expertise with relevant domain knowledge, which is not always readily available. Incomplete or inaccurate reporting often leads to confusion and delays in emergency response. For example, unlike COVID-19, in reporting a viral pandemic like MERS (Middle Eastern Respiratory Syndrome) outbreak, it is crucial to include information such as proximity with Camel or Camel-milk consumption related information. Subsequently, as part of the emergency response workflow, together with the notification of the public health authorities, the local veterinary department must also be alerted for the appropriate handling of the infected animals. Similarly, for a traffic accident involving humans’ bodily injury at a particular geographic area, the relevant paramedics will be informed together with the traffic police and insurance agents as appropriate.

2.5.2 Integration of Disaster Ontology with Ushahidi Platform

The cloud-based implementation of the current version of the Ushahidi crowdsourcing platform is modularized into two distinct modules – (1) Platform API: Crowdsourcing backend implemented as Kubernetes with native APIs and Webhooks, and (2) Platform Client: Crowdsourcing Frontend for data gathering, analysis, and visualization. With Ushahidi, we developed and integrated an owl-based disaster ontology using Protégé ontology editorFootnote 10 and the opensource Pallet reasoner plugin (capable of performing description logic-based reasoning and inference). The disaster ontology along with the natural language processing and annotation tools we developed, are integrated with the Ushahidi codebase. The current implementation and deployment of the disaster ontology enhanced Ushahidi crowdsourcing platform focuses on demonstrating the following features as a proof-of-concept basis:

  1. 1.

    Generate Reporting Forms (or Templates) semi-automatically for any disaster and emergency using the domain knowledge from the disaster ontology.

  2. 2.

    Generate Emergency Response Workflows automatically with necessary information inferred from the disaster ontology.

  3. 3.

    Integrate information in our disaster ontology with the heterogeneous data sources located at the site of the humanitarian organizations and emergency response authorities using webhook and other API seamlessly.

As shown in Fig. 4, with the help of the disaster ontology, any Ushahidi Administrator is guided to generate an appropriate Reporting Form to gather data from users via the Ushahidi web interface directly (or through an appropriate Template via the Twitter, email, or SMS gateway where semi-structure data is already annotated using natural language processing tools automatically). For a viral pandemic like MERS, camel-related attributes are included automatically as relevant. In case of other viral pandemic, such attributes are not relevant and will be automatically excluded. The figure also shows the auto-generated Emergency Response Workflow that includes information derived from the disaster ontology through inference (e.g., the contact details of the Veterinary Department of the relevant geographic area is extracted from the disaster ontology using the location attribute and the type of viral pandemic).

Fig. 4
figure 4

Ushahidi crowdsourcing platform with disaster ontology, (1) Disaster ontology showing the key classes and attributes, (2) Editable Ushahidi report form generated with the help of the ontology, and (3) Ushahidi crowdsourcing platform showing the reported disasters on the World Map

We are continuously enhancing the disaster ontology to integrate knowledge related to various disasters and the related workflow. We are also developing and enhancing necessary tools using natural language processing techniques and machine learning algorithms to maintain the ontology semi-automatically so that we can facilitate automatic annotation and detection of named-entity and disaster-specific events, topics, and workflow, etc. Integration of the information in our disaster ontology with heterogeneous data sources (e.g., HDX, the Humanitarian Data Exchange serverFootnote 11) are also currently in progress.

2.6 Incident Monitoring and Visualization

Incident monitoring and visualization are concerned with developing an intuitive visual interface for end-users. End-users such as emergency operations managers would like to see visual patterns and trends for incident type over time and space. In this regard, we are developing a comprehensive dashboard to facilitate information visualization of multi-model data. The dashboard will provide integrated and fused information from multiple sources and provides querying, filtering, and analysis functionality.

Previous studies have used different ways to analyze crisis-related information by targeting specific information needs visually. For example, Chae et al. (2014) focused on how the public follows evacuation orders in an emergency situation. They investigated people’s movement patterns by comparing spatiotemporal patterns using visualization that helped analysts understand how users react during an emergency event. Kwon and Kang (2016) used time series location map to compare the severity of floods from Twitter posts. Tweets were classified using a 5-by-5 risk evaluation matrix and displayed as color-coded points to assess risk severity level. Onorati et al. (2019) used different techniques, including tree map, word cloud, bubble chart, and an animated map to better analyze information for decision-makers. GeoViewer (Tsou et al. 2017) is a real-time tool that provides an interactive display of multi-media content in addition to a map. These techniques are useful to analyze information visually, targeting specific information need from a specific social media channel. However, for effective disaster response and management, a tool shall provide online access to a variety of useful information, to both citizens and responders in a crisis. To this end, automatic capabilities are required to integrate and fuse information and provide analysis, querying, triaging, and filtering functionality. In this project, we are developing capabilities to effectively disseminate information to the citizens and emergency response worker coordinating disaster related activities on the ground. A sample of some screenshots similar to those we intend to develop is shown in Fig. 5 . The system will provide the following functionalities.

Fig. 5
figure 5

Sample dashboard visualization of social media analytics framework for crisis management

Situational Awareness Dashboard: the dashboard is concerned with processed, integrated, multimodal information that can be used for decision making. It will allow real-time spatiotemporal distribution of incident and response reports. The information will be visualized using heat maps, time plots, incidents graphs showing a relationship between events, and statistics about the different incidents. With this kind of visual analysis, the emergency response operators will understand how a situation unfolds and help them in disaster planning.

Query and Navigation Interfaces: Intuitive query and navigation interfaces will be developed to allow users to filter through the detail of incidents and events. This will allow them to get additional detail on each incident type (such as location, damage, etc.).

Selected Dissemination of Information (SDI): In order to effectively share the right information with the right people, an SDI component will be developed. Such an efficient filtering system will identify only the right content to be shared out of a huge volume of information. For example, if the extracted knowledge is about a traffic accident, the traffic police will be notified with summarized information about the location, the number of injuries and deaths, the services needed etc. Similarly, the information related to medical emergencies will be shared with emergency services such as ambulance services and so on.

3 Discussion

The vast amount of information that is growing by the seconds in the Internet and various social media platforms is enhancing the relevance of information derived from such sources to operational crisis management. One of the key challenges is the availability of suitable computer-based solutions for automated information extraction and analysis from all types of data sources (text, images, audio-video and geo-spatial) and seamless flow of information with relevant stakeholders working together in emergency and crisis management. In the last few years, there have been several interesting developments in developing relevant solutions, their uses are still mostly restricted to a specific information need and single users within an organization and are not yet widely used in operational crisis management. The extension relevant applications to include operational crisis management would contribute towards stimulating the research community to make further advances in related research covering social media, and high-performance computing technologies to process all different type user generated content.

The proposed disaster social media framework described here has several implications for practice and research. From a practical perspective, it presents the robust techniques that can be used to facilitate the development of tools and implementation process to support seamless access to useful, actionable information in a timely fashion for effective disaster operations and decision making in a smart city. The key capabilities are analysis and visualization of multi-model data from diverse sources and customization according to different stakeholders’ needs. For instance, is it a group interested in a rescue operation? Is it an organization that provides food or medicines? However, extracting different types of information sought by different stakeholders from millions of diverse sources and modalities is challenging. First, there are no tools available that support the seamless integration of actionable information; therefore, much of the data is underutilized. Second, extracting the conversation structure in social media content and locating actionable information for human interpretation to support multi-dimensional functionalities requires multi-media data processing. Third, much of the information is human-generated; while such information has the advantage of effective disaster response, it could also suffer from false information. Automation is needed to identify the misinformation and integrate the correct information so that human stakeholders can act on such information accordingly.

The technologies we described in all the tiers of the proposed framework, such as event extraction, semantic analysis for multi-media data, event management using disaster ontology, and visualization, aim to address the problem of converting available data into actionable information. The techniques described under each tier provide an opportunity to integrate social media intelligence into disaster operations. While significant work is needed to fully integrate these emerging techniques into a complex disaster system, individuals and society’s possible benefit justifies the investment.

Funding Statement

The authors extend their appreciation to the Deputyship for Research & Innovation, Ministry of Education in Saudi Arabia, for funding this research work through project number 523.