Keywords

1 Introduction

Situation awareness involves short-term, immediately responsive human behaviors by acquiring geospatial information in the field to facilitate operational and tactical decision-making. Situation awareness can help decision makers identify key environmental elements and their meanings within a volume of time and space (Endsley 1995). Situation awareness plays an important role in disaster response, recovery and emergency management for first responders and emergency response providers (Tsou and Sun 2007; Wang et al. 2016). For example, a firefighter can view a map showing nearby wildfire perimeters and decide to create a backfire plan in order to control the spread of wildfires (immediate operational plan). A police chief can identify the locations of evacuation shelters after a major earthquake and dispatch police officers to evacuation shelters (short-term tactical plan).

There are a number of good examples of using social media for emergency management, including the 2010 Haiti Earthquake (Veil et al. 2011), 2012 Waldo Canyon Fire (Sutton et al. 2014), 2013 Typhoon Haiyn in the Philippines (Humanity Road and Statistics Without Borders 2014), Hurricane Sandy (Guan and Chen 2014), and the 2014 Ebola outbreaks in West Africa (Moore and Verity 2014). Disaster-relevant social media has been used to analyze how individuals or decision-makers enhance their situational awareness (Vieweg et al. 2010; de Albuquerque et al. 2015; Huang and Xiao 2015).

Previous studies have indicated the need of developing an integrated, user-friendly web application to collect and display relevant location-based social media messages in real-time. Many relief efforts during disastrous events will need to provide aid promptly to target areas. However, there are usually a significant delay of communication between the affected communities and their formal emergency response agencies (Tapia et al. 2013; OCHA 2014; VSMWG 2014; Sutton et al. 2014). Location-based social media messages can facilitate the prompt response and decision making for disaster response and recovery because social media is free and highly accessible. The messages provided by people in affected areas can be collected and interpreted by emergency responders immediately while also opening direct lines of communication between responders and affected communities (Goolsby 2010; Sakiki et al. 2013; Tomaszewski 2015).

One important metadata for disaster response stored in social media messages is the location of users. Twitter posts with geographic locations are referred to as “geo-tagged tweets”. However, only between 1–3% of tweets include location (Morstatter et al. 2013) as users must activate the geolocation function in their mobile devices. Utilizing geo-tagged tweets with content analysis can help improve situational awareness and further disaster management efforts.

One early software example of using geo-tagged social media for disaster relief is Ushahidi (http://www.ushahidi.com), which was an open source crowdsourcing crisis map platform that utilizes online volunteers for information validation (Gao et al. 2011). Created in 2007 in Kenya to provide situational information from the online volunteers (Okolloh 2009), many disaster relief agencies used Ushahidi to help coordinate emergency responses in such places as Haiti, Chile, Russia, Kenya, and India. However, there are several major challenges of using geo-tagged social media for disaster relief, including the accuracy of geo-tagged locations, verification of reports, summarizing all reports automatically, social behavior prediction, and the scalability and safety (Gao et al. 2011). Sahana is another well-known open source Disaster Management System for emergency management and disaster preparedness. It provides limited mapping functions to add information on disaster-relevant event locations or requests for help. Users can add notes or upload photos with specific locations on a web map , so that different agencies can collaboratively work on the disaster response and relief activities (Careem et al. 2006).

Since both Ushahidi and Sahana platforms cannot provide a real-time social media messages display functions, the main design goal of GeoViewer is to deliver real-time map display functions for disaster events ground truth observation. A product of the Center for Human Dynamics in the Mobile Age (HDMA) at San Diego State University, GeoViewer is a web-based real-time social media mapping application that links GPS locations, texts, photos, and time stamps in social media (Twitter ) messages. GeoViewer helps users visualize historical and real-time social media content, provides easy-to-use geospatial query functions, and displays multimedia sources (see Fig. 1).

Fig. 1
figure 1

The design of GeoViewer is based on the analytic framework of Knowledge Discovery in Cyberspace (KDC). (Modified from Tsou and Leitner 2013, 56)

The first beta version of GeoViewer was developed on July 30, 2014 after the San Diego Wildfire (http://vision.sdsu.edu/hdma/wildfire/). The official version of GeoViewer was released by the HDMA Center on December 6, 2014 (http://vision.sdsu.edu/hdma/geoviewer/sanDiego) with the real-time mapping function. The key design principle of GeoViewer follows the Knowledge Discovery in Cyberspace (KDC) conceptual framework (Fig. 1). In order to acquire geospatial knowledge and situation awareness in social media and cyberspace, GeoViewer needs to help users discover new patterns, new trends, and new spatial knowledge from aggregated social media records (Tsou and Leitner 2013). One unique characteristic of KDC is a triangular knowledge base consisting of place, time, and messages. The three components of KDC are interdependent and human-centered , which means they are originally defined or created by human beings (messengers).

2 The Design of GeoViewer System Architecture

The system design of GeoViewer follows the client-server architecture as shown in Fig. 2. On the server side, NodeJS was selected to develop the GeoViewer server functions because of (1) its multiple plugin modules; (2) the same programming language with the client side (JavaScript ); and (3) its lightweight and efficient features. The GeoViewer server includes the following components.

Fig. 2
figure 2

The client/server system architecture of GeoViewer

  • Twitter Streaming Engine: The engine maintains a streaming connection with Twitter Streaming Application Programming Interface (API ) to collect real-time geo-tagged tweets. The engine specifies a bounding box in the POST statuses/filter endpoint of the public Twitter streams to collect geo-tagged tweets if they are located within the targeted region.

  • Database: MongoDB was selected as the database for GeoViewer to store the collected tweets because it: (1) is an open source and document-based database with ease of development and scaling; (2) supports spatial query; and (3) provides full-text search functions.

  • GeoViewer API : The GeoViewer server provides several JavaScript APIs (such as search historical tweets, spatial search, and sending new tweets) to communicate messages between the client and server sides. The messages are delivered by a WebSocket (i.e., a persistent connection between clients and servers to increase interactions). Once GeoViewer is launched in a Web browser, a WebSocket is initiated to communicate messages in real-time between the client and the server.

On the client side, GeoViewer was created based on HTML5 , CSS , JavaScript, and JavaScript modules (Table 1). Besides the key mapping functions (such as searching, mapping, and labelling), the client side provides a filter function, which filters out unmatched tweets based on the search criteria (keywords, spatial boundary, or dates). If a real-time tweet does not match the criteria, the tweet will be invisible until a user resets the search criteria.

Table 1 The JavaScript (JS) modules used in the client side of GeoViewer

GeoViewer is an easy-configurable package. Developers can easily customize GeoViewer functions to target different areas by customizing the configuration file. GeoViewer is planned to become an open source project, so that more researchers and institutes can develop or utilize GeoViewer for their own research project.

3 User Interface Design and Key Functions

Different from traditional web map s, GeoViewer provides interactive display of multimedia content in addition to map display. The web-based user interface (front-end) is built by open source programming libraries, jQuery and Leaflet APIs, to visualize and query tweets from a server-side database (back-end). The user interface design of GeoViewer follows the concept of user-centered design (UCD) in cartography (Tsou 2011), which emphasizes providing the most important (and simplified) information for map readers with a user-friendly interface and data analytic functions. Four major components of the user interface design are highlighted: (1) real-time display of new geo-tagged tweets within the target area; (2) interactive mapping functions for geovisualization ; (3) spatial, text, and temporal search functions using keywords, spatial boundary, or dates; and (4) labeling and text-tagging for individual messages and for further sentiment analysis.

3.1 Real-Time Display of Geo-Tagged Tweets

When users launch GeoViewer, the web app automatically collects geo-tagged tweets within the selected region with the public Twitter Streaming API . Although the public Streaming API has a 1–3% limitation of total tweets within a region, GeoViewer can collect 40–50% of all geo-tagged tweets within the defined region based on our previous testing results. Some researchers also indicated that over 90% coverage of total geo-tagged tweets can be collected via public Streaming APIs (Morstatter et al. 2013). When a new geo-tagged tweet is collected, GeoViewer automatically displays the tweet at the top (newest first) or at the end (oldest first) of the Tweet window (left-side). Users can select the refresh rate of new tweets by time or by the number of tweets. Users can also pause the real-time feed if needed (Fig. 3).

Fig. 3
figure 3

Displaying and arranging real-time tweets in GeoViewer (San Diego region) (http://vision.sdsu.edu/ec2/geoviewer/sanDiego#)

The dual-view interface is adjustable to increase or decrease map view window (right-side), which also adjusts the number of columns in the tweet window. Each tweet in the Tweet window has an orange Map button to identify its exact location in the Map Window. Disaster responders and resource coordinators can query and analyze these real-time tweets and identify the location and time of urgent requests. Each affected area can have one GeoViewer instance with a pre-defined area for tweet collection. The bounding box is defined in the web server configuration file for each GeoViewer and can be visualized in the front-end as well by clicking the Twitter Search Boundary in the Map Layer option.

3.2 Interactive Mapping Functions for Geovisualization

GeoViewer utilizes Leaflet, an open source JavaScript mapping library for the development of interactive mapping function. In addition to the standard zoom-in (+), zoom-out (−), and pan function, four key mapping functions for better geovisualzation are provided: map marker selection, cluster mapping, hot spot mapping, and map layers overlay.

  • Map marker selection allows users to pick the color and size of map markers (Fig. 4, left). The markers depict the proximate location of each geo-tagged tweet. In order to protect Twitter users’ locational privacy, GeoViewer applies geo-masking to conceal the actual tweet locations by randomly selecting a coordinate within the 100 m radius of the original location.

    Fig. 4
    figure 4

    The map marker selection function (left) and the cluster mapping function with top three keywords within each cluster (right)

  • Cluster mapping provides a generalized view for the clustered distribution of tweets within a region to simplify the display of point data. In addition, a popular text display function is associated with the cluster mapping to present the top three keywords from the message clusters (Fig. 4, right). When users zoom-in or zoom-out, tweet clusters are re-arranged and top keywords are dynamically generated based on the new cluster groups. One motivation for this function is to provide a quick overview of popular issues or topics within the clustered groups. For example, during a new wildfire event, the popular keywords used near fire perimeters can be used to identify the name of wildfire for the future reference and public announcement. Also, if there are multiple urgent requests from a collapsed building or a landmark, the cluster labels can help decision-makers identify the key issues within the target area.

  • Hot spot mapping (heat maps) highlights the clustered areas in red. Users can turn on/off the hot spots in the map and modify the radius of hot spots by using kernel density estimation (KDE). Figure 5 illustrates two hot spot maps with different radius settings for the KDE function. The scalable hot spot function is designed for disaster relief work since it will help decision-makers evaluate the significance of damages or requests. The modifiable radius function assists users to identify urgent requests in small areas and provides an overview of spatial patterns in larger regions.

    Fig. 5
    figure 5

    Hot spot mapping functions with adjustable radius by using kernel density estimation (KDE)

  • Map layers overlay function combines multiple types of external web services, including WMS , WFS , and ArcGIS Online Mapping Services. Currently, GeoViewer has added the following external maps: NOAA Red Flag Warning Live Update, NOAA NWS Weather Alert CAP (polygons), and several San Diego County OES’s ArcGIS online mapping services. This function allows decision-makers to more efficiently integrate multiple data resources, operation plans, and collaboration tasks.

3.3 Spatial, Text, and Temporal Search

GeoViewer provides three types of query functions for collected geo-tagged social media: spatial search, text (keyword) search, and temporal search based on the KDC framework. Users can search tweets by drawing a rectangle or a polygon or typing place names within a radius circle. Figure 6 (left) illustrates a spatial query example by selecting tweets within the boundary of Balboa Park in San Diego. Users can also type keywords and select dates to query the historical tweets collected by GeoViewer. Figure 6 (right) illustrated a combined query result by using a keyword (“beach”) and a temporal window (from 2016-01-01 to 2016-02-01) together. The search results indicated that most tweets containing the text of “beach” were located close to beach areas. One interesting finding is that the top three popular keywords reflected the actual name of beach in the clustered map. Researchers may be able to use this function to retrieve local place names associated with different land use types.

Fig. 6
figure 6

The spatial query function (drawing a polygon) (left) and the text search function (keyword: beach) (right) in GeoViewer

3.4 Labeling and Text-Tagging Function

One unique feature of GeoViewer is labeling and text-tagging functions for authorized individual users. This function helps users identify important messages related to disaster management tasks, such as warnings or medical help. Ideally, users can create their unique labeling system for different types of disasters or events. The current version of GeoViewer provides five pre-defined labels (Fig. 7): Safe (green), Info (blue), Medical Help (cyan), Warning (orange), and Danger (red). Users can also label a tweet with a sentiment label for Positive, Neutral, Negative, or Unknown. The manually-labeling functions and sentiment analysis could be used for the future machine learning classification method (under development). Volunteers may use the GeoViewer to classify important tweets during and after disaster events. The manual classified tweets can be used for the training sets in machine learning methods. The results of machine learning classification will provide automatic labeling functions for future tweets related to the disaster events.

Fig. 7
figure 7

Labeling tweet (tag tweet) function and sentiment labels in GeoViewer

There are some similar social media text mining projects for disaster relief. Qu et al. (2011) classified earthquake-related messages into four categories (situation update, opinion expression, emotional support, and calling for action); Cameron et al. (2012) built a software package to detect emergent incident-related tweets; Imran et al. (2013) utilized machine learning methods to classify Twitter messages into different classes (caution and advice, casualties and damage, donations, people, information sources and others). However, most of these applications did not utilize real-time feeds of tweets and the data mining methods and take one or two days to generate results.

We hope to develop near-real time mechanism for automatically analyzing and classifying disaster-relevant tweets. There are two key technical challenges for near-real time text mining and classification in GeoViewer: (1) combining multiple volunteers’ labeling results (message ranking, classification, and error checking) into a single system; and (2) developing different types of classification schemes for different types of disasters. We hope to solve these technical issues by developing a new text analytic framework for GeoViewer in future releases.

4 Nepal Earthquake Case Study

In order to assess the usability of GeoViewer for disaster events, we designed and implemented an instance of GeoViewer for the 2015 Nepal earthquake as one case study. On April 25, 2015, an earthquake with a magnitude of 7.8 devastated Nepal causing 8,999 fatalities and 19,009 injuries. Despite the geographic distance, people were able to help internationally by volunteering online to produce crisis maps. Crisis maps can be used to aid in emergency relief efforts by providing first responders with detailed online maps that use information gathered from social media , satellite imagery, and drones which all contribute to increasing situation awareness (Gao et al. 2011). Organizations such as the International Network of Crisis Mappers, the Digital Humanitarian Network, and Humanitarian OpenStreetMap Team have formed to help by utilizing crowdsourcing and humanitarian efforts (Meier 2014).

The HDMA Center instantiated a GeoViewer instance for Nepal Earthquake on April 26 immediately after the event (Fig. 8) (http://humandynamics.sdsu.edu/NepalEarthquake.html). Information collected was shared with officers in the United Nations (UN) and affiliated international organizations. GeoViewer was used in two major ways: (1) analysis of historic tweets collected during and after the disaster by keywords, dates, and locations; and (2) real-time monitoring of tweets during the disaster. During analysis, some keywords needed to be translated since the official and most commonly used language in the area is Nepali, with English used by those with more education.

Fig. 8
figure 8

The new GeoViewer for 2015 Nepal Earthquake. (http://vision.sdsu.edu/ec2/geoviewer/nepal-kathmandu#)

The Nepal Earthquake GeoViewer found that the majority of communication via Twitter is in English with only 20% of tweets in Nepali. However, translation was difficult since Google Translate does not provide the translation option between Nepali and English.

Furthermore, we noticed many messages included trending hashtags for the Nepal earthquake, such as “#NepalEarthquake”, “#Nepal”, and “#NepalQuake”. Applying these popular hashtags could help users broadcasting messages more effectively. There are many important tweets indicating people requesting assistance or providing local help information for those who may be seeking assistance. These tweets can be manually labeled using the GeoViewer labeling functions. We can also identify the exact geographic coordinates of these tweets. Some examples of Twitter messages are:

  • plz help us we are stuck in Nepal Kathmandu. Vinod lila 9851035###. 10:15 pm, 25 April 2015 (geo-tagged location).

  • This shelter need help http://t.co/####, 03:16 am, 26 April 2015 (geo-tagged location).

The cluster mapping function in GeoViewer displays the top three trending keywords from the nearby tweets. During the disaster event, these trending keywords (Fig. 8) might indicate where the main damage has occurred, shelters or supplies are needed, or the locations where aftershocks were felt. Another method to collect disaster related tweets is to use the download function in GeoViewer to extract all Twitter messages within the selected bounding box, which can be designated using the “Location” feature or typing place names. With GeoViewer, we hope these social media messages can be reviewed by volunteers before being sent to, or monitored by, the local governmental response teams who could pinpoint exactly where assistance should be dispersed and how to contact those affected.

5 Conclusion

Emergency response and rescue services are critical tasks in disaster management. By combining cluster map display, hot spots, manual labeling, spatiotemporal querying, and real-time social media monitoring functions, GeoViewer is designed to help firefighters, police officers, and first responders to enhance their situation awareness and make better decisions for disaster relief. GeoViewer provides a more integrated geovisualization solution by analyzing disaster-relevant messages with scalable hot spots and cluster mapping/labels functions. These visualizations help users to validate major damaged areas. For example, a hot spot with over 10 reports from different users of a collapsed building becomes more credible than from just one message.

After the case study of 2015 Nepal Earthquake, we identified a few research challenges and opportunities for the next version of GeoViewer. The first challenge is how to validate information (reports) provided by social media users. Previous research applications (such as Ushahidi) only emphasized on crowd-sourced manual classification to filter noises and errors. GeoViewer will need to combine both crowd-sourced manual classification and real-time machine learning methods to validate and filter disaster-relevant messages in near real-time. How to effectively recruit online volunteers and generate training datasets for machine learning within a short period of time is an additional challenge. There were also significant noises and errors in geo-tagged tweets from job advertisement, weather reports, traffic updates, etc. These need to be filtered before conducting the hybrid manual and machine learning classification.

Another challenge is how to increase the coverage of total geo-tagged tweets by using Firehose or PowerTrack APIs from Twitter directly, which are not affordable for most local disaster response agencies or NGOs. The sampling process of the public Twitter Streaming APIs are a black box and the underlying algorithm has been changed frequently. We need to understand more about these algorithms to establish a stable and full coverage stream. On the other hand, to increase the percentage of geo-tagged tweets, disaster relief agencies, such as Red Cross or local government agencies, should ask people in affected areas to enable GPS on their devices when submitting relevant reports via location-based platforms.

The language translation and interpretation is another key challenge for future development (Tsou 2015). For example, if we want to use GeoViewer to guide disaster relief in San Diego, we need to provide both English and Spanish versions because there is a significant Hispanic population. Some language translation tasks can be facilitated by using Google Translator. But, in the Nepal case study, we found that there is no appropriate translation software for Nepali. Besides technological challenges, we also need to consider human and ethical issues. Sometime, people at the most damaged areas may not have electricity or wireless communications to report their urgent needs. Removing digital divides and building accessible wireless communication networks for everyone may be the most important challenge before we can effectively utilize social media for disaster relief and management tasks in the future.