1 Background

The use of maps to visually represent aspects related to disease dates back to the time of the Black Plague (Koch 2017). Maps have proven to be an effective visual to communicate geolocated data for numerous pandemics like the Spanish flu, SARS, and Ebola. COVID-19 is not novel in this regard. However, the role maps played during this pandemic differed (Rosenkrantz et al. 2021).

For starters, we witnessed a deluge of Web-based maps on COVID-19 when the disease first emerged in late 2019. Historically, maps of disease were produced by professional cartographers, but the rise of open-source cartographic software in the last decade made Web-based mapping of COVID-19 a popular pursuit available to anybody with an Internet connection and some technical know-how. Though this was a largely positive democratization of cartography (Mooney and Juhász 2020), the abundance and accessibility of such “ready to use” interactive mapping software (e.g., ArcGIS online, Tableau) made certain styles of COVID-19 maps (e.g., choropleth, graduated symbols) more ubiquitous than ever.

Maps of COVID-19 were also central to how governments, public health agencies, and news outlets were relaying information to the public. We live in an increasingly visual world, where visual forms of media often supersede the written word in terms of public information consumption. Maps have thus become a key communication tool given their ability to efficiently visualize events on the face of the earth in a manner that is difficult to replicate through text or tables alone. While some of the COVID-19 maps that were produced are excellent examples of geographic representations of COVID-19, others fell short, often misrepresenting the state of the pandemic and contributing at the time to what the World Health Organization (WHO) called an “infodemic,”— that is, an overabundance of information that makes it difficult for people to discern between what is a trustworthy and reliable source of information and what is not (WHO 2020).

The first section of this chapter discusses the deluge of maps produced on COVID-19, and the common cartography pitfalls encountered. We also discuss why despite the initial profusion of maps, we were only telling a small portion of the COVID-19 “story.” The second section explores the data issues that contributed to this mapping rut and how Geographic Information Science (GIScience) can be used to get us out of it.

2 A Mapping Deluge

From the very start, COVID-19 maps emerged as a way to depict the number and geographical region of those infected, those recovered, and those who died as the disease rapidly spread from place to place. Although these kinds of incidence and prevalence maps are common to public health and epidemiology, the sheer number of Web-based maps that emerged to represent the spread and impact of COVID-19 was decidedly uncommon.

The vast majority of COVID-19 maps that modelled incidence or prevalence rates took on the form of either a graduated or proportional circle map or a choropleth map. Though some of these maps were professionally produced and served as informative resources on COVID-19, many others failed to consider even the most basic tenets of good cartography such as inclusion of north arrows, scale stability, and the use of consistent units of aggregation (Mooney and Juhász 2020; Field 2020). Subsequently, they were either difficult to interpret, misleading, or both (Mooney and Juhász 2020).

One common example where mistakes were made were choropleth maps. Choropleth maps use shaded or patterned areas to represent spatial variations in geolocated areal data. They assume constant density over the area being shaded and therefore must map relative data (e.g., number of cases per 100,000 people) to allow the reader to compare one area to another. Yet numerous choropleth maps reported absolute data related to COVID-19, such as total number of cases or fatalities for an area, with complete disregard to each area’s population density (Fig. 58.1). Several of these maps also overclassified their data (i.e., they used too many gradations), making it difficult to discern which color is which, adding to interpretation challenges.

Fig. 58.1
figure 1

An example of two choropleth maps illustrating cases of COVID-19

Figure 58.1 presents an example of a poorly constructed choropleth map on top (map A) and a corrected version on the bottom (map B). With map A, the absolute number of cases are being mapped by state. However, because not all states have the same population density, using absolute values such as case counts can be misleading. For example, in map A, California is shown to have very high numbers of cases, signaling to a reader that the danger due to COVID-19 here is higher. But map B, which maps cases per 100,000 people (i.e., a relative value), indicates that California has a relatively low number of cases per 100,000. Choropleth maps should always map relative values to avoid misrepresentation.

On the other hand, graduated or proportional symbol maps, which use shape size as proportional to the data, can map either relative or absolute data. The most common pitfall with these maps was the use of improper scale coupled with low-resolution data. At too large a scale, graduated circles overlap each other to the point that it is impossible to determine which area each circle represents. Though increasing symbol transparency or using dynamic maps that scale can help mitigate this issue, symbol congestion can ultimately impinge a map’s readability and distort the information that the mapmaker is trying to convey (Fig. 58.2) (Field 2020). Another common issue with this style of map is the inconsistent use of units of aggregation. This is especially common with Web maps on a global scale, where in some cases, COVID-19 infections were represented at the country level, and in other cases, they were represented at the provincial, state, or even county level.

Fig. 58.2
figure 2

An example of two graduated circle maps illustrating cases of COVID-19

Figure 58.2 presents an example of a difficult-to-interpret graduated cylinder on the left (map A) and an improved version on the right (map B). As you can see, the circles in map A are too big; in some cases, they cover entire countries, and a few even overlap with other circles, making it difficult to interpret the information being represented. With map B, this issue has been resolved by reducing the overall sizes of the circles used and making them semitransparent to allow country borders and any remaining overlapping circles to be visible. In general though, graduated cylinder maps are often low resolution, and it is difficult to improve on this without higher-resolution data.

These cartographic missteps were not only specific to Web-based mapping. More traditional forms of maps displayed online or in print as static images suffered from similar blunders (Mooney and Juhász 2020). As Monmonier wisely reminds us in his well-known book How to Lie with Maps, a healthy dose of skepticism when interpreting maps is essential and cautions that “because of advances in graphics software and online mapping, inadvertent yet serious cartographic lies can appear respectable and accurate” (2018, p. 250).

Cartographic mistakes aside, what is so wrong with having had this profusion of COVID-19 maps? Without question, prevalence and incidence of disease are important statistics that should be mapped. However, these maps only told a small portion of the COVID-19 story. As Jonathan Everts (2020) notes in his article, The Dashboard Pandemic, the choice of choropleth and graduated circle maps to portray COVID-19 statistics masks certain risk groups and obscures small-scale patterns of disease. Essentially, these maps suggest that within a specified, territorially defined area, the burden of disease was shared by all equally, which we know was far from the truth (Everts 2020). The continuing ubiquity of these styles of maps is partly due to the profusion and easily accessible nature of “ready to use” interactive mapping software by amateur cartographers. But it was also in large part the result of messy health data and a lack of high-resolution spatial data that initially limited the type of spatial analyses possible.

In the following section, we discuss issues related to COVID-19 data for mapmaking purposes and how GIScience helped to overcome them.

3 Dealing with the Data

If we envisage a map as the top portion of the hypothetical iceberg, then the 90% below the surface is constituted by the data. Mapping COVID-19 for much of the pandemic was limited by two key elements: messy and incomplete health data and a paucity of high-resolution spatial data.

3.1 Messy Health Data

At the beginning of the pandemic, the rapid spread of the coronavirus and high transmission rates overwhelmed health systems. In particular, public health records suffered from numerous irregularities in data collection, particularly around testing for the virus, as well as in how this data was reported (Platt 2020; Smart 2020). Consequently, any sort of higher-level analysis of this data was forced to grapple with these inconsistencies.

As an example, let’s discuss the use of case counts as an indicator for COVID-19. Mapping case counts of COVID-19 can be problematic for a few reasons. Due to the incubation period of the virus and the time required for testing, case data lags about 2 weeks behind at minimum, representing the recent past rather than the present (Brunsdon 2020). Case counts are also highly dependent on the testing capacity of a region, meaning that they are likely an underestimate anywhere where testing capacity has been limited (Brunsdon 2020). At the beginning of the pandemic when labs were still adjusting to the flood of testing, daily spikes in cases often meant a backlog of tests being cleared rather than true daily counts. Finally, since testing capacity and strategy vary over time and place, it was extremely difficult to analyze case counts longitudinally or across regions, making this indicator difficult to work with from a spatial perspective.

Hospitalization and death data are typically more reliable than case counts; however, their use comes with some major caveats. First, due to the virus’ incubation period, both indicators are representative of the recent past and not the present. There was also evidence that mortality-related events were not being systematically tested and coded, likely leading to substantial undercounts of death (“The Fatal Flaws” 2020). Lastly, these indicators were often disproportionately affected by outbreaks in long-term care homes (Walsh and Semeniuk 2020; Yourish et al. 2020). Taken together, these limitations made accurately mapping health data difficult to do.

Consequently, the best maps accounted for these data inconsistencies. Figure 58.3 shows case counts using a 5-day rolling average to help prevent major events (such as a change in reporting methods or a clearing of testing backlogs) from skewing the data (John Hopkins Coronavirus Resource Center 2020). Figure 58.4 shows the prevalence of lab-confirmed cases per 100,000 people in Ottawa, based on the home location of those individuals (Ottawa Public Health 2020). Importantly, their data was filtered to exclude long-term care homes and retirement residences where outbreaks of the disease would have inflated overall rates.

Fig. 58.3
figure 3

John Hopkins animated map of daily confirmed cases of COVID-19 using a 5-day moving average (John Hopkins Coronavirus Resource Center 2020)

Fig. 58.4
figure 4

Ottawa Public Health’s map of COVID-19 case rates, excluding cases in long-term care homes and retirement homes (Ottawa Public Health 2020). (source: Contains information licensed under the Open Government License— City of Ottawa)

Another major issue with the health data for COVID-19 was the lack of information being collected and reported on race and ethnicity. In Canada, statistics based on race or ethnicity are not collected unless individual groups are found to have risk factors (Williams et al. 2020). Despite early anecdotal evidence that Black, Indigenous, and People of Color (BIPOC) in Canada faced greater infection rates than white Canadians (Bowden 2020; Alliance for Healthier Communities 2020), and the scientific evidence in the United States that BIPOC are indeed at higher risk (Oppel et al. 2020; CDC 2020; APM Research Labs 2020), provincial health officials were slow to begin collecting racial data, and certain provinces remained resistant (Boyd 2020; The Canadian Press 2020; Watson 2020; Andrew-Gee 2020). The situation in the United States was somewhat better—at the time of writing, 47 states had released confirmed cases of COVID-19 data by race, and 43 states had released COVID-19 mortality by race (“State COVID-19 Data by Race” 2020). Still, only four states had released COVID-19 testing data by race (“State COVID-19 Data by Race” 2020), and the overall process took months before race-based data began to be collected and made available to the public.

Despite these current shortfalls in data coding, GIScientists were in a unique position to help tell a more complete story of the COVID-19 pandemic, particularly with regard to its differential and unjust impact on BIPOC communities. By adopting a GIScience approach, researchers were able to harness the power of data linkage and analysis properties, comparing the limited COVID-19 data that was available with the most recent census data to expose the uneven and unjust geographies of the pandemic (Everts 2020). This allowed researchers to identify and map the racial breakdown of areas hit particularly hard by the disease, as well as monitor changes in the underlying causes of death to better locate anomalous patterns of mortality that may be attributed to racial disparities in risk and care access during an outbreak. Figure 58.5 shows an excellent example of a map that compared the racial makeup of counties in Georgia, United States (US), and how they had been affected by COVID-19 (Gaglioti et al. 2020). Visualizations like this are critical for illustrating the unequal geographies of this pandemic.

Fig. 58.5
figure 5

Maps of Georgia showing the often-high burden of COVID-19 in areas with high percentages of Black people (Gaglioti et al. 2020, p. 6). (Source: Morehouse School of Medicine National Center for Primary Care: www.msm.edu/ncpc)

3.2 Lack of High-Resolution Spatial Data

A lack of high-resolution spatial data also initially limited the types of maps produced at the beginning of the COVID-19 pandemic. Due to privacy concerns, public health agencies chose not to release data at a high resolution, limiting the detection of meaningful patterns. In North America, COVID-19 data was predominantly reported at low resolutions, with cases typically linked to the county, city, or state level (LA County Department of Public Health 2020; “Public Health - Seattle and King County” 2020; “NYC Coronavirus Disease 2019” 2020). While this data was useful for getting a big picture of the virus’ spread, it limited more nuanced types of analysis that would have allowed us to identify hot spots at a community level and subsequently allocate resources more appropriately. Moreover, it made the assumption that infected individuals are static beings that can be neatly assigned to a single area such as their county, city, or state, without ever leaving these boundary lines to shop for groceries, buy gas, or visit a close relative.

In spite of this, many GIScientists worked to confront this paucity of high-resolution spatial data, helping tap into the individual trajectory data of infected individuals as they moved about their daily lives (Rosenkrantz et al. 2021). For example, GPS, cell phone tower signals, or Wi-Fi connections can all be used to track and collect data on people’s daily trajectories. Together with data on COVID-19 infection status, individual trajectory data allowed researchers to hone their analyses in on actual hot spots, instead of large areas that infected individuals may have never set foot. It also gave the important ability to contact trace more efficiently and effectively than by memory alone.

While countries around the world used this technology to much success in controlling COVID-19 outbreaks by tracking their citizens, neither the United States nor Canada participated in this “big brother”-type surveillance due to obvious issues around privacy (Calvo et al. 2020). Instead, some researchers and the private sector focused their efforts on what is known as Volunteered Geographic Information (VGI). VGI is a term coined by the renowned geographer Michael Goodchild back in 2007 to describe the increasingly popular phenomena of citizens engaged in the creation of geographic information (Goodchild 2007); essentially, in the case of COVID-19, citizens were volunteering their health and location data to actively surveil themselves.

The urgency of the COVID-19 pandemic drove the development of a number of local-scale VGI Web and mobile apps like “COVID Near You” (Fliesler 2020), “COVID symptom tracker” (2020), “Flatten” (2020), and “Private Kit: Safe Paths” (2020). Companies like Kinsa Health were also in the VGI space, and used existing technology in their smart home thermometers to collect volunteered health and location data to track feverish illness across the country (“US Health Weather Map” 2020) (Fig. 58.6). Though this data does not distinguish COVID-19 from other feverish illnesses, it is a robust database with the capability to detect and map abnormal spikes in fevers and is thus an excellent example of how VGI helped serve as an early indicator of COVID-19 hot spots at the community level (“US Health Weather Map” 2020; McNeil 2020). The major downside to VGI of course is getting enough users to embrace it. Even during a pandemic where the sense of urgency was high, user numbers often remained low.

Fig. 58.6
figure 6

Kinsa’s map of influenza-type illness over on March 31 (“US Health Weather Map” 2020)

4 Conclusion

Maps and spatial analytics played a critical role in our understanding of COVID-19. While there have been some blunders, the pandemic has shown us how important it is for geographers and spatial analysts to work with domain experts to optimize communication for the purposes of reaching a large audience and provide policy makers with reliable evidence. In addressing potential future pandemics, it is important that data, spatial analyses, and maps are based on defensible principles from cartography and spatial epidemiology.

What is covered in this chapter is just a small sampling of the possibility for spatial representations and analyses of COVID-19. There is still more that can be explored retrospectively by adopting a GIScience approach. We can and should continue to explore other important aspects of the pandemic, so that we can tell a more nuanced story of COVID-19.