Introduction

H1N1 is a respiratory infection came about because of the re-mixture of bird, pig, and human influenza (Alabama Department of Health 2009) and is spread from individual to individual when a tainted individual coughs, sneeze, and talks. The H1N1 infection spread in Mexico and United States in March and early April 2009. Amid an initial couple of weeks of human to human transmission, the infection spread overall influencing more than thirty nations. Thus, the World Health Organization (WHO) pronounced the infection pandemic in June 2009.

The extent of the Influenza A(H1N1) undertaking is currently an overall pandemic. Nonetheless, because of the amount of information and investigation, the regional of the center of the examination is in the North and Central America on a national level. In Canada and–and the United States of America, provincial and state level data are accommodated. What’s more, Canada has a geographic breakdown by District Health Areas (DHAs) (Arizona Department of Health Services 2009) and the US has a region level breakdown of cases. The shifting levels of scale gave understanding into significant areas in clusters. This paper will provide discussion on how can GIS be utilized for modeling, simulation, tracking and visualization of reported Influenza A (H1N1) cases? The three objectives of this study are to:

  1. 1.

    Visually report H1N1 cases.

  2. 2.

    Spatially analyze data for patterns and high and low influenced areas and the variation in case counts by location.

  3. 3.

    Show the dispersion of H1N1 cases over time

A pandemic can be characterized as a spread of viral infection in the human population with a spatial extent greater than a region or nation. Pandemics more often influence vast locales, for example, landmasses or even some of the time around the world. The World Health Organization (WHO) builds its pandemic phase positioning in light of spread and not the seriousness of the sickness. There are six pandemic phases followed by post-peak and post-pandemic categorization.

In phase 1, A coursing infection in animals, and has no reported human cases. In Phase 2, A circulating virus in animals has brought on human infection. In, phase 3 A little bunches of individuals in a community-level episode get the contamination yet has not brought about human-to-human transmission. In phase 4 As well as animal-to-human transmission, human-to-human transmission has happened. Phase 5 Human-to-human spread of the infection in at least two nations in a WHO Region. This was proclaimed by the IHO on April 29th, 2009. In phase 6 criteria in Phase 5 and a group level flare-up in no less than one other nation in an alternate WHO Region., this was proclaimed by the IHO on June 11th, 2009, the Post-Peak Phase, Countries with satisfactory reconnaissance have sickness levels drop beneath crest qualities. Furthermore, the last phase is the Post-Pandemic Phase; Influenza illness levels come back to ordinarily watched steady levels.

Although the emergence of the H1N1 pandemic has a long history, it severely affected tied the human population in 2009. The major concentration of the pandemic was Mexico and within few months, it spread throughout the major regions of the world by affecting large populations. The spatiotemporal analysis carried out in our study was for the counties and States of USA. The main factors considered in the analysis were the population density and the number of flights entering the USA from Mexico. It is important to note that viral illness can cross the globe in a matter of hours considering the rate of international travel. Through these factors, our aim was to summarize spatial variation in the spread of H1N1 disease in the United States, its spatial patterns and to highlight areas of elevated or lowered risk to obtain clues as to the disease aetiology.

Outbreak modeling

The geographic modeling of a pandemic virus allows for trends involving location to emerge through geo-visualization.

The modeling of the outbreak in this paper concentrates on the number of cases over time. Since Influenza A(H1N1) is a pandemic of moderate seriousness, the number of fatalities was not sufficiently critical to the season of information accumulation to make a representation model. The significance of visualizing the number of reported cases helps in deciding their clusters disperse.

The literature published in the Bio.Diaspora project, by St. Michael’s Hospital in Toronto, Canada, included volumes of global air travel and the spread of Influenza A(H5N1), also known as avian influenza. This publication concentrated on case numbers in quarters and worldwide flight designs between nations (Arizona Department of Health Services 2009). The project specifically looked at some cases nationally, as well as pig density, population, economic standing and healthcare factors (Arizona Department of Health Services 2009). The St. Michael’s project was used as an outline for the research presented here, with an emphasis on the spatial visualization and analysis. At a second stage, was the other external factors because of the small variation of factors in Canada and the US at a national level, where the speed at which the virus circulated, forced WHO to change its reporting standards on Influenza A(H1N1) (Alabama Department of Health 2009).

Geovisualization

Geovisualization is the geographic visualization of specific factors, as well as the statistical analysis of those factors. Geovisualization was utilized as the favored technique for investigation as lack of patterns was discovered in part of the non-spatial information. In the early phases of Influenza A(H1N1), results were inadequate regarding population, size, density and socioeconomic factors. This prompted the acknowledgment that the geography of the area must bear some weight in the disease spread. Other geographic patterns may develop with broad open transit systems frameworks, worldwide airport terminals or by nearness to each other after some time. The element of time was likewise considered when utilizing representation and the impact that time has on an aggregate number of cases in a territory.

This examination has a massive measure of human factors that could influence the result of spread. Different variables are not talking about further. Nonetheless, it is vital to perceive components, for example, population, age, sex, medical history, and environment assume a fundamental role in the spread of H1N1.

At last, the spread of infection malady does not discriminate. Despite the fact that we can evaluate figures that put a person in danger, it does not prohibit whatever remains of the population from coming into contact with the virus. A record of irregularity ought to be thought about and, also, the avoidance of exceptions. Not each case will have the capacity to be clarified by variables laid out by a statistical model.

The importance of this work is that the visual examination here additionally measures human variables. The extent of the undertaking is substantial, and the different geographic breakdowns give a boundless measure of data.

Materials and methods

Data collection

The technique for the information accumulation and investigation of Influenza A(H1N1) required a broad range of methodologies and exemptions keeping in mind the end goal to deliver exact results. Spatial information gathering comprised of total case numbers, areas, and dates. Information was gathered from different levels of government all through North and Central America. The first information was assembled from the World Health Organization (WHO) which gave case tallies to nations around the world (Alabama Department of Health 2009). This information was then isolated to highlight Canada, the US, Mexico and other Central American nations. Dish Americas Health Organization (PAHO) is an individual from the WHO Regions that spotlights on North, South and Central America (Centre for Disease Control and Prevention 2009). This delivered accurate case counts for countries all over North and Central America.

Within Canada and the US, federal government agencies collected state and local case counts. The Centre for Disease Control (CDC), in the US, provided a case analysis at the state level (Colorado State Government 2009). The Public Health Agency of Canada (PHAC) provided a breakdown for the provinces and territories in Canada (Commonwealth of Kentucky 2009). The definitive sources for data collection were from provincial and state government health departments (Department of Community Health 2009; Georgia Division of Public Health 2009; Gouvernement du Quebec 2009; Government of Alberta 2009; Government of New Brunswick 2009; Government of Newfoundland and Labrador 2009; Government of Northwest Territories 2009; Government of Nunavut 2009; Government of Ontario 2009; Government of Prince Edawrd Island 2009; Government of Saskatchewan 2009; Government of Yukon 2009; Idaho Department of Health and Welfare 2009; Iowa Department of Public Health 2009; Kahn 2009; Kansas Department of Health and Environment 2009; Louisiana Department of Health & Hospitals 2009; Maine Centre for Disease Control & Prevention 2009; Minnesota Department of Health 2009; Mississippi State Department of Health 2009; Missouri Department of Health and Senior Services 2009; Nebraska Department of Health & Human Services 2007; Nevada State Health Division 2006; New Hampshire Government 2009; New Mexico Department of Health 2009; New York State Department of Health 2009; North Carolina Department of Health and Human Services 2009; North Dakota Department of Health 2009; Ohio Department of Health 2009; Oregon Department of Human Services 2009; Pan Americas Health Organization 2009; Pennsylvania Department of Health 2009; Province of British Columbia 2009; Province of Manitoba 2009; Province of Nova Scotia 2009; Public Health Agency of Canada 2009; South Carolina Department of Health and Environmental Control 2009; South Dakota Department of Health 2009; State of Arkansas 2009; State of California 2007; State of Connecticut 2009; State of Delaware 2009; State of Hawaii 2009; State of New Jersey Department of Health and Senior Services 2009; Texas Department of State Health Services 2009; Virginia Department of Health 2009; West Virginia Bureau for Public Health 2009; Wisconsin Department of Health Services 2009; World Health Organization 2009; Wyoming Department of Health 2009). As far as precise time information, a postponement on the off chance that numbers were recognizable for WHO reported qualities. The deferral happens because every level of government has a subset of geographic areas answering to it. The procedure of reporting appears in Fig. 1. Other spatial information that was gathered identified with the populace, size and salary. This information was removed from ESRI Data and Maps 9 and ESRI Data and Maps 9.3. All geographic information was in vector structure, as either polygons or focuses, a more point by point portrayal of how geographic.

Fig. 1
figure 1

The study area used for spatiotemporal analysis

The non-spatial information comprised of news reports and briefings alluding to populaces of tainted people, reporting periods, normal stage redesigns and advance on immunization. These redesigns originated from the spatial information sources recorded above, and additionally news organizations, for example, The Canadian Associated Press, Canadian Television Online, and also the LinkedIn H1N1 Alliance Group.

Study area

The collected data described above was examined to extract the sample data to be used for spatial analysis. Two major criteria were used to select the data and the corresponding study area; the first criterion was the number of deaths while the second was the availability of H1N1 related data. Initially, Canada and the USA were chosen for the study; however, at the preliminary stage it was discovered data related to Canada was not readily available and data that was available was very coarse and in many cases the spatial attribute was absent from the data. Therefore, United States of America satisfied both criteria and was chosen as the study area. Figure 1presents an overview of the study area; the red highlighted areas are the states where the H1N1 related casualties occurred.

Visual analytics

The spatial analyzes in the following study included visualization and exploratory data analysis; this analysis was generated using ArcGIS 9.3, ArcScene and Spatial Analyst Extension. Oculus Geo Time, on the other hand, was used to study the temporal variation in the data. Figure 2 below highlights the methodology process used to investigate the H1N1 related deaths.

Fig. 2
figure 2

Methodology Process used in the Research

Through data visualization, trends and patterns in the data can easily be detected; spatial outliers and significant areas can also be identified. Spatial visualization was the first step in the analyzed, and the deaths were spatially visualized over the Map of US to see the spatial spread of deaths. To make the visualization more effective Point Density tool was used to create density raster for the H1N1 deaths. The density raster was then exported to ArcScene to visualize the results in 3d. Visualizing the raster in 3d enhanced the ability to see the death pattern across US States. The second step involved exploratory analyzes; the exploratory data analysis can not only identify unusual patterns but through formulating relationships between various variables can also explain the cause of a pandemic.

Results and discussion

Spatio-temporal analysis of sampled data

During the epidemic period, 224 deaths were recorded; the cases have been registered from various States with major concentration in California and New York. Figure 3 below presents the overview map of mortality cases recorded from different States with each dot representing a death case.

Fig. 3
figure 3

H1N1 Deaths throughout the United States

In the span of 3 months, the largest center of the major outbreak was recorded in California and New York while the States with the smallest number of deaths were Minnesota and Maryland. Figure 6 shows the overall variation in the deaths across the country while Table 1 presents a detailed breakdown of mortality cases.

Table 1 Number of Deaths reported in various States and Counties

Temporal results

The month of May had 12 deaths which were the lowest recorded; June recorded more than half of the outbreaks with a death toll of 107. The month of June also recorded the highest number of fatalities in one day with 18 reported cases; July on the other hand had 78 deaths. Through visual and temporal inspection two major epidemic phases were observed: June 14–June 19th with 51 deaths and July 1–July 9th with 63 deaths. Overall, however, the temporal distribution varied, and no apparent pattern was observed except for the two peaks mentioned. Figures 4, 5, 6 provide an overview of the 4D spatiotemporal visualization of the data.

Fig. 4
figure 4

Spatiotemporal variation of H1N1 cases in May 2009

Fig. 5
figure 5

Spatiotemporal variation of H1N1 cases in June 2009

Fig. 6
figure 6

Spatiotemporal variation of H1N1 cases in July 200

Next, to enhance the visualization of the data, the density raster for all the deaths was created. Figure 7 shows the density raster generated through point density tool in ArcGIS. The raster was symbolized in a way to show the high affected areas as dark Red and the least affect areas as dark green, the areas in between follow the colour scheme accordingly.

Fig. 7
figure 7

Death Density across the USA

Further, to make the visualization more effective ArcScene was used to create a 3D model (Fig. 8) based on the density raster created above. Through 3D model, two major peaks are observed, one over California and the other on the East Coast over New York, New Jersey, and Connecticut.

Fig. 8
figure 8

3D model of death density across the USA

In the next step, the data were approached by exploratory data analysis. In this step, two large maps were generated that helped exploring and understanding the pattern of the deaths. These maps were also important in determining the factors for high and low deaths across the United States. Figure 9 represents the population density on County level for the United States. The population density map was generated using Natural Breaks Classification method. The colour scheme from dark Green to Dark Red represents the low population density to high population density, respectively.

Fig. 9
figure 9

Population Density of Counties in the USA

The other map generated for exploratory analysis was the air flight map showing the flight paths from Mexico airports to the airports in USA (Fig. 10).

Fig. 10
figure 10

Air flight routes from Mexican Airports to US States

This map determined the high and low concentration of flights from Mexico to the US. This information was further used in conjunction with the H1N1 deaths to explore the relevance of the deaths and the number of flights in a given State. Information extracted from the map revealed California as the top flight receiver State while Maryland, Massachusetts, Pennsylvania, Washington and Wisconsin are on the bottom of the list. Table 2 below presents the information regarding the number of flights entering US States from Mexico.

Table 2 Number of deaths and presence of International Airports in the States

The final step in the analyzes encompassed the process of combining all the above-generated data to see a pattern or relationship between H1N1 related deaths, populations density and some flights from Mexico. The first relationship explored was the connection between the number of deaths and the population density shown in Fig. 11 below.

Fig. 11
figure 11

Relationship between H1N1 deaths and population density

The second relationship was with the death toll and number of flights entering US States from Mexico (Fig. 12). The relations between the variables and the patterns observed are discussed in detail in the next section of the report.

Fig. 12
figure 12

Relationship between flights and H1N1 deaths

The major focus of the project was to analyze the spread of the H1N1 pandemic in the USA through the use of spatial analysis and related mapping. By generating various cartographic models, clues about the association between the location and the presence of death cases was analyzed. The analysis were divided into two parts: visualization and exploratory analysis.

Visualization plays a significant role in exploring the data, trends and patterns while having the ability to present complex relationships. Moreover, since humans have the tendency to learn more efficiently through visual means, visualization provides a clear and concise understanding of the phenomenon and reveals relationships that might otherwise not be possible. In our case, the main objective was to depict the H1N1 data in a way to show the variation of occurrences over the study area. A spatial database was generated containing the information related to the death cases to present the data through the means of visualization; All the recorded death cases were then tabled in the form of an overview map as shown in Fig. 10. This step in the analysis provided the foundations for the exploratory analysis.

The exploratory analysis was approached analytically to study the intricacies of the H1N1 epidemic. The high and low concentration of deaths in the study area was analyzed using population density and the number of air flights from Mexico to the US. Through this step, the “why” question in regards to the death cases was explored. The visualization models discussed above played a major role in devising an approach that revealed a solid pattern regarding death cases. Through spatial analysis, the first relationship to be explored was the relationship between the number of deaths and the population density. Two different scale maps were used for population density, the magnitude at the county level and the scale at State level. At the county level, the majority of the deaths took place in areas with high population density as shown in Fig. 12, with few exceptional cases. There can be many reasons as to why some cases occurred in areas with low density; however, since such cases are a minute, no significance was given to such cases.

However, when the results were analyzed and compared at State level a perplex situation was examined; it was discovered although New Jersey had the highest population density, it, however, had the third highest death counts. Similarly, New York being more densely populated than California, California had more deaths than New York. Massachusetts and Maryland are more densely populated than Florida; however, Florida had more deaths than both of these States. Lastly, Pennsylvania is denser than Illinois, Illinois however, had more deaths than Pennsylvania. Figure 13 below shows a weak relationship between the number of deaths and population density at the State level.

Fig. 13
figure 13

Relationship between deaths and population density

At this point, the second relationship was analyzed to study the cause of death pattern. The relationship was explored using the number of deaths and number of flights entering the USA from Mexico; Fig. 13 visually represents the flights from Mexico to States in the USA. The highest numbers of flights entering the USA from Mexico are in California with 45 flights, followed by Texas, Illinois and Florida with 22, 13, and eight trips, respectively. Figure 14 below highlights the relationship between the two variables.

Fig. 14
figure 14

Relationship between deaths and flights

Similar to the last relationship, this particular relationship was also not very strongly correlated. However, through the exploration of these two relationships, it was discovered that multiple variables are involved in the spread of a pandemic and, therefore, to explain any pandemic a multivariate approach is necessary. Instead of exploring a pattern or a trend in the death cases through single variable, an approach through which both variables are mutually used will drive more efficient and meaningful results.

Based on the concept of multivariate analysis the population density and the number of flights entering the USA were used jointly to explain the death pattern. Since California had the highest number of flights entering from Mexico, it experienced the largest number of fatalities than any other State. The case of New York with second highest death count and New Jersey with the third highest death count was peculiar at first. However when analyzed thoroughly an effective pattern emerged.

New York has lower population density than New Jersey, and the numbers of flights entering New Jersey from Mexico are also higher than New York. Logically, New Jersey should have had a greater number of deaths based on the two variables that are the basis of our exploratory analysis. However, that was not the case. It was found that majority of the deaths in New York occurred in counties with a population density higher than the death affected counties of New Jersey. While the deaths in New Jersey were spread throughout the State, in the case of New York the deaths were concentrated in two or three counties. New York had 37 deaths in total with eight counties being affected; New Jersey, on the other hand, had 16 deaths with ten countries being affected. In New York 67.5 % of the deaths occurred only in 3 counties (Bronx, Queens, and Nassau), two of these counties, Bronx and Queens have the highest population density in all of United States. In the case of New Jersey, the county with the largest number of deaths was Middlesex with three deaths, representing 18 % of the deaths in the State; the population density of Middlesex is one of the lowest when all the counties with death cases in New York and New Jersey are analyzed. The findings were encouraging as it explained the phenomenon surrounding the death cases in New York and New Jersey.

Florida and Illinois had the fourth-highest number of fatalities with 13 deaths each. Florida has higher population density than Illinois; however the numbers of fights entering from Mexico are higher in Illinois. The rest of the cases are similar; the death counts are high or low depending on the number of air flights entering from Mexico or population density. Table 3 below highlights the number of fatalities, population density and air flight information.

Table 3 Data related to deaths, population density and number of flights

The multivariate intricacies in the data could have been explained more thoroughly using some of the statistical packages such as SPSS or R. However, with the limitation in terms of time no such quantitative methods were approached.

Conclusions

The use of GIS and spatial analysis are extremely powerful and useful in epidemiology. Through GIS, it is easy to determine the spatial relationship between the disease and the factors that are vital in the spread of that particular illness. GIS provides powerful tools for visualization and the analytical approach in research and data exploration. In this paper, the use of GIS in understanding the H1N1 death cases in the US was explored.

The study was divided into two parts; in the first part, the data was spatially created and visualized to see the low and high concentration areas regarding death cases. Through visualization and spatial database, the number of fatalities, their spatial and temporal information was derived. The information generated in this part was the basis for the first data Analysis.

In the second part of the study, the data was thoroughly analyzed to investigate the pattern in the death cases from various States in the USA. At this stage the data was mostly analyzed at State level, however in some cases, the data was also analyzed at the county level. The pattern in the death cases was explored using two variables: population density and some flights entering the USA from Mexico. Initially, the two variables were analyzed individually; however, when the results derived were not satisfactory the multivariate approach was used. Through this method the results of the exploratory analysis were satisfactory.

There were few practical limitations in the study; the major limitation had to do with the data quality and data availability. The temporal and spatial data about the H1N1 pandemic was not available from any reliable source. The spatial data was generated using two or three different sources, and there is a slight chance that discrepancies might be present in the data, or they might have taken place while analyzing the data. However, based on the information that was collected the results were satisfactory.

Finally, through this study, the use of GIS in public health was proved to be very effective. It is helpful for surveillance/tracking, policy planning, and decision making. Through this study, it was demonstrated that by using GIS one can easily locate areas with high or low disease concentration. Based on this information a plan can be devised to mitigate the threat by either alerting the people in that particular vicinity or by locating suitable areas to target it for vaccination. Moreover, GIS can also locate population at risk, identify areas in need of resources, and make decisions on resource allocation.