Keywords

Introduction

A great part of data that we generate can be related to a location on Earth. The concept of an object’s location could be as simple as the place where it was generated or the place where an object can be found. The purpose for keeping track of the location of data varies depending on the application and may change in time. For instance, the location of an object might not be static; it could be dynamic, as in the case of an automobile in motion. This issue can be as complicated as the scope of the functionality of our application.

The inherent relation of geographic information to spatial dimensions attaches it to a location on Earth, with reference to a coordinate system. For this purpose, we require a representation of Earth in order to assign a specific location to an object. An option for this representation is a sphere with a radius of approximately 6,400 km and coordinates that are measured as latitude and longitude. Latitude is measured as the distance from the equator to a point to the north or the south of it. Longitude is measured as the distance from the (commonly) Greenwich meridian to the east or west. Height is another variable to take into account although more difficult to measure with high precision. A variation of this representation is the use of an ellipsoid instead of a sphere in order to obtain a closer representation of Earth for our geographic coordinate system.

Geographic data is produced from different sources. It can be identified by its spatial component. It may come from a source located in space, such as a remote sensing satellite or an airplane taking an aerial image. It could be data coming from a GPS satellite which we use to calculate our location. It could even be data from a sensor on Earth such as an electronic total station, as the tool of a civil engineer. An electronic total station is an electronic distance measurement (EDM) instrument that can be seen as the modern theodolite used to measure angles and directions, among others. No matter what the source of the data is, if it is being acquired, it is important to store it. Database management systems have evolved to make room for spatial data. Nowadays, these systems are able not only to store this type of data but also allow users to formulate queries that evaluate objects on the basis of spatial relations. In this way, we can find important information that is useful for decision making in different applications.

The evolution of spatial databases did not stop with these facilities. It is here where geographic information systems (GISs) were developed to create, visualize, and manipulate spatial data. A GIS is composed of a spatial database, a graphic user interface, and a set of tools to manipulate spatial data. Furthermore, many GISs are created to work in the web environment so that multiple users are able to obtain the benefits of a web application.

Once the GIS has been built and data has been collected, there is a treasure to be exploited, resulting in valuable information that can be used for decision making. There are different ways to analyze the data and information stored on GIS. One of them consists of overlaying of layers of data or information, as a way of organizing different types of data (as we will see in the following sections). Another way could be with the use of spatial queries. One more could be the specific processing of the data, as in the case of the creation of a network flow model. One more could be a data mining analysis.

Geomatics is defined as the field of study related to the gathering, storing, processing, management of spatially georeferenced data, and delivering of geographic information. These topics, among others, will be covered in the following sections. The rest of the chapter describes with more detail the conceptual framework of a GIS, its interactions with remote sensing data, some of its applications, the role of GIS in decision making, future trends of GIS, and conclusions.

GIS Conceptual Framework

In the previous section, we were introduced to the broad picture of GIS. Now we will concentrate in more detail on important concepts that support the theory of this multidisciplinary area. Let us start with by defining a GIS as a collection of components necessary to store spatial data to be manipulated in order to create spatial products (see Fig. 1). Then, the components of a GIS can be classified in three main categories. The first component is computer hardware to store data and software. The second component is computer software, required to manipulate data and create valuable products. Finally, the last and the most valuable component is geographic data. In order to manage geographic data, database management systems (DBMS) have been extended to deal with the spatial component. There are both commercial and open source tools with such an extension. An example of an open source DBMS with such an extension is PostgreSQL, with its spatial component called postGIS. For more information about postGIS, please refer to http://postgis.refractions.net/.

Fig. 1
figure 1

Main components of a geographic information system: computer hardware, software, and geographic data. Computer hardware is the data and software repository. Software is used to manipulate geographic data and to create valuable products used for decision making. Data is the heart of a GIS, its prime matter

GIS Sources of Data

There are different sources of data that we can store in a GIS (see Fig. 2). Much of the information was already available when GIS came into the market. This was the case of paper maps that were digitized and introduced to them. Data obtained from field work is another important source of data because although it is not an efficient way to obtain it, sometimes it is the only way to get data on a specific variable. Satellite and aerial images are yet another source of data for GIS. This type of data requires different preprocessing steps, depending on the purpose of the GIS. As we discussed before, information coming from global navigation satellite systems, such as GPS, can also be stored in a GIS (Lee 2001a). These are some of the sources of data for a GIS, but there might be more, in which case the other data depends on the spatial components of our GIS. Once we have identified these spatial components, we ask ourselves the question, what is the source of any other information that we require? The answer is directly related to the problem that our GIS application is intended to solve.

Fig. 2
figure 2

Some of the sources of data of a geographic information system. (a) Paper maps, (b) data coming from GPS satellites, (c) data coming from field work, and (d) data coming from satellite and aerial images

Vector and Raster Models

In order to introduce data into a GIS, we need to know how it is organized. There are two models to store data in a GIS. They are known as the vector and raster models. The goal is to model the real world, to represent it in a level of abstraction with the adequate level of detail to be mapped to the GIS. We might not completely store every detail of reality but only as much as we require. Let us assume for now that we are capturing this data from satellite images, but as we know, we could also obtain it from measures in the field or even from other ways.

In the vector model, each object in the real world is represented with one of three possibilities. These are a point, a line, or a polygon. Figure 3 shows an example of a road, a school, and a parcel represented with a line, a point, and a polygon, respectively. An important characteristic of this model is that the spatial relations between different objects can be captured. These are known as topological, metric, and direction relations. More information about the topic can be found in Koperski and Han (1999).

Fig. 3
figure 3

GIS vector model. Partial representation of the downtown of a city. A parcel for sale is represented with a polygon, main road with a polyline, and central school with a point

In the case of the raster model, data is represented with a grid of data. Each cell in the grid corresponds to a pixel in the image and is classified as a type of object. In the previous example, the line representing the road fills the points in the grid that overlap with the road. The school is represented with one pixel and the parcel for sale is represented by a set of pixels (an area) as can be seen in Fig. 4.

Fig. 4
figure 4

GIS raster model. Partial representation of the downtown of the city modeled with the GIS vector model of Fig. 3 with the raster model. In this case, the parcel for sale can be identified by the cells filled in blue. The main road can be identified by the cells filled as well in blue. Finally, central school corresponds to the cell filled in red

Up to now we know how data is structured to be stored in a GIS. However, we are missing one of the fundamental concepts that make a GIS so useful: its capability to manage locations. We need to be able to attach location to the objects that we store. For this, we require the use of a georeference system as we describe in the following section.

Georeferencing

When we load an image in a GIS, we want to identify the location of the objects in it. In order to do it, we need to georeference the image. We do this by finding a correspondence of the locations and a map projection using a coordinate system. For this, we introduce some concepts about Earth. We are first concerned with the measurement of Earth (Gelati 2006). There is a division of science known as geodesy or geodetics that does this. The important pieces of geodesy that concern GIS are the reference Earth shape or ellipsoid geodetic positioning or geodetic datum and coordinates, the true Earth shape or geoid and vertical datum, and the practical representation of the Earth or map projections.

The Ellipsoid

We use an ellipsoid as a good approximation to represent the shape of the Earth. This ellipse rotates around one of its axes. There have been different proposals of ellipsoids. One of the most common is the WGS84, which parameters are an equatorial axis of 6,378,137.00 m, a polar axis of 6,356,752.3142 m, and an inverse flattening of 298.257223563. As we know, the ellipsoid does not accurately represent the shape of the Earth. There are mountains that are higher than the line of the ellipse and there also exist places below sea level, which are below the line of the ellipse. This proves that this is not an accurate representation of Earth.

Mean Sea Level (Geoid)

As we can note, the shape of the Earth is very irregular. There is not any geometric body that has the exact shape of the Earth. This is the reason why the Earth’s shape received the name of the geoid. The geoid is an irregular equipotential surface that coincides with mean sea level over the oceans. It has also an imaginary continuity across the continents, which have undulations on the surface (the topography) because of the irregular distribution of the gravitational mass forces of the planet. The geoid is used as a reference surface for leveling, that is, we measure elevation relative to the geoid (Li and Götze 2001). A more detailed description about the geoid concept can be found in Mok and Chao (2001). Figure 5 illustrates how the geoid differentiates from the ellipsoid.

Fig. 5
figure 5

Difference between the geoid and the ellipsoid. The ellipsoid is a smooth geometric shape. The geoid passes over or under the ellipsoid depending on the irregular distribution of the gravitational mass forces of Earth. We can also appreciate how the topography of Earth differs from that of the geoid and the ellipsoid

A geodetic datum is defined as a reference model that associates a geodetic reference ellipsoid (the ellipsoid parameters: equatorial axis, polar axis, and inverse flattening) to a coordinate system (defined by a geodetic space through orientation, position, and scale). A geodetic datum is a mathematical model of Earth (Gelati 2006). There are two types of datums. These can be either geocentric or local. A geocentric datum is globally centered and is a good approximation of the whole Earth. In this case, the center of the reference ellipsoid coincides with the Earth’s center of mass (Gelati 2006). In Fig. 6 we can see how the center of the ellipsoid and the center of mass of the Earth coincide. We can also see that the geoid (or mean sea level) and the reference ellipsoid are in general a good approximation.

Fig. 6
figure 6

The geocentric datum. In a geocentric datum, the center of mass coincides with the center of the reference ellipsoid. It is a good approximation of the whole Earth, but there can be more accurate approximations for particular regions (This figure was adapted from Fig. 7.1 in Gelati (2006))

A geocentric datum is best suited for global applications, just as GPS uses the WGS84 geocentric datum. In contrast, a local geodetic datum better suits a particular region where the reference ellipsoid has better adjustment with the Earth’s shape. In this case, the center of the ellipsoid does not always coincide with the Earth’s center of mass. Because of this, a local geodetic datum does not provide a good global representation of the Earth. Figure 7 shows how the ellipsoid’s adjustment to the Earth is better in a local region than in the rest of it.

Fig. 7
figure 7

The local geodetic datum. In a local geodetic datum, the center of mass does not coincide with the center of the reference ellipsoid. It is a good approximation of a region of the Earth, but it is not for the whole Earth. We can see this in the figure because the geoid and the ellipsoid have better adjustment in a particular region of Earth than the rest of it (This figure was adapted from Fig. 7.2 in Gelati (2006))

Once we described the ellipsoid, the geoid, and the datum, we will describe the geographic coordinate system, which is based on the Earth’s rotation around its center of mass. We can determine the geographical coordinates of any point on the Earth’s surface based on its latitude and longitude. The Earth’s center of mass is on its rotation axis. The plane that passes through the center of mass and perpendicular to the axis defines the equator. Latitude is defined as the angle from the meridian between the equator and the reference parallel. This will always be north (N) or south (S). Then, the maximum latitude will be of 90°. Longitude is a geographic coordinate that defines the east (E) or west (W) position of a point on the Earth’s surface. It is the angle measured east or west between the plane containing the prime meridian (Greenwich) and a plane containing the North Pole, the South Pole, and the location in question (Longley et al. 2005) (Fig. 8).

Fig. 8
figure 8

Geographic coordinate system. Latitude is measured north or south from the equator, while longitude is measured east or west from the Greenwich meridian (Figure adapted from (http://services.arcgisonline.com/arcgisexplorer500/help/latlong_from_globe_center.png))

Even when an ellipsoidal representation of Earth has the advantage of being realistic, it has some disadvantages. For example, it is impossible to observe the entire terrestrial surface at the same time; we can only see one of its faces but not the other. This does not happen with a map. In a map, we can show the whole world. The ellipsoidal representation is not easy to manage as a map is. We cannot change the scale of the terrestrial globe in a practical way as we do with our paper map. Of course, there are many disadvantages of paper maps that we do not have with a terrestrial globe such as the geometric deformations suffered during the map projections that were used to create the maps. Now we will describe what a map projection is.

Map Projection

A map projection is defined as a mathematical transformation between the geographic coordinate system (in latitude and longitude) and a system on the plane surface. There are two common planar coordinate systems. One of them is the Cartesian system (with X, Y coordinates) and the other is the polar system (with range and angle coordinates). The problem with this transformation is that three main types of distortions are introduced. These are length, area, and angular distortions. This means that length, area, and angles cannot be preserved by a single map projection at the same time. As an example, a length distortion means that length measured on a map does not correspond to the length of the same feature measured in the real world. This is the distortion introduced by the map projection. That is, the use of plane geometry and trigonometry involving Cartesian coordinates to perform the calculations does not lead to correct results after the map projection. There is also an error when we measure angles in the map. For more information about map projections, please refer to Lee (2001b) (Fig. 9).

Fig. 9
figure 9

Map projection. The geographic coordinate system using latitude (ϕ) and longitude (λ) is projected into a plane coordinate system using X and Y coordinates, in this case, a Cartesian coordinate system (This figure was adapted from the online course of geography, lesson 7: A deeper understanding of coordinate systems and projections (https://www.e-education.psu.edu/geog486/l7_p9.html))

Until now, we have studied the basics of GIS. We know what a GIS is, its sources of data, and the vector and raster models. We are able to recognize spatial data, and we learned the difficulties involved with approximating the shape of the Earth. We can also find the coordinates of a location on Earth. Now, we will learn about the interactions between remote sensing and geographic information systems as a way to improve each other’s capabilities.

Metadata

Metadata is commonly referred as data about the data. This is the information that we use to document our data. In this way, metadata describes all the parameters necessary to work with spatial data: the data owner, source, resolution, and scale. A metadata framework can be described in different formats such as ASCII, HTML, Extensible Markup Language (XML), Standard Generalized Markup Language (SGML), and Resource Description Framework (RDF). In order to create compatibility among geospatial products and tools, a great effort to create standards over geospatial data has been done. Some of the available standards for geospatial metadata are (Gelati 2006; ISO 2011):

  • ISO 19139 Geographic Information Metadata XML Schema Implementation

  • ISO 19115 Geographic Information Metadata

  • Content Standard for Digital Geospatial Metadata

  • Dublin Core Metadata Element Set

  • Australian Government Locator Service

  • UK GEMINI Discovery Metadata Standard

Interactions of GIS with Satellite Systems

In this section, we describe in more detail how GIS interacts with remote sensing and the Global Positioning System platforms. The integration of these interactions has made possible what we have today and most of what we are creating for the future.

Geographic Information Systems and Remote Sensing

Remote sensing (RS) and GIS are two areas that interact with each other. There are three main ways in which these interactions can be combined to enhance each other (Wilkinson 1996). In the first one, RS is used as a tool to obtain data to be used in a GIS. Second, GIS data can be used as auxiliary information to improve products created from RS sources. Finally, RS and GIS are usually used together for modeling and analysis processes (Weng 2010).

RS contributes to the information that is stored in a GIS in different ways. One of the most important contributions is the extraction of thematic information from satellite images to create GIS layers. RS images are used to extract cartographic information to be the input to GIS, as in the case of the production of base maps. A very important application that requires the use of RS is the update of GIS databases. In this case, RS images are used to detect changes in thematic information to update GIS databases. RS images have also been used as background for GIS representations. This is the case of visualization tools for digital elevation models, which are very important for different applications (Weng 2010).

On the other hand, GIS data is used to improve some of the processes used in RS. These processes are, among others, the selection of the area of interest, its preprocessing, or its classification (Weng 2010). It is of great interest how GIS context information can be used to post-process the classification results of a statistical RS classification algorithm to improve its accuracy (Gonzalez et al. 2008). Another interesting approach is the use of a structural data representation (i.e., a graph-based representation) in order to use both types of information at the same time (nonspatial and spatial). In this way, the classification algorithm takes advantage of all the available information at the moment that it is performing the classification task (Pech et al. 2004). Figure 10 shows two patterns found through the Subdue system (Cook and Holder 1994), a graph-based spatial data mining process. Pattern (a) corresponds to the description of the class “mangrove” and tells us that, in general, a region of interest (ROI) that belongs to this class is adjacent to other regions of the classes “bare soil,” “vegetation,” and “water.” Pattern (b) describes the class “road” and tells us that, in general, a ROI that belongs to this class is adjacent to other regions of the classes “bare soil,” “vegetation,” and “urban.” This information is used in a post-processing step to validate the class assigned by a statistical classification algorithm in order to improve its classification accuracy.

Fig. 10
figure 10

Two patterns found by a graph-based spatial mining algorithm. (a) A graph pattern describing the mangrove class. This pattern says that a mangrove is usually found adjacent to regions of the classes bare soil, vegetation, and water. (b) A pattern describing the road class. This pattern says that a road is usually found adjacent to regions of the classes bare soil, vegetation, and urban

Geographic Information Systems and Intelligent Positioning

GIS has also a mature level of integration with positioning systems, as in the case of the Global Positioning System (GPS) of the USA. In this case, we can say that there are four main levels of integration. In the first one, a GIS only takes the information reported by GPS and displays it in a map. In a second level, there are more functions. The GIS can manage WGS84 coordinates and different layers of the map (i.e., boundaries, counties, roads, rivers, and more). It is also possible to zoom in and out to take a look to a specific location. In a third level of integration, it allows entering waypoints (the coordinates of important reference points) describing interesting features. This allows the GPS–GIS system to create a GIS database that can be used to make a map. In the last level of integration, the map, an intelligent map, is associated to a set of logical rules that are used to improve the accuracy of the reported position (Taylor and Blewitt 2006). Figure 11 depicts a set of satellites sending positioning information to vehicles.

Fig. 11
figure 11

GPS satellites updating a location-based service application. The satellites update the position of vehicles being monitored by the base station. The base station receives the position via SMS messages. The base station sends useful information (depending on their location: addresses of gas stations, restaurants, hotels, etc.) to the vehicles via SMS messages

Spatial Data Analysis

The organization of spatial data in spatial databases is a plus that does not only facilitate the way a GIS accesses information but also provides users powerful analysis tools. The extension of a relational database into a relational spatial database requires adding geometry information to the spatial objects stored in it. This includes the coordinates that define both the shape of the objects (as points, lines, or polygons) and their coordinates in space. This information is commonly stored in a table related to another table that stores the nonspatial information describing the spatial object. Indexes over these tables are created in order to access the data in an efficient way.

A GIS connected to a spatial database provides different tools to analyze the data stored in such a database. It allows organizing data of the same type (i.e., roads are represented by lines, trees by points, parcels by polygons, county divisions with polygons, a satellite image of each county with polygons) in layers. Because of this, a form of visual analysis consists of the overlaying of several layers to allow the user to identify how different features of distinct layers interact in space. We can give transparency to any of the layers (i.e., a satellite image) so that we can appreciate important features with more detail. In this level of spatial data analysis, the user interacts with the GIS to create a useful product, name it a map, that can be used for decision making.

Another level of spatial data analysis is known as spatial data mining (SDM). In this case, data is usually extracted from the GIS or spatial database and transformed into a data representation that can be managed by the spatial data mining system. There are different spatial data mining tasks: clustering (Ng and Han 1994), spatial association rules (Koperski et al. 1996), co-location patterns (Xiong et al. 2004), and outliers detection (Shekhar et al. 2002), among others. Some of the more interesting data representations (and useful for spatial data) are those able to deal with structural data, such as inductive logic programming (Muggleton 1995) and graph-based learning (Cook and Holder 1994). Spatial clustering methods find patterns that share a spatial component. Spatial association rules try to associate spatial objects to neighboring objects. An approach known as co-location patterns states that we usually find in a nearby region instances of a set of spatial features. That is, when a subset of such spatial features are commonly located together (in a nearby region), it can be considered a co-location pattern.

In Fig. 12 we show the integration of a GIS with a graph-based data mining tool. The GIS loads the spatial data stored in a postGIS spatial database and presents the base map located at the center of the interface. The GIS allows the user to analyze the data by presenting the spatial layers contained in the spatial database (the option to perform this function is located in the upper left of the graphic user interface; see Fig. 12). It is also possible to perform spatial queries using topological, distance, and direction relations (as we can see to the right of Fig. 12). The interface has an option to transform the queried data into its graph-based representation in order to send it as input to a graph-based data mining system, for instance, the option called Subdue. Subdue performs the data mining task and finds spatial patterns. The instances of the patterns found can then be visualized in the main map so that the domain expert can interpret the mined results. This interface integrates a GIS system with a set of tools for decision making.

Fig. 12
figure 12

A GIS data mining graphical user interface. This is a GIS graphical user interface created to analyze data from the city of Puebla, in Mexico (this is the reason why the labels are written in Spanish). The GIS has three main components. The first component allows the user to analyze data overlaying layers. In the second level, the user can perform spatial queries. In a third level, the user can perform the graph-based spatial data mining task, having the opportunity to visualize the resulting patterns (subgraphs) in the map

Applications

The high capability of GISs to store spatial data, process it, analyze it, and create final products such as thematic maps makes them a powerful tool to apply to any field where spatial data plays a role. GISs are used for applications in industry, in government, in health care, in environment protection, and in many other areas. In the rest of this section, we briefly describe a couple of applications as examples of GIS applications.

In the area of medicine, GISs are very useful for epidemiology studies. This type of application allows physicians to keep track of how a disease expands geographically. If a different type of treatment is being applied in different counties or states, and the statistics are shown in real time in the GIS, the efficiency of each treatment can be appreciated in real time in the GIS graphical user interface. This application could be used for any type of disease. It could be the swine influenza A (H1) in humans, malaria, aids, cancer, or any other disease.

More dynamical GISs are those that receive signals from different sensors such as GPS, as in the case of navigation consoles for automobiles or electronic chart display and information system (ECDIS) for vessels. An ECDIS is commonly connected to a GPS, a radar system, a meteorological station, a gyroscope, and other sensors of the vessel. The GIS presents the navigation charts and, with the help of the GPS, it plots the position of the vessel. The GIS allows drawing the path that the vessel should follow in the navigation chart. In addition, the radar system communicates with the navigation console (the ECDIS) and transmits the objects that it detects so that they can be plotted in the navigation chart and can be considered as dangerous or being in the middle of the path of the vessel. In this case, the ECDIS should play an alarm so that the vessel’s captain considers a maneuver to avoid the blocking object (perhaps another vessel) (Fig. 13).

Fig. 13
figure 13

An electronic chart display and information system. This system is composed of a touch screen monitor (bottom) to control the GIS functionality. The top monitor displays the information received through the internal network from all the sensors of the vessel connected to the navigation system

Another important area of application for GIS is that dedicated to disaster management. These GIS tools are created for any of the four phases of disasters: mitigation, preparedness, response, or recovery (UN-SPIDER 2011). Examples of information in GISs in this application may include floods, earthquakes, oil spills, storms, fires, tsunamis, volcano eruptions, epidemics, and droughts, among others. This, being an important area for any government, is an area of opportunity for GIS. For more information about the area of disaster management, please refer to the United Nations Portal of Knowledge at http://www.un-spider.org/knowledge-base. Figure 14 shows the emergency response cycle that considers its four phases: preparedness, response, recovery, and mitigation.

Fig. 14
figure 14

The emergency management cycle (Adapted from Wikipedia)

Examples of Current Trends

GISs are tools that can be used in any field of study. They are being used to make more efficient processes as part of any industry or government. This multidisciplinary work demands different research areas to meet and innovate. Some of these current topics are the following:

  • Augmented reality and GIS are being combined in different applications. The goal is to simulate how the real world would look like if we added artificial objects to it. Examples of augmented reality applications are its integration to landscape visualization (Ghadirian and Bishop 2008). Another example is the use of augmented reality for underground infrastructure visualization (Schall et al. 2009).

  • Another important application is the integration of semantic information to the 3D reconstruction of city models as we can see in recent research (Kolbe 2008). In this work, the author uses GML3 to represent the shape, the graphical appearance of the city models, the semantics, the representation of the thematic properties, and the taxonomies of the objects. GML is the Geography Markup Language, an eXtensible Markup Language (XML) grammar created to express geographical features. In Wolf and Asche (2010), the authors create a 3D tactical intelligence surveillance map for a group of crime experts who study spatiotemporal patterns of residential burglary crimes.

  • The integration of artificial intelligence techniques, such as fuzzy theory, is not new but is being more and more useful. An example of such a case can be seen in Kanjilal et al. (2010), in which the authors find an appropriate implementation approach to fuzzy regions.

  • These are some examples of both current research areas and applications of GIS that are used to solve real-world problems.

Conclusion

Geographic information systems are an advanced technology that allows developing applications in any area of study. Their power to analyze data enables them to create tools ideal for decision making. The advances in the development of satellite technology as sources of data for GIS enhance the quality of data as well as its analysis capability. In the current and future years, more research in this area will contribute to the development of more technology to solve more real-world problems, either in the industry, government, academia, or social areas.

Cross-References