Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Geospatial data collection is an important task for many spatial information users. Geospatial data collection may include field data collection, remote sensing data processing, and in-house geographical information science (GIS) data conversion. Nowadays, geospatial data are available from various sources. Among these, remote sensing data (i.e., optical, radio detection and ranging (RADAR), light detection and ranging (LIDAR), etc.) are among the primary data sources in many GIS analyzes. For example, high-resolution satellite images such as QuickBird, IKONOS, and aerial photographs are the basis for the generation of qualitative land-use maps (i.e., land-use zoning maps) and the delineation of transportation networks. Medium-resolution satellite images such as ALOS, SPOT, and Landsat TM/ETM are used in the generation of quantitative land-use maps (i.e., land cover maps) for regional-scale studies of changes in land use. The shuttle radar topography mission (SRTM) and LIDAR provide topographical characteristics for GIS analysis. Moreover, remote sensing data are important for environmental studies such as deforestation, global warming, and natural resource management. This technology captures the real-world information with various sophisticated sensors and platforms. However, building a GIS database is required for further geospatial analysis and mapping purposes. GIS converts the real-world information into a geodatabase in order to retrieve, analyze, and allow further geocomputations. On the other hand, field data collection is important for spatial information users in order to collect spatially distributed objects with their associated attribute information. In this chapter, we discuss geospatial data collection methods and processing, and their applications in GIS.

2 Geospatial Data Collection

Two approaches are presented, namely field data collection and in-house GIS data conversion. Both are frequently used in geospatial analysis and modeling.

2.1 Field Data Collection

This is one spatial data collection method, and is a first-step requirement for many spatial information users such as human geographers, physical geographers, geologists, crop scientists, ecologists, etc. Human geographers may want to collect public opinions and other social activities in order to understand how social behavior changes over space and time. Geologists or physical geographers may want to collect in-situ data in order to understand overall regional geological formations and structures. Researchers or students may collect ground-truth data to validate their results.

Components of field data: Field data collection is the foundation of many spatial analysis processes. Like other spatial data, field data (Fig. 3.1) are composed of two elements, namely the coordinate information of the spatial objects and their associated attribute information. Coordinate information includes X, Y, and Z for the positions of spatial objects, while attribute information includes properties of those spatial objects such as the soil nitrogen contents, the names of plant or animal species, the angles of dips and strikes for each rock unit, and so on. Planning and designing attribute data are at the heart of any field data collection process, and thus it is very important that this is considered before going into the field.

Fig. 3.1
figure 1

Importance of field data collection in geospatial analysis and elements of field data

Collection of coordinate information in the field: Coordinate information can be collected in several ways in the field, such as by using a global positioning system (GPS) device or GPS built-in devices, by using high-resolution satellite images, and by address matching/geocoding (conversion of addresses to X, Y coordinates) (Table 3.1).

Table 3.1 Coordinates collection methods and their advantages and disadvantages for various applications

Field data collection methods: Along with recent advances in modern wireless ­communication and Internet technologies, and mobile computational devices, nowadays field data collection can be conducted in a handy and timely manner. Many methods have been developed for field data collection, ranging from personal field data collection to automatic real-time field data collection using GPS, personal digital assistants (PDAs), and other mobile computational devices such as ultra-mobile personal computers (UMPC), smart phones, and Netbooks. In the following sub-sections, two field data collection methods are discussed, namely personal field data collection using an ultra-mobile PC (UMPC) or a Netbook computer, and real-time field data collection using a mobile phone. The latter is used to collect field data through a Post Office Protocol 3 (POP3) mail server and a centralized geodatabase, either individually or group-based. It is ideal for individual field data collection.

2.1.1 Personal Field Data Collection

Recent innovations in computing, networking, and Internet technologies enable GIS field users to collect, store, and analyze in a handy and mobile manner. PDAs are hand-held computers that were originally designed as personal data organizers, but became more versatile over the years. A PDA can be used as a clock, as a calendar, for accessing the Internet, for sending and receiving e-mails, for working on ­spreadsheets, and for using a word processor. However, PDAs lack the fully blown infrastructure of a wireless broadband network and have a limited screen resolution (typically 240  ×  320). UMPCs (typically with a screen resolution of 1024  ×  600 wide screen) began as a joint development exercise by Microsoft, Intel, Samsung, and others. UMPCs are able to run any software that has been written for the Windows XP platform. UMPCs can also feature GPS devices, Wi-Fi, and Ethernet. There has been a revolution in GPS over the last few years as the cost of receivers has decreased and accuracy has improved. GPS has become a critical tool for spatial information users in a wide range of application fields. Owing to the characteristics listed above, the mobile GIS is rapidly gaining popularity and effectiveness among spatial information users. A mobile GIS is also interdisciplinary. Nowadays, the focus of much leading-edge research in geography is interdisciplinary (integrating two or more academic disciplines), and hence is not limited to scientific investigation only, but can also be extended to applied real-world problem solving (so-called normative uses) in a time- and cost-effective way. Google Maps provides high-resolution satellite images for almost all urban areas around the world. The spatial resolution is good enough to collect reference ground control points (GCPs) in urban areas owing to the nature of distinguishable landscapes and their associated features, such as road intersections, building shapes, etc. Urban area field surveys such as household surveying, road condition inspection, hydrant inspection and mapping, damage investigation (in the case of a disaster), public health surveys, and the collection of public facilities are important to local and city planners to help in effective urban planning.

Initially, UM-FieldGIS (Lwin and Murayama 2007) was intended for use in student field survey projects which did not require any server-side installation. UM-FieldGIS is a Windows-based GIS program which allows the user to collect, store, and integrate information into a current GIS system in a timely manner. UM-FieldGIS field survey files are based on the Microsoft access database format (.mdb file extension), a technology that is also used in ESRI personal geodatabase (.mdb) (Fig. 3.2). However, ESRI personal geodatabases are more complex than MS access databases as they handle both the database management system (DBMS) and the geographical features, as well as their associated topographical relationships. Under the UM-FieldGIS, the field user can collect geographical positions from the embedded Google Maps API (online network connection), or a pre-installed map (PIM) (map-based mobile GIS), or an attached universal serial bus (USB) GPS, or a built-in GPS. Field users are able to create their own survey items (field names) and choose a multimedia attachment capability. UM-FieldGIS supports general GIS functions such as map zooming, scrolling, querying attributes, arranging map layers, searching by attribute name, and changing the properties of map views under the PIM mode. Finally, the surveyed data can be exported into the ESRI shape file format, or directly imported by the ArcGIS software.

Fig. 3.2
figure 2

Program Design of UM-FieldGIS.PIM, Pre-installed Map (supported formats: GeoTIFF and ESRI Shape); GMap API, Google Maps Application Program Interface; Download URL: http://giswin.geo.tsukuba.ac.jp/sis/en/gis_software.html

The UM-FieldGIS graphic user interface (Fig. 3.3) is especially designed for UMPC wide-screen resolution (1024  ×  600) monitors and is featured on the tabbed dialog box class. Functionally, the UM-FieldGIS can be divided into two, the database module and the GIS module. All functions are grouped by tabs. The user can easily switch between functions by clicking the appropriate tab without necessarily opening multiple windows. This is especially advantageous for a desktop with a limited screen display. The database module enables the user to create a new field survey file in Microsoft access database format (.mdb), or open an existing survey file to analyze or to append new information during the surveying process.

Fig. 3.3
figure 3

Graphical user interface of the UM-FieldGIS

The UM-FieldGIS allows users to create their own survey form (survey items) and add a media folder for image attachments in each record (Fig. 3.4). Records can be added, deleted, and edited using the add new/edit mode tab. Collected records can be viewed in a tabular, record, or media view (Fig. 3.5).

Fig. 3.4
figure 4

Add new record

Fig. 3.5
figure 5

Edit record

The UM-FieldGIS is suitable for field data collection in urban areas where wireless access services such as Wi-Fi is available. However, it can still be used in PIM mode by using maps or high-resolution satellite images without Internet access.

2.1.2 Real-Time Field Data Collection (Centralized Geodatabase)

UMPC or Netbook computers and wireless Internet access services are sometimes costly and are not suitable for all users. We need to find other methods of collecting field data in a handy and timely manner at low cost, such as by using a personal mobile phone. Figure 3.6 shows an overview of the real-time field data collection system. Basically, this system consists of two sub-modules, namely the client module and the automation module. The client module contains only a GPS-embedded mobile phone or a GPS-plus mobile phone. All the functions of the data injection and format conversion processes are performed automatically within the automation module. Finally, the real-time results can be viewed through a Web browser by providing Web-GIS.

Fig. 3.6
figure 6

A centralized geodatabase and mobile field data collection

2.1.2.1 How It Works

Real-time field data collection (a centralized geodatabase) utilizes GPS-embedded mobile phones, which typically support additional services such as the short messaging service (SMS), the multimessaging service (MMS), e-mail and Internet access, and short-range wireless (infrared or Bluetooth) communications, as well as business and gaming applications, and photography. Users are required to type a predefined text format for collecting the data. For example, the user needs to add a “/” character between fields, and add a “,” between attribute values (Fig. 3.7). This text message is then sent using a predefined mail address and subject. The user can also attach as many photographs as needed. This text message is read by the POP3 mail server, converted into a GIS dataset, and then injected into a centralized geodatabase at specific time intervals within the automation process. The centralized geodatabase is composed of aerial images, other ancillary datasets, and the injected data (survey data). End users can download and visualize the survey data in ESRI-shape file format through a Web browser for further analysis.

Fig. 3.7
figure 7

Format conversion between a text message and an attribute table in the geodatabase (Lwin and Murayama 2011b)

This filed data collection method can be implemented through an individual or a group survey by changing the “type” field (Fig. 3.8). The individual survey mode is ideal for users who collect field data for their own specific purpose, while the group survey mode is ideal for real-time collection of information such as surface temperature, wind speed, wind direction, damage information, etc. For the individual survey mode, the “type” field may be the user’s initial name. Later, the users would be able to extract their own data by using this field. For the group survey mode, the “type” field may be a category which is being surveyed, such as temperature, land-use type, rock or soil properties, etc.

Fig. 3.8
figure 8

Modes of survey: individual and group (Lwin and Murayama 2011b)

The overall system is built on Microsoft ASP.NET with an AJAX extension and VDS technologies (Web mapping components for ASP.NET). ASP.NET is a Web application framework marketed by Microsoft that programmers can use to build dynamic Websites, Web applications, and XML Web services. AJAX (shorthand for asynchronous JavaScript and XML) is a group of interrelated Web-development techniques used on the client side to create interactive Web applications. With AJAX, Web applications can retrieve data from the server asynchronously in the background without interfering with the display and behavior of the existing page. The use of AJAX techniques has led to an increase in interactive and dynamic interfaces on Web pages. The AspMap for.NET from VDS technologies is a set of high-performance Web-mapping components and controls for embedding maps in ASP.NET applications (Web forms).

This field data collection method has been introduced to the students of the University of Tsukuba, Japan, during their field survey course, which is part of the university campus GIS project. Under the campus GIS project, individual students are required to collect or report illegal bike or motorbike parking places, illegal waste disposal site locations, and man-made footprints which are caused by people who walk on the grasslands or who are passing between trees instead of using legal paths. Later, this information is used by the university administrators to maintain the campus landscape and manage student facilities. A group survey was also conducted to collect environmental data such as surface temperature, wind speed, wind direction, etc., on a real-time basis. In this case study, 4 faculty members and 16 students from the University of Tsukuba, Japan, and 2 faculty members and 9 students from the South China Normal University, China, participated.

Planning ahead is important for adequate and successful field data collection. Spatial planning and sampling design include setting where and what attribute information is to be collected. Sometimes, it is difficult or impossible to collect again after the field work has been done once. In this project, we assigned the survey area to student groups based on administrative units. We also demonstrated the handling of GPS and other field survey instruments. Students were required to send field survey data by using their GPS-embedded mobile phone or by reading the coordinates from a Garmin hand-held GPS. During the field work, we monitored their status on a Netbook computer with wireless Internet access (Figs. 3.9 and 3.10). We also advised the students through mobile phone communication.

Fig. 3.9
figure 9

Acquired real-time information with media attachment (Lwin and Murayama 2011b)

Fig. 3.10
figure 10

Real-time data injection (Lwin and Murayama 2011b)

After the field work, students were required to download the survey data through Web-GIS and open it in ESRI ArcMap in the laboratory. This process includes downloading the data, importing it in ArcGIS, formatting the data, and visualizing it in ArcMap. We used the following Visual Basic (VB) scripts to format the comma-separated values into attribute fields (Fig. 3.11). String substitution was also carried out by VB Script to replace the short text with full text, such as “urb” to “Urban.” This is because some students collected and recorded the data using short text messages to reduce the typing time and errors. Furthermore, students were also able to specify their names in the “Type” field.

Fig. 3.11
figure 11

Formatting the data in ArcGIS

Every year, a field survey is conducted to collect information about public facilities such as bicycle stands and capacity, garbage boxes and types, car parking lots, sidewalk conditions, illegal garbage places, and other environmental data. Figure 3.12 shows the visualization of student survey data in the ArcGIS software.

Fig. 3.12
figure 12

Mapping the surface temperature, wind speed, and wind direction in ArcMap based on a student group survey data

2.2 In-House GIS Data Conversion

The two main GIS data sources include digital and non-digital sources. Digital sources include information captured through remote sensing and surveying (field data), while non-digital sources include, but are not limited to, paper maps, which are usually digitalized through scanning and digitization (Sheldon 2007). This sub-section discusses the “in-house data collection” method. Generally, this method involves the digitalization of available paper maps. Map sheets comprise one of the most widely available and familiar sources of spatial data (Malczewski 2004). The immediate outputs of this method are digitized GIS vector data, which include points, lines, and polygons. However, the production of these usually involves the processes of georeferencing and digitizing, which are also preceded by scanning when the digitizing process is to be done on-screen.

Scanning: Prior to on-screen digitizing, paper maps have to be integrated into the GIS database by converting them into digital format. The process of such conversion is known as scanning. Through scanning, map features, including texts and symbols, are automatically captured as individual cells or pixels and an automated image is produced. These features in raster format are then vectorized through tracing or on-screen digitizing. Generally, in order to have a good source image in the digitizing process, a scanner needs to have a good resolution and, depending on the specific purpose, has to be large enough to accommodate the map sheets being scanned.

Georeferencing: Basically, the process of projecting image data onto a plane and making it conform to a map projection system is called rectification. However, since scanning produces images that are already planar, rectification is no longer required unless there is distortion in the image (Leica Geosystems 2005). In this instance, such scanned images only need to be georeferenced. Georeferencing refers to the process of assigning map coordinates to image data. As explained in the ERDAS imagine field guide (Leica Geosystems 2005), “the image data may already be projected onto the desired plane, but not yet referenced to the proper coordinate system. Rectification, by definition, involves georeferencing, since all map projection systems are associated with map coordinates. Image-to-image registration involves georeferencing only if the reference image is already georeferenced. Georeferencing by itself involves changing only the map coordinate information in the image file. The grid of the image does not change.”

Digitizing: Before the on-screen digitizing technique became available, table digitizing was the acceptable method for creating GIS vector data. However, aside from being considered as a time-consuming process, several challenges have also been experienced with this method, such as difficulty in using the digitizing puck, tablet malfunctions, source materials changing size, registration problems, edge-matching complexity, and more (Sheldon 2007).

Today, with the availability of more advanced digitizing methods like on-screen digitizing, it is possible to derive GIS vector data from a sheet map using a desktop computer (Fig. 3.13). On-screen digitizing only requires that the sheet map is scanned and properly georeferenced. With this method, which is sometimes referred to as “heads-up” digitizing, the scanned and georeferenced source map or photograph is displayed on-screen, and features are digitized using a standard mouse (Eastman 2006). This advancement in technology allows users to “zoom in” on images as much as is needed, digitize using a computer mouse, and edge-match more easily for faster map creation (Sheldon 2007).

Fig. 3.13
figure 13

On-screen digitizing for GIS data conversion

3 Geospatial Data Processing and Applications

Geospatial data processing is the heart of the task in many GIS analyzes. It is necessary to understand digital image processing, database design and construction, and GIS analytical functions. Geospatial data can be grouped into raster and vector formats. Raster data are mainly derived from remote-sensing data, and vector data are mainly constructed in GIS.

3.1 Raster Data Processing and Applications in Geospatial Analysis

Many remote-sensing data used in GIS are in raster format, such as land-use land cover (quantitative), the normalized difference vegetation index (NDVI), the satellite-derived digital elevation model (DEM), and surface temperature, since the remote-sensing technology captures the real-world information pixel by pixel, which is known as spatial resolution with sophisticated sensors. This section discusses some of the raster data commonly used in geospatial analysis, such as the quantitative land-use land cover and the NDVI.

3.1.1 Land-Use Land Cover

Quantitative land use/cover maps are commonly derived from medium-resolution satellite images using a method known as multispectral classification. In developing countries, land use/cover maps derived from satellite images are a cost- and labor-effective measure for updating an existing land use/cover map database and for managing the country’s natural resources (Lwin and Shibasaki 1997; Nunes and Auge 1999). The applications of land use/cover maps in GIS are manifold, e.g., land-use change modeling, monitoring of deforestation, natural resource management, and hydrological modeling. Figure 3.14, for example, shows the combination of two different spatial and temporal resolution satellite images, Landsat ETM and NOAA AVHRR data, to monitor the annual deforestation rates and deforested areas in Myanmar (Lwin and Shibasaki 1998).

Fig. 3.14
figure 14

Delineation of deforested areas using Landsat TM images

3.1.2 Normalized Difference Vegetation Index (NDVI)

The normalized difference vegetation index (NDVI) can be derived from normalizing two spectral bands, the infrared and red bands (NDVI  =  RED  −  IR/RED  +  IR). Chapter 16 presents the use of advanced land observing satellite (ALOS) image-derived NDVI data to model urban green walkability space. Furthermore, NDVI data can be used to monitor annual deforestation rates by using 10-day composite NOAA AVHRR 1-km data for regional scale studies (Fig. 3.15).

Fig. 3.15
figure 15

Monitoring the annual deforestation process using NOAA AVHRR 10-day composite NDVI data (a case study in Myanmar)

Figure 3.16 shows the surface temperature derived from Landsat TM band 6 (thermal band). For example, Lwin and Murayama (2010b) detected the urban thermal fringes from Landsat ETM-derived surface temperatures using “focal statistic analysis” in GIS to identify surface temperature variations (i.e., the heat island effect) inside Tsukuba city, in order to take further action for eco-city planning.

Fig. 3.16
figure 16

Surface temperatures derived from Landsat TM5

3.2 Vector Data Processing and Applications in Geospatial Analysis

Vector data are composed of points, lines, and polygons. Vector data can be directly purchased from map vendors, or generated by in-house GIS data conversion methods such as scanning, georeferencing, and digitizing.

3.2.1 LiDAR Point Data Processing and Applications in Geospatial Analysis

Point data are fundamental elements in vector data. Point data represent bus stops, facility locations, field survey data of soil and rock properties, surface temperature, etc. This sub-section considers the use of LIDAR point cloud data in generating DEMs, digital surface models (DSM), digital terrain models (DTM), and digital height models (DHM). The terms DSM, DTM, and DHM are generally suitable for high-resolution digital elevation data, since such data can distinguish the heights between buildings, trees, and other built-up objects, while DEM is generally suitable for coarse spatial resolution elevation data (i.e., 30 m, 90 m, 1 km).

Traditionally, stereo-image matching is a standard photogrammetric technique to generate a DSM. However, this technique is good only for a smooth open terrain surface. The quality of a DSM in built-up areas is poor owing to occlusions and height discontinuities (Haala and Brenner 1999). LiDAR techniques have been studied and utilized since the early 1960s, but appear to have become more prominent in the past few years. LiDAR has found applications in a wide variety of fields of study, including atmospheric science, bathymetric data collection, law enforcement, telecommunications, and even steel production (Maune et al. 2000). Because LiDAR operates at much shorter wavelengths, it has a higher accuracy and resolution than microwave radar (Jelalian 1992).

Choosing the appropriate surface-generating method for DSM and DTM is important in LiDAR data processing, since surface height information is collected as points. Figure 3.17 shows the detailed procedure for generating a DHM and digital volume model (DVM). In this process, both DSM and DTM point features are converted into a triangulated irregular network (TIN) model (i.e., TIN–DSM and TIN–DTM). Using ArcGIS software, the TIN process allows users to convert multiple scenes at one time. This reduces the time for mosaicing. Moreover, the TIN process is faster than other interpolation processes such as IDW, SPLINE, and kriging. Each TIN–DSM and TIN–DTM is converted into a raster format, setting the spatial resolution to 0.5 m. The DTM is subtracted from the DSM raster layers to achieve the DHM. For this DHM raster layer to be converted into a DVM, it is multiplied by the cell surface area (i.e., 0.25 m2).

Fig. 3.17
figure 17

LiDAR data processing work flow (modified from Lwin and Murayama 2010a)

The possible uses of elevation data in GIS are numerous, e.g., the identification of river flow directions and the delineation of catchment areas (river basin or watershed) in hydrology, soil erosion modeling, building height extraction for telecommunications, and other 3D applications. DVM is also used for building population estimates (Lwin and Murayama 2010a) for microscale population data analysis and 3D visualizations of urban landscapes (Fig. 3.18).

Fig. 3.18
figure 18

Building population estimates from DVM (upper), and 3D visualizations of urban landscapes by the combination of building footprints and LIDAR height data (below)

4 Conclusion

Perhaps the most exciting area of computer system development continues to be in hand-held devices such as PDA, UMPC, Netbooks, and smart phones. A smart phone is a mobile phone that offers more advanced computing ability and connectivity than a contemporary feature phone. They are noticeably more efficient in form factor (size, shape, weight, etc.), chip type, internal storage capacity, battery life, and operating system compared with desktop computers. Along with hardware developments, the operating systems used in smart phones are becoming more and more compact and functional, e.g., iPhone (Apple Inc.) and Android (Google). Computer scientists at the University of Washington have used Android, the open-source mobile operating system championed by Google, to turn a cell phone into a versatile data-collection device. They collected data on deforested areas, and instantly submitted that information to the global environmental database (University of Washington 2009). In the meantime, the number of cellular mobile phone subscribers world-wide is increasing year by year. According to the International Telecommunication Union’s (ITU) 2010 report, by the end of 2009, there were an estimated 4.6 billion mobile cellular subscriptions, corresponding to 67 for every 100 inhabitants globally. Recently, a couple of studies showed field data collection with mobile phones in both the educational and industrial sectors (Mourão and Okada 2010; Moe et al. 2004).

On the other hand, the increasing popularity of the Internet and user-friendly Web-based GIS applications such as Google Maps/Earth and Microsoft Bing maps have made GIS an integral part of life today for finding the nearest facilities, driving routes, and so on. For example, in Tsukuba City, Japan, local residents and “green” exercise takers can find the shortest or greenest route between stops by using their smart phone while walking along the street and accessing the eco-friendly walk score calculator Web-based GIS (Lwin and Murayama 2011a). However, PDAs, Netbooks and smart phones are sometimes considered to be cost-intensive, including both device and wireless access service charges, and hence not suitable for use in student field survey projects. Moreover, mobile field computing environments vary widely, but generally offer extremely limited computing resources, visual display, and bandwidth relative to the usual resources required for distributed ­geospatial data (Nusser et al. 2003). Nevertheless, geospatial data collection, ­processing, and analysis tasks are important in GIS. Proper data collection and conversion are required to support the geospatial analysis that is vital for decision-making.