Unlocking Textual Content from Historical Maps - Potentials and Applications, Trends, and Outlooks

Chiang, Yao-Yi

doi:10.1007/978-981-10-4859-3_11

Yao-Yi Chiang¹⁴

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 709))

Included in the following conference series:

International Conference on Recent Trends in Image Processing and Pattern Recognition

1026 Accesses
6 Citations

Abstract

Digital map processing has been an interest in the image processing and pattern recognition community since the early 80s. With the exponential growth of available map scans in the archives and on the internet, a variety of disciplines in the natural and social sciences grow interests in using historical maps as a primary source of geographical and political information in their studies. Today, many organizations such as the United States Geological Survey, David Rumsey Map Collection, OldMapsOnline.org, and National Library of Scotland, store numerous historical maps in either paper or scanned format. Only a small portion of these historical maps is georeferenced, and even fewer of them have machine-readable content or comprehensive metadata. The lack of a searchable textual content including the spatial and temporal information prevents researchers from efficiently finding relevant maps for their research and using the map content in their studies. These challenges present a tremendous collaboration opportunity for the image processing and pattern recognition community to build advance map processing technologies for transforming the natural and social science studies that use historical maps. This paper presents the potentials of using historical maps in scientific research, describes the current trends and challenges in extracting and recognizing text content from historical maps, and discusses the future outlook.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Automating Information Extraction from Large Historical Topographic Map Archives: New Opportunities and Challenges

Big Historical Geodata for Urban and Environmental Research

Encoding and Querying Historic Map Content

Keywords

1 Introduction

Historical maps are an irreplaceable primary source of geographical and political information in the past (e.g., historical place names, landmarks, natural features, transportation networks, and war, trade, and diplomacy networks). The image processing and pattern recognition community started to develop computational methods for the extraction and recognition of the content from archived images of maps since the early 80s (Chiang et al. 2014). With the exponential growth of available map scans in the archives and on the internet, a variety of disciplines in the natural and social sciences grow interests in using historical maps in their studies. For example, the Mappa Mundi by Fra Mauro (ca. 1450) (Fig. 1) contains not only place names but also provides “natural philosophy, description of places and people, commercial geography, history, navigation and direction of expansion, and, finally, on what we can nowadays call methodological issues. In addition, Fra Mauro’s world map also includes hundreds of images, representing cities, temples, funerary monuments, streets, and ships, as well as a scene in the lower left corner representing Earthly Paradise” (Nanetti et al. 2015).

In many cases, historical maps are the only source that provides professionally surveyed historical geographic data. Map archives such as the U.S. Geological Survey (USGS) National Geologic Map Database,^{Footnote 1} USGS Topographic Maps,^{Footnote 2} David Rumsey Map Collection,^{Footnote 3} OldMapsOnline.org,^{Footnote 4} and the National Library of Scotland,^{Footnote 5} together store millions of this type of historical map in either paper or scanned format. For example, between 1884 and 2006, the USGS has created over 200,000 topographic maps. According to the USGS, in the United States these topographic maps “portray both natural and manmade features. These maps show and name works of nature including mountains, valleys, plains, lakes, rivers, and vegetation. They also identify the principal works of man, such as roads, boundaries, transmission lines, and major buildings.” The USGS National Geospatial Program has scanned these historical paper maps. Collectively, these publicly available scanned maps portray the evolution of the American landscape over a 125-year period.^{Footnote 6} Similar map series exist in many countries, e.g.; the Ordnance Survey maps in the U.K archived by the National Library of Scotland.

In the case of more recent historical maps produced using modern geospatial survey technologies (e.g., the USGS Topographic Map series, Ordnance Survey six-inch series, and other national agency series dated from the early 1800), the detailed map data on the states of landscapes in the past are essential for understanding the causes and consequences of environmental change and support a variety of natural and social studies on topics such as cancer and environmental epidemiology, urbanization, biodiversity, landscape changes, and history. (See Gregory et al. 2015 for more examples and methodologies in historical geographic information systems.) Many of these historical maps are not georeferenced, and almost all of the maps have content that is not machine-readable. Existing map processing technologies are still limited in making a large number of historical maps fully searchable by their content because the archived documents often suffer from bleaching, blurring, and false coloring (e.g., Khotanzad and Zink 2003; Leyk and Boesch 2009; Leyk et al. 2006). The reader is referred to Chiang et al. (2014) and Chiang et al. (2016) for detailed reviews and case studies on map processing techniques and systems.

Today, a researcher can spend a great deal of time and effort searching and cross-referencing data sources to find relevant maps. Then they need to digitize the map for converting the map content to a machine-readable format (e.g., Godfrey and Eveleth 2015; Nanetti et al. 2015). The researcher may need to search in various publication repositories, map repositories, search engines, and then they will often not find the historical map that they are looking for and work without it. In many cases, these historical maps exist, and it just requires too much effort to locate and digitize them. The result is that researchers waste time and resources and do not get as far as they could have in their work because the relevant information is not discoverable or takes too long to prepare for scientific analysis.

The challenges in working with historical maps present an enormous collaboration opportunity for the image processing and pattern recognition community to build advance map processing technologies for transforming the scientific studies that currently use textual content in historical maps. Therefore, it is important to understand the current landscape in the broad applications of historical maps. This paper first describes the potentials and current applications of historical maps in a variety of studies, including topics in natural science (Sect. 2) and social science (Sect. 3). Next, the paper describes the current trend in extracting and recognizing textual content from historical maps (Sect. 4). Finally, the paper discusses the future outlook in text recognition technologies in map processing (Sect. 5).

2 Potentials and Applications of Historical Maps in Natural Science

Historical data archives (e.g., museum and herbaria collections, digital photography and newspaper archives) support a variety of scientific studies in natural science on topics such as biodiversity (e.g., Hill et al. 2009), evolutionary biology (e.g., Lavoie 2013), human disease (e.g., Yoshida et al. 2014), plant biology (Davis et al. 2015; Vellend et al. 2013), and ecology (e.g., Newbold 2010; Pyke and Ehrlich 2010), but geolocating the historical localities mentioned in archives (e.g., Calflora Observation Database^{Footnote 7} and the Global Biodiversity Information Facility; Samy et al. 2013) is challenging and very often a tedious manual process using historical maps. Murphey et al. (2004) reviewed the problems in georeferencing museum collections. They compared a number of geoparsing tools including the GEOLocate (Rios and Bart 2010) and BioGeomancer (Guralnick et al. 2006). Since then, a variety of advanced algorithms for geoparsing has been proposed (e.g., Leidner and Lieberman 2011) and open-source software packages (e.g., CLAVIN,^{Footnote 8} CLIFF (D’Ignazio et al. 2014)), and the Edinburgh Geoparser (Alex et al. 2015) are available. These algorithms and tools are widely used in geolocating places in the unstructured text and also used in spatial humanities research (e.g., Gregory et al. 2015). However, these tools need a “gold data” gazetteer to provide the location information of recognized place names, and the lack of historical reference gazetteers remains a challenge. The result is that even if the geoparsing software can correctly identify a historical name as a geolocation reference in the unstructured text, the geocoordinates of the historical name is still unknown if the place name no longer exists. To locate the place names that no longer exist in contemporary data sources, a researcher needs to search and cross-reference a variety of data sources such as archives of historical maps, newspapers, and photography.

For example, a data record in an online database of California herbarium specimens describes an August 16th, 1902 observation of Artemisia douglasiana (California mugwort) at the location “near Mesmer” in Los Angeles. The place name Mesmer near or within both the City and County of Los Angeles no longer exists in the contemporary geographic data sources, including authoritative sources like the U.S. Census^{Footnote 9} and USGS GNIS (the United States Geological Survey Geographic Names Information System)^{Footnote 10} and open sources, such as GeoNames,^{Footnote 11} OpenStreetMap,^{Footnote 12} and Wikipedia. Searching “Mesmer” in the GeoNames gazetteer results in an airport “Mesmer Airport” in New York and a street “Rue Mesmer” in Haiti. Neither of the results helps to geolocate the observation of California mugwort in 1902. A Google search with the keywords “Mesmer” and “Los Angeles” reveals a few interesting facts that could be helpful for geolocating Mesmer. First, the search results include a person, Louis Mesmer (1829–1900), who was a prominent businessman and the owner of the famous United States Hotel in Los Angeles. Because it was common to name locations after well-known families (e.g., Wilshire, Hancock, and Doheny in Southern California), Mesmer could be a place name in the Los Angeles area in the past. Second, the search results contain a link to a map in the Los Angeles Public Library collections showing a proposed development plan in 1924 for the “Mesmer City” in Los Angeles (Fig. 2). At the time, Mesmer City was advertised as “In the direct path of the Los Angeles’ growth toward the ocean”.^{Footnote 13} This map further narrows down the search space for Mesmer to somewhere nearby Culver City and Baldwin Hills. Together, the time and location information from the search results points to the USGS topographic map that contains the Mesmer in 1901 (Fig. 3). In this case, Mesmer is geolocated, but the entire process cannot be scaled to handle thousands of records in an efficient manner.

Historical GIS (Geographic Information System) (Gregory and Ell 2007) could alleviate the problem of geolocating historical locality references by providing a platform for collecting datasets of historical place names, but the datasets are rarely available. Even when historical gazetteers are available, their spatiotemporal coverage is often sparse. For example, the U.S. Census only provides post-2010 and also 2000 and 1990 census gazetteer files. NHGIS (the National Historical Geographic Information System at the Minnesota Population Center)^{Footnote 14} provides historical demography data down to the census tract level but only a few place names. The Ramsay Place Names File from the State Historical Society of Missouri provides a historical gazetteer covering locations in the State of Missouri from 1928 to 1945 (Adams 1928). The website “A Vision of Britain through Time” from the GB Historical GIS at the University of Portsmouth^{Footnote 15} provides historical place names in the Great Britain dated back in the early 19th century.

As shown in the examples of natural science studies in this section, the ability to automatically use the textual content in historical maps as the locality reference source will be able to transform historical data records in documents and collections into georeferenced datasets. This ability will enable natural science researchers to efficiently find, query, and analyze a variety of historical records by location.

3 Potentials and Applications of Historical Maps in Social Science

Historical maps also play an important role in social science studies. Kurashige (2013)^{Footnote 16} used historical census data, voting records, and precinct numbers and boundaries extracted from a 1920 map to study “who” (e.g., occupations and political parties) in Los Angeles voted for the 1920 California Alien Land Law that discriminates against Japanese (Fig. 4).

Ngo et al. (2015) used historical maps and land records to build an interactive visualization of land reclamation in Hong Kong (Fig. 5). This web tool^{Footnote 17} is among the top hits when searching Hong Kong land reclamation on Google.

The Spatial Sciences Institute at the University of Southern California (USC) is collaborating with an insurance company to automatically read historical Ordnance Survey maps (ca. 1900–1970) covering the entire U.K. to identify likely locations of subterranean contamination, such as factories, mines, quarries, and gas works which no longer exist and otherwise would not be known today (Fig. 6).^{Footnote 18}

In a joint effort, the USC Shoah Foundation Visual History Archive (VHA) works with the USC Spatial Sciences Institute to link historical maps to places mentioned in genocide survivor testimonies in the VHA archive. The linkages enrich the personal stories of the survivors by using the spatial and temporal context in historical maps to enable the viewers to “go back in time” to recreate the physical world of the historical experience of the survivors (Fig. 7).^{Footnote 19}

Nanetti et al. (2015) manually transcribed and georeferenced the textual content in the Mappa Mundi by Fra Mauro (ca. 1450). They use the transcribed data as a knowledge aggregator to represent the world as seen from Venice in the fifteenth centuries. They also plan to use the map data for automatic provenance and validation assessment of large and heterogeneous collections of other historical sources.

The example studies in this section demonstrate the power of historical maps in social science research. They also show that not only extracting and recognizing map content is important, but providing semantic annotations to the map content and linking the map content to other data sources will enable researchers to investigate complex social science problems at a scale that cannot be done today.

4 Current Trends in Text Recognition from Historical Maps

Text recognition from maps is a difficult task, especially for historical maps. This is because map labels often overlap with other map features, such as road lines, do not follow a fixed orientation within a map, and can be stenciled and handwritten text (Chiang et al. 2014, 2016; Nagy et al. 1997). Also, many historical scanned maps suffer from poor graphical quality due to bleaching of the original paper maps and archiving practices. This section presents a number of trends in text recognition from historical maps.

Classic Approaches.

Traditional, the approaches on text recognition from historical maps follow the classic document recognition strategy that first analyze the map to identify the potential text areas. Then the approaches use optical character recognition (OCR) algorithms or tools (e.g., TesseractOCR)^{Footnote 20} to convert the detected areas into machine-readable text (Honarvar Nazari et al. 2016; Chiang and Knoblock 2014; Li et al. 2000; Pezeshk and Tutwiler 2010, 2011; Raveaux et al. 2007, 2008). This line of work has demonstrated promising results on single map sheet or map series but still does not handle large numbers and various types of historical maps because of the significant heterogeneity of historical maps.

Crowdsourcing Approaches.

To handle the vast variety of historical map types and sources, recent work has adopted the crowdsourcing strategy. Crowdsourcing is not a new idea (but can be very difficult to implement and popularize) in document recognition and map processing. The David Rumsey Map Collection held crowdsourcing events to georeference their map collections by crowdsourcing. The New York Public Library provides semi-automatic tools for the public to extract parcel polygons from their historical insurance maps.^{Footnote 21} They also noted that even with the crowdsourcing approach, semi-automatic tools are required to process their map collections in a reasonable time (Arteaga 2013). Specifically for converting textual content in historical maps using crowdsourcing, the Pelagios Commons^{Footnote 22} is a notable community that provides tools and online infrastructures to facilitate annotating historical locality references in digital materials. Their tools allow semi-automatic extraction, recognition, annotation, and linking of place names in historical maps (Simon et al. 2010, 2014, 2015). These tools and the online infrastructure allow them to provide full-text searchable place data ranges from ancient times to 1500 AD and from Europe to East Asia.

To make the crowdsourcing strategy more effective and efficient, it will be necessary to build adaptive semi-automatic techniques that improve the level of automation as more maps are processed. Also, as the crowdsourcing strategy is used, approaches for cross-validating between user generated content and the “gold data” as well as recording the provenance information are required (e.g., Garijo et al. 2015).

Multi-model Approaches.

Another line of work in the recent development of text recognition from historical maps uses additional data sources as the dictionary to help correct the recognition errors. While this dictionary strategy is common in OCR, compiling and effectively using a dictionary for recognizing historical map text is difficult. This is because a dictionary built using contemporary data sources does not contain place names that no longer exist. Also, without knowing the map coverage beforehand, multiple dictionary entries can match to a partially recognized label. For example, a partially recognized label “Glas wo” near London could be matched to “Glasgow” when the label is “Glassworks”. Even the map coordinates are known, the map text might not be at the exact location of the geographic features depending on the cartographic labeling practice of the map. Weinman (2013) presents an approach that overcomes this challenge. His approach recognizes text labels in maps to then match the recognized text to a gazetteer by their position patterns using a RANSAC variant called MLESAC (Torr and Zisserman, 2000). He showed that this approach could automatically georeference historical maps and improve the recognition accuracy even when the gazetteer only contains 70% of the text in the test maps.

In previous work, we developed a semi-automatic approach that extracts and recognizes text labels in map images in a system called Strabo (Apache Version 2 License) (Chiang and Knoblock 2014). While Strabo could achieve over 90% precision and recall in recognizing text labels in scanned contemporary maps, it could only produce 47.6% precision and 83.5% recall on well-conditioned text from historical Ordnance Survey six-inch maps (Chiang et al. 2016). The result is that very often only partial labels could be recognized from a historical map (Figs. 8(a) and (b)) and manual post-processing is required to correct the recognition results.

In an effort to test higher levels of automation in text recognition from historical maps (Yu et al. 2016), we exploit the fact that geographic names for the same area found in different data sources is not independent and use geographic names in OpenStreetMap and other maps covering the same area as the “dependent” knowledge source. Given a historical map, the task at hand is to recognize all map labels in the map accurately without user intervention. First, the system queries a map repository to find all map editions covering the same area and then extracts and recognizes labels in the identified maps. Second, the system compares and uses a fuzzy matching algorithm to match the recognized (imperfect) labels using their locations and string similarity. Finally, the system uses two million geographical names extracted from OpenStreetMap to generate an improved recognition result. For example, by matching “Cltureh” from the 1935 map to “urch” in the 1900 map, the system finds the word “Church” in the geographic names extracted from OpenStreetMap to replace “Cltureh” and “urch” in the 1935 and 1900 recognition results, respectively.

For the multi-model approaches, the current challenges include how to exploit string similarity measures between the extracted map text (which contain recognition errors) and other sources (e.g., gazetteer entries) to (1) prune the search space for finding the matching pattern efficiently and (2) using matches between the OCR text and dictionary entries to learn potential OCR errors specifically for each map type. For example, the character sequences “ni” and “in” is commonly recognized as one character “m” during OCR. With enough training data (matches between OCR text and dictionary entries), the algorithm should be able to learn that the OCR results “Baldwm Hills” is highly likely to be “Baldwin Hills” for a specific map type or condition.

5 Outlooks

This paper presents studies in natural and social sciences demonstrating the opportunities for the image processing & pattern recognition community to transform conventional research practices in using historical maps. For example, a new technology that automatically generates machine-readable or -understandable (e.g., LinkedData (Bizer et al. 2009)) place name databases from historical maps and to do so at scale will enable biology scientists to minimize the time and effort for geo-locating their data records and to efficiently query and analyze historical records by location and time. These opportunities also present unique possibilities for researchers in image processing & pattern recognition to identify collaborators in other scientific domains. This type of interdisciplinary collaboration allows the researchers in image processing & pattern recognition to create algorithms and applications to solving “wicked” research problems and addressing real-world challenges facing our society. Further, the paper discusses a number of trends and their challenges in text recognition from historical maps. These trends have already shown promising results in the automatic unlocking of textual content in heterogeneous historical maps. Solving the challenges in these trends will make it possible to use a large number of heterogeneous historical maps efficiently and study historical spatiotemporal datasets on a large scale.

Notes

1.
USGS NGMDB (2016) [Website]. Retrieved from http://ngmdb.usgs.gov/ngmdb/ngmdb_home.html.
2.
USGS topoView (2016) [Website]. Retrieved from http://ngmdb.usgs.gov/maps/TopoView/.
3.
David Rumsey. (2016). [Website]. Retrieved from http://www.davidrumsey.com/.
4.
OldMapsOnline (2016) [Website]. Retrieved from http://www.oldmapsonline.org/.
5.
NLS (2016) [Website]. Retrieved from http://maps.nls.uk/.
6.
USGS topoView (2016) [Website]. Retrieved from http://ngmdb.usgs.gov/maps/TopoView/.
7.
CalFlora (2016) [Data set]. Retrieved from http://www.calflora.org/.
8.
CLAVIN (2016) [Computer software]. Retrieved from https://clavin.bericotechnologies.com/.
9.
U.S. Census Gazetteer (2016) [Data set]. Retrieved from https://www.census.gov/geo/maps-data/data/gazetteer.html.
10.
USGS GNIS (2016) [Data set]. Retrieved from http://geonames.usgs.gov/.
11.
GeoNames (2016) [Data set]. Retrieved from http://www.geonames.org/.
12.
OpenStreetMap (2016) [Website]. Retrieved from https://www.openstreetmap.org/.
13.
Los Angeles Public Library Map Collection (2016) [Website]. Retrieved from https://www.lapl.org/collections-resources/visual-collections/map-collection.
14.
NHGIS (2016) [Website]. Retrieved from https://www.nhgis.org/.
15.
A Vision of Britain through Time (2016) [Website]. Retrieved from http://www.visionofbritain.org.uk/.
16.
Dr. Kurashige’s article published in the Southern California Quarterly won the 2015 Carl I. Wheat Award for the best demonstration of scholarship in that journal from 2012–2014 by a senior historian.
17.
http://www.oldhkphoto.com/coast/.
18.
Spatial technology opens a window into history (2016) [News article]. Retrieved from https://news.usc.edu/91625/spatial-technology-opens-a-window-into-history/.
19.
Peter Feigl's Journey Through Historical Maps (2016) [Website]. Retrieved from http://www.arcgis.com/apps/MapJournal/index.html?appid=6c3b4136b9304df09c9adcf86dd30dd5.
20.
Tesseract-OCR (2016) [Computer software]. Retrieved from https://github.com/tesseract-ocr.
21.
NYPL map-vectorizer (2016) [Computer software]. https://github.com/NYPL/map-vectorizer.
22.
Plageois Commons (2016) [Website]. Retrieved from http://commons.pelagios.org/.

References

Adams, O.G.: Place Names in the North Central Counties of Missouri (Ph. D.). University of Missouri-Columbia (1928)
Google Scholar
Alex, B., Byrne, K., Grover, C., Tobin, R.: Adapting the Edinburgh geoparser for historical georeferencing. Int. J. Humanit. Comput. 9(1), 15–35 (2015)
Article Google Scholar
Arteaga, M.G.: Historical map polygon and feature extractor. In: Proceedings of the 1st ACM SIGSPATIAL International Workshop on MapInteraction, pp. 66–71. ACM (2013)
Google Scholar
Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. Int. J. Seman. Web Inf. Syst. 5(3), 1–22 (2009)
Article Google Scholar
Chiang, Y.-Y., Knoblock, C.A.: Recognizing text in raster maps. GeoInformatica 19(1), 1–27 (2014)
Article Google Scholar
Chiang, Y.-Y., Leyk, S., Knoblock, C.A.: A survey of digital map processing techniques. ACM Comput. Surv. (CSUR) 47(1), 1 (2014)
Article Google Scholar
Chiang, Y.-Y., Leyk, S., Nazari, N.H., Moghaddam, S., Tan, T.X.: Assessing the impact of graphical quality on automatic text recognition in digital maps. Comput. Geosci. 93, 21–35 (2016)
Article Google Scholar
Davis, C.C., Willis, C.G., Connolly, B., Kelly, C., Ellison, A.M.: Herbarium records are reliable sources of phenological change driven by climate and provide novel insights into species’ phenological cueing mechanisms. Am. J. Bot. 102(10), 1599–1609 (2015)
Article Google Scholar
D’Ignazio, C., Bhargava, R., Zuckerman, E.: Cliff-clavin: determining geographic focus for news. In: NewsKDD: Data Science for News Publishing (2014)
Google Scholar
Garijo, D., Gil, Y., Harth, A.: Challenges for provenance analytics over geospatial data. In: Ludäscher, B., Plale, B. (eds.) IPAW 2014. LNCS, vol. 8628, pp. 261–263. Springer, Cham (2015). doi:10.1007/978-3-319-16462-5_28
Chapter Google Scholar
Godfrey, B., Eveleth, H.: An adaptable approach for generating vector features from scanned historical thematic maps using image enhancement and remote sensing techniques in a in a geographic information system. J. Map Geogr. Librar. 11(1), 18–36 (2015)
Google Scholar
Gregory, I., Donaldson, C., Murrieta-Flores, P., Rayson, P.: Geoparsing, GIS, and textual analysis: current developments in spatial humanities research. Int. J. Humanit. Comput. 9(1), 1–14 (2015)
Article Google Scholar
Gregory, I.N., Ell, P.S.: Historical GIS: Technologies, Methodologies, and Scholarship, vol. 39. Cambridge University Press, Cambridge (2007)
Google Scholar
Guralnick, R.P., Wieczorek, J., Beaman, R., Hijmans, R.J., Group, B.W., et al.: BioGeomancer: automated georeferencing to map the world’s biodiversity data. PLoS Biol. 4(11), e381 (2006)
Google Scholar
Hill, A.W., Guralnick, R., Flemons, P., Beaman, R., Wieczorek, J., Ranipeta, A., Chavan, V., Remsen, D.: Location, location, location: utilizing pipelines and services to more effectively georeference the world’s biodiversity data. BMC Bioinf. 10(Suppl 14), S3 (2009)
Google Scholar
Honarvar Nazari, N., Tan, T.X., Chiang, Y.-Y.: Integrating text recognition for overlapping text detection in maps. Electron. Imaging Doc. Recogn. Retrieval XXIII 17, 1–8 (2016)
Google Scholar
Khotanzad, A., Zink, E.: Contour line and geographic feature extraction from USGS color topographical paper maps. IEEE Trans. Pattern Anal. Mach. Intell. 25(1), 18–31 (2003)
Article Google Scholar
Kurashige, L.: Rethinking anti-immigrant racism: lessons from the Los Angeles vote on the 1920 Alien Land Law. Southern Calif. Q. 95(3), 265–283 (2013)
Article Google Scholar
Lavoie, C.: Biological collections in an ever changing world: herbaria as tools for biogeographical and environmental studies. Perspect. Plant Ecol. Evol. Syst. 15(1), 68–76 (2013)
Article Google Scholar
Leidner, J.L., Lieberman, M.D.: Detecting geographical references in the form of place names and associated spatial natural language. Sigspatial Spec. 3(2), 5–11 (2011)
Article Google Scholar
Leyk, S., Boesch, R.: Colors of the past: color image segmentation in historical topographic maps based on homogeneity. GeoInformatica 14(1), 1–21 (2009)
Article Google Scholar
Leyk, S., Boesch, R., Weibel, R.: Saliency and semantic processing: extracting forest cover from historical topographic maps. Pattern Recogn. 39(5), 953–968 (2006)
Article Google Scholar
Li, L., Nagy, G., Samal, A., Seth, S., Xu, Y.: Integrated text and line-art extraction from a topographic map. Int. J. Doc. Anal. Recogn. 2(4), 177–185 (2000)
Article Google Scholar
Murphey, P.C., Guralnick, R.P., Glaubitz, R., Neufeld, D., Ryan, J.A.: Georeferencing of museum collections: a review of problems and automated tools, and the methodology developed by the mountain and plains spatio-temporal database-informatics initiative (Mapstedi). Phyloinformatics 1(3), 1–29 (2004)
Google Scholar
Nagy, G., Samal, A., Seth, S., Fisher, T.: Reading street names from maps-technical challenges. In: Proceedings of GIS/LIS (1997)
Google Scholar
Nanetti, A., Cattaneo, A., Cheong, S.A., Lin, C.-Y.: Maps as knowledge aggregators: from Renaissance Italy Fra mauro to web search engines. Cartographic J. 52(2), 159–167 (2015)
Article Google Scholar
Newbold, T.: Applications and limitations of museum data for conservation and ecology, with particular attention to species distribution models. Prog. Phys. Geogr. 34(1), 3–22 (2010)
Article Google Scholar
Ngo, V., Swift, J., Chiang, Y.-Y.: Visualizing land reclamation in Hong Kong: a web application. In: International Cartographic Conference (2015)
Google Scholar
Pezeshk, A., Tutwiler, R.L.: Improved multi angled parallelism for separation of text from intersecting linear features in scanned topographic maps. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1078–1081. IEEE (2010)
Google Scholar
Pezeshk, A., Tutwiler, R.L.: Automatic feature extraction and text recognition from scanned topographic maps. IEEE Trans. Geosci. Remote Sens. 49(12), 5047–5063 (2011). A Publication of the IEEE Geoscience and Remote Sensing Society
Article Google Scholar
Pyke, G.H., Ehrlich, P.R.: Biological collections and ecological/environmental research: a review, some observations and a look to the future. Biol. Rev. Camb. Philos. Soc. 85(2), 247–266 (2010)
Article Google Scholar
Raveaux, R., Burie, J.C., Ogier, J.M.: A colour document interpretation: application to ancient cadastral maps. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 2, pp. 1128–1132. IEEE (2007)
Google Scholar
Raveaux, R., Burie, J.C., Ogier, J.M.: Object extraction from colour cadastral maps. In: The Eighth IAPR International Workshop on Document Analysis Systems, DAS 2008, pp. 506–514. IEEE (2008)
Google Scholar
Rios, N.E., Bart, H.L.: GEOLocate (Version 3.22) [Computer software] (2010)
Google Scholar
Samy, G., Chavan, V., Ariño, A.H., Otegui, J., Hobern, D., Sood, R., Robles, E.: Content assessment of the primary biodiversity data published through GBIF network: status, challenges and potentials. Biodivers. Inform. 8(2) (2013). http://doi.org/10.17161/bi.v8i2.4124
Simon, R., Barker, E., Isaksen, L.: Linking early geospatial documents, one place at a time: annotation of geographic documents with Recogito. E-Perimetron 10(2), 49–59 (2015)
Google Scholar
Simon, R., Pilgerstorfer, P., Isaksen, L., Barker, E.: Towards semi-automatic annotation of toponyms on old maps. E - Perimetron 9(3), 105–128 (2014)
Google Scholar
Simon, R., Sadilek, C., Korb, J., Baldauf, M., Haslhofer, B.: Tag clouds and old maps: annotations as linked spatiotemporal data in the cultural heritage domain. In: Workshop on Linked Spatiotemporal Data, Zurich, Switzerland (2010)
Google Scholar
Torr, P.H.S., Zisserman, A.: MLESAC: a new robust estimator with application to estimating image geometry. Comput. Vis. Image Underst. CVIU 78(1), 138–156 (2000)
Article Google Scholar
Vellend, M., Brown, C.D., Kharouba, H.M., McCune, J.L., Myers-Smith, I.H.: Historical ecology: using unconventional data sources to test for effects of global environmental change. Am. J. Bot. 100(7), 1294–1305 (2013)
Article Google Scholar
Weinman, J.: Toponym recognition in historical maps by Gazetteer alignment. In: Proceedings of the 12th International Conference on Document Analysis and Recognition, pp. 1044–1048 (2013)
Google Scholar
Yoshida, K., Burbano, H.A., Krause, J., Thines, M., Weigel, D., Kamoun, S.: Mining herbaria for plant pathogen genomes: back to the future. PLoS Pathog. 10(4), e1004028 (2014)
Article Google Scholar
Yu, R., Luo, Z., Chiang, Y.-Y.: Recognizing text on historical maps using maps from multiple time periods. In: Proceedings of the 23rd International Conference on Pattern Recognition (2016)
Google Scholar

Download references

Acknowledgements

This research is based upon work supported in part by the National Science Foundation under award number IIS-1564164 and in part by the University of Southern California under the Undergraduate Research Associates Program (URAP). The author thanks Travis Longcore for his input on the biology studies and the U.S. National Committee (USNC) to the International Cartographic Association (ICA) for providing travel funding to attend the 27th International Cartographic Conference (ICC).

Author information

Authors and Affiliations

Spatial Sciences Institute, University of Southern California, Los Angeles, CA, 90089, USA
Yao-Yi Chiang

Authors

Yao-Yi Chiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yao-Yi Chiang .

Editor information

Editors and Affiliations

The University of South Dakota, Vermillion, South Dakota, USA
K.C. Santosh
Karnatak Arts, Science and Commerce College, Bidar, India
Mallikarjun Hangarge
Polytecnico di Bari, Bari, Italy
Vitoantonio Bevilacqua
University of Hyderabad, Hyderabad, India
Atul Negi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chiang, YY. (2017). Unlocking Textual Content from Historical Maps - Potentials and Applications, Trends, and Outlooks. In: Santosh, K., Hangarge, M., Bevilacqua, V., Negi, A. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2016. Communications in Computer and Information Science, vol 709. Springer, Singapore. https://doi.org/10.1007/978-981-10-4859-3_11

Download citation

DOI: https://doi.org/10.1007/978-981-10-4859-3_11
Published: 29 April 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-4858-6
Online ISBN: 978-981-10-4859-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Unlocking Textual Content from Historical Maps - Potentials and Applications, Trends, and Outlooks

Abstract

Similar content being viewed by others

Automating Information Extraction from Large Historical Topographic Map Archives: New Opportunities and Challenges

Big Historical Geodata for Urban and Environmental Research

Encoding and Querying Historic Map Content

Keywords

1 Introduction

2 Potentials and Applications of Historical Maps in Natural Science

3 Potentials and Applications of Historical Maps in Social Science

4 Current Trends in Text Recognition from Historical Maps

Classic Approaches.

Crowdsourcing Approaches.

Multi-model Approaches.

5 Outlooks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Unlocking Textual Content from Historical Maps - Potentials and Applications, Trends, and Outlooks

Abstract

Similar content being viewed by others

Automating Information Extraction from Large Historical Topographic Map Archives: New Opportunities and Challenges

Big Historical Geodata for Urban and Environmental Research

Encoding and Querying Historic Map Content

Keywords

1 Introduction

2 Potentials and Applications of Historical Maps in Natural Science

3 Potentials and Applications of Historical Maps in Social Science

4 Current Trends in Text Recognition from Historical Maps

Classic Approaches.

Crowdsourcing Approaches.

Multi-model Approaches.

5 Outlooks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation