Introduction

Back to the time of two decades ago, few Earth scientists would turn to the World Wide Web for discovering and accessing geoscience data. At that time, the network bandwidth was low and the available websites were limited, not to mention the shortage of methods and technologies for sharing and browsing geoscience data on the Web. The Open Geospatial Consortium (OGC, http://www.opengeospatial.org) was established in 1994 to “advance the development and use of international standards and supporting services that promote geospatial interoperability”. Now, about twenty years passed, OGC standards have been proven suitable for serving geospatial data on the Web. Geoscience data, as a unique sub-set of geospatial data, are also increasingly made available online by using OGC standards.

Coincidentally, the World Wide Web Consortium (W3C, https://www.w3.org) was also established in 1994, with the mission to “lead the World Wide Web to its full potential by developing protocols and guidelines that ensure the long-term growth of the Web”. A large portion of W3C’s deliverables are recommendations (de facto standards) for the Semantic Web, which extends and enhances the original Web by adding machine readable structures and thus meanings into the content on the Web (Berners-Lee et al. 2001). W3C standards provide the fundamentals for constructing a space of Linked Open Data in the Semantic Web. Berners-Lee (2006) proposed a five-star scheme for constructing the Linked Open Data, which can be summarized as: (1) Make data available on the Web with an open license; (2) Use a machine-readable format; (3) Use a non-proprietary format; (4) Use open standards from W3C to identify things; and (5) Link data to other people’s data. Because the Linked Open Data scheme endorses open data format and linkages among datasets, it helps facilitate data interoperability within the context of a Web of Data (Bizer et al. 2009).

In the geoscience community, while the OGC standards have already been widely used to build data services on the Web, works on using W3C standards for the Linked Open Data of geoscience, are still underdeveloped. A search on Google Scholar in December 2016 returned about 12,200 results for the combined keywords “linked data” and “biology”, about 18,200 results for “linked data” and “geography”, and only about 1300 results for “linked data” and “geoscience”. Clearly there are the space and opportunities for more efforts to leverage OGC and W3C standards for an environment of Linked Geoscience Data. To put the work into practice, the geoscience community can benefit from the experience from the GIScience community. In the past decade, GIScience researchers have made significant progress on using semantic technologies to enrich the functionality of data services in spatial data infrastructures. Early works included improving the annotation, discoverability, accessibility of geospatial information (Yue et al. 2007; Schade et al. 2010; Janowicz et al. 2010) and facilitating interoperability within and between spatial data infrastructures (Lacasta et al. 2007; Lutz et al. 2009; Vaccari et al. 2009). Recent efforts also addressed the needs of efficient ways to transform geospatial data into knowledge (Zhao et al. 2009; Yang et al. 2010) and the facilities for online data processing (Usery and Varanka 2012; Zhao et al. 2012). Most recently, OGC and W3C jointly set up the Spatial Data on the Web Working Group (Taylor and Parsons 2015) to take stock of existing best practices on geospatial data services, review methods for integrating spatial information with other relevant data and determine approaches to improve the discoverability and accessibility of spatial information for both machines and humans.

Janowicz et al. (2010) discussed the semantic challenges of five key activities within a spatial data infrastructure: discovery, access, registration, processing and visualization. The processing and visualization steps were considered as a synthesis process where all input data are to be aggregated, analyzed and interpreted in a meaningful way. Geoscience data are of various subjects, heterogeneous data structures and diverse terminologies (Reitsma and Albrecht 2005; Ramachandran et al. 2006; Berg-Cross et al. 2012; Narock and Fox 2015). To address the semantic challenges in geoscience data processing and visualization, geoscience data service is not only a single issue of making data available online, but also covers various other topics such as knowledge engineering, concept recognition and linking, as well as concept representation and annotation. It is desirable that there is a knowledge base of recognized concepts and relationships in the geoscience data services, through which automatic or semi-automatic data processing and visualization can be made available. There are increasing needs of domain specific data standards and models in geosciences that can be used to construct such knowledge bases as well as functions to deploy them (Janowicz et al. 2015; Ma et al. 2015). By adding meaningful data structures and interactive data visualizations into existing geoscience data services and connecting the services to external resources, the data services can be made “smart”. The objective of such technological fusion is to improve the understandability and usability of geoscience data records, and lower the barrier of the data service to both researchers and the general public for efficient use.

In this paper, the author presents the methods and technologies of applying a domain specific knowledge base and data visualization to leverage the functionality of existing geoscience data services and to interact with other resources on the Web of Data. Detailed works on semantic modeling and encoding, multilingual vocabularies, interactive data visualization, web map service and processing, and the query of linked data are introduced through detailed examples. This work bridges existing OGC and W3C standards and leverage their functionalities into a new level for domain-specific applications.

An approach to leverage geoscience data service

Researchers in various sub-disciplines of geoscience have discussed the embedded knowledge in datasets (Loudon and Laxton 2007; Ma et al. 2010; Richard et al. 2003). Their work reveals that there is a process for information passes from the tacit knowledge in researchers’ memory to the design of methods and procedures, data structures, data collection, and eventually to the shared datasets. Such a process exists in geoscience work no matter whether the data is “Born Analog” (e.g. paper, field notes, books) or “Born Digital” (e.g. computer, databases, Internet). A similar point of view is depicted in the data lifecycle of the Data Documentation Initiative (DDI) (DDI Alliance 2016). As shown in Fig. 1, Before data collection and processing, there is a step called “Concept” in which the stakeholders identify the domain and subjects of a work, articulate and define concepts, develop data models and configure a framework for subsequent data collection efforts.

Fig. 1
figure 1

DDI Data Lifecycle (From DDI Alliance 2016)

Being informed about the embedded knowledge in datasets is a start point to data reuse, as well as a gateway to data interoperability especially when datasets are collected from different sources. In the context of the Semantic Web, scientists use digital formats to record their knowledge and build knowledge bases to underpin datasets. Through those knowledge bases the tacit knowledge in researchers’ memory is made accessible and readable to both humans and machines. Ontology is one of the most widely used method in knowledge base construction. Each ontology is the formal specification of a shared conceptualization of a domain (Gruber 1995), and it provides the conceptual structure for data exchanged via the Semantic Web (Ma et al. 2016). W3C develops many standards to guide and formalize the modeling and encoding of ontologies, as well as the construction of knowledge bases and the Linked Open Data.

The fundamental data structure of ontologies and datasets in the Semantic Web is the Resource Description Framework (RDF, https://www.w3.org/TR/rdf11-primer/) which has a triple form “Subject, Predicate, Object”. For example “isc:Jurassic rdf:type skos:Concept” asserts that “isc:Jurassic” is an instance of “skos:Concept”. People often use the word “triplization” to describe the process of transforming a dataset from its previous format into RDF (Stadler et al. 2012). In the field of geoscience, ontologies and knowledge bases are increasingly built in recent years (Raskin and Pan 2005; Tripathi and Babaie 2008; Zhong et al. 2009; Klug and Kmoch 2014; Ma and Fox 2014), which leverage the efforts to collect datasets directly in RDF format and share them in the Linked Open Data, i.e. “Born Semantic” (Leadbetter 2015). The left part of Fig. 2 summarizes the application of the Linked Open Data approach in geosciences, and the right part depicts the approach of the OGC standard-based data service. The two approaches each represents a way of publishing and sharing datasets on the Web with their corresponding technologies. Following either approach, the final output is self-contained and is independent from that of the other approach. Since the integrated applications of the Linked Open Data and the OGC approaches have been proven suitable in the GIScience community, the question here is: what are the significant challenges for the geoscience community to fuse and leverage the two approaches?

Fig. 2
figure 2

An overview of W3C and OGC approaches for building geoscience data service. The term “Born Semantic” was from Leadbetter (2015)

One of the biggest challenges is the shortage of well-curated geoscience data standards, including schemas, ontologies and vocabularies. Conventionally, OGC and W3C do not cover domain specific data standards in geoscience (cf. McKee 2016). Although there are community efforts on building data models, ontologies and vocabularies, the discussion and use of the outputs are often restricted to their corresponding disciplines. This limits the visibility of those domain specific data standards and hinders the construction of knowledge bases in geoscience data services. As analyzed in the previous section, such knowledge bases play a crucial role in the “smart” functions of geoscience data services. Therefore, how to coordinate the data standards from different sub-disciplines and develop efficient methods to implement them is a complex issue that the geoscience community need to address. Geoinformatics researchers in the geoscience community, such as the Commission for the Management and Application of Geoscience Information (CGI) (http://www.cgi-iugs.org) and the Earth Science Information Partners (ESIP) (http://www.esipfed.org) have already begun works on coordinating the developments of schemas, ontologies and vocabularies towards shared knowledge bases. Example outputs from those efforts include the GeoSciML (http://www.geosciml.org) and the ESIP ontology portal (http://semanticportal.esipfed.org). Methods and technologies are needed to apply those community built knowledge bases into geoscience data services.

Both the Linked Open Data and the OGC approaches provide interfaces for accessing the structures of geoscience data. Those interfaces make it possible to develop interactive functions for data processing and visualization supported by knowledge bases (Fig. 3). For a certain subject in geoscience, there could be data resources available in both the Linked Open Data and via OGC standard-based data services. Despite the different approaches in data modeling and recording, they present same meanings and share the same subjects and concepts in the background knowledge. Those subjects and concepts can be the building blocks to construct the needed connection between geoscience data and a knowledge base. In this paper, the author demonstrates a case study for bridging and fusing data services underpinned by W3C and OGC standards through functions enabled by a knowledge base. In this case study, the core is the development of interactive functions underpinned by a knowledge base of domain specific ontologies and vocabularies. The data services accessed include those from spatial data infrastructures, the Linked Open Data, as well as other resources on the broad Web of Data (Fig. 3). The OGC-W3C Spatial Data on the Web Working Group recently released a list of best practices (Tandy et al. 2017). This research used several resources and methods presented in that list. Details will be described and discussed in the following sections. Though the context of this research is geoscience, the author hopes the presented work will be a complimentary contribution to the broad geospatial information community.

Fig. 3
figure 3

An approach to use knowledge bases of geoscience ontologies and vocabularies to leverage W3C and OGC standards in the construction of “smart” geoscience data services

Knowledge base, datasets, and technological components

The proposed study was derived from a previous work, which applied a geologic time ontology to enrich features of a map service provided by the British Geological Survey (Ma et al. 2012). In the study presented in this paper all the key components were updated with new ontologies, data resources and technologies (Table 1). Although the subject in this study was geologic time, the technologies used in constructing the data service can be applied to other subjects in geoscience with minor adaptation.

Table 1 Comparing key technological components from this study to previous work

Multilingual vocabularies based on W3C standards

The conceptual model of geologic time scale proposed by Cox and Richard (2005) addressed a long-term question in the field of geoscience. The two key time concepts “instant” and “interval” in their paper are consistent with people’s understanding of the general concept of time and also address the needs of researchers in stratigraphy. After the publication of their paper in 2005, researchers across the world have been working on geologic time ontologies and vocabularies. In a review paper by Ma and Fox (2013), the characteristics of several works were summarized. In the past two years, Cox and his colleagues have made new progress on both the ontology model and the vocabulary service (Cox and Richard 2015; Cox 2016). Their new work has several features: (1) First try of encoding relevant ISO standards and use them for modeling and encoding the geologic time scale; (2) An ordinal-hierarchical conceptual structure by using several small ontologies and the Simple Knowledge Organization System (SKOS, https://www.w3.org/TR/skos-primer/); (3) Presenting the time boundary as a first-class object, rather than just a literal value; and (4) Geologic time vocabulary services by using the Spatial Information Services Stack Vocabulary Service (SISSVoc, http://www.sissvoc.info). In view of those advantages, this study took the ontology and vocabulary service built by Cox and his colleagues as the knowledge base for geologic time scale.

By using the properties “skos:prefLabel” and “skos:altLabel” from SKOS, each concept in the vocabulary of geologic time scale can have labels in several languages (Ma et al. 2011). A key reason for this study to reuse the vocabulary developed by Cox and Richard (2015) is that it includes labels in more than 20 languages. Each label provides a key to retrieve more information about a concept from the Web of Data in its corresponding language. Switching between languages is a new way for researchers to interact with the Web, and this also provides an opportunity for users to access online data resources beyond the language barriers.

Web of data

The Web of Data covers various resources, including those made available through the Linked Open Data. It is worth to note that the efforts on Linked Open Data promotes several best practices towards a Web of Data (Greiner et al. 2017). The Linked Open Data adds categories, annotations and identifications to the digital resources on the Web and facilitate linkages among those resources (Ma et al. 2014). For example, if two entities from different sources are both asserted as the instance of a certain class in an ontology, then there is a relation between those two entities because they share the same class. Moreover, all those resources are discoverable and accessible on the Web, which improve the data discovery and reuse with minimal dependences.

The Web of Data provides resources and opportunities for exploring more information about certain subjects in geosciences, which in this study are the concepts in the geologic time scale. Because the Web of Data is an open space, both domain specific knowledge bases and crowd-sourced datasets are made available for access. This study intended to carry out experimental studies so several data sources were explored. For domain specific knowledge bases the vocabulary service at CSIRO (Cox and Richard 2015) was used. For crowd-sourced datasets both DBpedia (Bizer et al. 2009) and Wikipedia were used because they contain abundant information about geologic time concepts in different languages. To query those resources, the SPARQL language (https://www.w3.org/TR/sparql11-overview) was used in this study. Familiarity with the ontologies used in the knowledge bases or data resources is an advantage for writing query scripts.

Geospatial data service based on OGC standards

Geoscience datasets are increasingly made available as services on the Web (Laxton et al. 2010). The geologic map service used in this study was part of the services built by the British Geological Survey. The service was established by using the OGC Web Map Service (WMS, http://www.opengeospatial.org/standards/wms) standards, which provide a set of commands for access the data service. For example, “getFreatureInfo” can be used to retrieve attribute of a spatial feature on a map layer. Another command “getStyles” can retrieve the structured legend information, in a file format called Styled Layer Descriptor (SLD, http://www.opengeospatial.org/standards/sld), of a map layer. There are many libraries or packages that can be used for browsing WMS map layers. In this study the OpenLayers library was used to build the pilot website.

The WMS service standards set up a wrapper outside the datasets to make them accessible on the Web. For the datasets themselves, as discussed above, they have their embedded knowledge, which are reflected in the conceptual structures as well as the terminology used in the data records. Moreover, the WMS standards also provide metadata about the service and the layers. Such structured information can be used to set up connections between the WMS service, the knowledge base and the Web of Data, as well as to develop interactive functions among them for explore further information.

Data visualization

The role of data visualization can be understood at two levels, the first is to show some information visually and the second is to show it in an efficient way (Ma et al. 2015). Similar to those established functions in Ma et al. (2012), this study aims at using data visualization to create an interactive and friendly user interface to lower the barrier of domain specific datasets. Both experts and non-experts will be able to use the developed functions to see information about a geologic map layer and to retrieve further information about geologic time concepts. In practice, the D3.js library (https://d3js.org) was used to develop a visualized geologic time scale in this study and the JavaScript language was used to develop the interactive functions.

The ontologies and vocabularies in the knowledge base of geologic time scale enabled this study to test a method called exploratory visualization. That is, when the identifier of a WMS map layer is given, a researcher does not know yet what information is contained in the map, but through the knowledge base he can already retrieve information from the layer and visualize some patterns of interest from the information. The exploration can be performed on different aspects of the dataset in several steps. The information retrieved from preceding steps can help the researcher get more familiar with the map layer, the knowledge base, and the functions to operate them. In a later stage the researcher will be able to do further data analysis with all the available resources.

Implementation and results

Fusing the technological components

The approach and technological components were implemented in pilot website (Accessible at: https://goo.gl/JAP8vD). All the functions on ontology visualization, concept annotation, map feature filtration and generalization that were previously introduced in Ma et al. (2012) were all realized with new technologies (Table 1). Moreover, several new functions were designed and developed by using the abundant information available on the Web of Data (Fig. 4).

Fig. 4
figure 4

User interface of a proof-of-concept study for the Linked Geoscience Data. On the right is a map window which allows a user to interact and retrieve information of interest. In the center is information which is retrieved from several resources on the Web of Data, which shows details about a concept in the map. On the left is a visualization of geologic time scale. The website is accessible at: https://goo.gl/JAP8vD. Original geologic map (1:625,000 scale onshore bedrock age map of United Kingdom) reproduced with the permission of British Geological Survey & NERC. All rights reserved

The flexibility of the D3.js-based ontology visualization also enabled the development of several new functions. A guiding question here is that, a user retrieves a concept from a data service, can he retrieve further information about that concept in another language? This study took labels of geologic time concepts in seven different languages from the ontology developed by Cox and Richard (2015) and developed functions to use them in the visualization. A user can choose the language for labels in the visualization by clicking those buttons on the lower left part of the user interface. The ontology visualization will be refreshed with labels in the chosen language (Fig. 5). The visualization also has interactive functions to highlight the label and node caught by the cursor. The user can also click any of the label to retrieve more information about that geologic time concept from the Web of Data, in the corresponding language. All those functions are made available on the pilot website (https://goo.gl/JAP8vD).

Fig. 5
figure 5

Multilingual labels in the visualized ontology of geologic time scale. The domain specific terms enable the development of several innovate functions for the connection and interactions between geoscience data services and the broad Web of Data

For the geoscience data service shown in Fig. 4, all the information retrieved from it was in English. A few functions were developed to allow a user to see annotations of the information in other languages. For example, a user first chooses Spanish as the language for the user interface. Then, by clicking an area in the map window, a concept in its English label “Jurassic” is retrieved and shown on the lower right part of the user interface. In the same time, the developed functions find the corresponding Spanish label “Jurásico” by using the ontology and highlight it in the visualization (Fig. 4). The functions also search the data sources on DBpedia, Wikipedia and the vocabulary service at CSIRO for information (in Spanish) about “Jurásico”, and show the results in the center of the user interface (Fig. 4).

Exploratory visualization enabled by “smart” data services

A few interactive functions were developed between those technological components to perform exploratory data visualization, which leveraged the characteristics of each component and made the output website more functional than the sum of its parts. Figure 6 shows two of those functions: one is using the SLD information retrieved from a WMS layer to filter the nodes in the visualization, and thus show a map legend for that map layer; the other is using the legend as a dashboard to retrieve spatial information from the map layer by clicking nodes of geologic time terms in the legend. When a node/term is clicked, the website also retrieves information about that term from the Web of Data and showed it on the user interface.

Fig. 6
figure 6

Interactions between a D3.js visualization of the geologic time scale and a WMS map layer. Original geologic map (1:625,000 scale onshore bedrock age map of United Kingdom) reproduced with the permission of British Geological Survey & NERC. All rights reserved

Using the process demonstrated in Fig. 6, more case studies were conducted by using the open information about WMS geologic map layers of several European countries on the OneGeology-Europe project website (http://www.onegeology-europe.org). Part of the results is shown in Fig. 7. Through the generated map legends one can obtain a quick overview of the patterns of geologic time content in each map layer.

Fig. 7
figure 7

Using the visualized geologic time scale ontology to generate map legends for WMS surface rock age map layers of a few countries in Europe. Original geologic map reproduced with the permission of the OneGeology-Europe. All rights Reserved

A function that has not been developed but could be of interest here is to create the map legends with labels in different languages. The technical procedure can be: (1) Choose a language (e.g., Japanese) for the geologic time visualization; (2) Retrieve the map legend SLD information (e.g., in English) from a WMS layer and collect a list of geologic time terms from it; (3) Use the knowledge base of geologic time scale to find the corresponding geologic time terms in Japanese for the list generated in (2); and (4) Use the list of Japanese terms from (3) to filter the geologic time geologic time visualization and generate a map legend in Japanese. After that, a user can click nodes in the legend to retrieve more information in Japanese from the Web of Data. Functions can also be developed to support the user to retrieve spatial information from the map layer by using the map legend, where the geologic time term needs to be translated from Japanese to English before a request is sent to the WMS map layer. The knowledge base of geologic time scale will be capable to support the development of such functions.

The retrieval of spatial information from a WMS map layer was technically realized by building a SLD file and applying it to the map layer on the sever. Using the conceptual structure of geologic time scale in the construction of the SLD file can lead to a few innovative outputs. For Fig. 6b, the SLD file sent to the layer contained only one term “Jurassic”. In another test (Fig. 8), all the geologic time terms used in a map layer was used in a SLD file, but were described with a simpler color spectrum in a gray scale. The geologic time scale is a hierarchal structure and each geologic time concept has a unit (or level) is the structure. In the gray color spectrum, lighter colors were assigned to terms of higher-level concepts and darker colors to those of lower-level concepts. The result shows interesting patterns about the conceptual levels of rock attributes of different areas on that map layer. This may be due to the abundance of fossil records, or because of the procedure of mapping, and there could be some further studies of interest.

Fig. 8
figure 8

Using a gray scale to show conceptual levels of rock age attributes annotated in different areas on a WMS map layer. Original geologic map (1:625,000 scale onshore bedrock age map of United Kingdom) reproduced with the permission of British Geological Survey & NERC. All rights reserved

Discussion

Through the innovative use of a domain specific knowledge base of geologic time scale, this research developed visualization and interactive functions to engage various resources on the Web of Data and successfully leveraged the functionality of existing geoscience data services. Though the work is an empirical study, it covers various topics of data standards, data resources, technological components, as well as the big background of the Web of Data. Experience in this study can leads to the discussion of several topics.

The shortage of domain specific knowledge bases (i.e. those comprises data standards, schemas, ontologies and vocabularies) limits the functionality of geoscience data services. Conventionally, both OGC and W3C do not spend major efforts on domain specific standards (cf. McKee 2016). The recent efforts on geoscience data schemas, ontologies and vocabularies are often restricted to their corresponding disciplines and the visibility of the outputs is lower comparing with the standards developed and released by OGC and W3C. The work of interactions between a visualized geologic time ontology and geologic map services show the advantage of such knowledge bases for generating and interpreting meaningful information from datasets. Studies in other domains also prove the usefulness of domain specific knowledge bases. The recent progress in the field of oceanography proved to be a big success of applying controlled vocabularies to the construction of the Linked Ocean Data (Leadbetter et al. 2013; Leadbetter 2015). Those controlled vocabularies were previously published as books. By transforming them into Web compatible forms through Semantic Web technologies, they were applied to add structured descriptions to oceanographic datasets on the Web. The use of controlled vocabularies also enabled the connections among various resources and entities in oceanographic research, as well as the general geosciences (You 2015; Krisnadhi et al. 2015). For example, for a same concept, the vocabularies and ontologies will enable machines to find it as a topic in a publication or dataset, as a research interest of a scientist, a keyword of a research mission, as well as the capability of an instrument. Moreover, connections can be made among them through concept or term mapping, and innovative applications can be developed by using those connections. For example, broader federated queries can be developed through established concept mapping to explore more resources in a distributed environment.

The need for well-curated domain specific knowledge bases is also reflected in a recently released W3C Recommendation, the Data on the Web Best Practices (Greiner et al. 2017). In that recommendation, the best practices are clustered on a list of topics: metadata, data licenses, data provenance, data quality, data versioning, data identifiers, data formats, data vocabularies, data access, data preservation, feedback, data enrichment and republication. The benefits of those best practices for data on the Web are also represented in a list: comprehension, processability, discoverability, reuse, trust, linkability, access and interoperability. The recommendation document then uses a matrix to show the benefits of each best practice. For data vocabularies, especially standardized ones, the benefits include: comprehension, processability, reuse, trust and interoperability. The International Council for Science – Committee on Data for Science and Technology (CODATA) recently formed a task group Coordinating Data Standards amongst Scientific Unions (CODATA 2016). A key task of that group is to improve the visibility of standards that have been and/or are being developed and/or endorsed by different scientific disciplines.

The importance of persistent Uniform Resource Identifiers (URIs) is also shown in several parts of this study. The Web provides a wide and open space for improving the discoverability, accessibility, understandability and usability of data, including those in geoscience. URIs make resources on the Web accessible and linkable. To integrate datasets and services from multiple sources and set up stable applications on the Web, one needs to work with persistent URIs (Berners-Lee 2006). In this study, the persistent URIs of geologic time concepts at the SISSVoc at CRIRO, DBPedia and Wikipedia pave the way for interaction with them. The syntax of URIs of those data resources has stable structure. For a geologic time concept retrieved from a geologic map, the developed functions can easily generate URIs following the syntax of those data resources and set up links to them to show more information (Fig. 4). In a broad perspective, the Web of Data covers both linked open data as well resources in other formats and methods. Without persistent and stable URIs it is hard to link content to the Web of Data. This rule applies for both geospatial data and non-spatial data (Janowicz et al. 2013). In the W3C Recommendation of Data on the Web Best Practices (Greiner et al. 2017), there are best practices of using persistent URIs as identifiers both for datasets and for content within datasets. The benefits are summarized as reuse, likability, discoverability, interoperability in the recommendation document. A few other benefits, such as traceability, reproducibility and provenance can also be added when considering the role of URIs in open science. Similarly, the OGC-W3C Working Note – Spatial Data on the Web Best Practices (Tandy et al. 2017) also lists using global unique persistent URIs for spatial things as a best practice.

There could be further innovative data analysis and visualizations with more content of datasets made open and available from geoscience data services. WMS transfers a map layer as an image to the user side through a web browser. Through interactive data analysis functions was realized in the work by using the SLD, the tasks can be achieved were limited due the raster data format. As the user agents are becoming more powerful, more vector data can be provided to the user side for data analysis and visualization. In geoscience data services, the OGC Web Feature Service (WFS) standard was already implemented by a few organizations to publish geologic maps, which expose more analyzable content to end users. There are also other approaches such as the use of GeoJSON (http://geojson.org) and GeoJSON-LD (http://geojson.org/geojson-ld/) to promote the openness, structure and inter-connections of geoscience data on the Web. Recently, another geospatial data format CoverageJSON (https://covjson.org) was proposed through works in the MELODIES project (Blower and Riechert 2016; Riechert et al. 2016). It can be used for encoding coverage datasets such as grids, time series and vertical profiles. Since the fundamental structure is JSON, the data structure and content is open, which enables more opportunities for data analysis and knowledge discovery. The best practices document (Tandy et al. 2017) released by the OGC-W3C Spatial Data on the Web Working Group also shows that a significant change among stakeholders of spatial data services in recent years is their increasing awareness of the Linked Open Data approach and their actions to make the content of data open.

In the open space of the Web of Data, there could be many research topics of interest in the development of the Linked Geoscience Data. Although the Linked Open Data approach shows its advantage of publishing data on the Web, it is not necessary to make all geoscience data triplized (e.g., remote sensing images). The approach and technologies presented in this study focus more on the representation and annotation of domain specific concepts (i.e., geologic time) and their connections to attributes of spatial features and corresponding resources in the Web of Data. For many sub-disciplines in geoscience, such domain specific knowledge bases do not exist. The General Feature Model (GFM) (ISO 2015) can be used in the development of standards, schemas and ontologies for those disciplinary topics. Such knowledge bases have the potential to lead online geospatial data analysis to a finer scale. For example, a few recent studies have already begun to fuse spatial features in spatial data infrastructures using both W3C and OGC standards (Wiemann and Bernard 2016; Wiemann 2017). Given the various subjects and heavy volume of geoscience data and the joint efforts between OGC and W3C, there could be various methods and technologies to add semantics into datasets and data services (cf. Bernstein et al. 2016).

A few future works can be proposed from this study. The first is entity recognition and mapping. In the work presented in this paper, the connections between concepts from a WMS map layer to those in the geologic time ontology and the broad Web of Data were realized by label matching. In practice there could be synonyms for a same concept and the label matching technology will not be enough to address the needs. To make meaningful links among entities, advanced topics such as natural language processing and similarity computation of entities (Zheng et al. 2015) can be applied to extend the current work. Second, more efforts can be carried out on the spatial data. In this study only OGC WMS standard was used, which limits the space for exploring the Linked Data approach for spatial data. The OGC Web Feature Service (WFS, http://www.opengeospatial.org/standards/wfs) standard has also been applied in the geoscience community for constructing data services. A topic of interest is to fuse WFS with GeoSPARQL (http://www.opengeospatial.org/standards/geosparql), geoscience ontologies and data visualization technologies with real-world examples. This work will be consistent with the topics in the OGC-W3C joint working group on spatial data on the Web. The extension to spatial data will also create a broader space for the third work, which is to further explore ways of geoscience data analysis on the Web. In this study some solid progress has been achieved by using the power of reasoning and inference enabled by Semantic Web technologies, such as map generalization and map legend creation. The extension to WFS and GeoSPARQL will create more opportunities for semantics-enriched spatial data analysis.

Conclusions

The standards built by the Open Geospatial Consortium have been widely used by the geoscience community to build data services. In recent years, the geoscience community also began to see the value of the Linked Open Data approach enabled by recommendations of the World Wide Consortium and has been increasingly used the approach in data services. This paper presents a study focusing on the topic of geoscience time scale, which uses a knowledge base of geologic time ontology and vocabulary and data visualization techniques to leverage the functionality of geoscience data services in the environment of the Web of Data. Several functions were developed through the fused technologies, such as map legend creation, map generalization, patter recognition and multilingual information exploration. This study is a practice towards a broad perspective of the Linked Geoscience Data. The results demonstrate the value and potential of Semantic Web technologies for data service and analysis in the geoscience domain.