Keywords

1 Introduction

Currently, ontologies are the main means of formalization and systematization of knowledge in various subject areas including scientific subject domains (SSDs). (Note that by “scientific subject domain” we mean a subject area that encompasses a branch of science or field of scientific knowledge in all its aspects).

The development of ontology is a very complicated and time-consuming process. To simplify and facilitate it, various methods of and approaches to ontology development [1,2,3,4] have been proposed. Recently, an approach based on ontology design patterns (ODPs) [5,6,7] has gained popularity. According to this approach, ODPs are documented descriptions of proven solutions to typical problems of ontological modeling. Despite the fact that the use of ODPs allows us to greatly simplify the process of building ontologies and improves their quality, ontology design patterns have not yet found wide practical application due to a number of problems arising from their use.

One of the widespread problems of pattern reuse is their complexity: it is often difficult for the developer of a new ontology to understand the semantics the authors have laid down in the pattern. Another common problem is that the patterns are described and used separately and do not constitute a single system. In the development of ontologies of SSDs, there is yet another important problem, which is the absence of patterns designed to present scientific knowledge.

The paper presents the approach to the construction of ontologies of scientific subject domains based on the ODPs. The approach complements and develops the ontology development methodology proposed by the authors and used in development of intelligent scientific internet resources [8]. The ODPs used in this approach emerged as a result of solving the problems of ontological modeling, which the authors of the paper encountered in the process of developing ontologies for various scientific subject domains [9, 10].

This paper is organized as follows. The second section contains a short review of the ontology design patterns; the third section analyzes the problems of their use. The proposed approach to the development of ontologies of scientific subject domains is described in detail in the fourth section. The main advantages and practical benefits of this approach, as well as plans for the near future, are discussed in the Conclusion.

2 A Short Review of Ontology Design Patterns

The progenitors of ontology design patterns are design patterns, widely used in software development [11]. Similar to this design patterns, ODPs are employed to describe solutions of typical problems arising in the development of ontologies [7].

Depending on the problems for solution of which the ODPs are created, we distinguish between structural patterns, correspondence patterns, content patterns, reasoning patterns, presentation patterns and lexico-syntactic patterns. (Note that this typology of patterns was proposed in the framework of the NeOn project [12]).

From all types of patterns listed above only structural patterns, patterns of content and presentation are used in the development of ontologies.

The structural patterns either fix the ways to solve problems caused by the limitations of the expressive capabilities of ontology description languages or specify the general (modular) structure of an ontology. Patterns of the first type are called logical patterns, and patterns of the second type are called architectural patterns.

The content patterns define the ways of representing typical ontology fragments, on the basis of which ontologies of a whole class of subject domains can be built.

The presentation patterns actually represent the rules (recommendations) for naming and annotating elements of ontology. The application of these rules should increase the readability of the ontology, as well as the convenience and ease of its use.

Currently, several catalogs of ODPs have been created and are developing [13,14,15]. The most complete of them is posted on the ODPA (Association for Ontology Design & Patterns) portal [13], created as part of the NeOn project [12].

ODPs are most often described in the format proposed on the ODPA association portal [13]. According to it, the description of the pattern includes information about its author and scope, its graphical representation, text description, a set of scenarios and examples of usages, and links to other patterns. Content patterns can also be supplied with a set of competency questions [6, 7], which can be used both in the development of patterns and in the search for the desired patterns in the development of a specific ontology.

3 Problems of Using Ontology Design Patterns

The first problem of pattern reuse is due to their complexity: it is often difficult for the developer of a new ontology to understand the semantics that the authors have laid down in the pattern. Recently there has been a tendency to simplify patterns [16]. Even so-called meta-patterns, describing very simple entities, were suggested [17]. However, such simple patterns cannot significantly facilitate the construction of SSD ontologies.

Another problem is caused by the lack of convenient ontology development tools supporting the use of ODPs. Here we can note the plugins for the ontology development tool of the project NeOn [18] and the ontology editor WebProtégé [19]. However, the first plugin is available only to the participants of the NeOn project, and the second can be used only in the WebProtégé editor, which is not yet popular enough among ontology developers due to its limited functionality (in comparison with the desktop version).

The third problem is that the patterns are described and applied separately and do not constitute a single system. One more problem associated with this problem is the lack of systematized sets of patterns targeted at subject matter experts. Existing catalogs of ontology design patterns do not meet this requirement.

In our opinion, the OTTR library (Reasonable Ontology Templates) [20] is the closest to solving the latter problem. This library provides a language for the representation of ontology design patterns and software supporting it. The OTTR library supplies ontology developers with patterns in the form of high-level OWL macros [21], which makes possible their use by subject matter experts.

As for the availability of patterns that can be used in the development of SSD ontologies, the catalogs mentioned above do not even partially cover the needs of building ontologies of scientific fields since they do not contain patterns designed to represent scientific knowledge.

4 Approach to the Development of Ontologies of Scientific Subject Domains

This section describes an approach to solving the problem of reusing ODPs in the development of ontologies of scientific subject domains. This approach offers a system of heterogeneous ODPs and methods for their sharing (joint use) for building SSD ontology. At the moment, there are three types of patterns in the system: structural logical patterns, content patterns and presentation patterns. One part of these patterns is universal, and the other part is focused on the presentation of scientific knowledge.

An important feature of this approach is the use of base (core) ontologies, which include only the most general entities that are not dependent on a particular SSD. These ontologies were previously developed for the technology for building subject-based intelligent scientific internet resources [8] and are now represented by content patterns which were developed for all main entities of base ontologies. In this regard, the construction of SSD ontology using the base ontologies is reduced to their specialization and expansion. In particular, the content patterns presented in the base ontologies are tuned (specialized) to a specific SSD. As for the population of SSD ontology with actual data, it is performed by instantiation of content patterns. This process is supported by a data editor developed in the frameworks of this approach.

4.1 An SSD Ontology and Base Ontologies

Usually the ontology of any SSD contains not only descriptions of its inherent system of concepts and methods for processing and analyzing information, but also descriptions of relevant information resources. In this regard, an SSD ontology can be represented as a system of interrelated ontologies responsible for representing the above three components of knowledge, namely, the ontology of the knowledge domain, the ontology of tasks and methods, and the ontology of scientific Internet resources.

The ontology of the knowledge domain defines the system of concepts and relations intended for a detailed description of a modeled SSD and its scientific and research activities. The ontology of tasks and methods describes the tasks solved in a given SSD and the methods for their solution. The ontology of scientific Internet resources is used to describe the information resources available on the Internet relevant to this SSD.

Since the development of an ontology of an SSD from scratch is not an easy task, we have proposed a method for its construction based on a small but representative set of base ontologies that include only the most general entities not dependent on a particular SSD. This set includes: (1) the ontology of scientific knowledge, (2) the ontology of scientific activity, (3) the base ontology of tasks and methods, (4) the base ontology of information resources.

All base ontologies have specifications in the OWL language [21].

The ontology of scientific knowledge contains classes that define structures for describing concepts included in any SSD. Such concepts are Division of science, Object of research, Subject of research, Method of research, Scientific result, etc.

The ontology of scientific activity includes classes of concepts related to the organization of research activities, such as Person, Organization, Event, Activity (Scientific activity), Project, Publication, etc.

The base ontology of information resources includes the class Information resource as the main class. The set of properties (attributes and relationships) of this class is based on the Dublin core standard [22].

Concepts and relations of base ontology of tasks and methods are used to describe tasks to be solved in a given SSD, methods for their solution and software components and algorithms implementing them.

4.2 A System of Ontology Design Patterns

To support the considered approach, a set of ODPs [23] was developed and implemented in the OWL language. This set includes various types of patterns: structural logical patterns, content patterns and presentation patterns. All these patterns are combined into a single system.

Note that in this approach the presentation patterns define the rules for naming and annotating elements of ontology, which are close to the generally accepted ones [24].

The need to use structural logical patterns was attributed to the absence in OWL of expressive means for representing complex entities and structures required for building SSD ontologies, in particular, the ranges of admissible values, and n-ary and attributed relations (a binary relation with attributes).

The pattern of representation of the range of admissible values is intended to specify such structures that are called domains in the relational data model and are characterized by a name and a set of elementary values. Domains are convenient to use for describing possible values of class properties when the entire set of such values is known in advance. In this pattern, the domain is defined by an enumerated class, which is the successor of the specially introduced service class called the Domain class and consists of a finite set of different individuals (objects) determining the possible values of a certain property (see Fig. 1).

Fig. 1.
figure 1

Structural pattern of representation of the range of admissible values.

Examples of such domains are “Geographic type”, “Position”, “Type of organization”, “Type of publication”, which include, respectively, types of localities, types of positions in organization, types of organizations and publications.

Note that in the figures of the patterns presented in the paper, classes are shown in the form of ellipses, individuals and attributes are in the form of rectangles. An ObjectProperty type connection (a relation) is shown by a solid straight line, and a DataProperty type connection (an attribute), by a dash line. At the same time, classes, attributes and individuals, which must necessarily be present in the pattern, are represented by figures surrounded by a thick line.

To represent an attributed relation, a structural pattern is proposed. It is shown in the left side of Fig. 2.

Fig. 2.
figure 2

Structural pattern of the binary attributed relation and an example of its specialization.

The central place in this pattern is occupied by the service class Attributed relation with which the base classes of an ontology modeling the arguments of the binary relation are connected by the links isArgument1 and hasArgument2. At the same time, the attributes of a binary relation are modeled by the properties of this class (in OWL notation, either DataProperty or ObjectProperty) hasAttribute and hasAttributeFromDomain. For this pattern, it is required to set constraints on the obligatoriness and uniqueness of the arguments of the attributed relation (Class 1 and Class 2).

To represent a specific type of the attributed relation, a new class, which is its successor, can be defined.

The right side of Fig. 2 shows an example of a structural pattern for describing a person’s participation in scientific activities (the attributed relation participateIn). Here, the Person class serves as the first argument, the Activity class is the second argument. The pattern also allows us to specify the start and end dates of the person’s participation in an activity, as well as his/her role in it.

Similarly, we can build a pattern for an n-ary relation. Note that for this pattern we must also specify the order of the arguments.

For a uniform and consistent presentation of the concepts used in SSD and their properties, content patterns were constructed for the main concepts of base ontologies using the structural patterns proposed. Due to this, the development of an ontology of a specific SSD mainly consists in the specialization of content patterns and the construction of fragments of a target ontology based on them.

As an example, we give a pattern intended for the description of applied tasks solved within the framework of a scientific subject domain (see Fig. 3).

Fig. 3.
figure 3

Pattern for describing the applied task.

The following set of competency questions represents the content of this pattern:

What methods solve the applied task?

What data is used for solving the applied task?

What is the result of solving the applied task?

Who formulates the task?

and etc.

It should be noted that the content patterns included in the proposed set are interrelated through common concepts and relationships and thus form a single network of patterns. For example, presented in Fig. 4 content patterns, describing the concepts of Activity and Person, are interconnected not only by the attributed relation participateIn, but also through the concepts of Scientific result, Method of research, Publication, and Organization.

Fig. 4.
figure 4

Fragment of a network of patterns.

Note that in the Fig. 4 the attributed relations participateIn and workIn are shown by a dotted line.

4.3 Methods of Building Ontologies of SSDs

Building an SSD ontology involves two main steps:

  1. 1.

    Construction of the components of SSD ontology using the base ontologies through their specialization and expansion.

  2. 2.

    Population of SSD ontology with actual data by instantiation of content patterns presented in base ontologies and specialized at step 1.

Note that in this approach the ontology of the knowledge domain is built on the basis of ontologies of scientific knowledge and scientific activity; ontology of tasks and methods, on the basis of base ontology of tasks and methods; and ontology of scientific Internet resources, on the basis of base ontology of Internet resources.

The use of content patterns is supported by a special editor, which allows specialists in the subject area to populate the ontology with actual data, i.e. objects of classes and their properties. When populating an ontology with the help of the editor, the user selects the required class from the class hierarchy presented to him, and the editor uses the class name to find the corresponding pattern. After that, the editor, using the information from the pattern, builds a form containing the fields for filling in all the properties of the object of this class. At the same time, the editor can interpret the relations with attributes described by the patterns. Thanks to this, the user can work with the properties of the created object that are set by such relations as with “ordinary” object properties. The difference consists only in the need to specify the values of the attributes in a separate window.

5 Conclusion and Future Work

The paper discusses the problems of applying ontological design patterns for the development of ontologies of scientific subject domains. An approach to the development of SSD ontologies that solves most of these problems is presented. This approach is supported by a system of heterogeneous ontology design patterns, describing the main structures and entities necessary for describing scientific domains, and the data editor, which makes it possible to populate the ontology with actual data by instantiation of content patterns. Due to the simplicity and clarity of the pattern system and the data editor, this approach can be used not only by knowledge engineers, but also by specialists in the modeled area of knowledge.

This approach has shown its practical utility in the development of ontologies of various scientific subject domains (“Decision Support” [25], “Active Seismology” [26], etc.).

In the near future, it is planned to expand this approach in such a way that it provides automated population of ontology. For this, the pattern system will be expanded with lexico-syntactic patterns [27], which will be used to facilitate the population (completion) of ontologies based on texts in the natural language. Lexico-syntactic patterns are supposed to be automatically generated based on the existing content and structural patterns using the synonyms dictionary and subject area thesaurus.