Keywords

1 Introduction

Today, ontologies are the most popular and effective means of conceptualizing and formalizing scientific subject areas [1]. They are widely used to present and record some common knowledge shared by all experts (or a group of experts) about such areas. The formalization of the semantics of a subject area in the form of ontology not only contributes to its compact and consistent description, but also forms a conceptual basis for the representation of the whole body of knowledge about it. For example, in the system of information support of research activity (SISRA) [2], the semantics of the data and information resources used in it can be described in terms of ontology; the expert rules, precedents and other components of the knowledge base of the expert system or decision support system can also be described in terms of ontology [3].

The SISRA must provide the user with a representation of all the necessary information about his/her area of expertise, its components (divisions/subdivisions of science, objects, methods and techniques of research, etc.), as well as subjects (participants) of research activity (personalities, groups, communities and other organizations involved in the research process). In the SISRA, ontology defines a formal description of the knowledge area on the basis of which such information is systematized and relevant information resources and documents are integrated into a single information space. A user interface that provides meaningful access to knowledge and data integrated into the information space of the system is also constructed on the basis of the ontology. In particular, in such an interface the user can use the ontology as a guide to navigate through this space, as well as to formulate search queries that have concepts and relations of ontology as the main elements.

At present, ontologies are widely used for the conceptual modeling of subject areas with intensive use of data [4]. The development and application of infrastructures to support the research based on the conceptual specifications of these areas makes it possible to avoid the dependence of programs on the structure of data sources, to ensure the interoperability of various data processing methods working together, and to increase the reliability of the results obtained through the use of formal consistent specifications.

The development of the ontologies of scientific subject areas is a rather complex and time-consuming process. To simplify and facilitate it, various methods and approaches to the development of ontologies have been proposed. Over the past ten years, the approach based on the application of ontology design patterns (ODPs) has been developing intensively [5, 6].

The ODPs are the documented descriptions of the proven solutions of ontological modeling problems. By now, several catalogs of patterns have been created and are developing [7, 8]. It should be noted that since such catalogs are, as a rule, oriented either to some subject area or group of developers, they are not complete or versatile.

The paper discusses the ontology design patterns resulting from solving the problems that the authors have encountered in the process of developing ontologies for various scientific subject areas [9,10,11,12]. We describe the problems and patterns in the context of the ontology development methodology for thematic intelligent scientific Internet resources (ISIR) [13], intended for information and analytical support of research activity in the specified areas of knowledge.

This paper is an extended version of the paper presented at the DAMDID/RCDL’2017 conference [14]. The review of ontology design patterns has been substantially expanded (Sect. 2) and a description of the presentation patterns used in the methodology of building ontology for ISIR [13] has been added.

2 A Review of Ontology Design Patterns

The progenitors of ontology design patterns are design patterns, widely used in software development. In this area of activity, a design pattern is a description of a well-tested, generalized scheme for solving a frequently recurring development problem that arises in a certain context. The patterns have become part of the daily practice of object-oriented design. With their help, specific design tasks are solved, which has resulted in object-oriented design becoming more flexible, elegant, and reusable [15].

By analogy with design patterns, ontology design patterns are used to describe the solutions of typical problems that arise in ontologies development. Patterns are created in order to facilitate the process of building ontologies and help developers avoid some of the highly repetitive errors of ontological modeling [16]. In this capacity, the ODPs were first introduced, independently from each other, by Aldo Gangemi [5] and Eva Blomqvist with Kurt Sandkuhl [17].

The main catalog of ontology design patterns is presented on the portal of the Association for Ontology Design & Patterns (ODPA) [8], created within the NeOn project [18]. Within the framework of this project, the typology of the patterns presented in Fig. 1 was proposed [19].

Fig. 1.
figure 1

Types of ontology design patterns.

Depending on the problems for which ontology design patterns are designed, we distinguish between structural patterns, correspondence patterns, content patterns, reasoning patterns, presentation patterns and lexico-syntactic patterns [20].

It should be noted that currently there is no unique standard for the description of patterns [21]. As a rule, however, they are described in the format proposed on the portal of the ODPA association [8]. This pattern description schema includes a graphical representation, text description, a set of scenarios and examples of use, links to other patterns in which it is used, as well as general information about the pattern name, its author and application area. According to the eXtreme Design methodology [22], each content pattern is also supplied with a set of competency (qualification) questions that determine its content.

2.1 Structural Patterns

Structural patterns either fix the ways to solve problems caused by the limitations of the expressive capabilities of ontology description languages or specify the general structure and overall shape of an ontology. Patterns of the first type are called logical patterns, and patterns of the second type are called architectural patterns. They contain proposals for ontology organization in general, including, for example, structures such as taxonomy and modular architecture.

A most acute problem of using the OWL language for constructing ontologies is that OWL can only provide a representation of simple entities, while ontologies need to represent complex concepts and relations. To solve this problem, structural logical patterns are used. These include, for example, patterns solving the problems of representing n-ary relations, relations of partonomy, lists, trees, etc.

Note that structural patterns are subject-independent; they can serve as a basis for constructing the fragments of ontology that are part of the content patterns.

2.2 Reasoning Patterns

The reasoning patterns are built on the basis of structural logical patterns and are designed to produce specific results using a logical inference machine. Such patterns make it possible to provide not only the inference of implicit knowledge in the ontology (patterns of classification, categorization, inheritance, etc.), but also information about the state of ontology, execution of queries to ontology, and its evaluation and normalization (elimination of class anonymity and their instances, an explicit representation (reification) of the class hierarchy, normalization of names, etc.) [23].

As noted in [24], such patterns can be applied to searching, viewing, filtering, integrating, and personalizing ontology elements encountered in Semantic Web applications.

2.3 Content Patterns

The content patterns define the ways of representing the typical fragments of ontology, based on which ontologies of a whole class of subject domains can be built.

In the ODPA association catalog, different kinds of basic patterns involved in many subject areas, such as Person, EventCore, Action, Participation, TimeInterval, Location, SpatioTemporalExtent, Situation, and Trajectory, are represented as content patterns. They can be specialized for specific subject areas.

Based on these patterns, we can describe more complex (composite) content patterns, supplementing them with missing components and building additional relations between the main classes representing the pattern.

The content patterns also include patterns defining structures for the representation of different types of relationships between ontology elements, such as Classification, Collection, Set, Bag (a set with repeating elements), List, PartOf, Componency (a special kind of “part of” relation), etc.

To build and use patterns, the ODPA community develops the methodology of extreme ontology design (eXtreme Design methodology, XD) [22], developed within the framework of the NeOn project, whose main principles are iterativity, involvement of the customer in ontology building, and ontology development management based on the requirements. According to this methodology, each content pattern should be provided with a set of competency questions defining its content, as well as a set of contextual statements and reasoning requirements necessary for the implementation of competency questions. Based on these questions, an ontology can be tested by building SPARQL queries to the ontology.

Structural and content patterns are the most popular types of patterns which can be used to describe the fragments of a domain ontology. Unlike ontology repositories, catalogs of patterns do not provide collections of ready-made ontologies, but sets of proposed solutions (i.e., well documented ontology fragments) for ontology design, which can be repeatedly used by developers to create their own ontologies in various subject areas.

2.4 Presentation Patterns

The presentation patterns define recommendations for the naming, annotating and graphical representation of ontology elements so as to increase the ontology readability and usability. The paper [22] notes the importance of a meaningful naming of ontology elements for the purpose of its better understanding by the user, although it does not improve the technical capabilities of the ontology to answer questions and perform a logical inference.

Naming patterns include conventions on rules for naming the namespaces declared for ontologies, files, and ontological elements. For example, it is encouraged to use the basic URIs of the organization that publishes an ontology for constructing the namespace, capital letters and the singular in the name of a class, and the suffix of the name of the parent class in the names of its subclasses.

The annotation patterns include annotation rules for ontology elements, such as providing classes and their properties with comments (rdfs: comment) and labels (rdfs: label) in several languages.

2.5 Correspondence Patterns

Correspondence patterns are required to perform the reengineering (transformation) or alignment (mapping) of ontologies. The first group of patterns is used when we need to build a new ontology, and the initial model is not necessarily ontological. The second group of patterns is used to determine correspondence between the concepts and individuals of two ontologies [25] so as to ensure interoperability without modifying existing models.

2.6 Lexico-Syntactic Patterns

Lexico-syntactic patterns are used to facilitate the construction (completion) of ontologies based on texts in a natural language. They set the mapping of language structures into ontological structures.

The idea of this type of patterns is not new; it is based on the concept of lexico-syntactic patterns of language constructs for the automatic extraction of linguistic units from the text [26]. Elements of lexico-syntactic patterns can be groups of words and word combinations that correspond to ontological constructions defined in both the ontology description language and structural (logical) patterns.

3 Patterns of the Ontology Constructing Methodology for Thematic ISIRs

In this section, we will consider the ontology design patterns used in the methodology of building ontologies for thematic intellectual scientific Internet resources (ISIR) [13]. This methodology, developed with the participation of the authors, uses the Semantic Web technology tools [27]. In particular, ontologies within this methodology are developed in the OWL language [28] using the Protégé editor [29]. These tools help solve many ontological engineering problems, including ontology validation and reuse, but their use, in turn, creates new problems.

3.1 Structural Logical Patterns

The use of structural logical patterns in the methodology of building ontologies for thematic ISIRs became necessary because the OWL lacked expressive means for representing complex entities and constructions needed for building the ontology of thematic ISIRs, in particular, the areas of admissible values, and n-ary and attributed relations (binary relations with attributes).

First, let us consider the pattern of representing the range of admissible property values. This pattern was introduced because of the absence in the OWL of special tools for specifying the ranges of values called domains in the relational data model and characterized by a name and a set of atomic values. Domains are convenient to use in the descriptions of possible values of properties of a class, when the entire set of such values is known in advance. Using domains will not only allow us to control the input of information; it can also make this operation more convenient by providing users with the opportunity to select property values from a given list of values.

The solution to this problem is to define the domain by using an enumerated class, a descendant of a specially introduced Domain class. Each specific domain does not have descendants and consists of a finite set of different individuals (objects or instances of the class) that determine the possible values of a particular property (ObjectProperty) of the objects of the class in question (see Fig. 2).

Fig. 2.
figure 2

Structural pattern of representation of the range of admissible values and an example of its use.

Examples of such domains are “Geographic type”, “Position”, “Type of organization”, “Type of publication”, which include, respectively, the types of settlements, positions, organizations and publications. (A description of the “Type of organization” domain is shown at the bottom of Fig. 2).

Note that in the figures of patterns shown in the paper, classes are designated in the form of ellipses, and their individuals and attributes are represented in the form of rectangles. An ObjectProperty type connection is shown by a solid straight line, and a DataProperty type connection is shown by a dash line. At the same time, obligatory classes, attributes and individuals are represented by figures surrounded by a thick line.

Another common problem in the development of an ontology is the need to present the attributed relations between two objects. For these purposes, as a rule, ordinary binary relations provided with attributes that specialize the connection between the arguments of the relation [30] are used. Since the OWL language does not provide the possibility to specify attributes for a relation, a structural pattern that provides for the introduction of the service class Attributed relation has been proposed (see Fig. 3).

Fig. 3.
figure 3

Structural pattern of the binary attributed relation.

To represent a specific relation type with attributes, a new class, which is its successor, is defined. An instance of this class is associated with each argument and attribute of the attributed relationship. In this case, it is required to set constraints on the obligatoriness and uniqueness of the arguments, while the restrictions on the number of attributes (properties) are not specified.

Note that this pattern, in contrast to the Qualified Relation pattern introduced in [30] allows specifying explicitly the order of the arguments of the attributed relation, preserving information about its orientation, which is important for providing the user with complete information about the nature of the relation between objects.

Figure 4 shows an example of using this pattern to specify a relation describing a person’s participation (Person class) in an activity (Activity class). The pattern allows us to specify the start and end dates of the person’s participation in an activity, as well as his/her role in it.

Fig. 4.
figure 4

The pattern of the binary attributed relation “participates”.

Similarly, we can build a pattern for an n-ary relation. In this case, we have to specify the order of the arguments of the relation in addition to the properties of the obligatoriness and uniqueness of its arguments (see Fig. 5).

Fig. 5.
figure 5

Structural pattern of an n-ary relation with attributes.

Let us remark that in contrast to the structural pattern of the n-ary relation presented in [31] and the content pattern Situation presented in [32], which is proposed as a basis for the description of n-ary relations, in the pattern proposed in this work we can specify its properties (ObjectProperty) and attributes (DataProperty), in addition to the arguments of the relation and their order. This greatly enhances the expressive possibilities of this pattern.

Structural patterns are subject-independent; due to this, they can serve as a basis for specifying ontology elements for content patterns.

3.2 Content Patterns

As mentioned above, content patterns define the ways of representing typical ontology fragments that can underlie the construction of the ontologies of modeled subject domains. In fact, the content patterns proposed in the paper are fragments of the base ontologies provided by the above-mentioned methodology of building ontology for thematic ISIRs. After the specialization of the concepts contained in these fragments and expansion of new concepts, these fragments become constituent parts of the ontologies of specific subject areas.

The ontology of the ISIR domain is built on the basis of the following base ontologies: the ontologies of scientific knowledge and research activity, base ontology of tasks and methods, and base ontology of scientific information resources [13].

The ontology of scientific knowledge contains classes that define structures for describing concepts falling into any scientific field of knowledge. Such concepts are Division of Science, Object of Research, Subject of Research, Research Method, Scientific Result, etc. Using these classes, it is possible to identify and describe the divisions and subdivisions important for the modeled area of knowledge, to specify the typification of methods and objects of research, and to describe the results of scientific activity.

The ontology of research activity is based on the ontology proposed in [33], which is designed to describe research projects. This ontology includes classes of concepts related to the organization of scientific and research activity, such as Person (Researcher), Organization, Event, Activity, Project, Publication, etc. This ontology also includes relations that allow us to connect notions of this ontology not only among themselves, but also with the concepts of the ontology of scientific knowledge.

The base ontology of scientific information resources includes the Information Resource class as the main class, since this concept plays an important role in any scientific area. The set of attributes and relations of this class is based on the Dublin core standard [34]. Its attributes are resource name, resource language, resource topic, resource type, date of resource creation, etc. To represent information about resource sources and its creators, as well as related events, organizations, persons, publications and other entities, the ontology contains special relations linking the Information Resource class with the classes of other base ontologies.

The base ontology of tasks and methods includes classes such as Task, Solution Method, and Web Service. The concepts and relations of this ontology are meant to describe the tasks which are to be solved using the ISIR, methods for solving these tasks, and web services implementing these methods.

Building a consistent description of scientific subject areas depends on the ability to uniformly represent the concepts used in them. For this purpose, patterns representing the basic concepts and relations of base ontologies were developed. Let us show how patterns for describing the ontology of scientific knowledge look.

The pattern presented in Fig. 6 is intended to describe the research methods used in scientific activity.

Fig. 6.
figure 6

Pattern for describing the research method.

Elements of the description of the pattern of the research method are represented by such obligatory classes of ontology as Activity, Scientific Result, Task and others, and such relations as used_in, implemented_in, solve, etc.

Let us give a set of competency questions representing the content of this pattern:

  • What objects is the method applied to?

  • In what activity is the method used?

  • What tasks are solved using the method?

  • In what divisions of science is the method used?

  • In what scientific results is the method implemented?

  • Who is the author of the method?

  • Who applies the method?

  • Who develops the method?

  • In what organizations is the method applied?

  • In what publications is the method described?

  • On what resources is the method presented?

Figure 7 shows an example of using the pattern considered above to describe the method of subdefinite calculations [35] proposed by A.S. Narinyani in 1982 and implemented in the UniCalc solver [36].

Fig. 7.
figure 7

Example of using the pattern for describing the research method.

Figure 8 shows a pattern for describing the object of research, which includes the following classes as obligatory: Subject of Research, Activity, and Division of Science. The instances of these classes must be connected with the object of research by relations such as has_Aspect, investigated_In and studied_In, respectively. The object of research can be structural (i.e. include other objects of research).

Fig. 8.
figure 8

Pattern for describing the object of research.

The pattern of the subject of research must necessarily include a reference to the object of research, an aspect of which it is. The subject of research, as well as the object of research, can be structural (include other subjects of research).

The scientific results occupy an important place in the description of scientific activity. The pattern for describing the scientific result is shown in Fig. 9. This pattern sets the requirement that when describing the scientific result, a reference to the activity under which it was obtained should be made.

Fig. 9.
figure 9

Pattern for describing the scientific result.

Note that in the patterns described above, not only “central” concepts of patterns, but also concepts from adjacent patterns are used. For example, in the pattern for describing the scientific result, in addition to the concept of Scientific Result such concepts as Activity, Subject of Research, and Division of Science are also used. This allows us to give a related description of the modeled area.

3.3 Presentation Patterns

As mentioned above, presentation patterns define recommendations for naming, annotating, and graphical representation of ontology elements.

The technology of the development of thematic ISIR allows users to customize the display of information objects (instances of ontology class) included in the ISIR content on the monitor screen. To solve this problem, presentation patterns have been proposed. They allow us to fix the order in which the properties of information objects are displayed, specify the way of the dynamic naming of information objects using meaningful references to them, and store these settings directly in the ontology.

The desired order of showing the attributes of the information object is set for each property of its class using the annotation property (owl: AnnotationProperty) introduced especially for this purpose and called order. Its value is an integer.

When visualizing information from the ISIR content, we can represent information objects by full or short references. Full references are used when a list of objects of a given class is displayed; short ones are used to refer to one object within the description of another object. For creating full references, the link annotation property is introduced, for creating short links the shortlink annotation property is used. The values of these properties are also integers specifying the order in which the attributes (DataProperty) or relations (ObjectProperty) of the object enter into the full or short reference.

4 Conclusion

The paper discusses the questions of application of ontology design patterns for the development of ontologies of scientific subject areas. The classification of patterns proposed by the ODPA Association is presented, and the most important types of patterns from this classification are described. In addition, a detailed consideration of the patterns used by authors of the paper in the development of ontologies for a set of scientific subject areas is given.

Ontology design patterns serve to provide a uniform and consistent representation of all the entities of the ontology being developed. The use of ontology design patterns by experts and knowledge engineers helps to save resources and avoid errors in the development of ontologies.

The work was carried out with the partial financial support of the Russian Foundation for Basic Research (grant No. 16-07-00569).