Keywords

1 Introduction

Providing research of transdisciplinarity is a way of expanding the scientific world outlook, which is to consider a particular phenomenon outside of a single scientific discipline. The development of mechanisms of transdisciplinarity ensures the functioning of the general informational and analytical working environment, and is a priority area for the tasks of informational support of scientific and other research, especially in large integrated projects or in cases where resources are geographically distributed due to the peculiarities of the problems being solved [1,2,3,4,5,6,7].

It is advisable to cite five main interpretations of transdisciplinarity (according to Klein) [6]:

  • The first approach is connected with the research of the complex unity of the world and the corresponding “difficult” thinking, the expansion of a permanent image of science.

  • The second approach defines transdisciplinarity based on the possibilities of a transgressive transition beyond the limits of disciplinary knowledge in sociological and cultural studies.

  • The third approach reveals transdisciplinarity as a new form of organization of scientific research, which involves the consolidation of interdisciplinary resources in a single methodological and theoretical framework.

  • The fourth approach is the so-called “trans-sector of transdisciplinary”, which focuses on identifying mechanisms for effective interaction between the academic community, the industry and the business sector. In this context, transdisciplinarity can be considered as a resource for synergy between the prospects of expert (academic) and everyday (non-professional) types of knowledge in problem solving.

  • The fifth approach is represented by the theoretical concepts of “second type of production of knowledge” and “postnormal science”.

The most general is transdisciplinarity, which is based on the efforts of formal interconnection of the understanding of individual disciplines, which provides the formation of logical meta-frameworks, with the help of which the knowledge outlined in these disciplines can be integrated at a higher level of abstraction than is done in interdisciplinarity. This type of transdisciplinarity is often used in the work of various expert systems and expert groups [8].

Each approach has its own disadvantages and advantages that may arise in solving specific problems. However, the feasibility of transdisciplinary use is undeniable, as evidenced by the text of the “World Declaration on Higher Education for the 21st Century: Approaches and Practical Measures”, adopted by the participants in the International Conference in Paris in October 1998, at the headquarters of UNESCO, in particular its chapter 5 and 6 [7].

In practice, the transdisciplinary approach is used in the form of ontologies [9], which allows for a general scientific classification and systematization of interdisciplinary knowledge.

The most commonly used method for extracting knowledge and meaning from unstructured texts is Deep Learning methods, methods for counting identical words, statistical sequence labeling, supervised machine learning, classification and building ontological models [10,11,12].

However, the chapter discusses methods related to the construction of ontological models and text structuring together with the use of a dictionary reflecting the meaning obtained in previously structured languages ​​and document models. Particularly relevant are the integration of opportunities associated with extracting meaning from the text and representing this meaning in various geographic information systems.

2 The Concept of an Ontological GIS-Application and Its Formation on the Basis of Natural-Language Documents

GIS is the most natural and convenient way of presenting geospatial information [13,14,15]. However, the construction of geoinformation system (GIS) can be a rather complicated process if available geospatial data is presented in documents containing weakly structured or even unstructured information. Handling such documents manually can be an extremely labor-intensive process, and the processing of large amounts of such documents is almost impossible [16,17,18].

Before starting to work with weakly structured or unstructured documents, it is necessary to structure them. During this process, the data is presented in an easy-to-handle form, which can easily be read by standard GIS tools and also conveniently displayed to the end user. This, in particular, may provide an opportunity to find hidden information in the input data [19,20,21,22,23,24,25,26].

The most complicated is the implementation of the structuring of (natural linguistic) NL texts, because this process requires a sufficiently complete formal description of the subset of the language to which they belong. Each of the texts describes a specific subject area (SSA) or a part of it. At the same time, the terms relate to the SSA used in the text form its terminology field. The structuring of the text consists of isolating it from this terminology field, in particular, the identification of the concepts of the corresponding SSA, as well as their attributes and interconnections.

The formed terminology field can be represented using ontology, which is an ordered triple [13, 14, 27,28,29,30]:

$$ O = \left\langle {X,R,F} \right\rangle $$
(1)
  • where X - the set of concepts of the subject area,

  • R - the set of relations between concepts X,

  • F - a set of interpretation functions X and/or R.

Thus structuring of a certain NL text TT can be represented as a certain transformation (transformation of structuring):

$$ F_{str} :T^{T} \to O $$
(2)

In reality, however, it is not always possible to isolate all the necessary information from the text, therefore, the boundary cases of the formula (1) may occur when the conditions are fulfilled \( X = \emptyset \), \( R \ne \emptyset \) or \( F = \emptyset \). All possible combinations of these conditions give different variants of ontological constructions, ranging from simple vocabularies to the formal structure of the conceptual knowledge base. In particular, according to [14], one can distinguish:

  1. (1)

    \( X = \emptyset ,R = \emptyset ,F = \emptyset \) – unstructured text.

  2. (2)

    \( X \ne \emptyset ,R = \emptyset ,F \ne \emptyset \) – glossary;

  3. (3)

    \( X \ne \emptyset ,R \ne \emptyset ,F = \emptyset \) – taxonomy;

  4. (4)

    \( X \ne \emptyset ,\,\left( {R \ne \emptyset \left| {R = R_{t} } \right. \cup R^{ + } } \right),\,F = \emptyset \) – thesaurus;

  5. (5)

    \( X \ne \emptyset ,R = \emptyset ,card\left( F \right) = 1 \) – simple ontology;

  6. (6)

    \( X \ne \emptyset ,R \ne \emptyset ,card\left( F \right) > 1 \) – an active ontology.

Such a scheme for the classification of ontologies according to the functional attribute is consistent with the description given in [19, 31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47].

Thematic ontology [14] is such ontology in which, besides ensuring the fulfillment of conditions and, the functions of interpretation are added axioms, definition and restriction on the subject of this SSA. The description of all components is presented in some formal language, which can be interpreted by some procedure (algorithm). The scheme of the formal model of thematic ontology is described as follows:

$$ O_{t} = \left\langle {X,R,F,A,D,R_{s} } \right\rangle $$
(3)
  • where X - the set of concepts (concepts) of the given SSA;

  • R - a finite set of semantically meaningful relations between the concepts of SSA;

  • F - the finite set of interpretation functions given in concepts and/or relationships;

  • A - a finite set of axioms that are used to write always true sayings (definitions and restrictions) in terms of the subject of SSA;

  • D - a set of additional definitions of concepts in terms of the subject of the SSA;

  • Rs - a set of constraints that determine the scope of conceptual structures of a specific topic of the SSA.

Thematic ontology is a formal representation of conceptual knowledge of the subject area and can be represented by a certain information system. The process of constructing such an information system can be represented as a composition of certain statements, judgments, statements, terms-concepts and relations between them, and its result is the basis for constructing an integral part of the scientific theory - the ontological knowledge base in a given subject area, described in declarative form [21, 22, 48, 49]. Such a scheme for the classification of ontologies by function is consistent with the description: “Ontology or conceptual model of the subject area consists of a hierarchy of concepts of the domain, the links between them and the laws that operate within the framework of this model.”

However, the structure of the SSA, described by the text, is generally limited by the concepts \( \left( {X \ne \emptyset } \right) \), the links between them \( \left( {R \ne \emptyset } \right) \). That is, based on texts you can build a glossary O1 and a taxonomy O2. The taxonomy constructed will look like (4).

$$ O^{2} = \left\langle {X,R} \right\rangle $$
(4)
  • where X - the set of concepts given by the SSA;

  • R - a finite set of semantically meaningful relations between the concepts of SSA.

However, according to [14], all non-empty products of subsets X and R form a set of actions \( F_{t} \subset F \). Therefore, a plurality of interpretation functions can always be formed when the conditions \( X \ne \emptyset \) and \( R \ne \emptyset \) are fulfilled and performed for the formula (4). Thus, any formula of the form (4) will always correspond to a certain formula of the form (1).

In fact, structuring text conversion (2) is a multi-stage process, each stage of which requires the use of specialized models and procedures. The general scheme for performing a transformation can be represented by the formula of the form (5).

$$ T^{T} \to T_{sn} \to O^{1} \to O^{2} \to O $$
(5)
  • where TT - NL text;

  • Tsn - the primary structure of the text;

  • O1 - glossary;

  • O2 - taxonomy (4);

  • O - ontology (1).

In this case, the first stage, lexical analysis, \( T^{T} \to T_{sn} \) requires the use of a significant amount of linguistic information, and therefore performed a separate system - a lexical analyzer [32, 40, 50,51,52,53,54,55,56].

Obtaining a structured representation of the text in the form of ontology (1) allows it to perform its automatic processing of information in it by means of one or another facilities, but for experts such representation is not appropriate. Because the activity of experts in the corporate information environment can be presented in the form of a system {action -> results}, then it is expedient to submit the information available in the structured document to the expert with the help of a system of a similar structure. A system of this type can be defined as natural – SN [14, 57].

The natural system (NS), used to display a particular document, provides an interactive interaction with its content. Such document can be defined as interactive and presented with a pair (6).

$$ \left\langle {O,SN} \right\rangle $$
(6)
  • where O - ontology, which is a structured representation of a certain document;

  • SN - NS, built on the basis O.

The partial case of an interactive document is an ontological GIS application, which is an interactive document, NS of which is based on an affine space [58]. The affine space serves as a mathematical basis for the display of geospatial information, and therefore such an NS is able to naturally represent to the expert selected from the text in the process of structuring information of this type. The GIS-application formed on the basis of such an NS can be represented by the formula (7) [13, 58,59,60].

$$ \left\langle {X_{g} ,R_{g} ,A^{op} ,S,O} \right\rangle $$
(7)
  • where Xg - the set of geographical entities over which analytical operations are performed to solve the problem;

  • Rg - a plurality of relationships between geographic objects that determine the type of executed operations;

  • Aop - a set of analytical operations on geographic objects performed in the process of solving a problem;

  • S - the set of states of the task, which are visualized on the map in the process of its solution;

  • O - ontological description of geographic objects, processes and tasks of a given subject area.

3 Structuring the Text Using the Method of Recursive Reduction

The process of structuring the text (5) can be divided into two main stages: a lexical analysis \( T^{T} \to T_{sn} \) that forms the primary structure of the text, and the formation of an ontology \( T_{sn} \to O \), which allows you to select the necessary information from the primary structure and present it as an ontology. For the implementation of the second stage, a method of recursive reduction is proposed, which involves the sequential transformation of the primary structure with the help of a certain set of dynamically-specified user rules.

3.1 Graphic Representation of the Primary Structure of the Text

The primary structure of the text Tsn is the result of the work of the lexical analyzer and the input data for the process of recursive reduction. The primary structure contains a structured representation of lexemes (words or symbols), as well as syntactic relationships between them. This structure, in essence, is an oriented graph, and lexemes are the vertices of a given graph.

Any NL text TT is represented by a plurality of lexemes L, on which the relation of forwarding is determined as \( \prec \). This relation converts L into a linearly ordered set. Also, the text TT can be represented as a sequence of sentences S, which also defines the relation of the transference:

$$ T^{T} = \left\{ {S_{1} \prec S_{2} \prec \ldots \prec S_{{n_{s} }} } \right\} $$
(8)
  • where ns - the total number of sentences in the text.

Each sentence Si, in turn, is represented by some subset of lexemes:

$$ L_{{S_{i} }} = \left\{ {l_{ij} ,j = \overline{1..n}_{i} } \right\} $$
(9)
  • where ni - the number of lexemes in the i sentence.

Obviously, the condition is fulfilled:

$$ \forall l_{1} \in S_{1} ,\forall l_{2} \in S_{2} ,S_{1} \prec S_{2} \Rightarrow l_{1} \prec l_{2} $$
(10)
  • where S1, S2 - arbitrary sentences of the text;

  • L1, L2 - lexemes.

Each lexeme, in its turn, has a number of features:

$$ l_{ij} = { < }l_{ij}^{T} ,P_{ij} { > } $$
(11)
  • where \( l_{ij}^{T} \) - the text representation of lexemes \( l_{ij} \);

  • \( P_{ij} \) - signs of lexemes \( l_{ij} \).

A lexeme can be linked to other lexemes using syntactic relationships \( r_{sn} \in R_{sn} \):

$$ r_{sn} = \left\langle {l^{1} ,l^{2} ,k} \right\rangle $$
(12)
  • where \( l^{1} ,l^{2} \) - lexemes, between which there is a connection;

  • k - type of connection.

Thus, the oriented graph, representing the primary structure of the NL text, has the form (13)

$$ T_{sn} = \left\langle {L,R_{sn} } \right\rangle $$
(13)

The main problem of the structure (13), and in particular (11), is the ineffectiveness of working with the text representation of \( l_{ij}^{T} \) lexemes, which is redundant and requires the construction of specialized functions defined on a plurality of text representations of words. Such functions are cumbersome and ineffective, and in software implementation - often depend on the peculiarities of the processing of text variables in a given programming language.

As the set of text representations of lexemes, obviously, is counted, it is possible to construct the transformation of the form (14) [62].

$$ V:L^{T} \to {\mathbb{N}} $$
(14)
  • where LT - the set of text representations of lexemes.

Let the text be written in a certain alphabet W, the number of characters in which \( n_{W} = card\left( W \right) \). This alphabet can be considered as a system of calculus with the basis \( n_{W} \). Accordingly, each letter \( w \in W \) can be put in correspondence with a certain number \( i_{w} \in {\mathbb{N}} \), which is the index of the given letter in the alphabet. Any word for the input text is a sequence (15).

$$ l^{T} = \left\{ {w_{1} ,w_{2} \ldots w_{{n_{l} }} } \right\} $$
(15)
  • where \( n_{l} \) - the length of the word \( n_{l} > 0 \);

  • \( w_{i} \) - the letters of the alphabet \( W \).

If we consider letters \( w_{i} \) as digits of numbers in the corresponding numerical system, then such a number can be converted into a decimal system of the calculation using formula (16).

$$ V\left( {l^{T} } \right) = i_{{w_{1} }} \times (n_{W} )^{{n_{l} }} + i_{{w_{2} }} \times (n_{W} )^{{n_{l} - 1}} + \ldots + i_{{w_{{n_{l} }} }} \times (n_{W} )^{0} $$
(16)
  • where \( i_{{w_{j} }} \) - the index of letters in the alphabet \( W \);

  • \( n_{W} \) - the number of characters in the alphabet \( W \).

With the help of the function \( V \), you can replace all \( l^{T} \) the corresponding ones \( l^{V} = V\left( {l^{T} } \right) \). As a result of this operation, you can get a more effective representation (17) of the set of lexemes.

$$ { < }l^{V} ,P{ > } \in L^{V} $$
(17)
  • where \( l^{V} \) - the code representation of lexeme l;

  • \( P \) - grammatical characteristics of lexeme;

  • \( L^{V} \) - a set of code representations of lexemes.

In the future, you can consider \( L^{V} \) as a set of lexemes L.

3.2 The Method of Recursive Text Reduction

According to (5), the structuring process of the text performed by the recursive reduction method has the structure (18) and transforms the primary structure of the text (13) into the ontology (1).

$$ T_{sn} \to O^{1} \to O^{2} \to O $$
(18)

The method of recursive reduction consists in recursively executing the process of reduction of the input NL text, which consists in applying a specialist operator to it (19).

$$ F_{rd} :T_{sn} \to O $$
(19)

The reduction operator (19) is described in terms of the λ-theory.

λ-theory (λ-calculus) is a formal theory developed for the formalization and analysis of the notion of computability [63]. The basic concepts of λ-theory are application (application of a function to an argument) and abstraction. Abstraction means that if \( t\left( x \right) \) - formula which may contain a free variable x, then the record \( \lambda x.t\left( x \right) \) is a function f that converts the meaning a to value \( t\left( a \right) \), in other words, equality (20) holds.

$$ \left( {\lambda x.t\left( x \right)} \right)a = t\left( a \right) $$
(20)
  • where x - a certain variable;

  • t - formula containing, possibly, the entry of a variable x;

  • a - an argument that defines the value x.

The central element of λ-theory is the notion of the term [63, 64]. The set Λ, λ-terms is determined inductively, as shown in formula (21):

$$ \begin{array}{*{20}c} {x \in \varLambda } \\ {M \in \varLambda \Rightarrow \left( {\lambda xM} \right) \in \varLambda } \\ {M,N \in \varLambda \Rightarrow \left( {MN} \right) \in \varLambda } \\ \end{array} $$
(21)
  • where x - arbitrary variable;

  • M, N - arbitrary terms.

Besides, in the λ-theory, the so-called Steinfinkle observation (22) operates, which allows us to reduce the functions of many arguments to the function of one argument [63].

$$ \lambda x_{1} \ldots x_{n} .M = \lambda x_{1} \left( {\lambda x_{2} \left( { \cdots \left( {\lambda x_{n} .\left( M \right)} \right) \cdots } \right)} \right) $$
(22)
  • where \( x_{i} \) - arbitrary variables;

  • M - arbitrary term.

Important is the notion of β-reduction, which can be defined as the relation (23) between two terms:

$$ \beta = \{ \left( {\left( {\lambda x.M} \right)N,M\left[ {x: = N} \right]} \right)|M,N \in \varLambda \} $$
(23)
  • where x - arbitrary variable;

  • M, N - arbitrary terms;

  • M[x := N] - formula obtained by substitution N instead of a variable x in expression M.

The reduction operator (19) is a combination of four operators (24). while three of the operators perform the conversion steps given by the formula (18), and one performs the auxiliary function.

$$ F_{rd} = F_{l*} \circ F_{x} \circ F_{smr} \circ F_{ct} $$
(24)

where \( F_{l*} \) - the aggregation operator (25), which performs the auxiliary function, transforming the set of lexemes L into a plurality of constructs \( L^{*} \). Constructs are a special form of lexemes, and combine the sequence of words or characters, in particular, the phrase. The peculiarity of constructs is that from the point of view of further processing, they can be considered as lexemes (that is, as a single word or a symbol). In this way, the plural \( L \cup L^{*} \) can be used in the same cases as the plural L, in particular, as the input for the operator

$$ F_{l*} :L \cup L^{*} \to L^{*} $$
(25)
  • Fx - the operator (26) of the concept X identification.

$$ F_{x} :L \cup L^{*} \to L \cup L^{*} \cup X $$
(26)
  • Fsmr - operator (27) of the identification of ontological connections R [65], which are divided into the relationships between concepts \( R_{sem} \) and auxiliary links between the concept and its contexts \( R_{sem}^{*} \).

$$ F_{smr} :\left\langle {L \cup L^{*} \cup X,R_{sn} } \right\rangle \to \left\langle {L \cup L^{*} \cup X,R_{sn} \cup R_{sem}^{*} \cup R_{sem} } \right\rangle $$
(27)
  • Fct - operator (28) identifying contexts, which in the future act as attributes of concepts Ax.

$$ F_{ct} :\left\langle {X,R_{sem}^{*} } \right\rangle \to A_{X} $$
(28)

The total transformation \( F_{rd} \) has the form (29), where \( R = R_{sem}^{*} \cup R_{sem} \)

$$ F_{rd} :\left\langle {L \cup L^{ *} \cup X,A_{X} ,R_{sn} \cup R} \right\rangle \to \left\langle {L \cup L^{ *} \cup X,A_{X} ,R_{sn} \cup R} \right\rangle $$
(29)

In the first step \( L^{*} = \emptyset ;X = \emptyset ;A_{x} = \emptyset ;L_{sem}^{*} = \emptyset ;L_{sem} = \emptyset \), that’s why transformation (29) is degenerate into the transform (30). However, with each step, the data sets are replenished with new elements.

$$ F_{rd} :\left\langle {L,R_{sn} } \right\rangle \to \left\langle {L \cup L^{*} \cup X,A_{X} ,R_{sn} \cup R_{sem}^{*} \cup R_{sem} } \right\rangle $$
(30)

To fully analyze the text, it is necessary to use the operator (24) recursively, for which the operator of the fixed point is used in the λ-theory [63]. To do this, you need to build an auxiliary function F′:

$$ F^{\prime} = \lambda fx.\left\{ {\begin{array}{*{20}l} {fF_{rd} x,} \hfill & {F_{rd} x \ne x} \hfill \\ {x,} \hfill & {F_{rd} x = x} \hfill \\ \end{array} } \right. $$
(31)

We apply a fixed point operator Y to it (33). A term YF′ is a fixed point for a transformation given by a function F′, that is, for it the relation is satisfied (34). By using the relation (34), the recursive execution of the function (31) can be performed until it reaches the completion condition (35), that is, until the plurality of current data (32) becomes a fixed point for the transformation Frd.

$$ \left\langle {L \cup L^{ *} \cup X,A_{X} ,R_{sn} \cup R_{sem}^{ *} \cup R_{sem} } \right\rangle $$
(32)
$$ Y = \lambda f.\left( {\lambda x.f\left( {xx} \right)} \right)\lambda x.f\left( {xx} \right) $$
(33)
$$ F^{\prime}\left( {YF^{\prime}} \right) = YF^{\prime} $$
(34)
$$ F_{rd} x = x $$
(35)

Condition (35) can be interpreted as “application Frd no longer identifies new information”.

3.3 The Multiplicity of Transformation of Recursive Reduction

Constructs L* allocated during the transformation (25) are formed using a specialized operation. As a rule, the concept described in the input text is represented in it by a certain phrase, which in turn is represented by a certain subset of lexemes \( \tilde{L} \subset L \). Above this set is a specialized conversion (36).

$$ \tilde{L} \to \left\langle {V^{ + } \left( {\tilde{l}_{1}^{V} ,\tilde{l}_{2}^{V} \ldots \tilde{l}_{n}^{V} } \right),P_{{M\left( {\tilde{L}^{ *} } \right)}} ,L^{im} } \right\rangle $$
(36)
  • where n - the number of lexemes in \( \tilde{L} \);

  • \( l^{V} \) - code representation of lexeme l;

  • \( V^{ + } \) - operation combining code representations by lexemes (37);

  • \( M\left( L \right) \) - operation finding the main lexeme of the set L;

  • \( L^{im} \) - generating set \( \left( {L^{im} = \tilde{L}} \right) \).

The operation \( V^{ + } \) is given recursively (37).

$$ \begin{array}{*{20}c} {V^{ + } \left( {l^{V} } \right) = l^{V} } \\ {V^{ + } \left( {l_{1}^{V} ,l_{2}^{V} \ldots l_{n}^{V} } \right) = l^{V} \times \left[ {log_{{n_{W} }} (V^{ + } \left( {l_{2}^{V} ,l_{3}^{V} \ldots l_{n}^{V} } \right)} \right) + 1] + V^{ + } \left( {l_{2}^{V} ,l_{3}^{V} \ldots l_{n}^{V} } \right)} \\ \end{array} $$
(37)
  • where [] - operation finding the whole part of the number;

  • \( n_{W} \) - the number of characters in the alphabet W, which is used for the text representation of the lexemes.

Conversion \( V^{ + } \) is an analogue of applying transformation V to source text formula of lexemes (38).

$$ V^{ + } \left( {l_{1}^{V} ,l_{2}^{V} \ldots l_{n}^{V} } \right) = V\left( {V^{ - 1} \left( {l_{1}^{V} } \right) + V^{ - 1} \left( {l_{2}^{V} } \right) + \cdots + V^{ - 1} \left( {l_{n}^{V} } \right)} \right) $$
(38)
  • where V - the operation of forming a code representation of lexeme;

  • \( V^{ - 1} \) - back to the operation V, \( V \circ V^{ - 1} = V^{ - 1} \circ V = \lambda x.x \);

  • + - combining operation (concatenation) of text representations.

Constructs can also be created on the basis of a combination of constructs and lexemes \( \left( {\tilde{L} \subset L \cup L^{*} } \right) \). The main difference is the formation of a generating set \( L^{im} \) with the help of formula (39).

$$ L^{im} = \left\{ {l|l \in L \cap \tilde{L}} \right\} \cup \bigcup\nolimits_{{l^{ *} \in \tilde{L}}} {L_{{l^{ *} }}^{im} } $$
(39)
  • where L - the set of lexemes of the input text;

  • \( \tilde{L} \) - set of lexemes and constructs from which the construct was formed;

  • \( l \) - lexemes belonging to \( \tilde{L} \);

  • \( l^{*} \) - Constructs belonging to \( \tilde{L} \);

  • \( L_{l}^{im} \) - forming a set of constructs \( l^{*} \).

The concept of a generating set L can also be extended to a plural by taking (40).

$$ l \in L \Rightarrow L_{l}^{im} = \emptyset $$
(40)

On the set of lexemes L, the relation of a strict linear order \( \prec \) is determined. This relation can be extended to a plurality of \( L^{*} \):

$$ \forall l_{1} \in L_{{l_{1}^{*} }}^{im} ,\forall l_{2} \in L_{{l_{2}^{*} }}^{im} ,l_{1} \prec l_{2} \Rightarrow l_{1}^{*} \prec l_{2}^{*} $$
(41)
  • where \( l_{1} ,l_{2} \) - lexemes;

  • \( L_{{l_{1}^{*} }}^{im} ,L_{{l_{2}^{*} }}^{im} \) - forming sets of constructs \( l_{1}^{*} ,l_{2}^{*} \).

But in the plural \( L^{*} \) relation \( \prec \), in the general case, will not be a linear order relation, because the condition (42) may be fulfilled.

$$ \exists l_{1} \in L_{{l_{1}^{ *} }}^{ *} ,\exists l_{2} ,l_{3} \in L_{{l_{2}^{ *} }}^{ *} ,l_{2} \prec l_{1} \prec l_{3} $$
(42)
  • where \( l_{1} ,l_{2} ,l_{3} \) - lexemes;

  • \( L_{{l_{1}^{*} }}^{im} ,L_{{l_{2}^{*} }}^{im} \) - forming sets of constructs \( l_{1}^{*} ,l_{2}^{*} \).

The ratio (41) can be trivially extended to \( L \cup L^{*} \). Also, the set \( L \cup L^{*} \) has a relation R:

$$ l_{1} \in L \cup L^{ *} ,l_{2} \in L^{ *} ,l_{1} \in L_{{l_{2} }}^{im} \Rightarrow l_{1} Rl_{2} $$
(43)

The ratio R, as well as the relation of follow-up \( \prec \), are relations of a strict partial order on \( L \cup L^{*} \).

Now consider the more detailed combination of transformations (25) and (26), in the first step of reduction, when it has the form (44).

$$ F_{l *} \circ F_{x} :L \to L \cup L^{ *} \cup X $$
(44)

This transform converts each lexeme \( l \in L \) into a lexeme itself, as well as a certain set of constructs or concepts that are formed on the basis of this lexeme. If we take as a starting set the set of constructs L*, then this correspondence can be represented as a hyperratio \( G_{im} :L^{*} \to L \):

$$ l \in L, Y = \left\{ {l^{*} | l \in L_{{l^{*} }}^{im} } \right\} \Rightarrow YG_{im} l $$
(45)
  • where \( l \) - lexeme;

  • \( l^{*} \) - arbitrary construct;

  • \( L_{{l^{*} }}^{im} \) - forming a set of construct l*.

This hyperratio can be extended to a set of concepts X by condition (46).

$$ l \in L,Y = \{ x|\exists l^{*} ,x\left( {l^{*} } \right) = x \wedge l \in L_{{l^{*} }}^{im} \} \Rightarrow YG_{im} l $$
(46)
  • where \( l \) - lexeme;

  • \( x \) - an arbitrary concept;

  • \( l^{*} \) - arbitrary construct;

  • \( x\left( {l^{*} } \right) \) - operation of forming a concept based on a construct \( l^{*} \);

  • \( L_{{l^{*} }}^{im} \) - forming a set of constructs \( l^{*} \).

Using the formulas (40), (45) and (46), this relation can be trivially extended to \( G_{im} :L \cup L^{*} \cup X \to L \cup L^{*} \cup X \).

Hyperratio \( G_{im} \) shows the plurality of the ratio of recursive reduction and can be used as an additional mechanism for the formation of ontological links R between the concepts of SSA.

3.4 The Structure of the Reduction Operator and Its Components

As already mentioned above, operator (24) is a composition of operators. Each of them performs one conversion step (5). The full cycle of this transformation structures a portion of the information contained in the input text, after which the conversion is recursively called again until all information is selected. However, each of the components of the reduction operator can also be divided into constituents.

In the general case, the operator of the transformation F is given by the base of the rules \( G_{R} \) for the implementation of this transformation. The rule \( g \in G_{R} \) has a uniform structure for all stages:

$$ g = \left\langle {f_{ap}^{g} ,f_{tr}^{g} } \right\rangle $$
(47)
  • where \( f_{ap}^{g} \) - the applicability function, which determines whether the rule can be applied to a certain set of input information;

  • \( f_{tr}^{g} \) - a transformation function that specifies the transformation of the input information.

The rule g set for the transformation \( F_{g} :X \to Y \) has the form (48).

$$ F_{g} \left( x \right) = \left\{ {\begin{array}{*{20}l} {f_{tr}^{g} \left( x \right),} \hfill & {f_{ap}^{g} \left( x \right)} \hfill \\ {x,} \hfill & {\neg f_{ap}^{g} \left( x \right)} \hfill \\ \end{array} } \right. $$
(48)

Each function of applicability is a lambda-term of the form [40, 54, 55]:

$$ f_{ap} = \left( {\lambda x_{1} ,x_{2} \ldots x_{{n_{g} }} .t_{ap} \left( x \right)} \right)a_{1} ,a_{2} \ldots a_{{n_{g} }} = t_{ap} \left( {a_{1} ,a_{2} \ldots a_{{n_{g} }} } \right) $$
(49)
  • where the record \( \lambda x \) indicates that this design is \( \lambda \) - term;

  • \( x_{i} \) - a variable that takes values in the plural \( L \cup L^{*} \);

  • \( a_{i} \) - argument of the function that sets the value \( x_{i} \);

  • \( n_{g} \) - number of arguments to be submitted to the input of the conversion function;

  • \( t_{ap} \) - condition of applicability, a formula that contains ng variables.

In the general case, the condition of applicability \( t_{ap} \) means the existence of a homeomorphism between a directed graph formed by an input sequence of lexemes (as well as syntactic links between them) and a certain reference oriented graph \( G_{ap} \) representing a subgraph \( T_{sn}^{e} \) selected by the user of the initial representation of a certain text. As \( T_{sn}^{e} \) can be the initial representation of both the current text \( T_{sn} \) and any other text (for example, the thesaurus of the SSA). The condition has a structure (50) and consists of predicates of identification [51, 58, 68]. Such predicates make it possible to identify the context of a particular lexeme, and, on the basis of this, conclude that there is no need or need for conversion. Each of the predicates sets a certain condition, and the condition of applicability of the rule is to fulfill all the conditions given by each of the predicates. The number of predicates in the formula sets the number \( n_{g} \).

$$ t_{ap} = c_{{p_{1} }} \left( {x_{1} } \right)\& \ldots c_{{p_{n} }} \left( {x_{{n_{g} }} } \right)\& r_{{k_{11} }} \left( {x_{1} ,x_{1} } \right)\& \ldots r_{{k_{{n_{g} n_{g} }} }} \left( {x_{{n_{g} }} ,x_{{n_{g} }} } \right) $$
(50)

Single predicates present in the formula are predicates of identifying lexemes. Such a predicate sets the condition that a certain lexeme (or construct) of the input plural must match. The predicate has a structure (51).

$$ c_{p} \left( l \right) = \left\{ {\begin{array}{*{20}l} {1,p = 0 \vee p = l^{T} \vee p \in P_{l} } \hfill \\ {0,p \ne 0 \wedge p \ne l^{T} \wedge p \notin P_{l} } \hfill \\ \end{array} } \right. $$
(51)

The work of such a predicate depends on the template parameter p. Depending on the type of this parameter, the predicate can be:

  1. (1)

    The standard predicate of identification. In such a predicate p - this is a morphological characteristic of lexemes. This predicate determines whether the input lexeme has a given characteristic \( \left( {p \in P_{l} } \right) \).

  2. (2)

    Predict the identification of keywords. In such predicates p, it is a text representation of the necessary lexeme, and this value is compared with the value of the input \( \left( {p = l^{T} } \right) \). This predicate always has a value of 0 for constructs.

  3. (3)

    Zero predicate \( \left( {p = 0} \right) \). This predicate always has a value of 1, regardless of the input lexeme.

A double predicate is a predicate of link identification. This predicate determines whether there is a connection between the two given lexemes of a given type. The predicate has the form (52).

$$ r_{k} \left( {l_{1} ,l_{2} } \right) = \left\{ {\begin{array}{*{20}l} {1,k = 0 \vee \left\langle {l_{1} ,l_{2} ,k} \right\rangle \in R_{sn} } \hfill \\ {0,k \ne 0 \wedge \left\langle {l_{1} ,l_{2} ,k} \right\rangle \notin R_{sn} } \hfill \\ \end{array} } \right. $$
(52)

Like predicate lexemes, this predicate has zero modification, which has value 1 regardless of input data. In this case, for the correct use of the condition (50), the condition given by the structure of the lexical analyzer must be fulfilled:

$$ k_{ij} = 0,i = j $$
(53)

The rule base defines a transformation F that looks like:

$$ F_{G} \left( L \right) = \bigcup\nolimits_{{\tilde{L} \in P\left( L \right)}} {\bigcup\nolimits_{g \in G} {F_{g}^{*} \left( {\tilde{L}} \right)} } $$
(54)
  • where \( P\left( L \right) \) - the set of all subsets L;

  • \( F_{g}^{*} \) - modified function (48), supplemented by additional conditions.

Additional conditions are superimposed on two:

The condition of the order means that all elements of the input subset must be linearly ordered by a certain ratio of strict order G, and has the form:

$$ f_{ord} \left( x \right) = \left\{ {\begin{array}{*{20}l} {1,\forall x_{1} ,x_{2} \in x,x_{1} \prec x_{2} \vee x_{2} \prec x_{1} } \hfill \\ {0,\exists x_{1} ,x_{2} \in x,x_{1} \,{ \nprec }\,x_{2} \wedge x_{2} \,{ \nprec }\,x_{1} } \hfill \\ \end{array} } \right. $$
(55)

The order of reference that can be used as G:

  1. (1)

    Relationship to follow up \( \prec \)

  2. (2)

    Transitive closure ratio R (43)

  3. (3)

    Transitive closure of the relationship given by the links \( R_{sem} \)

The condition of consistency \( f_{ap + }^{g} \) determines whether this element \( \tilde{L} \in P\left( L \right) \) is suitable for processing by the rule g:

$$ f_{ap + }^{g} \left( x \right) = \left\{ {\begin{array}{*{20}l} {1,card\left( x \right) = n_{g} } \hfill \\ {0,card\left( x \right) \ne n_{g} } \hfill \\ \end{array} } \right. $$
(56)

Given these conditions of transformation (48) becomes \( F_{g}^{*} \):

$$ F_{g}^{ *} \left( x \right) = \left\{ {\begin{array}{*{20}c} {\begin{array}{*{20}l} {f_{tr}^{g} \left( x \right),} \hfill \\ {x,} \hfill \\ \end{array} } & {\begin{array}{*{20}c} {f_{ord} \left( x \right) \wedge f_{ap + }^{g} \left( x \right) \wedge f_{ap}^{g} \left( x \right)} \\ {\neg f_{ord} \left( x \right) \vee \neg f_{ap + }^{g} \left( x \right) \vee \neg f_{ap}^{g} \left( x \right)} \\ \end{array} } \\ \end{array} } \right. $$
(57)

Consider the peculiarities of performing various reduction steps within the structure (57).

The transformation of aggregation is a transformation that forms structures (constructs) from linked lexemes, which later become candidates in concepts or contexts of concepts. The peculiarity of the work of transformation is that it applies to the sentence as a whole, that is, to the set of lexemes L associated with a particular sentence S. The transformation forms a plurality of constructs L*, which in the future is also associated with the sentence S, and is used in conjunction with the initial set of lexemes. All subsequent transformations are used as the input data of the set, which retains the relation of follow-up \( \prec \), but which in this case is converted into a partial order.

The rules of this transformation differ from the standard structure by the method of application and the method of forming the function of transformation. At this stage, a modified function \( F_{g}^{*} \) with the form (58) is used.

$$ F_{g}^{ *} \left( x \right) = \left\{ {\begin{array}{*{20}c} {\begin{array}{*{20}l} {f_{tr}^{g} \left( x \right),} \hfill \\ {\emptyset ,} \hfill \\ \end{array} } & {\begin{array}{*{20}c} {f_{ord} \left( x \right) \wedge f_{ap + }^{g} \left( x \right) \wedge f_{ap}^{g} \left( x \right)} \\ {\neg f_{ord} \left( x \right) \vee \neg f_{ap + }^{g} \left( x \right) \vee \neg f_{ap}^{g} \left( x \right)} \\ \end{array} } \\ \end{array} } \right. $$
(58)

The function (58), unlike the function (57), returns an empty set in case of its inapplicability. Due to this, the resulting set does not fall into the elements of the input set L. The function \( f_{tr}^{g} \) is a function of the aggregation of the form (59).

$$ f_{g}^{tr} :\left\{ {l_{1} \ldots l_{{n_{g} }} } \right\} \to \left\{ {l^{ *} = \left\langle {V^{ + } \left( {\tilde{l}_{1}^{V} ,\tilde{l}_{2}^{V} \ldots \tilde{l}_{{n_{g} }}^{V} } \right),P_{{l_{{n_{r} }} }} ,L_{{l^{ *} }}^{ *} } \right\rangle } \right\} $$
(59)
  • where \( l^{*} \) - the resulting construct;

  • \( L_{{l^{*} }}^{*} \) - a set of lexemes or constructs from which the construct is formed;

  • \( V^{ + } \) - operation of combining numerical representations \( \tilde{l}_{i}^{V} \) of lexemes of the input plural;

  • \( n_{r} \) - a specific index \( \left( {1 \le n_{r} \le n_{g} } \right) \) that defines which word is the main word in the selected phrase (for example, in a combination of adjectives and nouns, the noun will be the main one). The index \( n_{r} \) determines which lexeme construct \( l^{*} \) imposes its morphological characteristics \( P_{{l^{*} }} \). In the case that the construct is formed not by words (for example, date), this parameter has no content and can be considered a zero.

On the basis of this index, the operation M of finding the word of the phrase is constructed:

$$ M\left( L \right) = l_{{n_{r} }} $$
(60)

Actually, the transformation of aggregation is performed by recursive application of the function (58) according to the scheme (61).

$$ F_{ag} \left( {\tilde{L}} \right) = \left\{ {\begin{array}{*{20}c} {\begin{array}{*{20}l} {\tilde{L},} \hfill \\ {F_{ag} \left( {F_{G} \left( {L \cup \tilde{L}} \right)} \right),} \hfill \\ \end{array} } & {\begin{array}{*{20}c} {F_{G} \left( {L \cup \tilde{L}} \right) = \tilde{L}} \\ {F_{G} \left( {L \cup \tilde{L}} \right) \ne \tilde{L}} \\ \end{array} } \\ \end{array} } \right. $$
(61)

Transformation of concept identification is a transformation that forms a set of concepts X based on a plurality \( L \cup L^{*} \). This transformation is performed after the conversion of aggregation and uses the constructions \( L^{*} \) generated by this transformation.

In general, this transformation has a standard structure. The applicability condition often has a trivial structure:

$$ t_{ap} = c_{{p_{1} }} \left( {x_{1} } \right) $$
(62)

Where \( c_{{p_{1} }} \) identifies nouns. In this way, the phrases, the main word for which is the noun, and for which the constructs were formed during the aggregation stage, immediately fall under the rule, and from them the concepts are formed. The transformation function \( f_{tr}^{g} \) is trivial:

$$ f_{tr}^{g} \left( {L^{ *} } \right) = x\left( {l_{1}^{ *} } \right) \in X $$
(63)

where \( x\left( l \right) \) - the function of forming a concept x from lexeme l.

At the same time, the lexeme L form those concepts that are described in one word, and form the constructs \( L^{*} \) that are described in a few words.

Transformation of link identification is a transformation that forms a set of semantic relationships between concepts \( R_{sem} \), semantic relationships between concepts and lexemes/constructs \( R_{sem}^{*} \) on the basis of sets \( L \cup L^{*} \) and X.

Transformation has a standard structure, with the condition of applicability as a rule has the form:

$$ \begin{array}{*{20}l} {t_{ap} = c_{{p_{1}^{ *} }} \left( {x_{1} } \right)\& c_{{p_{1} }} \left( {x_{2} } \right)\& \ldots \& c_{{p_{{n_{g} - 2}} }} \left( {x_{{n_{g} - 1}} } \right)} \hfill \\ {\& c_{{p_{2}^{*} }} \left( {x_{{n_{g} }} } \right)\& r_{{k_{12} }} \left( {x_{1} ,x_{2} } \right)\& \ldots r_{{k_{{n_{g} n_{g} }} }} \left( {x_{n - 1} ,x_{{n_{g} }} } \right)} \hfill \\ \end{array} $$
(64)
  • where \( p_{1}^{*} \), \( p_{2}^{*} \) - conditions that determine the type of lexemes or constructs between which (or their corresponding concepts) will be established a connection;

  • \( p_{1} \ldots p_{{n_{s} - 2}} \) - conditions defining keywords that are specific to the type of communication identified by the rule.

The transformation function has the form (65).

$$ f_{tr}^{g} \left( {L^{ *} } \right) = \left\{ {\begin{array}{*{20}l} {\left\langle {x\left( {l_{1}^{ *} } \right),x\left( {l_{{n_{g} }}^{ *} } \right),k_{sem}^{g} } \right\rangle \in R_{sem} ,} \hfill & {x\left( {l_{1}^{ *} } \right) \in X {\bigwedge } x\left( {l_{{n_{g} }}^{ *} } \right) \in X} \hfill \\ {\left\langle {x\left( {l_{1}^{ *} } \right),l_{{n_{g} }}^{ *} ,k_{sem}^{g} } \right\rangle \in R_{sem}^{ *} ,} \hfill & {x\left( {l_{1}^{ *} } \right) \in X {\bigwedge } x\left( {l_{{n_{g} }}^{ *} } \right) \notin X} \hfill \\ {\left\langle {x\left( {l_{{n_{g} }}^{ *} } \right),l_{1}^{ *} ,k_{sem}^{g} } \right\rangle \in R_{sem}^{ *} ,} \hfill & {x\left( {l_{1}^{ *} } \right) \notin X {\bigwedge } x\left( {l_{{n_{g} }}^{ *} } \right) \in X} \hfill \\ {0,} \hfill & {x\left( {l_{1}^{ *} } \right) \notin X {\bigwedge } x\left( {l_{{n_{g} }}^{ *} } \right) \notin X} \hfill \\ \end{array} } \right. $$
(65)
  • where \( l_{1}^{*} \ldots l_{{n_{g} }}^{*} \) - the constructs of the input plural;

  • \( x\left( l \right) \) - the function of forming a concept x from a lexeme l;

  • \( k_{sem}^{g} \) - The type of semantic relation identified by the current rule.

This transformation determines which communication is identified in its result. If at the stage of identifying concepts, ones have been identified and concepts created on both the basis \( l_{1}^{*} \) and on the basis \( l_{{n_{g} }}^{*} \), then the formed link will be assigned to the plural \( R_{sem} \). If only for one of the constructs the concept was not created, then such link would fall into the set \( R_{sem}^{*} \) and will be used to identify the contexts of the concept associated with it. If the concept was not created on the basis of any of the concepts, then the connection is not created. Most likely, such situation means that there is no rule of aggregation that would combine these constructs into one.

Transformation of context identification is intended for the analysis of the contexts of concepts given by relations \( R_{sem}^{*} \, \cup \,R_{sem} \), and the formation of attributes based on them, which then define the functions of interpretation of concepts.

Transformation of context identification uses a modified link identification predicate (52), which is intended to identify semantic relationships between concepts. The predicate has the following form:

$$ r_{k}^{ *} \left( {x_{1} ,x_{2} } \right) = \left\{ {\begin{array}{*{20}l} {1,\left\langle {x_{1} ,x_{2} ,k} \right\rangle \in R_{sem}^{ *} \cup R_{sem} } \hfill \\ {0,\left\langle {x_{1} ,x_{2} ,k} \right\rangle \notin R_{sem}^{ *} \cup R_{sem} } \hfill \\ \end{array} } \right. $$
(66)
  • where \( x_{1} ,x_{2} \) - certain concepts \( x_{1} ,x_{2} \in X \).

The applicability function does not use predicate lexemes identification (51), and has the following form:

$$ t_{ap} = r_{{k_{12} }}^{ *} \left( {x_{1} ,x_{2} } \right) $$
(67)

The transformation function for this case is also trivial and has the form:

$$ f_{tr}^{g} \left( {L^{*} } \right) = a_{{l_{2}^{*} }} \left( {x\left( {l_{1}^{*} } \right)} \right) \in A_{X} $$
(68)
  • where \( a_{l} \left( x \right) \) - the function of forming an attribute with a lexeme or construct l for the concept x.

The transformation of the previous structuring \( F_{ps} \) is not part of the reduction operator, but is carried out in the same way as its parts. This transformation can complement the parsing process.

Pre-structuring of text is often necessary when working with weakly structured documents. This is especially important when processing tables that contain headers, but the values ​​of table cells contain a natural language that needs to be analyzed in the same way as the NL text to obtain the necessary information. Also, as a rule, concepts are contained in the headings of the text sections, and the concept in the header may be the name of a particular category, which includes the concepts given in this section.

All that needs to be done with the previous structuring is to identify the named areas in the text (such as sections or columns of the table). This requires:

  1. (1)

    Define lexemes of the text that form the name of the named area.

  2. (2)

    Define lexemes of the text that form the found named domain.

The name of an area is, as a rule, a certain block of text (often sentence), which is highlighted in a markup in a special way, which, in turn, can be accomplished in two ways:

  1. (1)

    Explicit - using a specific separator character, such as a comma in CSV (Comma Separated Values) files.

  2. (2)

    Implicit - using the display settings of the text, for example, by selecting another font.

When reading documents that support implicit allocation, it is necessary to perform additional processing in order to obtain metadata for displaying lexemes. In this case, the set of signs \( P_{l} \) of a lexeme l will consist of two parts - morphological signs \( P_{l}^{mph} \) and metadata of reflection \( P_{l}^{dis} \).

$$ P_{l} = P_{l}^{mph} \cup P_{l}^{dis} $$
(69)

The function of the previous structuring transformation has a standard look (48)–(52). However, it uses the specialized structure of the applicability condition (50) having the form (70).

$$ \begin{array}{*{20}l} {t_{ap} = c_{d} \left( {x_{1} } \right)\& c_{t} \left( {x_{2} } \right)\& \ldots \& c_{t} \left( {x_{{n_{g} - 1}} } \right)} \hfill \\ {\& c_{d} \left( {x_{{n_{g} }} } \right)\& r_{0} \left( {x_{1} ,x_{2} } \right)\& \ldots r_{0} \left( {x_{n - 1} ,x_{{n_{g} }} } \right)} \hfill \\ \end{array} $$
(70)

The condition uses only two template parameters: d - constraint condition, and t - condition comparison:

  1. (1)

    For an explicit way of defining, \( c_{d} \) identifies a specific separator character, while \( c_{t} \) does not impose any conditions \( \left( {c_{t} = c_{0} } \right) \).

  2. (2)

    For an implicit method of assignment \( c_{d} \), identifies any lexemes that, according to the styles of the current document, do not relate to the headings, whereas \( c_{t} \) on the contrary, it identifies any lexemes that belong to them.

Syntactic relationships between lexemes are not used at the stage of previous structuring, therefore the conditions on them are not superimposed \( \left( {\forall i,\forall j,k_{ij} = 0} \right) \).

If you find the lexemes belonging to the named block, in the explicit method of specifying the rule is similar, whereas when implicitly the conditions are changed places.

The function of the transformation of the previous structuring performs a transformation of the form (71) or (72), where \( p_{n} \) and \( p_{n}^{*} \) - the property of the membership of the lexeme to the named area and the affiliation of the lexemes to the name of the named area, respectively.

$$ f_{tr} :\left\langle {l^{T} ,P} \right\rangle \to \left\langle {l^{T} ,P \cup \left\{ {p_{n} } \right\}} \right\rangle $$
(71)
$$ f_{tr} :\left\langle {l^{T} ,P} \right\rangle \to \left\langle {l^{T} ,P \cup \left\{ {p_{n}^{*} } \right\}} \right\rangle $$
(72)

Conditions \( p_{n} \) and \( p_{n}^{*} \), if necessary, can be used at the next steps by setting the appropriate parameters in the rules templates. In particular, you can include text headers in the source ontology as objects.

3.5 The Structure of the Reduction Operator and Its Components

Whereas the principle of the work of each of the predicates included in the applicability condition (50) depends solely on the associated template parameter p or k, then any applicability function is identified by a set of such parameters. The conversion function \( f_{tr} \), on the other hand, depends solely on the executable step. As a result, the conversion rule g can be represented in a much more compact form - in the form of a rule template \( \tilde{g} \) that appears as a formula of the form (73).

$$ \tilde{g} = \left\langle {\left\{ {p_{i} } \right\},\left\{ {k_{ij} } \right\}} \right\rangle $$
(73)

In a trivial manner, you can construct the transformation of the interpretation of patterns:

$$ F_{int} :\tilde{G} \to G $$
(74)

The formula (73) is much more convenient for the user, as it is easier to represent it in a textual form. However, the process of forming conversion rules can be simplified even more.

The formula (73) imposes a certain condition on the topology of the graph structure formed by the input plural and the connections between its elements. Whereas both constructs and concepts within the framework of operations performed on them can be considered as an analogue of lexemes, then the graph structure that they have formed can be regarded as an analogue of the structure formed by lexemes, namely, the primary structure of the text (13). An important consequence of this fact is that to create templates of the form (73) it is easy to develop a procedure for automated creation, because such a procedure is reduced to a simple implementation of the subgraph selection function \( G_{ap} \subset T_{sn}^{e} \). It is important to note that as \( T_{sn}^{e} \) can be used the primary structure of the processed text, that is, the formation of templates can act as an additional step after parsing and before aggregation (25).

Another important consequence of the existence of a homomorphism between the primary structure of the text and the structure of information from which it is allocated is the possibility of constructing a specialized NS \( NS^{\prime} \) intended for work not with objects of ontology, but with lexemes of the primary structure [58]. An interactive document built on its basis can be used to form \( G_{ap} \).

The procedure for creating templates can be simplified even more, creating specialized reference texts that contain only examples of the use of some verbal constructions. In this case, you can exclude the step of selecting a subgraph by accepting \( G_{ap} = T_{sn}^{e} \).

4 Formation of Ontological GIS Applications and Transdisciplinary Representation of Geospatial Information with Their Help

As already mentioned, experts in the corporate information environment can be represented as a system {action → results} and represent the analogue of the natural system SN. Based on this system, you can build interactive document types (6). To create an interactive document, you can build a specialized transformation of the form (75).

$$ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{G} :O \Rightarrow SN $$
(75)

Consider the natural systems SN characterized by a set n of “actions” \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{x}^{1} \ldots \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{x}^{n} \) and one “result” \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{y} \) [57], which are connected by dependence:

$$ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{y} = \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{f} \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{x}^{1} \ldots \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{x}^{n} } \right) $$
(76)

When forming a natural system based ontologies dependence \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{f} \) in the general case, has the following form:

$$ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{f} \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{x}^{1} \ldots \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{x}^{n} } \right) = D\left( {Q_{n} \left( {Q_{n - 1} \left( { \ldots Q_{1} \left( {Q_{o} \left( X \right),\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{x}^{1} } \right) \ldots ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{x}^{n - 1} } \right),\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{x}^{n} } \right)} \right) $$
(77)
  • where X - the set of objects of the initial ontology O;

  • \( Q_{i} \) - auxiliary processing functions;

  • D - a mapping function that allows the user to display the results of processing objects.

You can define a basic set of auxiliary functions as follows:

Hierarchical Filtration Function (78).

$$ Q_{h} \left( {X,x^{*} } \right) = \left\{ {\bar{x} \in X|\bar{x}Rx^{*} \vee \left( {\exists \tilde{x},\tilde{x}\bar{R}x^{*} \wedge \bar{x} \in Q_{h} \left( {\bar{x}} \right)} \right)} \right\} $$
(78)
  • where X - the set of objects belonging to a certain ontology;

  • \( x^{*} \) - the object in respect of which the filtration is carried out;

  • \( \bar{R} \) - a certain relation between objects.

This feature allows you to filter objects that are related to a particular relationship \( \bar{R} \). Also, the condition is verified recursively, which is equivalent to the use of a transient circuit \( \bar{R} \).

The ratio \( \bar{R} \) should be based on the set of links between the objects R:

$$ \begin{array}{*{20}c} {\exists k,\left\langle {x_{1} ,x_{2} ,k} \right\rangle \in R \Rightarrow x_{1} \bar{R}x_{2} } \\ {\exists k,\left\langle {x_{1} ,x_{2} ,k} \right\rangle \in R \Rightarrow x_{2} \bar{R}x_{1} } \\ {\exists k,\left\langle {x_{1} ,x_{2} ,k} \right\rangle \in R \vee \left\langle {x_{2} ,x_{1} ,k} \right\rangle \in R \Rightarrow x_{1} \bar{R}x_{2} } \\ \end{array} $$
  • where \( x_{1} ,x_{2} \) - certain objects;

  • R - a plurality of links between objects;

  • k - the type of relation between objects.

If necessary, the type of relation k can be fixed, which will result in the formation of the relation \( \bar{R}_{k} \) given by the formulae (79)–(81).

$$ \left\langle {x_{1} ,x_{2} ,k} \right\rangle \in R \Rightarrow x_{1} \bar{R}_{k} x_{2} $$
(79)
$$ \left\langle {x_{1} ,x_{2} ,k} \right\rangle \in R \Rightarrow x_{2} \bar{R}_{k} x_{1} $$
(80)
$$ \left\langle {x_{1} ,x_{2} ,k} \right\rangle \in R \vee \left\langle {x_{2} ,x_{1} ,k} \right\rangle \in R \Rightarrow x_{1} \bar{R}_{k} x_{2} $$
(81)

The attribute filtering function looks like:

$$ Q_{a} \left( {X,A} \right) = \{ \bar{x} \in X|A \cap A_{{\bar{x}}} = A)\} $$
(82)
  • where X - the set of objects belonging to a certain ontology;

  • A - the set of attributes for which filtration is performed;

  • \( A_{x} \) - the set of attributes of the object x.

The function \( Q_{a} \) allows you to select a set of objects that have a specific attribute or specific value of a particular attribute.

The context-sensitive function allows you to establish links between objects described in documents belonging to a specific information environment. This function is a combination of two functions.

The indexing function looks like:

$$ Q_{I} \left( C \right) = \bigcup\nolimits_{T \in C} {\bigcup\nolimits_{{l \in L_{T} }} {\left\{ {{ < }V\left( l \right),V\left( T \right){ > }} \right\}} } $$
(83)
  • where C - the input set of documents;

  • T - a certain document;

  • \( L_{T} \) - a set of lexemes that forms a textual representation T;

  • \( V\left( l \right),V\left( T \right) \) - lexemes l and document T identifiers respectively.

The indexing function results are used by the search function:

$$ Q_{S} \left( {I,l} \right) = \{ T|{ < }V\left( l \right),V\left( T \right) > \, \in I{ > } $$
(84)
  • where I - index, which is the result of work \( Q_{I} \);

  • \( V\left( l \right),V\left( T \right) \) - lexemes l and document T identifiers respectively.

With the help of the search function, you can form relationships between objects that belong to different ontologies (or between an ontology object and an unstructured information resource). It is necessary to take into account two features:

  1. (1)

    The object context usually consists of a large number of attributes, which, when constructing the index, can be regarded as a single text.

  2. (2)

    Both the name and the context of the object in the general case consists of many lexemes.

Taking into account these features, the function of the context link will look like:

$$ Q_{c} \left( x \right) = \bigcup\nolimits_{{l \in L_{x} }} {Q_{S} \left( {Q_{I} \left( C \right),l} \right)} $$
(85)
  • where C - a set of documents, representing the information environment, within which a bunch is carried out;

  • x - the object with which the bunch is made;

  • \( L_{x} \) - textual representation of the context.

  • l - a certain lexeme.

The main disadvantage of the function (85) is that it forms an unordered set of documents as a result, which can be quite large in size, and therefore uncomfortable for processing by an expert. There are two ways to solve this problem:

  1. (1)

    Assign on the set of results the order of the relation, which can be the relevance ratio \( R_{rel} \):

$$ T_{1} R_{rel} T_{2} \Rightarrow card\left( {L_{{T_{1} }} \cap L_{x} } \right) > card\left( {L_{{T_{2} }} \cap L_{x} } \right) $$
(86)
  • where \( L_{{T_{1} }} ,L_{{T_{2} }} \) - text representations of documents \( T_{1} ,T_{2} \);

  • \( L_{x} \) - The context of the object x with which the contextual link was implemented.

  1. (2)

    Delete from the result documents with insufficient relevance that can be used in the function (85) of the intersection operation instead of the merger operation.

5 Transdisciplinary Presentation of Information Through Interactive Documents

The transdisciplinary representation of sets of ontologies is based on the transformation between two hypersets:

$$ f^{ct} :R \to F $$
(87)
  • where R, F - the set of links and functions of the interpretation of a certain ontology O.

For a set of ontologies, you can construct a hyperset:

$$ {\Re } = \bigcup\nolimits_{i} {R_{i} } , F = \bigcup\nolimits_{i} {F_{i} } $$
(88)
  • where i - the index defining a certain ontology \( O_{i} = \left\langle {X_{i} ,R_{i} ,F_{i} } \right\rangle \).

On these hypersets it is possible to construct a converse transformation to \( f^{ct} \)

$$ f^{tt} :F \to {\Re } $$
(89)

That is, a certain connection between objects of various ontology themes can be represented by a non-empty set of interpretive functions from data ontologies.

This transformation can be extended to a set of unstructured texts, applying to each of the texts the reduction operator (24).

Let’s consider the procedure of transdisciplinary representation of a certain set of ontologies by means of interactive documents. This representation is based on the function (85). To implement it, you need to execute transdisciplinary transformation using the function:

$$ Q_{TI} \left( C \right) = \bigcup\nolimits_{{x \in X_{C} }} {\left\{ {Q_{C} \left( x \right)} \right\}} $$
(90)
  • where C - the set of ontologies;

  • \( X_{\text{C}} \) - a set of objects belonging to the association of ontologies \( \bigcup\nolimits_{O \in C} O \).

The application of the function \( Q_{C} \left( x \right) \) generates a set of objects from different ontologies, representing a certain hyperratio between the corresponding objects. With the help of the hyperratios thus formed, a transdisciplinary representation \( O^{\prime} \) of the set of ontologies \( C = \left\{ {{ < }X_{i} ,R_{i} ,F_{i} { > }} \right\} \) can be constructed:

$$ C\mathop \to \limits^{{Q_{TI} }} { < }\bigcup\nolimits_{i} {X_{i} } ,\bigcup\nolimits_{i} {R_{i} \cup Q_{TI} \left( C \right)} ,\bigcup\nolimits_{i} {F_{i} } { > } $$
(91)

Transform (91) sets the most complete representation of the available in C information, which is not always convenient. Often it is necessary to perform a representation of one selected ontology O. In this case (90) it is necessary to change:

$$ Q_{TO} \left( {O,C} \right) = \bigcup\nolimits_{{x \in X_{O} }} {\left\{ {Q_{C} \left( x \right)} \right\}} $$
(92)
  • where C - the set of ontologies;

  • \( X_{O} \) - a set of objects belonging to ontology O.

For the construction of systems that use the transdisciplinary representation of information, the following statements are important:

Statement 2.1.

On the basis of one ontology O belonging to a plurality of documents C, it is possible to form an arbitrary number of NS.

Argument.

Let there exist a certain type of NS (76), built on the basis of ontology \( O = \left\langle {X,R,F} \right\rangle \) with the help of transformation (75), and a set of documents C. Whereas the NS is determined by its function of the form (77), it is possible to form an arbitrary number of natural systems, changing the composition and the number of functions \( Q_{0} \ldots Q_{n} \). In particular:

  1. (1)

    If you lock an object \( \tilde{x} \in X \) arbitrarily and apply the hierarchical filtering function to the ontology, then you can take \( Q_{0} \left( X \right) = Q_{h} \left( {X,\tilde{x}} \right) \) and receive a system that represents a certain class of objects;

  2. (2)

    If we select as a \( Q_{0} \) function that performs the transdisciplinary representation (91), then we obtain a system that reflects the transdisciplinary representation of the ontology \( O^{\prime} \).

Consistently applying the function \( Q_{h} \) with different parameters and functions performed transdisciplinary representation, you can create any number of initial ontology O changes, the use of which in the formula (77) and will create as many natural systems.

The consequence of Statement 2.1 is that on the basis of one ontology, it is possible to construct an arbitrary number of interactive documents.

Natural systems of the species (76) are the simplest variant of natural systems. However, such system may be ineffective, in particular in the following cases:

If the information can be presented in several ways (for example, in the form of a table and an e-card), then it is usually not premature to know which method is more convenient for an expert;

If the information is different in content and structure, then it may be necessary to represent its various subsets in different ways;

So often there is a need to form the content from another ontology several natural systems act independently and accept the same set of input “action” from the user, and giving it a different “results” from which he can choose the one that meets the challenge.

Statement 2.2.

A combination of independent natural systems \( SN_{i} \) that accept the same set of “actions” is a natural system.

Argument.

According to the definition of the natural system in [57], if at least one element in the system during the action will give the same result, as in the work independently of other elements of the system. If we consider each of \( SN_{i} \) as an element of a particular metasystem S, then due to their independence \( SN_{i} \), their work in the system will not be different from work outside the system. Therefore, S will also be a natural system.

Such system S will be an extension (76) and will have the form (93).

$$ \begin{array}{*{20}c} {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{y}^{1} = \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{f}^{1} \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{x}^{1} \ldots \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{x}^{n} } \right)} \\ \vdots \\ {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{y}^{m} = \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{f}^{m} \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{x}^{1} \ldots \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{x}^{n} } \right)} \\ \end{array} $$
(93)

6 Transdisciplinary Representation of Geospatial Information with Ontological GIS Applications

The ontological GIS-application of the form (7) is also formed on the basis of the NS, which can be represented by the formula (76). In the simplest case, such NS can be regarded as a normal NS with a special display function - the function of mapping as an electronic map marker \( D_{m} \). However, the main feature of this function is that not all concepts can be represented as a marker, but only those that are elements of the layer (which, for example, can be represented by a certain category of concepts) and whose attributes contain geographic information. Therefore before using \( D_{m} \) it is necessary to filter:

$$ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{T}_{lr} = Q_{a} \left( {Q_{h} \left( {X,x_{lr} } \right),A_{geo} } \right) $$
(94)
  • where \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{T}_{lr} \) - a taxonomy formed by a subset of ontology concepts that can be represented as markers of a certain layer;

  • \( x_{lr} \) - an object that defines a class of objects that forms a layer of GIS;

  • \( A_{geo} \) - A set of attributes that can contain geographic information.

As a result of this filtering, you can get a set of taxonomies (usually non-intersecting). This set can be displayed with \( D_{m} \):

$$ D_{m} \left( {\bigcup\nolimits_{{x_{lr} \in X_{lr} }} {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{T}_{lr} } } \right) $$
(95)
  • where \( X_{lr} \) - the set of classes of concepts representing layers of GIS;

  • \( x_{lr} \) - a concept that defines the class of concepts that form a layer of GIS;

  • \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{T}_{lr} \) - taxonomic representation of the layer \( x_{lr} \) given;

  • \( D_{m} \) - function of displaying an object as an e-card marker.

It is formed on the basis of such an NS procedure and will form an ontological GIS-application of the form (7). Bring the formula (7) to the standard structure of the interactive document can be as follows:

A set of geographic entities \( X_{g} \) can be formed by combining taxonomic representations of GIS layers (94), as in the formula (95):

$$ X_{g} = \bigcup\nolimits_{{x_{lr} \in X_{lr} }} {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{T}_{lr} } $$
(96)

On the basis of the set \( X_{g} \), you can also form a set \( R_{g} \subset R \) with the formula (97). Typically, such set will correspond to the union of sets of links of taxonomic representations \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{T}_{lr} \) of GIS layers, however, and specialized links between objects of different taxonomic representations are possible.

$$ R_{g} = \{ { < }x_{1} ,x_{2} { > } \in R|x_{1} \in X_{g} \wedge x_{2} \in X_{g} \} $$
(97)

The set of states S appears as a sequence of “results” \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{y} \) of the work of the natural system, which were formed in response to the “actions” provided by the user:

$$ S = \left\{ {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{y} } \right\} $$
(98)

The set of analytic operations over geographic objects \( A^{op} \) essentially represents a subset of the set of auxiliary functions Q forming the dependence (77):

$$ A^{op} \subset Q $$
(99)

This set can be divided into two subsets: the set of operations of the analysis of the spatial relation of objects \( A^{rel} \) and the set of measuring operations \( A^{ms} \).

The operations of the analysis of the spatial ratio of objects allow the selection of a plurality of objects, such that the geospatial information associated with them meets a certain criterion. Such operations can be used to filter objects, in combination with the functions (78)–(84). The classification of such operations is presented in Table 1.

Table 1. Operations of the analysis of spatial relations of objects

Most of these relationships are binary. When the user specifies a geospatial object-standard (point or polygon), it is necessary to highlight among objects of ontology such ones that contain geospatial information, which in turn is linked to the given binary relation with the standard:

$$ Q^{r} \left( {X,g} \right) = \{ x \in X|{ < }A_{geo} \left( x \right),g{ > } \in r\} $$
(100)
  • where r - a certain binary geospatial relation;

  • X - a set of taxonomy objects;

  • g - spatial object acting as a reference;

  • \( A_{geo} \left( x \right) \) - a function of the sample of the geospatial information associated with the object x.

The distance ratio can also be used for filtering if you set a specific maximum distance value \( d_{max} \). This function is extremely useful as an alternative to the “coincides” relation for cases where both the reference and the set of geospatial information for which filtration is performed are points (for example, when handling a click-by-click event). This function can be represented as follows:

$$ Q^{d} \left( {X,g} \right) = \{ x \in X|d\left( {A_{geo} \left( x \right),g} \right) < d_{max} \} $$
(101)
  • where d - function of distance;

  • X - a set of taxonomy objects;

  • g - spatial object acting as a reference;

  • \( A_{geo} \left( x \right) \) - function of the sample of geospatial information connected with the object x;

  • \( d_{max} \) - the value of the maximum permissible distance.

Measuring operations \( A^{ms} \) are fundamentally different from the operations of the analysis of spatial relations of objects. Their result is a certain numeric value, for presentation of which the user needs a special display function Dn. The classification of such operations is given in Table 2.

Table 2. Measuring operations

The main difference between computational operations is that they accept as an argument not a standard for filtering, but a particular object over which geospatial information is bound to perform the calculation. However, if necessary, on their basis, one can construct the functions of filtration, in particular, similar to (101).

We apply all the elements of the set of ontologies C to the following conditions: each of them has a non-empty subset of objects containing geographic information:

$$ \forall O \in C,Q_{a} \left( {X_{O} ,A_{geo} } \right) \ne \emptyset $$
(102)
  • where \( X_{O} \) - the set of objects ontology O;

  • \( A_{geo} \) - a set of geographical attributes;

  • \( Q_{a} \) - attribute filtering function (82).

An arbitrary GIS can be represented as follows [13, 61, 66, 67]:

$$ \left\langle {X_{g} ,R_{g} ,A^{op} ,T^{s} } \right\rangle $$
(103)
  • where \( X_{g} \) - the set of geographical entities over which analytical operations are performed to solve the problem;

  • \( R_{g} \) - a set of relationships between geographic objects that determine the type of executed operations;

  • \( A^{op} \) - a set of analytical operations on geographic objects performed in the process of solving a problem;

  • \( T^{s} \) - the set of states of the task, which are visualized on the map in the process of its solution;

Statement 2.3.

Each GIS can be represented as a definite natural system.

This statement follows from the definition of the natural system [57], because GIS can be represented as a system of n influences and one “result” (76), which serves as the representation of the user geospatial information in the right form.

Statement 2.4.

The set of all GIS can be represented as a hyperset of natural systems.

Functional characteristics of GIS can be interpreted on the basis of affine space. The affinity space above the field K is called three:

$$ \left\langle {A_{n} ,V_{n} , + } \right\rangle $$
(104)
  • where \( A_{n} \) - the set, the elements of which are called points;

  • \( V_{n} \) - vector space above the field K;

  • \( + \) - a binary operation \( A_{n} \times V_{n} \to A_{n} \) that satisfies certain axioms [69].

A set of points \( A_{n} \) is called n-dimensional affine space over K, and the vector space \( V_{n} \) is a guide for \( A_{n} \) [69, 70].

Such space can be easily built as part of a certain ontological GIS application. To do this, you need to use the sample function of the geospatial information associated with the object \( A_{geo} \left( x \right) \):

$$ A_{geo}^{x} = \{ A_{geo} \left( x \right)|x \in X\} $$
(105)

The set of points \( A_{n} = A_{geo}^{x} \) formed in this way allows form affine space over a field \( {\mathbb{R}}^{2} \) or \( {\mathbb{R}}^{3} \) (depending on the structure of the information in the ontological GIS application initialization and the requirements of the solvable problem). This operation is carried out by specialized transformations:

$$ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{G}_{A} :\left\langle {X,R,F} \right\rangle \to \left\langle {A_{geo}^{x} ,V_{n} , + } \right\rangle $$
(106)
  • where X, R, F - the set of objects, connections and functions of the interpretation of ontology;

  • \( A_{geo}^{x} \) - formed on the basis of X (105) the set of geographic points;

  • \( V_{n} , + \) - vector space over a field \( {\mathbb{R}}^{2} \) or \( {\mathbb{R}}^{3} \) a certain binary operation.

Points belonging to a plurality of geographic information \( A_{geo}^{x} \) will be related to the vectors that belongs to \( V_{n} \) as follows:

  1. (1)

    For each ordered pair of points A, B, you can put in a matching vector \( \overrightarrow {AB} = v \in V_{n} \) that can be labeled as \( v = B - A \).

  2. (2)

    For a given point A and a vector \( \in V_{n} \) \( \exists !B \in A_{n} :B - A = v \).

  3. (3)

    For arbitrary three points A, B, C, the identity of the triangle is fulfilled in Vn:

$$ \left( {B{-}A} \right) + \left( {A{-}C} \right) + \left( {C{-}B} \right) = 0 $$
(107)

You can select properties:

  1. (1)

    The two identical points correspond to the zero vector.

  2. (2)

    A vector determined by a pair of points B, A opposite to a vector determined by a pair of points A, B.

  3. (3)

    If \( B - A = B - A^{\prime} \) so \( A = A^{\prime} \).

  4. (4)

    If we choose a certain starting point O, then \( \forall v \in V_{n} \) only one point \( B \in A_{n} \) corresponds, and \( \forall B \in A_{n} \) corresponds only one vector \( v = A - O \).

Consider the plane, as affine space \( A_{2} \), and consider all mutually \( A_{2} \to A_{2} \) unambiguous mappings that keep the distances between the points and the angles between the vectors. The classification of the main such mappings is shown in Table 3.

Table 3. Classification of main reflections of the affinity space

Each of the main types of mapping corresponds to a certain function that performs such a reflection over a certain affine space \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{A}_{n} \) formed by a certain subset of ontology \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{X} \subset X \) objects. These functions have a structure (108).

$$ Q_{t} \left( O \right) = \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{G}_{A}^{ - 1} \left( {t\left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{G}_{A} \left( O \right)} \right)} \right) $$
(108)
  • where \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{G}_{A} ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{G}_{A}^{ - 1} \) - the transformation (106) and the inverse of it;

  • t - a certain affinity mapping \( A_{n} \to A_{n} \).

Functions \( Q_{t} \), \( Q_{rt} \), \( Q_{in} \) and others specified by specialized views can be used in combination with the standard functions of an interactive document for building ontological GIS applications [71,72,73,74]. Ontological GIS applications, in turn, provide transdisciplinary integration of geospatial information through interactive documents and reflections of the affinity space.

7 The Practical Approach for the Taras Shevchenko Portal

The reflection of the transdisciplinary representation of geospatial information is presented on the example of the ontological GIS formation, as a certain means of reflecting the life of Taras Shevchenko (see Fig. 1).

Fig. 1.
figure 1

Interdisciplinary nature of ontologies in scientific research

The basis of ontological GIS is an interactive document for which semantic hyperratio and categories of classes must be defined, which collectively provide processes for the formation of transdisciplinary interactive mappings of complex research topics. It is necessary to determine their semantic connectivity, which is formed on the basis of the use of interdisciplinary relationships that provide structuring and actualization of information, meaningful interactive activity of the researcher of the work of Shevchenko (Table 4).

Table 4. Categories of transdisciplinary representation on the basis of hyperratios

The above categories quite fully provide the use of the mechanism of recursive reduction in the formation of interactive documents and display of geospatial information.

These mechanisms provide the construction of different classes of taxonomies, which, in turn, can be identified or expanded in such a way that the construction of vertex names in each taxonomy, from the notion of the lower level to the notion of the upper level, forms a logical chain. For this we consider all the connected vertices of the graph as certain tautologies, which are constructed according to the rule: <concept> - property-relation - <concept>. The vertices of graphs are formed by certain, content-filled concepts, and the graphs can be defined as tautologies, taking into account certain limitations of intuitive understanding of concepts. Formulae, which are formed from all vertices, having an attitude relation, can be regarded as the set of all admissible tautologies and consist of allegations of the affiliation of concepts to a particular taxonomy. The tautologies are formed on the basis of the thematic linking of the concepts of classes bearing the names of the above taxonomies.

Taxonomies in the ontological GIS environment that provide grouping classes of ontology objects of the research area correspond to thematic layers of the electronic map, and just the objects belonging to the corresponding class corresponds to objects of the thematic layer. The taxonomy of ontology objects, which corresponds to the legend of the map, is formed on the basis of establishing relationships between concepts and classes, for example, the relation “part - the whole”.

The representation of the taxonomy classes in the form of thematic layers of the map allows us to combine the notion of place and time with the concepts of facts and events in an unknown to this combination, from a new angle of view.

All categories, taxonomies and classes are formed on the basis of a certain classification. This classification may be changed. The entire hierarchy, names and properties of each category and class can be edited at any time and under any conditions. It is also possible to expand their list. That is, the ontology of such a complex subject of research as the life and work of Taras Shevchenko is dynamic and provides as a broad and rather deep reflection of all the events and facts that have come to pass in his life, and those that have existed to the present.

Application of the function of membership allows you to define classes of taxonomies, concepts-notions that can be represented by thematic layers of the map. Also, concepts that directly form taxonomies can be reflected in the legends of these thematic layers.

The legend of the map includes thematic layers, similar to the name taxonomies, as classes of concepts that have the same set of properties and objects of the layers. The correspondence between thematic layers of maps and taxonomies reflecting the structure of information arrays is realized on the basis of determining the level of similarity of fuzzy formal concepts of the investigated subjects.

Description of the concepts of the subject area in the form of certain objects on the map is limited to the fields of attribute information, and the service of attachments allows you to attach only the information that is physically available to the user. Due to the combination of different types of databases, in taxonomy, attributes of objects can be represented not only in table form, but also in text, as well as in the form of thematic hyperlinks to distributed information resources in the network (see Figs. 2 and 3).

Fig. 2.
figure 2

Thematic layers on the map dedicated to the life and work of Taras Shevchenko

Fig. 3.
figure 3

Objects of the thematic layer “Places of stay

The mechanism of recursive reduction provides a dynamic change in thematic profiles of research in the subject area. Interactive documents that are formed at the stage of semantic analysis of input documents (see Fig. 1) provide such a dynamic change based on the use of hyper properties (see Table 1).

An example of their use is shown in Fig. 4 - a dynamic transition from places of residence to the location of archival documents that reflect the various events of the life of Shevchenko. Ontological GIS tools provide the opening of an information card window containing information about its official name.

Fig. 4.
figure 4

Window of the “Archival Document” widget

The formation of the indicated mapping is realized on the basis of the mechanism of recursive reduction in the processing of a certain object of the ontological register of archival documents related to the life, creativity and honoring the memory of Taras Shevchenko in the ontological interface (see Fig. 5).

Fig. 5.
figure 5

The transition between the geographic information system and the ontological interface.

Thus, the interactivity of documents formed on the basis of transdisciplinary representation of geospatial information ensures the integration of information systems and the aggregation of distributed network information resources into a single multifunctional ontologically controlled system. The efficiency of the use of information increases due to its timeliness, usefulness, expedient dosage, accessibility (comprehensibility), noise minimization, operational relationship of the source of information and user, adaptation of the rate of information submission to the rate of its learning, taking into account the individual characteristics of the user, the efficient combination of individual and collective activities, etc. The use of interactive documents of GIS ontology allows significantly expanding the perceptions of such objects of a complex thematic profile of the study, which is the historical and cultural heritage of Kobzar, and interdisciplinary connections between them by supplementing information descriptions of objects based on distributed information resources and searching semantically related information arrays. Such a combination allows us to create a single conceptual information and analytical environment, which is constantly supplemented with the replenishment of geographically distributed users of different directions and has the ability to flexibly expand the functionality through the integration of various information systems.

8 Conclusions

The chapter deals with the approach to structuring the NL text with its key meaningful characteristics on the basis of the primary structure, which is formed during its lexical analysis.

The possibility is determined and the method of structuring natural language texts based on the basis is proposed procedures for recursive text reduction using rules presented in the form of lambda-expressions.

It is shown the possibility of dynamically forming such rules by users without special training on the basis of the original structure of texts already analyzed, which allows to form structured presentation of texts and handle the information contained in the text by automatic and automated systems.

The approach to construct an ontological model of an interactive document is designed to display an expert text structuring results in accordance with entered user request.

The practical implementation of the proposed theoretical approach is a model of ontological GIS-application that is an interactive document.

Such a document is characterized by a natural system of coordinates defined over affine space and due to this, the geospatial information most suitable for displaying naturally.

The models of interactive document and ontological GIS-application provide a high level of representativeness of information available in text documents (in particular, geospatial) for using structured text representation.

Realization of the model of transdisciplinary representation of information as an interactive feature of document, provides the possibility of obtaining operative access to large arrays of thematic information, and in combination with the capabilities of ontological GIS applications - solves the problem of transdisciplinary representation of geospatial information.