Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

PART I. FOUNDATIONS

1 Introduction

This Chapter is intended as a survey of some basic applications of mereological schemes of reasoning in computer science and engineering. In accordance with the specificity of surveys, it does collect in one text various topical applications, not necessarily related by means of a logical order of things and this calls for a unifying thread stemming from a general discussion of mereological view on things and their relations. In language of mereology, we can state that one thing is a part of the other, that a thing is a fusion of some other things, its parts, or, more generally that some things approximate parts of another thing to a specified degree.

Particular cases are: spatial reasoning where things are figures in space, semantics of spatial locutions addressing spatial relationships of mutual positions like ‘in’, ‘out’, ‘on’, ‘under’, etc., planning and navigation by intelligent agents (e.g., mobile autonomous robots) in which environment is presented as a collection of polygons which should be bypassed in order to reach a goal, granulation of knowledge represented in data tables along with applications to synthesis of decision or classification rules, logics for reasoning about knowledge, problems of design and assembling of artifacts from parts, problems of representation of spatial features of things like dents, holes, joints. All these problems require adequate notions of parts and relations induced with their help.

One can justly say that we aim at knowledge expressed in language of underlying mereology theory, and, knowledge we understand here in the sense of I. M. Bocheński (1954), i.e., as a collection of true statements about things, their collections, i.e., concepts, and about relations among concepts.

A discussion of relations among concepts and things and concepts goes back to Aristotle (1989) who singled out four basic types of relations among concepts, i.e., containment or not, intersection or not, and established twenty four valid figures of reasoning called syllogisms, see Łukasiewicz (1939, 1957) for a modern logical rendering of the Aristotle system including an axiomatization and Słupecki (1949–1950) for a proof of completeness of the system. In Aristotle’s system, knowledge was rendered as a collection of syllogisms, i.e., relations solely among concepts.

Georg Cantor introduced to this discussion things as elements and sets as collections of things. Naïve set theory of Cantor (from 1873 on), in which one assumed that to each property of things there existed the collection (set) of things which satisfied the property, gave way to antinomies, i.e., statements arising from valid deduction rules and seemingly valid assumptions, but contradicting some other assumptions, like the Cantor antinomy (the existence of the set of all sets), the Burali–Forti antinomy (the existence of the set of all ordinal numbers), and most notably the Russell antinomy, see Frege (1903); this antinomy, stated in elementary terms of the theory, defined the set X of things x having the property xx, concluding that \(X \in X \Leftrightarrow X\notin X\). In the aftermath of this antinomy, set theory became an axiomatized system due in particular to Zermelo (the Zermelo–Fraenkel system) (1908) and Gödel (the Gödel–Bernays system) (1951); the independence of some statements like continuum hypothesis, Souslin hypothesis, etc., see Balcar and Štěpánek (1986) caused emergence of many systems of set theory.

Another way of resolving the problem of the Russell antinomy was pointed by Stanisław Leśniewski (1916), see a translation in Leśniewski (1982), who developed a theory of Mereology in which the primitive notion was that of a part. This notion appeared in Aristotle and Leśniewski defined it as a relation between individuals which, in turn, are defined in His Ontology Leśniewski (1927, 1930), see expositions in Słupecki (1955), Lejewski (1958), or, Iwanuś (1973). Ontology is founded on the predicate ε to be read “is” (in Greek, ei “you are”, cf., Plutarch 1936) which is required to satisfy the Ontology Axiom AO, formulated by Leśniewski as early as 1920, see Słupecki (1955)

$$\displaystyle{\mathrm{AO}\ x\epsilon \ y \Leftrightarrow \ \exists z.(z\epsilon \ x) \wedge \ \forall \ z.(z\epsilon \ x \Rightarrow \ z\epsilon \ y) \wedge \forall \ z,w.(z\epsilon \ x \wedge w\epsilon \ x \Rightarrow z\epsilon \ w).}$$

This axiom determines the meaning of the copula ε in the way adopted by Leśniewski: in spite of the copula occurring on either side of the equivalence, its meaning can be revealed by requiring the equivalence to be true as adopted by Leśniewski in His Protothetics, see Słupecki (1955), Miéville (1984). AO means that the thing x is an individual called by a singular name responding to the singular or collective (possibly) name of y. In mereology, the predicate part is applied to individual things.

The meaning of AO is that x responds to the name of y if and only if x is non–void (∃z. (z εx)), all x is y, and, x is an individual (\(\forall \ z,w.(z\epsilon \ x \wedge w\epsilon \ x \Rightarrow z\epsilon \ w)\)). In particular, the relation x ε x does characterize individuals.

The relation of a part being non–reflexive and transitive, the union of the part relation and the identity relation, called the ingredient relation ingr does satisfy the relation ingr(x, x) for every thing x which implies that each thing is its element (see (10.3) in Sect. 10.2.1); this fact eliminates the Russell antinomy, cf., a formal analysis in Sobociński (1950).

Let us call a complex thing a thing which has other things as parts; a question poses itself, in the Henri Poincare sense, of mutual spatial relations of these parts: some may be over, some under other parts, some may be in, some out, some before, some after, some parts may have interiors, some not, etc.

The relation of mereology to spatial analysis was recognized early, e.g., in Tarski’s axiomatization of geometry of solids (Tarski, 1929). Spatial considerations led Alfred North Whitehead (1916, 1919, 1920, 1929) in His attempt at capturing some notion of a ‘continuity of events’ to the notion of a relation of being connected, rendered in full generality in Clarke’s Connection Calculus (Clarke, 1981), and explicating the intuitive notion of an ‘external contact’ (see below). By considering spatial relations among parts, mereology comes in touch with topology, a mathematical discipline investigating spatial properties of things; hence, attempts at introducing into mereological universa of mereotopology, i.e., of a topological structure induced by part structure. Mereotopology in turn may be applied in semantics of utterances describing spatial relations, see, Aurnague and Vieu (1995). The reader will find an extensive discussion of spatial aspects of mereology in Casati and Varzi (1999).

Synthesis of a complex thing from its parts involves two stages, viz., design as well as assembling; either of these stages requires a specification of the order in which parts enter the process, their mutual orientation and connections, timing of particular operations etc. This belongs in the domain of engineering.

Orientation in space and movement planning are one of principal intelligent behaviors of humans; artificial intelligence and machine learning which are studied within computer science devote much attention to those problems and one of principal applications of methods elaborated in those studies is the field of intelligent (behavioral) robotics. Robots which may move freely in space allotted to them, endowed with sensors, are examples of intelligent agents, capable of performing autonomous tasks. This requires planning of paths and subsequent navigation in the environment. As environment is modeled as a collections of continua (i.e., compact connected sets) spatial analysis is a principal component in the process of planner synthesis.

Other intelligent behavior is the ability to solve problems, classify things to categories. In studies of this problem, a few paradigms were defined like fuzzy set theory of Zadeh (1965), rough set theory of Pawlak (1991) along with cognitive paradigms like neural networks. In particular, fuzzy set theory is built on the notion of a membership to a degree; transferring this idea to mereology brought forth rough mereology whose primitive notion is a relation of a part to a degree (Polkowski and Skowron, 1994), also see Polkowski (2011). Rough mereology allows for a more precise description of spatial relations among parts and its applications in classifier synthesis and planning for intelligent robots are described in this Chapter.

In the process of synthesis of a classifier/decision algorithm, one is forced to cope with noise immanent in data; one means of reducing noise is granulation of data/knowledge consisting of forming granules of knowledge which consist of things similar one to another to a satisfactory degree; rough mereology delivers effective tools for granulation of knowledge and permits to obtain effective classifiers at reduced complexity.

In our opinion, the topics outlined above, which belong in the so–called cognitive technologies and knowledge engineering, are among the most important ones in computer science and engineering, and, in consequence, we discuss them in more detail in the following sections.

The unifying thread which can be defined as the existence of part or part to a degree relations on concepts as the basic starting component in reasoning is strongly evident in those applications.

Preliminary to applications, we want to acquaint the reader with necessary background in theoretical foundations of mereology, rough mereology and mereotopology as bases for further developments.

2 Mereology

In this Chapter, we propose to introduce the reader to the notions and methods of mereology. We begin with the mereology in the sense of Leśniewski, based on the notion of a part, and then we present the version of mereology based on the notion of being connected, Connection Calculus proposed in Clarke (1981) on the lines of Whitehead (1916, 1919, 1920, 1929), de Laguna (1922), and Leonard and Goodman (1940). Connection relation captures the notion of being in contact, e.g., having parts in common, or intersecting, and within Connection Calculus one redefines a part relation which leads to a richer theory with a topological flavor. Mereotopology along with necessary background on topology are presented in a further section.

The reader may be aware of the existence of a vast literature on philosophical and ontological aspects of mereology which cannot be mentioned nor discussed here, and, we advise them to consult Simons (2003) and Casati and Varzi (1999) for discussions of those aspects.

2.1 Mereology of Leśniewski

Mereology due to Leśniewski arose from attempts at reconciling antinomies of naïve set theory, see Leśniewski (1916, 1927, 1982), Srzednicki et al. (1992), and Sobociński (1950, 1954–1955). Leśniewski (1916) was the first presentation of the foundations of his theory as well as the first formally complete exposition of mereology.

2.1.1 On the Notion of Part

The primitive notion of mereology in this formalism is the notion of a part. Given some category of things, a relation of a part is a binary relation π which is required to be

M1:

Irreflexive: For each thing x, it is not true thatπ(x, x).

M2:

Transitive: For each triple x, y, z of things, if π(x, y) and π(y, z), then π(x, z).

Remark.

In the original scheme of Leśniewski, the relation of parts is applied to individual things as defined in Ontology of Leśniewski, see Leśniewski (1930), Iwanuś (1973), Słupecki (1955) (see Introduction for the Ontology Axiom AO).

The relation of part induces the relation of an ingredient (the term is due to T. Kotarbiński), ingr, defined as

$$\displaystyle{ \mathit{ingr}(x,y) \Leftrightarrow \ \pi (x,y)\ \vee x = y }$$
(10.1)

The relation of ingredient is a partial order on things, i.e.,

  1. 1.

    ingr(x, x). 

  2. 2.

    \(\mathit{ingr}(x,y) \wedge \mathit{ingr}(y,x) \Rightarrow (x = y).\)

  3. 3.

    \(\mathit{ingr}(x,y) \wedge \mathit{ingr}(y,z) \Rightarrow \ \mathit{ingr}(x,z)\).

We formulate the third axiom with a help from the notion of an ingredient.

M3:

(Inference) For each pair of things x, y, if the property

I(x, y): For each t, if ingr(t, x), then there exist w, z such that ingr(w, t),

ingr(w, z), ingr(z, y) hold,

is satisfied, then ingr(x, y).

The predicate of overlap, Ov in symbols, is defined by means of

$$\displaystyle{ Ov(x,y) \Leftrightarrow \ \exists z.\mathit{ingr}(z,x) \wedge \ \mathit{ingr}(z,y) }$$
(10.2)

Using the overlap predicate, one can write I(x, y) down in the form

I Ov(x, y): For each t if ingr(t, x), then there exists z such that ingr(z, y) and Ov(t, z)

2.1.2 On the Notion of a Class

The notion of a mereological class follows; for a non–vacuous property Φ of things, the class of Φ, denoted Cls Φ is defined by the conditions

C1:

If Φ(x), then ingr(x, Cls Φ). 

C2:

If ingr(x, Cls Φ), then there exists z such that Φ(z) and IOv(x, z). 

In plain language, the class of Φ collects in an individual thing all things satisfying the property Φ. The existence of classes is guaranteed by an axiom.

M4:

For each non–vacuous property Φ there exists a class Cls Φ. 

The uniqueness of the class follows by M3. M3 implies also that, for the non–vacuous property Φ, if for each thing z such that Φ(z) it holds that ingr(z, x), then ingr(Cls Φ, x).

The notion of an overlap allows for a succinct characterization of a class: for each non–vacuous property Φ and each thing x, it happens that ingr(x, Cls Φ) if and only if for each ingredient w of x, there exists a thing z such that Ov(w, z) and Φ(z).

Remark.

Uniqueness of the class along with its existence is an axiom in the Leśniewski (1916) scheme, from which M3 is derived. Similarly, it is an axiom in the Tarski (1929, 1935, 1937) scheme.

Please consider two examples.

  1. 1.

    The strict inclusion ⊂ on sets is a part relation. The corresponding ingredient relation is the inclusion \(\subseteq \). The overlap relation is the non–empty intersection. For a non–vacuous family F of sets, the class Cls F is the union \(\bigcup F\).

  2. 2.

    For reals in the interval [0, 1], the strict order < is a part relation and the corresponding ingredient relation is the weak order ≤ . Any two reals overlap; for a set \(F \subseteq [0,1]\), the class of F is sup F.

2.1.3 Notions of Element, Subset

The notion of an element is defined as follows

$$\displaystyle{ el(x,y) \Leftrightarrow \ \exists \varPhi.y = \mathit{Cls}\varPhi \ \wedge \ \varPhi (x) }$$
(10.3)

In plain words, el(x, y) means that y is a class of some property and x responds to that property. To establish some properties of the notion of an element, we begin with the property INGR(x) = { y: ingr(y, x)}, for which the identity x = Cls INGR(x) holds by M3. Hence, el(x, y) is equivalent to ingr(x, y). Thus, each thing x is its own element. This is one of means of expressing the impossibility of the Russell paradox within the mereology, cf., Leśniewski (1916), Thms. XXVI, XXVII, see also Sobociński (1950).

We observe the extensionality of overlap: For each pair x,y of things, x = y if and only if for each thing z, the equivalence\(\mathit{Ov}(z,x) \Leftrightarrow \mathit{Ov}(z,y)\)holds. Indeed, assume the equivalence \(\mathit{Ov}(z,x) \Leftrightarrow \mathit{Ov}(z,y)\) to hold for each z. If ingr(t, x) then Ov(t, x) and Ov(t, y) hence by axiom M3 ingr(t, y) and with t = x we get ingr(x, y). By symmetry, ingr(y, x), hence x = y.

The notion of a subset follows

$$\displaystyle{ sub(x,y) \Leftrightarrow \ \forall z.[\mathit{ingr}(z,x) \Rightarrow \ \mathit{ingr}(z,y)] }$$
(10.4)

It is manifest that for each pair x, y of things, sub(x, y) holds if and only if el(x, y) holds if and only if ingr(x, y) holds.

For the property \(\mathit{Ind}(x) \Leftrightarrow \ \mathit{ingr}(x,x)\), one calls the class Cls Ind, the universe, in symbols V.

2.1.4 The Universe of Things, Things Exterior, Complementation

It follows that (1) The universe is unique. (2) ingr(x, V ) holds for each thing x. (3) For each non–vacuous property Φ, it is true that ingr(Cls Φ, V ).

The notion of an exterior thing x to a thing y, extr(x, y), is the following

$$\displaystyle{ \mathit{extr}(x,y) \Leftrightarrow \ \neg \mathit{Ov}(x,y) }$$
(10.5)

In plain words, x is exterior to y when no thing is an ingredient both to x and y.

Clearly, the operator of exterior has properties (1) No thing is exterior to itself. (2) extr(x, y) implies extr(y, x). (3) If for a non–vacuous property Φ, an thing x is exterior to every thing z such that Φ(z) holds, then extr(x, Cls Φ).

The notion of a complement to a thing, with respect to another thing, is rendered as a ternary predicate comp(x, y, z), cf., Leśniewski (1916), par. 14, Def. IX, to be read:‘x is the complement to y with respect to z’, and it is defined by means of the following requirements (1) x = Cls EXTR(y, z). (2) ingr(y, z), where EXTR(y, z) is the property which holds for an thing t if and only if ingr(t, z) and extr(t, y) hold.

This definition implies that the notion of a complement is valid only when there exists an ingredient of z exterior to y. Following are basic properties of complement (1) If comp(x, y, z), then extr(x, y) and π(x, z). (2) If comp(x, y, z), then comp(y, x, z).

We let for a thing x, \(-x = \mathit{Cls}\mathit{EXTR}(x,V )\). It follows that (1) \(-(-x) = x\)for each thing x. (2) − V does not exist.

We conclude this paragraph with two properties of classes useful in the following

$$\displaystyle{ \text{If}\ \varPhi \Rightarrow \varPsi \ \text{then}\ \mathit{ingr}(\mathit{Cls}\varPhi,\mathit{Cls}\varPsi ) }$$
(10.6)

and a corollary

$$\displaystyle{ \text{If}\ \varPhi \Leftrightarrow \varPsi \ \text{then}\ \mathit{Cls}\varPhi = \mathit{Cls}\varPsi }$$
(10.7)

2.2 Mereology Based on Connection

In Whitehead (1916, 1919, 1920), a proposition of the notion of ‘x extends over y’, appeared, dual to that of a part. Th. de Laguna (1922) published a variant of the Whitehead scheme, which led Whitehead (1929) to another version of his approach, based on the notion of ‘x is extensionally connected to y’. Connection Calculus based on the notion of a ‘connection’ was proposed in Clarke (1981), which we outline here.

2.2.1 On the Connection Predicate

The predicate of connection C is subject to basic requirements

CN1:

C(x, x) for each thing x.

CN2:

If C(x, y), then C(y, x) for each pair x, y of things.

It follows that connection is reflexive and symmetric. This theory is sometimes called Ground Topology T, cf., Casati and Varzi (1999). The additional extensionality requirement

CN3:

If\(\forall z.[C(z,x) \Leftrightarrow C(z,y)]\), then x = y.

produces the Extensional Ground Topology ET., see, op. cit.

Let us observe that the predicate C can be realized by taking C = Ov; clearly, CN1–CN3 are all satisfied with Ov. We call this model of connection mereology, the Overlap model, denoted OVM. Also, letting C(x, y) if and only if \(x \cap y\neq \varnothing \), defines a connection relation on non-empty sets.

In the universe endowed with C, satisfying CN1, CN2, one defines the notion of an ingredient ingrC by letting

$$\displaystyle{ \mathit{IC}\ \mathit{ingr}_{C}(x,y) \Leftrightarrow \forall z.[C(z,x) \Rightarrow C(z,y)] }$$
(10.8)

Then, the following properties of ingrC hold

  1. 1.

    ingrC(x, x).

  2. 2.

    \(\mathit{ingr}_{C}(x,y) \wedge \mathit{ingr}_{C}(y,z) \Rightarrow \mathit{ingr}_{C}(x,z)\).

  3. 3.

    In presence of CN3, \(\mathit{ingr}_{C}(x,y) \wedge \mathit{ingr}_{C}(y,x) \Rightarrow x = y\).

  4. 4.

    \(\mathit{ingr}_{C}(x,y) \Leftrightarrow \forall z.[\mathit{ingr}(z,x) \Rightarrow \mathit{ingr}(z,y)]\).

  5. 5.

    \(\mathit{ingr}_{C}(x,y) \wedge C(z,x) \Rightarrow C(z,y)\).

  6. 6.

    \(\mathit{ingr}_{C}(x,y) \Rightarrow C(x,y)\).

2.2.2 Introducing Notions of a Part, an Ingredient, Overlapping Things and Things Exterior

The notion of a C–part πC can be introduced as

$$\displaystyle{ \mathit{PC}\ \pi _{C}(x,y) \Leftrightarrow \mathit{ingr}_{C}(x,y) \wedge x\neq y }$$
(10.9)

The predicate of C–overlapping, OvC(x, y) is defined by means of

$$\displaystyle{ \mathit{OC}\ \mathit{Ov}_{C}(x,y) \Leftrightarrow \exists z.[\mathit{ingr}_{C}(z,x) \wedge \mathit{ingr}_{C}(z,y)] }$$
(10.10)

Basic properties of C–overlapping follow.

  1. 1.

    OvC(x, x).

  2. 2.

    \(\mathit{Ov}_{C}(x,y) \Leftrightarrow \mathit{Ov}_{C}(y,x)\).

  3. 3.

    \(\mathit{Ov}_{C}(x,y) \Rightarrow C(x,y)\).

  4. 4.

    \(\mathit{ingr}_{C}(x,y) \wedge \mathit{Ov}_{C}(z,x) \Rightarrow \mathit{Ov}_{C}(z,y)\).

  5. 5.

    \(\mathit{ingr}_{C}(x,y) \Rightarrow \mathit{Ov}_{C}(x,y)\).

The notion of an C–exterior things, extrC is defined by means of

$$\displaystyle{ \mathit{EC}\ \mathit{extr}_{C}(x,y) \Leftrightarrow \neg Ov_{C}(x,y) }$$
(10.11)

2.2.3 Notions Derived from C

A new notion is C–external connectedness, EC, defined as follows

$$\displaystyle{ \mathit{EC}\ \mathit{EC}(x,y) \Leftrightarrow C(x,y) \wedge \mathit{extr}(x,y) }$$
(10.12)

It is easy to see that in the model OVM, EC is a vacuous notion. Clearly, by definition (10.12),

  1. 1.

    \(\neg \mathit{EC}(x,x)\).

  2. 2.

    \(\mathit{EC}(x,y) \Leftrightarrow \mathit{EC}(y,x)\).

  3. 3.

    \(C(x,y) \Leftrightarrow \mathit{EC}(x,y) \vee Ov_{C}(x,y)\).

  4. 4.

    \(\mathit{Ov}_{C}(x,y) \Leftrightarrow C(x,y) \wedge \neg \mathit{EC}(x,y)\).

  5. 5.

    \(\neg \mathit{EC}(x,y) \Leftrightarrow [\mathit{Ov}_{C}(x,y) \Leftrightarrow C(x,y)]\): This is a logical rendering of our remark that in OVM, no pair of things is in EC, hence, \(\neg EC(x,y) = \mathit{TRUE}\) for each pair of things.

  6. 6.

    \(\neg \exists z.\mathit{EC}(z,x) \Rightarrow \{\mathit{ingr}_{C}(x,y) \Leftrightarrow [\forall w.\mathit{Ov}_{C}(w,x) \Rightarrow \mathit{Ov}_{C}(w,y)]\}\).

A comment in the way of proof. The implication

$$\displaystyle{\mathit{ingr}_{C}(x,y) \Rightarrow [\forall w.Ov_{C}(w,x) \Rightarrow Ov_{C}(w,y)]}$$

is always true. Thus, it remains to assume that (i) \(\neg \exists z.EC(z,x)\) and to prove that

$$\displaystyle{({\ast})\ [\forall w.Ov_{C}(w,x) \Rightarrow Ov_{C}(w,y)] \Rightarrow \mathit{ingr}_{C}(x,y)}$$

(i) can be written down as (ii) \(\forall z.\neg C(z,x) \vee Ov_{C}(z,x)\). To prove that ingrC(x, y) it should be verified that (iii) \(\forall z.(C(z,x) \Rightarrow C(z,y))\).

Consider an arbitrary thing z′; either \(\neg C(z^{\prime},x)\) in which case implication in (iii) is satisfied with z′, or, OvC(z′, x), hence, OvC(z′, y) by the assumed premise in (*), which implies that C(z′, y). The implication (iii) is proved and this concludes the proof.

The richer structure of connection based calculus allows for some notions of a topological nature; the first is the notion of a tangential ingredient, TingrC(x, y), defined by means of

$$\displaystyle{ \mathit{TI}\ \mathit{Tingr}_{C}(x,y) \Leftrightarrow \mathit{ingr}_{C}(x,y) \wedge \exists z.\mathit{EC}(z,x) \wedge \mathit{EC}(z,y) }$$
(10.13)

Basic properties of tangential parts follow by (10.13)

  1. 1.

    \(\exists z.\mathit{EC}(z,x) \Rightarrow \mathit{Tingr}_{C}(x,x)\).

  2. 2.

    \(\neg \exists z.\mathit{EC}(z,y) \Rightarrow \neg \mathit{exists}x.\mathit{Tingr}_{C}(x,y)\).

  3. 3.

    \(\mathit{Tingr}_{C}(z,x) \wedge \mathit{ingr}_{C}(z,y) \wedge \mathit{ingr}_{C}(y,x) \Rightarrow \mathit{Tingr}_{C}(y,x)\).

For Property 3, some argument may be in order; consider w such that EC(w, x), EC(w, z) existing by TingrC(z, x). hence, C(w, y). As \(\neg Ov_{C}(w,x)\), it follows that \(\neg Ov_{C}(w,y)\), hence, EC(w, y), and TingrC(y, x).

These properties witness the fact that if there is some thing externally connected to x, then x is its tangential ingredient. This fact shows that the notion of a tangential ingredient falls short of the idea of a boundary. Dually, in absence of things externally connected to y, no ingredient of y can be a tangential ingredient.

A thing y is a non–tangential ingredient of a thing x, NTingrC(y, x), in case it is an ingredient but not any tangential ingredient of x,

$$\displaystyle{ \mathit{NTI}\ \mathit{NTingr}_{C}(y,x) \Leftrightarrow \neg \mathit{Tingr}_{C}(y,x) \wedge \mathit{ingr}_{C}(y,x) }$$
(10.14)

Basic properties of the operator NTI are

  1. 1.

    \(\mathit{NTingr}_{C}(y,x) \Rightarrow \forall z.\neg \mathit{EC}(z,y) \vee \neg \mathit{EC}(z,x)\).

  2. 2.

    \(\neg \exists z.\mathit{EC}(z,x) \Rightarrow \mathit{NTingr}_{C}(x,x)\).

In absence of externally connected things, each thing is a non–tangential ingredient of itself, hence, in the model OVM each object is its own non–tangential ingredient and it has no tangential ingredients. To produce models in which EC, NTingrC, TingrC will be exhibited, we resort to topology, see Sect. 10.4.3.

Further properties of the predicate NTingrC are

  1. 1.

    \(\mathit{NTingr}_{C}(y,x) \wedge C(z,y) \Rightarrow C(z,x)\).

  2. 2.

    \(\mathit{NTingr}_{C}(y,x) \wedge Ov_{C}(z,y) \Rightarrow Ov_{C}(z,x)\).

  3. 3.

    \(\mathit{NTingr}_{C}(y,x) \wedge C(z,y) \Rightarrow Ov_{C}(z,x)\).

  4. 4.

    \(\mathit{ingr}_{C}(y,x) \wedge \mathit{NTingr}_{C}(x,z) \Rightarrow \mathit{NTingr}_{C}(y,z)\).

  5. 5.

    \(\mathit{ingr}_{C}(y,z) \wedge \mathit{NTingr}_{C}(x,y) \Rightarrow \mathit{NTingr}_{C}(x,z)\).

  6. 6.

    \(\mathit{NTingr}_{C}(y,z) \wedge \mathit{NTingr}_{C}(z,x) \Rightarrow \mathit{NTingr}_{C}(y,x)\).

For Property 3, from already known \(\forall z.\neg \mathit{EC}(z,y) \vee \neg \mathit{EC}(z,x)\), it follows

$$\displaystyle{(i)\ \forall w.\neg C(w,x) \vee Ov_{C}(w,x) \vee \neg C(w,y) \vee \mathit{Ov}_{C}(w,y)}$$

As C(z, y), one obtains C(z, x). Thus, by (i), \(\mathit{Ov}_{C}(z,y) \vee \mathit{Ov}_{C}(z,x)\) and OvC(z, x).

For Property 4, assume ingrC(y, x), ingrC(x, z) and hence, ingrC(y, z) (otherwise there is nothing to prove), consider ¬NTingrC(y, z), i.e., for some w: EC(w, z), EC(w, y). Thus, C(w, z), ¬OvC(w, z), C(w, y), ¬OvC(w, y). Then, C(w, x) and ¬OvC(w, x), hence, EC(w, x) and ¬NTingrC(x, z), a contradiction. Similarly, one justifies Properties 5 and 6.

3 Rough Mereology

A scheme of mereology, introduced into a collection of things, sets an exact hierarchy of things of which some are (exact) parts of others; to ascertain whether a thing is an exact part of some other thing is in practical cases often difficult if possible at all, e.g., a robot sensing the environment by means of a camera or a laser range sensor, cannot exactly perceive obstacles or navigation beacons. Such evaluation can be done approximately only and one can discuss such situations up to a degree of certainty only. Thus, one departs from the exact reasoning scheme given by decomposition into parts to a scheme which approximates the exact scheme but does not observe it exactly.

Such a scheme, albeit its conclusions are expressed in an approximate language, can be more reliable, as its users are aware of uncertainty of its statements and can take appropriate measures to fend off possible consequences.

Imagine two robots using the language of connection mereology for describing mutual relations; when endowed with touch sensors, they can ascertain the moment when they are connected; when a robot has as a goal to enter a certain area, it can ascertain that it connected to the area or overlapped with it, or it is a part of the area, and it has no means to describe its position more precisely.

Introducing some measures of overlapping, in other words, the extent to which one thing is a part to the other, would allow for a more precise description of relative position, and would add an expressional power to the language of mereology. Rough mereology answers these demands by introducing the notion of a part to a degree with the degree expressed as a real number in the interval [0, 1]. Any notion of a part by necessity relates to the general idea of containment, and thus the notion of a part to a degree is related to the idea of partial containment and it should preserve the essential intuitive postulates about the latter.

The predicate of a part to a degree stems ideologically from and has as one of motivations the predicate of an element to a degree introduced by L. A. Zadeh as a basis for fuzzy set theory (Zadeh, 1965); in this sense, rough mereology is to mereology as the fuzzy set theory is to the naive set theory. To the rough set theory, owes rough mereology the interest in concepts as things of analysis.

The primitive notion of rough mereology is the notion of a rough inclusion which is a ternary predicate μ(x, y, r) where x, y are things and r ∈ [0, 1], read ‘the thing x is a part to degree at least of r to the thing y’. Any rough inclusion is associated with a mereological scheme based on the notion of a part by postulating that μ(x, y, 1) is equivalent to ingr(x, y), where the ingredient relation is defined by the adopted mereological scheme. Other postulates about rough inclusions stem from intuitions about the nature of partial containment; these intuitions can be manifold, a fortiori, postulates about rough inclusions may vary. In our scheme for rough mereology, we begin with some basic postulates which would provide a most general framework. When needed, other postulates, narrowing the variety of possible models, can be introduced.

3.1 Rough Inclusions

We have already stated that a rough inclusion is a ternary predicate μ(x, y, r). We assume that a collection of things is given, on which a part relation π is introduced with the associated ingredient relation ingr. We thus apply inference schemes of mereology due to Leśniewski, presented above.

Predicates μ(x, y, r) were introduced in Polkowski and Skowron (1994, 1997); they satisfy the following postulates, relative to a given part relation π and the induced by π relation ingr of an ingredient, on a set of things

RINC1 \(\mu (x,y,1) \Leftrightarrow \mathit{ingr}(x,y).\)

This postulate asserts that parts to degree of 1 are ingredients.

RINC2 \(\mu (x,y,1) \Rightarrow \forall z[\mu (z,x,r) \Rightarrow \mu (z,y,r)].\)

This postulate does express a feature of partial containment that a ‘bigger’ thing contains a given thing ‘more’ than a ‘smaller’ thing. It can be called a monotonicity condition for rough inclusions.

RINC3 \(\mu (x,y,r) \wedge s < r \Rightarrow \mu (x,y,s)\).

This postulate specifies the meaning of the phrase ‘a part to a degree at least of r’. From postulates RINC1–RINC3, and known properties of ingredients some consequences follow

  1. 1.

    μ(x, x, 1).

  2. 2.

    \(\mu (x,y,1) \wedge \mu (y,z,1) \Rightarrow \mu (x,z,1)\).

  3. 3.

    \(\mu (x,y,1) \wedge \mu (y,x,1) \Leftrightarrow x = y\).

  4. 4.

    \(x\neq y \Rightarrow \neg \mu (x,y,1) \vee \neg \mu (y,x,1)\).

  5. 5.

    \(\forall z\forall r[\mu (z,x,r) \Leftrightarrow \mu (z,y,r)] \Rightarrow x = y\).

Property 5 may be regarded as an extensionality postulate in rough mereology.

By a model for rough mereology, we mean a quadruple

$$\displaystyle{M = (V _{M},\pi _{M},\mathit{ingr}_{M},\mu _{M})}$$

where VM is a set with a part relation \(\pi _{M} \subseteq V _{M} \times V _{M}\), the associated ingredient relation \(\mathit{ingr}_{M} \subseteq V _{M} \times V _{M}\), and a relation \(\mu _{M} \subseteq V _{M} \times V _{M} \times [0,1]\) which satisfies RINC1–RINC3.

We now describe some models for rough mereology which at the same time give us methods by which we can define rough inclusions, see Polkowski (2002, 2003, 2004a,b, 2005a, 2007, 2008, 2009a), a detailed discussion may be found in Polkowski (2011).

3.1.1 Rough Inclusions from t–norms

We resort to continuous t–norms which are continuous functions T: [0, 1]2 → [0, 1] which are (1) symmetric. (2) associative. (3) increasing in each coordinate. (4) satisfying boundary conditions \(T(x,0) = 0,T(x,1) = x\), cf., Polkowski (2011), Chs. 4 and 6, Hájek (1998), Ch. 2. Classical examples of continuous t–norms are

  1. 1.

    \(L(x,y) = \mathit{max}\{0,x + y - 1\}\) (the Łukasiewicz’s t–norm).

  2. 2.

    P(x, y) = x ⋅ y (the product t–norm).

  3. 3.

    M(x, y) = min{x, y} (the minimum t–norm).

The residual implication ⇒ T induced by a continuous t–norm T is defined as

$$\displaystyle{ x \Rightarrow _{T}y = \mathit{max}\{z: T(x,z) \leq y\} }$$
(10.15)

One proves that \(\mu _{T}(x,y,r) \Leftrightarrow x \Rightarrow _{T}y \geq r\) is a rough inclusion; particular cases are

  1. 1.

    \(\mu _{L}(x,y,r) \Leftrightarrow \mathit{min}\{1,1 - x + y \geq r\}\) (the Łukasiewicz implication).

  2. 2.

    \(\mu _{P}(x,y,r) \Leftrightarrow \frac{y} {x} \geq r\) when x > 0, μP(x, y, 1) when x = 0 (the Goguen implication).

  3. 3.

    \(\mu _{M}(x,y,r) \Leftrightarrow y \geq r\) when x > 0, μM(x, y, 1) when x = 0 (the Gödel implication).

A particular case of continuous t–norms are Archimedean t–norms which satisfy the inequality T(x, x) < x for each x ∈ (0, 1). It is well–known, see Ling (1965), that each archimedean t–norm T admits a representation

$$\displaystyle{ T(x,y) = g_{T}(f_{T}(x) + f_{T}(y)) }$$
(10.16)

where the function fT: [0, 1] → R is continuous decreasing with fT(1) = 0, and gT: R → [0, 1] is the pseudo–inverse to fT, i.e., gf = id.

It is known, cf., e.g., Hájek (1998), that up to an isomorphism there are two Archimedean t–norms: L and P. Their representations are

$$\displaystyle{ f_{L}(x) = 1 - x;\ g_{L}(y) = 1 - y }$$
(10.17)

and

$$\displaystyle{ f_{P}(x) = exp(-x);\ g_{P}(y) = -ln\ y }$$
(10.18)

For an Archimedean t–norm T, we define the rough inclusion μT on the interval [0, 1] by means of

$$\displaystyle{ (ari)\ \mu ^{T}(x,y,r) \Leftrightarrow g_{ T}(\vert x - y\vert ) \geq r }$$
(10.19)

equivalently,

$$\displaystyle{ \mu ^{T}(x,y,r) \Leftrightarrow \vert x - y\vert \leq f_{ T}(r) }$$
(10.20)

It follows from (10.20), that μT satisfies conditions RINC1–RINC3 with ingr as identity = .

To give a hint of proof: for RINC1: μT(x, y, 1) if and only if \(\vert x - y\vert \leq f_{T}(1) = 0\), hence, if and only if x = y. This implies RINC2. In case s < r, and | xy | ≤ fT(r), one has fT(r) ≤ fT(s) and | xy | ≤ fT(s).

Specific recipes are

$$\displaystyle{ \mu ^{L}(x,y,r) \Leftrightarrow \vert x - y\vert \leq 1 - r }$$
(10.21)

and

$$\displaystyle{ \mu ^{P}(x,y,r) \Leftrightarrow \vert x - y\vert \leq -ln\ r }$$
(10.22)

Both residual and archimedean rough inclusions satisfy the transitivity condition

$$\displaystyle{\mathrm{(Trans)}\ \mathit{if }\mu (x,y,r)\ \mathit{and}\ \mu (y,z,s),\mathit{then}\ \mu (x,z,T(r,s)).}$$

In the way of a proof, assume, e.g., μT(x, y, r) and μT(y, z, s), i.e., | xy | ≤ fT(r) and | yz | ≤ fT(s). Hence, \(\vert x - z\vert \leq \vert x - y\vert + \vert y - z\vert \leq f_{T}(r) + f_{T}(s)\), hence, \(g_{T}(\vert x - z\vert ) \geq g_{T}(f_{T}(r) + f_{T}(s)) = T(r,s)\), i.e., μT(x, z, T(r, s)). Other cases go on same lines. Let us observe that rough inclusions of the form (ari) are also symmetric.

3.1.2 Rough Inclusions in Information Systems (Data Tables)

An important domain where rough inclusions will play a dominant role in our analysis of reasoning by means of parts is the realm of information systems of Pawlak (1991), cf., Polkowski (2011), Ch. 6. We will define information rough inclusions denoted with a generic symbol μI.

We recall that an information system (a data table) is represented as a pair (U, A) where U is a finite set of things and A is a finite set of attributes; each attribute a: U → V maps the set U into the value set V. For an attribute a and a thing v, a(v) is the value of a on v.

For things u, v the discernibility set DIS(u, v) is defined as

$$\displaystyle{ \mathit{DIS}(u,v) =\{ a \in A: a(u)\neq a(v)\} }$$
(10.23)

For an (ari) μT, we define a rough inclusion μTI by means of

$$\displaystyle{ (\mathit{airi})\ \mu _{T}^{I}(u,v,r) \Leftrightarrow g_{ T}(\frac{\vert \mathit{DIS}(u,v)\vert } {\vert A\vert } ) \geq r }$$
(10.24)

Then, μTI is a rough inclusion with the associated ingredient relation of identity and the part relation empty.

For the Łukasiewicz t–norm, the airi μLI is given by means of the formula

$$\displaystyle{ \mu _{L}^{I}(u,v,r) \Leftrightarrow 1 -\frac{\vert \mathit{DIS}(u,v)\vert } {\vert A\vert } \geq r }$$
(10.25)

We introduce the set IND(u, v) = A ∖ DIS(u, v). With its help, we obtain a new form of (10.25)

$$\displaystyle{ \mu _{L}^{I}(u,v,r) \Leftrightarrow \frac{\vert \mathit{IND}(u,v)\vert } {\vert A\vert } \geq r }$$
(10.26)

The formula (10.26) witnesses that the reasoning based on the rough inclusion μLI is the probabilistic one which goes back to Łukasiewicz (1970). Each (airi)–type rough inclusion μTI satisfies the transitivity condition (Trans) and is symmetric.

3.1.3 Rough Inclusions on Sets and Measurable Sets

Formula (10.26) can be abstracted to set and geometric domains. For finite sets A, B,

$$\displaystyle{ \mu ^{S}(A,B,r) \Leftrightarrow \frac{\vert A \cap B\vert } {\vert A\vert } \geq r }$$
(10.27)

where | X | denotes the cardinality of X, defines a rough inclusion μS. For bounded measurable sets X, Y in an Euclidean space En,

$$\displaystyle{ \mu ^{G}(A,B,r) \Leftrightarrow \frac{\vert \vert A \cap B\vert \vert } {\vert \vert A\vert \vert } \geq r }$$
(10.28)

where | | A | | denotes the area (the Lebesgue measure) of the region A, defines a rough inclusion μG. Both μS, μG are symmetric but not transitive.

Other rough inclusions and their weaker variants will be defined in later chapters.

4 Mereotopology and Mereogeometry

Both mereology and topology address problems of mutual relations among things like ‘being external’, ‘being inside’ etc., hence, as the language of topology is well established, it is desirable to trace topological constructs in mereological universa. First, we would like to introduce the reader to rudiments of topology necessary in order to follow our exposition.

4.1 A Topological Background

We begin with the notion of a topological space which is a pair (X, τ) where X is a set and τ a family of subsets of X; sets in the family τ are called open sets provided the following are satisfied (1) τ is closed on finite intersections. (2) unions of sub–families of τ belong in τ. Examples are provided, e.g., by metric spaces; given a metric ρ on a set X, open balls are defined as sets of the form B(x, r) = { y ∈ X: ρ(x, y) < r} for x ∈ X and r > 0. Open sets are defined in this case as unions of families of open balls.

Closed sets are complements to open sets; a set \(C \subseteq X\) is closed if and only if the set X ∖ C is open. Clearly, intersections of arbitrary families of closed sets are closed and finite unions of closed sets are closed.

4.1.1 Approximations: Interior and Closure of a Set

In a given topological space, (X, τ), open as well as closed sets may be called definable as the membership problem for them is decidable; other sets can be approximated only by open, respectively, closed sets. To this end, topology offers operators of interior, Int, respectively, of closure, Cl. The operator Int provides the approximation from below, whereas Cl yields the approximation from above.

For a subset \(A \subseteq X\), the interior of A is \(\mathit{IntA} =\bigcup \{ V \in \tau: V \subseteq A\}\); the dual operator of closure can be defined as \(\mathit{ClA} = X\setminus \mathit{Int}(X\setminus A)\). The well–known properties of those operators follow (1) \(\mathit{Int}\varnothing = \varnothing \) (ClX = X). (2) \(\mathit{IntA} \subseteq A\) (\(A \subseteq \mathit{ClA}\)). (3) IntIntA = A (ClClA = ClA). (4) \((A \subseteq B) \Rightarrow (\mathit{Int}A \subseteq \mathit{IntB})\) (\(A \subseteq B \Rightarrow \mathit{ClA} \subseteq ClB\)), where formulas in parentheses give properties of the closure operator, dual to those of the interior operator. Properties 1–4 may be taken as axioms for interior, respectively, closure, operators. Each set \(A \subseteq X\) is sandwiched between IntA and ClA.

4.1.2 Boundaries

The difference ClA ∖ IntA, denoted BdA, is the boundary of A. Clearly, \(\mathit{BdA} = \mathit{ClA} \cap \mathit{Cl}(X\setminus A)\). In case BdA = , the set A is closed–open (clopen), A is meager in case IntBdA = , and A is nowhere–dense in case IntBdClA = . Let us observe that the boundary operator requires for its definition either interior and complement or closure and complement operators, i.e., it is of open–and–closed character.

We close this paragraph with an essential property of Int and Cl operators

$$\displaystyle{ V \cap \mathit{ClA} \subseteq \mathit{Cl}(V \cap A) }$$
(10.29)

for each set A and each open set V in a topological space (X, τ). For the proof, it suffices to observe that given \(x \in V \cap \mathit{ClA}\) and an arbitrary open W ∋ x, one has \((W \cap V ) \cap A\neq \varnothing \), i.e., \(W \cap (V \cap A)\neq \varnothing \), hence, \(x \in \mathit{Cl}(V \cap A)\).

Topological spaces are classified also with respect to their separation properties; T0 (or, Kolmogorov) property consists in ClxCly when points xy; T1 means Clx = x, each x; T2 (Hausdorff) means that each pair of distinct points can be separated by disjoint open sets; T3 (regularity) means that each pair x, F, F closed and xF, can be separated by disjoint open sets(regularity); a topological space (X, τ) is regular if and only if for each pair x ∈ V, where V open, there exists an open W such that \(x \in W \subseteq \mathit{ClW } \subseteq V\).

Of interest to us are particular sub–categories of open or closed sets. We introduce regular closed sets which allow for non–trivial examples of C–mereological operators of overlap, interior and closure.

4.2 Regular Open and Regular Closed Sets

A set A in a topological space (X, τ) is regular open if it is of the form IntClB for some set B; then, by property (iii) of operators Int, Cl, IntClA = IntClB, i.e., A = IntClA. Hence, regular open sets are characterized by the identity A = IntClA. Dually, a set C is regular closed if it satisfies the identity C = ClIntC. It follows that A is regular open (resp. regular closed) if and only if the set X ∖ A is regular closed (resp. regular open). As the set IntBd(ClV ∖ V ) with an open V is empty, each regular open or regular closed set has a nowhere–dense boundary.

Examples of regular closed sets are closed disks in the 2D–space or closed balls in the 3D–space and the open complements are examples of regular open sets; more generally, regular closed are compact convex regions in Euclidean spaces and their interiors are specimens of regular open sets.

A very basic property of regular sets is that they form complete Boolean algebras; regular open sets form the Boolean algebra, denoted RO(X). In order to justify this claim, we let

$$\displaystyle{A^{\perp } = X\setminus \mathit{ClA}}$$

The set A is regular open if and only if A = A ⊥ ⊥ . Indeed, \(A^{\perp \perp } = X\setminus \mathit{Cl}(X\setminus \mathit{ClA}) = \mathit{IntClA}\). Properties of the operation A ⊥  are (in proofs, one uses (10.29)

  1. 1.

    If \(A \subseteq B\), then \(B^{\perp }\subseteq A^{\perp }\).

  2. 2.

    If A is an open set, then \(A \subseteq A^{\perp \perp }\).

  3. 3.

    If A is an open set, then \(A^{\perp } = A^{\perp \perp \perp }\), hence, \(A^{\perp \perp } = A^{\perp \perp \perp \perp }\).

  4. 4.

    If A, B are open sets, then \((A \cap B)^{\perp \perp } = A^{\perp \perp }\cap B^{\perp \perp }\).

  5. 5.

    \((A \cup B)^{\perp } = A^{\perp }\cap B^{\perp }\).

  6. 6.

    If A is an open set, then \((A \cup A^{\perp })^{\perp \perp } = X\).

Now, we define in the family RO(X) of regular open sets operations \(\wedge,\vee,{\prime}\)

  1. 1.

    \(A \vee B = (A \cup B)^{\perp \perp } = \mathit{IntCl}(A \cup B)\).

  2. 2.

    \(A \wedge B = A \cap B\).

  3. 3.

    \(A^{{\prime}} = A^{\perp } = X\setminus \mathit{ClA}\).

and constants \(\mathbf{0} = \varnothing,\mathbf{1} = X\).

All operations listed above give regular open sets by properties of (. ) ⊥ . It remains to check that axioms of a Boolean algebra are satisfied. Commutativity laws \(A \vee B = B \vee A,A \wedge B = B \wedge A\) are satisfied evidently. The laws \(A \vee \mathbf{0} = A,A \wedge \mathbf{1} = A\) are also manifest. We have \(A \wedge A^{{\prime}} = A \cap A^{\perp } = A\setminus \mathit{ClA} = \varnothing = \mathbf{0}\) as well as \(A \vee A^{{\prime}} = (A \cup A^{\perp \perp })^{\perp \perp } = X = \mathbf{1}\). The distributive laws \(A \vee (B \wedge C) = (A \vee B) \wedge (A \vee C)\) as well as \(A \vee (B \wedge C) = (A \vee C) \wedge (A \vee C)\) hold by Property 5.

A particular sub–algebra of RO(X) is the algebra CO(X) of clopen sets in X. In case of CO(X) boolean operations \(\vee,\wedge,{\prime}\) specialize to usual set–theoretic operations \(\cup,\cap,\setminus\) i.e. CO(X) is a field of sets.

The basic distinction between RO(X) and CO(X) is the fact that RO(X) is a complete Boolean algebra for any X whereas CO(X) needs not be such.

Let us observe that the boolean ordering relation ≤ is in this case the inclusion \(\subseteq \).

Consider \(\mathcal{A}\subseteq \mathit{RO}(X)\). Let \(s(\mathcal{A}) = (\bigcup \mathcal{A})^{\perp \perp }\); we check that \(s(\mathcal{A})\) is the supremum of \(\mathcal{A}\).

Indeed, for \(A \in \mathcal{A}\), we have \(A \in \bigcup \mathcal{A}\) hence \(A = A^{\perp \perp }\subseteq (\bigcup \mathcal{A})^{\perp \perp }\) i.e. \(A \leq s(\mathcal{A})\). It follows that \(s(\mathcal{A})\) is an upper bound for \(\mathcal{A}\).

Now, assume that B ∈ RO(X) is an upper bound for \(\mathcal{A}\), i.e., \(A \subseteq B\) for each \(A \in \mathcal{A}\). Hence \(\bigcup (\mathcal{A}) \subseteq B\) and thus \((\bigcup \mathcal{A})^{\perp \perp }\subseteq B^{\perp \perp } = B\) i.e. \(s(\mathcal{A}) \leq B\) proving that \(s(\mathcal{A})\) is the supremum of \(\mathcal{A}\). Finally, by duality it follows that \(i(\mathcal{A}) = (\bigcap \mathcal{A})^{\perp \perp }\) is the infimum of \(\mathcal{A}\).

By duality applied to the family RC(X) of regular closed sets in X, we obtain a dual proposal that RC(X) is a complete boolean algebra under operations \(\wedge,\vee,{\prime}\) defined as follows

  1. 1.

    \(A \vee B = A \cup B\).

  2. 2.

    \(A \wedge B = \mathit{ClInt}(A \cap B)\).

  3. 3.

    \(A^{{\prime}} = X\setminus \mathit{IntA}\).

and with constants \(\mathbf{0} = \varnothing,\mathbf{1} = X\).

It follows that RO(X), RC(X) are mereological categories as they are closed on formation of classes; contrariwise, closed sets do not form any mereological category; this is in part responsible for difficulties with boundaries in mereology.

A complete axiomatization of mereotopology interpreted in regular open sets is given in Asher and Vieu (1995).

4.3 An Application: The Model ROM for Connection

We define in the space RO(X) of regular open sets in a regular space X the connection C by demanding that

$$\displaystyle{C(A,B) \Leftrightarrow \mathit{ClA} \cap \mathit{ClB}\neq \varnothing.}$$

For simplicity sake, we assume that the regular space X is connected, so no set in it is clopen, equivalently, the boundary of each set is non–empty.

4.3.1 Ingredient in ROM

First, we investigate what ingrC means in ROM. By definition IC in (10.8), for A, B ∈ RO(X),

$$\displaystyle{\mathit{ingr}_{C}(A,B) \Leftrightarrow \forall Z \in \mathit{RO}(X).\mathit{ClZ} \cap \mathit{ClA}\neq \varnothing \Rightarrow \mathit{ClZ} \cap \mathit{ClB}\neq \varnothing.}$$

This excludes the case when \(A\setminus \mathit{ClB}\neq \varnothing \) as then we could find a Z ∈ RO(X) with

$$\displaystyle{Z \cap A\neq \varnothing = \mathit{ClZ} \cap \mathit{ClB}}$$

(as our space X is regular). It remains that \(A \subseteq \mathit{ClB}\), hence, \(A \subseteq \mathit{IntClB = B}\). It follows finally that in model ROM, \(\mathit{ingr}_{C}(A,B) \Leftrightarrow A \subseteq B\).

4.3.2 Overlap in ROM

Now, we can interpret overlapping in ROM. For A, B ∈ RO(X), OvC(A, B) means that there exists Z ∈ RO(X) such that \(Z \subseteq A\) and \(Z \subseteq B\) hence\(Z \subseteq A \cap B\), hence

$$\displaystyle{A \cap B\neq \varnothing.}$$

This condition is also sufficient by regularity of X. We obtain that in ROM,

$$\displaystyle{Ov_{C}(A,B) \Leftrightarrow A \cap B\neq \varnothing.}$$

4.3.3 External Connectedness in ROM

The status of EC in ROM is

$$\displaystyle{\mathit{EC}(A,B) \Leftrightarrow \mathit{ClA} \cap \mathit{ClB}\neq \varnothing \wedge A \cap B = \varnothing.}$$

This means that closed sets ClA, ClB do intersect only at their boundary points.

4.3.4 Tangential Ingredient in ROM

We can address the notion of a tangential ingredient: TingrC(A, B) means the existence of Z ∈ RO(X) such that

$$\displaystyle{\mathit{ClZ} \cap \mathit{ClA}\neq \varnothing \neq \mathit{ClZ} \cap \mathit{ClB}}$$

and

$$\displaystyle{Z \cap A = \varnothing = Z \cap B}$$

along with \(A \subseteq B\).

In case

$$\displaystyle{\mathit{ClA} \cap (\mathit{ClB}\setminus B)\neq \varnothing }$$

letting \(Z = X\setminus \mathit{ClB}\) we have

$$\displaystyle{\mathit{ClZ} = \mathit{Cl}(X\setminus \mathit{ClB})}$$

and

$$\displaystyle{BdZ = \mathit{ClZ}\setminus Z = \mathit{Cl}(X\setminus \mathit{ClB})\setminus (X\setminus \mathit{ClB})}$$

which in turn is equal to

$$\displaystyle{\mathit{Cl}(X\setminus \mathit{ClB}) \cap \mathit{ClB} = \mathit{Cl}(X\setminus B) \cap \mathit{ClB} = \mathit{BdB}.}$$

Hence, \(\mathit{ClB}\setminus B \subseteq \mathit{ClZ}\), and \(\mathit{ClZ}\,\cap \,\mathit{ClA}\neq \varnothing \); a fortiori, \(\mathit{ClB}\,\cap \,\mathit{ClZ}\neq \varnothing \). As \(Z\,\cap \,B = \varnothing \), a fortiori \(Z \cap A = \varnothing \) follows.

We know, then, that

$$\displaystyle{\mathit{ClA} \cap (\mathit{ClB}\setminus B)\neq \varnothing \Rightarrow \mathit{Tingr}_{C}(A,B)}$$

Was to the contrary, \(\mathit{ClA} \subseteq B\), from \(Z \cap \mathit{ClA}\neq \varnothing \) it would follow that \(Z \cap B\neq \varnothing \), negating EC(A, B).

It follows finally that in ROM, TingrC(A, B) if and only if \(A \subseteq B\) and \(\mathit{ClA} \cap (\mathit{ClB}\setminus B)\neq \varnothing \), i.e.,

$$\displaystyle{\mathit{Tingr}_{C}(A,B) \Leftrightarrow A \subseteq B \wedge \mathit{ClA} \cap \mathit{BdB}\neq \varnothing.}$$

From this analysis we obtain also that NTingrC(A, B) if and only if \(\mathit{ClA} \subseteq \mathit{IntB}\).

4.4 Mereotopology in Part Mereology

We assume now a Leśniewski–style universe with part and ingredient relations and derived notions. Topological structures which arise in this context can be induced from overlap relations.

As in topology, interior as well as closure operators act on unions and intersections of sets, we recall here two fusion operators due to Tarski (1935), cf., Clay (1974). These operators are the sum x + y and the product x ⋅ y defined by means of

$$\displaystyle{ \mathit{ingr}(z,x + y) \Leftrightarrow \mathit{ingr}(z,x) \vee \mathit{ingr}(z,y), }$$
(10.30)

and,

$$\displaystyle{ \mathit{ingr}(z,x \cdot y) \Leftrightarrow \mathit{ingr}(z,x) \wedge \mathit{ingr}(z,y) }$$
(10.31)

4.4.1 On Closures

As the first approximation to topology, let us define for each thing x, its closure c(x) by means of

$$\displaystyle{ c(x) = \mathit{Cls}Ov(x) }$$
(10.32)

where the property Ov(x) is defined by \(Ov(x)(y) \Leftrightarrow Ov(x,y)\), i.e., we build the closure c(x) as the class of things which overlap with x.

The closure operator c(. ) has the following properties

Cl1:

ingr(x, c(x)).

Cl2:

If ingr(x, y), then ingr(c(x), c(y)).

Cl3:

ingr(c(x ⋅ y), c(x) ⋅ c(y)).

Cl4:

\(c(x + y) = c(x) + c(y)\).

In way of proof, we observe that Cl1 and Cl2 follow from definition of the overlap relation and the class definition. For Cl3, if ingr(t, c(x ⋅ y)), then there is z such that Ov(t, z) and Ov(z, x ⋅ y) thus for some w one has Ov(z, w) and ingr(w, x), ingr(w, y) which imply that ingr(t, c(x)), ingr(t, c(y)) and finally ingr(t, c(x) ⋅ c(y)). By M3, ingr(c(x ⋅ y), c(x) ⋅ c(y)). For Cl4, it suffices to observe that \(Ov(z,x + y) \Leftrightarrow Ov(z,x) \vee Ov(z,y)\).

Another possibility for a topology is in iteration of the operator c, viz, we let

$$\displaystyle{ Ov^{n+1}(x,y) \Leftrightarrow \exists z.Ov(x,z) \wedge Ov^{n}(z,y);Ov^{1}(x,y) \Leftrightarrow Ov(x,y) }$$
(10.33)

and we define

$$\displaystyle{ \mathit{OVLP}(x)(y) \Leftrightarrow \exists n.Ov^{n}(x,y) }$$
(10.34)

The closure Cl(x) is defined as the class of the property OVLP(x), i.e.,

$$\displaystyle{ \mathit{Cl}(x) = \mathit{Cls}\mathit{OVLP}(x) }$$
(10.35)

The operator Cl(x) has the following properties

CL1:

Cl(Cl(x)) = Cl(x).

CL2:

ingr(x, Cl(x)).

CL3:

ingr(x, y) implies ingr(Cl(x), Cl(y)).

CL4:

\(\mathit{Cl}(x + y) = \mathit{Cl}(x) + \mathit{Cl}(y)\).

Please observe that CL2, CL3 follow straightforwardly from definitions. For CL1, observe that ingr(t, Cl(x)) if and only if OVLP(x)(t). Thus, ingr(t, Cl(Cl(x))) if and only if OVLP(Cl(x))(t) if and only if OVLP(x)(t) if and only if ingr(t, Cl(x)).

For CL4, assume first that ingr(t, Cl(x + y)) hence OVLP(t, x + y) and thus \(\mathit{OVLP}(t,x) \vee \mathit{OVLP}(t,y)\), i.e., \(\mathit{ingr}(t,\mathit{Cl}(x)) \vee \mathit{ingr}(t,\mathit{Cl}(y))\) and thus ingr(t, Cl(x) + Cl(y)). Assume now that ingr(t, Cl(x) + Cl(y)), i.e., \(\mathit{OVLP}(t,x) \vee \mathit{OVLP}(t,y)\), so there exists m such that \(Ov^{m}(t,x) \vee Ov^{m}(t,y)\), i.e., Ovm(t, x + y), hence, ingr(t, Cl(x + y)).

It follows that the operator Cl is a genuine closure operator; its properties are weak, as it in fact delineates components of things with respect to the overlap property: it is not even a T0–closure operator.

4.4.2 On Boundaries

A definition of a boundary can be attempted on the lines of topological boundary concept. For a thing x, let a property Υ(x) be defined as follows

$$\displaystyle{ \varUpsilon (x)(t) \Leftrightarrow \mathit{ingr}(t,x) \wedge \forall z.[Ov(z,x) \wedge Ov(z,-x) \Rightarrow Ov(z,t)] }$$
(10.36)

We may define the boundary of x, Fr(x), by letting

$$\displaystyle{ \mathit{Fr}(x) = \mathit{Cls}\varUpsilon (x) }$$
(10.37)

Properties of Fr(x) following directly from definitions above are

  1. 1.

    ingr(Fr(x), x).

  2. 2.

    \(\forall z.Ov(z,x) \wedge Ov(z,-x) \Rightarrow Ov(z,\mathit{Fr}(x)).\)

The above notion of a boundary has a topological flavor though by definition the boundary of the thing must be its ingredient contrary to topological reality; however, the notion of a boundary has a much wider scope. It can also support the idea of a separator between two things within a third, which does encompass either, like a river flowing through a town separates parts on opposite banks. To implement this idea, for things x, y, such that extr(x, y), we define the property

$$\displaystyle{ \varOmega (x,y)(t) \Leftrightarrow \mathit{extr}(t,x) \wedge \mathit{extr}(t,y) }$$
(10.38)

and we let

$$\displaystyle{ \mathit{Bd}(x,y) = \mathit{Cls}\varOmega (x,y) }$$
(10.39)

Then the boundary operation Bd has properties

  1. 1.

    Bd(x, y) = Bd(y, x).

  2. 2.

    \(\mathit{Bd}(x + y,z) = \mathit{Bd}(x,z) \cdot \mathit{Bd}(y,z)\).

Property 1 is obvious. Property 2 follows from the equivalence \(\mathit{extr}(x + y,z) \Leftrightarrow \mathit{extr}(x,z) \wedge \mathit{extr}(y,z)\). A relative variant can be defined; assuming that ingr(x, z), ingr(y, z) and extr(x, y), a boundary relative to z between x and y, Bdz(x, y), is the class of things t such that ingr(t, z), extr(t, x), extr(t, y) provided this property is non–vacuous.

4.5 Connection Mereotopology

Topological operators are constructed in connection mereology under same caveat as quasi–Boolean operators: absence of the null thing causes the need for reservations concerning existence of some things necessary for topological constructions. We will make this reservations not trying to add new axioms which would guarantee existence of some auxiliary things. We follow Clarke (1981) in this exposition.

4.5.1 On the Notion of C–interior

The C–interiorIntC(x) of a thing x is defined as the class of non–tangential ingredients of x.

We define the property NTP(x)

$$\displaystyle{ \mathit{NTP}(x)(z) \Leftrightarrow \mathit{NTP}(z,x) }$$
(10.40)

The interior IntC(x) is defined by means of

$$\displaystyle{ \mathit{INT}_{C}\ \mathit{Int}_{C}(x) = \mathit{Cls}\mathit{NTP}(x) }$$
(10.41)

hence, properties follow

  1. 1.

    \(C(z,\mathit{Int}_{C}(x)) \Leftrightarrow \exists w.\mathit{NTingr}_{C}(w,x) \wedge C(z,w)\) by the class definition.

  2. 2.

    \(\neg \exists z.\mathit{EC}(z,x) \Rightarrow (\mathit{Int}_{C}(x) = x)\). In particular, in the model OVM, IntC(x) = x for each thing x.

  3. 3.

    ingrC(IntC(x), x) as C(z, IntC(x)) ⇒ C(z, x).

  4. 4.

    C(z, IntC(x)) ⇒ OvC(z, x).

  5. 5.

    EC(z, x) ⇒ ¬C(z, IntC(x)).

  6. 6.

    \(\mathit{ingr}_{C}(z,\mathit{Int}_{C}(x)) \Leftrightarrow \mathit{NTingr}_{C}(z,x)\).

  7. 7.

    ingrC(z, x) ⇒ ingrC(IntC(z), IntC(x)).

  8. 8.

    \(\mathit{Int}_{C}(x) = x \Leftrightarrow C(z,x) \Rightarrow Ov_{C}(z,x)\).

  9. 9.

    \(\mathit{Int}_{C}(x) = x \Leftrightarrow \mathit{NTingr}_{C}(x,x)\).

An open thing is x such that IntC(x) = x.

Under additional axiomatic postulate that the boolean product of any two open sets is open, see Clarke (1981), A2.1, one can prove that \(\mathit{Int}_{C}(x \cdot y) = \mathit{Int}_{C}(x) \cdot \mathit{Int}_{C}(y)\).

4.5.2 On the Notion of C–Closure

The notion of a topological closure ClC(x) of x, can be introduced by means of the standard duality

$$\displaystyle{ \mathit{Cl}_{C}\ \mathit{Cl}_{C}(x) = -\mathit{Int}_{C}(-x) }$$
(10.42)

By properties of the interior and by duality (10.42), one obtains dual properties of closure

  1. 1.

    ingrC(x, ClC(x)).

  2. 2.

    ClC(ClC(x)) = ClC(x).

  3. 3.

    \(\mathit{ingr}_{C}(x,y) \Rightarrow \mathit{ingr}_{C}(\mathit{Cl}_{C}(x),\mathit{Cl}_{C}(y))\).

  4. 4.

    \(\mathit{Int}_{C}(x \cdot y) = \mathit{Int}_{C}(x) \cdot \mathit{Int}_{C}(y) \Leftrightarrow \mathit{Cl}_{C}(x + y) = \mathit{Cl}_{C}(x) + \mathit{Cl}_{C}(y)\).

  5. 5.

    \(C(z,\mathit{Cl}_{C}(x)) \Leftrightarrow \exists w.\mathit{NTingr}_{C}(w,-x) \wedge C(z,w)\).

4.5.3 C–Boundaries and a Barry Smith’s Proposal for Mereotopology

The notion of a boundary can be introduced along standard topological lines

$$\displaystyle{ \mathit{Bd}_{C}\ \mathit{Bd}_{C}(x) = -(\mathit{Int}_{C}(x) + \mathit{Int}_{C}(-x)) }$$
(10.43)

We collect basic properties of the boundary

  1. 1.

    Under Property 4, \(\mathit{Bd}_{C}(x) = \mathit{Cl}_{C}(x) \cdot -\mathit{Int}_{C}(x)\), i.e., it can be expressed as the difference between the closure and the interior of the thing.

  2. 2.

    \(\mathit{Bd}_{C}(x) = \mathit{Bd}_{C}(-x)\).

  3. 3.

    ingrC(BdC(x), ClC(x)).

An interesting current in mereotopology is that by Smith (1996). It is situated in a universe endowed with part π and ingredient ingr relations, no matter in what way introduced. It departs from the above schemes for mereotopology by introducing the notion of an interior part, ι π, which is supposed to satisfy the requirements

  1. 1.

    \(\iota \pi (x,y) \Rightarrow \pi (x,y)\).

  2. 2.

    \(\iota \pi (x,y) \wedge \pi (y,z) \Rightarrow \iota \pi (x,z)\).

  3. 3.

    \(\pi (x,y) \wedge \iota \pi (y,z) \Rightarrow \iota \pi (x,z)\).

  4. 4.

    \(\iota \pi (x,y) \wedge \iota \pi (y,z) \Rightarrow \iota \pi (x,y \cdot z)\).

  5. 5.

    For a non–void property (collection) F, if \(F(x) \Rightarrow \iota \pi (x,y)\) then \(\iota \pi (\mathit{Cls}F,y)\).

  6. 6.

    There exists y such that \(\iota \pi (x,y)\) for each x.

  7. 7.

    \(\iota \pi (x,y) \Rightarrow \iota \pi (x,\mathit{Cls}\{t:\iota \pi (t,y)\})\).

The interior of x can be defined as \(\mathit{Int}x = \mathit{Cls}\{t:\iota \pi (t,x)\}\), and, one can declare the thing x as open when x = Int x.

From these postulates one derives in the standard way the following properties

  1. 1.

    \(\iota \pi (V,V )\).

  2. 2.

    \(\iota \pi (x,V )\).

  3. 3.

    \(\iota \pi (x,y) \Leftrightarrow \pi (x,\mathit{Cls}\{t:\iota \pi (t,y)\})\).

  4. 4.

    \(\iota (\mathit{Cls}\{t:\iota \pi (t,y)\},y)\).

An approach to the notion of a boundary follows in two steps. First, the relation †(x, y), ‘x crosses y’ is defined as

$$\displaystyle{ \dag (x,y) \Leftrightarrow Ov(x,y) \wedge Ov(x,-y) }$$
(10.44)

Observe that no x can cross the universe V. The second notion of straddling involves open sets; one says that Str(x, y), x straddles y, when

$$\displaystyle{ \iota \pi (x,z) \Rightarrow \dag (z,y) }$$
(10.45)

for each z.

It follows that

  1. 1.

    \(\pi (x,y) \Rightarrow \iota \pi (x,y) \vee \mathit{Str}(x,y)\).

  2. 2.

    \(\iota \pi (x,x) \vee \mathit{Str}(x,x).\)

The notion of to be boundary, B(x, y) is derived by means of

$$\displaystyle{ B(x,y) \Leftrightarrow \pi (z,x) \Rightarrow \mathit{Str}(z,y) }$$
(10.46)

for each z.

The boundary Bd x of x is then defined as

$$\displaystyle{ \mathit{Bd}x = \mathit{Cls}\{y: B(y,x)\} }$$
(10.47)

Closure of x, cl(x) is defined as the union x + Bd x. It then satisfies the postulates for topological closure operator,

  1. 1.

    π(x, cl(x)).

  2. 2.

    cl(cl(x)) = cl(x).

  3. 3.

    \(cl(x + y) = cl(x) + cl(y)\).

4.6 Rough Mereotopology

We analyze now topological structures in rough mereological framework. We consider the case of transitive and symmetric rough inclusions here, for more general discussion, cf., Polkowski (2011), Ch. 6. Here belong rough inclusions of types (ari), (airi). We use a generic symbol μ to denote either of these forms. μ is transitive with some t–norm T

$$\displaystyle{ \mu (x,y,r),\mu (y,z,s) \Rightarrow \mu (x,z,T(r,s)) }$$
(10.48)

and symmetric

$$\displaystyle{ \mu (x,y,r) \Leftrightarrow \mu (y,x,r) }$$
(10.49)

4.6.1 The Notion of an Open Set

For each thing x, we define Or(x) as the class of property M(x, r), where

$$\displaystyle{ M(r,x)(y) \Leftrightarrow \mu (y,x,r), }$$
(10.50)

and we let

$$\displaystyle{ O_{r}(x) = \mathit{Cls}M(x,r) }$$
(10.51)

Hence: ingr(z, Or(x)) if and only if μ(z, x, r). Indeed, ingr(z, Or(x)) if and only if there exists t such that Ov(z, t) and μ(t, x, r), hence, there exists w such that ingr(w, z), ingr(w, t), hence \(w = z = t\), and finally μ(z, x, r).

We regard the thing Or(x) as an analogue of the notion of the ‘closed ball about x of the radius r’. To define the analogue of an open ball, we consider the property

$$\displaystyle{ M_{r}^{+}(x)(y) \Leftrightarrow \exists q > r.\mu _{ s,t}(y,x,q) }$$
(10.52)

The class of the property Mr+(x) will serve as the open ball analogue

$$\displaystyle{ \mathit{Int}(O_{r}(x)) = \mathit{Cls}M_{r}^{+}(x) }$$
(10.53)

Then: ingr(z, Int(Or(x))) if and only ifq > r. μs, t(z, x, q). We follow the lines of the preceding proof. It is true that

$$\displaystyle{\mathit{ingr}(z,\mathit{Int}(O_{r}(x)))}$$

if and only if there exists t such that Ov(z, t) and there exists q > r for which μT(t, x, q) holds, hence, there exists w such that ingr(w, z), ingr(w, t), which implies that \(w = z = t\), and finally μT(z, x, q).

It follows

  1. 1.

    ingr(Int(Or(x)), Or(x)).

  2. 2.

    If s < r, then ingr(Or(x), Os(x)), ingr(Int(Or(x)), Int(Os(x))).

Consider z with ingr(z, Int(Or(x))). μT(z, x, s) holds with some s > r. We can choose α ∈ [0, 1] with the property that T(α, s) > r. For any thing w with ingr(w, Oα(z)), we can find an thing u such that μT(u, z, α) and Ov(w, u). For a thing t such that ingr(t, u) and ingr(t, w), we have μT(t, w, 1), μT(t, u, 1), hence, μT(t, x, T(α, s)), i.e, ingr(t, Int(Or(x))). As t = w, we find that ingr(w, Int(Or(x))). We have verified that

(P) for each z with ingr(z, Int(Or(x))), there exists α ∈ [0, 1] such that

$$\displaystyle{\mathit{ingr}(O_{\alpha }(z),\mathit{Int}(O_{r}(x))).}$$

For any thing z, when ingr(z, Int(Or(x))) and ingr(z, Os(y)), one finds α, β ∈ [0, 1] such that ingr(Oα(z), Int(Or(x))) and ingr(Obeta(z), Int(Os(y))), hence, ingr(Oq(z), Int(Or(x))), ingr(Oq(z), Int(Os(y))), for q = max{α, β}.

We can sum up the last few facts: the collection {Int(Or(x)): x a thing, r ∈ [0, 1]} is an open basis for a topology on the collection of things.

A thing x is open, Open(x) in symbols, in case it is a class of some property of things of the form Int(Or(x)). Hence,

  1. 1.

    If Φ is any non–vacuous property of things of the form Open(x), then Open(Cls Φ).

  2. 2.

    If Ov(Open(x), Open(y)), then Open(Open(x) ⋅ Open(y)).

4.6.2 On Closures and Interiors

We define closures of things, and to this end, we introduce a property Γ(x) for each thing x

$$\displaystyle{ \varGamma (x)(y) \Leftrightarrow \forall s < 1.Ov(O_{s}(y),x) }$$
(10.54)

Closures of things are defined by means of

$$\displaystyle{ \mathit{Cl}(x) = \mathit{Cls}\varGamma (x) }$$
(10.55)

Then one verifies that

  1. 1.

    ingr(z, Cl(x)) if and only if Ov(Or(z), x) for every r < 1.

  2. 2.

    ingr(z, Cl(Ow(x))) if and only if Ov(Or(z), Ow(x)) for every r < 1.

  3. 3.

    ingr(z, Cl(Ow(x))) if and only if ingr(z, Ow(x)).

  4. 4.

    Cl(Ow(x)) = Ow(x).

We define Int(x), the interior of x as

$$\displaystyle{ \mathit{ingr}(z,\mathit{Int}(x)) \Leftrightarrow \exists w.[Ov(z,w) \wedge \exists r < 1.\mathit{ingr}(O_{r}(w),x)] }$$
(10.56)

A standard reasoning shows: ingr(z, Int(x)) if and only if there exists r < 1 such that ingr(Or(z), x).

4.6.3 On Boundaries

We can now address the problem of a boundary of any thing of the form Or(x). We define the boundary Bd(Or(x)) as

$$\displaystyle{ \mathit{Bd}(O_{r}(x)) = O_{r}(x) \cdot -\mathit{Int}(O_{r}(x)) }$$
(10.57)

We have a characterization of boundary ingredients: ingr(z, Bd(Or(x))) if and only if

$$\displaystyle{\mu (z,x,r) \wedge \neg \exists q > r.\mu _{s,t}(z,x,q).}$$

Hence,

$$\displaystyle{\mathit{ingr}(\mathit{Bd}(O_{r}(x)),O_{r}(x)).}$$

4.7 Mereogeometry

This section introduces mereogeometry modeled on classical axiomatization of geometry by Tarski (1959). It will serve us in the sequel in building tools for defining and navigating formations of intelligent agents (robots).

Elementary geometry was defined by Alfred Tarski in His Warsaw University lectures in the years 1926–1927 as a part of Euclidean geometry which can be described by means of the 1st order logic.

There are two main aspects in formalization of geometry: one is metric aspect dealing with the distance underlying the space of points which carries geometry and the other is affine aspect taking into account the linear structure.

In Tarski axiomatization, Tarski (1959), the metric aspect is expressed as a relation of equidistance (congruence) and the affine aspect is expressed by means of the betweenness relation. The only logical predicate required is the identity = . Equidistance relation denoted Eq(x, y, u, z) (or, as a congruence: xy ≡ uz) means that the distance from x to y is equal to the distance from u to z (pairs x, y and u, z are equidistant).

Betweenness relation is denoted B(x, y, z), (x is between y and z). Van Benthem (1983) took up the subject proposing a version of betweenness predicate based on the nearness predicate and suited, hypothetically, for Euclidean spaces.

We are interested in introducing into the mereological world defined by μ of a geometry in whose terms it will be possible to express spatial relations among things. We first introduce a notion of a distance κ, induced by a rough inclusion μ

$$\displaystyle{ \kappa (X,Y ) = \mathit{min}\{\mathit{max}\ r,\mathit{max}\ s:\ \mu (X,Y,r),\mu (Y,X,s)\} }$$
(10.58)

Observe that the mereological distance differs essentially from the standard distance: the closer are things, the greater is the value of κ: κ(X, Y ) = 1 means X = Y whereas κ(X, Y ) = 0 means that X, Y are either externally connected or disjoint, no matter what is the Euclidean distance between them.

4.7.1 On the Notion of Betweenness in Tarski and Van Benthem Sense

The notion of betweenness in the Tarski sense B(Z, X, Y ) in terms of κ is

$$\displaystyle{ B(Z,X,Y )\ \Leftrightarrow \text{for each region W,}\ \kappa (Z,W)\ \in [\kappa (X,W),\kappa (Y,W)] }$$
(10.59)

Here, [a, b] means the non–oriented interval with endpoints a, b.

We use κ to define in our context the relation N of nearness proposed in Van Benthem (1983)

$$\displaystyle{ N(X,U,V )\ \Leftrightarrow \kappa (X,U) >\kappa (V,U) }$$
(10.60)

Here, N(X, U, V ) means that X is closer to U than V is to U.

Then, N does satisfy all axioms for nearness in Van Benthem (1983)

  1. 1.

    NB1 N(Z, U, V ) and N(V, U, W) imply N(Z, U, W) (transitivity).

  2. 2.

    NB2 N(Z, U, V ) and N(U, V, Z) imply N(U, Z, V ) (triangle inequality).

  3. 3.

    NB3 N(Z, U, Z) is false (irreflexivity).

  4. 4.

    NB4 Z = U or N(Z, Z, U) (selfishness).

  5. 5.

    NB5 N(Z, U, V ) implies N(Z, U, W) or N(W, U, V ) (connectedness).

We provide a sketch of proof.

For NB1, assumptions are κ(Z, U) > κ(V, U) and κ(V, U) > κ(W, U); it follows that κ(Z, U) > κ(W, U) i.e. the conclusion N(Z, U, W) follows.

For NB2, assumptions κ(Z, U) > κ(V, U), κ(V, U) > κ(Z, V ) imply κ(Z, U) > κ(Z, V ), i.e., N(U, Z, V ).

For NB3, it cannot be true that κ(Z, U) > κ(Z, U).

For NB4, ZU implies in our world that κ(Z, Z) = 1 > κ(Z, U) ≠ 1.

For NB5, assuming that neither N(Z, U, W) nor N(W, U, V ), we have κ(Z, U) ≤ κ(W, U) and κ(W, U) ≤ κ(V, U) hence κ(Z, U) ≤ κ(V, U), i.e., N(Z, U, V ) does not hold.

We introduce a betweenness relation in the sense of Van Benthem TB modeled on betweenness proposed in Van Benthem (1983)

$$\displaystyle{ T_{B}(Z,U,V )\ \Leftrightarrow [\text{for each}\ W\ (Z = W)\ \text{or}\ N(Z,U,W)\ \text{or}\ N(Z,V,W)] }$$
(10.61)

4.7.2 Example: The Case of Betweenness for Robots in 2D Space

The principal example bearing, e.g., on our approach to robot control deals with rectangles in 2D space regularly positioned, i.e., having edges parallel to coordinate axes. We model robots (which are represented in the plane as discs of the same radii in 2D space) by means of their safety regions about robots; those regions are modeled as squares circumscribed on robots. One of advantages of this representation is that safety regions can be always implemented as regularly positioned rectangles.

Given two robots a, b as discs of the same radii, and their safety regions as circumscribed regularly positioned rectangles A, B, we search for a proper choice of a region X containing A, and B with the property that a robot C contained in X can be said to be between A and B. In this search we avail ourselves with the notion of betweenness relation TB.

Taking the rough inclusion μG defined in (10.28), for two disjoint rectangles A, B, we define the extent, ext(A, B) of A and B as the smallest rectangle containing the union \(A \cup B\). Then we have the claim, obviously true by definition of TB: given two disjoint rectangles C, D, the only thing between C and D in the sense of the predicate TBis the extent ext(C,D) of C,D.

For a proof, as linear stretching or contracting along an axis does not change the area relations, it is sufficient to consider two unit squares A, B of which A has (0,0) as one of vertices whereas B has (a,b) with a, b > 1 as the lower left vertex (both squares are regularly positioned). Then the distance κ between the extent ext(A, B) and either of A, B is \(\frac{1} {(a+)(b+1)}\).

For a rectangle R: [0, x] × [0, y] with \(x \in (a,a + 1),y \in (b,b + 1)\), we have that

$$\displaystyle{ \kappa (R,A) = \frac{(x - a)(y - b)} {xy} =\kappa (R,B) }$$
(10.62)

For \(\phi (x,y) = \frac{(x-a)(y-b)} {xy}\), we find that

$$\displaystyle{ \frac{\partial \phi } {\partial x} = \frac{a} {x^{2}} \cdot (1 - \frac{b} {y}) > 0 }$$
(10.63)

and, similarly, \(\frac{\partial \phi } {\partial y} > 0\), i.e., ϕ is increasing in x, y reaching the maximum when R becomes the extent of A, B.

An analogous reasoning takes care of the case when R has some (c,d) with c, d > 0 as the lower left vertex.

Further usage of the betweenness predicate is suggested by the Tarski (1959) axiom of B,Eq–upper dimension, which implies collinearity of x, y, z. Thus, a line segment may be defined via the auxiliary notion of a pattern; we introduce this notion as a relation Pt(u, v, z) which is true if and only if TB(z, u, v) or TB(u, z, v) or TB(v, u, z). 

We will say that a finite sequence u1, u2, , un of things belong in a line segment whenever \(Pt(u_{i},u_{i+1},u_{i+2})\) for \(i = 1,\ldots,n - 2\); formally, we introduce the functor Line of finite arity defined by means of

$$\displaystyle{\mathit{Line(u_{1},u_{2},\ldots,u_{n})\ if \ and\ only\ if \ Pt(u_{i},u_{i+1},u_{i+2})\ for\ i < n - 1.}}$$

For instance, any two disjoint rectangles A, B and their extent ext(A, B) form a line segment.

PART II. APPLICATIONS

5 Mereology in Engineering: Artifacts, Design and Assembling

Mereology plays a fundamental role in problems of design and assembling as basic ingredients in those processes are parts of complex things. The process of synthesis involves sequencing of operations of fusion of parts into more complex parts until the final product – artifact. We propose a scheme for assembling and a parallel scheme for design; the difference is in the fact that design operates on abstracta, i.e. categories of things whereas assembling deals with concreta, i.e., with real things. The interplay between abstracta and concreta will be described as a result of our analysis.

5.1 On the Notion of an Artifact

The term artifact means, etymologically, a thing made by art, which covers a wide specter of things from man–made things of everyday usage to abstract pieces of mathematical proofs, software modules or sonnets, or concertos. All those distinct things are unified in a scheme dependent on some common ingredients in their making, cf., e.g., a concise discussion in SEP (2012). We cannot include here a discussion of vast literature on ontological, philosophical and technological aspects of this notion, see, e.g., Baker (2004), Hilpinen (1995), Margolis and Laurence (2007), we point only to a thorough analysis of ontological aspects of artifacts in Borgo and Vieu (2009) in which authors propose also a scheme defining artifacts. It follows from discussion by many authors that important in analysis of artifacts are such aspects as: authorship, intended functionality, parthood relations. Analysis of artifacts is closely tied to design and assembly, cf., Boothroyd (2005) and Boothroyd et al. (2002) as well as Salustri (2002), Kim et al. (2008) and Seibt (2009). A discussion of mereology with respect to its role in domain science and engineering and computer science can be found in Björner and Eir (2010) and in Chapter by Björner in this Volume.

We attempt at a definition of an artifact as a thing obtained over a collection of things as a most complex thing in the sense of not being a part of any thing in the collection; to aspects of authorship (operator) and functionality, we add a temporal aspect, which allows for well–foundedness of the universe of parts, and seems to be a natural aspect of the assembling or design process. We regard processes leading to artifacts as fusion processes in which a by–product is obtained from a finite number of substrats. Though processes, e.g., of assembling a bike from its parts or a chemical reaction leading to a product obtained from a mixture of substances are very distinct to the observer, yet the formal description is identical for the two; it does require a category of operators P, a category of functionalities F, a linear time T with the time origin 0. The domain of things is a category Things(P, F, π) of things endowed with a part relation π. The assignment operator S acts as a partial mapping on the Cartesian product P × F ×Things(P, F, π) with values in the category Tree of rooted trees.

The act of assembling is expressed by means of a predicate

$$\displaystyle{\mathit{Art}(p(u),< v_{1}(u),\cdots \,,v_{k}(u) >,u,f(u),t(u),T(u)),}$$

which reads: an operator p(u) assembles at time t(u) a thing u with functionality f(u) according to the assembling scheme T(u) organized by p(u) which is a tree with the root u, from things v1(u),⋯ ,vk(u) which are leaves of T(u). The thing vi(u) enters in the position i the assembling process for u.

The predicate ART is subject to the following requirements.

ART1.:

If Art(p(u), < v1(u), ⋯ , vk(u) > , u, f(u), t(u), T(u)) and for any i in {1, , k}, it holds that

$$\displaystyle{Art(p(v_{i}(u)),< v_{i_{1}}(v_{i}(u)),\cdots \,,v_{i_{k}}(v_{i}(u)) >,v_{i}(u),f(v_{i}(u)),t(v_{i}(u)),T(v_{i}(u))),}$$

then t(vi(u)) < t(u), \(f(u) \subseteq f(v_{i}(u))\), \(p(v_{i}(u)) \subseteq p(u)\), and T(vi(u)) attached to T(u) at the leaf vi(u) yields a tree, called an unfolding of T(u) via the assembling tree for vi(u).

The meaning of ART1 is that for each substrate v entering the assembly process for u, v is assembled at time earlier than time for u, functionality of u is lesser than that of v, the operator for u has a greater operating scope than that of v, and the assembly tree for u can be expanded at the leaf v by the assembly tree for v.

ART2.:

\(\mathit{Art}(p(u),< v_{1}(u),\cdots \,,v_{k}(u) >,u,f(u),t(u),T(u)) \Rightarrow \pi (v_{i}(u),u)\) for each vi(u).

Meaning that each thing can be assembled only from its parts.

We introduce an auxiliary predicate App(v, i(v), u, t(u)) meaning: v enters in the position i the design process for u at time t(u).

ART3.:

π(v, u) ⇒ ∃w1(v, u), 

$$\displaystyle{\cdots \,,w_{k}(v,u),t(w_{2}(v,u)),\cdots \,,t(w_{k}(v,u)),i(w_{1}(v,u)),\cdots \,,i(w_{k(v,u)-1}(v,u))}$$

such that v = w1(v, u), t(w2(v, u)) < ⋯ < t(wk(v, u), wk(v, u)) = u,

$$\displaystyle{\mathit{App}(w_{j}(v,u)),i(w_{j}(v,u)),w_{j+1}(v,u),t(w_{j+1}(v,u))}$$

for \(j = 1,2,k(v,u) - 1\).

This means that each thing which is a part of the other thing will enter the assembly tree of the thing.

ART4.:

Each thing used in assembling of some other thing can be used in only one such thing in only one position at only one time.

This requirement will be referred to as the uniqueness requirement.

ART5.:

Values t(u) belong in the set T = { 0, 1, 2, ⋯ } of time moments.

Corollary 1.

By ART1, ART2, ART5: The universe of assembly things is well–founded, i.e., there is no infinite sequence {x i : i = 1,2,…} of things with π(x i+1 ,x i ) for each i.

From this Corollary, it follows that our notion of identity of artifacts (EA) is equivalent to extensionality notions (EP), (EC), (UC) discussed in Varzi (2008).

For a tree T(u), the ART–unfolding of T(u) is the tree T(u, 1) in which leaves v1(u), v2(u), ⋯ , vk(u) are expanded by attaching those trees T(v1(u)), ⋯ , T(vk(u)) which are distinct from their roots. For a tree T(u), the maximal ART–unfolding T(u, max) is the tree obtained from T(u) by repeating the operation of ART–unfolding until no further ART–unfolding is possible.

Corollary 2.

Each leaf of the tree T(u,max) is an atom.

We now define an artifact: an artifact over the category Things(P, F, π) of assembly things is a thing u such that π(u, v) holds for no thing v in Things(P, F, π). Thus artifacts are ‘final things’ in a sense.

We define the notion of identity for artifacts:

(Extensionality of artifacts EA) artifacts a, b are identical if and only if trees Tree(a,max), Tree(b,max) are isomorphic and have identical things at corresponding under the isomorphism nodes.

5.2 Design Artifacts

We regard the process of design as analogous to the assembly process; the only difference between the two which we introduce is that in design, the designer works with not the things but with classes of equivalent things. Thus, to begin with, we introduce an equivalence relation on things. To this end, we let

$$\displaystyle{ u \sim v\ \text{if and only if}\ [\pi (u,t)\ \text{ if and only if}\ \pi (v,t)]\ \text{ for each}\ \mathit{thing}\ t }$$
(10.64)

and

$$\displaystyle{ \mathit{Cat}(u) = \mathit{Cat}(v)\ \text{if and only if}\ u \sim v }$$
(10.65)

Things in the same category Cat are ‘universally replaceable’. It is manifest that the part relation π can be factored through categories, to the relation Π of part on categories,

$$\displaystyle{ \varPi (\mathit{Cat}(u),\mathit{Cat}(v))\ \text{if and only if}\ \pi (u,v) }$$
(10.66)

In our formalism, design will imitate assembling with things replaced with categories of things and the part relation π replaced with the factorization Π. We need only to repeat the procedure with necessary replacements. We use the designer set D, the functionality set F, and the time set T as above.

The act of design is expressed by means of a predicate,

$$\displaystyle{\mathit{Des}(\mathit{d,< Cat}_{1},\cdots \,,\mathit{Cat}_{k} >,\mathit{Cat,f }(\mathit{Cat}),t(\mathit{Cat}),T(\mathit{Cat}))}$$

which reads: a designer d designs at time t a category of things Cat with functionality f(Cat) according to the design scheme T(Cat) organized by d which is a tree with the root Cat, from categories Cat1, ⋯ , Catk which are leaves of T(Cat). The category Catienters in the position i the design process for Cat.

The predicate Des is subject to the following requirements.

DES1.:

If Des(d, < Cat(v1(u)), ⋯ , Cat(vk(u)) > , Cat(u), f(u), t(u), T(u)) and for any i in {1, ⋯ , k}, it holds that

$$\displaystyle{\mathit{Des}(p(\mathit{Cat}(v_{i}(u))),< \mathit{Cat}(v_{i_{1}}(v_{i}(u))),\cdots \,,\mathit{Cat}(v_{i_{k}}(v_{i}(u))) >,}$$
$$\displaystyle{\mathit{Cat}(v_{i}(u)),f(v_{i}(u)),t(v_{i}(u)),T(v_{i}(u))),}$$

then t(vi(u)) < t(u), \(f(u) \subseteq f(v_{i}(u))\), \(p(v_{i}(u)) \subseteq p(u)\), and T(vi(u)) attached to T(u) at the leaf Cat(vi(u)) yields a tree, called the unfolding of T(u) via the design tree for Cat(vi(u)).

DES2.:
$$\displaystyle{\mathit{Des}(d,< \mathit{Cat}(v_{1}(u)),\cdots \,,\mathit{Cat}(v_{k}(u)) >,\mathit{Cat}(u),f(u),t(u),T(u)) \Rightarrow }$$
$$\displaystyle{\varPi (\mathit{Cat}(v_{i}(u)),\mathit{Cat}(u))}$$

for each vi(u).

Meaning that each thing can be designed only from its parts.

We introduce an auxiliary predicate App(v, i(v), u, t(u)) meaning: Cat(v) enters in the position i the design process for Cat(u) at time t(u).

DES3.:

\(\varPi (\mathit{Cat}(v),\mathit{Cat}(u)) \Rightarrow \exists \mathit{Cat}(w_{1}(v,u)),\cdots \,,\mathit{Cat}(w_{k}(v,u))\), and,

$$\displaystyle{t(w_{2}(v,u)),\cdots \,,t(w_{k}(v,u)),i(w_{1}(v,u)),\cdots \,,i(w_{k(v,u)-1}(v,u))}$$

such that v = w1(v, u), t(w2(v, u)) < ⋯ < t(wk(v, u), wk(v, u)) = u,

$$\displaystyle{\mathit{App}(w_{j}(v,u)),i(w_{j}(v,u)),w_{j+1}(v,u),t(w_{j+1}(v,u))}$$

for \(j = 1,2,\cdots \,,k(v,u) - 1\).

This means that for each thing which is a part of the other thing the category of the former will enter the design tree for the category of the latter.

For ART4, we may not have the counterpart in terms of DES: clearly, things of the same category may be used in many positions and at many design stages of some other category. We may only repeat our assumption about timing.

DES4.:

Values t(u) belong in the set T = { 0, 1, 2, ⋯ } of time moments.

Corollary 1.

The universe of categories is well–founded.

We define a design artifact as a category Cat(u) such that Π(Cat(u), Cat(v)) is true for no v.

We are approaching the notion of identity for design artifacts. To begin with, for a design artifact a, denote by the symbol art(a) the artifact obtained by filling in the design tree for a all positions Cat(v) with things v for some choices of v. We state the identity condition for design artifacts.

(Extensionality for design artifacts ED) design artifacts a,b are identical if and only if there exist artifacts art(a),art(b) which are identical.

From the principle of identity for artifacts, a corollary follows.

Corollary 2.

If design artifacts a,b are identical then a,b have isomorphic design trees and categories at corresponding nodes are identical.

Corollary 3.

If design artifacts a,b have isomorphic design trees and categories at corresponding nodes are identical, then a,b are identical.

Indeed, consider two design artifacts a, b which satisfy the condition in the corollary. There is at least one category Cat(v) in the same position in design trees of a and b. Choose a thing x in Cat(v) and let a(x), b(x) be artifacts assembled according to a, b, respectively. Having a thing in common, a(x), b(x) are identical hence a, b are identical.

5.3 Action of Things on Design Abstracta

The interplay between concreta and abstracta in design and assembly can be exhibited by action of things on design artifacts. We define a partial mapping \(\iota\) on the product Things(P, F, π) ×DesignArtifacts into Artifacts: for a thing v and a design artifact a, we define the value \(\iota (v,a)\) as NIL in case category Cat(v) is not any node in the design tree for a, and, the unique artifact a(v) in the contrary case. The inverse \(\iota ^{-1}(\iota (v,a))\) is the set {(u, b): b ∈ DesignArtifacts, Cat(u) anodeinb}; thus, abstracta are equivalent in this sense to collections of concreta.

6 Mereology in Spatial Reasoning

Spatial orientation of a thing depends on the real world in which things are immersed, hence, to, e.g., discern among sides of a thing, one needs additional knowledge and structures. An example of this approach is found, e.g., in Aurnague et al. (1997), where it is proposed to exploit in determining orientation, e.g., the direction of gravity (‘haut–grav’, ‘bas–grav’) or peculiar features of things (like the neck of a bottle) suggesting direction, and usage of geometric predicates like equidistance in definitions of, e.g., orthogonal directions.

6.1 Properties of Artifacts: Mereological Theory of Shape and Orientation

It is manifest that mereology is amorphous in the sense that decomposition of a thing into parts does not depend of orientation, isometric transformations etc. Hence, to exhibit in things additional features like shape, side, one needs augmented mereology.

Particular features of shape like existence of ‘dents’ or ‘holes’ in a thing resulting from removal of other things can be accounted for within mereology.

We define the predicate hole(x, y) reading a thing x constitutes a hole in a thing y as follows,

$$\displaystyle{ \mathit{hole}(x,y) \Leftrightarrow \exists z.\mathit{NTP}(x,z) \wedge \mathit{comp}(y,x,z) }$$
(10.67)

i.e., x is a non–tangential thing in z and y complements x in z.

The predicate dent(x, y), reading a thing x constitutes a dent in a thing y is defined as

$$\displaystyle{ \mathit{dent}(x,y) \Leftrightarrow \exists z.TP(x,z) \wedge \mathit{comp}(y,x,z) }$$
(10.68)

i.e., x is a tangential thing in z and y complements x in z. The notion of a dent may be useful in characterizing things that ‘fit into a thing’: the predicate fitsinto(x, y) may be defined as

$$\displaystyle{ \mathit{fits_{-}into}(x,y) \Leftrightarrow \exists z.\mathit{dent}(z,y) \wedge \mathit{ingr}(x,z) }$$
(10.69)

i.e., x is an ingredient of a thing which is a dent in y. A particular case of fitting is ‘filling’ i.e., a complete fitting of a dent. We offer a predicate fills(x, y)

$$\displaystyle{ \mathit{fills}(x,y) \Leftrightarrow \exists z.\mathit{dent}(z,y) \wedge z = x \cdot y }$$
(10.70)

i.e., dent–making z is the product of x and y. Following this, the notion of a join can be defined as

$$\displaystyle{ \mathit{joins}(x,y,z) \Leftrightarrow \exists w.w = x + y + z \wedge \mathit{fills}(x,y) \wedge \mathit{fills}(x,z) }$$
(10.71)

i.e., x joins y and z when there is a thing \(x + y + z\) and x fills both y and z.

This predicate can be inductively raised to

$$\displaystyle{\mathit{join}(n)(x_{1},x_{2},\ldots,x_{n};y_{1},y_{2},\ldots,y_{n},y_{n+1})}$$

via

$$\displaystyle{\mathit{join}(1)(x_{1};y_{1},y_{2}) \Leftrightarrow \mathit{join}(x_{1},y_{1},y_{2})}$$

and

$$\displaystyle{\mathit{join}(k + 1)(x_{1},x_{2},\ldots,x_{k+1};y_{1},y_{2},\ldots,y_{k+1},y_{k+2}) \Leftrightarrow }$$
$$\displaystyle{\mathit{join}(x_{k+1},\mathit{join}(k)(x_{1},x_{2},\ldots,x_{k};y_{1},y_{2},\ldots,y_{k+1}),y_{k+2})}$$

in which we express sequentially a possibly parallel processing.

In case x joins y and z, possibility of assembling arises which may be expressed by means of modal operator \(\diamond \) of ‘possibility’, with an extended operator Asmbl to the form Asmbl(x, i, y, j, … w, p, f, t) meaning that w can be assembled from x in position i, y in position j,…by an operator p with functionality f at time t,

$$\displaystyle{ \mathit{join}(x,y,z) \Rightarrow \diamond \exists w,p,f,t,i,j,k.\mathit{Asmbl}(x,i,y,j,z,k;w,p,f,t) }$$
(10.72)

Assuming our mereology is augmented with environment endowed with directions N, S, E, W, we may represent these directions by means of mobile agents endowed with laser or infrared beams of specified width; at the moment when the beam range reaches the thing x, it marks on its boundary a region which we denote as top in case of N, bottom in case of S, left-side in case of W, and right-side in case of E. Thus we have \(\mathit{top}(x),\mathit{bottom}(x),\mathit{left - side}(x),\mathit{right - side}(x)\) as areas of the boundary of x; these are not parts of x. To express relations among sides of things we need a distinct language; for the sake of this example let us adopt the language of set theory regarding sides as sets.

Then we may say that the thing y

  1. 1.

    Is on the thing x in case bottom(y) is contained in top(x).

  2. 2.

    Is under the thing x when top(y) is contained in bottom(x).

  3. 3.

    Touches x on the left when right-side(y) is contained in left-side(x)

  4. 4.

    Touches x on the right when (left-side(y) is contained in right-side(x)).

This modus of orientation can be merged with mereological shape theory: one can say that a thing x constitutes a dent on top/under/ on the left/on the right of the thing y when, respectively,

  1. 1.

    \(\mathit{dent}_{top}(x,y) \Leftrightarrow \exists z.\mathit{TP}(x,z) \wedge \ \mathit{top}(x) \subseteq \ \mathit{top}(z)\ \wedge \mathit{comp}(y,x,z)\).

  2. 2.

    \(\mathit{dent}_{\mathit{bottom}}(x,y) \Leftrightarrow \exists z.\mathit{TP}(x,z) \wedge \ \mathit{bottom}(x) \subseteq \ \mathit{bottom}(z)\ \wedge \mathit{comp}(y,x,z)\).

  3. 3.

    \(\mathit{dent}_{\mathit{left}}(x,y) \Leftrightarrow \exists z.\mathit{TP}(x,z) \wedge \ \mathit{left - side}(x) \subseteq \ \mathit{left - side}(z)\ \wedge \mathit{comp}(y,x,z)\).

  4. 4.

    \(\mathit{dent}_{\mathit{right}}(x,y) \Leftrightarrow \exists z.\mathit{TP}(x,z)\wedge \ \mathit{right - side}(x) \subseteq \ \mathit{right - side}(z)\ \wedge \mathit{comp}(y,x,z)\).

These notions in turn allow for more precise definitions of fitting and filling; we restrict ourselves to filling as fitting is processed along same lines: we say that a thing x fills a thing y on top/bottom/on the left-side/on the right-side,

$$\displaystyle{\mathit{fills}_{\alpha }(x,y) \Leftrightarrow \exists z.\mathit{dent}_{\alpha }(z,y) \wedge z = x \cdot y}$$

where α is, respectively, top, bottom, left, right.

This bears on the notion of a join which can be made more precise: we say that a thing x (α,β)–joins things y and z

$$\displaystyle{\mathit{joins}_{\alpha,\beta }(x,y,z) \Leftrightarrow \exists w.w = x + y + z \wedge \mathit{fills}_{\alpha }(x,y) \wedge \mathit{fills}_{\beta }(x,z)}$$

where α, β=top, bottom, left, right.

A very extensive discussion of those aspects is given in Casati and Varzi (1999).

6.1.1 Qualitative Spatial Reasoning

With this analysis we enter the realm of Qualitative Spatial Reasoning. Qualitative spatial reasoning abstracts from qualitative details, cf., Cohn (1996); it is related to design, cf., Booch (1994) and planning, cf., Glasgow (1995).

Spatial reasoning employing mereology is a basis for analysis of semantics of orientational lexemes and semantics of motion, cf., Asher et al. (1995). It is basis for representation, and mapping of environments in behavioral robotics, cf., Kuipers (1994) and Arkin (1998). It is especially important for Geographic Information Systems (Frank and Campari, 1993; Frank and Kuhn, 1995; Hirtle and Frank, 1997; Egenhofer and Golledge, 1997).

Any formal approach to Spatial Reasoning requires Ontology, cf., Guarino (1994), Smith (1989), and Casati et al. (1998). In reasoning with spatial things, of primary importance is to develop an ontology of spatial things, taking into account complexity of these things.

6.1.2 A Case of Spatial Analysis of Limiting Things

We give two examples of spatial reasoning based on merology. In the first, we attempt, cf. Polkowski and Semeniuk–Polkowska (2010) at giving descriptions of various notions of boundary, or limiting, things like separator, border, fence, hedge, confine, involving in this discussion various models of mereology.

We introduce the notion of a separator Sepr(x, z, y) for a triple x, z, y such that π(x, y), π(z, y), extr(x, z) as,

$$\displaystyle{ \mathit{Sepr}(x,z,y) = \mathit{Cls}\{v:\pi (v,y),\mathit{extr}(v,x),\mathit{extr}(v,z)\} }$$
(10.73)

Then,

  1. 1.

    extr(Sepr(x, z, y), x).

  2. 2.

    extr(Sepr(x, z, y), z).

  3. 3.

    π(Sepr(x, z, y), y).

  4. 4.

    x = Sepr(Sepr(x, z), z, y).

  5. 5.

    z = Sepr(Sepr(x, z, y), x).

The notion of a separator comes close to the notion of a border: assume that Warszawa, the river Vistula, the left–bank part of Warszawa and its right–bank part are things in a Geographic Information System. Then the river Vistula is the separator between left– and right–bank parts of Warszawa, and it can justly be called the border of either, contrary to the topological boundaries of those parts which are left and right banks of the river.

The notion of connection, in particular the predicate of external connectedness EC allows for more detailed spatial analysis; in our example of Warszawa and the river Vistula, where things are Vistula and left and right banks of Warszawa, we have EC(Vistula, left − bank side), EC(Vistula, right − bank side). The connection relation in this case is defined in the ROM model. External connection leads to division of ingredients of any thing into two categories: tangential and non–tangential. In order to make this distinction, one introduces the complement of an entity, − x as the class of all entities external to x. The tangential ingredient of x is z such that it is externally connected to an ingredient of − x. In the example of Warszawa and Vistula, the connection boundary of either bank of Warszawa is this bank itself; it is different from the idea of geographic boundary and any reasonable idea of a boundary. In order to rectify the idea, we can introduce a richer universe of parts, e.g., by declaring a part any region contained either in a bank of Warszawa or in the river, but not intersecting any two of these entities. Then the idea of Tarski (1929) may be applied of defining ideal things (‘points’) as limits of ultrafilters of regions. Limits of ultrafilters of regions being parts of a bank of Warszawa constitute the geographic boundary of this bank.

For the notion of confine or extent we can apply rough mereogeometry and the notion of betweenness. Assume for simplicity, that entities are rectangles with sides parallel to coordinate axes. Given rectangles R1, R2, as proved above, the extent of R1, R2 is the smallest rectangle spanned by R1, R2.

6.1.3 A Digression on Time in Mereology

To analyze notions of a fence and a hedge, we resort to the property of passability: by a fence we understand a structure of iron wire made to be impassable, e.g. to small animals whereas a hedge is a structure usually of plants which we regard as passable. To express this difference, we introduce a new aspect of mereology, viz., timed mereology due to Tarski (1937) and Woodger (1937, 1939).

The time component is introduced into the framework of mereology with a set of notions and postulates (axioms) concerning aspects of time like momentariness, coincidence in time, time slices. Things are considered as spatial only and their relevance to time is expressed as momentary or as spatial and extended in time and then the predicate of part is understood as a global descriptor covering spatio–temporal extent of things whereas the temporal extension is described by the predicate Temp, T with the intended meaning that T(u, v) means that the thing u precedes in time the thing v (in terminology of Leśniewski, Tarski and Woodger: u wholly precedes v) meaning that, e.g., when u and v have some temporal extent, then u ends before or at the precise moment when v begins.

The property (predicate) Mom meaning momentary being is introduced to denote things having only spatial aspect. This predicate is introduced by means of the following postulate,

(MOM) Mom(x) ⇔ T(x, x)

Thus, x begins and ends at the same time, so its time aspect is like a spike in time; it renders the phrase ‘to exist in a moment of time’.

The predicate T is required to satisfy postulates

  1. 1.

    TM1 \(T(x,y) \wedge T(y,z) \Rightarrow T(x,z)\).

  2. 2.

    TM2 \(\mathit{Mom}(x) \wedge \mathit{Mom}(y) \Rightarrow T(x,y) \vee T(y,x)\).

  3. 3.

    TM3 \(T(x,y) \Leftrightarrow \forall u,v.\mathit{ingr}(u,x) \wedge \mathit{ingr}(v,y) \Rightarrow T(u,v)\).

Postulate TM1 states that T is transitive, TM2 does state that of two momentary things, one precedes the other and TM3 relates T to the class operator, i.e., x precedes y if and only if each ingredient of x precedes each ingredient of y. Postulate TM3 provides a link between the part based mereology and the timed mereology, bonding spatial and temporal properties of things.

The notion of a coincidence in time, CT in symbols, is

$$\displaystyle{ \mathit{CT}(x,y) \Leftrightarrow T(x,y) \wedge T(y,x) }$$
(10.74)

and it implies in turn a notion of a time–slice, Slice(x, y), as

$$\displaystyle{ \mathit{Slc}(x,y) \Leftrightarrow \mathit{Mom}(x) \wedge \mathit{ingr}(x,y) \wedge \forall z.[\mathit{ingr}(z,y) \wedge C(z,x) \Rightarrow \mathit{ingr}(z,x)] }$$
(10.75)

and thus a time–slice of an thing y is an ingredient of y which is spatially so arranged that any ingredient of y coinciding with it in time is also its ingredient.Time slices are unique up to coincidence in time: if x, y are time–slices of z, then x, y coincide in time if and only if x = y.

We use these notions in order to make a distinction between passable and non–passable boundaries, i.e., between hedges and fences. We say that a time–slice x of an entity y is a time–front boundary of y if and only if for each entity z it follows from ingr(z, y) and T(z, x) that ingr(z, x); similarly, a time–slice w of y is a time–rear boundary of y if and only if for each entity z it follows from ingr(z, y) and T(w, z) that ingr(z, w). The Boolean sum x + w of x and w, is the time–boundary of y.

The front time boundary x of y is passable (is a front time–hedge of y) if and only if there is an entity z such that T(z, x) and not ingr(z, x); otherwise x is the front time–fence of y. Analogous definitions concern rear time–hedges and rear time–fences. Smith and Varzi (1997), make a distinction between fiat boundaries and bona–fide boundaries, the former defined as material boundaries of real entities whereas the latter understood as mental boundaries; time boundaries may serve as an example of the latter.

6.1.4 RCC: Region Connection Calculus. ROM Revisited

As an important example of mereological spatial reasoning we introduce here the RCC Calculus (Region Connection Calculus), cf. Randell et al. (1992), Cohn et al. (1993, 1996), Cohn (1996), Cohn and Gotts (1996) and Cohn and Varzi (1998). It is a calculus on closed regular sets (regions) in a regular topological space, i.e, in the frame of ROM. RCC admits Clarke’s connection postulates CN1–CN3 and follows same lines in defining basic predicates. To preserve the flavor of this theory we give these predicates in the RCC notation

  1. 1.

    DISCONNECTEDFROM(x)(y)  DC(x, y) ⇔ ¬C(x, y).

  2. 2.

    IMPROPERPARTOF(x)(y): P(x, y) ⇔ ∀z. [C(z, y) → C(z, x)].

  3. 3.

    \(\mathit{PROPER}\ \mathit{PART}\ \mathit{OF}(x)(y):\ \mathit{PP}(x,y) \Leftrightarrow P(x,y) \wedge \neg P(y,x)\).

  4. 4.

    \(\mathit{EQUAL}(x)(y):\ \mathit{EQ}(x,y) \Leftrightarrow P(x,y) \wedge P(y,x)\).

  5. 5.

    \(\mathit{OVERLAP}(x)(y): \mathit{Ov}(x,y) \Leftrightarrow \exists.z.P(x,z) \wedge P(y,z)\).

  6. 6.

    \(\mathit{DISCRETE}\ \mathit{FROM}(x)(y): \mathit{DR}(x,y) \Leftrightarrow \neg \mathit{Ov}(x,y)\).

  7. 7.

    \(\mathit{PARTIAL}\ \mathit{OVERLAP}(x)(y): \mathit{POv}(x,y) \Leftrightarrow \mathit{Ov}(x,y) \wedge \neg P(x,y) \wedge \neg P(y,x)\).

  8. 8.

    \(\mathit{EXTERNAL}\ \mathit{CONNECTED}(x)(y): \mathit{EC}(x,y) \Leftrightarrow C(x,y) \wedge \neg \mathit{Ov}(x,y)\).

  9. 9.

    \(\mathit{TANGENTIAL}\ \mathit{PART}\ \mathit{OF}(x)(y): \mathit{TPP}(x,y) \Leftrightarrow \mathit{PP}(x,y)\wedge \exists z.\mathit{EC}(x,z)\wedge \mathit{EC}(y,z)\).

  10. 10.

    \(\mathit{NON - TANGENTIAL}\ \mathit{PART}\ \mathit{OF}(x)(y): \mathit{NTPP}(x,y) \Leftrightarrow \mathit{PP}(x,y)\wedge \neg \mathit{TPP}(x,y)\).

To each non–symmetric predicate X RCC adds the inverse Xi (e.g., to TPP(x, y) it adds TPPi(y, x)). The eight predicates: DC, EC, PO, EQ, TPP, NTPP, TPPi, NTPPi show the JEPD property (Jointly Exclusive and Pairwise Disjoint) and they form the fragment of RCC called RCC8.

Table 10.1 Transition table for RCC8 calculus

Due to topological assumptions, RCC has some stronger properties than Clarke’s calculus of C, where connection is simply the set intersection. Witness, the two properties, see.

  1. 1.

    If \(\forall z.\mathit{Ov}(x,z) \leftrightarrow \mathit{Ov}(y,z)\), then x = y (extensionality of overlapping). (If xy, then, e.g., there is z ∈ xy and regularity of the space yields us an open neighborhood V of z such that \(\mathit{ClV } \cap y = \varnothing \) and Ov(V, x) negating the premise).

  2. 2.

    If PP(x, y), then \(\exists z.P(x,z) \wedge \mathit{DR}(y,z)\).

  3. 3.

    x. EC(x, −x).

RCC8 allows for additional predicates characterizing shape, connectivity, see Gotts et al. (1996) and regions with vague boundaries (“the egg–yolk” approach), see Gotts and Cohn (1995).

RCC8 is presented in the form of the transition table: a table in which for entries R1(x, y) and R2(y, z) a result R3(x, z) is given, see Egenhofer (1991). The transition table for RCC8 is shown in Table 10.1.

7 Mereology in Intelligent Planning and Navigation: The Case of Behavioral Robotics

We have stressed that by its nature, rough mereology does address concepts, relations among which are expressed by partial containment rendered as the predicate of a part to a degree. Behavioral robotics falls into this province, as usually robots as well as obstacles and other environmental things are modeled as figures or solids. We show applications of mereology to planning and navigation of autonomous mobile robots and their formations. First, we introduce the subject of planning in robotics.

7.1 Planning with Emphasis on Behavioral Robotics

Planning is concerned with setting a trajectory for a robot endowed with some sensing devices which allow it to perceive the environment in order to reach by the robot a goal in the environment at the same time bypassing obstacles.

Planning methods, cf., e.g., Choset et al. (2005), vary depending on the robot abilities, features of the environment and chosen methodology. Among them are simple geometric methods designed for a robot endowed with sensors detecting obstacles, e.g., touch sensors or range sensors and able to detect distance between any pair of points. These methods are called ‘contour following’, as for such a robot, the idea can be implemented of moving to goal in a straight line segment and in case of meeting with an obstacle to bypass it by circumnavigating its boundary until the straight line to goal is encountered anew. Typically, the robot performs a heuristic search of A type, see, e.g., Russell and Norvig (2009) or Choset et al. (2005) with the heuristic function \(h(x) =\rho (x,O) +\rho (O,\mathit{goal})\) where x is the current position of the robot, and the point O is selected as an end–point of the continuity interval of ρ – the distance function, whose values are bound by a constant R.When the distance measured by range sensors exceeds R the value of ρ is set to infinity. The graph of ρ against the position x exhibits then discontinuities and continuity intervals clearly outline boundaries of obstacles, hence, the idea of selecting O as a boundary continuity point. Minimization of h leads to optimization of the chosen safe trajectory.

A method of potential field, see Khatib (1986) consists in constructing a potential field composed of attractive potentials for goals and repulsive potentials for obstacles.

An example may be taken as the quadratic potential function

$$\displaystyle{ U_{\mathit{attractive}}(x) = \frac{1} {2} \cdot \vert \vert x - x_{\mathit{goal}}\vert \vert ^{2} }$$
(10.76)

which induces the gradient

$$\displaystyle{ \nabla U_{\mathit{attractive}}(x) = x - x_{\mathit{goal}} }$$
(10.77)

which assures that the force (the gradient) exerted on the robot is greater when the robot is far from the goal and diminishes to zero as the robot is approaching the goal.

A repulsive potential should have opposite properties: it should exert a force tending to with the distance to the obstacle reaching 0. Denoting the distance from a point x to the closest obstacle with s(x), the repulsive potential can be defined as in

$$\displaystyle{ U_{\mathit{repulsive}}(x) = \frac{1} {2} \cdot [ \frac{1} {s(x)}] }$$
(10.78)

with the gradient

$$\displaystyle{ \nabla U_{\mathit{repulsive}}(x) = - \frac{1} {s(x)^{2}} \cdot \nabla s(x) }$$
(10.79)

The global potential function U is the sum of the attractive and repulsive parts:

$$\displaystyle{U(x) = U_{\mathit{attractive}}(x) + U_{\mathit{repulsive}}(x)}$$

Given U, the robot performs a well–known gradient descent: it does follow the direction of the gradient in small steps: the (i + 1)–th position is given from the i-th position and the gradient therein as

$$\displaystyle{ x_{i+1} = x_{i} +\xi _{i} \cdot \nabla U(x_{i}) }$$
(10.80)

In Polkowski and Ośmiałowski (2008, 2010) and Ośmiałowski (2009a) a mereological potential field planning method was proposed.

7.2 Mereological Planning via Potential Fields

Classical methodology of potential fields works with integrable force field given by formulas of Coulomb or Newton which prescribe force at a given point as inversely proportional to the squared distance from the target; in consequence, the potential is inversely proportional to the distance from the target. The basic property of the potential is that its density ( = force) increases in the direction toward the target. We observe this property in our construction.

We apply the geometric rough inclusion

$$\displaystyle{ \mu ^{G}(x,y,r) \Leftrightarrow \frac{\vert \vert x \cap y\vert \vert } {\vert \vert x\vert \vert } }$$
(10.81)

where | | x | | is the area of the region x. In our construction of the potential field, region will be squares: robots are represented by squares circumscribed on them (simulations were performed with disk–shaped Roomba robots, the intellectual property of iRobot. Inc.).

Geometry induced by means of a rough inclusion can be used to define a generalized potential field: the force field in this construction can be interpreted as the density of squares that fill the workspace and the potential is the integral of the density. We present now the details of this construction. We construct the potential field by a discrete construction. The idea is to fill the free workspace of a robot with squares of fixed size in such a way that the density of the square field (measured, e.g., as the number of squares intersecting the disc of a given radius r centered at the target) increases toward the target.

To ensure this property, we fix a real number – the field growth step in the interval (0, square edge length); in our exemplary case the parameter field growth step is set to 0.01.

The collection of squares grows recursively with the distance from the target by adding to a given square in the (k + 1)-th step all squares obtained from it by translating it by k ×field growth step (with respect to Euclidean distance) in basic eight directions: N, S, W, E, NE, NW, SE, SW (in the implementation of this idea, the floodfill algorithm with a queue has been used, see Ośmiałowski (2009a,b)). Once the square field is constructed, the path for a robot from a given starting point toward the target is searched for.

The idea of this search consists in finding a sequence of way–points which delineate the path to the target. Way–points are found recursively as centroids of unions of squares mereologically closest to the square of the recently found way–point. We recall that the mereological distance between squares x, y is defined by means of

$$\displaystyle{ k(x,y) = \mathit{min}\{\ \mathit{max}\ r:\mu (x,y,r),\mathit{max}\ s:\mu (y,x,s)\} }$$
(10.82)

We also remind that the mereological distance k(x, y) takes on the value 1 when x = y and the minimal value of 0 means that \(x \cap y \subseteq \mathit{Bd}(x) \cap \mathit{Bd}(y)\). In order do define a “potential” of the rough mereological field, let us consider how many generations of squares will be centered within the distance r from the target. Clearly, we have

$$\displaystyle{ d + 2d +\ldots +kd \leq r }$$
(10.83)

where d is the field growth step, k is the number of generations. Hence,

$$\displaystyle{ k^{2}d \leq \frac{k(k + 1)} {2} d \leq r }$$
(10.84)

and thus

$$\displaystyle{ k \leq (\frac{r} {d})^{\frac{1} {2} } }$$
(10.85)

The potential V (r) can be taken as \(\sim r^{\frac{1} {2} }\). The force field F(r) is the negative gradient of V (r),

$$\displaystyle{ F(r) = -\frac{d} {\mathit{dr}}V (r) \sim - \frac{1} {r^{\frac{1} {2} }} }$$
(10.86)

Hence, the force decreases with the distance r from the target slower than traditional Coulomb force. It has advantages of slowing the robot down when it is closing on the target. Parameters of this procedure are: the field growth step set to 0.01, and the size of squares which in our case is 1.5 times the diameter of the Roomba robot.

A robot should follow the path proposed by planner shown in Fig. 10.1.

Fig. 10.1
figure 1figure 1

Planned paths of Roomba robots to their targets

7.3 Planning for Teams of Robots

Problems of planning paths for teams of robots present an intellectual challenge due to aspects of cooperation, communication, task–sharing and division, and planning non–collision paths for robots. These problems require studies of cognitive theories, biology, ethology, organization and management. They can also lead to new solutions to problems of artificial intelligence. Passing from a single robot to teams of robots can be motivated also by pragmatic reasons, cf., Cao et al. (1997), as tasks for robots can be too complex for a single robot, or many robots can do the task easier at a lesser cost, or many robots can perform the task more reliably.

Practical studies along these lines were concerned with moving large things of irregular shapes by groups of robots, see Kube and Zhang (1996), search and rescue, see Jennings et al. (2001), formations of planetary outposts of mobile robots, see Huntsberger et al. (2007), multi–target inspection Parker (1997). Simulations of systems a few robots were studied, e.g., in CEBOT, see Fukuda and Nakagawa (1987), ACTRESS, see Asama et al. (1989), GOFER, see Caloud et al. (1990), cf., the ALLIANCE architecture in Parker (1998).

Many authors attacked these problems by extending methods elaborated for a single robot; helpful in those attempts were studies of behavior of migrating birds flying in ‘boids’, cf., Reynolds (1987) which brought forth elementary behaviors like collision–avoidance, velocity adjustment, leader–following, flock–centering, transferred into robot milieu, e.g., in Matarić (1993, 1994, 1997), Fredslund and Matarić (2002), Agah (1996), and Agah and Bekey (1997), which provided elementary robot behaviors like wandering, homing, following, avoidance, aggregation, dispersion.

In Balch and Arkin (1998) basic principles of behavioral approach, were formulated: it is vital to keep all robots within a certain distance from one another (e.g., to ensure mutual visibility), to move away when the distance becomes too close (to avoid congestion, collision, or resource conflict), to adapt own movement to movement of neighbors (e.g., by adjusting velocity of motion), to orient oneself on a leader, or a specific location, e.g., the gravity center of the group. They proposed that robots in a team obey rigid geometric constraints by means of references to the center of the group or to the assigned leader, or to the assigned neighbor.

7.4 Mereological Approach to Robot Formations

We recall that on the basis of the rough inclusion μ, and mereological distance κ geometric predicates of nearness and betweenness, are redefined in mereological terms.

Given two robots a, b as discs of same radii, and their safety regions as circumscribed regularly positioned rectangles A, B, we search for a proper choice of a region X containing A, and B with the property that a robot C contained in X can be said to be between A and B.

For two (possibly but not necessarily) disjoint rectangles A, B, we define the extent, ext(A, B) of A and B as the smallest rectangle containing the union \(A \cup B\). We know that in this setting, given two disjoint rectangles C, D, the only thing between C and D in the sense of the predicate TB is the extent ext(C, D) of C, D,, i.e., the minimal rectangle containing the union \(C \cup D\).

For details of the exposition which we give now, please consult Ośmiałowski (2011) and Ośmiałowski and Polkowski (2009).

For robots a, b, c, we say that a robot b is between robots a and c, in symbols

$$\displaystyle{ (\text{between}\ b\ a\ c) }$$
(10.87)

in case the rectangle ext(b) is contained in the extent of rectangles ext(a), ext(c), i.e.,

$$\displaystyle{ \mu _{0}(\mathit{ext}(b),\mathit{ext}(\mathit{ext}(a),\mathit{ext}(c)),1) }$$
(10.88)

This can be generalized to the notion of partial betweenness which models in a more realistic manner spatial relations among a, b, c; we say in this case that robot b is between robots a and c to a degree of at least r, in symbols,

$$\displaystyle{ (\text{between\textendash deg}r\ b\ a\ c\ ) }$$
(10.89)

if and only if

$$\displaystyle{ \mu _{0}(\mathit{ext}(b),\mathit{ext}[\mathit{ext}(a),\mathit{ext}(c)],r), }$$
(10.90)

i.e.,

$$\displaystyle{\frac{\vert \vert \mathit{ext}(b) \cap \mathit{ext}(\mathit{ext}(a),\mathit{ext}(c))\vert \vert } {\vert \vert \mathit{ext}(b)\vert \vert } \geq r}$$

For a team of robots, T(r1, r2, , rn) = { r1, r2, , rn}, an ideal formation IF on T(r1, r2, , rn) is a betweenness relation (between…) on the set T(r1, r2, , rn).

In implementations, ideal formations are represented as lists of expressions of the form

$$\displaystyle{ (\text{between}\ r_{0}\ r_{1}\ r_{2}) }$$
(10.91)

indicating that the thing r0 is between r1, r2, for all such triples, along with a list of expressions of the form

$$\displaystyle{ (\text{not\textendash between}\ r_{0}\ r_{1}\ r_{2}) }$$
(10.92)

indicating triples which are not in the given betweenness relation.

To account for dynamic nature of the real world, in which due to sensory perception inadequacies, dynamic nature of the environment etc., we allow for some deviations from ideal formations by allowing that the robot which is between two neighbors can be between them to a degree in the sense of (10.89). This leads to the notion of a real formation.

For a team of robots, T(r1, r2, , rn) = { r1, r2, , rn}, a real formation R F on T(r1, r2, , rn) is a betweenness to degree relation \((\mbox{between\textendash deg}\)…. ) on the set T(r1, r2, , rn) of robots.

In practice, real formations will be given as a list of expressions of the form,

$$\displaystyle{ (\text{between\textendash deg}\ \delta \ r_{0}\ r_{1}\ r_{2}), }$$
(10.93)

indicating that the thing r0 is to degree of δ in the extent of r1, r2, for all triples in the relation \((\text{between\textendash deg}\ \ldots.)\), along with a list of expressions of the form,

$$\displaystyle{ (\text{not\textendash between}\ r_{0}\ r_{1}\ r_{2}), }$$
(10.94)

indicating triples which are not in the given betweenness relation.

Description of formations, as proposed above, can be a list of relation instances of large cardinality, cf., examples below. The problem can be posed of finding a minimal set of instances sufficient for describing a given formation, i.e., implying the full list of instances of the relation (between). This problem turns out to be NP–hard, see Ośmiałowski and Polkowski (2009).

To describe formations we propose a language derived from LISP–like s–expressions: a formation is a list in LISP meaning with some restrictions that formulates our language. We will call elements of the list things. Typically, LISP lists are hierarchical structures that can be traversed using recursive algorithms. We restrict that top–level list (a root of whole structure) contains only two elements where the first element is always a formation identifier (a name). For instance

Example 1.

(formation1 (some_predicate param1… paramN))

For each thing on a list (and for a formation as a whole) an extent can be derived and in facts, in most cases only extents of those things are considered. We have defined two possible types of things

  1. 1.

    Identifier: robot or formation name (where formation name can only occur at top–level list as the first element);

  2. 2.

    Predicate: a list in LISP meaning where first element is the name of given predicate and other elements are parameters; number and types of parameters depend on given predicate.

Minimal formation should contain at least one robot. For example

Example 2.

(formation2 roomba0)

To help understand how predicates are evaluated, we need to explain how extents are used for computing relations between things. Suppose we have three robots (roomba0, roomba1, roomba2) with roomba0 between roomba1 and roomba2 (so the between predicate is fulfilled). We can draw an extent of this situation as the smallest rectangle containing the union \(\mathit{roomba}1 \cup \mathit{roomba}2\) oriented as a regular rectangle, i.e., with edges parallel to coordinate axes. This extent can be embedded into bigger structure: it can be treated as an thing that can be given as a parameter to predicate of higher level in the list hierarchy. For example:

Example 3.

(formation3 (between (between roomba0 roomba1 roomba2) roomba3 roomba4))

We can easily find more than one situation of robots that fulfill this example description. That is one of the features of our approach: one s–expression can describe many situations. This however makes very hard to find minimal s–expression that would describe already given arrangement of robots formation (as stated earlier in this chapter, the problem is NP–hard).

Typical formation description may look like below, see Ośmiałowski (2011)

Example 4.

(cross

(set

   (max–dist 0.25 roomba0 (between roomba0 roomba1 roomba2))

   (max–dist 0.25 roomba0 (between roomba0 roomba3 roomba4))

   (not–between roomba1 roomba3 roomba4)

   (not–between roomba2 roomba3 roomba4)

   (not–between roomba3 roomba1 roomba2)

   (not–between roomba4 roomba1 roomba2)

)

)

This is a description of a formation of five Roomba robots arranged in a cross shape. The max–dist relation is used to bound formation in space by keeping all robots close one to another.

We show a screen–shot of robots in the initial formation of cross–shape in a crowded environment, see Figs. 10.2 and 10.3. These behaviors witness the flexibility of our definition of a robot formation: first, robots can change formation, next, as the definition of a formation is relational, without metric constraints on robots,the formation can manage an obstacle without losing the prescribed formation (though, this feature is not illustrated in figures in this chapter).

Fig. 10.2
figure 2figure 2

Trails of robots moving in the line formation through the passage (From Polkowski (2011))

Fig. 10.3
figure 3figure 3

Trails of robots in the restored cross formation in the free workspace after passing through the passage (From Polkowski (2011))

8 Mereology in Knowledge Granulation and Reasoning About Knowledge

The topic of knowledge engineering in computer science does encompass problems of representation, extraction (data mining), reasoning about and application of knowledge (knowledge engineering). We represent knowledge as annotated data expressed in symbolic or numeric, or hybrid form which encode information about a considered case.

8.1 Representation of Knowledge: Information/Decision Systems

We assume that knowledge is represented in the form of information or decision systems. An information system, cf., e.g., Pawlak (1991) or Polkowski (2002) is a pair (U, A) where U is a finite set of things and A is a finite set of attributes; each attribute a is a mapping a: U → V from the set U into a set V of attribute values. For each thing u ∈ U, the information vector of u is the set

$$\displaystyle{\mathit{Inf }_{A}(u) =\{ a(u): a \in A\}.}$$

A decision system adds to the set A a decision attribute dA. Knowledge can be extracted from either system in the form of (1) a classification into categories or (2) a decision algorithm which is a judiciously chosen set of decision rules.

Classification into categories in an information system (U, A) relies on the indiscernibility in the sense of Leibniz (1969) and Forrest (2010): for a set \(B \subseteq A\) of attributes, one defines the B–indiscernibility relation INDB as

$$\displaystyle{ \mathit{IND}_{B}(u,v) \Leftrightarrow a(u) = a(v)\ \mathit{for}\ \mathit{each}\ a \in B }$$
(10.95)

Classes {[u]B: u ∈ U} of the relation INDB form Bcategories.

In a decision system (U, A, d), perceived as a window on a (possibly unknown) function fA, d from A-categories onto d-categories, an approximation to fA, d can be searched for in a form of a set of decision rules of the form of an implication

$$\displaystyle{ \bigwedge _{a\in B}(a,w_{a}) \Rightarrow (d,w_{d}) }$$
(10.96)

where the descriptor (a, wa) is a logical formula interpreted as \([(a,w_{a})] =\{ u \in U: a(u) = w_{a}\}\), extended recursively as \([\alpha \wedge \beta ] = [\alpha ] \cap [\beta ]\), \([\alpha \vee \beta ] = [\alpha ] \cup [\beta ]\), \([\neg \alpha ] = U\setminus [\alpha ]\). The implication in the formula (10.96) is satisfied to a degree r with respect to a set rough inclusion μS in case \(\mu ^{S}([\bigwedge _{a\in B}(a,w_{a})],[(d,w_{d})],r)\). In case r = 1 the rule is true.

8.2 Decision Rules

A decision algorithm, classifier is a judiciously chosen set of decision rules, approximating possibly most closely the real decision function fA, d. This comes down to a search in the space of possible descriptors in order to find their successful combinations. In order to judge the quality, or, degree of approximation, decision rules are learned on a part of the decision system, the training set and then the decision algorithm is tested on the remaining part of the decision system, called the test set. Degree of approximation is measured by some coefficients of varied character. Simple measures of statistical character are found from the contingency table, see Arkin and Colton (1970). This table is built for each decision rule r and a decision value v, by counting the number nt of training things, the number nr of things satisfying the premise of the rule r (caught by the rule), nr(v) is the number of things counted in nr and with the decision v, and nrv) is the number of things counted in nr but with decision value distinct from v. To these factors, we add nv, the number of training things with decision v and \(n_{\neg v}\), the number of remaining things, i.e, \(n_{\neg v} = n_{t} - n_{v}\).

For these values, accuracy of the rule r relative to v is the quotient

$$\displaystyle{ \mathit{acc}(r,v) = \frac{n_{r}(v)} {n_{r}} }$$
(10.97)

and coverage of the rule r relative to v is

$$\displaystyle{ \mathit{cov}(r,v) = \frac{n_{r}(v)} {n_{v}} }$$
(10.98)

These values are useful as indicators of a rule strength which is taken into account when classification of a test thing is under way: to assign the value of decision, a rule pointing to a decision with a maximal value of accuracy, or coverage, or combination of both can be taken; methods for combining accuracy and coverage into a single criterion are discussed, e.g., in Michalski (1990). Accuracy and coverage can, however, be defined in other ways; for a decision algorithm D, trained on a training set Tr, and a test set Tst, the accuracy of D is measured by its efficiency on the test set and it is defined as the quotient

$$\displaystyle{ \mathit{accuracy}(D) = \frac{n_{\mathit{corr}}} {n_{\mathit{caught}}} }$$
(10.99)

where ncorr is the number of test things correctly classified by D and ncaught is the number of test things classified.

Similarly, coverage of D is defined as

$$\displaystyle{ \mathit{coverage}(D) = \frac{n_{\mathit{caught}}} {n_{\mathit{test}}} }$$
(10.100)

where ntest is the number of test things. Thus, the product accuracy(D) ⋅ coverage(D) gives the measure of the fraction of test things correctly classified by D.

We have already mentioned that accuracy and coverage are often advised to be combined in order to better express the trade–off between the two: one may have a high accuracy on a relatively small set of caught things, or a lesser accuracy on a larger set of caught by the classifier things. Michalski (1990) proposes a combination rule of the form

$$\displaystyle{ \mathit{MI}\ =\ \frac{1} {2} \cdot A + \frac{1} {4} \cdot A^{2} + \frac{1} {2} \cdot C -\frac{1} {4} \cdot A \cdot C }$$
(10.101)

where A stands for accuracy and C for coverage. With the symbol MI, we denote the Michalski index as defined in (10.101). Other rule quality measures can be found, e.g., in Bruning and Kintz (1997), Bazan (1998), and Grzymala–Busse and Hu (2000).

Whereas indiscernibility classes are computationally feasible, cf., Skowron and Rauszer (1992), decision rules in optimal form are not, cf., op. cit. Methods for generation of rules with minimal set of descriptors, optimal rules, true rules, minimal sets of rules, strong (association) rules, etc., can be found in Pawlak and Skowron (1993), Skowron (1993), Skowron and Rauszer (1992), Grzymala–Busse (1992), and Agrawal et al. (1993).

8.3 Mereology as Similarity: Granulation of Knowledge

The creator of Fuzzy Set Theory Lotfi A. Zadeh (1979) proposed to compute with granules of knowledge. It was posed by L. A. Zadeh that the process of extraction of knowledge can be factored through the stage of granulation in which things are aggregated into granules of knowledge understood as collections or classes of things similar with respect to a chosen measure of similarity. Resulting granular computing, i.e., processing granules of knowledge promises lesser complexity as well as noise filtering.

In case discussed here, as similarity measure we choose a rough inclusion; it provides a similarity to a degree relation which is reflexive but not always symmetric, e.g., a set or geometric rough inclusion is not whereas (ari) or (airi) type rough inclusion is symmetric therefore inducing a hierarchy of tolerance to a degree relations; for a theory of tolerance relations, see, e.g., Shreider (1960).

The idea of mereological granulation of knowledge, see Polkowski (2004a, 2005a), cf., surveys Polkowski (2008, 2009a), presented here finds an effective application in problems of synthesis of classifiers from data tables. This application consists in granulation of data at preprocessing stage in the process of synthesis: after granulation, a new data set is constructed, called a granular reflection, to which various strategies for rule synthesis can be applied. This application can be regarded as a process of filtration of data, aimed at reducing noise immanent to data. Application of rough inclusions leads to a formal theory of granules of various radii allowing for various choices of coarseness degree in data.

For a given rough inclusion μ, the granule gμ(u, r) of the radius r about the center u is defined as the class of property Φu, rμ

$$\displaystyle{ g_{\mu }(u,r) = \mathit{Cls}\varPhi _{u,r}^{\mu } }$$
(10.102)

where

$$\displaystyle{ \varPhi _{u,r}^{\mu }(v) \Leftrightarrow \mu (v,u,r) }$$
(10.103)

Properties of granules depend, obviously, on the type of rough inclusion used in their definitions. In case of a symmetric and transitive rough inclusion μ, for each pair u, v of things, and r ∈ [0, 1],

$$\displaystyle{\mathit{ingr}(v,g_{\mu }(u,r)) \Leftrightarrow \mu (v,u,r)}$$

holds which follows directly from the inference rule M3.

In effect, the granule gμ(u, r) can be represented as the set {v: μ(v, u, r)}. To justify this claim, assume that ingr(v, gμ(u, r)) holds. Thus, there exists z such that Ov(z, v) and μ(z, u, r). There is x with ingr(x, v), ingr(x, z), hence, by transitivity of μ, also μ(x, u, r) holds. By symmetry of μ, ingr(v, x), hence, μ(v, x, r) holds.

A more complicated case of other types of rough inclusions is discussed in Polkowski (2011).

Our idea of augmenting existing strategies for rule induction consists in using granules of knowledge. The principal assumption we can make is that the nature acts in a continuous way: if things are similar with respect to judiciously and correctly chosen attributes, then decisions on them should also be similar. A granule collecting similar things should then expose the most typical decision value for things in it while suppressing outlying values of decision, reducing noise in data, hence, leading to a better classifier.

In Polkowski and Artiemjew (2007) and in Artiemjew (2007) the theoretical analysis was confirmed as to its application merits. We proceed with a summary of methods and results of these verification.

8.4 The Idea of Granular Mereological Classifiers

We assume that we are given a decision system (U, A, d) from which a classifier is to be constructed; on the universe U, a rough inclusion μ is given, and a radius r ∈ [0, 1] is chosen, see Polkowski (2004a, 2005a). We can find granules gμ(u, r) for all u ∈ U, and make them into the set G(μ, r).

From this set, a covering Cov(μ, r) of the universe U can be selected by means of a chosen strategy \(\mathcal{G}\), i.e.,

$$\displaystyle{ \mathit{Cov}(\mu,r) = \mathcal{G}(G(\mu,r)) }$$
(10.104)

We intend that Cov(μ, r) becomes a new universe of the decision system whose name will be the granular reflection of the original decision system. It remains to define new attributes for this decision system.

Each granule g in Cov(μ, r) is a collection of things; attributes in the set \(A \cup \{ d\}\) can be factored through the granule g by means of a chosen strategy \(\mathcal{S}\), i.e., for each attribute \(q \in A \cup \{ d\}\), the new factored attribute \(\overline{q}\) is defined by means of the formula

$$\displaystyle{ \overline{q}(g) = \mathcal{S}(\{a(v): \mathit{ingr}(v,g_{\mu }(u,r))\}) }$$
(10.105)

In effect, a new decision system \((Cov(\mu,r),\{\overline{a}: a \in A\},\overline{d})\) is defined. The thing v with

$$\displaystyle{ Inf(v) =\{ (\overline{a} = \overline{a}(g)): a \in A\} }$$
(10.106)

is called the granular reflection of g.

Granular reflections of granules need not be things found in data set; yet, the results show that they mediate very well between the training and test sets. In order to demonstrate the merits of this approach, we consider a standard data set the Australian Credit Data Set from Repository at UC Irvine (2012) and we collect the best results for this data set by various rough set based methods in Table 10.2. For a comparison we include in Table 10.3 results obtained by some other methods, as given in Statlog. In Table 10.4, we give a comparison of performance of rough set classifiers: exhaustive, covering and LEM (Grzymala–Busse, 1992) implemented in RSES (2012) public domain system. We begin in the next section with granular classifiers in which granules are induced from the training set.

Table 10.2 Best results for Australian credit by some rough set based algorithms
Table 10.3 A comparison of errors in classification by rough set and other paradigms
Table 10.4 Train and test (trn = 345 things, tst = 345 things); Australian credit; comparison of RSES implemented algorithms exhaustive, covering and LEM

8.5 Classification by Granules of Training Things

We begin with a classifier in which granules computed by means of the rough inclusion μL form a granular reflection of the data set and then to this new data set the exhaustive classifier, see RSES (2012), is applied.

8.5.1 Procedure of the Test

  1. 1.

    The data set (U, A, d) is input;

  2. 2.

    The training set is chosen at random. On the training set, decision rules are induced by means of exhaustive, covering and LEM algorithms implemented in the RSES system;

  3. 3.

    Classification is performed on the test set by means of classifiers of pt. 2;

  4. 4.

    For consecutive granulation radii r, granule sets G(μ, r) are found;

  5. 5.

    Coverings C o v(μ, r) are found by a random irreducible choice;

  6. 6.

    For granules in C o v(μ, r), for each r, we determine the granular reflection by factoring attributes on granules by means of majority voting with random resolution of ties;

  7. 7.

    For found granular reflections, classifiers are induced by means of algorithms in pt. 2;

  8. 8.

    Classifiers found in pt. 7, are applied to the test set;

  9. 9.

    Quality measures: accuracy and coverage for classifiers are applied in order to compare results obtained, respectively, in pts. 3 and 8.

In Table 10.5, the results are collected of results obtained after the procedure described above is applied. We can compare results expressed in terms of the Michalski index MI as a measure of the trade–off between accuracy and coverage; for template based methods, the best MI is 0.891, for covering or LEM algorithms the best value of MI is 0.804, for exhaustive classifier (r = nil) MI is equal to 0.907 and for granular reflections, the best MI value is 0.915 with few other values exceeding 0.900.

Table 10.5 Train–and–test; Australian Credit; Granulation for radii r; RSES exhaustive classifier; r = granule radius, tst = test set size, trn = train set size, rulex = rule number, aex = accuracy, cex = coverage

What seems worthy of a moment’s reflection is the number of rules in the classifier. Whereas for the exhaustive classifier (r = nil) in non–granular case, the number of rules is equal to 5,597, in granular case the number of rules can be surprisingly small with a good MI value, e.g., at r = 0. 5, the number of rules is 293, i.e., 5 % of the exhaustive classifier size, with the best MI at all of 0.915. This compression of classifier seems to be the most impressive feature of granular classifiers.

Table 10.6 Train–and–test; Australian credit;(layered–granulation)

It is an obvious idea that this procedure can be repeated until a stable system is obtained to which further granulation causes no change; it is the procedure of layered granulation, see Artiemjew (2007). Table 10.6 shows some best results of this procedure for selected granulation radii. As coverage in all reported cases is equal to 1.0, the Michalski index MI is equal to accuracy. This initial, simple granulation, suggests further ramifications. For instance, one can consider, for a chosen value of \(\varepsilon \in [0,1]\), granules of the form

$$\displaystyle{ g_{\mu }(u,r,\varepsilon ) =\{ v \in U: \forall a \in A.\vert a(u) - a(v)\vert \leq \varepsilon \} }$$
(10.107)

and repeat with these granules the procedure of creating a granular reflection and building from it a classifier. Another yet variation consists in mimicking the performance of the Łukasiewicz based rough inclusion and introducing a counterpart of the granulation radius in the form of the catch radius, rcatch. The granule is then dependent on two parameters: \(\varepsilon\) and rcatch, and its form is

$$\displaystyle{ g_{\mu }(u,\varepsilon,r_{\mathit{catch}}) =\{ v \in U: \frac{\vert \{a \in A: \vert a(u) - a(v)\vert \leq \varepsilon } {\vert A\vert } \geq r_{\mathit{catch}}\} }$$
(10.108)

Results of classification by granular classifier induced from the granular reflection obtained by means of granules (10.108) are shown in Table 10.7.

Table 10.7 \(\varepsilon _{\mathit{opt}}\) = optimal value of \(\varepsilon\), acc = accuracy, cov = coverage. Best rcatch = 0. 1428, \(\varepsilon _{\mathit{opt}} = 0.35\): accuracy = 0.8681, coverage = 1.0

8.6 A Treatment of Missing Values

A particular but important problem in data analysis is the treatment of missing values. In many data, some values of some attributes are not recorded due to many factors, like omissions, inability to take them, loss due to some events etc.

Analysis of systems with missing values requires a decision on how to treat missing values; Grzymala–Busse and Hu (2000) analyze nine such methods, among them, (1) most common attribute value, (2) concept restricted most common attribute value, (3) assigning all possible values to the missing location, (4) treating the unknown value as a new valid value, etc. Their results indicate that methods (3), (4) perform very well and in a sense stand out among all nine methods.

We adopt and consider two methods, i.e., (3), (4) from the above mentioned. As usual, the question on how to use granular structures in analysis of incomplete systems, should be answered first.

The idea is to embed the missing value into a granule: by averaging the attribute value over the granule in the way already explained, it is hoped the average value would fit in a satisfactory way into the position of the missing value.

We will use the symbol ∗, commonly used for denoting the missing value; we will use two methods (3), (4) for treating ∗, i.e, either ∗ is a don’t care symbol meaning that any value of the respective attribute can be substituted for ∗, hence, ∗ = v for each value v of the attribute, or ∗ is a new value on its own, i.e., if ∗ = v then v can only be ∗.

Our procedure for treating missing values is based on the granular structure \((G(\mu,r),\mathcal{G},\mathcal{S},\{a{\ast}: a \in A\})\); the strategy \(\mathcal{S}\) is the majority voting, i.e., for each attribute a, the value a(g) is the most frequent of values in {a(u): u ∈ g}. The strategy \(\mathcal{G}\) consists in random selection of granules for a covering.

For a thing u with the value of ∗ at an attribute a,, and a granule g = g(v, r) ∈ G(μ, r), the question whether u is included in g is resolved according to the adopted strategy of treating ∗: in case ∗ = don’t care, the value of ∗ is regarded as identical with any value of a hence | IND(u, v) | is automatically increased by 1, which increases the granule; in case ∗ = ∗, the granule size is decreased. Assuming that ∗ is sparse in data, majority voting on g would produce values of a distinct from ∗ in most cases; nevertheless the value of ∗ may appear in new things g, and then in the process of classification, such value is repaired by means of the granule closest to g with respect to the rough inclusion μL, in accordance with the chosen method for treating ∗.

In plain words, things with missing values are in a sense absorbed by close to them granules and missing values are replaced with most frequent values in things collected in the granule; in this way the method (3) or (4) in Grzymala–Busse and Hu (2000) is combined with the idea of a frequent value, in a novel way.

We have thus four possible strategies:

  1. 1.

    Strategy A: in building granules ∗ = don’t care, in repairing values of ∗, ∗ = don’t care;

  2. 2.

    Strategy B: in building granules ∗ = don’t care, in repairing values of ∗, ∗ = ∗;

  3. 3.

    Strategy C: in building granules ∗ = ∗, in repairing values of ∗, ∗ = don’t care;

  4. 4.

    Strategy D: in building granules ∗ = ∗, in repairing values of ∗, ∗ = ∗.

We show how effective are these strategies, see Polkowski and Artiemjew (2007) by perturbing the data set Pima Indians Diabetes, from UC Irvine Repository (2012). First, in Table 10.8 we show results of granular classifier on the non–perturbed (i.e., without missing values) Pima Indians Diabetes data set. We now perturb this data set by randomly replacing 10 % of attribute values in the data set with missing ∗ values. Results of granular treatment in case of Strategies A, B, C, D in terms of accuracy are reported in Table 10.9. As algorithm for rule induction, the exhaustive algorithm of the RSES system has been selected. 10-fold cross validation (CV–10) has been applied.

Table 10.8 10-fold CV; Pima; exhaustive algorithm, r = radius, macc = mean accuracy, mcov = mean coverage
Table 10.9 Accuracies of strategies A, B, C, D. 10-fold CV; Pima Indians; exhaustive algorithm; r = radius, maccA = mean accuracy of A, maccB = mean accuracy of B, maccC = mean accuracy of C, maccD = mean accuracy of D

Strategy A reaches the accuracy value for data with missing values within 94 % of the value of accuracy without missing values (0. 9407–1. 0) at the radius of 0. 875. With Strategy B, accuracy is within 94 % from the radius of 0. 875 on. Strategy C is much better: accuracy with missing values reaches 99 % of accuracy in no missing values case from the radius of 0. 625 on. Strategy D gives results slightly better than C with the same radii.

We conclude that the essential for results of classification is the strategy of treating the missing value of ∗ as ∗ = ∗ in both strategies C and D; the repairing strategy has almost no effect: C and D differ very slightly with respect to this strategy.

8.7 Granular Rough Mereological Classifiers Using Residuals

Rough inclusions used in Sects. 10.8.410.8.6 in order to build classifiers do take, to a certain degree, into account the distribution of values of attributes among things, by means of the parameters \(\varepsilon\) and the catch radius rcatch. The idea that metrics used in classifier construction should depend locally on the training set is, e.g., present in classifiers based on the idea of nearest neighbor, see, e.g., a survey in Polkowski (2009b). In order to construct a measure of similarity based on distribution of attribute values among things, we resort to residual implications–induced rough inclusions. This rough inclusion can be transferred to the universe U of a decision system; to this end, first, for given things u, v, and \(\varepsilon \in [0,1]\), factors

$$\displaystyle{ \mathit{dis}_{\varepsilon }(u,v) = \frac{\vert \{a \in A: \vert a(u) - a(v)\vert \geq \varepsilon \}\vert } {\vert A\vert } }$$
(10.109)

and

$$\displaystyle{ \mathit{ind}_{\varepsilon }(u,v) = \frac{\vert \{a \in A: \vert a(u) - a(v)\vert <\varepsilon \} \vert } {\vert A\vert } }$$
(10.110)

are introduced. The weak variant of rough inclusion \(\mu _{\rightarrow _{T}}\) is defined, see Polkowski (2007), as

$$\displaystyle{ \mu _{T}^{{\ast}}(u,v,r)\ \text{if and only if}\ \mathit{dis}_{\varepsilon }(u,v) \rightarrow _{ T}\ \mathit{ind}_{\varepsilon }(u,v) \geq r }$$
(10.111)

These similarity measures will be applied in building granules and then in data classification.

Tests are done with the Australian credit data set; the results are validated by means of the 5–fold cross validation (CV–5). For each of t–norms: M, P, L, three cases of granulation are considered

  1. 1.

    Granules of training things (GT);

  2. 2.

    Granules of rules induced from the training set (GRT);

  3. 3.

    Granules of granular things induced from the training set (GGT).

In this approach, training things are made into granules for a given \(\varepsilon\). Things in each granule g about a test thing u, vote for decision value at u as follows: for each decision class c, the value

$$\displaystyle{ p(c) = \frac{\sum _{\text{training thing v in g falling in c}}w(u,v)} {\text{size of c in training set}} }$$
(10.112)

is computed where the weight w(u, v) is computed for a given t–norm T as

$$\displaystyle{ w(u,v) = \mathit{dis}_{\varepsilon }(u,v) \rightarrow _{T}\ ind_{\varepsilon }(u,v) }$$
(10.113)

The class c* assigned to u is the one with the largest value of p.

Weighted voting of rules in a given granule g for decision at test thing u goes according to the formula d(u) = arg max p(c), where

$$\displaystyle{ p(c) = \frac{\sum _{\text{rule in g pointing to c}}w(u,r) \cdot support(r)} {\text{size of c in training set}} }$$
(10.114)

where weight w(u, r) is computed as

$$\displaystyle{ dis_{\varepsilon }(u,r) \rightarrow _{T}\ ind_{\varepsilon }(u,r) }$$
(10.115)

The optimal (best) results in terms of accuracy of classification are collected in Table 10.10.

Table 10.10 5-fold CV; Australian; residual metrics. met = method of granulation, T = t–norm, \(\varepsilon _{\mathit{opt}}\) = optimal \(\varepsilon\), macc = mean accuracy, mcov = mean coverage

These results show that rough mereological granulation provides better or at least on par with best results by other methods accuracy of classification at the radically smaller classifier size measured as the number of decision rules in it.

9 Mereology in Artificial Intelligence

Though the topics relegated to this section may be as well assigned to Knowledge Engineering, yet we relate them to Artificial Intelligence as tools which may be helpful in reasoning about complex systems and hard decision problems.

9.1 Cognitive Reasoning

We focus here on cognitive methods known also as network related methods. Of those, neural networks are well–known as a tool useful in pattern recognition, classification and machine learning. Based on the structure of the physiological neuron, discovered by Ramón y Cajal (1889), artificial neuron was defined in McCulloch and Pitts (1943) as the structure composed of a finite set of inputs labeled x1, x2, , xn, a body with a threshold Θ and the output, y; according to the physiological archetype, this neuron computes by the rule

$$\displaystyle{ y = 1 \Leftrightarrow \sum _{i}x_{i} \geq \varTheta \ \text{else}\ y = 0 }$$
(10.116)

Later developments include a perceptron defined in Rosenblatt (1958). A simplified perceptron adds to McCulloch–Pitts neuron weights on inputs, and an additional input with constant value of 1 and a weight b, called bias. Thus, the computation rule has the form

$$\displaystyle{ y = 1 \Leftrightarrow \sum _{i}w_{i} \cdot x_{i} + b \geq \varTheta \ \text{else}\ y = 0 }$$
(10.117)

which allows for a greater flexibility. Either type of neuron is able to classify binary concepts by means of the separating hyperplane, H, which separates the space En of possible input vectors into two semi–planes, and it is defined, e.g., in case of perceptrons, as

$$\displaystyle{ \sum _{i}w_{i} \cdot x_{i} + b =\varTheta }$$
(10.118)

The idea of networks of neurons was advocated by Alan Turing (1948) who proposed a learning scheme for networks of neurons connected through modifiers, and it was revived in Grossberg (1973) with networks of perceptrons. Such networks of connected perceptrons produce the intersection of respective semi–planes which cuts the space of input vectors into convex closed regions in ideal case assigning input vectors representing distinct categories of things to distinct regions. Finally, the back–propagation learning, see Werbos (1994), added in place of Θ’s differentiable sigmoid transfer functions.

Our model of perceptron, see Polkowski (2005b), differs from the standard model, as its neurons are perceived as intelligent agents working with knowledge represented in information systems. An essential feature of network perceptrons from the point of view of learning is differentiability of transfer functions; hence, we introduced a special type of rough inclusions, called gaussian in Polkowski (2005b) because of their form, by letting

$$\displaystyle{ \mu _{G}(x,y,r)\ \text{iff}\ e^{-\vert \sum _{a\in \mathit{DIS}(x,y)}w_{a}\vert ^{2} } \geq r }$$
(10.119)

where wa ∈ (0, +) is a weight associated with the attribute a for each attribute a ∈ A; cf. (10.23) for DIS. Computation by this perceptron is directed by the gradient of the function

$$\displaystyle{ f(x,y) = e^{-\vert \sum _{a\in \mathit{DIS}(x,y)}w_{a}\vert ^{2} } }$$
(10.120)

whose wa component is

$$\displaystyle{ \frac{\partial f} {\partial w_{a}} = f \cdot (-2 \cdot \sum w_{a}) }$$
(10.121)

It follows from the last equation that gradient search would go in direction of minimizing the value of awa.

We denote the perceptron by the agent symbol ag; it is endowed with an information system Iag = (Uag, Aag). The input to ag is in the form of a thing x.

The rough mereological perceptron is endowed with a set of target concepts\(t = g_{\mu _{G}}(T \in U_{\mathit{ag}}/\mathit{IND}(A_{\mathit{ag}}),r_{t})\). The result of computation with a dedicated target t for a training thing x is a granule \(g = g_{\mu _{G}}(x,r(\mathit{res}))\) such that ingr(g, t).

During computation, weights are incremented by the learning rule

$$\displaystyle{ w_{a} \leftarrow w_{a} +\varDelta \cdot \frac{\partial E} {\partial w_{a}} }$$
(10.122)

where Δ is the learning rate.

At a stage current of computing, where \(\gamma = \vert r_{\mathit{current}} - r\vert \), for a natural number k, the value of Δcurrent which should be taken at the step current in order to achieve the target in at most k steps should be taken as, see Polkowski (2005b)

$$\displaystyle{ \varDelta _{\mathit{current}} \simeq \frac{\gamma } {2 \cdot k \cdot f^{2} \cdot (\sum _{a}w_{a})^{2}} }$$
(10.123)

9.2 MAS Reasoning: Many–Agent Systems

Reasoning in artificial intelligence is often concerned with ‘complex cases’ like, e.g., robotic soccer, in which performing successfully tasks requires participation of a number of ‘agents’ bound to cooperate, and in which a task is performed with a number of steps, see, e.g., Stone (2000); other areas where such approach seems necessary concern assembling and design, see Amarel (1991), fusion of knowledge, e.g., in robotics, fusion of information from sensors, see, e.g., Canny (1988), Choset et al. (2005), or, Stone (2000), as well as in machine learning and fusion of classifiers, see, e.g., Dietterich (2000).

Rough mereological approach to these problems was initiated with Polkowski and Skowron (1998, 1999a,b, 2001); here, we propose a logically oriented formulation.

Rough inclusions and granular intensional logics based on them can be applied in describing workings of a collection of intelligent agents which are called here granular agents.

A granular agent a will be represented as a tuple

$$\displaystyle{(U_{a},\mu _{a},L_{a},\mathit{prop}_{a},\mathit{synt}_{a},\mathit{aggr}_{a})}$$

where

  1. 1.

    Uais a collection of objects available to the agent a.

  2. 2.

    μais a rough inclusion on objects in Ua.

  3. 3.

    Lais a set of unary predicates in first–order open calculus, interpretable in Ua.

  4. 4.

    propais the propagation function that describes how uncertainty expressed by rough inclusions at agents connected to a propagates to a.

  5. 5.

    syntais the logic propagation functor which expresses how formulas of logics at agents connected to the agent a are made into a formula at a.

  6. 6.

    aggrais the synthesis function which describes how objects at agents connected to a are made into an object at a.

We assume for simplicity that agents are arranged into a rooted tree; for each agent a distinct from any leaf agent, we denote by Ba the children of a, understood as agents connected to a and directly sending to a objects, logical formulas describing them, and uncertainty coefficients like values of rough inclusions.

For b ∈ Ba, the symbol xb will denote an object in Ub; similarly, ϕb will denote a formula of Lb, and μb will be a rough inclusion at b with values rb. The same convention will be obeyed by objects at a.

Postulates governing the working of the scheme are

MA1:

If ingrb(xb′, xb) for each b ∈ Ba, then ingra(aggr({xb′}), aggr({xb})).

This postulate does assure that ingredient relations are in agreement with aggregate operator of forming complex objects: ingredients of composed objects form an ingredient of a complex object. We can say that aggringr = ingraggr, i.e, the resulting diagram commutes.

MA2:

If\(x_{b}\models \phi _{b}\), then\(\mathit{aggr}(\{x_{b}\})\models \mathit{synt}(\{\phi _{b}\})\).

This postulate is about agreement between aggregation of objects and their logical descriptions: descriptions of composed objects merge into a description of the resulting complex object.

MA3:

If μb(xb, yb, rb) for b ∈ Ba, then μa(aggr({xb}), aggr({yb}), prop{rb}).

This postulate introduces the propagation function, which does express how uncertainty at connected agents is propagated to the agent a. One may observe the uniformity of prop, which in the setting of MA3 depends only on values of rb’s; this is undoubtedly a simplifying assumption, but we want to avoid unnecessary and obscuring the general view complications, which of course can be multiplied at will.

MA4:

For b ∈ Ba, \(\mathit{ingr}_{b}(x_{b},g_{\mathit{mu}_{r}}(u_{b},r_{b}))\)implies

$$\displaystyle{\mathit{ingr}_{a}(\mathit{aggr}(\{x_{b}\}),g_{\mu _{a}}(\mathit{aggr}(\{u_{b}\}),prop(\{r_{b}\}))).}$$

Admitting MA4, we may also postulate that in case agents have at their disposal variants of rough mereological granular logics, intensions propagate according to the prop functor

MA5:

If\(I_{\nu _{b}}^{\vee }(g_{b})(\phi _{b}) \geq r_{b}\)for each b ∈ Ba, then

$$\displaystyle{I_{\nu _{a}}^{\vee }(\mathit{aggr}(\{g_{ b}\}))(synt(\{\phi _{b}\})) \geq \mathit{prop}(\{r_{b}\}).}$$

A simple exemplary case of knowledge fusion was examined in Polkowski (2008). We consider an agent a ∈ Ag with two incoming connections from agents b, c, i.e., Ba = { b, c}. Each agent is applying the rough inclusion μ = μLI, see (10.26), to an information system (Ua, Aa), (Ub, Ab), (Uc, Ac). Each agent is also applying the rough inclusion on sets of the form (10.27) in evaluations related to extensions of formulae intensions.

We consider a simple fusion scheme in which information systems at b, c are combined thing–wise to make the information system at a; thus, aggra(x, y) = (x, y). Such case may happen, e.g., when an object is described with help of a camera image by some features and at the same time it is perceived and recognized with range sensors like infrared or laser sensors and some localization means like GPS.

Then: uncertainty propagation and granule propagation are described by the Łukasiewicz t–norm L and extensions of logical intensions propagate according to the product t–norm P.

9.3 Granular Logics: Reasoning in Information Systems

The idea of a granular rough mereological logic, see Polkowski (2004b) and Polkowski and Semeniuk–Polkowska (2005), consists in measuring the meaning of a unary predicate in the model which is a universe of an information system against a granule defined by means of a rough inclusion. The result can be regarded as the degree of truth (the logical value) of the predicate with respect to the given granule. The obtained logics are intensional as they can be regarded as mappings from the set of granules (possible worlds) to the set of logical values in the interval [0, 1], the value at a given granule regarded as the extension at that granule of the intension. A discussion of intensional logics can be found, e.g., in Gallin (1975), Van Benthem (1988), Hughes and Creswell (1996) and Fitting (2004).

For an information/decision system (U, A, d), with a rough inclusion ν, e.g., (10.27), on subsets of U and for a collection of unary predicates Pr, interpreted in the universe U (meaning that for each predicate ϕ ∈ Pr the meaning [[ϕ]] is a subset of U), we define the intensional logic GRMν by assigning to each predicate ϕ in Pr its intension Iν(ϕ) defined by its extension \(I_{\nu }^{\vee }(g)\) at each particular granule g, as

$$\displaystyle{ I_{\nu }^{\vee }(g)(\phi ) \geq r\ \Leftrightarrow \nu (g,[[\phi ]],r) }$$
(10.124)

With respect to the rough inclusion (10.27) the formula (10.124) becomes

$$\displaystyle{ I_{\nu _{L}}^{\vee }(g)(\phi ) \geq r\ \Leftrightarrow \frac{\vert g \cap [[\phi ]]\vert } {\vert g\vert } \geq r }$$
(10.125)

A formula ϕ interpreted in the universe U of a system (U, A, d) is true at a granule g with respect to a rough inclusion ν if and only if \(I_{\nu }^{\vee }(g)(\phi ) = 1\) and ϕ is true if and only if it is true at each granule g. A rough inclusion ν is regular when ν(X, Y, 1) holds if and only if \(X \subseteq Y\). Hence, for a regular ν, a formula ϕ is true if and only if for \(g \subseteq [[\phi ]]\) for each granule g.

A particularly important case of a formula is that of decision rules; clearly, for a decision rule r: p ⇒ q in the descriptor logic, the rule r is true at a granule g with respect to a regular rough inclusion ν if and only if \(g \cap [[p]] \subseteq [[q]]\).

Analysis of decision rules in the system (U, A, d) can be given in a more general setting of dependencies. For two sets \(C,D \subseteq A \cup \{ d\}\) of attributes, one says that D depends functionally on C when \(\mathit{IND}(C) \subseteq \mathit{IND}(D)\), symbolically denoted CD. Functional dependence can be represented locally by means of functional dependency rules of the form

$$\displaystyle{ \phi _{C}(\{v_{a}: a \in C\}) \Rightarrow \phi _{D}(\{w_{a}: a \in D\}) }$$
(10.126)

where ϕC({va: a ∈ C}) is the formula \(\bigwedge _{a\in C}(a = v_{a})\), and \([[\phi _{C}]] \subseteq [[\phi _{D}]]\).

We introduce a regular rough inclusion on sets ν3 defined as

$$\displaystyle{ \nu _{3}(X,Y,1) \Leftrightarrow X \subseteq Y \ \text{else}\ \nu _{3}(X,Y, \frac{1} {2}) \Leftrightarrow X \cap Y \neq \varnothing \ \text{else}\ \nu _{3}(X,Y,0) }$$
(10.127)

Then one proves that \(\alpha:\phi _{C} \Rightarrow \phi _{D}\)is a functional dependency rule if and only if α is true in the logic induced by ν3. A specialization of this statement holds for decision rules. Further applications to modalities in decision systems and the Perception Calculus in the sense of Zadeh (2004) can be found in Polkowski (2011).