Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Distributional models of language describe the meaning of a word using co-occurrence statistics derived from corpus data. A central question with these models is how to combine meanings of individual words, in order to understand phrases and sentences. Categorical compositional models of natural language [15] address this problem, providing a principled approach to combining the meanings of words to form the meanings of sentences, by exploiting their grammatical structure. They also outperform conventional techniques for some standard NLP tasks [23, 29].

Distributional models of language can be thought of as “process theories” [16] A process theory consists of a graphical language for reasoning about composite systems of abstract processes, and a categorical semantics modelling the application domain. A particularly important class of categorical models are the compact closed categories, which come equipped with an elegant graphical calculus. Process theoretic models built upon compact closed categories have been successfully exploited in many application areas, including quantum computation [1], signal flow graphs [11], control theory [2], Markov processes [4], electrical circuits [3] and even linear algebra [43].

Recently [9], the categorical compositional approach to meaning has been applied to the conceptual space models of human cognition introduced in [21, 22]. When addressing a new application domain, it is necessary to identify a compact closed category with mathematical structure compatible with the application phenomena of interest.

Amongst the compact closed categories the hypergraph categories [20] are a particularly well behaved class of practical interest. In [33] we presented a flexible parameterized mathematical framework for constructing hypergraph categories. We view this framework as a practical tool for building new models in a principled manner, by varying the parameter choices according to the needs of the application domain. These models are based upon generalizing the well understood notion of a binary relation, providing a concrete and intuitive setting for model development.

In the present work we demonstrate, via extensive examples, that categories of generalized relations present an attractive setting for constructing new models of language and cognition. We emphasize the intuitive interpretation of the models under construction, and their connections to concrete ideas in computation, NLP and further afield. These examples are structured as follows:

  • In Sect. 3 we introduce relations with generalized truth values, and exploit them to model features such as distances, forces, connectivity and fuzziness. Relations with generalized truth values are well known in the mathematical community, but seem to have received little attention from the perspective of compositional semantics, with the recent exception of [19].

  • In Sect. 4 we generalize relations in another direction, considering relations that respect algebraic structure. These relations can capture features such as convexity, which is important in conceptual spaces models [21, 22]. In this case, we recover a model first used in [9], originally constructed in an ad-hoc manner using techniques from monad theory and the theory of regular categories. Importantly, we then show that we can combine generalized truth values with relations respecting algebraic structure, providing conceptual space models with access to distance measures.

  • In Sect. 5 we view spans as generalized “proof aware” relations in which the apex of the span contains witnesses to relatedness between the domain and codomain. Spans can be extended to support generalized truth values, and to respect algebraic structure. Exploiting a combination of these features, we construct a new model of semantic ambiguity in conceptual space models of natural language, in which different proof witnesses allow us to vary how strongly different words are related, depending on how they are interpreted.

  • The previous examples were essentially built upon the category of sets. Our techniques can be applied with different choices of ambient topos. In Sect. 6, as a practical example of this feature, we use presheaf toposes to build models in which meanings can vary with context, such as the progress of time or states of the world.

All of our models are preorder enriched, providing a natural candidate for modelling semantic entailment or hyponymy [5, 6]. Preorder enrichment also means we can consider internal monads within our various categories of relations. We emphasize the importance of these internal monads throughout our discussions. They provide access to important structured objects such as preorders, generalized metric spaces and ultrametric spaces, and similar well behaved relationships when we combine various modelling features.

2 Compositional Models of Meaning

The grammatical structure of natural language can be modelled using Lambek’s pregroup grammars [31].

Definition 1

A pregroup is a tuple \((X,\cdot , 1, (-)^l,(-)^r,\le )\) where \((X,\cdot , 1, \le )\) is a partially ordered monoid, or pomonoid, and \((-)^r, (-)^l\) are unary functions of type \(X\rightarrow X\) such that for all \(x\in X\) the following conditions hold,

$$\begin{aligned} 1\le x\cdot x^l \qquad x^l\cdot x\le 1 \qquad 1 \le x^r\cdot x \qquad x\cdot x^r \le 1 \end{aligned}$$

We say that x reduces to \(x'\) if \(x \le x'\).

A grammar is typically described using the free pregroup over some set of basic types. For example, we may consider the free pregroup of the set \(\{n,s\}\) where n and s are basic types for nouns and sentences respectively. More complex terms are then built up using the algebraic operations, for example the type of a transitive verb is \(n^rsn^l\). We can calculate the type of a phrase by composing the types of the individual terms using the monoid multiplication. For example, the phrase “mice eat cheese” has type \(n(n^rsn^l)n\). A composite term is then a well typed sentence if its type reduces to the sentence type. For example:

$$\begin{aligned} n(n^rsn^l)n = (nn^r)s(n^ln) \le s (n^ln) \le s \end{aligned}$$

and so “mice eat cheese” is a well typed sentence. In this way, pregroups give us access to the compositional features of language.

On the other hand, distributional models [40] of the meaning of words in natural language are built using vector space models automatically derived from co-occurrence statistics in a large corpus of text. The key observation of the categorical compositional approach to natural language is that both pregroups and the category of finite dimensional real vector spaces carry the same categorical structure, that of an autonomous category.

Definition 2

A monoidal category \(\mathcal {V}\) has left/right duals if every object has an internal left/right adjoint when \(\mathcal {V}\) is regarded as a one object bicategory. An autonomous category is a monoidal category in which every object has both left and right duals. A compact closed category is a symmetric monoidal category in which every object has right duals.

A straightforward application oriented introduction to monoidal categories and compact closed categories can be found in [17].

This observation can be exploited to derive the meanings of sentences from the meanings of words. We fix a strong monoidal functor from a pregroup describing grammatical structure to the category of finite dimensional vector spaces. This functor maps type reductions to linear maps, allowing us to automatically derive the meaning of a sentence from its constituent parts. Clearly, this approach can be seen as an instance of functorial semantics. By varying the domain and preserved structure we can consider different categorial grammars [14]. By varying the codomain we can consider different models, as has been important in recent work broadening the scope to mathematical models of cognition  [9, 10]. When varying the category of meanings, it is desirable to remain within the domain of compact closed categories, in order to exploit connections with previous linguistic developments, and to retain access to their powerful graphical calculus.

The question then becomes: How can we find or construct compact closed categories with desirable mathematical properties? This is the question we explore in this paper. In fact, our constructions produce a subclass of compact closed categories, referred to as hypergraph categories [20, 30], and so this is where we shall focus our attention.

Definition 3

hypergraph category is a symmetric monoidal category in which every object is equipped with a choice of special commutative Frobenius algebra, coherently with the monoidal structure.

Details of the notion of a Frobenius algebra, and linguistic applications including modelling relative pronouns can be found in [38, 39]. If I is the monoidal unit, we will occasionally refer to morphisms of types \(I \rightarrow X\) and \(X \rightarrow I\) as the states and effects of X. Morphisms of type \(I \rightarrow I\) are referred to as numbers.

Example 1

The category \(\mathbf{Rel}\) of sets and binary relations between them can be given the structure of a hypergraph category. The monoidal structure is given by forming Cartesian products of sets. A state of a set X is a subset of X and the numbers are the Boolean truth values. The Frobenius algebra is given by the copying relation \(x \sim (x,x) : X \rightarrow X \times X\), the deletion relation \(x \sim * : X \rightarrow I\), and their converses.

All the compact closed categories discussed in this paper will be hypergraph categories, generalizing Example 1 along different axes of variation.

3 Generalized Truth Values

A binary relation \(R : A \rightarrow B\) between sets can be identified with a characteristic function of type \(A \times B \rightarrow \{ \top , \bot \}\) mapping the related pairs of elements to \(\top \). It is fruitful to consider generalizing the codomain of such characteristic functions to a set Q, thought of as a collection of truth values. We can then consider functions of the form \(A \times B \rightarrow Q\) as generalized relations, with truth values in Q. In order for the corresponding binary relations to have satisfactory notions of identities and composition, the set Q must carry the structure of a quantale.

Definition 4

(Quantale). A quantale is a join complete partial order Q with a monoid structure \((\otimes ,k)\) satisfying the following distributivity axioms, for all \(a,b \in Q\) and \(A,B \subseteq Q\):

$$\begin{aligned} a \otimes \left[ \bigvee B \right] = \bigvee \{ a \otimes b \mid b \in B \}\qquad \left[ \bigvee A \right] \otimes b = \bigvee \{ a \otimes b \mid a \in A \} \end{aligned}$$

A quantale is said to be commutative if its monoid structure is commutative.

All the quantales encountered in this paper will be commutative. We introduce some examples of importance in later developments.

Example 2

The Boolean quantale is given by the two element complete Boolean algebra \(\mathbf{B} = \{ \top , \bot \}\), with the join and multiplication given by the join and meet in the Boolean algebra.

Example 3

The Lawvere quantale  \(\mathbf{L}\) is given by the chain \([0,\infty ]\) of extended positive reals with the reverse ordering, hence minima in \([0,\infty ]\) provide the joins of the quantale, and the monoid structure is given by addition.

Example 4

The quantale \(\mathbf{F}\) has again the extended positive reals with reverse order as its partial order, but now with \(\max \) as the monoid multiplication.

Example 5

The interval quantale  \(\mathbf{I}\) is given by the ordered interval [0, 1] with minima as the monoid structure.

For a quantale Q, the Q-relations form a category \(\mathbf{Rel}(Q)\) with composition and identitiesFootnote 1

$$\begin{aligned} (S \circ R)(a,c) = \bigvee _b R(a,b) \otimes S(b,c) \qquad 1_A (a,b) = \bigvee \{k | a = b\} \end{aligned}$$

If Q is a commutative quantale, \(\mathbf{Rel}(Q)\)  carries a symmetric monoidal structure, with the tensor product of objects given by the cartesian product of sets, and the action on relations given for \(R : A \rightarrow C\) and \(S : B \rightarrow D\) by

$$\begin{aligned} (R \otimes S)(a,b,c,d) = R(a,c) \otimes S(b,d) \end{aligned}$$

The singleton set is the monoidal unit. A key observation from the perspective of this paper is:

Theorem 1

\(\mathbf{Rel}(Q)\) is compact closed with respect to this monoidal structure.

Now that we have described how Q-relations compose, we can consider computational interpretations for our example choices of quantale.

Example 6

The relations over the Lawvere quantale \(\mathbf{L}\) can be thought of as describing costs. The value R(ab) describes the cost of converting a into b. A cost of 0 means they are maximally related and can be freely inter-converted. A cost of \(\infty \) indicates completely unrelated values, that cannot be converted between each other for finite cost. The value \((S \circ R)(a,c)\) describes the cheapest way of converting a into some b, and then converting that b into c, and adds the associated costs. If we perform two conversions in parallel \((R \otimes R')(a,a',b,b')\) describes the sum of the two individual conversion costs.

In this setting, we can think of a state \(I \rightarrow A\) as giving a table of costs for acquiring the resources in A, and similarly an effect \(A \rightarrow I\) is a table of costs for disposing of resources in A.

Example 7

The quantale \(\mathbf{F}\) has the same underlying set as the Lawvere quantale, but its different algebraic structure leads to a very different interpretation. We think of R(ab) as the peak force required to move a to b. The value given by the composite \((S \circ R)(a,c)\) then describes optimum peak force we will require to move a to c. For example if we can convert a to b with one unit of force, and then move b to c for two units of force, then the peak force required is two units. An alternative procedure converting a to \(b'\) for zero units of cost, and then converting \(b'\) to c for 2.5 units of cost has a peak cost of 2.5 units, so we would prefer the first procedure to minimize our peak effort. Similarly, the truth value \((R \otimes R')(a,a',b,b')\) gives the peak force required to complete both conversions, assuming these costs are independently incurred.

As with Example 6, we can think of states and effects as tables of acquisition and elimination forces.

Example 8

We can interpret ordinary relations over the Boolean quantale as modelling connectivity. R(ab) tells us that a is connected to b, composition tells us that we can chain connections together, and the tensor product tells us that we can connect pairs of elements together using a pair of connections between their components. Generalizing to the interval quantale, we now think of R(ab) as a “connection strength” between a and b. The composite \((S \circ R)(a,c)\) gives the best connection quality that we can achieve in two steps via B. Similarly, the parallel composite \((R \otimes R')(a,a',b,b')\) gives a conservative judgment of the connection quality we can achieve simultaneously between both a and b and \(a'\) and \(b'\) as the lower of the two individual connection strengths. States describe the “transmission strength” with which signals enter the system from the environment, and effects describe the “reception quality” when consuming output signals.

Alternatively, we could view relations over \(\mathbf{I}\) as fuzzy relations, with states and effects sets with fuzzy membership, and fuzzy predicates. Graded membership is widely used in cognitive science, for example in [8, 18, 24, 25, 37]. Concepts such as ‘tall’ have no crisp boundary and are better modelled using grades of membership. Although human concept use does not obey fuzzy logic [35], fuzzy relations may prove useful.

\(\mathbf{Rel}(Q)\) is partial order-enriched if we order relations pointwise with respect to the underlying quantale order. It therefore makes sense to consider internal monads in \(\mathbf{Rel}(Q)\) as interesting “structured objects”. An internal monad on an object in a partially ordered category is an endomorphism R satisfying:

$$\begin{aligned} (R \circ R) \subseteq R, \qquad 1_A \subseteq R \end{aligned}$$
(1)

Example 9

If we specialize condition (1) to \(\mathbf{Rel}(\mathbf{L})\), it is equivalent to:

$$\begin{aligned} R(a,b) + R(b,c) \ge R(a,c), \qquad 0 = R(a,a) \end{aligned}$$

We therefore consider these internal monads as describing generalized metric spaces. This observation is important in the field of monoidal topology [26].

As before, we can also interpret our internal monad as giving a well behaved collection of conversion costs between resources. Converting a resource to itself is free, and converting a resource via an intermediate state is at least as expensive as taking the direct route. Similarly, if we consider \(\mathbf{Rel}(\mathbf{F}) \) the conditions of (1) become:

$$\begin{aligned} \max (R(a,b),R(b,c)) \ge R(a,c), \qquad 0 = R(a,a) \end{aligned}$$

and we can therefore see such internal monads as generalized ultrametric spaces. Again, the interpretation in terms of maximum force requirements extends to a sensible interpretation of these axioms.

Example 10

Internal monads in the category of ordinary relations are preorders on their underlying set. The generalization to the interval quantale then gives a fuzzy generalization of the notion of preorder. We can also apply our intuition in terms of connection strengths. Reflexivity tells us that every element can be perfectly connected to itself. Transitivity tell us that the optimal connection strength available is always at least as good as connecting via an intermediate node.

4 Incorporating Convexity

Up to this point, the domain and codomain of our relations have been sets. If we fix an algebraic structure \((\varSigma ,E)\) with set of operations \(\varSigma \) and equations between terms E, we can define a notion of binary relation between these algebras.

Definition 5

An algebraic  Q -relation of type \(A \rightarrow B\) is an ordinary Q-relation R between the underlying sets, such that for each operation \(\sigma \in \varSigma \) of arity n the following inequation holds in the quantale order:

$$\begin{aligned} R(a_1,b_1) \otimes ... \otimes R(a_n, b_n) \le R(\sigma (a_1,...,a_n),\sigma (b_1,...,b_n)) \end{aligned}$$

As shown in [33], algebraic Q-relations form a hypergraph category:

Theorem 2

For commutative quantale Q and algebraic signature \((\varSigma ,E)\) there is a hypergraph category \(\mathbf{Rel}_{(\varSigma ,E)}(Q)\) with objects \((\varSigma ,E)\)-algebras and morphisms algebraic Q-relations.

In the conceptual spaces literature, convexity is conceptually important. In [9] this convexity was captured using relations between convex algebras. We refer to [9] and the extended paper [10] for explicit modelling of toy computations of composed concepts in this category.

These convex algebras can be described as the Eilenberg-Moore algebras of the finite distribution monad. They can in fact be presented by a family \(\varSigma _c\) of binary operations

$$\begin{aligned} +^p, \quad p \in (0,1) \end{aligned}$$

satisfying suitable axioms. We can read \(x +^p y\) as “choose x with probability p and y with probability \((1-p)\)”. By considering algebraic \(\mathbf{B}\)-relations over this signature, we can construct a category isomorphic to the category \(\mathbf{ConvexRel}\) of convex relations from [9]. By changing our quantale of truth values, we can go further than this.

Proposition 1

In the category of convex \(\mathbf{L}\)-relations, the internal monads are generalized metric spaces satisfying the additional axioms for \(p \in (0,1)\):

$$\begin{aligned} R(a_1,b_1) + R(a_2,b_2) \ge R(a_1 +^p b_1, a_2 +^p b_2) \end{aligned}$$

So internal monads in the category of convex relations over the Lawvere quantale are generalized metric spaces that interact well with formation of convex mixtures. The usual distance on \(\mathbb {R}^n\) is an example of such a metric.

As shown in [33], every quantale homomorphism \(h : Q_1 \rightarrow Q_2\) induces a strict monoidal functor of type \(\mathbf{Rel}_{(\varSigma ,E)}(Q_1) \rightarrow \mathbf{Rel}_{(\varSigma ,E)}(Q_2) \). If the quantale morphism is injective, this functor is faithful. In particular, the mapping \(\bot \mapsto \infty ; \top \mapsto 0\) is an injective quantale homomorphism from the Boolean to the Lawvere quantale. This means we can find the ordinary Boolean binary relations as a monoidal subcategory of the category \(\mathbf{Rel}(\mathbf{L})\). This presents some flexible modelling possibilities. If U and V are two subsets of a set X, they induce two states \(U, V : I \rightarrow X\) in \(\mathbf{Rel}(\mathbf{B})\). If we consider the number \(V^\circ \circ U\), where \(R^\circ \) denotes relational converse, it evaluates to true if and only if \(U \cap V \ne \emptyset \).

Proposition 2

If \(U, V \subseteq X\) and d is an internal monad in \(\mathbf{Rel}(\mathbf{L})\), the composite \(V^\circ \circ d \circ U\) is the infimum of the distances between elements in U and V.

This gives us the greatest lower bound on the distances between elements in U and V, providing a finer grain measure of similarity than can conventionally be achieved in relational models. We note that as distances are in general asymmetric, the number \(U^\circ \circ d \circ V\) may give a different measure of similarity. Similarly, we can find the ordinary Boolean convex relations within the category of \(\mathbf{L}\)-valued convex relations, presenting analogous opportunities for performing calculations with discrete convex relations, and then measuring their separation on a continuum of values.

Such asymmetric distance measures are of practical use in cognitive science applications. A fundamental concept in psychology is that of similarity, which can be used as the basis of concept formation. Similarity between objects or concepts can be explained by locating objects in some sort of conceptual or feature space, and modelling similarity as a function of distance, for example in [42]. However, judgements of similarity are not necessarily symmetric [45]. In one study examining the similarity between pairs of countries, participants are asked to choose between statements ‘Country A is similar to country B’ or ‘Country B is similar to country A’. In all cases, a majority of participants preferred the statement where the latter country was considered more prominent.

5 Proof Relevance

span  S of sets, between sets A and B, is a set X and a pair of functions \(X \xrightarrow {p_1} A\) and \(X \xrightarrow {p_2} B\). Paralleling the notation for relations, we will write

$$\begin{aligned} S_x(a,b) := x \in X \; \wedge \; p_1(x) = a \; \wedge \; p_2(x) = b \end{aligned}$$

We can think of such a span as a proof relevant relation in which \(S_x(a,b)\) tells us that x witnesses that a and b are related. In a computational linguistics or cognition application where relations may have been derived automatically from data in some way, we can exploit these proof witnesses to track evidence for our beliefs that certain relationships hold.

Sets and spans between them form a hypergraph category \(\mathbf{Span}\) with composition given by pullback, and tensor product induced by a choice of productsFootnote 2. In fact, as we did for relations, we can extend these spans with algebraic structure and a choice of truth values in a partially ordered monoid. We no longer require full quantale structure on our truth values, as multiple proof witnesses mean we don’t need to choose a single representative truth value when composing relations.

Definition 6

For an algebraic signature \((\varSigma ,E)\) and pomonoid Q an algebraic Q -span of type \(A \rightarrow B\) between \((\varSigma ,E)\)-algebras is a span \(A \xleftarrow {p_1} X \xrightarrow {p_2} B\) between the underlying objects, with a characteristic morphism  \(\chi : X \rightarrow Q\). We require that the algebraic structure is respected in that for all \(\sigma \in \varSigma \), with arity n:

$$\begin{aligned} \bigwedge _{1 \le i \le n} (p_1(x_i) = a_i \wedge p_2(x_i) = b_i) \Rightarrow \bigotimes _{1 \le i \le n} \chi (x_i) \le \chi (\sigma (x_1,...,x_n)) \end{aligned}$$

Intuitively, these are intensional relations in which proof witnesses are weighted by a truth value, and the relations respect the algebraic structure. As shown in [33], algebraic Q-spans also form a hypergraph category:

Theorem 3

For commutative pomonoid Q and algebraic signature \((\varSigma ,E)\) there is a hypergraph category \(\mathbf{Span}_{(\varSigma ,E)}(Q)\) with objects \((\varSigma ,E)\)-algebras and morphisms algebraic Q-spans.

For algebraic Q span S we define

$$\begin{aligned} S^q_x(a,b) := x \in X \; \wedge \; p_1(x) = a \; \wedge \; p_2(x) = b \; \wedge \; \chi (x) = q \end{aligned}$$

We then read \(S_x^q(a,b)\) as telling us that x witnesses that a and b are related with strength q. In fact, we can order algebraic Q-spans in a manner similar to that for relations, but accounting for proof witnesses.

Definition 7

For pomonoid Q, we define a preorder on algebraic Q-spans by setting \((X_1,f_1,g_1,\chi _1) \subseteq (X_2,f_2,g_2,\chi _2)\) if there is a \(\mathbf{Set}\)-monomorphism \(\varphi : X_1 \rightarrow X_2\) such that \(f_1 = f_2 \circ \varphi \), \(g_1 = g_2 \circ \varphi \) and \(\forall x\,.\, \chi _1(x) \le \chi _2(\varphi (x))\).

The ordering accounts pointwise for strengths of relatedness in a natural way. The requirement that the function \(\varphi \) in Definition 7 is a monomorphism ensures that even if our truth values are trivial, we take account of the “number” of proof witnesses available.

As internal monads provided interesting objects in the setting of relations, we should consider them in the span setting as well.

Proposition 3

An internal monad on A in \(\mathbf{Span}(\mathbf{L})\) is an \(\mathbf{L}\)-span \(S : A \rightarrow A\) such that if \(S_x^p(a_1,a_2)\) and \(S_y^q(a_2,a_3)\) we can choose an element \(\varphi (x,y)\) of the apex such that \(S_{\varphi (x,y)}^r(a_1,a_3)\) and \(p + q\) is greater than r in the usual ordering on the real numbers. Furthermore, we can do this in a way such that the assignment \(\varphi \) is injective.

So internal \(\mathbf{L}\)-span monads further generalize metric spaces to incorporate multiple possible distances, which we can think of as describing different paths between points. We now outline a new practical application of spans in models of language.

Example 11

(Semantic Ambiguity via Spans). In natural language, we often encounter ambiguous situations. For example the word “bank” can refer to either a “river bank” or a “financial bank”. A compositional account of semantic ambiguity was presented in [36], using mathematical models of incomplete information from quantum theory. The techniques applied implicitly assume meanings are built upon a vector space model, to which we apply Selinger’s CPM construction [41] to yield a new category of ambiguous meanings. The CPM construction can also be applied to categories of relations, but in this case it does not provide a satisfactory model of ambiguity [34].

An alternative approach to ambiguity in relational models is to use spans. We consider how the ambiguous word “bank” is related to the word “water”

  • In the “river bank” context, we would expect a strong relationship

  • In the “financial bank” context, we would expect a weaker relationship

By using spans rather than relations, we can introduce two different proof witnesses for the different contexts under consideration. By choosing our quantale of truth values to be the Lawvere quantale \(\mathbf{L}\), we can attach a different choice of distance to each of these choices. As we compose spans to describe the meanings of phrases and sentences, the proof witnesses will keep track of the different possible relationships in play.

6 Variable Contexts

Our definitions of algebraic Q-relations and algebraic Q-spans are constructive. This means that Theorems 2 and 3 continue to hold for any elementary topos, as proved in [33]. Standard sources on topos theory are [12, 27, 28, 32]. We will write \(\mathbf{Rel}^{\mathcal {E}}_{(\varSigma , E)}(Q) \) and \(\mathbf{Span}^{\mathcal {E}}_{(\varSigma , E)}(Q) \) for the categories of spans and relations, to make the choice of topos \(\mathcal {E}\) explicit. This generalization has practical implications if we move to different choices of background topos.

Definition 8

Let \(\mathcal {C}\) be a small category. A presheaf on \(\mathcal {C}\) is a functor of type \(\mathcal {C}^{op} \rightarrow \mathbf{Set} \). Presheaves and natural transformations between them form a topos, denoted \(\mathbf{Set} ^{\mathcal {C}^{op}}\). For presheaf X over a preorder, we will write \(X_i\) for the set in the image under X of element i of the preorder, and \(X_{i,j}\) for the image of \(j \le i\) under X.

Presheaves can be interpreted as sets varying with context. This is exactly the perspective we shall adopt in our examples. To exploit our generalized span construction, we need to describe internal pomonoids in presheaf categories.

Lemma 1

A commutative partially ordered monoid in a presheaf category \(\mathbf{Set} ^{\mathcal {C}^{op}} \) is a presheaf Q such that for each \(\mathcal {C}\)-object x and \(\mathcal {C}\)-morphism f, Q(x) is a commutative pomonoid and Q(f) is a pomonoid morphism in \(\mathbf{Set}\). See [28, D1.2.14].

Example 12

(Temporal dependence). In Example 11 we modelled ambiguity using multiple proof witnesses to describe different interpretations of words. We now investigate the description of time dependent ambiguous relationships, by exploiting spans over presheaves. To do so, we consider presheaves over the partial order \(\mathbb {N} = 0 \leftarrow 1 \leftarrow 2 ...\) having objects natural numbers. We view these presheaves as sets varying in time. We assume our notion of truth is fixed, and so we will consider \(\mathbf{Span}^{\mathbf{Set} ^{\mathcal {\mathbb {N}}^{op}}}(\mathbf{L}) \), where \(\mathbf{L} \) is the constant presheaf on the pomonoid underlying the Lawvere quantale. An \(\mathbf{L}\)-span between presheaves X and Y then consists of natural transformations \(p_1 : X \Rightarrow A\) and \(p_2 : X \Rightarrow B\), and a characteristic natural transformation \(\chi : X \Rightarrow \mathbf{L} \). We see naturality as a consistency condition between the relationships described by proof witnesses, as they move forward in time. As our pomonoid is constant, \(\chi _i(x) = \chi _j(X_{i,j}(x))\), so the truth value associated with a proof witness is preserved through time. Intuitively, in this model, a steadily increasing collection of relationships hold over time.

Example 13

(Perspective Dependence). In Example 12, the truth object was fixed in all contexts. We now examine a brief example in which our notion of truth is context dependent. Consider two agents. Agent 0 has a binary view of the world, relationships either hold or they don’t. Agent 1 has a richer view incorporating different strengths of relation in the unit interval. Consider presheaves on the category \(\mathcal {C}\) with a single non-trivial arrow \(0 \leftarrow 1\). We define an internal pomonoid Q with \(Q(0) = \mathbf{B} \), \(Q(1) = \mathbf{I} \) and \(Q_{0,1}\) the canonical pomonoid morphism between the Boolean and interval quantales. Now if we consider a Q-span between constant presheaves A and B with apex an arbitrary presheaf X, we can think of it as follows. Each element of \(X_0\) relates two elements \(a \in A\) and \(b \in B\) with strength 0 or 1. The structure of X then forces that \(X_1\) contains a witness relating those two elements with the same strength. As \(X_1\) encodes the views of the more powerful agent, it may describe additional relationships, now with strengths weighted in the interval [0, 1].

If we wish to consider algebraic Q-relations over an arbitrary topos things are more delicate since internal quantales cannot be defined pointwise. Nevertheless there are standard sources of internal commutative quantales, for example:

  • If \(\mathcal {C}\) is a groupoid and Q is a commutative quantale in \(\mathbf{Set}\), then Q can be lifted to an internal commutative quantale in \(\mathbf{Set} ^{\mathcal {C}^{op}} \).

  • The subobject classifier \(\varOmega \) of a topos is an internal locale, and therefore an internal commutative quantale.

We conclude by establishing the relationship between our framework of generalized relations and the standard notion of the category of relations over a regular category. This will involve the internal locale given by the subobject classifier.

Definition 9

A category \(\mathcal {C}\) is regular if it is finitely complete, every kernel pair has a coequalizer and regular epimorphisms are stable under pullback.

There is standard construction of a category of relations \(\mathbf{Rel}(\mathcal {C}) \) of a regular category \(\mathcal {C}\), see for example [13]. For the category \(\mathbf{Set}\) for example, this construction recovers exactly the usual category of binary relations. As we have been constructing categories of relations in this paper, it would be interesting to know how this relates to the relations of a regular category. Every topos is regular, and in fact for any algebraic theory \((\varSigma ,E)\), the category of internal \((\varSigma ,E)\)-algebras in a regular category [7], meaning we can consider the impact of algebraic structure. In fact, the resulting category of relations is equivalent to the one produced by our construction with the subobject classifier as the object of truth values.

Theorem 4

Let \(\mathcal {E}\) be a topos, \(\varOmega \) its subobject classifier and \((\varSigma ,E)\) an algebraic signature. The category \(\mathbf{Rel}^{\mathcal {E}}_{(\varSigma , E)}(\varOmega ) \) resulting from the algebraic Q-relations construction is equivalent to the category of internal relations over the regular category of internal \((\varSigma ,E)\)-algebras in \(\mathcal {E}\).

In this way, we see that relations over suitable regular categories are a special case of our construction.

7 Conclusion

We have demonstrated that categories of generalized relations present a flexible modelling tool for categorical compositional models of natural language and cognition. We presented various potential models worthy of further investigation, capturing features such as fuzziness, distances, convexity, ambiguity and context sensitivity, and showed how these features can be used in combination within a generic framework. One natural direction for further work would be empirical investigation of the compatibility of these theoretical models with concrete applications. Another one would be to investigate whether the techniques in [44] can be used to build models with either non-commutative or typed quantales, known as quantaloids.