Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The \(\mathcal{DLF}\) family of description logics (DLs) have been designed primarily to support reasoning tasks about object relational data sources. This includes the \(\mathcal{CFD}\) sub-family that admits PTIME algorithms for many of these tasks. Unlike the usual case of role-based DLs [1], dialects in this family are feature-based, that is, are fragments of first order logic with underlying signatures that replace binary predicate symbols, called roles, with unary function symbols, interpreted as total functions, called features. Since features are intended to capture the notion of an attribute in a data source, a column for a relational table for example, this practice has led to some cognitive incongruity: one must reconcile that every attribute is fundamentally defined for every object and introduce protocols for indirectly saying when an attribute is or is not meaningful for various kinds of objects.

In this paper, we show how computational properties of DLs in the \(\mathcal{DLF}\) family are preserved when they are modified to address such incongruity. This modification is as follows. First, features, now called partial features, are instead interpreted as partial functions, and second, a concept constructor is added that makes it possible to refer to all objects that have a value for a given partial feature. The added constructor yields to an ability for any of the DLs to define cases in which partial functions become total functions, such as to say that every employee has a salary, or, for two of the DLs considered, to define cases in which partial functions are not meaningful, such as to say that departments do not have a salary.

We consider the particular problem of reasoning about logical implication for three representative members of the \(\mathcal{DLF}\) family: \(\mathcal{DLFD}\) [57], \(\mathcal{CFD}\) [3, 8] and \({{\mathcal {CFD}}^{\forall }_{nc}}\) [9, 10]. The first is a very expressive dialect for which logical implication is EXPTIME-complete, while the remaining two are also members of the \(\mathcal{CFD}\) family and therefore have PTIME decision procedures for this problem. Note that all three include a concept constructor for capturing keys and functional dependencies in object relational data sources and can express, for example, that no two departments have the same manager, or that an employee’s pay grade determines her salary. The constructor is called a path functional dependency (PFD).

Our contributions, in the order presented, are as follows. Note that, in presenting them and for the remainder of the paper, we write \({ partial\!-\! \mathcal{{DLFD}}}\), \({ partial\!-\! \mathcal{{CFD}}}\) and \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\) to refer, respectively, to \(\mathcal{DLFD}\), \(\mathcal{CFD}\) and \({{\mathcal {CFD}}^{\forall }_{nc}}\) when they are presumed to be modified to support partial functions in the above fashion:

  1. 1.

    We introduce a semantics for PFDs when features are partial functions that is entirely neutral on issues of feature existence, that is, on whether certain kinds of objects must have or even can have values for particular features;Footnote 1

  2. 2.

    We show that logical implication for \({ partial\!-\! \mathcal{{DLFD}}}\) reduces to logical implication for \(\mathcal{DLFD}\), in the process showing that partial functions can be simulated in a straightforward fashion in this dialect; and

  3. 3.

    We show that logical implication for both \({ partial\!-\! \mathcal{{CFD}}}\) and \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\) remains in PTIME by exhibiting refinements of existing respective PTIME decision procedures for deciding logical implication with \(\mathcal{CFD}\) and \({{\mathcal {CFD}}^{\forall }_{nc}}\).

We conclude with summary comments and an outline of possible directions for future work, with a particular focus on \({{\mathcal {CFD}}^{\forall }_{nc}}\) in the latter case: on partial functions for a recent extension, and on relaxing syntactic restrictions to enable straightforward partial function simulation, as we show is possible for \(\mathcal{DLFD}\).

2 Background and Definitions

In this section, we begin by reviewing the basic definitions for member dialects of the \(\mathcal{DLF}\) family in which features are interpreted as total functions, and then proceed to introduce modifications that yield support for partial features, that is, features that are instead interpreted as partial functions.

Note that DLs in this family do not forgo the ability to capture roles or indeed n-ary relations in general. This can be accomplished with the simple expedient of reification via features, and by using the above-mentioned PFD concept constructor common to these dialects to ensure a set semantics for reified relations. Indeed, the first dialect we consider, \(\mathcal{DLFD}\), can capture very expressive role-based dialects, including dialects with so-called qualified number restrictions, inverse roles, role hirarchies, and so on [5].

Definition 1

(Feature-Based DLs). Let \(\mathsf {F}\) and \(\mathsf {PC}\) be sets of feature names and primitive concept names, respectively. A path expression is defined by the grammar “ \(\mathop {\mathsf {Pf}}\nolimits {:}{:}= f.\mathop {\mathsf {Pf}}\nolimits | \mathop { id }\nolimits \)” for \(f\in \mathsf {F}\). We define derived concept descriptions by the grammar on the left-hand-side of Fig. 1.

An inclusion dependency \(\mathcal C\) is an expression of the form \(C_1\sqsubseteq C_2\). A terminology (TBox) \(\mathcal T\) consists of a finite set of inclusion dependencies. A posed question \(\mathcal{Q}\) is a single inclusion dependency.

Fig. 1.
figure 1

Syntax and semantics of \(\mathcal{DLFD}\)/\(\mathcal{CFD}\) concepts.

The semantics of expressions is defined with respect to a structure \(\mathcal{I}=(\triangle , \cdot ^\mathcal{I})\), where \(\triangle \) is a domain of “objects” and \(\cdot ^{\mathcal{I}}\) an interpretation function that fixes the interpretations of primitive concepts A to be subsets of \(\triangle \) and primitive features f to be total functions \(f^{\mathcal{I}}:\triangle \rightarrow \triangle \). The interpretation is extended to path expressions, \(\mathop { id }\nolimits ^{\mathcal{I}} = \lambda x.x\), \((f.\mathop {\mathsf {Pf}}\nolimits )^{\mathcal{I}} = \mathop {\mathsf {Pf}}\nolimits ^{\mathcal{I}}\circ f^{\mathcal{I}}\) and derived concept descriptions C as defined in the centre column of Fig. 1.

An interpretation \(\mathcal{I}\) satisfies an inclusion dependency \(C_1\sqsubseteq C_2\) if \(C_1^{\mathcal{I}}\subseteq C_2^{\mathcal{I}}\) and is a model of \(\mathcal {T}\) (\(\mathcal{I}\,\models \,\mathcal {T}\)) if it satisfies all inclusion dependencies in \(\mathcal {T}\). The logical implication problem asks if \(\mathcal {T}\,\models \,\mathcal {Q}\) holds, that is, if \(\mathcal{Q}\) is satisfied in all models of \(\mathcal {T}\). \(\Box \)

In the following, we simplify the notation for path expressions by allowing a syntactic composition \(\mathop {\mathsf {Pf}}\nolimits _1.\mathop {\mathsf {Pf}}\nolimits _2\) that stands for their concatenation.

In this presentation, we do not consider so-called ABoxes (sets of assertions about membership of individuals in descriptions) and the associated problem of knowledge base consistency. However, these can be reduced to logical implication problems involving posed questions that utilize value restrictions and equational same-as descriptions [8].

Also note that the logical implication problem for TBoxes and posed questions characterized so far, that allow arbitrary concepts in inclusion dependencies, is not decidable for a variety of reasons (e.g., see [7] for one case involving arbitrary PFDs and ABoxes encoded in the above manner). However, restrictions on occurrences of concept constructors has led to a number of decidable fragments that range from light-weight to expressive dialects of feature-based DLs. The restrictions that obtain \(\mathcal{DLFD}\), \(\mathcal{CFD}\) and \({{\mathcal {CFD}}^{\forall }_{nc}}\), the focus of our attention, are given in Sect. 3 for the first case and in Sect. 4 for the remaining two cases.

The two definitions that follow now introduce the necessary modifications to our characterization of feature-based DLs to accommodate partial features, that is, features that are interpreted as partial functions. Note that their presentation relies on our notational convention given in our introductory comments of qualifying particular dialects with the word “partial” whenever we intend such modifications to apply, as in \({ partial\!-\! \mathcal{{DLFD}}}\) for example.

Definition 2

(Partial Features and Existential Restrictions). The syntax of feature-based DLs is extended with an additional concept constructor of the form \(\exists {f}\), called an existential restriction. Semantics is updated as follows:

  1. 1.

    features \(f\in F\) are now interpreted as partial functions on \(\triangle \) (i.e., the result can be undefined for parts of \(\triangle \)); and

  2. 2.

    the \(\exists {f}\) concept constructor is interpreted as \(\{x: \exists y\in \triangle . f^{\mathcal{I}}(x)=y\}\).

Also, in this setting, path functions (\(\mathop {\mathsf {Pf}}\nolimits \)) naturally denote composition of partial functions yielding a partial function, equality (\(=\)) is true only when both of its arguments are defined (in addition to being equal), set membership (\(\in \)) requires only defined values to be members of its right hand side argument, etc.Footnote 2 \(\Box \)

Observe that features are still functional, and that there is therefore no need for a qualified existential restriction of the form \(\exists {f}.C\), with the standard meaning \((\exists {f}.C)^{\mathcal{I}}\) given by

$$\{x: \exists y\in \triangle . f^{\mathcal{I}}(x)=y\wedge y\in C^{\mathcal{I}}\}.$$

Indeed, they can be simulated using the following identity:

$$\exists {f}.C = \exists {f}\sqcap \forall f.C.$$

Using this identity, we write \(\exists {\mathop {\mathsf {Pf}}\nolimits }\) in the following as shorthand for

$$ \exists {f_1}\sqcap \forall f_1.(\exists {f_2}\sqcap \forall f_2.( \ldots (\exists {f_k})\ldots )).$$

On interpreting the PFD constructor in the presence of partial features: the minimum necessary (and we believe most natural) circumstance in which one obtains a violation of a PFD inclusion dependency of the form

$$ C_1 \sqsubseteq C_2:\mathop {\mathsf {Pf}}\nolimits _1,\ldots ,\mathop {\mathsf {Pf}}\nolimits _k\rightarrow \mathop {\mathsf {Pf}}\nolimits _0 $$

happens when all path functions \(\mathop {\mathsf {Pf}}\nolimits _0,\ldots ,\mathop {\mathsf {Pf}}\nolimits _k\) are defined for a \(C_1\) object \(e_1\) and a \(C_2\) object \(e_2\), and in which \(\mathop {\mathsf {Pf}}\nolimits _i^{\mathcal{I}}(e_1) = \mathop {\mathsf {Pf}}\nolimits _i^{\mathcal{I}}(e_2)\) holds only for \(i >0\). This yields the following modification to the interpretation of PFDs in the presence of partial features that we now adopt:

$$ \begin{array}{l} (C:\mathop {\mathsf {Pf}}\nolimits _1,\ldots ,\mathop {\mathsf {Pf}}\nolimits _k\rightarrow \mathop {\mathsf {Pf}}\nolimits _0)^{\mathcal{I}} = \{x: \forall y. y\in C^{\mathcal{I}}\wedge x\in (\exists {\mathop {\mathsf {Pf}}\nolimits _0})^{\mathcal{I}}\wedge y\in (\exists {\mathop {\mathsf {Pf}}\nolimits _0})^{\mathcal{I}}\wedge {}\\ ~~~~~~\bigwedge _{i=1}^k(x\in (\exists {\mathop {\mathsf {Pf}}\nolimits _i})^{\mathcal{I}}\wedge y\in (\exists {\mathop {\mathsf {Pf}}\nolimits _i})^{\mathcal{I}}\wedge \mathop {\mathsf {Pf}}\nolimits _i^{\mathcal{I}}(x)=\mathop {\mathsf {Pf}}\nolimits _i^{\mathcal{I}}(y)) \rightarrow \mathop {\mathsf {Pf}}\nolimits _0^{\mathcal{I}}(x)=\mathop {\mathsf {Pf}}\nolimits _0^{\mathcal{I}}(y) \}. \end{array} $$

Observe that this definition coincides with the original semantics of the PFD constructor given in Fig. 1 when features are interpreted as total functions. Also note that, without this modification to semantics, the strict interpretation of undefined values would mean that satisfying the left-hand-side of a PFD would imply the existence of “ \(\mathop {\mathsf {Pf}}\nolimits _0\) paths” (and equality of the “endpoint” of these paths), a circumstance that would violate feature existence neutrality of PFDs mentioned in our introductory comments that seems desirable.

We now return to examples of constraints mentioned in our introductory comments to illustrate the use of existential restrictions and PFDs in \(\mathcal{DLF}\) dialects with partial features. Each can be expressed in \({ partial\!-\! \mathcal{{DLFD}}}\) and \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\), and each but the second mentioning negation in \({ partial\!-\! \mathcal{{CFD}}}\):

  1. 1.

    \(\mathrm {EMP}\sqsubseteq \exists { salary }\) (every employee has a salary);

  2. 2.

    \(\mathrm {DEPT}\sqsubseteq \lnot \exists { salary }\) (departments do not have a salary);

  3. 3.

    \(\mathrm {DEPT}\sqsubseteq \mathrm {DEPT}: manager \rightarrow \mathop { id }\nolimits \) (no two departments have the same manager); and

  4. 4.

    \(\mathrm {EMP}\sqsubseteq \mathrm {EMP}: paygrade \rightarrow salary \) (employee pay grades determine salaries).

3 Expressive Feature Logics: The \(\mathcal{DLF}\) Family

In this section, we consider the impact of partial features in expressive feature-based description logics, namely in \(\mathcal{DLF}\) and \(\mathcal{DLFD}\) [57]. \(\mathcal{DLF}\) allows both TBox and posed question dependencies to contain concepts formed from primitive concepts and bottom using negation, conjunction, disjunction, and restriction concept constructors. \(\mathcal{DLFD}\) in addition allows the PFD concept constructor to appear on the right hand sides of inclusion dependencies. These restrictions on syntax yield an expressive Boolean complete description logic with a logical implication problem that is complete for EXPTIME. Additional extensions, e.g., allowing PFDs on the left-hand sides of inclusion dependencies or equational constraints in the posed questions (or equivalently ABoxes) leads to undecidability [7].

We now proceed to demonstrate that partial features can be effectively simulated in the original logics by introducing an auxiliary primitive concept G that stands for existing or generated objects, and by using value restrictions to assign membership of objects generated by the \(\exists {f}\) constructor to this concept. All remaining inclusion dependencies are then simply preconditioned by this auxiliary concept.

Formally, let \(\mathcal {T}\) be a \({ partial\!-\! {\mathcal {DLF}}}\) TBox in which all inclusion dependencies are of the form \(\top \sqsubseteq C\). We define a \(\mathcal{DLF}\) TBox \(\mathcal {T}_{\mathcal{DLF}}\) as

$$ \begin{array}{ll} \mathcal {T}_{\mathcal{DLF}} = &{}\{ G \sqsubseteq C[\exists {f}\mapsto \forall f.G, \text{ for } \text{ all } f\in F] \mid \top \sqsubseteq C\in \mathcal {T}\} \\ &{}{}\cup \{\forall f.G\sqsubseteq G \mid f\in F\}, \end{array} $$

where G is a primitive concept not occurring in \(\mathcal {T}\). Note that the substitution \([\exists {f}\mapsto \forall f.G, \text{ for } \text{ all } f\in F]\) is applied simultaneously to all occurrences of the \(\exists {f}\) constructor in the concept C.

Theorem 3

Let \(\mathcal {T}\) be a \({ partial\!-\! {\mathcal {DLF}}}\) TBox in which all inclusion dependencies are of the form \(\top \sqsubseteq C\). Then

$$\mathcal {T}\,\models \,\top \sqsubseteq C \text{ if } \text{ and } \text{ only } \text{ if } \mathcal {T}_{\mathcal{DLF}}\,\models \,G\sqsubseteq C[\exists {f}\mapsto \forall f.G, \text{ for } \text{ all } f\in F],$$

for G a fresh primitive concept.

Proof (sketch): For any \(\mathcal{I}\) where \(\mathcal{I}\,\models \,\mathcal {T}_{\mathcal{DLF}}\), we can define an interpretation \(\mathcal{J}=(G^{\mathcal{I}}, \cdot ^{\mathcal{I}}_{|G^{\mathcal{I}}})\). It is easy to verify that \(\mathcal{J}\,\models \,\mathcal {T}\) and also that \(\mathcal{J}\,\models \,\top \sqsubseteq C\) since \(\mathcal{I}\,\models \,G\sqsubseteq C[\exists {f}\mapsto \forall f.G, \text{ for } \text{ all } f\in F]\).

For the other direction, we need to extend a model \(\mathcal{J}\) of \(\mathcal {T}\) to a model \(\mathcal{I}\) of \(\mathcal {T}_{\mathcal{DLF}}\) by setting \(G^{\mathcal{I}}=\triangle ^{\mathcal{J}}\) and by adding missing features connecting \(\mathcal{I}\) to complete \(F^*\) trees with all nodes in \((\lnot G)^{\mathcal{I}}\). This way, either \(\mathcal{I}\) coincides with \(\mathcal{J}\) or satisfies dependencies in \(\mathcal {T}_{\mathcal{DLF}}\) and \(G\sqsubseteq C[\exists {f}\mapsto \forall f.G, \text{ for } \text{ all } f\in F]\) vacuously. \(\Box \)

To extend this construction to the full \({ partial\!-\! \mathcal{{DLFD}}}\) logic, it is sufficient to encode the path function existence preconditions in terms of the auxiliary concept G as follows: if \(A\sqsubseteq B:\mathop {\mathsf {Pf}}\nolimits _1,\ldots ,\mathop {\mathsf {Pf}}\nolimits _k\rightarrow \mathop {\mathsf {Pf}}\nolimits _0\in \mathcal {T}\) then

(1)

is added \(\mathcal {T}_{\mathcal{DLFD}}\). Here, we are assuming w.l.o.g. that A and B are primitive concept names (\(\mathcal{DLFD}\) allows one to give such names to complex concepts).

Theorem 4

Let \(\mathcal {T}\) be a \({ partial\!-\! \mathcal{{DLFD}}}\) TBox in which all inclusion dependencies are of the form \(\top \sqsubseteq C\) or \(A\sqsubseteq B:\mathop {\mathsf {Pf}}\nolimits _1,\ldots ,\mathop {\mathsf {Pf}}\nolimits _k\rightarrow \mathop {\mathsf {Pf}}\nolimits _0\). Then

$$\begin{array}{l} \mathcal {T}\,\models \,\top \sqsubseteq C \text{ if } \text{ and } \text{ only } \text{ if } \mathcal {T}_{\mathcal{DLFD}}\,\models \,C\sqsubseteq D[\exists {f}\mapsto \forall f.G, \text{ for } \text{ all } f\in F], \text{ and }\\ \mathcal {T}\,\models \,A\sqsubseteq B:\mathop {\mathsf {Pf}}\nolimits _1,\ldots ,\mathop {\mathsf {Pf}}\nolimits _k\rightarrow \mathop {\mathsf {Pf}}\nolimits \text{ if } \text{ and } \text{ only } \text{ if } \mathcal {T}_{\mathcal{DLFD}}\,\models \,(1), \end{array}$$

for G a fresh primitive concept.

Proof (sketch): Logical implication in \(\mathcal{DLFD}\) can be reduced to logical implication in \(\mathcal{DLF}\) [57]. Hence the claim holds by observing that (*) captures properly the semantics of PFDs and then by appealing to Theorem 3. \(\Box \)

Corollary 5

Logical implication is EXPTIME-complete for \({ partial\!-\! {\mathcal {DLF}}}\) and for \({ partial\!-\! \mathcal{{DLFD}}}\). \(\Box \)

Similar results can be obtained for other members of the \(\mathcal{DLF}\) family.

4 Tractable Logics: The \(\mathcal{CFD}\) Family

We now consider how partial features impact logical consequence for light-weight (PTIME) feature-based description logics, namely \(\mathcal{CFD}\) [3, 8] and \({{\mathcal {CFD}}^{\forall }_{nc}}\) [9, 10]. Both of these logics allow the use of an ABox. Hence, PFDs must adhere to one of the following two forms to avoid undecidability [7]:

$$\begin{aligned} \begin{array}{ll} 1. &{} C:\mathop {\mathsf {Pf}}\nolimits _1,\ldots ,\mathop {\mathsf {Pf}}\nolimits . \mathop {\mathsf {Pf}}\nolimits _i,\ldots ,\mathop {\mathsf {Pf}}\nolimits _k\rightarrow \mathop {\mathsf {Pf}}\nolimits ~\text{ or } \\ 2. &{} C:\mathop {\mathsf {Pf}}\nolimits _1,\ldots ,\mathop {\mathsf {Pf}}\nolimits . \mathop {\mathsf {Pf}}\nolimits _i,\ldots ,\mathop {\mathsf {Pf}}\nolimits _k\rightarrow \mathop {\mathsf {Pf}}\nolimits . f \end{array} \end{aligned}$$
(2)

With this restriction, originally introduced in [3], posed questions can contain inclusion dependencies formed from concepts in Fig. 1 (with a few mild restrictions when tractability in the size of the posed question is required). For simplicity, however, we assume that the concepts in the posed question \(\mathcal{Q} = E_1\sqsubseteq E_2\) adhere to the following grammar:

$$ E\,\,{:}{:}\!\!= A \mid \bot \mid E\sqcap E \mid \forall \mathop {\mathsf {Pf}}\nolimits .E \mid (\mathop {\mathsf {Pf}}\nolimits _1=\mathop {\mathsf {Pf}}\nolimits _2).$$

More complex posed questions, e.g., ones that contain the PFD constructor [8], can be equivalently expressed in the above grammar (perhaps as a sequence of posed questions).

Note that, due to syntactic restrictions on TBox inclusion dependencies in \(\mathcal{CFD}\) and \({{\mathcal {CFD}}^{\forall }_{nc}}\) (see below), we will not be able to directly simulate partial features as was done with \(\mathcal{DLF}\) and \(\mathcal{DLFD}\) above. However, the approach to extending/modifying existing decision procedures for logical implication in the respective logics is analogous: in both cases we introduce an additional unary predicate \(\mathsf {D}(x)\) to mark the necessarily existing objects and use this predicate to restrict the applications of inclusion dependencies. This in turn simulates partial features.

4.1 \({ partial\!-\! \mathcal{{CFD}}}\)

To obtain a PTIME decision procedure for \(\mathcal{CFD}\), we need to further restrict the inclusion dependencies allowed in the TBox \(\mathcal {T}\) as follows:

  1. 1.

    left hand sides must be conjunctions of primitive concepts, and

  2. 2.

    right hand sides must be primitive concepts, conjunctions, value restrictions, existential restrictions, and PFDs (obeying restrictions in (2)).

With these restrictions we can show that the logical implication problem for \({ partial\!-\! \mathcal{{CFD}}}\) is in PTIME. Our proof is based on encoding a given problem as a collection of Horn clauses. The reduction introduces terms that correspond to path expressions, and relies on the fact that the number of required terms is polynomial in the size of the problem itself.

Fig. 2.
figure 2

Expansion rules.

Definition 6

(Expansion Rules). Let \(\mathcal{T}\) and \(\mathcal{Q}\) be a \( partial\!-\! {\mathcal{{CFD}}}\) terminology and a posed question, respectively. We write \(\mathop {\mathsf {CON}}(\mathcal{T},\mathcal{Q})\) to denote the set of all subconcepts appearing in \(\mathcal{T}\) and \(\mathcal{Q}\), define \(\mathsf {PF}(\mathcal{T, Q})\) to be the set

$$ \{ \mathop {\mathsf {Pf}}\nolimits .\mathop {\mathsf {Pf}}\nolimits '\mid \begin{array}[t]{@{\,}r@{\;}} \mathop {\mathsf {Pf}}\nolimits \textit{is a prefix of a path expression in }\mathcal{Q} \textit{ and} ~~~~\\ \mathop {\mathsf {Pf}}\nolimits ' \textit{is a feature occurring in }\mathcal{T}\textit{ or }\mathop { id }\nolimits \}, \end{array} $$

write \(\mathsf {C}_C\) to denote unary predicates for \(C\in \mathop {\mathsf {CON}}(\mathcal{T,Q})\), and introduce a unary predicate \(\mathsf {D}\) and a binary predicate \(\mathsf {E}\), with all predicates ranging over the universe \(\mathsf {PF}(\mathcal{T, Q})\). The expansion rules for a given terminology \(\mathcal T\), denoted \(\mathsf {R}(\mathcal{T})\), are defined in Fig. 2.Footnote 3

A goal for each concept E is a set of ground assertions defined as follows:

Given two concept descriptions \(E_1\) and \(E_2\), we say that

$$\mathsf {R}(\mathcal{T})\cup \{\mathsf {C}_{E_1}(\mathop { id }\nolimits )\} \,\models \,\mathsf {G}_{E_2}$$

if \(\mathsf {G}_{E_2}\subseteq M\) for every minimal ground model M of \(\mathsf {R}(\mathcal{T})\) over \(\mathsf {PF}(\mathcal{T, Q})\) that contains \(\mathsf {C}_{E_1}(\mathop { id }\nolimits )\) and \(\mathsf {D}(\mathop { id }\nolimits )\). \(\Box \)

Intuitively, \(\mathsf {PF}(\mathcal{T, Q})\) represents a finite graph of objects, predicates \(\mathsf {E}(\mathop {\mathsf {Pf}}\nolimits _1,\mathop {\mathsf {Pf}}\nolimits _2)\) express equality of the objects at the end of paths \(\mathop {\mathsf {Pf}}\nolimits _1\) and \(\mathop {\mathsf {Pf}}\nolimits _2\), and predicates \(\mathsf {C}_{C'}(\mathop {\mathsf {Pf}}\nolimits )\) express that the object at the end of path \(\mathop {\mathsf {Pf}}\nolimits \) is in the interpretation of concept \(C'\).

Our PTIME result for the \({ partial\!-\! \mathcal{{CFD}}}\) implication problem follows by a simple check for goals occurring in a ground model for expansion rules generated by a polynomial sized collection of path expressions.

Theorem 7

Let \(\mathcal{T}\) be a \( partial\!-\! {\mathcal{{CFD}}}\) terminology and \(\mathcal{Q}\) a posed question of the form \(E_1\sqsubseteq E_2\). Then

$$\mathcal{T}\,\models \,\mathcal{Q} \mathrm {~iff~} \begin{array}[t]{@{}l} \mathsf {R}(\mathcal{T})\cup \{\mathsf {C}_{E_1}(\mathop { id }\nolimits ),\mathsf {D}(\mathop { id }\nolimits )\} \,\models \,\mathsf {G}_{E_1} \mathrm {~or~}\\ \mathsf {R}(\mathcal{T})\cup \{\mathsf {C}_{E_1}(\mathop { id }\nolimits ),\mathsf {D}(\mathop { id }\nolimits )\}\,\models \, \mathsf {G}_{\forall \mathop {\mathsf {Pf}}\nolimits .\bot }(\mathop { id }\nolimits )\mathrm {~for~some~} \mathop {\mathsf {Pf}}\nolimits \in \mathsf {PF}(\mathcal{T, Q}). \end{array}$$

Proof (sketch): If \(\mathsf {C}_\bot (\mathop {\mathsf {Pf}}\nolimits )\) and \(\mathsf {D}(\mathop {\mathsf {Pf}}\nolimits )\) for \(\mathop {\mathsf {Pf}}\nolimits \in \mathsf {PF}(\mathcal{T, Q})\) appear in M, where M is the least model of \(\mathsf {R}(\mathcal{T})\cup \{\mathsf {C}_{E_1}(\mathop { id }\nolimits )\}\) \(\mathsf {R}(\mathcal {T}\)), then the concept \(E_1\) is unsatisfiable w.r.t. \(\mathcal {T}\) since only implied facts appear in M, and therefore the subsumption holds for any \(E_1\) and \(\mathcal{T}\).

Otherwise, if \(\mathsf {R}(\mathcal{T})\cup \{\mathsf {C}_{E_1}(\mathop { id }\nolimits )\}\not \,\models \,\mathsf {G}_{E_1}\), then there must be a model M of \(\mathsf {R}(\mathcal{T})\cup \{\mathsf {C}_{E_1}(\mathop { id }\nolimits )\}\) such that \(G\not \in M\) for some \(G\in \mathsf {G}_{E_1}\). We construct an interpretation \(\mathcal{I}_M\) such that \(\mathcal{I}_M\,\models \,\mathcal{T}\) but \(\mathcal{I}_M\not \,\models \,\mathcal{Q}\). The interpretation \(\mathcal{I}_M\) contains an object o for each equivalence class defined on the set \(\mathsf {PF}(\mathcal{T, Q})\) by the interpretation of \(\mathsf {E}\). The class membership of these objects is determined by the membership of the corresponding path in the interpretations of the \(\mathsf {C}_C\) predicates in M. Note that, due to the syntactic restriction imposed on PFDs, this is sufficient to satisfy all PFDs in \(\mathcal{T}\) since any precondition or a non-trivial consequence of a PFD can only manifest on some path belonging to \(\mathsf {PF}(\mathcal{T, Q})\) and beginning at the distinguished object o. To complete the construction of \(\mathcal{I}_M\), we simply attach a unique complete tree \(\mathsf {F}^*\) to each leaf node (i.e., a node that is missing successors). Nodes of these complete trees belong to all primitive descriptions in \(\mathcal{I}_M\) and thus satisfy \(\mathcal{T}\).

Conversely, assume \( \mathsf {R}(\mathcal{T})\cup \{\mathsf {C}_{E_1}(\mathop { id }\nolimits )\}\,\models \,\mathsf {G}_{E_1}\) but \(\mathcal{T}\not \,\models \,\mathcal{Q}\). Then there must be an interpretation \(\mathcal{I}\) and an object \(o\in \triangle \) such that \(\mathcal{I}\,\models \,\mathcal{T}\) and \(o\in E_1^{\mathcal{I}}-E_2^{\mathcal{I}}\). Thus, there is a model \(M_\mathcal{I}\) of \(\mathsf {R}(\mathcal{T})\) such that \(\mathsf {C}_{E_1}(\mathop { id }\nolimits )\in M_\mathcal{I}\). In this model, the element \(\mathop { id }\nolimits \in \mathsf {PF}(\mathcal{T, Q})\) serves as the counterpart of the object o and the interpretations of the predicates \(\mathsf {C}_C\) and \(\mathsf {E}\) is extracted from \(\mathcal{I}\) by navigating all (pairs of) path functions in \(\mathsf {PF}(\mathcal{T, Q})\). However, since \(o\not \in E_2^{\mathcal{I}}\), it must be the case that \(M_\mathcal{I}\) is a strict subset of the least model of \(\mathsf {R}(\mathcal{T})\cup \{\mathsf {C}_{E_1}(\mathop { id }\nolimits )\}\); a contradiction. \(\Box \)

Since the expansion rules are Horn clauses over a finite universe \(\mathsf {PF}(\mathcal{T, Q})\) of polynomial size, we have the following:

Corollary 8

Let \(\mathcal{T}\) be a terminology and \(\mathcal{Q}\) a posed question in \({ partial\!-\! \mathcal{{CFD}}}\). Then the implication problem \(\mathcal{T}\,\models \,\mathcal{Q}\) is complete for PTIME.

Proof (sketch): The least model of \(\mathsf {R}(\mathcal{T})\cup \{\mathsf {C}_{E_1}(\mathop { id }\nolimits )\}\) can be obtained by using a bottom-up construction of the least fix-point of the rules in time polynomial in \(|\mathcal{T}|+|\mathcal{Q}|\) (since all predicates in \(\mathsf {R}(\mathcal {T})\) have a fixed arity). Hardness follows from embedding Horn-SAT into reasoning with PFDs. \(\Box \)

In practice, elements of this set can be constructed on demand by using additional Horn rules in such a way that only path expressions needed to confirm subsumption or non-subsumption are generated [4].

Note that neither \({ partial\!-\! \mathcal{{CFD}}}\) nor \(\mathcal{CFD}\) can be extended to allow disjointness (bottom (\(\bot \))), negation (hence \(\lnot \exists {f}\) cannot be used), or disjunction on the right hand sides of inclusion dependencies while maintaining PTIME decidability of logical implication [8]. However, a similar technique as in the above development can be used to handle partial features without impacting the complexity of the logical implication problems.

4.2 \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\)

\({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\) shares the PFD restrictions with \({ partial\!-\! \mathcal{{CFD}}}\). However, it trades the ability to use conjunctions on the left hand sides of TBox inclusion dependencies for the ability to express disjointness and conditional typing:

  1. 1.

    left hand sides must be primitive concepts or value restrictions, and

  2. 2.

    right hand sides must be a primitive concepts, negations of primitive concepts, conjunctions, value restrictions, existential restrictions, and PFDs (again, restricted as in (2)).

It is easy to see that every \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\) TBox \(\mathcal {T}\) is consistent (by setting all primitive concepts to be interpreted as the empty set). It is, however, no longer true that all primitive concepts (and their conjunctions) are trivially satisfiable. For example, \(\mathrm {A}\sqsubseteq \lnot \mathrm {A}\in \mathcal {T}\) forces \(\mathrm {A}\) to be empty in every model of \(\mathcal {T}\).

Concept Satisfiability. The problem of concept satisfiability asks, for a given concept C and TBox \(\mathcal {T}\), if there exists an interpretation \(\mathcal I\) for \(\mathcal {T}\) in which \(C^{\mathcal{I}}\) is non-empty. Such problems can be reduced to the case where C is a primitive concept \(\mathrm {A}\) by simply augmenting \(\mathcal {T}\) with \(\{\mathrm {A}\sqsubseteq C \}\), where \(\mathrm {A}\) is a fresh primitive concept. Note that concept C can be a conjunction of other concepts since it only appears on the right-hand side of an inclusion dependency. We proceed as follows:

Definition 9

(Transition Relation for \(\mathcal {T}\) ). Let \(\mathcal {T}\) be a \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\) TBox in normal form. We define a transition relation \(\delta (\mathcal {T})\) over the set of states

$$\begin{array}{r@{}l} S= \mathsf {PC}\cup \{\lnot \mathrm {A}\mid \mathrm {A}\in \mathsf {PC}\} \cup \{\forall f.\mathrm {A}\mid \mathrm {A}\in \mathsf {PC}, f\in \mathsf {F}\} \cup \{\exists {f}\mid f\in \mathsf {F}\} \end{array}$$

and the alphabet \(\mathsf {F}\) as follows:

$$\begin{array}{l} C_1\mathop {\longrightarrow }\limits ^{\mathop { id }\nolimits }C_2\in \delta (\mathcal {T}) {\mathrm {,~if~}} C_1\sqsubseteq C_2 \in \mathcal {T}\text{, } \text{ and }\\ \forall f.\mathrm {A}\mathop {\longrightarrow }\limits ^{f}{\mathrm {A}}\in \delta (\mathcal {T}) {\mathrm {,~if~}} \forall f.\mathrm {A}\mathop {\longrightarrow }\limits ^{\mathop { id }\nolimits *}\exists {f}\in \delta (\mathcal {T}),\\ \end{array}$$

where \(\mathop { id }\nolimits \) is the empty letter transition, \(\mathop { id }\nolimits *\) is a sequence of \(\mathop { id }\nolimits \) edges, \(f\in \mathsf {F}\), \(A\in \mathsf {PC}\), and \(C_1, C_2 \in S\). \(\Box \)

The transition relation allows us to construct non-deterministic finite automata (NFA) that can be used for various reasoning problems on a \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\) TBox \(\mathcal {T}\). Note that, unlike common practice in automata theory, we use \(\mathop { id }\nolimits \) for the empty letter in transition relations. Given a primitive concept \(\mathrm {A}\) and TBox \(\mathcal {T}\), one can test for primitive concept satisfiability by using the following NFA, denoted \(\mathsf {nfa}^{\mathrm {A}}_{\mathrm {B}}(\mathcal {T})\):

$$(S,\{\mathrm {A}\},\{\mathrm {B}\},\delta (\mathcal {T})),$$

with states induced by primitive concepts, their negations, and value restrictions, with start state \(\mathrm {A}\), with the set of final states \(\{\mathrm {B}\}\subseteq S\), and with transition relation \(\delta (\mathcal {T})\). Intuitively, if \(\mathop {\mathsf {Pf}}\nolimits \in \mathsf {nfa}^{\mathrm {A}}_{\mathrm {B}}(\mathcal {T})\) and \(o\in \mathrm {A}^{\mathcal{I}}\) then \(\mathop {\mathsf {Pf}}\nolimits ^{\mathcal{I}}(o)\) is defined and \(\mathop {\mathsf {Pf}}\nolimits ^{\mathcal{I}}(o)\in \mathrm {B}^{\mathcal{I}}\) in every model \(\mathcal{I}\) of \(\mathcal {T}\).

Theorem 10

(Concept Satisfiability). \(\mathrm {A}\) is satisfiable with respect to the TBox \(\mathcal {T}\) if and only if

$$\mathcal{L}(\mathsf {nfa}^{\mathrm {A}}_{\mathrm {B}}(\mathcal {T})\cap \mathcal{L}(\mathsf {nfa}^{\mathrm {A}}_{\lnot \mathrm {B}}(\mathcal {T}) = \emptyset $$

for every \(\mathrm {B}\in \mathsf {PC}\).

Proof (sketch): Assume \(\mathrm {A}\) is non-empty and hence there is \(a\in \mathrm {A}^{\mathcal{I}}\). For a primitive concept \(\mathrm {B}\in \mathsf {PC}\), a word \(\mathop {\mathsf {Pf}}\nolimits \) in the intersection language of the two automata above is a witness of the fact that \(\mathop {\mathsf {Pf}}\nolimits ^{\mathcal{I}}(a^{\mathcal{I}})\in \mathrm {B}^{\mathcal{I}}\) and \(\mathop {\mathsf {Pf}}\nolimits ^{\mathcal{I}}(a^{\mathcal{I}})\in \lnot \mathrm {B}^{\mathcal{I}}\) must hold in every model of \(\mathcal {T}\).

Conversely, if no such word exists, then one can construct a deterministic finite automaton from \(\mathsf {nfa}^{\mathrm {A}}_{\mathrm {B}}(\mathcal {T})\), using the standard subset construction, in which there is not a state containing both \(\mathrm {B}\) and \(\lnot \mathrm {B}\) reachable from the start state \(\mathrm {A}\). Unfolding the transition relation of this automaton, starting from the state \(\mathrm {A}\) and labelling nodes by the concepts associated with the automaton’s states, yields a tree interpretation that satisfies \(\mathcal {T}\) (in particular in which all PFD constraints are satisfied vacuously) and whose root provides a witness for satisfiability of \(\mathrm {A}\). \(\Box \)

To test for emptiness of \(\mathsf {nfa}^{\mathrm {A}}_{\mathrm {B}}(\mathcal {T})\), we use a graph connectivity algorithm that non-deterministically searches for a \((\mathrm {A},\mathrm {A})-(\mathrm {B},\lnot \mathrm {B})\) path in the (virtual) poly-sized product automaton [2]; the following result is then immediate.

Corollary 11

Concept satisfiability with respect to \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\) TBoxes is complete for NLOGSPACE. \(\Box \)

Note that, as we remarked above, this procedure can be used to test for satisfiability of conjunctions of concepts in \({{\mathcal {CFD}}^{\forall }_{nc}}\) as follows:

Lemma 12

\(\mathrm {A}_1\sqcap \ldots \sqcap \mathrm {A}_k\) is consistent in \(\mathcal {T}\) if and only if \(\mathrm {A}\) is satisfiable in \(\mathcal {T}\cup \{\mathrm {A}\sqsubseteq \mathrm {A}_i\mid 1\le i\le k\}\). \(\Box \)

It is, however, impossible to precompute all such inconsistent concepts since this would require consideration of all possible types over \(\mathsf {PC}\) (or finite subsets of primitive concepts), a process essentially equivalent to constructing an equivalent deterministic automaton which can require exponential time [2].

Logical Implication. Logical implication for \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\) TBoxes \(\mathcal {T}\) and posed questions \(\mathcal{Q}\) can now be solved similarly to the \(\mathcal{CFD}\) case. The main difference lies in detecting inconsistencies caused by object membership in conjunctions of (primitive) concepts that are necessarily empty in models of \(\mathcal {T}\). This observation yields the following extension to \(\mathsf {R}(\mathcal {T})\):

If \(\mathsf {D}(\mathop {\mathsf {Pf}}\nolimits )\) and \(\mathsf {C}_{\mathrm {A}_1}(\mathop {\mathsf {Pf}}\nolimits ),\ldots ,\mathsf {C}_{\mathrm {A}_k}(\mathop {\mathsf {Pf}}\nolimits )\) are in \(\mathsf {R}(T)\) for some \(\mathop {\mathsf {Pf}}\nolimits \in \mathsf {PF}(\mathcal{T, Q})\) and \(\mathrm {A}_1\sqcap \ldots \sqcap \mathrm {A}_k\) is not consistent in \(\mathcal {T}\) then add \(\mathsf {C}_{\bot }(\mathop {\mathsf {Pf}}\nolimits )\) to \(\mathsf {R}(\mathcal {T})\).

Note that \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\) can be extended to allow \(\lnot \exists {f}\) on the right hand sides of TBox dependencies (which would be handled analogously to negated primitive concepts by the NFA in Theorem 10). Alltogether, we obtain following results analogous to those in Sect. 4.1:

Theorem 13

Let \(\mathcal{T}\) be a \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\) terminology and \(\mathcal{Q}\) a posed question of the form \(E_1\sqsubseteq E_2\). Then

$$\mathcal{T}\,\models \,\mathcal{Q} \mathrm {~iff~} \begin{array}[t]{@{}l} \mathsf {R}(\mathcal{T})\cup \{\mathsf {C}_{E_1}(\mathop { id }\nolimits ),\mathsf {D}(\mathop { id }\nolimits )\} \,\models \,\mathsf {G}_{E_1} \mathrm {~or~}\\ \mathsf {R}(\mathcal{T})\cup \{\mathsf {C}_{E_1}(\mathop { id }\nolimits ),\mathsf {D}(\mathop { id }\nolimits )\}\,\models \, \mathsf {G}_{\forall \mathop {\mathsf {Pf}}\nolimits .\bot }(\mathop { id }\nolimits )\mathrm {~for~some~} \mathop {\mathsf {Pf}}\nolimits \in \mathsf {PF}(\mathcal{T, Q}). \end{array}$$

Proof (sketch): The proof is similar to the proof of Theorem 7: the necessary \(F^*\) trees are generated by unfolding \(\delta (\mathcal {T})\) as in the Proof of Theorem 10. \(\Box \)

Corollary 14

Let \(\mathcal{T}\) be a terminology and \(\mathcal{Q}\) a posed question in \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\). Then the implication problem \(\mathcal{T}\,\models \,\mathcal{Q}\) is complete for PTIME. \(\Box \)

5 Summary and Future Work

In summary, we have shown how partial features coupled with a strict interpretation of undefined values can be incorporated in the feature-based DLs \(\mathcal{DLFD}\), \(\mathcal{CFD}\) and \({{\mathcal {CFD}}^{\forall }_{nc}}\), thus obtaining \({ partial\!-\! \mathcal{{DLFD}}}\), \({ partial\!-\! \mathcal{{CFD}}}\) and \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\), respectively. Our primary contributions have been to also show that this can be done without impact on the complexity of their associated logical inference problems. Indeed, with \(\mathcal{DLFD}\), this was achieved by showing how \({ partial\!-\! \mathcal{{DLFD}}}\) can be fully simulated in \(\mathcal{DLFD}\) in an entirely transparent fashion.

One avenue for future work would be to consider the impact on logical inference of alternative semantics for the interpretation of undefined values, in particular, on choosing the so-called Kleene semantics for equality. In this case, “\(e_1 = e_2\)” is also true when both \(e_1\) and \(e_2\) have undefined values. Note that doing so with \(\mathcal{DLFD}\) would in fact necessitate changes to the semantics of the PFD concept constructor to avoid undecidability of logical inference.Footnote 4

In our introductory comments, we also hinted at possible directions for future work relating to \({{\mathcal {CFD}}^{\forall }_{nc}}\). The first concerns recent work that begins to explore how inverse features can be added to feature-based DLs, in particular, on adding the \(\exists {f}^{-1}\) concept constructor. For example, logical inference has been shown to be decidable in PTIME for \({\mathcal {CFDI}}^{\forall -}_{nc}\) [11], a dialect obtained by adding this constructor to \({{\mathcal {CFD}}^{\forall }_{nc}}\) and by imposing additional syntactic restrictions, e.g., on the syntax of PFDs, to avoid intractability for this problem. We conjecture that our results for \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\) can be extended to \({partial\!-\!}{{\mathcal {CFDI}}}^{\forall -}_{nc}\), although the development would be much less straightforward, and that the same applies to the other two dialects that we have considered: that logical consequence for \(partial\!-\!\mathcal{{CFDI}}\) and \(partial\!-\!\mathcal{{DLFDI}}\) is decidable in PTIME and EXPTIME, respectively.

The second possible direction for future work relating to \({{\mathcal {CFD}}^{\forall }_{nc}}\) is an indirect consequence of the ability to easily simulate \({ partial\!-\! \mathcal{{DLFD}}}\) in \(\mathcal{DLFD}\). In particular, for this case, there remains little incentive to adopt \({ partial\!-\! \mathcal{{DLFD}}}\): to non-trivially complicate the semantics of feature-based DLs, to add the existential restriction concept constructor, and so on. We conjecture that it is possible to extend the syntax of \({{\mathcal {CFD}}^{\forall }_{nc}}\) to allow limited use of conjunction on left-hand-sides of inclusion dependencies to enable simulating \({partial\!-\!}{{\mathcal {CFD}}}^{\forall }_{nc}\) in \({{\mathcal {CFD}}^{\forall }_{nc}}\) in an analogous fashion, while preserving PTIME decidability for logical consequence.