1 Introduction

The ever increasing reliance of AI technologies on data acquisition, managements, and processing is having a profound impact on the nature and mission of artificial intelligence itself [28]. In recent years the area of knowledge representation and reasoning (KR&R) has witnessed a growing interest in the modelling and analysis of data-driven/data-centric/data-intensive systems [3, 15, 16]. This paradigm shift towards data-aware systems (DaS) has initiated in the area of business process modelling (BPM), in response to traditional approaches to service-oriented computing that typically abstract the data content away to reduce the complexity of the system description [30]. However, this data content is often essential to drive a business process. Hence, according to the data-aware perspective on BPM, the data content and the processes operating on it are seen as two equally relevant tenets in modelling systems [11, 20]. This data-aware approach has proved fruitful also in applications to areas in KR&R, including commitments in negotiation [27], planning [9], and service-oriented computing [14], where processes are often thought of as agents, endowed with their own goals, plans to achieve them, as well as information about the external environment [29].

Yet, if agent-based DaS are to be deployed in concrete KR&R scenarios, it is key to develop verification techniques, suitable to account for the two tenets of data and processes. Then, a critical issue in tackling this task lies in the infinite state space generated by the possibly infinite data content of DaS. Recently, several contributions have addressed this problem [3, 7, 10, 26], also leading to the development of open-source toolkits for DaS verification [19, 25]. Nonetheless, we identify a conceptual difficulty with most of the current approaches in the literature: data-aware systems are typically assumed to contain an actual infinity of data and to be able to reason about such an actual infinity. For instance, in [7, 10] an infinite quantification domain is part of the system’s description. But real-life scenarios actually deal only with a finite, possibly unbounded, quantity of data. Hence, the soundness and applicability of those theoretical results to concrete DaS scenarios cannot be taken for granted.

To provide an answer to the difficulties pertaining to reasoning about an actual infinite data domain, in Sect. 2 we introduce parameterised agent-based DaS (or P-AbDaS) as abstract systems, which are to be coupled with a (finite) data domain, in order to generate a concrete agent-based DaS (or C-AbDaS). Hence, differently from [7, 10], the same P-AbDaS can be instantiated in possibly infinitely-many C-AbDaS, but all of them are finite. Further, to specify the behaviour of P-AbDaS we need both temporal operators to describe the system’s evolution, and first-order features, including quantifiers and relation symbols, to account for data. Hence, in Sect. 3 we consider a first-order extension of the computation-tree logic CTL as the specification language for P-AbDaS, and then define the parameterised model checking problem for this setting, which we show to be undecidable in general. Then, in Sect. 4 we introduce techniques based on isomorphisms and finite interpretation that allow – under specific assumptions – for the existence of a cut-off, that is, a bound on the size of the quantification domain above which the truth value of formulas in first-order CTL does not change. The existence and value of the cut-off allow for a complete model checking procedure that checks the specification on increasingly larger domains, up to the cut-off value. We illustrate the formal machinery with a procurement scenario from the literature on BPM [21]. Finally, we conclude in Sect. 5 by discussing related work and pointing to future directions of research.

2 Agent-Based Data-Aware Systems

In this section we introduce parameterised agent-based data-aware systems (P-AbDaS) and define the corresponding model checking problem w.r.t. a first-order version of the temporal logic CTL. We first present the basic terminology on databases that is used throughout the paper [1].

Definition 1 (Database schema and instance)

A database schema is a finite set \(\mathcal {D}=\{P_1/q_1, \ldots ,P_n/q_n\}\) of relation symbols P with arity \(q \in \mathbb {N}\).

Given a countable interpretation domain \(Y\), a \(\mathcal {D}\)-instance over \(Y\) is a mapping D associating each relation symbol P to a finite q-ary relation on \(Y\), i.e., \(D(P) \underset{fin}{\subset } Y^{q}\).

By Definition 1 a database instance can be thought of as a finite relational structure, in line with relational models of databases [1]. We denote the set of all \(\mathcal {D}\)-instances on domain \(Y\) as \(\mathcal {D}(Y)\). The active domain \(\textit{adom}(D)\) of a \(\mathcal {D}\)-instance D is the finite set of all elements \(u \in Y\) occurring in some predicate interpretation D(P), that is, \(\textit{adom}(D) = \bigcup _{P \in \mathcal {D}} \{ u \in Y \mid \langle u_1, \ldots , u, \ldots , u_q \rangle \in D(P) \}\). Hereafter, we assume w.l.o.g. that the active domain also includes a finite set \( C \subseteq Y\) of constants, i.e., \( C \subseteq \textit{adom}(D)\). To describe the temporal evolution of agent-based data-aware systems, we introduce the primed version of a database schema \(\mathcal {D}\) as the schema \(\mathcal {D}'=\{P'_1/q_1, \ldots ,P'_n/q_n\}\). Then, the disjoint union \(D \oplus D'\) of \(\mathcal {D}\)-instances D and \(D'\) is the \((\mathcal {D}\cup \mathcal {D}')\)-instance such that (i) \((D \oplus D')(P_i) = D(P_i)\), and (ii) \((D \oplus D')(P'_i) = D'(P_i)\), where \(\mathcal {D}'\) is the primed version of \(\mathcal {D}\). Intuitively, D and \(D'\) represent the current and next state of the system respectively, represented as database instances.

To specify properties of databases, we now recall the syntax of first-order logic with equality and no function symbols. Let V be a countable set of individual variables and let a term be any element \(t \in T = V \cup C \).

Definition 2 (FO-formulas)

Given a database schema \(\mathcal {D}\), the formulas \(\varphi \) of the first-order language \(\mathcal {L_D}\) are defined by the following BNF:

$$\begin{aligned} \varphi \,{::=}\, P(t_1, \ldots , t_{q}) \mid t = t' \mid \lnot \varphi \mid \varphi \rightarrow \varphi \mid \forall x \varphi \end{aligned}$$

where \(P \in \mathcal {D}\), \(t_1,\ldots ,t_{q}\) is a q-tuple of terms, and \(t, t'\) are terms.

We define the free and bound variables in a formula \(\varphi \) as standard, and write \(\varphi (\varvec{x})\) to denote that the free variables of \(\varphi \) are among \(x_1, \ldots , x_n\).

To interpret first-order formulas on database instances, we introduce assignments as functions \(\sigma : T \rightarrow Y\) from terms to elements in Y. We denote by \(\sigma ^x_u\) the assignment such that (i) \(\sigma ^x_u(x) = u\); and (ii) \(\sigma ^x_u(x') = \sigma (x')\) for every \(x' \ne x\). Also, we assume a Herbrandian interpretation of constants, that is, \(\sigma (c) = c\) for all \(c \in C \).

Definition 3 (Satisfaction of FO-formulas)

Given a \(\mathcal {D}\)-instance D, an assignment \(\sigma \), and an FO-formula \(\varphi \in \mathcal {L_D}\), we inductively define whether D satisfies \(\varphi \) under \(\sigma \), or \((D, \sigma ) \models \varphi \), as follows:

figure a

A formula \(\varphi \) is true in D, or \(D \models \varphi \), iff \((D, \sigma ) \models \varphi \) for all assignments \(\sigma \).

Notice that we adopt an active domain semantics, where quantifiers range over the active domain \(\textit{adom}(D)\) of D. This is a standard assumption in database theory [1]. Hereafter, we often write \((D, \varvec{u}) \models \varphi \) whenever \(\varvec{x}\) are all the free variables in \(\varphi \) and \(\sigma (\varvec{x}) = \varvec{u}\). In particular, the satisfaction of a formula only depends on its free variables.

We now introduce a notion of agent whose local information state is represented as a relational database. In particular, inspired by the literature in KR&R and BPM on the specification of agent actions in terms of pre- and post-conditions [2, 3, 21], we introduce the notion of action type.

Definition 4 (Action Type)

An action type is an expression \(\alpha (\varvec{x}) \,{:: =} g(\varvec{x}) \rightsquigarrow ef (\varvec{x})\), where:

  • guard g is an FO-formula with free variables \(\varvec{x}\);

  • effect \( ef \) is an expression built according to the BNF:

    $$\begin{aligned} ef \, {::=}\, add(P,\varvec{x}) \mid del(P,\varvec{x}) \mid ef ; ef \mid ef \cup ef \end{aligned}$$

    where, intuitively, \(add(P,\varvec{x})\) is the insertion of tuple \(\varvec{x}\) in relation P, \(del(P,\varvec{x})\) is the deletion of \(\varvec{x}\) from P, \( ef ; ef \) is the sequential composition, and \( ef \cup ef\) is the non-deterministic choice.

We now introduce a set Ag of agents, operating on databases, each of them defined as follows:

Definition 5 (Agent)

An agent is a tuple \(i = \langle \mathcal {D}_i, Act_i\rangle \), where

  • \(\mathcal {D}_i\) is the local database schema;

  • \(Act_i\) is the finite set of action types \(\alpha (\varvec{x})\), whose guards and effects are built over \(\mathcal {D}_i\).

Intuitively, by Definition 5 we assume that at each moment agent i is in some local state \(D \in \mathcal {D}_i(Y)\) that represents all the information she has about the global state of the system. In this respect we follow the typical approach to agent-based systems [17, 31], but here we require that this information is structured as a database. Further, each agent has her own database schema \(\mathcal {D}_i\), but the same relation symbol might appear in several schemas.

As we are interested in the interactions of agents among themselves and with the external environment, we introduce their synchronous composition.

Definition 6 (Parameterised AbDaS)

A parameterised agent-based data-aware system (or P-AbDaS) is a finite set Ag of agents defined as in Definition 5.

To endow a P-AbDaS with a data content, thus obtaining a concrete Ab-DaS, we consider an infinite, countable interpretation domain \(\mathcal {Y}\), which intuitively represents these data.

Definition 7 (Concrete AbDaS)

A concrete agent-based data-aware system (or C-AbDaS) is a tuple \(\mathcal {P} = (Ag, Y)\), where (i) Ag is a P-AbDaS; and (ii) \(Y\supseteq C \) is a finite subset of \(\mathcal {Y}\).

Notice that, differently from [7, 10], we do not assume an actual infinity of elements in our models: each C-AbDaS only contains a finite set Y of elements. However, in general we can obtain infinitely many C-AbDaS based on the same P-AbDaS, build on different domains \(Y\underset{fin}{\subset } \mathcal {Y}\).

We now introduce some technical notions that will be used in the rest of the paper. Given a C-AbDaS \(\mathcal {P}= (Ag, Y)\), the (global) states of \(\mathcal {P}\) are tuples \(s \in S = \prod _{i \in Ag} \mathcal {D}_i(Y)\), whereas joint actions \(\alpha (\varvec{u}) \in ACT(Y) = \prod _{i \in Ag} Act_i(Y)\) take values \(\varvec{u}\) from domain Y. Observe that every global state \(s=\langle D_0,\ldots , D_n\rangle \in S\) can be thought of as a database instance on the global database schema \(\mathcal {D}= \bigcup _{i \in Ag} \mathcal {D}_i\) such that \(s(P) = \bigcup _{i \in Ag} \mathcal {D}_i(P)\), for every \(P \in \mathcal {D}\). Then, we set \(s_i\) as the restriction of s to the relation symbols in \(\mathcal {D}_i\). That is, we assume that each agent has a truthful, yet partial, view of the global database \(\mathcal {D}\), since in general \(\mathcal {D}_i\) is a subset of \(\mathcal {D}\).

Further, the transition relation \(\tau : S \times ACT(Y) \mapsto 2^S\) is defined such that \(t =\langle D'_0,\ldots , D'_n\rangle \in \tau (s, \alpha (\varvec{u}))\) iff for every \(i \in Ag\), \((s_i, \varvec{u}) \models g_{i}\), i.e., all guards are satisfied and the corresponding joint action is enabled, and applying the effects \( ef _{i}(\varvec{u})\). Specifically, if \( ef _i = add(P, \varvec{x})\) (resp. \(del(P, \varvec{x})\)), then \(D'_i\) is obtained from \(D_i\) by performing the corresponding insertion (resp. deletion) in P with values \(\varvec{u}\). If \( ef _i = ef '_i ; ef ''_i\), then \(t_i\) is obtained from \(s_i\) by applying first the effects in \( ef '_i\), and then \( ef ''_i\). Similarly for \( ef _i = ef '_i \cup ef ''_i\).

Finally, we introduce the successor relation \(\rightarrow \) on global states such that \(s \rightarrow t\) if there exists \(\alpha (\varvec{u}) \in ACT(Y)\) such that \(s \xrightarrow {\alpha (\varvec{u})} t\), i.e., \(t \in \tau (s, \alpha (\varvec{u}))\). A run r from state s is an infinite sequence \(s^0 \rightarrow s^1\rightarrow \ldots \), with \(s^0=s\). For \(n\in \mathbb {N}\), we define \(r(n) = s^n\). Hereafter we assume that the relation \(\rightarrow \) is serial. This can be ensured by using skip actions. Notice that, in what follows we restrict the set of global states as the set of reachable states only. The disjoint union \(\oplus \) is extended to global states in a pointwise manner: for \(s = \langle D_0, \ldots , D_n\rangle \) and \(s' = \langle D'_0, \ldots , D'_n \rangle \), we define \(s \oplus s'\) as \(\langle D_0 \oplus D'_0 , \ldots , D_n \oplus D'_n \rangle \).

Example 1

To illustrate the formal machinery introduced thus far, we present a business process inspired by a concrete IBM customer use case [21]. The order-to-cash business process specifies the interactions of three agents in an e-commerce situation relating to the purchase and delivery of a product: a manufacturer m, a customer c, and a supplier s. The process begins when c prepares and submits to m a purchase order (PO), i.e., a list of products c requires (action createPO()). Upon receiving a PO, m prepares a material order (MO), i.e., a list of components needed to assemble the requested products (action createMO()). Then, m forwards to s the relevant material order. Upon receiving an MO, s can either accept or reject it (actions acceptMO() and rejectMO()). In the former case she proceeds to deliver the requested components to m (action shipMO()). In the latter, she notifies m of her rejection. If an MO is rejected, m deletes it and then prepares and submits a new MO (action deleteMO()). Upon delivery of the components (action receiveMO()), m assembles the product and, provided the order has been paid for (action payPO()), delivers it to c (action shipPO()).

We can encode the order-to-cash business process as a P-AbDaS, where the data model is represented by means of database schemas, whose evolution is determined by an appropriate set of actions types. Formally the three agents can be defined as follows:

  • \(A_c = \langle \mathcal {D}_c, Act_c\rangle \), where

    • \(\mathcal {D}_c \{ Products( prod\_code, budget), PO(id, prod\_code, of\!fer, status)\};\)

    • \(Act_c = \{ createPO(id,code), payPO(id), deletePO(id)\}\);

  • \(A_m = \langle \mathcal {D}_m, Act_m\rangle \), where

    • \(\mathcal {D}_m \!\!=\!\!\{ PO(id, prod\_code, o\!f\!f\!er, status)\!,\! MO(id, prod\_code, price, status)\};\)

    • \(Act_m \!\!=\!\! \{ createM\!O(\!id,price\!),\! receiveMO(\!id),\! deleteMO(id),\! shipPO(id) \};\)

  • \(A_s = \langle \mathcal {D}_s, Act_s\rangle \), where

    • \(\mathcal {D}_s = \{ Materials( mat\_code, cost), MO(id, prod\_code, price, status)\};\)

    • \(Act_s = \{ acceptMO(id), rejectMO(id), shipMO(id)\}\).

In Table 1 we provide the detailed action types for all agents in the use case. As an example, according to action type createPO() (item (1.a)), the customer can create a purchase order with a designed id only if there exists a product with the same id. Further, by using createMO() the manufacturer can create a material order with a designed id if MO does not contain a tuple with same id in preparation status (item (2.a)).

Table 1. The list of actions in the order-to-cash scenario.

3 The Verification of AbDaS

In this section we introduce the specification language for AbDaS and the corresponding model checking problem. We recall that we consider a set V of individual variables and a set \( C \) of individual constants. The terms \(t_1, t_2, \ldots \) in T are either variables in V or constants in \( C \).

Definition 8 (FO-CTL)

The FO-CTL formulas \(\varphi \) over a database schema \(\mathcal {D}\) are defined as follows, where \(P \in \mathcal {D}\):

$$\begin{aligned} \varphi \,{::=}\, P(t_1, \ldots , t_q) \mid \lnot \varphi \mid \varphi \rightarrow \varphi \mid \forall x \varphi \mid AX \varphi \mid A \varphi U \varphi \mid E \varphi U \varphi \end{aligned}$$

The language FO-CTL is a first-order extension of the propositional temporal logic CTL. The temporal formulas \(AX \varphi \) and \(A \varphi U \varphi '\) (resp. \(E \varphi U \varphi '\)) are read as “for all runs, at the next step \(\varphi \)” and “for all runs (resp. some run), \(\varphi \) until \(\varphi '\)”. Given a formula \(\varphi \), we denote the set of free and all variables as \(fr(\varphi )\) and \(var(\varphi )\) respectively, and introduce formulas \(EX\varphi \), \(AF\varphi \), \(AG \varphi \), \(EF\varphi \), and \(EG\varphi \) as standard.

We now interpreted FO-CTL on concrete agent-based data-aware systems.

Definition 9 (Semantics of FO-CTL)

We define whether a C-AbDaS \(\mathcal {P}\) satisfies a formula \(\varphi \) in a state s according to assignment \(\sigma \), or \((\mathcal {P}, s,\sigma )\models \varphi \), as follows:

figure b

A formula \(\varphi \) is true at state s, or \((\mathcal {P}, s) \models \varphi \), if \((\mathcal {P}, s, \sigma ) \models \varphi \) for all assignments \(\sigma \); \(\varphi \) is true in C-AbDaS \(\mathcal {P}\), or \(\mathcal {P}\models \varphi \), if \((\mathcal {P}, s) \models \varphi \) for every \(s \in S\). Finally, \(Ag \models \varphi \) iff \((Ag, Y) \models \varphi \) for all \(Y\underset{fin}{\subset } \mathcal {Y}\).

Again, in Definition 9 we adopt an active domain semantics, whereby quantifiers range over the active domain \(\textit{adom}(s)\) of s.

Finally, we present the model checking problem for P-AbDaS with respect to the specification language FO-CTL.

Definition 10 (Parameterised Model Checking)

Given a P-AbDaS Ag, an infinite domain \(\mathcal {Y}\), and an FO-CTL formula \(\varphi \), determine whether \(Ag \models \varphi \).

Notice that the parameterised model checking problem requires in principle to check an infinite number of C-AbDaS built on the same P-AbDaS. Indeed, model checking P-AbDaS is undecidable in general: we remark without proof that P-AbDaS are expressive enough to encode Turing machines, and reachability of a halting state can then be expressed in FO-CTL similarly to [7, 15]. Hence, it is of interest to investigate semantic restrictions on P-AbDaS that allow for a decidable model checking problem.

To this end, a key notion to decide parameterised model checking in general is the cut-off:

Definition 11 (Cut-off)

A natural number \(n \!\in \! \mathbb {N}\) is a cut-off for P-AbDaS Ag and formula \(\phi \) iff for all finite subsets \(Y\supseteq C , Y'\supseteq C \) of \(\mathcal {Y}\), if \(|Y| = n\) and \(|Y'| \ge |Y|\), then \((Ag, Y) \models \phi \) iff \((Ag, Y') \models \phi \).

Note that, in Definition 11 we suppose \(|Y'| \ge |Y|\) without considering that \(|Y|\) is a subset of \(|Y'|\). This is because we define the set of constants \( C \) to be in both \(|Y|\) and \(|Y'|\), and for this reason the intersection between \(|Y'|\) and \(|Y|\) cannot be empty.

The existence of the cut-off allows us to decide verification by checking all C-AbDaS up to size |n|, of which there exist finitely many instances. We devote the rest of the paper to finding sufficient condition for the existence of cut-offs.

We conclude this section by elaborating on Example 1.

Example 2

We can investigate properties of the order-to-cash business process by using specifications in FO-CTL. For instance, the following formula intuitively specifies that each material order MO has to match a corresponding purchase order PO:

$$\begin{aligned} AG~\forall id, pc~(\exists pr, s~ MO(id, pc, pr, s) \rightarrow \exists o,s' PO(id, pc, o,s')) \end{aligned}$$

The next specification states that given a material order MO, it can be the case that eventually the corresponding PO will be shipped.

$$\begin{aligned} AG~\forall id, pc~(\exists pr, s~MO(id, pc, pr, s) \rightarrow EF \, \exists o~PO(id, pc, o, \mathsf{shipped})) \end{aligned}$$

Hereafter we develop techniques to model check specifications in FO-CTL like the ones above.

4 Finding Cut-Offs

In this section we introduce model-theoretic notions that will be used to tackle the parameterised model checking problem for P-AbDaS. In particular, we recall some notions in [7].

Definition 12 (Isomorphism)

Two database instances \(D \in \mathcal {D}(Y')\), \(D'\in \mathcal {D}(Y)\) are isomorphic, or \(D \simeq D'\), iff there exists a bijection \(\iota : \textit{adom}(D) \mapsto \textit{adom}(D')\) s.t.:

  1. (i)

    \(\iota \) is the identity on the constants in \( C \);

  2. (ii)

    for all \(P \in \mathcal {D}\), \(\varvec{u} \in Y^{q}\), \(\varvec{u} \in D(P)\) iff \(\iota (\varvec{u}) \in D'(P)\).

When the above is the case, we say that \(\iota \) is a witness for \(D \simeq D'\). Moreover, two global states \(s = \langle D_0 , \ldots , D_n \rangle \in S\) and \(s' = \langle D'_0 , \ldots , D'_n \rangle \in S'\) are isomorphic, or \(s \simeq s'\), iff there exists a bijection \(\iota :\textit{adom}(s) \mapsto \textit{adom}(s')\) such that for every \(j \in Ag\), \(\iota \) is a witness for \(D_j \simeq D'_j\).

By Definition 12 isomorphisms preserve the interpretation of constants as well as of predicates up to renaming of terms. Obviously, isomorphisms are equivalence relations. Given a function \(f: Y\mapsto Y'\) defined on \(\textit{adom}(s)\), f(s) denotes the instance in \(\mathcal {D}(Y')\) obtained from s by renaming each \(u \in \textit{adom}(s)\) as f(u). If f is also injective (thus invertible) and the identity on \( C \), then \(f(s) \simeq s\).

While isomorphic states share the same relational structure, two isomorphic states do not necessarily satisfy the same FO-formulas as satisfaction depends also on the values assigned to free variables. To account for this, we introduce the following notion.

Definition 13 (Equivalent assignments)

Given states \(s\in S\) and \(s'\in S'\), and a set \(V' \subseteq V\) of variables, assignments \(\sigma : T \mapsto Y\) and \(\sigma ' : T \mapsto Y'\) are equivalent for \(V'\) w.r.t. s and \(s'\) iff there exists a bijection \(\gamma :\textit{adom}(s)\cup \sigma (V') \mapsto \textit{adom}(s')\cup \sigma '(V')\) such that:

  1. (i)

    \(\gamma |_{\textit{adom}(s)}\) is a witness for \(s \simeq s'\);

  2. (ii)

    \(\sigma '|_{V'}= \gamma ;\sigma |_{V'}\), where \(;\) is function composition.

By Definition 13 equivalent assignments preserve both the (in)equalities of the terms in \(s, s'\) up to renaming. Clearly, the existence of equivalent assignments implies that \(s,s'\) are isomorphic. We say that two assignments are equivalent for an FO-CTL formula \(\varphi \), omitting states s and \(s'\) when clear from the context, if these are equivalent for the free variables \(fr(\varphi )\) in \(\varphi \).

We now state the following standard result in first-order (non-modal) logic, i.e., isomorphic states satisfy exactly the same FO-formulas, when interpreted with equivalent assignments [1].

Proposition 1

Given isomorphic states \(s\in S\) and \(s'\in S'\), an FO-formula \(\varphi \), and assignments \(\sigma \) and \(\sigma '\) equivalent for \(\varphi \), we have that

$$\begin{aligned} (s, \sigma ) \models \varphi&\text {iff}&(s', \sigma ') \models \varphi \end{aligned}$$

An immediate consequence of Proposition 1 is that isomorphic states cannot be distinguished by FO-sentences. In the rest of the section we show how isomorphisms can actually be used to prove the preservation of the whole FO-CTL. Notice that this is in marked contrast with similar results in the literature [3, 7], which need to assume some notion of (bi)simulation on the underlying transitions systems. Nothing similar is required here, we show that isomorphisms suffice. More specifically, in [7] the requirement of uniformity was put forward as a sufficient condition for bisimilar systems to satisfy the same formulas in FO-CTL. We now show that C-AbDaS satisfy uniformity unrestrictedly.

Lemma 1 (Uniformity)

All C-AbDaS \(\mathcal {P}\), \(\mathcal {P}'\) are uniform, that is, for every \(s,t \in S\), \(s' \in S'\), \(t' \in \mathcal {D}(Y)\), if \(t \in \tau (s,\alpha (\varvec{u}))\) and \(s \oplus t \simeq s' \oplus t'\) for some witness \(\iota \), then for every constant-preserving bijection \(\iota '\) that extends \(\iota \) to \(\varvec{u}\), we have that \(t' \in \tau (s',\alpha (\iota '(\varvec{u})))\).

Proof

For illustration, we consider the case in which there is only one agent, i.e., \(\alpha (\varvec{u}) = g(\varvec{u}) \rightsquigarrow ef(\varvec{u})\). First of all, notice that if \(s \oplus t \simeq s' \oplus t'\) then for every bijection \(\iota '\) extending \(\iota \) to \(\varvec{u}\), we have that \((s, \varvec{u}) \models g(\varvec{x})\) iff \((s, \iota '(\varvec{u})) \models g(\varvec{x})\) by Proposition 1. Hence, action \(\alpha (\varvec{u})\) is enabled in s iff \(\alpha (\iota '(\varvec{u}))\) is enabled in \(s'\).

Now we prove by induction on the structure of \(ef(\varvec{u})\) that \(t'\) can be obtained by applying effects \(ef(\iota '(\varvec{u}))\) to \(s'\), and therefore \(t' \in \tau (s',\alpha (\iota '(\varvec{u})))\). For the base of induction, consider \(ef(\varvec{u}) = add(P,\varvec{u})\). Then, t differs from s only for tuple \(\varvec{u}\) possibly added to the interpretation of P. Since \(s \oplus t \simeq s' \oplus t'\), also \(t'\) differs from \(s'\) only for tuple \(\iota '(\varvec{u})\) added to the interpretation of P, and therefore \(t' \in \tau (s',\alpha (\iota '(\varvec{u})))\). As regards the base case for \(ef(\varvec{u}) = del(P,\varvec{u})\), t differs from s only for tuple \(\varvec{u}\) possibly deleted from the interpretation of P. Since \(s \oplus t \simeq s' \oplus t'\), again \(t'\) differs from \(s'\) only for tuple \(\iota '(\varvec{u})\) deleted from the interpretation of P, and therefore \(t' \in \tau (s',\alpha (\iota '(\varvec{u})))\).

As for the inductive case for \(ef(\varvec{u}) = ef_1(\varvec{u}_1) \cup ef_2(\varvec{u}_2)\), then t is obtained from s by applying either the effects in \(ef_1(\varvec{u}_1)\) or in \(ef_2(\varvec{u}_2)\). Then, by induction hypothesis, \(t'\) can be obtained from \(s'\) by applying either the effects in \(ef_1(\iota '(\varvec{u}_1))\) or in \(ef_2(\iota '(\varvec{u}_2))\), which is tantamount to \(ef(\iota '(\varvec{u}))\). Finally, for \(ef(\varvec{u}) = ef_1(\varvec{u}_1); ef_2(\varvec{u}_2)\), t is obtained from s by applying first the effects in \(ef_1(\varvec{u}_1)\) and then \(ef_2(\varvec{u}_2)\). Then, by induction hypothesis, \(t'\) can be obtained from \(s'\) by applying first the effects in \(ef_1(\iota '(\varvec{u}_1))\) and then \(ef_2(\iota '(\varvec{u}_2))\), which is tantamount to \(ef(\iota '(\varvec{u}))\).

Intuitively, the notion of uniformity in Lemma 1 captures the idea that actions take into account and operate only on the relational structure of states, irrespective of the actual data they contain. Because of this, uniformity has been compared to the notion of genericity in database theory, whereby in specific cases the answer to a query depends only on the structure of the database [1]. Actually, the result in Lemma 1 is stronger that the notion of uniformity in [7], which is restricted to states belonging to the same system. We are able to prove a stronger result, as we consider C-AbDaS built on the same P-AbDaS and therefore sharing the same actions, which is not the case in [7].

We now demonstrate some auxiliary lemmas that will be used in proving the main preservation result (Theorem 2). The first two guarantee that under appropriate conditions on the cardinality of the interpretation domains, equivalent assignments are preserved by the isomorphism relation. Hereafter we set \(N_{Ag} \!=\! \sum _{i \in Ag} \!\max _{\alpha {(\varvec{x})} \in Act_i} \{|\varvec{x}|\}\), i.e., \(N_{Ag}\) is the sum of the maximum number of parameters contained in the action types of each agent in Ag; whereas \(\mathcal {P}= (Ag, Y)\) and \(\mathcal {P}' = (Ag, Y')\) are C-AbDaS defined on the same P-AbDaS Ag.

Lemma 2

Consider C-AbDaS \(\mathcal {P}\) and \(\mathcal {P}'\) defined on the same P-AbDaS Ag, isomorphic states \(s \in S\) and \(s' \in S'\), an FO-CTL formula \(\varphi \), and assignments \(\sigma \) and \(\sigma '\) equivalent for \(\varphi \) w.r.t. s and \(s'\). For every \(t \in S\) such that \(s \rightarrow t\), if \(|{Y'}| \ge \ |\textit{adom}(s) \cup \sigma (fr(\varphi ))| + N_{Ag}\), then there exists \(t' \in S'\) such that \(s' \rightarrow t'\), \(t \simeq t'\), and \(\sigma \) and \(\sigma '\) are equivalent for \(\varphi \) w.r.t. t and \(t'\).

Proof. First of all, let \(\gamma \) be a bijection witnessing that \(\sigma \) and \(\sigma '\) are equivalent for \(\varphi \) w.r.t. s and \(s'\), and suppose that \(t \in \tau (s, \varvec{\alpha }(\varvec{u}))\) for some joint action \(\varvec{\alpha }(\varvec{u})\). Now define \(Dom(j) \doteq \textit{adom}(s) \cup \sigma (fr(\varphi )) \cup \varvec{u}\), and partition it into:

  • \(Dom(\gamma ) \doteq \textit{adom}(s)\cup \sigma (fr(\varphi ))\);

  • \(Dom(\iota ') \doteq \varvec{u} \setminus Dom(\gamma )\).

Let \(\iota ':Dom(\iota ') \mapsto Y' \setminus Im(\gamma )\) be an invertible total function. Observe that \(|{Im(\gamma )}| = |{\textit{adom}(s') \cup \sigma '(fr(\varphi ))}| = |\textit{adom}(s)\cup \sigma (fr(\varphi ))|\), thus from the fact that \(|{Y'}| \ge |{\textit{adom}(s) \cup \sigma (fr(\varphi ))}| + N_{Ag}\), we have that \(|{Y' \setminus Im(\gamma )}| \ge |{Dom(\iota ')}|\), which guarantees the existence of \(\iota '\).

Next, define \(j: Dom(j) \mapsto Y'\) as follows:

$$ j(u) =\left\{ \begin{array}{l} \gamma (u) \text{, } \text{ if } u\in Dom(\gamma )\\ \iota '(u) \text{, } \text{ if } u\in Dom(\iota ') \end{array}\right. $$

Clearly, j is invertible. In particular, j is a witness for \(s \oplus t \simeq s' \oplus t'\), for \(t' = j(t)\). In particular, since \(t \in \tau (s, \varvec{\alpha }(\varvec{u}))\), by uniformity we obtain that \(t' \in \tau (s', \varvec{\alpha }(j(\varvec{u})))\). Thus, \(s' \rightarrow t'\). Finally, by construction of \(t'\), \(\sigma \) and \(\sigma '\) are equivalent for \(\varphi \) w.r.t. t and \(t'\).    \(\square \)

The proof of Lemma 2 relies crucially on \(\mathcal {P}\) and \(\mathcal {P}'\) being uniform. Moreover, since \(\mathcal {P}\) and \(\mathcal {P}'\) are defined on the same P-AbDaS Ag, we do not need to assume that \(\mathcal {P}\) and \(\mathcal {P}'\) are bisimilar, as it is the case in [7, Lemma 3.9] for instance.

Then, Lemma 2 generalises to runs.

Lemma 3

Consider C-AbDaS \(\mathcal {P}\) and \(\mathcal {P}'\) defined on the same P-AbDaS Ag, isomorphic states \(s \in S\) and \(s' \in S'\), an FO-CTL formula \(\varphi \), and two assignments \(\sigma \) and \(\sigma '\) equivalent for \(\varphi \) w.r.t. s and \(s'\). For every run r of \(\mathcal {P}\), if \(r(0) = s\) and for all \(i \ge 0\), \(|{Y'}|\ge |\textit{adom}(r(i)) \cup \sigma (fr(\varphi ))| + N_{Ag}\), then there exists a run \(r'\) of \(\mathcal {P}'\) such that for all \(i\ge 0\):

  1. (i)

    \(r'(0)=s'\);

  2. (ii)

    \(r(i) \simeq r'(i)\);

  3. (iii)

    \(\sigma \) and \(\sigma '\) are equivalent for \(\varphi \) w.r.t. r(i) and \(r'(i)\).

Proof

Let r be a run satisfying the lemma’s hypothesis. We inductively build \(r'\) and show that the conditions (i)(iii) are satisfied. For \(i = 0\), let \(r'(0) = s'\). By hypothesis, r is such that \(|{Y'}| \ge |{\textit{adom}(r(0)) \cup \sigma (fr(\varphi ))}| + N_{Ag}\). Thus, since \(r(0) \rightarrow r(1)\), by Lemma 2 there exists \(t' \in S'\) such that \(r'(0) \rightarrow t'\), \(r(1)\simeq t'\), and \(\sigma \) and \(\sigma '\) are equivalent for \(\varphi \) w.r.t. r(1) and \(t'\). Let \(r'(1)= t'\).

The case for \(i > 0\) is similar. Assume that \(r(i)\simeq r'(i)\) and \(\sigma \) and \(\sigma '\) are equivalent for \(\varphi \) w.r.t. r(i) and \(r'(i)\). Since \(r(i)\rightarrow r(i+1)\) and \(|{Y'}|\ge |\textit{adom}(r(i))\cup \sigma (fr(\varphi ))| + N_{Ag}\), by Lemma 2 there exists \(t' \in S'\) such that \(r'(i)\rightarrow t'\), \(\sigma \) and \(\sigma '\) are equivalent for \(\varphi \) w.r.t. \(r(i+1)\) and \(t'\), and \(r(i+1)\simeq t'\). Let \(r'(i+1) = t'\). It is clear that \(r'\) is a run in \(\mathcal {P}'\).

Again, Lemma 3 differs from similar results in the literature (e.g., [7, Lemma 3.10]) as we do not need to assume that \(\mathcal {P}\) and \(\mathcal {P}'\) are bisimilar.

By Lemma 3 we can prove that, for sufficiently large domains, FO-CTL formulas cannot distinguish isomorphic C-AbDaS built on the same P-AbDaS.

Theorem 1

Consider C-AbDaS \(\mathcal {P}\) and \(\mathcal {P}'\) defined on the same P-AbDaS Ag, isomorphic states \(s \in S\) and \(s' \in S'\), an FO-CTL formula \(\varphi \), and two assignments \(\sigma \) and \(\sigma '\) equivalent for \(\varphi \) w.r.t. s and \(s'\). If

  1. 1.

    for every run r such that \(r(0)=s\), for all \(k\ge 0\) we have \(|{Y'}| \ge |\textit{adom}(r(k)) \cup \sigma (fr(\varphi ))| + |var(\varphi )\setminus fr(\varphi )| + N_{Ag}\);

  2. 2.

    for every run \(r'\) such that \(r'(0)=s'\), for all \(k\ge 0\) we have \(|{Y}| \ge |\textit{adom}(r'(k)) \cup \sigma '(fr(\varphi ))| + |{var(\varphi )\setminus fr(\varphi )}| + N_{Ag}\);

then \((\mathcal {P}, s ,\sigma ) \models \varphi \) iff \((\mathcal {P}', s', \sigma ') \models \varphi \).

Proof

The proof is by induction on the structure of \(\varphi \). We prove that if \((\mathcal {P}, s ,\sigma ) \models \varphi \) then \((\mathcal {P}', s', \sigma ') \models \varphi \). The other direction can be proved analogously. The base case for atomic formulas follows by Proposition 1. The inductive cases for propositional connectives are immediate and thus omitted.

For \(\varphi \equiv \forall x\psi \), assume that \(x\in fr(\psi )\) (otherwise consider \(\psi \), and the corresponding case), and no variable is quantified more than once (otherwise we can rename variables w.l.o.g.). Let \(\gamma \) be a bijection witnessing that \(\sigma \) and \(\sigma '\) are equivalent for \(\varphi \) w.r.t. s and \(s'\). For \(u\in \textit{adom}(s)\), consider the assignment \(\sigma ^x_u\). By definition, \(\gamma (u)\in \textit{adom}(s')\), and \(\sigma '^x_{\gamma (u)}\) is well-defined. Note that \(fr(\psi )=fr(\varphi )\cup \{x\}\); so \(\sigma ^x_u\) and \(\sigma '^x_{\gamma (u)}\) are equivalent for \(\psi \) w.r.t. s and \(s'\). Moreover, \(|{\sigma ^x_{u}(fr(\psi ))}| = |{\sigma (fr(\varphi ))}|+1\). The same considerations apply to \(\sigma '\). Further, \(|{var(\psi )\setminus {fr(\psi )}}| = |{var(\varphi )\setminus {fr(\varphi )}}|-1\), as \(var(\psi )=var(\varphi )\), \(fr(\psi )=fr(\varphi )\cup \{x\}\), and \(x\notin fr(\varphi )\). Thus, both hypotheses (1) and (2) remain satisfied if we replace \(\varphi \) with \(\psi \), \(\sigma \) with \(\sigma ^x_u\), and \(\sigma '\) with \(\sigma '^x_{\gamma (u)}\). Therefore, by the induction hypothesis, if \((\mathcal {P},s,\sigma ^x_u)\models \psi \) then \((\mathcal {P}',s', \sigma '^x_{\gamma (u)})\models \psi \). Since \(u\in \textit{adom}(s)\) is generic and \(\gamma \) is a bijection, the result follows.

For \(\varphi \equiv AX\psi \), assume by contraposition that \((\mathcal {P}',s', \sigma ')\not \models \varphi \). Then, there exists a run \(r'\) such that \(r'(0)=s'\) and \((\mathcal {P}',r'(1), \sigma ')\not \models \psi \). Since \(|{var(\varphi )\setminus {fr(\varphi )}}|\ge 0\), by Lemma 3, there exists a run r such that \(r(0)=s\), and for all \(i\ge 0\), \(r(i)\simeq r'(i)\) and \(\sigma \) and \(\sigma '\) are equivalent for \(\psi \) w.r.t. r(i) and \(r'(i)\). Since r is a run such that \(r(0)=s\), it satisfies hypothesis (1). Moreover, the same hypothesis is necessarily satisfied by all the runs \(r''\) such that for some \(i\ge 0\), \(r''(0)=r(i)\) (otherwise, the run \(r(0) \rightarrow \cdots \rightarrow r(i) \rightarrow r''(1) \rightarrow r''(2) \rightarrow \cdots \) would not satisfy the hypothesis for r); the same considerations apply w.r.t hypothesis (2) and for all the runs \(r'''\) such that \(r'''(0)=r'(i)\), for some \(i\ge 0\). In particular, these hold for \(i=1\). Thus, we can inductively apply the hypothesis, by replacing s with r(1), \(s'\) with \(r'(1)\), and \(\varphi \) with \(\psi \) (observe that \(var(\varphi )=var(\psi )\) and \(fr(\varphi )=fr(\psi ))\). But then we obtain \((\mathcal {P},r(1),\sigma )\not \models \psi \), thus \((\mathcal {P},r(0), \sigma )\not \models AX\psi \).

For \(\varphi \equiv E\psi U\phi \), assume that the only variables common to \(\psi \) and \(\phi \) occur free in both formulas (otherwise rename quantified variables w.l.o.g.). Let r be a run such that \(r(0)=s\), and there exists \(k\ge 0\) such that \((\mathcal {P},r(k), \sigma )\models \phi \), and \((\mathcal {P},r(j), \sigma )\models \psi \) for \(0\le j <k\). By Lemma 3 there exists a run \(r'\) such that \(r'(0)=s'\) and for all \(i\ge 0\), \(r'(i)\simeq r(i)\) and \(\sigma \) and \(\sigma '\) are equivalent for \(\varphi \) w.r.t. \(r'(i)\) and r(i). From each bijection \(\gamma _i\) witnessing that \(\sigma \) and \(\sigma '\) are equivalent for \(\varphi \) w.r.t. \(r'(i)\) and r(i), define the bijections \(\gamma _{i,\psi }=\gamma _i|_{\textit{adom}(r(i))\cup \sigma (fr(\psi ))}\) and \(\gamma _{i,\phi }=\gamma _i|_{\textit{adom}(r(i))\cup \sigma (fr(\phi ))}\). Since \(fr(\psi )\subseteq fr(\varphi ), fr(\phi )\subseteq fr(\varphi )\), it can be seen that \(\gamma _{i,\psi }\) and \(\gamma _{i,\phi }\) witness that \(\sigma \) and \(\sigma '\) are equivalent for respectively \(\psi \) and \(\phi \) w.r.t. \(r'(i)\) and r(i). By the same argument used for the AX case above, hypothesis (1) holds for all the runs \(r''\) such that \(r''(0)=r(i)\), for some \(i \ge 0\), and hypothesis (2) holds for all the runs \(r'''\) such that \(r'''(0)=r'(i)\). Now observe that \(|{\sigma (fr(\phi ))}|,|{\sigma (fr(\psi ))}|\le |{\sigma (fr(\varphi ))}|\). Moreover, by the assumption on the common variables of \(\psi \) and \(\phi \), \((var(\varphi )\setminus fr(\varphi ))=(var(\psi )\setminus fr(\psi ))\uplus (var(\phi )\setminus fr(\phi ))\), thus \(|{var(\varphi )\setminus fr(\varphi )}|=|{(var(\psi )\setminus fr(\psi )}|+|{(var(\phi )\setminus fr(\phi )}|\), hence \(|{(var(\psi )\setminus fr(\psi )}|,|{(var(\phi )\setminus fr(\phi )}|\le |{var(\varphi )\setminus fr(\varphi )}|\). Therefore hypotheses (1) and (2) hold also with \(\varphi \) uniformly replaced by either \(\psi \) or \(\phi \). Then, the induction hypothesis applies for each i, by replacing s with r(i), \(s'\) with \(r'(i)\), and \(\varphi \) with either \(\psi \) or \(\phi \). Thus, for each i, \((\mathcal {P},r(i), \sigma )\models \psi \) iff \((\mathcal {P}',r'(i), \sigma ')\models \psi \), and \((\mathcal {P},r(i), \sigma )\models \phi \) iff \((\mathcal {P}',r'(i), \sigma ')\models \phi \). Therefore, \(r'\) is a run such that \(r'(0)=s'\), \((\mathcal {P}',r'(k), \sigma ')\models \phi \), and for every j, \(0\le j < k\) implies \((\mathcal {P}',r'(j), \sigma ')\models \psi \), i.e., \((\mathcal {P}',s', \sigma ')\models E\psi U\phi \).

For \(\varphi \equiv A\psi U\phi \), assume by contraposition that \((\mathcal {P}',s', \sigma ')\not \models \varphi \). Then, there exists a run \(r'\) such that \(r'(0)=s'\) and for every \(k\ge 0\), either \((\mathcal {P}',r'(k), \sigma ')\not \models \phi \) or there exists j such that \(0\le j < k\) and \((\mathcal {P}',r'(j), \sigma ')\not \models \psi \). By Lemma 3 there exists a run r such that \(r(0)=s\), and for all \(i\ge 0\), \(r(i)\simeq r'(i)\) and \(\sigma \) and \(\sigma '\) are equivalent for \(\varphi \) w.r.t. r(i) and \(r'(i)\). Similarly to the case of \(E\psi U\phi \), it can be shown that \(\sigma \) and \(\sigma '\) are equivalent for \(\psi \) and \(\phi \) w.r.t. r(i) and \(r'(i)\), for all \(i\ge 0\). Further, assuming w.l.o.g. that all variables common to \(\psi \) and \(\phi \) occur free in both formulas, it can be shown, as in the case of \(E\psi U\phi \), that the induction hypothesis holds on every pair of runs obtained as suffixes of r and \(r'\), starting from their i-th state, for every \(i\ge 0\). Thus, \((\mathcal {P},r(i), \sigma )\models \psi \) iff \((\mathcal {P}',r'(i), \sigma ')\models \psi \), and \((\mathcal {P},r(i), \sigma )\models \phi \) iff \((\mathcal {P}',r'(i), \sigma ')\models \phi \). But then r is such that \(r(0)=s\) and for every \(k\ge 0\), either \((\mathcal {P},r(k), \sigma )\not \models \phi \) or there exists j such that \(0\le j < k\) and \((\mathcal {P},r(j), \sigma )\not \models \psi \), that is, \((\mathcal {P},s, \sigma )\not \models A\psi U \phi \).

We can now immediately extend Theorem 1 to the model checking problem for C-AbDaS.

Theorem 2

Consider C-AbDaS \(\mathcal {P}\) and \(\mathcal {P}'\) defined on the same P-AbDaS Ag, and an FO-CTL formula \(\varphi \). If

  1. 1.

    \(|{Y'}| \ge \max _{s \in S}|\textit{adom}(s)| + |var(\varphi )| + N_{Ag}\);

  2. 2.

    \(|{Y}| \ge \max _{s' \in S'}|\textit{adom}(s')| + |var(\varphi )| + N_{Ag}\);

then \(\mathcal {P}\models \varphi \) iff \(\mathcal {P}' \models \varphi \).

Proof

Equivalently, we prove that if \((\mathcal {P},s_{0},\sigma ) \not \models \varphi \) for some \(\sigma \), then there exists a \(\sigma '\) s.t. \((\mathcal {P}',s'_{0},\sigma ') \not \models \varphi \), and viceversa. To this end, observe that hypotheses (1) and (2) imply, respectively, hypotheses (1) and (2) of Theorem 1. Further, notice that, by cardinality considerations, given the assignment \(\sigma : T \mapsto Y\), there exists an assignment \(\sigma ': T \mapsto Y'\) such that \(\sigma \) and \(\sigma '\) are equivalent for \(\varphi \) w.r.t. \(s_{0}\) and \(s'_{0}\). Thus, by applying Theorem 1 we have that if there exists an assignment \(\sigma \) such that \((\mathcal {P},s_{0},\sigma )\not \models \varphi \), then there exists an assignment \(\sigma '\) such that \((\mathcal {P}',s'_{0},\sigma ')\not \models \varphi \). The converse can be proved analogously, as the hypotheses are symmetric.

Theorem 2 shows that P-AbDaS Ag can in principle be verified by assuming an interpretation domain of suitable size. Notice again that, since \(\mathcal {P}\) and \(\mathcal {P}'\) are defined on the same P-AbDaS Ag, differently from [7] we do not require any notion of bisimulation. Moreover, if we are able to bound the quantity \(\max _{s \in S}|\textit{adom}(s)|\) across Ag, then we obtain a cut-off value. These considerations motivate the following definition.

Definition 14 (Bounded P-AbDaS)

A P-AbDaS Ag is b-bounded, for \(b \in \mathbb {N}\), if for all C-AbDaS \(\mathcal {P}\) based on Ag, for all reachable states \(s \in S\), \(|\textit{adom}(s)| \le b\).

Boundedness can be justified in terms of the underlying implementation of a P-AbDaS. Indeed, in the order-to-cash scenarios it is likely that there is a maximum number of purchase orders that the manufacturer can deal with at any single time. By assuming boundedness, next result follows from Theorem 2.

Theorem 3

Consider a b-bounded P-AbDaS Ag over an infinite interpretation domain \(\mathcal {Y}\). Then, \(n = b + k + N_{Ag}\) is a cut-off for all formulas with at most k variables.

By Theorem 3 to decide whether a specification \(\varphi \) is true in a bounded P-AbDaS Ag, we can check the corresponding C-AbDaS \(\mathcal {P}\) based on increasingly bigger domains \(Y\underset{inf}{\subset } \mathcal {Y}\), until we hit \(|Y| = b + var(\varphi ) + N_{Ag}\). If formula \(\varphi \) is true in all iteration, we can then conclude that \(\varphi \) is true in Ag.

Discussion. The assumption of boundedness to obtain decidability may appear restrictive. However, notice that in most implementation of data-aware systems, the bound is set by the system’s specification in terms of memory. That is, we can safely assume that our system will never contain more than a certain amount of data, however large it can be, and use this bound to verify properties of interest. Unfortunately, the problem of deciding whether a system is b-bounded, for some \(b \in \mathbb {N}\), is undecidable in general. Some restrictions on the specification of actions to obtain bounded systems have been explored in [3].

We conclude this section by elaborating on our running example.

Example 3

Consider again the order-to-cash scenario and suppose that the customer can request at most 5 products for each purchase order and the manufacturer can request at most 10 materials to the supplier. Note that, in principle the number of products could be infinite. Further, the total number of products and the total number of materials are both 20. So, we can fix a bound \(b = 5 \cdot 4 + 10 \cdot 4 + 20 \cdot 2 + 20 \cdot 2 = 140\), and notice that the FO-CTL specifications in Example 2 contain at most 6 variables. Then, the value for the cut-off is \(n = 146 + N_{Ag}\). Since the maximum number of parameters for the customer and the manufacturer is 2 and for the supplier is 1, then \(n = 146 + 5 = 151\) is the total cut-off. As a result, to verify the FO-CTL specifications in Example 2 it is sufficient to model check them on C-AbDaS of domain size \(|Y| = 151\).

5 Related Work and Conclusions

Amongst the first contributions to consider the verification of data-aware systems we mention [8, 18]. This direction was then developed in [12, 15], which apply syntactic restrictions on the system description and the specification language in order to obtain decidability. Closely related to the present contribution are [3, 7, 10], where sufficient conditions for decidable model checking of data-centric dynamic systems are given. Results on the verification of DaS have also appeared in [5, 6, 13], and then applied to the monitoring of commitments [27] and plan synthesis [9]. While we acknowledge the contribution of these works, there are two important differences in our approach w.r.t. the state of the art. Firstly, we here considered the parameterised model checking problem, where each system is parametric w.r.t. a finite, possibly different, interpretation domain; whereas in the references above each system carries its own infinite interpretation domain. Secondly, because of this technical shift, instead of introducing notions of bisimilarity to obtain finite abstractions [7], we rather explore the existence of cut-offs defined on the same agents as the parameterised AbDaS, but with a finite interpretation domain. We believe that this last problem is more interesting for practical applications because, rather than dealing with an actual infinity of data, data-aware systems usually encompass an unbounded number of elements, which is more naturally modelled as a parameterised model checking problem.

On the subject of parameterised model checking of agent-based systems, recently several methodologies and tools have been proposed [22, 23]. These contributions are orthogonal, as while they do not model data-aware systems, they are capable of dealing with an arbitrary number of agents. As regards DaS, a method for the verification of parameterised agent-based systems, each encoded via infinite-state models, was presented in [24]. However, this approach only supports a non-quantified specification language and does not deal with (semi-)structured data as we do here. Finally, [4] reports on some preliminaries results on the verification of data-aware multi-agent systems. But decidability results are available only for a rather limited fragment of the specification language considered therein. The present contribution differs from the works above as, to the best of our knowledge, we introduce for the first time the problem of parameterised model checking for data-aware systems. As we motivated, this is a relevant question for verification, as we aim at guaranteeing the correct behaviour of data-aware systems no matter what the underlying data content is. To this end, we proved theoretical results on the preservation of specifications written in FO-CTL under cardinality constraints. Finally, we showed that such results guarantee the existence of a cut-off for the class of bounded P-AbDaS. We illustrate the relevance of the formal machinery through an application to an IBM use-case, the order-to-cash scenario.

We plan to extend the present work in several directions, including more expressive specification languages, possibly with some form of arithmetic, which is essential for real-life applications. Also of interest are the results in [22, 23] that allow for the verification of systems with an arbitrary number of agents. We plan to explore such an extension of our present setting.