1 Introduction

Legal contracts specify the terms and conditions, i.e., the requirements, that apply to business transactions. They are commonly expressed in natural language and often contain parts that are ambiguous, incomplete, conflicting, or possibly invalid, i.e., inconsistent with the intentions of contracting parties. A smart contract is a software system intended to partially automate, monitor, and control the execution of a legal contract to ensure compliance with relevant terms and conditions [48]. There is tremendous interest in industry these days for such systems, in sectors that include supply chain management, energy, and government [41].

Formal specifications of legal contracts can serve as requirements specifications of smart contract software. Such specifications can also enable automated analysis of a contract, as well as the generation of smart contract code. Among several languages that exist, Symboleo was recently proposed as a formal specification language for legal contracts, together with an ontology, syntax, and semantics [35, 45].

In previous work [37], we briefly proposed an analysis tool named SymboleoPC for model checking properties of Symboleo contract specifications, but without an automated translation of a Symboleo specification into model checker code, nor any scalability analysis. SymboleoPC is built on top of the nuXmv model checker [9] and can check properties expressed in Linear Temporal Logic (LTL) [27] or Computation Tree Logic (CTL) [18]. Such properties can capture the intents of the contracting parties, as well as desirable legal properties such as termination for all possible executions.

As illustrated in Fig. 1, our new SymboleoPC tool now automatically translates a Symboleo specification to a nuXmv contract module that invokes a library of predefined modules extracted from Symboleo ’s ontology and axioms. This generated contract module is the one that can later be checked against properties using the nuXmv model checking environment.

Fig. 1
figure 1

Overview of SymboleoPC ’s construction

Model checking technology [14] has come of age in the past decade with important applications in analyzing different types of artefacts, including hardware and software. This technology enables searches over huge spaces of execution paths looking for counterexamples to a given desired property that a specification should have, or an undesirable property it should not have. However, model checking can be computationally prohibitive, sometimes only returning answers for simple problems in a particular domain. A recent survey [43] identifies scalability challenges for model checking, especially regarding models of contracts. In this context, a critical research question for SymboleoPC is whether it scales up to analyze specifications of real-life contracts, such as contracts and templates found on the Web, rather than only toy examples. The purpose of the work reported herein is to present a full implementation of SymboleoPC and assess its scalability.

The contributions of the paper are as follows:

  • A full implementation of SymboleoPC, including its architecture and testing, along with an illustration of use for a new contract specification.

  • A scalability analysis that studies the performance of the tool as the input specifications and properties-to-be-checked grow in size along different dimensions; the results of the analysis suggest that SymboleoPC can be usefully deployed for the analysis of real-life and real-size legal business contract specifications and properties, but not for large contracts involving hundreds of terms and conditions.

This paper is an extended and improved version of a paper that appeared at the MODELS 2022 conference [36]. The extensions include the following items:

  • Improvements to the earlier SymboleoPC prototype to support variable assignments in events used by obligations and permissions, which is a recent extension of the Symboleo language. This feature enables more types of legal contracts to be formalized and verified (Sect. 4.1).

  • Further improvements to SymboleoPC to support the automatic generation of nuXmv specifications that capture implicit constraints between event predicates in Symboleo. This automates the generation of complex code that was generated manually in the earlier prototype (Sect. 4.1).

  • A new example of a contract (Computer Delivery) that demonstrates the use of the above features (Sect. 2.1).

  • Additional details in the description of the translation from Symboleo to nuXmv (Sect. 4.1). A new, more realistic performance evaluation where synthetic Symboleo specifications are first converted to nuXmv automatically (using SymboleoPC) and then model-checked against synthetic LTL and CTL properties. Different verification algorithms are used to verify the generated models and identify the key factors that impact their performance (Sect. 5).

The rest of the paper is structured as follows. Section 2 introduces the Symboleo language (with the help of a new Computer Delivery contract) and the nuXmv model checker. Section 3 presents SymboleoPC ’s architecture, whereas Sect. 4 discusses its implementation rules, and unit/acceptance testing, and demonstrates its use through an example. Section 5 reports results of the scalability experiments. In Sect. 6, we review related work. Finally, Sect. 7 concludes and speculates on future research.

2 Research baseline

We introduce the Symboleo specification language for contracts. In designing Symboleo, we reviewed hundreds of sample contracts found by searching the web from different legal domains and jurisdictions and formalized more than a dozen that we considered representative of contracts that can benefit from event-based monitoring [37, 45]. In this section, we showcase a novel example that leverages the language’s recently added assignment feature.

2.1 Symboleo

In Symboleo, a contract consists of obligations and powers (collectively called legal positions), roles (the contracting parties), assets, situations that describe contract conditions, as well as events. Each obligation has a debtor role that is responsible for making a consequent true, if an antecedent (a situation) holds, for the benefit of a creditor, another role. A power (which enables creating, changing, or terminating other legal positions) also has a creditor and a debtor where the creditor has the right to exercise the power by making a consequent (another situation) true if an antecedent holds [45].

Both obligations and powers may have triggers that can be used to instantiate them many times during the execution of a single contract. For example, a buyer that generates many purchase events (triggers) instantiates for each purchase a new obligation for the seller to deliver it. Every contract must have at least two roles and two assets, one of which is usually money for business transactions. Triggers, antecedents, and consequents, as well as preconditions, post-conditions, and constraints, are expressed in an extension of the event calculus [44] where quantification is over finite sets. Events are treated as primitive concepts for describing consequents, antecedents, etc., and play a critical role in monitoring compliance.

To illustrate the nature of Symboleo specifications, consider a simple computer buying contract:

  1. i.

    The customer orders a computer from a store, to be delivered within 7 days;

  2. ii.

    The customer agrees to pay a deposit worth between 15 and 20% of the computer price, on the same day;

  3. iii.

    The customer agrees to pay the remaining amount of the computer price within 10 days of delivery;

  4. iv.

    If delivery is late, the customer has the option (power) to cancel the contract or get a 5% reduction on the original price and pay within 10 days of delivery.

It is imperative to retain data from monitored events and use that data for dynamically adjusting the states of obligations, powers, and the contract itself. For contracts that include cumulative constraints, such as ‘Contract terminates when the total amount of computers sold exceeds $100,000,’ we need a way to keep track of the amount of computers sold as the contract is being executed, especially as the number of sales is unknown in advance. Toward this end, a new construct is introduced to the Symboleo language to update a cumulative constraint variable. Cumulative constraints are typical of contracts where payment or delivery can be done in multiple steps captured by events.

The specification in Listing 1 begins with a domain model that formalizes terms mentioned in the contract (such as , , , and ) as specializations of primitive concepts in Symboleo ( , , , etc.). The contract is named , and its definition begins with parameters that are assigned values for each contract execution, as well as local variables that can be assigned instances of the classes introduced in the domain model. Event values that come from the environment are labeled with .

When contract execution begins, all obligations and powers without triggers are initiated, i.e., , , and in this contract. As has a trigger condition ( or ), this obligation gets instantiated when one of these two events happens. This obligation then proceeds to become active because its antecedent is true. That obligation is fulfilled when the total amount paid so far is computed. Note the usage of the assignment ( ) in this particular obligation, which is needed to accumulate all the partial amounts paid by the customer to the store, with a number of payments that is unknown at contract design time and that can differ across contract instances. When this event happens, becomes active, and the customer has 10 days to pay the rest of the computer price when the computer is delivered within the delivery due date. Once that happens, there are no active obligations on any side, so the contract execution terminates successfully.

If delivery is late, two powers ( and ) are triggered to give the customer authority to choose between (i) cancelling the contract and getting a reimbursement, or (ii) paying 95% of the computer’s price ( triggers obligation , while triggers obligation ). Note that time is an intrinsic event attribute, accessible through a predicate or using , where and are variables and is a predefined attribute name.

figure am

Unfortunately, this sample specification does not capture the expectations of the customer and of the store:

  1. 1.

    If the computer is delivered late and the customer chooses the ‘pay late’ option, then she must pay both the regular price and the reduced price (i.e., 195% of the original price here) for the computer! To fix the problem, needs to be amended into . That is, the antecedent of that obligation is not the happening of the event, but the successful fulfilment of the delivery ( ) obligation. With this amendment, becomes active and obliges the customer to pay for the computer at the regular price only if delivered on time.

  2. 2.

    Moreover, in line 41, the payment is added to the amount deposited each time a partial payment is made by the customer! This time, this issue puts the store at a disadvantage. To fix the problem, the deposit amount must be the variable to be updated ( )) and be used in the condition on line 45 (i.e., ).

This example underscores the importance of formal analysis of Symboleo specifications to ensure they are consistent with the expectations of contracting parties.

2.2 The nuXmv model checker

The nuXmv model checker [9] is the evolution of the NuSMV open source model checker [11]. It supports the specification and the analysis of finite- and infinite-state transition systems. The nuXmv specification language provides for modular hierarchical descriptions and for the definition of reusable parametric components. The basic purpose of the nuXmv language is to describe (using expressions in propositional calculus) the transition relation of either a finite-state or infinite-state transition system.

A nuXmv program consists of: Declarations of state variables (within the scope of ) which determine the state space of the model (this construct is also used to instantiate modules); Init assignments and Next assignments (both in the scope of ) define respectively the valid initial states and the transition relations; Declarations, specified in the scope of , introduce abbreviations of complex formulas to be evaluated in the current state. The variables can be defined for a finite range (e.g., Boolean, enumerative, finite integer range, or bit vectors) or for an infinite number of states (e.g., Integer, Real).

nuXmv provides state-of-the-art algorithms for the verification and analysis of both CTL and LTL properties specified in the nuXmv program (with , , respectively). Listing 2 provides an excerpt of a nuXmv program. nuXmv allows proving that a temporal property holds. Moreover, for properties that do not hold, it can generate a counterexample witnessing the reason why the property fails. This last feature allows also to generate witnesses for temporal properties, thus supporting the user in assessing the correctness of the model or of the property itself [1, 12, 21, 39]. We refer the reader to [49] for a more detailed description of the nuXmv language and functionalities of nuXmv. Below we provide a brief introduction to LTL and CTL.

Intuitively, given an infinite sequence of states (computation sequences), the LTL syntax and semantics are as follows. Any propositional formula \(\varphi \) is an LTL formula, which holds in a state if the formula evaluates to true in that state. If \(\varphi \) and \(\psi \) are LTL formulas, then \(\lnot \varphi \), \(\varphi \wedge \psi \), and \(\varphi \vee \psi \) are LTL formulas with the standard semantics. LTL also uses the following temporal state operators: (i) \({{\,\mathrm{\textbf{X}}\,}}\varphi \) is an LTL formula that holds in a state of the sequence if \(\varphi \) holds in the state at the next position in the sequence, and (ii) \(\varphi {{\,\mathrm{\textbf{U}}\,}}\psi \), which holds in a state if \(\varphi \) holds at every point in the sequence starting from the given state until \(\psi \) holds. We also use \({{\,\mathrm{\textbf{F}}\,}}\varphi \) as a shorthand for \(\top {{\,\mathrm{\textbf{U}}\,}}\varphi \), which holds in a state of a sequence if eventually in a subsequent state, including the current one, \(\varphi \) holds, and \({{\,\mathrm{\textbf{G}}\,}}\varphi \) as a shorthand for \(\lnot {{\,\mathrm{\textbf{F}}\,}}\lnot \varphi \), which holds in a state of a sequence if in all subsequent states, including the current one, \(\varphi \) holds.

CTL extends LTL temporal state operators with path quantifiers \({{\,\mathrm{\textbf{A}}\,}}\) (for all paths) and \({{\,\mathrm{\textbf{E}}\,}}\) (there exists a path) to be applied only in front of state formulas (e.g., \({{\,\mathrm{\textbf{EX}}\,}}\), \({{\,\mathrm{\textbf{AX}}\,}}\), \({{\,\mathrm{\textbf{EG}}\,}}\), \({{\,\mathrm{\textbf{AG}}\,}}\), \({{\,\mathrm{\textbf{E}}\,}}[\cdot {{\,\mathrm{\textbf{U}}\,}}\cdot ]\), \({{\,\mathrm{\textbf{A}}\,}}[\cdot {{\,\mathrm{\textbf{U}}\,}}\cdot ])\). CTL semantics, unlike LTL that uses computation sequences, is given on computation trees. Thus, (i) \({{\,\mathrm{\textbf{EX}}\,}}\varphi \) holds in a state if there exists a computation starting from that state such that in at least one next state \(\varphi \) holds, (ii) \({{\,\mathrm{\textbf{EG}}\,}}\varphi \) holds in a state if there is a computation starting from that state such that for at least a path of such computation, \(\varphi \) holds in all the states of the path, and (iii) \({{\,\mathrm{\textbf{E}}\,}}[ \varphi {{\,\mathrm{\textbf{U}}\,}}\psi ]\) holds in a state if there is a computation starting from the state such that for at least a path of such computation, \(\varphi \) holds at least until at some position in the future \(\psi \) holds. We also use \({{\,\mathrm{\textbf{EF}}\,}}\varphi \) as a shorthand for \({{\,\mathrm{\textbf{E}}\,}}[ \top {{\,\mathrm{\textbf{U}}\,}}\varphi ]\) to state that there exists a path of a computation such that along the path eventually \(\varphi \) holds.

Note that while there are CTL properties that cannot be checked using LTL, there are also LTL properties that cannot be checked using CTL. As a final remark, while both CTL and LTL properties can be verified on finite-state transition systems, in nuXmv only LTL properties can be verified for infinite-state transition systems [9].

3 SymboleoPC : a model checker for Symboleo

SymboleoPC (Symboleo Property Checker) is a tool that, given a Symboleo specification of a contract, a set of temporal logic properties (representing expectations of that contract), and a range of parameter values of interest, verifies whether each property holds or is violated. For each property proven not to hold for the contract, a counterexample witnessing the reason is generated so that the user can either correct the specification of the contract or revise the property itself. Moreover, properties are also used to generate behaviors (witnesses) compliant with a given temporal property to check that expected intentions, captured by the property, are complied with the contract. All these functionalities are intended to check that unforeseen undesired situations are not encountered during contract execution and thus to partially protect contracts against parties trying to exploit their weaknesses, and that the contract is not too restrictive to rule out desired outcomes.

3.1 Library of trusted contract-independent modules

SymboleoPC leverages the nuXmv finite-state symbolic model checker [9] to perform verification of specified properties. The conversion from Symboleo to nuXmv relies on the ontology and structure of a Symboleo specification, that consists of (i) generic language concepts such as contract, obligation, power, party, and event; (ii) domain-specific information used to specify the specific constraints of each contract.

The semantics of Symboleo is given in terms of statecharts describing the states and transitions of Symboleo generic concepts, and axioms expressed in Event Calculus specifying the guards and effects that govern statechart transitions, as well as quantitative constraints.

The statechart diagram depicted in Fig. 2 describes the behavior of Symboleo ’s happen predicates. Similar statechart diagrams exist for contract, obligation, power, and party concepts [45].

Fig. 2
figure 2

Statechart diagrams representing the concepts of three variants of the predicate

Each of the generic concepts is encoded faithfully in a nuXmv module parametric on the conditions and guards that label the specific state transitions, with variables to encode the states, and with declarations to define reusable predicates of the state diagram to facilitate encoding of Symboleo ’s primitive concepts.

figure ba
figure bb

Listing 2 represents the nuXmv encoding of the statechart diagram for the concept depicted in Fig. 2. The predicate , which stands for Weak Happens Before, starts in the state. If event2 does not happen before or at the same time as event1, the predicate changes to . This indicates that event1 occurred before event2, but does not guarantee that event2 will eventually happen. In contrast, the predicate (Strong Happens Before) includes an additional state to ensure that event2 eventually happens. This guarantees that event1 happens before event2 and that event2 will occur at some point in the future. Finally, the predicate indicates that event1 occurs after event2.

The concepts of role, asset, and situation, which are also contract-independent, are defined as distinct nuXmv modules, as illustrated in the nuXmv Listing 3. These modules are designed to allocate a party to a role, ascertain the owner of an asset, and articulate the propositional state of a situation, respectively. Their attributes can be accessed using the ‘dot’ notation. For instance, if ‘Computer’ represents an asset with ‘Michael’ as its owner, an instance of is instantiated in nuXmv, and the owner’s identity (Michael) can be accessed using the notation in the nuXmv environment.

Each of the resulting nuXmv modules can be subject to formal verification to ensure the overall properties of the corresponding Symboleo concept are preserved (e.g., the last property of Listing 2 is an LTL property aiming to verify that the encoding of the is such that never happens before .

The encodings in nuXmv of Symboleo ’s generic concepts (ontology and axioms) constitute a library of trusted modules to be used as building blocks for the encoding of a specific Symboleo specification. This library of trusted modules, which are independent from any specific contract (see Fig. 1), is a core component of SymboleoPC.Footnote 1

The complete nuXmv encoding is obtained by having the nuXmv representation of a specific Symboleo contract specification instantiating the elements of the library of trusted modules. For example, the contract-specific nuXmv module shown in Listing 4 corresponds to the Symboleo specification of the Computer Delivery contract from Listing 1. In that example, and instantiate trusted modules from the contract-independent library, whereas and instantiate contract-specific nuXmv modules generated by SymboleoPC for that Symboleo specification, which in turn instantiate modules from the contract-independent library.

figure br

3.2 Verification problem scoping

SymboleoPC relies on finite-state symbolic model checking techniques provided by the nuXmv model checker [9] to perform formal verification of a Symboleo specification against a set of properties. Symboleo, by its very nature, specifies a contract that applies to a possibly infinite set of Symboleo concept instances. For example, the Computer Delivery contract of Listing 1 is expected to be valid for all possible instances of , , , , etc. However, in order to perform model checking with nuXmv, we are forced to specify a finite set of instances for each class element. Thus, for instance, we say that there are two possible customers namely and , two possible addresses (e.g., and ), and so on for all the parameter elements of the contract. This approach has several similarities with the approaches used (i) in Formal Tropos [21], where an upper bound was specified on the number of instances for each class in the domain model; (ii) in PDDL Planning [19], where the PDDL problem specifies the objects for the planning problem; (iii) and the object diagram used in OthelloPlay [8].

To create a verification problem in SymboleoPC, one needs to create a file that contains the range of interesting instances for each class of the specification. Given this file, an algorithm generates a complete nuXmv specification that includes all models to be checked for verification purposes.

The steps of the conversion algorithm are outlined in Listing 8. The conversion starts by analyzing the basic elements and parameters of the contract and of the problem (lines 4 to 17), followed by the analysis of the contract, powers, and obligations (lines 20 to 26), and finally building the complete nuXmv specification of the given verification problem (lines 29 to 58). In a nutshell, the algorithm extracts defined event and asset variables from declarations and creates respective instances using the information in Symboleo ’s problem scoping file. During this analysis, the algorithm builds, for each condition that governs a legal position of a specification (e.g., antecedent, consequent, triggers, precondition, termination, satisfaction), a corresponding propositional formula to be used in the instantiation of the respective nuXmv module. Moreover, this analysis computes precedence relations among events. For instance, if the antecedent of an obligation is satisfied by an event, then the event must happen before the creation of the obligation. All these elements contribute to the final nuXmv verification of the problem at hand.

We remark that the parameters of the contract are mapped to nuXmv ’s Footnote 2 (see line 4 in Listing 5). These variables are the mechanism provided by nuXmv to specify parametric specifications. These variables, together with other variables and constants necessary to encode the specification, can be passed as argument values at the time a nuXmv module is instantiated to make the specification complete and enable model checking. During verification, such variables range over all possible assignments in their problem scope. Thus, if a property holds, it does so for all possible assignments to its parametric variables, or if a property is violated, the model checker pinpoints specific values that lead to property violation.

Listings 4 and 5 represent excerpts of the final encoding in nuXmv of the Computer Delivery contract of Listing 1. In this problem scoping example we considered two customers ( , ), two different stores ( , ), two addresses ( , ), three options ( , , ), prices represented as integers ranging from to currency value (e.g., CAD), etc. The choices made during problem scoping may have a critical impact on verification and must therefore be chosen carefully taking into account the contract itself and the properties to be verified (see next section). However, if the model checker proves a property to hold, it means that any value for the parameters-at-hand has no effect on the truth of the given property. On the other hand, if the property is violated, then the model checker shows which values lead to a violation, through the generation of a counterexample.

We finally note that we chose not to include the complete encoding of the contract specifications in nuXmv here. The complete contract specification in nuXmv and the complete details of the encoding are available in [38].

figure cn

3.3 Property checking with SymboleoPC

SymboleoPC enables checking a specification (given a problem scope) against desirable and undesirable properties formulated as LTL or CTL formulas. These properties are subject to encoding to convert Symboleo terms into corresponding nuXmv terms. Typical properties of interest for a contract are:

  • Termination: There should always exist a way to terminate a contract; otherwise, parties may remain liable forever after. A contract terminates successfully if all instantiated obligations are fulfilled. A power may also cause a contract to terminate unsuccessfully.

  • Limited liability: Legal contracts commonly compensate for breaches by entitling a creditor to dynamically impose fine obligations on the debtor. This technique may subject a debtor to unlimited liability against the creditor. A property can check that such liabilities are limited in terms of time and assets.

  • Conformity to party intentions: A contract must comply with party intentions and expectations; otherwise, the contract may be deemed void and parties may rescind the contract. Leaving an unwanted contract is often expensive. Properties can express intentions and expectations in terms of events and situations.

During property verification, implicit constraints play a vital role in fine-tuning and restricting potential solutions. Their primary purpose is to guarantee that only solutions adhering to the defined temporal order are considered within the solution space. These constraints achieve this by introducing specific conditions that govern the sequence of predicate functions, such as , , and . By doing so, implicit constraints contribute to narrowing down the solution space, eliminating sequences that fail to adhere to the specified order within the contract.

Listing 6 shows for each of the properties above a corresponding encoding in nuXmv. The state names in these properties refer to state diagrams encoded in nuXmv. P1’s query (instance of Termination, named ) ensures that the contract finally reaches either a successful or unsuccessful termination state. Property P2 (instance of Limited liability, named ) checks that a store cannot be penalized more than once when it violates its delivery obligation. Property P3 (instance of Conformity to party intentions, named ) assures that on-time delivery is a prerequisite of payment. In this scenario, the implicit constraint (refer to Listing 4, line 35) ensures that the valid solutions, when the paid event is active, are those where the delivered event has occurred or expired. Any solutions that deviate from this specified order—such as the paid event being active, while the delivered event has not occurred—will be excluded from the solution space. This constraint is derived from the contract specification (see Listing 1, lines 37 and 45), which states that the delivery event must occur within 7 days of the order time, while the paid event must occur 10 days after the order time. It is crucial to highlight that the verification of this property yields a false result without the application of this constraint, whereas the use of the implicit constraint renders the verification true. Property P4 (instance of Conformity to party intentions, named ) further checks that each legal position is activated in some execution, which is akin to looking for dead code in computer programs.

figure cv

We note that SymboleoPC can also be used to discover possible ways to fulfill the conditions of a contract. For example, P5, named , (for which the model checker is expected to generate a counterexample) discovers a sequence of events that delivers the computer to the customer and terminates the contract without payment.

3.4 Architecture

While Fig. 1 gives an overview of how SymboleoPC was created, Fig. 3 presents the architecture of SymboleoPC, together with its inputs and outputs. The tool leverages the nuXmv model checker engine [9] to perform analysis.

Fig. 3
figure 3

Overview of SymboleoPC ’s inputs and outputs

SymboleoPC expects three main inputs:

  1. i.

    a Symboleo contract specification syntactically validated with an Xtext-based editor (e.g., Listing 1);

  2. ii.

    a problem scope that specifies a finite set of instances for each class that determine a finite set of contract instances (e.g., Listing 5);

  3. iii.

    the set of temporal logic properties (expressed in CTL/LTL) to be verified against contract instances of interest (e.g., Listing 6).

Translation into nuXmv (as discussed in Sect. 3.2 and sketched in Listing 8) takes advantage of a library of trusted nuXmv modules (Sect. 3.1), each encoding basic Symboleo constructs and primitives (e.g., axioms of primitive predicates, runtime operations, and state machines describing the behavior of primitive concepts). Details of the encoding are available online [38].

The output of the translation is a complete nuXmv specification. At this point, SymboleoPC invokes the symbolic model checker nuXmv to verify the CTL/LTL properties and analyze its outputs. For the properties that hold, it simply reports the information to the user. For properties that do not hold, it presents a counterexample/witness to the user. To this end, SymboleoPC leverages the mapping used for the conversion from Symboleo to nuXmv.

4 Implementation and testing

This section presents rules used to support the nuXmv code generation from Symboleo specifications in SymboleoPC ’s implementation, together with the tests used to assure a minimum level of quality.

4.1 Implementation

The translation process is a multi-step endeavor that entails navigating through the specifications outlined in a contract. As illustrated in Listing 7, a contract specification contains the domain and scopes of the contract. A contract comprises a designated name and a list of input parameters that dictate specific values crucial to the contract instance, such as the payment due date. The contractual dynamics is event-driven, where events play a pivotal role in altering the contractual landscape. For example, an event might encapsulate a situation that triggers an obligation. Variables are used to define instances of events, situations, and assets. Similarly, Obligation, Power, and Party are all identified with variables, although the translation algorithm treats them distinctively. Depending on the contract, various constraints may apply.

figure cx

The translation algorithm systematically transforms each scope into their respective nuXmv modules, as detailed in Listing 8. The algorithm initiates its process by parsing events, assets, and situation variables (lines 4 to 17) and systematically delving into the powers and obligations associated with a contract (lines 18 to 26). These initial steps serve to extract and organize the essential metadata of a contract, subsequently arranging them into appropriate data structures. Leveraging this extracted metadata, the algorithm proceeds to construct nuXmv modules for the contract (lines 27 to 54) and generates the corresponding ASSIGN clauses based on assignment expressions (lines 56 to 60).

figure cy

More specifically, the algorithm extracts event and asset variables from the declaration scope of a specification. According to the event module nuXmv, a propositional precondition enables an event. Lines 7 to 17 search anywhere whether the occurrence of an event is effective (e.g., in antecedents, consequents, and triggers of legal positions) and determine the precondition for the occurrence of the event. For instance, the ordered variable in Listing 1, integrated into the antecedents of oOrder and oDel, becomes activated upon the instantiation of these respective obligations. Consequently, as shown in Listing 4, the conjunction Order.state = create | oDel.state = create serves to activate the ordered event.

A contract, along with its legal positions, is represented through parametric nuXmv modules. These parameters serve as input parameters for the nuXmv modules associated with obligations, powers, and contracts. They play an important role in determining when a legal position or a contract undergoes state changes [37]. While some parameters are statically defined by constant values, others are contingent on the legal positions or the contract itself. Consider the True value in the oOrder obligation in Listing 4, which serves as a constant indicating unconditional obligation triggering. In contrast, oOrder_violated is a variable that dynamically determines when the obligation is violated. The algorithm navigates through a contract, dynamically computing these variables. In particular, lines 20 to 26 gather powers that result in the termination of a contract, generate an event for the exertion of each power, and ultimately produce the cntTermination variable, representing the contract’s termination. In a parallel fashion, the algorithm defines variables for the dischargement, suspension, resumption, and termination of obligations. Additionally, it introduces variables pertaining to the suspension and resumption of a contract.

Following the preliminary processing of variables, the algorithm undergoes a secondary scan of the specification, during which it creates customized nuXmv modules. In line 30, contract-independent modules are dynamically generated. Specifically, nuXmv modules for the timer, event, obligation, power, and contract are universally defined once, regardless of a particular contract. These modules serve as templates and are subsequently instantiated for each specific contract. Moving to lines 32 to 33 of Listing 8, the algorithm then addresses domain-specific concepts, such as Store, Device, Computer and Delivered as outlined in Listing 1. These concepts are uniquely defined for each contract, resulting in the creation of one nuXmv module per concept.

In line 35, a parametric module is crafted to align with a designated contract, exemplified by ComputerC in Listing 1. This module includes declarations of variables, legal positions, and constraints relevant to the specific contract under consideration. Subsequently, in line 37, an instance of the contract module is instantiated. This instantiation employs internal variables to govern the contract’s behavior. In particular, the disjunction function is used to aggregate events that culminate in the termination of a contract, such as the exertion of a power. The invocation of disjunction(cntTermination) serves as the catalyst for contract termination, adhering to the logical principles encoded within the contract module.

Aligned with the stipulated variables in the contract specification, lines 39 and 40 of Listing 8 initiate the instantiation of certain domain modules. Specifically referencing the ComputerC contract, the variable ordered is established, referring to an instance of Ordered defined within the domain concept. The conversion procedure involves the creation of an instance of the Ordered module in nuXmv, with the resulting instance assigned to the nuXmv variable labeled ordered.

Similarly to the instantiation of the contract module, the algorithm scans the obligations and powers and generates the corresponding instances of nuXmv modules with the proper parameters in lines 41 to 42. Debtors and creditors are two features of legal positions that determine liability, the right holder, and the performer. The makeDebtor and makeCreditor functions (lines 43 and 44) instantiate a party module for the debtor and the creditor of a legal position. Then, Symboleo constraints are converted to invariants in nuXmv (lines 53 to 54). Implicit constraints are also generated to constrain the sequence of occurrence of the predicate functions and reduce the solution space (lines 49 to 51).

To extract the clause from the assignment expressions found in all the legal positions, lines 56 to 60 go through their expressions , , and , searching for the and predicate functions to extract both the event and the expressions.

The translation algorithm consists of 13 rules [34]. In order to provide clarifications on the algorithm, we present an illustrative sample of six important rules in Appendix A.

SymboleoPC ’s code generator is implemented in Xtend [6] and Java, with Eclipse. The parsing of a specification leverages the Xtext framework. Similarly, the translations back and forth from Symboleo to nuXmv leverage the navigation methods provided by Xtext and Xtend. The implementation consists of more than 3000 lines of code that comprise a set of methods that mostly parse the Xtext file and generate nuXmv modules using translation rules. The tool, technical tutorials, usage instructions, and full contract examples are publicly accessible on GitHub [38].

4.2 Unit and acceptance tests

The tool and translation rules have undergone a rigorous assessment process, which included a comprehensive set of unit tests and two acceptance tests. Unit tests were carried out at different levels of granularity, covering various scenarios related to assets, situations, and events. At the highest level of granularity, the tests focused on verifying the translation rules of legal positions, constraints, and contracts. This approach allowed us to test the translation of concepts and relationships in Symboleo ’s ontology, and to ensure that the translation rules for obligations and powers were accurate, based on verified assets, events, and situations. To ensure that our test scenarios covered the other concepts well, we extracted entities from the Symboleo ontology and the language grammar. The resulting unit test scenarios are summarized in Table 1, while details are available on GitHub [15].

Table 1 Test scenarios for SymboleoPC

This approach not only helped us identify and correct errors in our translation rules but also enabled us to improve the overall quality of the translation tool.

An often comes with a list of attributes that describe the quantitative and qualitative properties of the asset (scenario 1). A contract may contain several simple assets, consisting of atomic attributes (scenario 2), or composite assets, which contain an attribute with the type of a defined asset (scenario 3). As an example, can be a composite asset that contains and assets. The last scenario is the generalization of an asset (scenario 4). Although there are several generalization rules such as attribute overriding, the translator supports inheritance cases with new attributes and skips overriding cases.

Similarly, an is defined by a set of attributes (scenario 1). An event is often used in the antecedent, consequent, or trigger of a legal position, or in the precondition, postcondition, or constraint of a contract. Scenario 2 covers the antecedent and the consequent, while the remaining scenarios have been implemented in the tool. In addition to obligations, events may activate a power (scenario 3). An event may occur through a time-limited predicate such as (scenario 4). Similar to assets, the generalization of events is another possible format of events (scenario 5).

A Situation is represented in different formats. Atomic situations are numeric and Boolean values or occurrence predicates. Recursive combinations of atomic situations result in a composite situation. A situation may expire when it never happens in the future. For example, expires if the obligation is fulfilled, terminated, or discharged.

Unconditional and conditional legal positions are typical scenarios of valid legal positions. However, several powers may accomplish the same action such as termination of a contract. In this case, the translator mixes powers and generates a proposition for the termination of a contract (scenarios 2 and 3).

The assignment is defined by two forms, and . Both are used in the antecedent and consequent of a legal position. Scenarios 1 and 4 cover the antecedent and the consequent of the obligations, while scenarios 3 and 6 cover powers’ antecedent. The assignment may have more than one assignment expression and modify the event and contract variables.

Unit test scenarios have been assessed through state coverage and pair transition coverage metrics [53]. The test scenarios cover all concepts as well as 18 out of 22 links in the ontology of Symboleo. Liability, performer, right holder, and subcontracting association links [35] are not covered as the translator does not support runtime operations.

The quality of the verification results depends on the correctness of the specification and the properties. To validate the correctness of the generic modules (i.e., the event, timer, party, obligation, power, and contract), we specified for each a set of highly granular properties, and we verified each of them using the nuXmv tool itself. The result is then a library of generic modules that constitute a trusted basis for the specification of the contract, hence minimizing the possibility that contract-dependent properties fail because of bugs in the common basic modules. Since state machines represent the behavior of modules, state and transition coverage metrics have been used to assess coverage and the percentage of properties. For example, Listing 9 lists a set of LTL and CTL properties used to verify that each state of event and obligation statecharts can be reached in the encoded nuXmv modules and that all the direct and indirect transitions can be fired.

figure dp

5 Scalability analysis of SymboleoPC

The performance analysis of a tool such as SymboleoPC is a multi-parameter problem. The most important parameters that may affect the performance of SymboleoPC and that will allow to evaluate its scalability in handling realistic, typical legal contracts are (1) the number of legal positions (i.e., obligations and powers) in a Symboleo specification, (2) their inter-dependencies (e.g., defined by conditions), (3) the verification algorithm, and (4) the number and structure of the properties to check.

Table 2 Frequencies of legal positions and their dependencies in 14 business contracts adopted from the Web

To perform a credible evaluation on synthetic and scalable benchmarks within the space of these parameters, we have studied fourteen typical monitorable (i.e., with many events) legal business contracts adopted from the literature or publicly available on the Web. These contracts are available in annotated form online [38]. From these contracts, we only extracted the distributions of legal positions, their relationships, and the operators that occur in properties of interest, with results reported in Tables 2 and 3.

Table 2 shows that for these legal contracts, the number of obligations ranges from 1 to 31, while the number of powers ranges from 1 to 20. The dependency level of legal positions indicates to what extent the evolution of obligations and powers depends on other clauses of a contract. Symboleo ’s semantics determine the types of dependencies that can exist between positions, including the creation, suspension, and discharge of obligations by powers, or the termination of contracts by powers. Implicit dependencies, e.g., triggering obligations by the violation of other obligations, are also important. Considering all possible types of dependencies in generating specifications would result in a huge space of contracts to be tested. However, most types are rarely found in real contracts. Consequently, our empirical study only considers 12 frequent types of dependencies, reported in Table 2 when generating synthetic specifications. The results of our analysis suggest that 14% of the obligations are triggered by powers (R3), while 22% of powers are triggered by an obligation violation (R9). Suspension and resumption of an obligation (R4 and R5) are outliers here since the meat sales contract contains two powers to suspend and resume all obligations. Note that the dependency percentages sum up to more than 100% because some positions include multiple dependencies.

Table 3 reports the result of our empirical study of typical LTL and CTL properties previously used to analyze legal contracts. In the analysis, we considered the number of occurrences of CTL and LTL temporal operators, the number of disjunctions (or), conjunctions (and), implications, and negations, as well as the number of sub-formulas (which corresponds to the nesting of temporal operators). These results show that for the property formulas, the maximum nesting of operators is 7 and the average is about 3.

Table 3 Distribution of operators in properties of legal business contracts

The presence of dependencies among positions reduces the size of the state space to explore during verification. However, dependencies also involve the evaluation of conditions. The impact of these adversarial performance factors must be studied empirically.

5.1 Testing infrastructure

In this paper, we conducted a more realistic performance evaluation than we did in our previous work [36]. Now, we generate Symboleo specifications of random synthetic contracts and translate them into nuXmv models automatically using SymboleoPC, instead of directly generating nuXmv code (and bypassing Symboleo). Synthetic random contracts were manually syntax-checked using the Symboleo editor, which enforces all the necessary static rules needed to specify legal contracts correctly. Furthermore, different validation algorithms are used to verify the generated models, new data analysis is conducted, and new results are reported.

For the evaluation, we complemented the SymboleoPC tool with a side evaluation tool that takes as input a randomly generated Symboleo contract specification and the set of parameters that comply with the extracted distributions of Symboleo parameters (e.g., # of obligations, # of powers, % of dependencies, # properties, and depth) also used for the generation of the Symboleo specification. This tool first converts the random synthetic Symboleo contract into the nuXmv format, and then, it generates a set of random properties for the given Symboleo specifications directly at the nuXmv level. This tool is implemented in Java and is available online [38].

The next three sections report the evaluation results along the parameters discussed previously (# of positions, % of dependencies, and # of properties in Table 3). We explore these parameters separately to reduce the number of possible analysis combinations. All the results have been obtained by executing the tools on a laptop equipped with an Intel Core i5-8250U CPU with 1.60GHz and 8GB RAM. For each test, we considered an execution time limit of 2 h, reordered variables, and computed reachable states of nuXmv modules to increase the verification performance.

5.2 Number of independent legal positions

SymboleoPC ’s performance is sensitive to the number of external events that are not triggered by obligations or powers. These events can happen in the real world and are not controlled by SymboleoPC. They cause SymboleoPC to check additional possible scenarios. In our experiment, four external events can trigger, fulfill, activate, or expire independent obligations. Similarly, external events can trigger, activate, exert, or expire powers. However, internal events such as suspension, resumption, termination, and discharge by a power or contract are discarded in our experiment. In addition, fulfillment and violation states share one event whose trigger fulfills the obligation and whose expiration violates the obligation. Using the tool discussed in the previous section [38], we generated over sixty Symboleo contract specifications with random dependencies between the powers and obligations. We then converted each of them automatically to nuXmv modules using SymboleoPC. To create independent tests, such that the obligations and powers are independent (without any type of dependencies such as R2, R3, R5, R6, and R9 in Table 2), we define two ‘fresh’ events to trigger and represent the consequence of each obligation. However, according to the principles set forth in the Symboleo ontology [45], power is defined as the right of a party to create, modify, suspend, or terminate legal situations. Hence, a power that does not impact an obligation or another power lacks a rational basis in the legal contracts domain. Consequently, and as enforced by Symboleo ’s grammar, each power must include at least one dependency. In our synthetic specifications ‘without dependencies,’ we decided that all powers will suspend a dummy obligation (R4) while leaving the other obligations independent—an enhancement affecting 8% of the specifications. Compared to the original experiment in [36], this addition enhances the coherence and correctness of the generated specifications.

Listing 10 shows the Symboleo contract specification with one obligation ( ) and one power ( ) that are independent, while Listing 11 shows the nuXmv code resulting from the conversion with SymboleoPC. Notice that the obligation ( ) is an unconditional obligation that is created when the contract is initiated and suspended by , representing the 8% dependency. This dependency is realized by the internal event that is a part of the situation (a condition used to evaluate the situation and decide whether to suspend the obligation or not). Three external events together with the internal event are associated with four free variables and randomly happen or expire to modify legal positions. Other input parameters of and are constants or situations to eliminate inter-dependencies.

figure dy
figure dz

To simulate the various compositions of obligations and powers, we considered a number of obligation instances ranging from 1 to 32, incremented by a factor of 2 (i.e., 1, 2, 4, 8, 16, 32), and a number of power instances ranging from 1 to 16, also incremented by a factor of 2. For this analysis, nuXmv computes the set of reachable states (a basic step typically carried out before checking any property, giving an idea of the complexity of the problem). The results of this analysis are reported in Fig. 4, indicating that reachable states’ computation times grow exponentially with the number of positions. Hereafter, reachable states computation time means the time to compute the set of reachable states.

Fig. 4
figure 4

Set of reachable states computation time (seconds) per number of positions. The Y-axis is displayed along a logarithmic scale

Table 4 Independent positions: average and mean deviation times (seconds) of reachable states computation, for 8 runs

In this experiment, the case involving 32 obligations and 16 powers exceeded preset time limits (2 h). SymboleoPC is hence able to handle cases where the contract contains up to 32 obligations with up to 16 powers, in a reasonable amount of time, to compute the full set of reachable states. These results suggest that SymboleoPC can handle real-life size contracts, as identified in our study since they consist of fewer legal positions than the limits identified above. It should be mentioned, however, that there exist larger contracts with hundreds of positions, especially in domains such as logistics.

For contracts with 8 powers, we also executed each test case eight times and computed their average execution time (in seconds). Table 4 shows that, for a given configuration of obligation power, there is only a small variance in the execution time among the 8 runs. This suggests that tool performance depends deterministically on the parameters used in this study.

5.3 Dependency levels between legal positions

To evaluate the impact of legal position dependencies on SymboleoPC ’s performance, we also generated 30 test contracts using the most frequent types of legal position dependencies observed in real contracts and analyzed in Table 2, i.e., 14% on average of obligations with dependencies of type R3, 8% on average of obligations with dependencies of type R4, and 22% on average of powers with dependencies of type R9. (R12 and R9 were ignored as they are dependencies involving contracts, not just legal positions.) Again in this experiment, we measure the time to compute the set of reachable states.

The results are reported in Fig. 5, where we also compare the reachable states computation times of these test cases with the corresponding scenarios that use independent positions. In this figure, ompn represents the numbers of obligations (m) and powers (n), respectively. As the Y-axis uses a logarithmic scale, times below 1 appear with a negative exponent.

Fig. 5
figure 5

Comparison of the set of reachable states computation time (seconds) per number of obligations/powers, with and without dependencies

These results show that, for the same numbers of obligations and powers, the time is reduced in most cases with dependencies between legal positions (23 out of 30 scenarios, with o8p1, o16p2, o32p2, o1p4, o1p8, o8p16, and o16p16 as exceptions) since some free variables have been replaced with the status of dependent legal positions. We performed the Wilcoxon signed-rank testFootnote 3 on our data test, as it compares the probability to get a higher value from one group (e.g., specifications with dependencies) with the probability to get a higher value from a dependent group (e.g., specifications without dependencies). The test result indicated that there is a significant medium difference between reachability times in specifications with dependencies ( \(\hbox {median} = 3.1, n = 29\)) and reachability times in specifications without dependencies ( \(\hbox {median} = 3.4, n = 29\)), with \(p =.035\) (and hence \(p < 0.05\), which suggests significance) and \(r = 0.4\) (medium magnitude).

We also measured the average and mean deviation of the time used to compute the reachable states, through eight execution rounds for test contracts that contain 8 powers. These results, summarized in Table 5, show that even though the times measured to compute reachable states of rounds are not convergent, they are always ascending in each round. One potential explanation for the standard deviation is the weak management of multi-core CPUs. The nuXmv tool running the experiments underneath SymboleoPC is a single thread program, and it was using 100% of the capacity of one CPU core even if other CPU cores were not used.Footnote 4

5.4 Property checking time

We used the results from the empirical observation discussed earlier in this section to generate 1000 LTL and 1000 CTL random properties for the synthetic dependent legal contracts discussed previously, considering 16 obligations and 12 powers. The results are reported in Fig. 6, where we plot the median time required by SymboleoPC to check each of the properties.

The diagrams show that (i) the median verification times are about 0.36 s for LTL properties and 0.01 s for CTL properties; (ii)the average verification times are about 4.8 s for LTL properties and 0.26 s for CTL properties.

These results tell us that thousands of typical properties can be checked within an hour. We remark that a typical stand-alone contract has few properties but scaling up the contract and taking into account relevant regulations can lead to an exponential growth of properties to be checked.

Figure 6 shows more fluctuations for LTL execution times than for CTL properties. We have found no explanation to account for this difference. In any case, the difference does not alter the general conclusions of our scalability study.

Figures 7 and 8 present an in-depth analysis of verification times for both CTL and LTL properties. The verification process for all CTL properties is verified within 0–1.77 s, whereas the verification of LTL properties spans 0–301 s. Notably, 98.6% of LTL properties are successfully verified within 2.96 s, with only a minimal 1.4% (14 properties) requiring more than 59 s for verification. Notice that verification time is not aggregated, which means properties release resources after verification. Therefore, the tool is able to verify many properties sequentially.

To assess the impact of the properties and the verification algorithm on the performance of the generated models, we analyzed the same 1000 LTL properties exhibiting performance times, utilizing the same nuXmv model but employing the IC3 state-of-the-art SAT algorithm [7]. Figure 9 illustrates the performance times of these properties with the adoption of the IC3 algorithm. There is a noticeable reduction in performance times of some of the properties, while for other properties the performance time increased to reach 1 h and 42 min. For example, as shown in Figs. 8 and 9, property number 508 has the largest performance time (around 301 s) when running the BDD algorithm [13], whereas its performance time dropped significantly to 14.24 s using IC3 algorithm. On the other hand, the BDD algorithm needs 0.38 s to verify property number 744 whose verification time equals 1 h and 46 min using IC3. Table 6 shows the structure and the verification results of these two properties.

Table 5 Dependent positions: average and mean deviation times (seconds) of reachable states computation, for 8 runs
Fig. 6
figure 6

Median time (seconds) to check each property

Table 6 Two generated LTL properties used for comparing the two verification algorithms

Furthermore, we noted variations in computation times when the result was true, whereas properties with false results exhibited more consistent times. Within the overall computation time of 17 h for the 1000 properties, a subset of 26 properties, all producing true results, required 11 h for verification. Another group of 187 properties, also yielding true results, displayed notably short computation times, totaling just 1 min, with each property taking less than 1 s.

Conversely, the verification of 699 properties with false results took approximately 6 h, with an average computation time of 35 s. In contrast to properties with true results, their actual computation times varied between 13 s and 2 min, so their execution times were more consistent.

The average computation time of the generated nuXmv model when using IC3 verification algorithm is 1 min and 13 s while the median equals 27 s. However, despite this relatively short average, the 1000 LTL properties demand 17 h to be verified. The same properties and the same model were verified in approximately 45 min only using BDD algorithm.

It is noteworthy that our models, even those with up to 64 obligations and 64 powers, were effectively verified by the IC3 verification algorithm. This demonstrates the robustness of our models, showcasing their ability to undergo successful verification (for LTL properties) even at a larger scale. It is important to highlight our decision to halt testing upon reaching large models with 64 obligations and 64 powers (equivalent to 128 legal positions and around 250 internal and external events), a scale beyond the scope of most real contract scenarios. Furthermore, although the computation times of our nuXmv models are notably affected by the hardware specifications of the running device, the relative fluctuations in these times are primarily influenced by the size of the model and the chosen algorithm. The property evaluation value also contributes to this dynamic, yet its impact is contingent on the algorithm employed for the verification process.

Fig. 7
figure 7

Verification time (seconds) for 1000 CTL properties using BDD algorithm

Fig. 8
figure 8

Verification time (seconds) for each of the 1000 LTL properties using the BDD algorithm

Fig. 9
figure 9

Verification time (seconds) for each LTL property using the IC3 algorithm

5.5 Threats to validity

Table 7 summarizes various threats to the internal, external, construct, and conclusion validity of the research. Internal validity is crucial as it hinges on the Symboleo specification, underscoring its dependency on an ontology that may change over time. Modifications to the Symboleo language can significantly impact the translator and, consequently, the results of the analysis. External validity, on the other hand, poses challenges related to the inspection of Symboleo specifications by legal contract experts. The absence of such scrutiny may result in specifications not fully capturing the original intent of natural language contracts.

In terms of construct validity, the study conveys a limitation in the scalability study, which primarily focused on time without a detailed examination of memory usage. However, the experiments did not lead to any memory-related problems. Lastly, the conclusion validity addresses the potential limitation in the representativeness of the 14 business contracts adopted for analysis. Acknowledging that these contracts may not entirely represent the entire class targeted by Symboleo and SymboleoPC, the study suggests that future research should explore additional contracts to enhance the generalizability of the conclusions drawn here.

Table 7 Threats to validity

6 Related work

The formal verification of smart contracts using specialized languages has been the subject of several contributions, as discussed in the surveys of Tolmach et al. [52] and of Shishkin [46]. Of particular relevance here, papers [3, 31,32,33] use nuXmv for the functional verification of implementations of smart contracts in languages such as Ethereum Smart Contracts, to check deadlock-freedom, liveness, and safety properties expressed in CTL or LTL.

Frank et al. [20] define an SMT-based [5] bounded model checker for the verification of low-level implementation properties of Ethereum networks. Antonino and Roscoe [4] propose a similar approach but for the Solidity high-level language used by Ethereum smart contract developers. Li and Long [24] propose and study the SOLAR analysis tool, also based on SMT, for automatically detecting standard violation errors in Ethereum smart contracts. Liu and Liu [26] propose a formal verification method based on Colored Petri Nets (CPN) and the ASK-CTL variant to verify smart contracts in blockchain systems. Hajdu and Jovanovic [23] provide a source-level verification tool for Ethereum smart contracts.

All these papers have common characteristics: They show the feasibility of their analysis approach on case studies or examples, but they do not perform a thorough experimental assessment of scalability and applicability along important dimensions such as the size of the problem, the degree of interconnection of the specification elements, and the number and size/depth of the properties involved. They also focus on smart contract programming languages that do not support legal concepts such as obligations and powers.

There exist high-level formal contract languages other than Symboleo, but they also suffer from limitations in their verification support or performance assessment. For example, TCL [10], PENELOPE [22], and eFlint [50, 51] support some analysis capabilities (akin to testing) but not yet any formal verification of properties. Other languages support verification, but without any performance or scalability assessment. This is the case of MODELLER from Daskalopulu [16], with contracts formalized in Petri Nets and model-checked against CTL properties, and of \(\mathcal{C}\mathcal{L}\) from Pace et al. [33], with contracts in deontic logic, transformed to labeled transition systems, and also model-checked using nuXmv. SCIFF, from Alberti et al. [2], enables verifying deontic logic contracts using a tailored procedure for design-time property verification. The performance of this procedure was briefly evaluated in [29], but not for contract specifications or their properties.

There are also approaches checking the satisfiability of LTL formulas that are somewhat related to our approach. Notably, Li et al. [25] and Rozier and Vardi [42] investigated different approaches for checking the satisfiability of LTL formulas both randomly generated and taken from specifications of realistic problems. Similarly, Narizzano et al. [30] discussed an approach for checking the satisfiability of LTL properties resulting from random combinations of property patterns [17]. In these three cases, they used nuXmv, among other tools. For the generation of random formulas in our setting, we leveraged the approach used in [30] for LTL specifications, with extensions to handle CTL specifications, and to consider atomic propositions resulting from Symboleo specifications. We remark that the focus of these approaches is LTL satisfiability, i.e., they use a universal model (without constraints on the evolution of variables). In our case, we have a nuXmv model resulting from the encoding of a Symboleo specification, either randomly generated or corresponding to a real contract.

7 Conclusions and future work

The verification of legal contract specifications against properties capturing the intents of contracting parties is essential, especially in contexts where these specifications are used to guide smart contract implementation. It is also important to assess whether automated verification tools can scale to realistic-size contracts and properties.

This paper reports on the implementation, performance, and scalability analysis of SymboleoPC, a tool based on nuXmv for model checking legal contracts specified in Symboleo against LTL and CTL properties. Our analysis results suggest that SymboleoPC performs well on realistic mid-sized contracts with up to 128 legal positions and scales well to their size considering different degrees of inter-dependencies among their legal positions. The tool also scales well in support of LTL/CTL properties of different sizes and degrees of complexity. These results improve substantially the scalability results reported in previous work [36]. We remark that:

  1. 1.

    Since our analysis is done at the specification level, this study is original compared to existing work on the verification of smart contract code (e.g., expressed in Solidity) or of specifications developed in other high-level contract languages. In related papers [35, 37], we have shown that Symboleo is significantly more expressive than other languages that have been developed for similar purposes. Since then, we have also extended Symboleo to support assignments (as illustrated in our Computer Delivery example and in the translation rule #6, shown in Table 14), the automated generation of implicit constraints in nuXmv (Sect. 4.1 and translation rule #5, shown in Table 13), as well the automatic generation of smart contracts in JavaScript for the Hyperledger Fabric blockchain platform [40].

  2. 2.

    Our synthetic contracts are realistic because they are based on metrics extracted from contracts found in contract repositories, with few adaptations, and with considerable variation among them.

  3. 3.

    The amount of performance experimentation that was done for checking the properties of these contracts far exceeds what was done in any other research published so far in this field.

This work opens the door to additional future research directions. For example, studying how to turn such verification engines into online services. As contracts exist within legal systems (e.g., national laws or other jurisdictions), it would be relevant to encode legal requirements in Symboleo or in nuXmv such that verification could be performed in other contexts. Some contracts that satisfy properties when evaluated stand-alone may no longer do so in specific judicial contexts, and this verification may help detect voidable contracts. This may pose performance challenges depending on the complexity of legal systems.

For real-world contracts, there is a need to be able to specify runtime operations. Within Symboleo, we have already introduced syntax and axiomatic semantics for operations supporting subcontracting, assignment, delegation, and substitution [35]. Verifying properties at that level will help handle several aspects of contract evolution. Our forthcoming efforts involve further integrating these concepts with access control principles to enhance security and privacy measures.

Although the usability of the Symboleo language and related tools is outside the scope of this paper, our previous work discusses usability concerns (especially for legal professionals) [37]. Ongoing work investigates the conversion of natural language contracts to Symboleo  [28, 47], which would then help reduce the effort in generating good contract specifications.

Finally, we hope that this evaluation study of SymboleoPC will guide similar work on assessing the performance of verification technologies for other contract specification languages.