Keywords

1 Introduction and Motivation

Originally introduced by Gill in 1962 [7] and later proposed by Harel in 1987 [8] as a significant extension over traditional finite state machines, statecharts are a visual formalism for modeling the dynamic behavior of components at various levels of abstraction. The Unified Modeling Language (UML), an industrial de facto standard that supports software modeling, adopted Harel’s statecharts in its specification and extended them. This study is based on the extended statechart model, referred to in the literature as “UML state machine” (or “UML statechart”). A state machine can model the behavior of a reactive system at any level of abstraction. In this study, we define a declarative representation of a state machine, and construct a platform to analyze the machine through queries and simulation. The objective is to assist in the validation of system requirements captured by the machine and the methodology entails the study of quality attributes, behavior and well-formedness of the machine as well as simulation.

A declarative model is a powerful and intuitive way to represent state machines, offering numerous advantages in terms of maintainability, scalability, and analysis. In fact, declarative representation expresses the behavior and transitions of the state machine using logical clauses and rules. This model can be implemented in Prolog, which provides capabilities like pattern matching and backtracking, making it well-suited for modeling complex behavior in state machines [16]. Sheng et al. [15] present a Prolog-based consistency checking for UML class diagrams and object diagrams. They formalize the elements of the model and then convert the model into Prolog facts along with some consistency rules that enable querying of the properties, elements, and subsequent parts of the model. Similarly, Khai et al. [12] propose a Prolog-based approach for consistency checking of class and sequence diagrams. State machines are widely utilized in software testing to evaluate performance and quality against predefined requirements. Hashim and Dawood [11] conduct a review of test case generation methods that use UML statecharts. Chen and Lin [3] propose a test case generation strategy that enhances efficiency and guarantees high test coverage and accuracy. Aktaş and Ovatman [1] discuss statechart anti-patterns which may occur in software development process.

Using a declarative model, the static behavior of a system can be studied and the system requirements can be validated. Additionally, statecharts are a widely-used notation for representing the dynamic and executable behavior of complex systems [5]. This highlights the significance of having tools for visualizing and simulating statecharts. Mens et al. [13] introduce a technique to improve statechart design using specialized tools including a modular Python library called Sismic [5]. Van Mierlo and Vangheluwe [17] present an approach for modeling, simulating, testing, and deploying statecharts. Balasubramanian et al. [2] introduce Polyglot, a framework for analyzing models described using multiple statechart formalisms. Their approach involves translating the structure and behavior of statechart models into Java and analyzing them using pluggable semantics. Modeling state machines with nested composite states and flattening the model has been a challenge. One major issue is the potential occurrence of unwanted non-determinism which has also been studied in the literature [9, 10], and [17]. E. V. and Samuel [6] describe a technique to transform hierarchical, concurrent, and history states into Java code using a design pattern-based methodology.

We structure the remainder of this paper as follows: We provide a background to the mathematical specification of a state machine in Sect. 2. We present an overview of our approach and the case study in Sect. 3. We present our initial declarative model in Sect. 4 and describe our query system in Sect. 6. We present our flattened declarative model in Sect. 5; and the simulation process in Sect. 7, together with a discussion on the results of a given scenario. We finally present our conclusion.

2 Background and Assumptions

UML 2.5.1 [14] provides numerous complex features, such as composite and nested states; entry and exit pseudostates; entry, exit, and do state behavior; as well as implicit region completion transitions. These features lead to a complex behavioral analysis. We simplify the machine by converting it into a modified Extended Finite State Machine EXTENDED FINITE STATE MACHINE (EFSM), as specified in the subsequent section. Moreover, the standard UML does not allow \(\epsilon \)-transitions. An \(\epsilon \)-transition is a transition whose event and guard are empty. Observe that \(\epsilon \)-transitions are only allowed in pseudostates (i.e. entry and exit), as well as region completion (i.e. in the case of the completion of a do behavior, or reaching a final substate).

2.1 Modified Extended Finite State Machine (EFSM)

The EFSM is formally defined as a 7-tuple [4]. Our definition of EFSMs adapts this 7-tuple, with a slight modification on the inputs of the transition. An EFSM M, is defined as a 7-tuple \((Q, \varSigma _1, \varSigma _2, q_0, V, \varGamma , \varLambda )\), where

  • Q is a finite set of states,

  • \(\varSigma _1 = \{ e_i \,:\,i\in \mathbb {Z} \}\), is a non-empty finite set of events,

  • \(\varSigma _2 = \{ a_i \,:\,i\in \mathbb {Z} \}\), is a finite set of actions,

  • \(q_0 \in Q\) is the starting state,

  • \(V = \{ v_i \,:\,i\in \mathbb {Z} \}\) is a finite set of mutable global variables,

  • \(\varGamma = \{ g_i \,:\,i\in \mathbb {Z} \}\) is a finite set of guards,

  • \(\varLambda = \{ \lambda : q \xrightarrow { e_i{[}g_i{]}/a_i } q^\prime , i\in \mathbb {Z} \}\), is a finite set of deterministic transitions defined on \(Q \times \overset{\circ }{\varSigma _1}\ \times \overset{\circ }{\varGamma }\ \rightarrow Q \times \overset{\circ }{\varSigma _2}\), where \(\overset{\circ }{\varSigma _1}\ = \{ \epsilon \} \cup \varSigma _1\), \(\overset{\circ }{\varGamma }\ = \{ \epsilon \} \cup \varGamma \), \(\overset{\circ }{\varSigma _2}\ = \{ \epsilon \} \cup \varSigma _2\), \(\epsilon \) denotes null, \(q, q^\prime \in Q\), \(e \in \overset{\circ }{\varSigma _1}\), \(g_i \in \overset{\circ }{\varGamma }\), and \(a_i \in \overset{\circ }{\varSigma _2}\) are all bindable string literals.

A guarded \(\epsilon \)-transition is represented by \(\lambda : q \xrightarrow { e_i{[}g_i{]}/a_i } q^\prime \) where \(e_i=\epsilon \). In the case where \(g=\epsilon \), the transition is referred to as \(\epsilon \)-transition. In order for \(\varLambda \) to be deterministic, for every state \(q \in Q\), at most one possible transition must exist. In other words, \(\forall q \forall \lambda _i: q \xrightarrow { e_i{[}g_i{]}/a_i } q^\prime \), the satisfiability of \((e_i, g_i)\) must be exclusive. While this property holds for all EFSMs, we enforce the following restrictions:

  1. 1.

    If state q has an outgoing \(\epsilon \)-transition, no other outgoing transitions are allowed on q.

  2. 2.

    If state q has an outgoing guarded \(\epsilon \)-transition, only other guarded \(\epsilon \)-transitions are allowed on the state. Let \(\{g_i\)} be the set of all guards for all guarded \(\epsilon \)-transitions on state q. i) \(\cup g_i = \texttt {True}\); ii) \(\forall i \forall j \ne i ~ (\lnot (g_i \wedge g_j))\).

3 Overview of the Approach and Case Study

An overview of our approach is illustrated in the UML activity diagram of Fig. 1, and the various aspects of the diagram will be discussed in the subsequent sections through a case study that models an alarm system, shown in Fig. 2 and Fig. 3.

Fig. 1.
figure 1

UML activity diagram of the approach.

Fig. 2.
figure 2

Case study: Alarm.

Fig. 3.
figure 3

The initial declarative model of the alarm case study.

4 Transformation of the State Machine into a Declarative Model

The first part of this task is to provide a platform that can serve as a virtual machine for analysis of a state machine. The model consists of a declarative representation of a machine, following a defined structure of clauses, implemented as Prolog facts, that represent the state machine as a cyclic directed multigraph, where states are modeled as nodes and where transitions are modeled as edges. Unary clauses such as state/1, pseudostate/1, initial/1, final/1 model their respective language element and proc/1 defines a do behavior. Binary and multi-arity clauses are defined in Table 1.

4.1 Modeling Events

In this declarative model, events are represented by the event/2 clause, implemented as event(type, argument). The supported event types in accordance with the UML specification include call, signal, time and change. Additionally, we introduce three new event types: inactivity, update and completion. A brief description of all event types is shown below:

call::

An external event that triggers a transition. Makes use of keyword call.

signal::

Triggered by an internal or external clock, which indicates a specific time for triggering a transition. Makes use of keyword at.

time::

When the source state has been active for a specified length of time, the transition occurs if its guard evaluates to true. If no guard is present (nil), a transition occurs automatically. Makes use of keyword after.

change::

Triggered by a constantly evaluated condition once true. Makes use of keyword when.

inactivity::

The system is expected to be inactive over a given amount of time, specified by the argument. Though treated as a time event, it makes use of keyword timeout.

update::

Updates the value of a variable or attribute, which may subsequently trigger a transition if the new value satisfies the conditions for the transition. Makes use of keyword set.

completion::

Occurs when a region concludes or a do behavior completes, modeled as event(completed, ?state), where ?state represents the current state (or region). Makes use of keyword completed.

Table 1. Major clause signatures of the initial declarative model.

4.2 Modeling Actions

We classify actions into EXEC and LOG. This classification provides the means to manage each action type differently, allowing for greater flexibility in the model. This classification is particularly useful when we need to flatten the model (see Sect. 5), as it allows us to easily identify and apply the appropriate processing to each type. Finally, the model introduces action/2 to codify actions. The case study illustrates actions that are executed by the script engine (e.g. invoking the echo() method) as well as actions that are logged by the system (e.g. Green LED OFF). Note that a do behavior is a process that is started when the machine enters a state and may be stopped (upon successful termination) or aborted (triggered by an exit event). Finally, in Fig. 2, system shutdown is implemented as an entry behavior of the final state, since a final state cannot have an exit behavior.

5 Flattened Representation of UML State Machines

We extend the initial declarative model and develop an algorithm that flattens the machine. We believe that a flattened model can provide a platform for deeper analysis as well as a simulation of behavior (see Sect. 7). The flattened model provides the same semantic model as the initial model, though at a lower level of abstraction, being analogous to the bytecode platform for languages such as Java and Clojure, which is a seamless virtual machine. The flattened model can also be extended with rules that target the three aspects of our analysis (quality attributes, behavior, and well-formedness).

5.1 The Flattening Process

In a complex UML machine, transitions can trigger various sequences of actions. For example, when transitioning from idle to active, while the transition itself has no action, the activate event triggers the entry action on active before transitioning into configuring. Similarly, when transitioning from activated (substate of active) to idle, a sequence of actions is executed: aborting ‘Make Siren Sound’, executing echo(‘Exit Emergency’), and logging ‘Green LED OFF’.

To analyze the behavior of the UML state machine, we convert it into a flattened EFSM by chaining the subsequent actions using \(\epsilon \)-transitions. Our flattening algorithm consists of 4+1 passes, progressively eliminating complex UML features such as composite states, pseudostates, state behaviors, and internal transitions. Each pass involves multiple steps, modifying facts and reducing complexity until the machine is fully flattened. Finally, the resulting machine is minimized by reducing the number of states and combining equivalent transitions. Prolog queries are used as selectors to process the working database. An outline of the flattening algorithm is presented on the next page:

Procedure Flatten(Input: UML in decl. DB, Output: EFSM in decl. DB)

Pass 0: Preprocessing

1: Convert all outgoing nil-events form state s to event(completed, s).

2: Convert all actions to action-lists.

Pass 1: Processing pseudostates, entry, exit, and do behaviors

1: Resolving do behaviors: For each state s with do behavior with process p:   i) Append “start p”, insert “abort p” notification actions to the entry and exit actions of state s, respectively; ii) For every completed event on state s, insert “stop p” notification action to transition’s actions; iii) Remove the do behavior from s.

2: Resolving entry/exit pseudostates: i) Replace all entry_pseudostate(s, t) clauses with transition(s, t, nil, nil, []) and superstate(p, s) where superstate(p, t); ii) Change all exit_pseudostate(s, p) clauses to superstate(p, s).

3: Resolving entry behaviors: Starting from top to bottom, for every state with entry behavior: i) Find onentry_action(s, a) and remove it; ii) For each incoming transition from an external state x to s: append s to the transition’s action list; iii) For each incoming transition from an external state x to a substate b of s, append a to the transition’s action list; iv) If s is a top-level initial state, create a new state ps, add state(ps); change initial(s) to initial(ps), and add transition(ps, s, nil, nil, a); v) Otherwise if s is a non-top-level initial state, find p where superstate(p, s); add superstate(p, ps); change initial(s) to initial(ps), and add transition(ps, s, nil, nil, a).

Pass 2: Full State Resolution

1: For each composite state p do the following: i) Obtain the list of immediate substates of p into l; Obtain the exit behavior of p into ea; ii) Change the target state of all incoming transitions to p, to the initial substate of p; iii) For each non-final substate s of p repeat: a) Inherit all outgoing nil-transitions from the superstate, if the child state does not contain a nil-transition; b) For every outgoing transition from the state s to a state that is not in l, including the above; insert ea to the transition’s action list, if ea \(\ne \) nil; c) Replace superstate(p, s) with state(s).

iv) Find inner final state f (if applicable); remove both superstate(p, f) and final(f); add state(f); for each transition(p, t, e, g, a) from p to the target state t where e is a region completion event on p: add transition(s, t, nil, g, a); insert ea to a, if ea \(\ne \) nil; v) Remove the composite state p, its behaviors, and all its outgoing transitions.

2: For each remaining state s with exit behavior e, insert e to all outgoing transitions’ actions list and remove the exit behavior clause.

3: For each internal transition on state s, convert internal_transition to transition to self.

Pass 3: Post-Processing

1: For all action lists containing “stop p”, find corresponding “abort p” in the list; remove “stop p”, and change “abort p” to “stop p”.

2: For all transition(s, t, e, g, l), where length(l) \(>1\), create intermediary state i, replace the original transition with transition(s, i, e, g, head(l)) and transition(i, t, nil, nil, tail(l)); Resolve transition(i, t, nil, nil, tail(l)), recursively.

3: Replace all transition(s, t, e, g, []) with transition(s, t, e, g, nil).

Pass 4: State Reduction/Minimization

For each transition(s, t, e, g, a): Find all transition(\({\texttt {\textit{s}}}_\texttt {2}\), t, e, g, a) where \({\texttt {\textit{s}}}_\texttt {2}\) is not initial and \({{\texttt {\textit{s}}}_\texttt {2}}\) \(\ne \) s. Replace all transition(x, \({\texttt {\textit{s}}}_\texttt {2}\), \({{\texttt {\textit{e}}}_\texttt {2}}\), \({\texttt {\textit{g}}}_\texttt {2}\), \({\texttt {\textit{a}}}_\texttt {2}\) ) with transition(x, s, \({\texttt {\textit{e}}}_\texttt {2}\), \({{\texttt {\textit{g}}}_\texttt {2}}\), \({\texttt {\textit{a}}}_\texttt {2}\) ). Remove all instances of state(\({\texttt {\textit{e}}}_\texttt {2}\) ) and transition(\({\texttt {\textit{e}}}_\texttt {2}\), t, e, g, a). Repeat until no more transitions can merge.

Having produced a flattened model, we perform a model transformation into a (new) declarative representation, deploying only the clause structures state/1, initial/1, final/1, transition/5, event/2, and action/2.

Fig. 4.
figure 4

Partial flattened declarative model of the alarm case study.

Figure 4 includes a partial model capturing transitions from states idle and configuring to reading. Consider the transition from idle to configuring in Fig. 2. Such transition causes system startup notification upon entry to idle. The reception of the event activate causes a transition to the active superstate which is now collapsed. Upon reaching active, the transition causes echo configuring mode upon entry to the configuring substate. Such sequence of actions are implemented in the flattened model by sequence of transitions starting from initial, to pre_idle, idle, s71, and finally to configuring. Note that one may extend the model to support transition with multiple actions, in which case, an extra step in pass 4 may reduce the total number of states by following and merging all outgoing nil-transitions into a single transition. We intentionally avoided this to make the model compatible with the definition of EFSMs (Fig. 5).

Fig. 5.
figure 5

The flattened UML diagram

6 Building a Query Platform

With the declarative model as is, we can execute simple ground queries that can give us some basic knowledge of the machine such as “Is there a transition from state idle to state configuring?”

figure a

We can also execute non-ground queries such as “Under what conditions, if any, would the state machine perform a transition to the emergency state?” This would entail capturing any and all state-event-guard triples that can cause such a transition.

figure b

6.1 Extending the Declarative Model with Rules

We can extend the declarative model by introducing rules. We can identify three types of rules: (1) We have rules that reason about the behavior of the state machine by examining the traversal of the underlying graph under various different conditions. When we study behavior, we want rules that reason about elements such as the exposed interface and legal event sequences. (2) We have rules that reason about the quality attributes of the state machine by examining the properties and measurements of the underlying directed graph. When we study graph (machine) complexity we want rules that provide knowledge about aspects such as connectivity and (global and nodal) measurements. We argue that the above two types of rules roughly correspond to the state machine’s functional and non-functional requirements. (3) We have rules that reason about the well-formedness of the machine, such as the presence of infinite loops, dead ends, or conflicts with the UML specification e.g. the existence of an internal transition without an action association.

6.2 Studying Behavior

Exposed Interface: The call and set events correspond to messages sent to the system and they collectively constitute the exposed interface of the system. Rule get_interface/1 succeeds by collecting any and all such events.

figure c

Legal Events at a Given State: Given the system exposed interface, it is important to note that not all events can be acted upon unconditionally. An event can be accepted based on the system’s current state. It will be acted upon provided the associated guard (if one is present) evaluates to true.

figure d

6.3 Studying Complexity

We provide rules for properties and measurements. Measurements in graphs can be global or nodal. Global measurements refer to global properties of the graph and consist of a single number for any given graph. Nodal measurements refer to properties of the nodes and consist of a number for each node for any given graph.

Order of Graph: This measurement refers to the number of nodes in a graph. In the context of state machines, we believe that the initial model may not give us an accurate picture due to the presence of composite states. The flattened model would be more accurate for this measurement. For the initial and flattened models the corresponding rules are shown below:

figure e
figure f

Number of nil Transitions: The number of nil transitions in a flattened model can be a measure of the complexity of a state machine. The following rule succeeds by returning the number of nil transitions:

figure g

Size (or Length) of Graph: This measurement refers to the number of edges in a graph. In the context of state machines, we believe that the initial model may not give us an accurate picture due to the fact that in the presence of composite states, their nested states inherit the transitions of their superstate. The flattened model would be more accurate for this measurement.

figure h

6.4 Studying the Well-Formedness of the State Machine

We define rules to study the design of the state machine and find cases such as dead ends, conflicts, or inconsistencies among the state machine’s elements, considering issues such as (1) Dead ends and infinite loops, (2) Internal transition without an action, (3) Multiple change events originating from the same state, (4) Non mutually exclusive guards originating from the same state, (5) The absence of a do behavior in the presence of an external transition with no event, and (6) As the previous item for a composite state, in the absence of an exit substate.

Dead Ends: We are interested in finding out if the machine can enter a state from which the final state is not reachable. Rule dead_end/0 succeeds by obtaining a non-empty list of states from each of which there is no path to state final.

figure i

7 Simulating State Machine Behavior

The query system provides a level of analysis that is complemented with a simulation of the machine. The flattened model serves as the platform for simulation. A simulation reads in a machine representation and a scenario under which the machine is traversed and its state and behavior is monitored and recorded. The question we ask here is “Is the Machine behaving according to its specification?” During simulation, we need to be able to identify issues perhaps not having been identified by the query system, e.g. “Has the simulator encountered an ambiguous transition?”, in which case we need to report such issues.

Structure of Scenario: A scenario is a sequence of commands consisting of three types of tags: EVENT, EXECUTE, and TIME. EVENT tags can be of type call, set, or completion, and must trigger the corresponding transition. EXECUTE tags contain expressions that modify variable values, and may trigger a transition. TIME tags can be either after or at, which update time variables duration and absoluteTime (if applicable) and may trigger a transition.

Read-Evaluate-Execute Cycle: In UML, it is assumed that a state machine processes one event at a time and finishes all the consequences of that event before processing next event [14]. At the highest level of abstraction, and given a scenario, the simulation would be performed using a Read-Evaluate-Execute Cycle. When a command in a scenario is EVENT e, where \(e \in \varSigma _1\), given the current state and the event, the simulator would construct a transition query and consult the declarative model. We query the database and find all transitions \(\lambda _i \in \varLambda \) with event e. The result of the query is a set of \(\lambda _i\), associated with tuples \(\{ (q, g, a)_i \}\) where \(q \in Q\) is the target state, \(g \in \varLambda \) is a guard, and \(a \in \varSigma _2\) is an action. Each tuple is also associated with a set of \(v_i \subset V\), containing all variables used in \(g_i\) and \(a_i\). The query is successful only if one transition is possible. This is achieved by instantiating all variables in \(v_i\) and evaluating \(g_i\). Upon success, a single transition is fired. The simulator consequently checks if any additional transitions can be triggered, following the most recent transition. The process continues until no further possible transition is applicable.

Simulator Architecture: To perform a simulation, we need to provide storage of all variables (machine and environment) while keeping track of any changes. We also need to provide storage and keep track of the machine’s current state. To support these requirements, we provide an imperative model in Java while deploying Java Prolog Library (JPL). We use Javascript to maintain system variables, and we deploy the GraalVM engine to evaluate events and guards, and finally to execute actions. We illustrate the architecture of the simulator in the UML component diagram shown in Fig. 6. We illustrate the interaction of the various components during simulation with the UML sequence diagram of Fig. 7. The diagram illustrates the interactions among high-level objects, including SimulatorExecuter, JPLMediator (facilitating the communication with the declarative model), ScriptHandler (responsible for evaluating guards, actions, and modifying variables), a Scenario defined as a text file containing a sequence of events for simulation, and the Output generated by the tool. The outer loop in the sequence diagram illustrates the Read-Evaluate-Execute cycle and the inner loop mostly covers \(\epsilon \)-transitions in our flattened model.

Fig. 6.
figure 6

UML component diagram of the simulator.

Fig. 7.
figure 7

UML sequence diagram of the simulation process.

Results of Simulation in the Case Study: We applied the flattening algorithm to the declarative representation of our case study, and the resulting minimized flattened model is shown in Fig. 4. Also, Fig. 8 presents a sample scenario (top-left) along with the corresponding simulation output (top-right).

Visualization of Results: we visualize the results of simulating the scenario as the model of behavior which is shown in Fig. 8 (bottom). This diagram shows the current state of the state machine as well as state of the system in each time id.

Fig. 8.
figure 8

Input scenario and the corresponding simulation output (top), and its model of behavior (bottom)

Table 2. Complexity Metrics: Original vs Flattened Models

8 Conclusion

In this paper, we presented a declarative model to represent UML state machines. The model is used to study the dynamic behavior of the underlying machine. The simulation results provide insights into the machine’s behavior under specific scenarios. We developed a simulation tool and a query engine that use the model in Prolog environment and run scenarios in an imperative platform. We deployed JPL for Java-Prolog interoperability. Our platform supports codified actions in JavaScript, by which developers may set or update system variables, in both the model, as well as in scenarios.

We introduced an algorithm to flatten the UML state machine and convert it into an extended finite state machine. Our algorithm supports major UML 2.5.1 features including single and composite states; exit and entry pseudostates; state behaviors including entry, do, and exit; in addition to the UML events including call, signal, time, change, as well as three newly introduced events namely inactivity, update, and completion. Table 2 lists some metrics that may be used to measure the complexity of the UML diagrams in both original and flattened models.

We used a modified version of the extended finite state machine to support guarded and unguarded \(\epsilon \)-transitions that are required for handling complex sequences of actions and notifications in a non-flattened model. Future work may involve expanding the model to include contract considerations as well as other UML features such as history pseudostates and orthogonal regions.