1 Introduction

Formal methods using interactive proof assistants such as Coq [6] or Isabelle [18] are increasingly integrated into the software engineering process to verify the correctness of software. They have a solid formal basis and a precise semantics, but they use complex notations that might be difficult to understand for unaccustomed users. On the contrary, Model-Driven Engineering (MDE) [4, 20] supplies us with visual specification languages such as class diagrams [11] that use intuitive notations. These languages permit to specify, visualize, understand and document software systems. However, they suffer from a lack of precise semantics. We are interested in combining these two complementary technologies by mapping the elements of the one into the other, using an MDE-based transformation method.

One possible scenario is to define the abstract syntax of a Domain Specific Language (DSL) [7] to be used in the context of a formal verification, and then to generate a corresponding Ecore meta-model to be able to use an MDE-based tool chain for further processing. Inversely, the meta-model can then be modified by an application engineer and serve as basis for regenerating the corresponding data types. This operation may be used to find a compromise between the representation of the software architect’s wishes on the meta-model and functional data structures used in the proof. Furthermore, the meta-model can be used to easily generate a textual (or graphical) editor using Xtext (respectively GMF: Graphical Modeling Framework) facilities [12].

This work constitutes a first step towards using MDE technology in an interactive proof development. The illustrating example is a Java-like language enriched with assertions developed by ourselves for which no off-the-shelf definition exists [2]. It constitutes a sufficiently complex case study of realistic size for a DSL. In this paper, the transformation is applied only on a part of the Safety Critical Java DSL, corresponding to a method definition. A transformation for the whole language can be found in [8], together with other case studies.

This article is a considerably extended and revised version of a previous conference paper [9]. It provides a more detailed discussion of related work and of our case study, and gives a more formal treatment of the transformation rules for transforming meta-models into data types as used in functional programming (see Sect. 4). The definition of the inverse translation (see Sect. 5) is entirely new.

The structure of the paper is as follows: In Sect. 2, we compare our approach with related work. Then, we present some preliminaries, to introduce the main components of our work in Sect. 3. Sections 4 and 5 constitute the technical core of the article; they describe the translation from data models used in verification environments, to meta-models in Ecore and vice versa. We then illustrate the methodology with an example in Sect. 6, before concluding with perspectives of further work.

2 Related Work

Eclipse Modeling Framework (EMF) [5] models are comparable to Unified Modeling Language (UML) class diagrams [11]. For this reason, we are interested in the mappings from other formal languages to UML class diagrams and back again. Some research is dedicated to establishing the link between these two formalisms. We cite the work of Idani et al. that consists in a generic transformation of UML models to B constructs [14] and vice versa [13]. The authors propose a meta-model based transformation method defining a set of structural and semantic mappings from UML to B (a formal method that allows to construct a program by successive refinements, using abstract specifications).

Similarly, there is an MDE-based transformation approach for generating Alloy (a textual modeling language based on first-order logic) specifications from UML class diagrams and backwards [1, 21].

The purpose of these methods is to generate UML components from a formal description and backwards, but their formal representation is significantly different from our needs: functional data structures used in proof assistants.

In addition, graph transformation tools [10, 16] permit to define source and target meta-models all along with a set of transformation rules and use graphical representations of instance models to ease the transformation process. However, the verification functionality they offer is often limited to syntax, typing and structural aspects (such as confluence of transformation rules).

Notable exceptions are codings of graph transformations as transition systems [3, 23] for model checking properties of the transformation system such as invariants or reachability. These approaches are not applicable in our context as we aim at modeling deeper semantic properties (such as an operational semantics of a programming language and proofs by bisimulation that often require inductive arguments).

Our work aims at narrowing the gap between interactive proof and meta-modeling by offering a way to transform data structures used in interactive provers to meta-models and vice versa.

3 Preliminaries

3.1 Methodology

Model-driven Engineering is a software development methodology where the (meta-)models are the central elements in the development process. A meta-model defines the constituents of a language. The instances of theses constituents are used to construct a model of the language. A model transformation is defined by a mapping from elements of the source meta-model to those of the target meta-model. Consequently, each model conforming to the source meta-model can be automatically translated into an instance model of the target meta-model. The Object Management Group (OMG) [19] defined the Model-driven Architecture (MDA) standard [15], as specific incarnation of the MDE.

We apply this method to define a generic transformation processes from data types (used in functional programming) to Ecore models and backwards. Figure 1 shows an overview of our approach. For the first direction of the translation, we derive a meta-model of data types starting from an EBNF representation of the data type definition grammar [18]. This meta-model is the source meta-model of our transformation. We also define a subset of the Ecore meta-model [12] to be the target meta-model. To perform this transformation, we defined a set of transformation rules (detailed in Sect. 4) that maps components of the meta-model of data types to those of Ecore meta-models.

Fig. 1
figure 1

Overview of the transformation method

We use the mapping between the constructs of the two meta-models to define the reverse direction transformation rules to ensure the bidirectionality of the transformations. Bidirectionality [22] is one of the desired options of MDE-based transformations. Indeed, assuming we start from a source model \(M_S\), then we perform a transformation using a function \(f\) to get a target model \(M_T\). It is important to derive an equivalent model to \(M_S\), as a result to the application of \(f^{-1}\) on \(M_T\). Such a feature requires more restrictions on the Ecore models. This transformation function is not automatically derived from \(f\), it is given in Sect. 5.

Introductory example: The use of the transformation rules will be illustrated in Sect. 6 with an example that is an excerpt of a real-life application. To give a flavour of the approach, we here present a tiny meta-model, namely, a finite-state automaton.

Figure 2 represents a data type description of an automaton, in this case written in the Caml language. Each automaton is then composed of a list of states and a list of transitions. Every state is composed of an integer value (for identifying the state) and two Boolean values (defining whether a state is an initial state and/or a final state). A transition is then described by two states: a source and a target. Figure 3 consists in the representation of the same automaton as a meta-model in Ecore. This meta-model represents the result of applying our transformation on the presented data types.

Fig. 2
figure 2

Automaton data types (in Caml)

Fig. 3
figure 3

Automaton meta-model (in Ecore)

3.2 The Data type Meta-Model

Functional programming supplies us with a rich way to describe data structures. However, since some features are not supported by Ecore (see Sect. 4.1 for a discussion), we have only defined a subset that contains the essential element composing data types. Figure 4 depicts the data type meta-model that is constructed from the subset of data type’s declarations grammar presented in Fig. 5 [17, 18]. We point out that we are mainly interested in data structures. They correspond to the static part of the proofs. Except for the case of accessors, the functions are not treated.

Fig. 4
figure 4

Data type meta-model

Fig. 5
figure 5

Part of the Caml data types grammar

A Module may contain several Type Definitions. Each Type Definition has a Type Constructor, which corresponds to the data type’s name. It is also composed of at least one Constructor Declaration. These declarations are used to express variant types: a disjoint union of types. A Type declaration has a name, it is the name of a particular type case. It takes as argument some (optional) type expressions which can either represent a Primitive Type (int, bool, float, etc.) or also a data type defined previously in the Module. The list notation introduces the predefined data structure for lists. The type option describes the presence or the absence of a value. The ref feature is used for references (pointers).

We enriched the type definition grammar with a specific function named Accessor (see Fig. 6). It is introduced by the annotation (*@accessor*). It allows assigning a name to a special field of the type declaration. This element is essential for the transformation process, its absence would lead to nameless structural features.

Fig. 6
figure 6

Syntax of accessor functions in Caml

Representing Generic Types in Functional Programming: Parametrized types are important features in functional programming. They are used to express polymorphic data structures. They are comparable to generics in Java and templates in C++. They permit to build different data structures that accept any kind of values. Each definition of a parametrized type is formed of a Type Constructor and a set of Type Parameters. The type expressions then can contain a previously defined parametrized type or one of the specified parameters.

3.3 The Ecore Meta-Model

Our destination meta-model is a subset of the Ecore meta-model. Ecore is the core language of EMF [5], which permits to build Java applications based on model definitions and to integrate them as Eclipse plug-ins.

The Meta Object Facility (MOF) set by the OMG defines a subset of UML class diagrams [11]. It represents the meta-meta-model of UML. Ecore is comparable to MOF but simpler. They are similar in their ability to specify classes, structural and behavioral features, inheritance and packages.

To implement our approach, we use Eclipse and its core language Ecore. However, it would be possible to choose other solutions [16]. This choice is due to the place of Eclipse for meta-modeling and development, in particular it offers a wide range of highly integrated tools.

Figure 7 represents a subset of the Ecore language. It contains essentially the elements needed for our transformation process. Its main components are:

  • The EPackage is the root element in serialized Ecore models. It encompasses EClasses and EDataTypes.

  • The EClass component represents classes in Ecore. It describes the structure of objects. It contains EAttributes and EOperations.

  • The EDataType component represents the types of EAttributes, either predefined types (Integer, Boolean, Float, etc.) or defined by the user. There is a special data type to represent enumerated types EEnum

  • EReferences is comparable to the UML Association link. It defines the kinds of the objects that can be linked together. The containment feature is a Boolean value that makes a stronger type of relationships. When it is set to true, it represents a whole/part relationship.

Fig. 7
figure 7

Simplified subset of the Ecore meta-model

Representing Generics: Ecore has been extended to support parametric polymorphism. Actually, parametrized types and operations can be specified, and types with arguments can be used instead of regular types. The changes are represented in the Ecore meta-model mainly in two new classes EGenericType and ETypeParameter (they are distinguishable from the others on the Fig. 7 by the green color). A parametrized type is then represented by a simple EClass that contains one or more ETypeParameters. An EGenericType represents an explicit reference to either an EClassifier or an ETypeParameter (but not both at the same time). The eTypeArguments reference is used to contain the EGenericTypes representing the type parameters.

4 From data types to Meta-Models

This part details the automatic translation from functional data types to meta-models. It represents the first direction of translation. To precisely define transformation rules, the transformation method is presented in a formal notation in the form of a function noted Tr(). The transformation rules are presented as sub-functions relative to the component given as input. In each rule definition, we start by an informal description, then we present it formally and finally we show an effective example. The Table 1 summarizes the mappings performed between the elements of the two meta-models.

Table 1 Correspondence between elements of the two meta-models
$$\begin{aligned} Tr: DataTypes \longrightarrow Ecore \; Meta\hbox {-}model \end{aligned}$$

The following translation sub-functions are given for a concrete syntax in the style of Caml [17]. Since most functional languages (including the language of proof assistants) have great similarities, the concrete syntax can be mapped to different functional languages.

4.1 Well-Formedness Constraints for Input Data Types

Our translation does not treat all the features typically present in functional programming languages such as Isabelle and Caml. The primary reason is that some features which are specific to functional programming have no counterpart in Ecore. This is particularly true of higher-order constructors, ı.e, constructors taking functions as arguments.

Even though we allow reference types (stateful programming can be simulated in functional programming by a monadic style), we exclude some mutable data structures, in particular arrays. Also, for now, we have not implemented a treatment for mutually recursive types, except for the list, reference and option type constructors. Genuine mutual recursion considerably complexifies the transformation procedure, but apart from the exceptions mentioned, only occurs rarely in practice.

However, as presented previously in Sect. 3.2, we treat primitive types (integers, Booleans, floats, strings) and user defined data types, represented in variant types. We allow the use of parametrized types and our subset detects the use of Caml keywords introducing lists, references and type option.

4.2 Rule ModuleToEPackage

In ML programs (respectively in the Isabelle proof assistant), it is possible to group portions of programs into modules. We decided to represent these modules by EPackages in Ecore. They are used to gather EDataTypes and EClasses. Thus, the transformation process consists of creating an EPackage for each module. The name of the corresponding EPackage is the module name. We have to also specify the prefix and the URI of the XML namespace by instantiating the NsPrefix and NsURI values. To translate the data types contained in the module, we call the function \(Tr_{dtp}()\) for each type definition.

$$\begin{aligned}&Tr_{module}( Module~ md\_name ~Dtp_1 \ldots Dtp_n)=\\&\qquad \qquad createEPackage();\\&\qquad \qquad setName(md\_name);\\&\qquad \qquad setNsPrefix(md\_name);\\&\qquad \qquad setNsURI(``http ://md\_name/1.0\hbox {''});\\&\qquad \qquad Tr_{dtp} (Dtp_i)~~~~~~ /\; 1 \le i \le n \end{aligned}$$

4.3 Rule DatatypeToEClass

This rule is applied when the data type is formed of only one constructor. The latter is translated into an EClass. The EClass name is the name of the type constructor. The types composing the data type are translated using other rules (PrimitivTypeToEAttribute or TypeToEReference).

$$\begin{aligned} \begin{array}{ll} Tr_{dtp}( tpConstr = cn \; t_1 \ldots t_n)= &{} createEClass();\\ &{} setName(tpConstr);\\ &{} Tr_{type} ( acc_i, t_i) \\ &{} /\; 1 \le i \le n\\ \end{array} \end{aligned}$$

Example:

figure a

4.4 Rule DatatypeToEEnum

Data types composed only of constructors (without type expressions typexpr) are translated into EEnums, which are usually employed to model enumerated types in Ecore. Then, each constructor composing the data type is translated into a literal named EEnumLiteral. The name of each constructor becomes the name of a literal.

$$\begin{aligned} \begin{array}{ll} Tr_{dtp}(tpConstr = cn_1| \ldots |cn_p) = &{} createEEnum();\\ &{} setName(tpConstr);\\ &{} Tr_{constrNm} (cn_i);\\ &{}/\; 1 \le i \le p \\ Tr_{constrNm}(cn_i) = &{} EEnumLiteral(cn_i); \\ &{} /\; 1 \le i \le p\\ \end{array} \end{aligned}$$

Example:

figure b

4.5 Rule DatatypeToEClasses

When constructor declarations are composed of more than one constructor declaration containing type expressions: a first EClass is created to represent the type constructor (tpConstr). Then, for each constructor, an EClass is created too, and inherits from the tpConstr one. To transform the type expressions of each constructor, we call the functions for translating the type expressions.

$$\begin{aligned} \begin{array}{ll} Tr_{dtp}( tpConstr = cd_1| \ldots |cd_n) = &{} createEClass();\\ &{} setName(tpConstr);\\ &{} Tr_{decl} (cd_i,tpConstr) \\ &{} /\; 1 \le i \le n\\ \end{array} \end{aligned}$$
$$\begin{aligned} Tr_{decl}: ConstructorDeclaration&\longrightarrow EClass \\ \end{aligned}$$
$$\begin{aligned} \begin{array}{ll} Tr_{decl} (cn_i \; t_1 \ldots t_m,tpConstr) =\\ \qquad \qquad \qquad createEClass();\\ \qquad \qquad \qquad setName(cn_i);\\ \qquad \qquad \qquad setSuperType\;(EClass(tpConstr));\\ \qquad \qquad \qquad Tr_{type}(acc_j,t_j) ~~~~~~~/\; 1 \le j \le m\\ \end{array} \end{aligned}$$

Example:

figure c

4.6 Rule PrimitiveTypeToEAttribute

If a type expression is formed of a primitive type, the translation function generates a new EAttribute. The name of this EAttribute is the name of its corresponding accessor, and its type is the EMF representation of the primitive type : EInt for int, EBoolean for bool, EString for string, etc.

$$\begin{aligned} \begin{array}{lcl} Tr_{type}: ( accessor,type)&{} \longrightarrow &{} EStructualFeature \\ Tr_{type}(acc,primTp) &{} = &{} createEAtrribute();\\ &{} &{} setName(acc); \\ &{} &{} setType(primTp_{EMF});\\ \end{array} \end{aligned}$$

Example: Same example is presented in Sect. 4.3.

4.7 Rule TypeToEReference

When a type expression contains a type which is not a primitive type, the latter has to be previously defined in the Isabelle theory. Then, a containment link is created between the current EClass and the EClass referenced by the type constructor, and the multiplicity is set to 1.

$$\begin{aligned} Tr_{type}: ( accessor,type)&\longrightarrow EStructualFeature \\ Tr_{type}(acc,tpConstr)&= createEReference();\\&setName(acc);\\&setType\;(tp\_constr);\\&setContainment\;(true); \\&setLowerBound(1);\\&setUpperBound(1); \\ \end{aligned}$$

Example:

figure d

4.8 Rule TypeOptionToMultiplicity

The type expressions can also appear in the form of a type list. In this case the multiplicity is set to 0...*. The type expression type option is used to express whether a value is present or not. It returns None, if it is absent and Some value, if it is present. This is modeled by changing the cardinality to 0...1.

$$\begin{aligned}&Tr_{type}: ( accessor,type) \longrightarrow EStructualFeature \\&Tr_{type}(acc, t \; list) = Tr_{type}(acc, t) \\&\qquad \qquad \qquad \qquad \qquad \quad setLowerBound(0);\\&\qquad \qquad \qquad \qquad \qquad \quad setUpperBound(*); \\&Tr_{type}(acc,t \; option) = Tr_{type}( acc, t) \\&\qquad \qquad \qquad \qquad \qquad \quad setLowerBound(0);\\&\qquad \qquad \qquad \qquad \qquad \quad setUpperBound(1); \\ \end{aligned}$$

Example:

figure e

The last case that we deal with is references (type ref). References are used to represent pointers in ML programming and Isabelle. They are translated into simple references without containment option in Ecore.

$$\begin{aligned} \begin{array}{ll} Tr_{type} (acc, t \; ref)= &{} Tr_{type}(acc, t ) \\ &{} setContainment(False);\\ \end{array} \end{aligned}$$

Example:

figure f

4.9 Rule AccessorToStructuralFeaturesName

This rule is spelled out to define how the \(accessor\_name\) is selected for naming a particular EStructuralFeature. Accessors are grouped in \(accssors\_list\). Each accessor structure is formed of an \(accessor\_name\), a \(constructor\) \(\_name\) and an integer value named \(index\). This index corresponds to the position of the type concerned by the accessor function.

The elements composing the accessor are used to select which (previously created) structural feature is concerned by the accessor. The accessor name is then used to name the selected EStructuralFeature.

The details of the process are given formally by the following representation.

$$\begin{aligned} \begin{array}{ll} Tr_{acc}: Accessor \longrightarrow EStructualFeature \\ Tr_{acc} (acc) = \\ ~~Tr_{acc}(acc\_name,constr\_nm, i) \\ ~~eCl\_list:= package.getEClassifier();\\ ~~select\_eCl:= eCl\_list.search\_by\_name(constr\_nm);\\ ~~eSF\_list:= select\_eCl.getEAllStructuralFeatures();\\ ~~select\_eSF.set\_Name(acc\_name); \end{array} \end{aligned}$$

Here is an example of transforming a data type description together with accessor functions into a class diagram represented in Ecore.

figure g

4.10 Transforming Generics

In case the data type definition is polymorphic, it is translated into the representation of generics in the meta-model. It consists in creating an EClass to represent the Type Constructor and for each type parameter creating an ETypeParameter related to the EClass via the eTypeParameters reference. Notice that we have to create an EGenericType for each class and type parameters (related to their EGenericType via the reference: eTypeArguments) each time we intend to use the EClass as a generic. Then, for each constructor declaration:

  • Create an EClass to represent the Constructor Declaration which has the same ETypeParameters as the Type Constructor one.

  • Setting its eGenericSuperType referring to the generic type representing the Type Constructor EClass.

When it comes to use these generics to type EStru- cturalFeatures, we are faced with two scenarios. First, when the type expression is a type parameter. The EStru- cturalFeature is typed with an EGenericType referring to the ETypeParameter of the containing EClass. If instead the type expression corresponds to a parametrized type with type parameters it is typed with an EGenericType representing the EClass with ETypePara- meters.

To clarify this process, we use the example below. It consists in transforming a parametrized tree data type. It has two parameters: the first corresponds to the type of leaves and the second to the type of values contained in a node. The result after performing the translation is displayed in the arborescent Ecore editor. The EGenericTypes are not explicitly represented in the EcoreDiagram.

Example:

figure h
figure i

5 From Meta-Models to Data types

In this section, we present the second direction of the translation: from meta-models into data structures used in functional programming. We start by defining some well-formedness conditions on the entry meta-model. Next, we detail one by one the different transformation rules. As in the previous section, transformation rules are presented in the natural language with a formal notation. To avoid overloading the notation, we use again the notation \(Tr()\) to represent the translation function.

5.1 Well-Formedness Constraints for Input Meta Model

To perform the reverse direction of the transformation, we draw heavily on the mapping performed on the forward translation (Sect. 4). In our view, it is important to successfully implement a function that is the inverse of the one from data type to meta-models. Indeed, it seems important to insure that the composition of the two opposite transformation functions gives identity, even if it leads us to impose some additional restrictions on the meta-model. In the forthcoming Ph.D thesis of the first author [8], these well-formedness constraints are spelled out in more detail.

The first restriction concerns the depth of inheritance relations: the transformation of a meta-model containing inheritance of classes on more than one level (a class that inherits from a class that inherits from another one etc.) is not supported by our rules.

The second restriction aims at avoiding mutually dependent data types. We therefore define a partial order \(\prec \) on classes for the transformation of EClassifiers contained in an EPackage. The EEnums have to be translated first, because they do not depend on any other elements. The EClasses left in the EPackage have then to be ordered using two criteria:

  • The inheritance relation: if an EClass \(C_1\) is a superType (used in Ecore for determining a super class) of another EClass \(C_2\), then \(C_1\) has to be translated before \(C_2\). We therefore add the constraint \(C_1 \prec C_2\).

  • The reference relation: if an EClass \(C_1\) is a target (eType in Ecore) of an EReference belonging to another EClass \(C_2\), then \(C_1\) has to be translated before \(C_2\), thus \(C_1 \prec C_2\).

This order allows us to define the second well-formedness criterion: the order \(\prec \) generated by the above two constraints has to be acyclic.

The last constraint we impose on the models is about inheritance and genericity. Indeed, if we have an inheritance relation between two generics (represented by EClasses with ETypeParameters), all the parameters used by the child class have to appear in the super class.

5.2 Rule EPackageToModule

The elements composing Ecore models can be gathered into EPackages. When we perform the translation from Ecore models to functional data type descriptions, we transform these packages into modules. The name of a particular EPackage gives the name of the module. The additional elements nsPrefix and nsURI are specific features of Ecore. They are not translated and not used in the functional description.

$$\begin{aligned} \begin{array}{ll} Tr (\mathtt{ePackage }~\mathtt{name }=p \\ \qquad \qquad \qquad \mathtt{nsPrefix }=pp\;\\ \qquad \qquad \qquad \mathtt{nsURI }=puri\;\\ \qquad \qquad \qquad \{ECl_1 ~\ldots ~ ECl_n\})=\\ \quad createModule();\\ \quad setName(p);\\ \quad Tr_{Cl}(ECl_i); \quad /\; 1 \le i \le n\\ \end{array} \end{aligned}$$

5.3 Rule EEnumToDatatype

Enumerated types are represented in Ecore by EEnums. To translate an EEnum, we first get all the EClassifiers contained in the EPackage, check for their instances and transform them to a data type definition composed of constructors without type expressions. Each EEnumLiteral is mapped to a constructor name.

$$\begin{aligned} \begin{array}{ll} Tr_{Cl}(\mathtt{eEnum }~ \mathtt{name }=e~ \\ \qquad \qquad \qquad \{ELit_1~\ldots ~ELit_n\}) = \\ \qquad \;createDatatype();\\ \qquad \;NewTp\_Constr();\\ \qquad \; setName(e);\\ \qquad \; Tr_{Lit} (ELit_i); /\; 1 \le i \le n\\ \end{array} \end{aligned}$$
$$\begin{aligned} \begin{array}{ll} Tr_{Lit} (\mathtt{literal }=l)=\\ \qquad \qquad createConstructor();\\ \qquad \qquad setName(l) ;\\ \end{array} \end{aligned}$$

Example:

figure j

5.4 Rule EClassToDatatype

The simplest case that we deal with is the one consisting in transforming a simple EClass which is not related with other EClasses by any inheritance link. In such a case, the EClass is translated into a single type definition without constructor declarations. The EClass name gives the type constructor name. Then, for each EStructuralFeature contained in the EClass, we call the appropriate sub-function: \(Tr_{SF}\) that stands for Translate Structural Feature.

$$\begin{aligned} \begin{array}{ll} Tr_{Cl} (\mathtt{eClass }~ \mathtt{name }=c\\ \qquad \qquad \qquad \{ESf_1~\ldots ~ESf_n\})=\\ \qquad \;if (c.is\_superType() ==false)\\ \qquad \;createDatatype(); \\ \qquad \;setName(c);\\ \qquad \;Tr_{SF} ( ESf_i); /\; 1 \le i \le n\\ \end{array} \end{aligned}$$

Example:

figure k

5.5 Rule EClassInheritenceToDatatype

This rule transforms an EClass hierarchy into a type definition. When we are faced with an EClass transformation, we first check if it is a SuperType of other classes. In such a case, we create a new data type definition named with the EClass name. Then, we select all the classes that inherit from this super class. For each of them, we apply the rule EClassToConstructor.

If the super class is a generic type (an EClass augmented with ETypeParameters) we call the function \(Tr_{prm} ()\) for every ETypeParameter.

$$\begin{aligned} \begin{array}{ll} Tr_{Cl} (\mathtt{eClass }~ \mathtt{name }=sClass\\ \qquad \qquad \qquad \{ESf_1~\ldots ~ESf_n\}\\ \qquad \qquad \qquad \{ETp_1~\ldots ~ETp_n\} )= \\ \qquad \;if (c.is\_superType() ==true)\\ \qquad \;createDatatype(); \\ \end{array} \end{aligned}$$
$$\begin{aligned} \begin{array}{ll} \qquad \;setName(sClass);\\ \qquad \;class\_list=select\_child\_classes;\\ \qquad \;Tr_{ch\_cl}(class_i) \quad /\; class_i \in class\_list\\ \qquad \;Tr_{prm}(ETp_i);~~~~~~ /\;1 \le i \le n \\ \end{array} \end{aligned}$$

5.5.1 Rule EClassToConstructor

Thanks to this rule, each (child) EClass is transformed into a constructor declaration in the corresponding type definition. First, a new constructor is created, the name of the constructor is the EClass name. Then for each EStructuralFeature contained in the EClass the function \(Tr_{SF}\) is called. The rules EAttributeToType (Sect. 5.6) and EReferenceToType (Sect. 5.7) are applied depending on the nature of the EStructuralFeature.

$$\begin{aligned} \begin{array}{ll} Tr_{ch\_cl}( (\mathtt{eClass }~ \mathtt{name }=c\\ \qquad \qquad \qquad \mathtt{eSuperType }=sClass\\ \qquad \qquad \qquad \{ESf_1~\ldots ~ESf_n\})=\\ \qquad \; setName(); \\ \qquad \; createConstructor(c);\\ \qquad \; Tr_{SF} (ESf_i);\quad /\; 1 \le i \le n\\ \end{array} \end{aligned}$$

The rule is applied in the same way when the super class is generic (in this case, we have eGenericSuperType instead of eGenericType).

$$\begin{aligned} \begin{array}{ll} Tr_{ch\_cl} (\mathtt{eClass }~\mathtt{name }=~c\\ \qquad \qquad \qquad \mathtt{eGenericSuperType }=sClass\\ \qquad \qquad \qquad \{ESf_1~\ldots ~ESf_n\})\\ \end{array} \end{aligned}$$

5.5.2 Rule ETypeParamaterToTypeParameter

Clearly, the ETypeParameters used in the representation of generics in Ecore are translated into their equivalents in functional programming: type parameters.

$$\begin{aligned} \begin{array}{ll} Tr_{prm}(\mathtt{eTypeParameter } \; \mathtt{name } = tp)=\\ \qquad \; createTypeParameter(); \\ \qquad \; setName(tp);\\ \end{array} \end{aligned}$$

Example:

figure l

5.6 Rule EAttributeToType

Transforming each EAttribute consists in creating a new type expression in the corresponding constructor declaration (or type definition). This corresponding element can be selected by name in the list of created data types. EAttribute’s type becomes the equivalent type in the functional language (using the transformation function \(Tr_{Type}\)).

To translate the upper and lower bounds, the function \(Tr_{Bnd} \) is called.

$$\begin{aligned} \begin{array}{ll} Tr_{SF}(\mathtt{eAttribute }\;\mathtt{name }= a \\ \qquad \qquad \qquad LowerBound\\ \qquad \qquad \qquad UpperBound\\ \qquad \qquad \qquad EType)= \\ createTypeExpression(Tr_{Type} (EType)); \\ Tr_{Bnd} (LowerBound,UpperBound); \\ Tr_{Type} (\mathtt{eType } = \mathtt{EInt })= \mathtt {int}\\ Tr_{Type} (\mathtt{eType } = \mathtt{EBoolean })= \mathtt {bool}\\ Tr_{Type} (\mathtt{eType } = \mathtt{EFloat })= \mathtt {float}\\ Tr_{Type} (\mathtt{eType } = \mathtt{EString })= \mathtt {string}\\ Tr_{Type} (\mathtt{eType } = \mathtt{eenum e })= e\\ \end{array} \end{aligned}$$

5.6.1 Rule EAttributeToTypeParameter

If the EAttribute is typed with an ETypeParameter (belonging to a generic type), it is translated (as the precedent case) into a type expression consisting of a type parameter. The name of the ETypeParameter becomes the name of the type parameter in the type expression.

$$\begin{aligned} \begin{array}{ll} Tr_{SF}( \mathtt{eAttribute }\;\mathtt{name } = a\\ \qquad \qquad \qquad LowerBound\;\\ \qquad \qquad \qquad UpperBound\;\\ \qquad \qquad \qquad \mathtt{eGenericType }\{ETp_1~\ldots ~ETp_n\})= \\ \qquad \; createTypeExpression(Tr_{TpPrm}(ETp_i));\\ \qquad \; /\; 1 \le i \le n\\ \qquad \; Tr_{Bnd} (\textit{LowerBound},\textit{UpperBound}); \\ Tr_{TpPrm} (\mathtt{eTypeParameter } \;\mathtt{name }= prm)=\\ \qquad \;createTypeExpression(prm)\\ \end{array} \end{aligned}$$

Example:

figure m

5.7 Rule EReferenceToType

To translate an EReference (pointing to an EClass \(c\)), we first create (as in the precedent rule) a new type expression in the corresponding constructor declaration (or type definition). This type expression is then represented by the name of the EClass to whom is targeting the EReference. The name of this EClass corresponds to a previously translated type definition.

To translate the multiplicities and the containment values, we call respectively the functions \(Tr_{Bnd} \) and \(Tr_\mathrm{cont}\).

$$\begin{aligned} \begin{array}{ll} Tr_{SF}( \mathtt{eReference }\;Containment\\ \qquad \qquad \qquad \mathtt{name } = r\\ \qquad \qquad \qquad LowerBound\\ \qquad \qquad \qquad UpperBound\\ \qquad \qquad \qquad \mathtt{eType }= c)= \\ \qquad \; createTypeExpression (c ~Tr_\mathrm{cont}(Containment)); \\ \qquad \; Tr_{Bnd} (LowerBound,UpperBound); \\ \end{array} \end{aligned}$$

If the Boolean value “containment” is set to False, the translated type expression is augmented with keyword ref. ref is used to represent pointers in functional languages.

$$\begin{aligned} \begin{array}{ll} Tr_\mathrm{cont}(\mathtt{containment }=\mathtt{false })= ref ;; \end{array} \end{aligned}$$

5.7.1 Rule EReferenceToParametrizedType

When the EReference target is a generic type (in the shape of an EClass augmented with type parameters), a new type expression in the corresponding constructor declaration (or type definition) is created to represent the EClass. Next, to each ETypeParmeter related to the EClass a type parmater is created in the type expression.

$$\begin{aligned} \begin{array}{ll} Tr_{SF}( \mathtt{eReference }\;\mathtt{name } = a\\ \qquad \qquad \qquad LowerBound\\ \qquad \qquad \qquad UpperBound\; \\ \qquad \qquad \qquad \mathtt{eGenericType }\;\mathtt{name } = genTp\\ \qquad \qquad \qquad \qquad \qquad \qquad \{ETp_1~\ldots ~ETp_n\})=\\ \qquad \;createTypeExpression(genTp)\\ \qquad \;createTypeParameters(prm_i)\;/\; 1 \le i \le n\\ \\ Where ETp_i has the form: \\ ETp_i=\! (\mathtt{eTypeParameter }\, \mathtt{name }\!=\! prm_i)/ 1 \!\le \! i \!\le \! n\\ \end{array} \end{aligned}$$

Example:

figure n

5.8 Rule MultiplicitiesToTypeOptions

This rule permits to translate multiplicity values contained in structural feature definitions (represented by an upper and a lower bound). They are used to determine the number of features that composes an instance. When the lowerBound is 0 and the upperBound is set to 1, this signifies that in the instance this EStructuralFeature might be present or absent. These values are translated into type option in the type expressions.

If the upperBound is represented by a * this implies the ability of creating more than one instance of the concerned EStructuralFeature. It is mapped to the type list in the data type description.

$$\begin{aligned}&Tr_{Bnd} (\mathtt{lowerBound =``0'' }, \mathtt{upperBound =``1'' })= \mathtt{option };\\&Tr_{Bnd} (\mathtt{lowerBound =``0'' }, \mathtt{upperBound =``*'' }) = \mathtt{list };\\ \end{aligned}$$

Example:

figure o

6 Case Studies

To evaluate our approach we have applied our method on case studies that combine our transformation with the generation of tools for graphical/textual syntaxes. More details can be found in [8]. For the lack of space, we here illustrate our approach with a part of a description of a DSL.

This DSL is a Java-like language enriched with assertions developed by ourselves for which no off-the-shelf definition exists. It represents a real-time dialect of the Java language allowing us to carry out specific static analyses of Java program (details are described in [2]), where a formal semantics of Java is defined in the Isabelle proof assistant. Because of this application context, we do not use a Caml grammar in this case. Our approach is implemented using the Eclipse environment.

Performing the translation for the whole language description would generate a huge meta-model that could not be presented in the paper. We thus choose to present only an excerpt of it, corresponding to a method definition. Figure 8 shows a data type taken from the Isabelle theory where the verifications were performed. A method definition (in our DSL) is composed of a method declaration, a list of variables, and statements. Each method declaration has an access modifier that specifies its kind. It also has a type, a name, and some variable declarations. The stmt data type describes the statements allowed in the method body: assignments, conditions, sequence of statements, return and the annotation statement (for time annotations). In this example we use Booleans, integers, strings for types and values.

Fig. 8
figure 8

Data types in Isabelle

This part of the Isabelle theory was given as input to the implementation of our translation rules presented in Sect. 4. The resulting Ecore diagram is presented in Fig. 9. As it is shown on the figure, data type definitions built only of type constructors (Tp, AccModifier, Binop, Binding) are treated as enumerations in the meta-model. Whereas Data type MethodDecl composed of only one constructor derive a single class. As for type expressions that represent a list of types (like accModifier list in varDecl), they generate a structural feature in the corresponding class and their multiplicities are set to (0...*). The result of type definitions containing more than one constructor and at least a type expression (stmt and expr) is modeled as a number of classes inheriting from a main one. Finally, the translation of the int, bool and string types is straightforward. They are mapped to respectively EInt, EBoolean and EString.

Fig. 9
figure 9

Resulting Ecore diagram after transformation

7 Conclusion

Our work constitutes a first step towards a combination of interactive proof and MDE. We have presented an MDE-based method for transforming data type definitions used in proof assistants to class diagrams and back again, using bidirectional transformations.

The approach is illustrated with the help of a DSL developed by ourselves. It is a Java-like language enriched with annotations. Starting from data type definitions, set up for the semantic modeling of the DSL, we have been able to generate an EMF meta-model. The generated meta-model is used for documenting and visualizing the DSL, it can also be manipulated in the Eclipse workbench to generate a textual editor as an Eclipse plug-in.

We are working on coupling our work with the generation of provably correct object-oriented code from proof assistants.