Tool Support for Functional Graph Rewriting with Persistent Data Structures - GrapeVine

Weber, Jens H.

doi:10.1007/978-3-031-09843-7_11

Jens H. Weber ORCID: orcid.org/0000-0003-4591-6728⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13349))

Included in the following conference series:

International Conference on Graph Transformation

387 Accesses
3 Citations

Abstract

Existing graph transformation (GT) tools treat graphs as ephemeral data structures, i.e., the successful application of a GT rule to a graph G rewrites that graph to produce a modified graph \(G'\). The original graph G is lost during that update. In contrast to ephemeral data structures, persistent data structures preserve access to all previous versions when data is modified and fully persistent data structures even allow all previous versions to be modified. In earlier work, we introduced the Graph Rewriting and Persistence Engine Grape as a tool for specifying and executing transformations on large-scale graphs and integrated it with a computational notebook platform (GrapePress). While the term “persistence” has been in the tool’s acronym from the start, it was chosen to indicate that graphs were maintained in a database with transactional support. Until now, Grape (and GrapePress) treated graphs as ephemeral data structures, i.e., previous graph versions were not retained upon modification. This paper presents a major revision of the tool (called GrapeVine) to support functional graph rewriting based on a fully persistent data structure.

Access provided by Autonomous University of Puebla. Download conference paper PDF

GRAPE – A Graph Rewriting and Persistence Engine

GMTE: A Tool for Graph Transformation and Exact/Inexact Graph Matching

Model-Driven Engineering of an OpenCypher Engine: Using Graph Queries to Compile Graph Queries

Keywords

1 Introduction

Graphs and graph transformation (GT) systems have been applied to a variety of problems with industrial relevance [12, 17,18,19]. GT tools play an important role in making this feasible. A considerable number of such tools have been developed over the last decades [1, 15]. Some of these tools aim to closely implement results from GT theory, e.g., results from algebraic graph rewriting [16, 20]). Other tools are more concerned with incorporating a notion of GTs with practical applications, e.g., software design [10] and verification [13]. However, one aspect where all current GT tools fall short of reflecting mathematical theory is that they maintain graphs in ephemeral data structures, i.e., graphs are maintained in stateful objects that are updated when GT rules are applied and previous version of graphs are lost. The use of ephemeral data structures for maintaining graphs is particularly limiting for example when an application needs to “look back” into graph derivation histories [8] or if it needs to perform some type of reasoning about graph processes [4]. Significant extensions of GT theory, such as the definition of transactional graph transformation systems as processes [2], cannot readily be implemented with current GT tools. From a more mundane perspective, it is well known from the programming languages and software engineering domains that stateless computation avoids complexity and is easier to understand and test.

All data in functional programming is immutable. Under the hood, this is implemented with persistent data structures which, in contrast to ephemeral data structures, retain all previous versions upon update [9]. A data structure is partially persistent if all versions can be accessed but only the latest version can be modified. Data structures where all versions can be modified are referred to as fully persistent.

In this paper, we present GrapeVine, a GT tool that maintains graphs in a fully persistent data structure and treats graphs as immutable data objects. GrapeVine is based on our earlier work on Grape (the Graph Rewriting And Persistence Engine) [21] and its integration with computational notebook technology (GrapePress) [22]. Like other tools, Grape (and GrapePress) uses an ephemeral data structure to store graphs, i.e., graphs are destructively rewritten “in-place”. (The term “persistence” in the tool’s name was originally chosen to indicate that data structure was persisted in a database with transactional support.) While GrapeVine is based on these earlier works, it constitutes a complete reimplementation of its core engine and data structures to achieve support for truly functional graph rewriting.

The rest of this paper is structured as follows: We discuss related work in the following section. Section 3 provides an overview of GrapeVine and its new persistent data structure. Section 4 comments on the tool demonstration provided in the appendix and Sect. 5 offers concluding remarks and an outlook to future work.

2 Related Work

Our current work on GrapeVine can be seen as a major revision of our earlier work on Grape [21] and GrapePress [22], with the fundamental difference that GrapeVine uses a “functional programming” paradigm and treats graphs as immutable objects. Historically, the name Grape referred to the GT engine and the domain-specific language for defining and controlling GTs, while the name GrapePress was used to refer to the computational notebook platform that integrates with Grape. While GrapeVine inherited the same architecture (i.e., its language and engine can be used without the computational notebook platform), we use a single name for the tool, going forward. Similarly, we will use the name GrapePress in this paper to also include the Grape engine.

GrapeVine has the ability to concurrently explore and compare many (all) possible derivations of arbitrarily many graphs. This feature is related to functionality provided by the GROOVE tool, which provides support for state-space exploration of GT systems [13]. However, GROOVE does not persist different versions of a graph but only fingerprints for the purpose of detecting collisions during model-checking (which is the primary function of graph process exploration in GROOVE).

The functionality provided by the fully persistent data structure implemented in GrapeVine resembles that of model versioning and indexing system like Hawk [3]. However, those approaches are primarily used for model management but they are not integrated with the computational model of the transformation tool. In contrast, GrapeVine realizes a truly functional computation model based on its fully-persistent graph data structure.

The data structure implemented in GrapeVine was inspired by the theoretical concept of a graph process, as defined by Corradini et al. [4], which is an a graph of graph transformation occurrences. Moreover, transaction-handling in GrapeVine is based on graph processes, as proposed by Baldan et al. [2].

3 GrapeVine Concepts

3.1 Overview

A core objective in the design of GrapeVine (and its predecessor GrapePress) has been to make it easy to integrate GT programming with common software engineering tools and processes. As such, GrapeVine was developed as an internal domain-specific language (DSL) to a general purpose programming language (Clojure). Regular tools like text editors, IDEs and configuration management systems can be used, without the need to install and learn a particular graphical tool for GT development and execution [21]. GrapeVine comes integrated with an optional computational notebook user interface, which provides graphical visualization of rules, graphs and graph history [22]. GrapeVine computational notebooks are an excellent way to quickly explore and document GT-based computations and systems. Notebook worksheets can be shared as “executable” papers, but they can also be saved as regular program code, to be used as part of larger software applications.

Under the hood, GrapeVine uses the Neo4J graph database management system, which makes it highly scalable to ultra-large graphs. GrapeVine is available as a Docker image and installs with a single command.

Graph Model. The tool uses directed, attributed, node- and edge-labeled (danel) graphs. Graphs do not need to be typed but it is possible to define constraints on graphs, which are enforced whenever rules are applied. GrapeVine comes with a number of predefined constraint types (e.g., for allowed node and edge types, cardinalities, and uniqueness of attribute values) but also allows users to define complex, user-defined constraints based on Oreja et al.’s logic of graph constraints [11].

Graph Transformation Rules. In the previous version of the tool (GrapePress), each GT rule could be defined with either single-pushout (SPO) or double-pushout (DPO) semantics [6]. The main difference concerned the way how any “dangling” edges would be handled during the execution of rules that delete nodes. SPO rules would delete such dangling edges, while the application of a DPO rule is simply not permitted in a context that would cause dangling edges to arise. GrapeVine has removed this choice and adopted SPO semantics throughout. We found that the complexities caused by providing this option (both in terms of tool implementation as well as in terms of reasoning about such “mixed” GT systems) are not justified, since the behaviour of DPO rules can be simulated in the SPO approach, given the availability of additional mechanisms, like negative application conditions.

The user can still choose between homomorphic and isomorphic matching semantics for each rule. Rules can be equipped with applications conditions (i.e., conditions that must be true for a rule to be applicable) [5] and negative application conditions (i.e., conditions that prevent rule application) [7].

Rules can be parameterized and rule parameters can be used to define or restrict the labels and attributes of graph elements. The previous version of the tool (GrapePress) treated attributes as variables and therefore also provided an assignment operator to change their values. The concept of variables no longer applies to the functional computation paradigm implemented in GrapeVine. Since graphs are immutable, attributes also have that property. This means that GTs that seek to “modify” attributes of a graph element (node or edge) need to replace that graph element with a new graph element of the same type, while copying the unchanged attributes and (re)defining the “changed” ones.

Control Structures. Since GrapeVine rules are defined with an internal DSL to a general purpose programming language (Clojure), the control structures of the host language are available for programming with GTs. The previous version of the tool (GrapePress) also provided a set of dedicated control structures (e.g., atomic blocks, loops, non-deterministic choice), which could be used to define composite transactions with ACID properties and backtracking for dealing with non-determinism during rule applications [21]. Given the paradigm shift to a purely functional model of computation in GrapeVine, these control structures have been revised completely. All GrapeVine control structures are defined as functions on graph sets rather than graphs. This allows the definition of deterministic operators for rule application. (Non-deterministic operators are also still available.) GrapeVine therefore no longer needs complex backtracking mechanisms to “undo” unsuccessful derivations. Similarly, there is no longer a need for a dedicated transaction manager to achieve ACID properties for programmed transformation units, since unsuccessful (partial) execution of such units can simply be “forgotten” [2].

Graph Queries. Graph queries are pattern-based matches for the purpose of returning data from the graph or testing the existence of conditions. The previous version of the tool (GrapePress) did not have a dedicated syntactic form for graph queries; graph queries were merely defined as rules that do not alter the graph. Instead, GrapeVine has a dedicated syntax for defining graph queries and enforces their read-only semantics. This is important because, in contrast to rule applications, the application of queries does not generate a new graph.

3.2 A Fully-Persistent Data Structure for Functional Graph Rewriting

There are several choices that can be made when designing a fully persistent data structure. Of course, there is always the naïve approach of copying a data item upon modification. However, that approach makes sense only for relatively small data items. Since in our application, each occurrence of a GT creates a new graph, one choice is to record only the differences from the “previous” graph in our data structure. We haven chosen that option and Fig. 1 presents the model that was implemented for this purpose.

Nodes of type Graph reference the graph elements (nodes and edges) that were created or deleted when the “last” transformation occurred. We also record which graph elements where read (preserved) when a transformation occurred. While the latter is not needed for implementing a versioned data structure, we capture this data for the purpose of using the tool for reasoning about the provenance of applied transformation rules.

Graphs are uniquely identified by a globally unique hash code (id). Graphs are arranged in a partial order, which is induced by the previous relationship. Computing the elements of a given graph G simply requires retrieving all elements that have been created by any graph in the history of G minus those elements that have been deleted in G’s history. This query can efficiently be executed in the Neo4J graph database. The Neo4J query language is well understandable and we provide it below instead of creating our own mathematical formulation. The query looks up a graph (g) with the given id and collects all graphs gs in its history by repeatedly traversing previous edges. The query then collects all graph elements e that have a created edge from any node in gs without also having a deleted edge from any node in gs.

Now that we have described the design chosen for implementing the persistent data structure for graphs in GrapeVine, we briefly want to comment on a disadvantage of the chosen approach, i.e., GrapeVine graphs are no longer represented “varbatim” in the Neo4J graph database. While the previous version of the tool (GrapePress) was able to operate on any any graph in a Neo4J database, independently of its origin, GrapeVine now requires graphs to conform to the meta-model in Fig. 1. For example, we note that GrapeVine now represents graph edges as Neo4J graph nodes, to allow incoming provenance edges from Graph nodes. (Neo4J does not support hypergraphs.)

We could have represented GrapeVine edges directly as edges in the Neo4J database if we replaced the provenance edges (create, delete, read) by “foreign key” attributes (createdBy, deletedBy, readBy) on edges. (Neo4J allows attributes on edges.) These attributes could store the IDs of the corresponding graphs (that created, deleted or read the edge), respectively. However, such a design would be inefficient as each graph lookup would then involve value-based “join” operations on potentially large sets of elements. Moreover, the benefit of such a design would be small, as other external tools accessing the graph database directly would still need to use a projection to assemble a concrete graph from its history. The only alternative that would avoid the need for such a projection would be the naïve approach of copying the graph upon each change, which does not appear scalable.

As mentioned in the previous section, GrapeVine no longer requires a sophisticated transaction manager like the one used on the previous (stateful) version of the tool (GrapePress). Rather, GrapeVine implements transactions based on the notion of graph processes. [2] Unsuccessful or incomplete executions of composite GT programs can simply be “forgotten”. Of course, from a practical point of view, these graphs would still fill up the database over time. GrapeVine therefore provides a mechanism to purge them. Any graph of interest can be “committed” to the database by tagging it with a commit tag (ctag) in Fig. 1. Conversely, GrapeVine offers a rollback operation that deletes all graphs that are not in the history of any committed graph.

4 A Taste of Interacting with GrapeVine

Like the previous version of the tool, GrapeVine can be used as a library in software programming projects or it can be used as a stand-alone tool with its integrated computational notebook. The demo in the appendix uses the latter.

At first glance, GrapeVine looks similar to the previous version of the tool (GrapePress). Worksheets consist of sequences of static segments and dynamic segments. Dynamic segments consist of executable code and a display for the output of the computation. A major advantage of the functional (side-effect free) computational model in GrapeVine is, however, that it is now much easier to create idempotent dynamic segments. In the previous version of the tool, code segments needed to be executed “in the right order” to produce the desired output result. For example, if a graph was rewritten by a graph transformation once, applying that transformation another time would usually result in a different output.

The demo shows how using sets of graphs rather than single graphs as a data type for computation simplifies the composition of (GT) operations and allows for the definition of a deterministic operator for rule application (which simply return the set of all possible direct derivations). Another advantage of this choice of data type is that unsuccessful rule applications simply return an empty set rather than requiring special handling of failure.

Finally, the demo illustrates how GrapeVine’s data structure maintains the derivation history of all graphs as a graph of graphs. We demonstrate how such history graphs can be reflected as regular GrapeVine graphs and how the tool supports visualizing and reasoning about derivation histories.

5 Conclusions and Future Work

The fact that current GT tools maintain graphs in ephemeral data structures limits their usefulness in practice. Applications that require knowledge of the derivation history of graphs cannot readily be supported. Current tools do not support computing with and reasoning about graph derivation history. Even in applications where this is not needed, stateful computation is harder to understand and test. Stateless computation is also preferrable when working with GrapeVine as a computational notebook. Users of the predecessor tool had to remember to execute code segments in worksheets “in the right order” since GTs had side-effects on the shared graph and few code segments would be idempotent.

Finally, from a tool developer’s point of view, the associated stateful computation model is complex and prone to problems. For example, the transaction manager developed for the (stateful) prior version of this tool (GrapePress) was by far the most complex module of the tool. Removing it from GrapeVine was liberating, but also painful because of all the work that had gone into it.

Our current plan for evolving GrapeVine has two near-term objectives. Firstly, we will evolve the persistent data structure to make it confluently persistent. A data structure is called confluently persistent if there is a “merge” operation to join two branches originating from a common version [9]. While we did not talk about it in this paper, GrapeVine already has an operator to filter out “duplicate” graphs in a graph set (up to isomorphism). That operator is implemented along a similar mechanism as described by Rensink [14]. We are working on using this operator as a basis for defining a “merge” operation for GrapeVine’s data structure. Our second objective is to integrate theoretical results for verifying properties of graph transformation systems, in particular with respect to the confluence of rule sets, i.e., critical pair analysis.

References

Aouat, A., Bendella, F., Deba, E.a.: Tools of model transformation by graph transformation. In: IEEE CSAE 2012, pp. 425–428 (2012). https://doi.org/10.1109/ICSESS.2012.6269495
Baldan, P., Corradini, A., Foss, L., Gadducci, F.: Graph transactions as processes. In: Corradini, A., Ehrig, H., Montanari, U., Ribeiro, L., Rozenberg, G. (eds.) ICGT 2006. LNCS, vol. 4178, pp. 199–214. Springer, Heidelberg (2006). https://doi.org/10.1007/11841883_15
Chapter MATH Google Scholar
Barmpis, K., Kolovos, D.: Hawk: towards a scalable model indexing architecture. In: Proceedings of the of ACM BigMDE 2013 (2013). https://doi.org/10.1145/2487766.2487771
Corradini, A., Montanari, U., Rossi, F.: Graph processes. Fundamenta Informaticae 26(3, 4), 241–265 (1996). https://doi.org/10.3233/FI-1996-263402
Ehrig, H., Habel, A.: Graph Grammars with Application Conditions, pp. 87–100. Springer, CHam (1986). https://doi.org/10.1007/978-3-642-95486-3_7
Ehrig, H., et al.: Algebraic approaches to graph transformation - Part II: single pushout approach and comparison with double pushout approach. In: Handbook of Graph Grammars and Computing by Graph Transformation, pp. 247–312 (1997). https://doi.org/10.1142/9789812384720_0004
Habel, A., Heckel, R., Taentzer, G.: Graph grammars with negative application conditions. Fundamenta Informaticae 26(3, 4), 287–313 (1996). https://doi.org/10.3233/FI-1996-263404
Jahnke, J.H., Schäfer, W., Wadsack, J.P., Zündorf, A.: Supporting iterations in exploratory database reengineering processes. Sci. Comput. Program. 45(2–3), 99–136 (2002). https://doi.org/10.1016/S0167-6423(02)00056-4
Article MATH Google Scholar
Kaplan, H.: Persistent data structures. In: Handbook of Data Structures and Applications, pp. 511–527. Chapman and Hall/CRC (2018)
Google Scholar
Nickel, U., Niere, J., Zündorf, A.: The fujaba environment. In: Proceedings of ICSE, pp. 742–745 (2000). https://doi.org/10.1145/337180.337620
Orejas, F., Ehrig, H., Prange, U.: A logic of graph constraints. In: Fiadeiro, J.L., Inverardi, P. (eds.) FASE 2008. LNCS, vol. 4961, pp. 179–198. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78743-3_14
Chapter Google Scholar
Pfaltz, J.L., Nagl, M., Bohlen, B.: AGTIVE 2003, vol. 3062 LNCS. Springer (2004)
Google Scholar
Rensink, A.: The GROOVE simulator: a tool for state space generation. In: Pfaltz, J.L., Nagl, M., Böhlen, B. (eds.) AGTIVE 2003. LNCS, vol. 3062, pp. 479–485. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25959-6_40
Chapter Google Scholar
Rensink, A.: Isomorphism checking in groove. Electr. Commun. EASST 1 (2007). https://doi.org/10.14279/tuj.eceasst.1.77
Rensink, A., Van Gorp, P.: Graph transformation tool contest 2008. Int. J. Softw. Tools Technol. Transfer 12(3), 171–181 (2010). https://doi.org/10.1007/s10009-010-0157-7
Article Google Scholar
Runge, O., Ermel, C., Taentzer, G.: AGG 2.0 – new features for specifying and analyzing algebraic graph transformations. In: Schürr, A., Varró, D., Varró, G. (eds.) AGTIVE 2011. LNCS, vol. 7233, pp. 81–88. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34176-2_8
Chapter Google Scholar
Schürr, A., Varró, D., Varró, G. (eds.) AGTIVE 2011. LNCS, vol. 5088 LNCS. Springer (2008)
Google Scholar
Schürr, A., Rensink, A.: Software and systems modeling with graph transformations theme issue of the Journal on Software and Systems Modeling. Softw. Syst. Model. 13(1), 171–172 (2012). https://doi.org/10.1007/s10270-012-0254-z
Article Google Scholar
Schürr, A., Varró, D., Varró, G.: AGTIVE 2011, vol. 7233 LNCS. Springer (2012)
Google Scholar
Taentzer, G.: AGG: a graph transformation environment for modeling and validation of software. In: Pfaltz, J.L., Nagl, M., Böhlen, B. (eds.) AGTIVE 2003. LNCS, vol. 3062, pp. 446–453. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25959-6_35
Chapter Google Scholar
Weber, J.H.: GRAPE – a graph rewriting and persistence engine. In: de Lara, J., Plump, D. (eds.) ICGT 2017. LNCS, vol. 10373, pp. 209–220. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61470-0_13
Chapter Google Scholar
Weber, J.H.: Grapepress - a computational notebook for graph transformations. In: Graph Transformation, LNCS, vol. 12741, pp. 294–302. Springer (2021). https://doi.org/10.1007/978-3-030-78946-6_16

Download references

Author information

Authors and Affiliations

University of Victoria, Victoria, BC, Canada
Jens H. Weber

Authors

Jens H. Weber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jens H. Weber .

Editor information

Editors and Affiliations

Université Paris Cité, Paris, France
Nicolas Behr
Computer Science and Engineering, Chalmers | Gothenburg University, Gothenburg, Sweden
Daniel Strüber

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Weber, J.H. (2022). Tool Support for Functional Graph Rewriting with Persistent Data Structures - GrapeVine. In: Behr, N., Strüber, D. (eds) Graph Transformation. ICGT 2022. Lecture Notes in Computer Science, vol 13349. Springer, Cham. https://doi.org/10.1007/978-3-031-09843-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-09843-7_11
Published: 26 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09842-0
Online ISBN: 978-3-031-09843-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics