1 Introduction

This paper overviews our research on verifying expression optimisations used in the GraalVM compiler developed by Oracle.Footnote 1 The compiler supports multiple source languages (Java, Scala, Kotlin, JavaScript, Python, Ruby, ...) and multiple target architectures (AMD64 and ARM) and has variants for both just-in-time and ahead-of-time compilation. It has front ends that generate an intermediate representation (IR) of the program being compiled from the source programming language. The compilation process includes multiple optimisation phases that transform the IR representation of a method/program to a more efficient version, also expressed in the IR. The final phase generates machine code for the target architecture from the optimised IR representation of the program.

Why Verify Compilers? Compilers for programming languages are an indispensable part of the trusted base of a software development platform. Their correctness is essential because an error in a compiler can lead to errors in any of the myriad of programs it compiles.

Why Focus on the Optimiser? For a multi-lingual, multi-target compiler, the machine-independent optimiser is common to all source programming languages and all target machine architectures and hence correctness of the optimiser affects all source languages and all target architectures.

The optimiser is a common source of errors within compilers. In a study of C compilers, Yang et al. [7] found that for GCC, with optimisation turned off only 4 bugs were found but with optimisation turned on 79 bugs were found, and for Clang, with optimisation turned off only 19 bugs were found but with optimisation turned on 202 bugs were found.

Errors in an optimiser are often due to subtle edge cases that may not be covered by testing, whereas verification addresses all possible cases. For example, a quirk of two’s complement arithmetic is that the most negative 32-bit signed integer \(MinInt = -2^{31}\), when negated gives back MinInt (because the largest representable positive integer is \(2^{31}-1\) and hence \(-MinInt = 2^{31}\) is not representable as a 32-bit signed integer and the negation of MinInt “overflows” and gives back MinInt). One consequence of this is that the absolute value function when applied to MinInt gives MinInt, a negative value! Hence a plausible optimisation that replaces \(0 \le abs(x)\) with true is invalid if x is MinInt.

Overview. The GraalVM IR for a method consists of a graph structure that combines both control-flow and data-flow nodes [3]. In this paper we overview our approach to verifying the optimisation of data-flow sub-graphs, which represent expressions in the source language. We have developed a model of the IR in Isabelle/HOL [1] and then given the IR a semantics [6] (see Sect. 2). Expression optimisations are given as a set of conditional term rewriting rules (see Sect. 3). Proving the rules correct then corresponds to showing that they preserve the semantics (see Sect. 4). Generating efficient code for an optimiser from a set of rewriting rules is overviewed in Sect. 5.

2 Data-Flow Sub-graphs

GraalVM IR data-flow sub-graphs are,

  • side-effect free – side effects are factored out into the control-flow part of the graph,

  • well-defined in context – runtime exceptions such as divide by zero or index out of range are guarded in the control flow graph, so that for example, a divide node cannot be reached if its divisor is zero, and

  • share common sub-expressions – if the same sub-expression, e, is used in multiple places in an expression f, a single sub-graph representing e is shared by all references to e within f.

Sharing common sub-expressions is essential for generating efficient code but it means that the representation of a term (i.e. a programming language expression) is not a conventional abstract syntax tree but rather a directed acyclic graph structure with a single root node, commonly known as a term graph [2]. Figure 1 gives an example of both a conventional tree and (maximal sharing) term-graph representation of the term \(x\,*\,x + x\,*\,x\). Note that in the term-graph representation, the node representing x is shared in the sub-graph representing \(x*x\), and the root node of \(x*x\) is shared in the whole expression \(x*x+x*x\). Also note that the same sharing occurs if x is a single node or a term graph representing a more complex shared sub-expression.

Fig. 1.
figure 1

Abstract syntax tree and term-graph representations of \(x*x + x*x\).

Each leaf node of a term represents either a constant, a parameter to the method in which the expression occurs, or a control-flow node, such as a method call node, where at runtime the value associated with that node will have already been calculated by the control-flow execution, e.g. the result of a method call.

The semantics of expressions is defined over their abstract syntax tree (or tree for short) form. Term-graphs are a (more efficient) representation of a tree (i.e. a data refinement). For any term graph there is a unique corresponding tree and hence the semantics of a term graph is defined as the semantics of the corresponding tree. Our Isabelle/HOL semantics for an expression, e, is parametrised with respect to a context consisting of a list, p, of parameters of the method in which e occurs and a method state, m, consisting of a mapping from control-flow node identifiers (e.g. for a method call node) to their (pre-computed) values. The following relation represents evaluating term e in a context consisting of method state m and parameters p to value v.

$$\begin{aligned}{}[m,p] \vdash e \mapsto v \end{aligned}$$
(1)

In Isabelle/HOL this is a (deterministic) relation, rather than a function; if the value of e in context [mp] is not well defined, the relation does not hold.

Values may be integers, object/array references or the special undefined value.Footnote 2 The semantics needs to take into account the bit width of integer values (e.g. 32 or 64) because unbounded integers do not have the same semantics. For integers, their values are represented by a 64-bit value (using the HOL Word library) plus a bit-width b, where \(0 < b \le 64\); only the low-order b bits of the 64-bit word are significant. In two’s complement arithmetic, for an expression such as \((x+y)-y\), the calculation of \(x+y\) may overflow but subtracting y then “underflows” the value back to x, allowing \((x+y)-y\) to be replaced by x, even with the presence of overflow.

To validate our semantics we have developed a tool that translates GraalVM test cases written in Java to their Isabelle/HOL representation. Each test case is run in Java and its result compared with the value determined by our executable Isabelle/HOL semantics (see [4] for more details).

3 Term Rewriting Rules

Expression optimisations are based on the algebraic properties of the expressions, e.g. \(x * 0 = 0\), and can be expressed as conditional term rewriting rules [5], for example, in the following rewriting rules x, y, t and f represent arbitrary expressions, c represents an integer constant, and \(\mathbin {<\!\!<}\) is the left shift operator. The compiler performs static analysis that tracks the lower and upper bounds of a node, which are stored in the node’s stamp so that for Rule 6 if the upper bound for x is less than the lower bound for y, \(x < y\) must be true in that context. A division node within the graph can only be reached after a (control-flow) check that the divisor is non-zero, otherwise an exception is raised.Footnote 3 That allows an optimisation like Rule 3 to be valid because the case when x is zero cannot occur.

$$\begin{aligned} {x * 0}&\longmapsto {0} \end{aligned}$$
(2)
$$\begin{aligned} {x / x}&\longmapsto {1} \end{aligned}$$
(3)
$$\begin{aligned} {(x + y) - y}&\longmapsto {x} \end{aligned}$$
(4)
$$\begin{aligned} {x * c}&\longmapsto {x \mathbin {<\!\!<}\log _2 c}\quad {\textsf {when}}\,\,\,{isPower2~c} \end{aligned}$$
(5)
$$\begin{aligned} {x < y}&\longmapsto {true}\quad \quad \ \ \ \ \ {\textsf {when}}\,\,\,{upper(stamp(x)) < lower(stamp(y))} \end{aligned}$$
(6)
$$\begin{aligned} {\lnot false}&\longmapsto {true} \end{aligned}$$
(7)
$$\begin{aligned} (true \mathbin {?} t \mathbin {:} f)&\longmapsto {t} \end{aligned}$$
(8)

The rewriting rules can be applied to any sub-term and in any order. In practice, it is better to optimise all sub-terms of a term e before applying rules to optimise e itself. An exception is when optimising a conditional \((b \mathbin {?} t \mathbin {:} f)\), in which case it is better to first optimise b (e.g. using Rule 6 or Rule 7) because if Rule 8 can then be applied then f is eliminated from the expression and then only t needs to be optimised.

4 Verifying Term Rewriting Rules

This section briefly overviews the verification of rewriting rules (for more details see [5]). We say term \(e_1\) is refined by term \(e_2\) if and only if for all contexts [mp], if \(e_1\) evaluates to a well-defined value v, so does \(e_2\).

$$\begin{aligned} e_1 \mathrel {\sqsupseteq }e_2 = (\forall m~p~v \mathbin {.}[m,p] \vdash e_1 \mapsto v \implies [m,p] \vdash e_2 \mapsto v) \end{aligned}$$
(9)

To show a rewriting rule, \({e_1}\longmapsto {e_2}\,\,\textsf {when}\,\,{cond}\), is correct, we show that if cond holds \(e_1\) is refined by \(e_2\).

$$\begin{aligned} cond \implies (e_1 \mathrel {\sqsupseteq }e_2) \end{aligned}$$
(10)

For Rule 2, the right side (i.e. 0) is valid in all contexts but the left side (i.e. \(x*0\)) is only well defined in contexts where x is well defined. For the division node, the semantics defines 0/0 to be a special undefined value, and hence Rule 3 is valid because the values of the two sides of a rewriting rule only need to be equal if the left side is well defined.

Verifying optimisations as term rewriting rules is much simpler on the tree representation than on the term-graph representation because, in a term graph, a replaced node may be referenced in multiple places in the graph. To show that term-graph rewriting is correct, we make use of a theorem that shows that if \(e_1 \mathrel {\sqsupseteq }e_2\) and a term graph matching \(e_1\) is replaced by the corresponding term graph for \(e_2\), then the semantics of the overall graph is preserved.

5 Generating Code for Optimisations

The approach described above represents optimisations as a set of rewriting rules. One could naively translate each rule to code and apply them repeatedly until no rule was applicable.Footnote 4 We are currently developing an approach to generate an efficient optimiser from sets of rewriting rules. In practice, there is often overlap between rules in the matching process, for example, all rules with the same node at the top-level of their pattern will perform the same initial match. In practice there are many rewriting rules for each kind of top-level node and hence in generating code we would like to factor out such matching so it is only done once. The factoring can also be applied to sub-expressions of the pattern.

To handle code generation we need to introduce more basic matching primitive, \(\mathop {\textsf {match}}e\,p\), that matches term e with pattern p; it takes an initial substitution s and if the match succeeds, returns s updated with instantiations for the free variables within p. Matches can be composed using, , to form a combined match that takes a substitution, s, and returns s updated for both matches, if they both succeed, but if either fails their combination fails. For a condition C, \(\mathop {\textsf {test}}C\), fails is C does not hold for its input substitution s, otherwise it passes through s. Alternative rules can be combined using, \(r_1 \mathbin {\textsf {else}}r_2\), meaning take the result of \(r_1\) if it succeeds, otherwise try \(r_2\). For example, (Rule 2 \(\mathbin {\textsf {else}}\) Rule 5) expands to,

which after factoring out the initial matches becomes the following.

Generating efficient code in a programming language, such as Java, from rewriting rules expressed using these primitives is relatively straightforward.

For many rewriting rules (e.g. Rule 4), if all sub-expressions of the left side (e.g. x and y) have already been optimised, then a successful application of the rewriting rule results in a term that cannot be further optimised and hence the optimisation of that term is complete.

6 Conclusions

Conditional term rewriting rules allow one to succinctly formalise expression optimisations. Each rule can be separately verified to show that it preserves the semantics of an expression, whenever the rule is applicable. In the context of the GraalVM compiler the rewriting rules can also be applied to its term-graph representation of expressions. Given a set of valid conditional rewriting rules, each representing individual optimisations, the challenge is then to combine them to form an efficient optimiser by factoring out common matches and avoiding applying rules in situations where they cannot possibly succeed.