Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Code generation exhibits the all-too-common trade-off: obtaining code with the highest performance; statically ensuring the code quality; being able to use the code-generating system in practice – choose two. Optimizing compilers and many practical code-generating tools do all desired optimizations. The correctness of the result is ensured however only by careful programming. This is not a problem in case of a compiler written by a small team of experts and changed relatively infrequently. The lack of static assurances is worrisome for code transformation and generation libraries written by domain experts, who have less time to devote to proofs and have to continually tune their libraries to the domain knowledge and circumstances. There is the attested danger of generating code with unbound, or worse, unexpectedly bound variables. At the very least, we would like to guarantee that the generated code – at all times, even the code fragments – is well-formed, well-typed, and all of its free variables will eventually be bound by their intended binders. This guarantee should hold before we compile the generated code, which is typically unfit for human reading. Ideally, the guarantee should hold even before we compile the generator.

On the other side of the trade-off are the staged calculi such as \(\lambda ^\circ \) and \(\lambda ^\alpha \) [4, 15] that express code generators with the desired static guarantees. They are called ‘staged’ because evaluation is stratified: the result of the present, or Level-0, stage is the code to be evaluated at the next, Level-1, or future stage. The calculi have been implemented as full-featured staged languages used in practice [8, 17]. Another example is Pouillard’s [13] code generation and analysis library with proven correctness. Alas, all these systems restrict the range of safe operations on open code: in particular, they limit or outlaw the operations that move or store open code, retrieving it later in a different binding environment. Such operations are required for many optimizations such as let-insertion, memoization, loop interchange and tiling. There have been general approaches that permit the desired open code motions and provide static guarantees: for example, [12]. Alas, they are too complex to use in practice or even to implement. For more discussion, see Sect. 5 and especially [6].

StagedHaskell [6] overcomes the impasse, but partially. It is the library for code generation that supports code movements, including movements via any computational (monadic) effect. Using a contextual modal type system, the library statically assures that at all times the generated code is well-formed, well-typed and well-scoped: all free variables in manipulated and stored code fragments shall eventually be bound by their intended binders. However, the safety properties have been argued only informally. The main reason is that the complexity of the Haskell implementation, specifically, the encoding of the contextual modal type system, make the formal reasoning difficult.

The present paper takes the first step of formalizing StagedHaskell: it distills the staged calculus \({\texttt {<\!NJ\!>}}\) that safely permits open code movements across different binding environments via mutable cells. The calculus can express realistic examples from StagedHaskell, such as the assert-insertion, see Sect. 4.

Although \({\texttt {<\!NJ\!>}}\) was motivated by code generation with safety guarantees, it turned a vantage point to view seemingly unrelated areas. First, there is an uncanny similarity between generating code of functions (or other blocks with local binders) and region-based memory management. Preventing the ‘extrusion’ of free variables out of the bodies of generated functions is similar to keeping reference cells allocated within a region from leaking out. We have consciously used this similarity, adapting techniques from region calculi [5].

The key to ensuring hygiene and type safety when manipulating open code is reflecting free variables of a code fragment in its type – which evokes contextual modal type theory [10] and, in general, sequent calculus. The structural rules such as weakening now turn up in programs, e.g., as ‘shifts’ of De Bruijn indices [3]. After all, in metaprogramming, meta-level becomes the object level. ‘The bureaucracy of syntax’ now worries not only logicians but also programmers.

A particularly elegant method to overcome the complexities and redundancies of concrete name and environment representations is environment classifiers [15] (recalled and discussed in Sect. 3.2). A single classifier represents a set of free variables, abstracting from their order, quantity, or names. Unfortunately, in the presence of effects, the original environment classifiers are too coarse, abstracting away the information needed to ensure the type safety of effectful generators. Inspired by the concept of local assumptions from Natural Deduction NJ, we have identified the minimal necessary refinement of environment classifiers.

Contributions. Our specific contributions are as follows:

  • Practical two-stage calculus \({\texttt {<\!NJ\!>}}\) whose type system statically ensures hygiene and the type-safety of the generated code in the presence of mutable reference cells. The calculus distills the design of the practical StagedHaskell library. The calculus itself is easily implementable.

  • Refinement of environment classifiers – imposing partial order – that preserves all their simplicity and advantages and is compatible with effects.

\({\texttt {<\!NJ\!>}}\) is close to the current MetaOCaml [8], which permits leaking of variables (scope extrusion) but raises a run-time error at the moment the code with leaked (extruded) variables is about to be used. Our calculus prevents such errors statically.

The calculus has been implemented as a simple embedding in OCaml, whose type checker checks \({\texttt {<\!NJ\!>}}\) types and even infers them. Signatures are only needed for functions that receive code values as arguments and use them in distinct binding environments. One is immediately reminded of ML\(^F\) [9]; this is not an accident, as we shall see in Sect. 3.3. All examples in the paper are slightly reformatted running code. The implementation, with more examples, is available at http://okmij.org/ftp/tagless-final/TaglessStaged/metaNJ.ml.

This paper is organized as follows: The next section introduces the calculus, using many examples to illustrate its syntax and dynamic semantics. Section 3 describes the type systems and proves its soundness. Section 3.1 specifically demonstrates the obvious and very subtle dangers arising from storing open code in mutable cells, and how \({\texttt {<\!NJ\!>}}\) prevents the dangers but not the free use of reference cells. Responsible for this are refined environment classifiers; Sect. 3.2 discusses what, why and how. Section 4 shows off a complex example: It is used exactly as was explained in [6], deliberately to demonstrate that \({\texttt {<\!NJ\!>}}\) is capable of representing practical StagedHaskell examples.

2 \({\texttt {<\!NJ\!>}}\), Its Syntax and Semantics

Formally, the syntax of \({\texttt {<\!NJ\!>}}\) is defined in Fig. 1. This section introduces the calculus and its dynamic semantics more accessibly, on a series of small examples.

\({\texttt {<\!NJ\!>}}\) is a lambda-calculus with reference cells and special constants to create and combine code values. Whereas 1 : int is the familiar constant of the base type int, is an expression of the code type which evaluates to The code types are explained in Sect. 3. Here, is a special code-generating constant, also called code combinatorFootnote 1 [16, 18]. We underline all such constants. Likewise, evaluates to Since the and expressions are common we adopt the abbreviated notation % that stands for either constant, depending on the context.

The bracketed expressions like cannot appear in source programs; they come only during and as the result of reductions. This is the most visible distinction from stage calculi like \(\lambda ^\alpha \) [15] and MetaOCaml. Neither do we have ‘splicing’ (or, ‘escape’, unquotation). Bracketed expressions are essentially constants: they cannot be decomposed, inspected, or substituted into. Only a subset of expressions may be bracketed: underlined constants and the brackets themselves are excluded. \({\texttt {<\!NJ\!>}}\) therefore is the two-stage calculus, for generating code but not for generating code generatorsFootnote 2. We will sometimes use the superscript to emphasize the distinction between the host language and the generated language. Most of the time the superscript is elided for ease of notation.

Figure 2 defines the constants of \({\texttt {<\!NJ\!>}}\). They come in different arities. Seen earlier 1 and true are zero-ary constants, denoted as in Fig. 1. On the other hand, and have the arity 1, and are not considered expressions per se: only their applications to one argument are expressions, see Fig. 1. Likewise, \(\mathsf {+}\) is an arity-2 constant, requiring two arguments to be regarded as an expression. We write such an expression in the conventional infix notation \(\mathsf {1+2}\), which evaluates to 3. Besides and , there are constants that combine already built code values: of the code type evaluates as follows according to the rules of Figs. 3 and 4.

figure a

Here, ± is an arity-2 constant, which we also write in infix. Again, all constants must be fully applied: there are no partial applications, sections, or other sugar.

\({\texttt {<\!NJ\!>}}\) is the lambda-calculus, with the standard abstractions and applications . We let stand as the abbreviation for , and for where x does not appear free in . The semantics of \({\texttt {<\!NJ\!>}}\) is the standard small-step left-to-right call-by-value, see Fig. 3. (Heaps will be explained later on and can be ignored for now). \({\texttt {<\!NJ\!>}}\) can also generate the code of functions, using the expression form with peculiar semantics, which we explain in detail. For example, the expression eventually generates the code of the function that increments its argument by 3:

figure b

First, the expression reduces by choosing a fresh variable name y, replacing all free occurrences of x in its body with and wrapping the result in . (Expressions of the form come up during the evaluation and do not appear in source programs.) Next, the body of thus built reduces as described earlier. The final reduction in the sequence builds the resulting code. Thus producing the code for functions has two separate phases: generating the name for the bound variable, and generating the binder for that name at the end. In many staged calculi the two phases can be (and are) combined. The effects force them apart however, as we shall see soon.

Fig. 1.
figure 1

Syntax of \({\texttt {<\!NJ\!>}}\). The constants \(\mathsf {c}_{i}\) with their arities i are defined in Fig. 2.

Fig. 2.
figure 2

The constants \(\mathsf {c}_{i}\) of \({\texttt {<\!NJ\!>}}\) with their arities i. The underlined constants, whose result type is code type, are code combinators. The shown types are schematic: t denotes any suitable type and \(\gamma \) any suitable classifier. We silently add other arithmetic and comparison constants and code combinators, similar to + and =. Although the constants may have function types, they are not expressions, unless applied to the right number of arguments.

Fig. 3.
figure 3

Dynamic semantics of \({\texttt {<\!NJ\!>}}\): reductions e\(_1 \leadsto \) e\(_2\). The reductions involving (code-generating) constants are defined in Fig. 4.

Fig. 4.
figure 4

Constant (code-generating) reductions

For another illustration we take the familiar power example: generating a function that raises its argument to the given power by repeated multiplications.

figure c

Applying the result to, say, 3 produces .

\({\texttt {<\!NJ\!>}}\) has mutable state in the form of the familiar mutable cells, such as those found in ML and many other languages. Correspondingly, the calculus has the form to create a fresh reference cell holding the value of e, to dereference it, obtaining the held value, and to replace the value of the cell with the value of , returning the latter value. The semantics is standard, involving locations l and the heap H, the finite map from locations to values. The empty heap is denoted as ; is the heap that contains the association of l with v plus the associations in H. The domain of the latter does not include l. From \(\lambda ^U\) [2] we borrow the heap-like name heap N, which is the set of names used for variables in the generated code. As we shall see throughout the paper, there is an uncanny similarity between reference cells and the future-stage variable names.

The full dynamic semantics of \({\texttt {<\!NJ\!>}}\) thus deals with reductions between configurations, made of the name and location heaps, and an expression, see Fig. 3. We will often elide the heaps when presenting reductions, especially in examples. As an illustration, the following reductions show the evaluation of a sample imperative code, and the generation of the imperative code:

figure d

where we used to stand for . In the first example, l denotes a fresh location. We elided the heaps in the second example.

So far, the lambda-calculus fragment of \({\texttt {<\!NJ\!>}}\), the code generating and the reference cell fragments looked like orthogonal extensions. There is one part of the semantics where they interact non-trivially. It has to do with generating functions and using reference cells to store open code. The following is an example of how not to use reference cells to store open code: it is the infamous scope-extrusion example.

figure e

When building the functions’s body we store the code with the yet-to-be-bound variable y in the reference cell. After the function is constructed we retrieve from the reference cell the code with what is by now the unbound variable y. We have just seen the most blatant example of scope extrusion; alas, there are also subtle, and hence far more dangerous cases; we discuss them in Sect. 3.1.

Our dynamic semantics is really non-chalant about unbound future-stage variables, treating them essentially as constants. To be pedantic, y in the result of the scope-extrusion example is bound, technically: it occurs in the name heap. Real staged languages such as MetaOCaml and Scala-Virtualized [14] likewise allow unbound variables to appear in code values (in case of MetaOCaml, for a short interval). We will show in the next section that a well-typed \({\texttt {<\!NJ\!>}}\) program never generates code with unbound variables. The scope-extrusion program above does not type-check.

Storing open code in reference cells has many legitimate uses. Here we show one simple example. It is again the power function, but now with reference cells. Merely computing \(x^n\) looks in \({\texttt {<\!NJ\!>}}\) as

figure f

To obtain the code for computing \(x^n\) for a fixed n, we turn the above program into the generator, in a rather straightforward way:

figure g

Applying the result to, say, 3 produces, as before, . The reference cell r accumulates progressively longer code for the product, containing multiple occurrences of the free variable y, to be bound at the end. Section 4 shows more interesting, realistic example of reference cells in code generators, of assertion insertion.

3 Type System

Figure 1 also defines the syntax of types t, which include the standard, base types of int and bool, the arrow (function) type and the reference type . Non-standard is the code-type , containing the so-called classifier \(\gamma \) – similar in intent, but more precise than the environment classifier of [15], as mentioned in the Introduction. One may think of the classifier \(\gamma \) as a type-level representation, or ‘name’, of a Level-1 variable – although strictly speaking a classifier represents a binding environment. We delay the further discussion of classifiers till Sect. 3.2, after we explained the typing rules that govern classifiers, the partial order on classifiers and classifier subtyping. Since \({\texttt {<\!NJ\!>}}\) is a two-level system – the generated code does not contain any code generating expressions – we distinguish level 1 types from level 0 types t: the former omits code types. To relieve the notation burden, however, we will often use the same meta-variable t for both sorts of types, using only where necessary for disambiguation.

Fig. 5.
figure 5

Judgements, environments, classifiers

Figure 5 defines judgements and their components. The main typing judgement – the expression e has the type t at the level L – has the form . Here, \(\varGamma \) is the standard environment, an ordered sequence associating types with free variables in an expression. Free variables in expressions (that is, expressions within brackets) are Level-1 free variables; their associations in \(\varGamma \) are annotated with the classifier \(\gamma \). Besides the free variable bindings, \(\varGamma \) also contains classifiers \(\gamma \) and classifier subtyping witnesses to be explained shortly. \(\varUpsilon \) and \(\varTheta \) are essentially the typings of the name and location heaps. \(\varTheta \) is indeed a finite map from locations to types; \(\varUpsilon \) on the other hand, has more structure. It is an ordered sequence. It contains the classifier \(\gamma \) for each name in N. Like \(\varGamma \), it also contains the types associated with each name (Level-1 variable) and the classifier subtyping witnesses. One may think of \(\varGamma \) as a local type environment and \(\varUpsilon \) as a ‘global’ one. The initial \(\varUpsilon \) contains only the pre-defined classifier . We use the standard \(\in \) notation to assert that \(\varGamma \) or \(\varUpsilon \) sequences contain a particular element. In addition, we write to say the location is in the domain of the finite map \(\varTheta \). The notation means that some binding b is an element of \(\varGamma \), or else it is an element of \(\varUpsilon \) (note the asymmetry).

Some judgements are generic, so we use superscript L that stands for either empty or a classifier. If L is not empty, then, strictly speaking, the judgement should be written as , meaning that only a subset of expressions (and types) are allowed at level 1. In particular, locations cannot appear at Level 1Footnote 3: normally locations result from evaluating expressions ; although such expressions may appear in the generated code, they remain unevaluated. There are no code combinators in \({\texttt {<\!NJ\!>}}\) that could produce the value . A substitution cannot insert a location either, since the generated code cannot be substituted into. Therefore, the heap typing \(\varTheta \) is irrelevant in such judgements. We will almost always drop the superscript in and (we keep it in the rule (Code) as reminder).

Fig. 6.
figure 6

Well-formedness of environments and heap typings

Figure 6 states the well-formedness constraints on the environments and heap typings, which can be summarized as the absence of duplicates and the classifiers being defined before use. It becomes clear that each Level-1 variable binding recorded in the global \(\varUpsilon \) or local \(\varGamma \) environment has its own classifier. Indeed, a classifier acts as a type-level ‘name’ of a Level-1 variable. To ease the notation, hereafter we shall assume well-formedness of all environments and heap typings. We write and for the concatenation of two sequences such that the result must be well-formed.

Fig. 7.
figure 7

Type system: typing of expressions

The typing of expressions is presented in Fig. 7 whereas Fig. 9 defines the typing of heaps. Most of the type system is standard. The rule (Const) uses the types of constants tc, given in Fig. 2. We abuse the notation and treat, for type-checking purposes, constant expressions such as \({\textsf {c}_{2}}\) \({\textsf {e}_{1}}\) \({\textsf {e}_{2}}\) as applications to \({\textsf {c}_{2}}\), although \({\textsf {c}_{2}}\) is not an expression per se. The rules (Sub0) and (Sub1) rely on the partial order on classifiers specified in Fig. 8 in the straightforward way: if either literally occurs in the environments as a witness, or can be derived by reflexivity and transitivity.

Fig. 8.
figure 8

Partial order on classifiers

Fig. 9.
figure 9

Type system: typing of heaps and

The most interesting are the rules (CAbs) and (IAbs). To explain them and to illustrate the type system, we show two sample typing derivations. The first deals with the term  – generating the curried addition function – in the initial environment, in which \(\varUpsilon \) contains only the predefined classifier , and \(\varTheta \) and \(\varGamma \) are empty. In the following derivation, \(\varGamma _{2}\) stands for .

figure h

The side-conditions of (CAbs) tell that the classifiers and are ‘fresh’. Section 3.1 shows another attempted (but not completed) derivation, in case of scope extrusion. The second derivation is for the expression , which results from the one-step reduction of the expression in the previous derivation. Now, stands for and \(\varGamma _{2}\) for .

figure i

It should be clear, already from (IAbs) in fact, that is not really a binding form. The environment \(\varGamma \) in (IAbs) is empty since shows up only during evaluation and it is not a value.

Proposition 1

(Canonical Forms). The only values of base types int and bool are zero-ary constants (numerals and booleans, respectively). Values of reference types are locations. Values of code types are all bracketed expressions and of the function types are abstractions .

Although constants of arity 1 and above also have function types (see Fig. 2), not applied to the right number of arguments they are not regarded as expressions.

Proposition 2

(Weakening). If hold, so do and and .

Recall that comma denotes concatenation that preserves well-formedness; which implies \(\varTheta \) and are disjoint. The proof is straightforward.

Theorem 1

(Subject Reduction). If , , , and , then , , , for some and that are the extensions of the corresponding unprimed things.

We outline the proof in Appendix A.

Theorem 2

(Progress). If and , then either e is a value or there are , and such that .

The proof is the easy consequence of the canonical forms lemma. For example, if the last rule in the derivation of is (IAbs), then must have the form for some , where must itself be typeable in the same \(\varUpsilon \) and \(\varTheta \). By induction hypothesis, either reduces, or is a value. In the latter case, by the canonical forms lemma, it should be of the form for some – meaning can reduce.

Corollary 1

If and then v has the form and .

That is, if a well-typed program of the type terminates it generates the code well-typed in the empty environment. The generated code hence has no unbound variables.

3.1 Scope Extrusion

When generating the body of a function, its formal argument is available as a code value – as the free variable. Scope extrusion occurs when that open code value is used outside the dynamic scope of the function generator and hence the free variable can never be properly bound. Although the error is obvious once we attempt to compile the generated code, it is not at all obvious what part of the generator is responsible. Debugging generated code is very difficult in general. We now demonstrate how \({\texttt {<\!NJ\!>}}\) prevents scope extrusion.

We start with the example of blatant scope extrusion, from Sect. 2:

figure j

We have seen that its evaluation indeed produces the code with an unbound variable. The example does not type check however. Specifically, the type error occurs not when the open code is retrieved from the reference cell at the end. Rather, the generator of the function body, specifically, fails to type-check. Here is the attempt at the derivation, where we assumed and so is (which may be the same as \(\gamma \)). We take \(\varGamma _{2}\) to be where is fresh.

figure k

The derivation cannot be completed since has the type but x is of the type where is specifically chosen by (CAbs) to be different from any other classifiers in \(\varGamma \) and \(\varUpsilon \), including .

If such examples were our only worry, a simpler type system would have sufficed. Instead of named classifiers, we would annotate code types with just a natural number: the nesting level of . Our blatant example will likewise fail to type-check. The error will be reported later, however, when type checking the last expression retrieving the code with the already leaked variable as the program result. The program result must be closed: be at the 0th nesting level. The type system of [3] (extended with reference cells) likewise rejects the blatant example, as was described in that paper. (After all, their type system annotates code types with the typing environment sequence, which is the refinement of the nesting depth.) MetaOCaml also reports the scope extrusion error – when running the program and executing the expression. In contrast, \({\texttt {<\!NJ\!>}}\) rejects , when merely attempting to leak out the free variable.

Alas, scope extrusion can be subtle. Consider a small modification of the earlier example:

figure l

The simpler type system with mere level counting accepts the code: the free variable leaks out of one binder into another, at the same nesting level of . Likewise, the calculus of [3] (extended with reference cells as described therein) will type-check and even run the example, producing the code for the identity function. This is not what one may expect from the generator . Our \({\texttt {<\!NJ\!>}}\) rejects in the first part of the example as described earlier: it rejects even an attempt to leak the variable.

Finally, scope extrusion may be harmless, as in the following, yet another variation of the example:

figure m

When generating the body of the function, we incorporate the free variable x in the closure , but in a way that it does not contribute to the result and hence is not reflected in the closure’s type, which remains . Technically, the free variable has leaked – but in a useless way, embedded in dead code.

\({\texttt {<\!NJ\!>}}\) accepts the latter example. When run, it indeed produces the closure with an unbound variable – which remains typeable since the unbound variable is still in the global heap N and its classifier in \(\varUpsilon \). Such open code must have been dead, however: it cannot be the result of a well-typed generator, since the type of such result would have contained the classifier \(\gamma \) that is different from . The well-typed generator program must have the type . We have seen before that even a fragment, let alone the whole program, that attempts to ‘usefully leak’ a bound variable will fail to type-check.

Accepting unbound variables in dead code has many precedents. Most region calculi (see [5] and references therein) and their implementations (such as runST monad in Haskell) allow dangling references, provided they are not accessed – that is, remain embedded in essentially dead code.

3.2 Environment Classifiers, Binding Abstractions, and Lexical Scope

As we have seen from Sect. 3.1, the key to preventing scope extrusion is annotating the type of a code value with some representation of free variables that may be contained therein. This section discusses a few choices for the representation and the position of \({\texttt {<\!NJ\!>}}\) among them as the most abstract while still sufficient to prevent scope extrusion. By free variables we always mean Level-1 free variables: all values and terms produced and evaluated in stage calculi are closed with respect to Level-0 variables.

On one end of the spectrum is annotating the type of a code value with the names of the containing free variables, or the typing environment: the set or the sequence listing the free variables and their types. Taha and Nielsen [15, Sect. 1.4] describe many difficulties of this approach (the sheer size of the type being one of them), which makes it hard to implement, and use in practice.

On the other extreme is the most abstract representation of a set of free variables: as a single name (the environment classifier, [15]) or a number, the cardinality of the set. Section 3.1 showed that this is not sufficient to prevent the scope extrusion, of the devious, most harmful sort.

The approach of [3] also annotates the code types with the type environment; however, by using De Bruijn indices, it avoids many difficulties of the nominal approach, such as freshness constraints, \(\alpha \)-renaming, etc. The approach is indeed relatively easy to implement, as the authors have demonstrated. Alas, although preventing blatant scope extrusion, it allows the devious one, as we saw in Sect. 3.1.

The representation of [3] is also just too concrete: the code type tells not only that the value may contain three free variables with the indices 0, 1 and 2. The type also tells that the int and the bool variables will be bound in that order and there is no free variable to be bound in between. There is no need to know with such exactitude when free variables will be bound. In fact, there is no need to even know their number, to prevent scope extrusion. The concreteness of the representation has the price: the system of [3, Sect. 3.3] admits the term, in our notation, , which may, depending on the argument f, generate either or, contrary to any expectation, .

Such a behavior is just not possible in \({\texttt {<\!NJ\!>}}\): consider where f is some function on code values. The function receives the code of a Level-1 variable and is free to do anything with it: discard it, use it once or several times in the code it is building, store in global reference cells, as well as do any other effects, throw exceptions or diverge. Still, we are positive that whatever f may do, if it eventually returns the code that includes the received Level-1 variable, that variable shall be bound by of our expression – regardless of whatever binders f may introduce. This is what we call ‘lexical’ scope for Level-1 variables: the property, not present in [7] (by choice) or [3].

\({\texttt {<\!NJ\!>}}\) avoids the problematic ‘variable conversions’ because it does not exposes in types or at run-time any structure of the Level-1 typing environment. The environment classifier in \({\texttt {<\!NJ\!>}}\) is the type-level representation of the variable name. There is a partial order on classifiers, reflecting the nesting order of the corresponding generators. The relation tells that the variable corresponding to is (to be) introduced earlier than the free variable corresponding to , with no word on which or how many variables are to be introduced in-between. The code type is annotated not with the set of free variables, not with the set of the corresponding classifiers – but only with the single classifier, the maximal in the set. The type system ensures that there is always the maximal element. To be precise, any free Level-1 variable that may appear within is marked by such a classifier that . Therefore, any such variable will be bound by an ancestor of . This is another way to state the property of ‘lexical scope’ for free variables.

3.3 Classifier Polymorphism

The classifier polymorphism and its importance are best explained on examples. The following generator

figure n

contains the repeated code that we would like to factor out, to make the generators clearer and more modular:

figure o

One may worry if the code type-checks: after all, u is used in contexts associated with two distinct classifiers. The example does type-check, thanks to (Sub0) rule: u can be given the type , and although and are associated with unrelated classifiers, and hold.

Alas, the classifier subtyping gets us only that far. It will not help in the more interesting and common example of functions on code values:

figure p

where the function u is applied to code values associated with unrelated classifiers. To type-check this example we need to give the type . Before, \(\gamma \) was used as a (sometimes schematic) constant; now we use it as a classifier variable.

Extending \({\texttt {<\!NJ\!>}}\) with let-bound classifier polymorphism with attendant value restriction is unproblematic and straightforward. In fact, our implementation already does it, inheriting let-polymorphism from the host language, OCaml. Sometimes we may need more extensions, however.

For example, we may write a generator that introduces an arbitrary, statically unknown number of Level-1 variables, e.g., as let-bound variables to share the results of computed expressions. Such pattern occurs, for example, when specializing dynamic programming algorithms. Appendix B demonstrates the let-sharing on the toy example of specializing the Fibonacci-like function, described in [6, Sect. 2.4]. As that paper explains, the generator requires polymorphic recursion – which is well-understood. Both Haskell and OCaml supports it, and hence our implementation of \({\texttt {<\!NJ\!>}}\). Polymorphic recursion also shows in [3].

There are, however, times (not frequent, in our experience) where even more polymorphism is needed. The poster example is the staged eta-function, the motivating example in [15]: , whose type is, , approximately. The type is not quite right: f accepts the code value that contains a fresh free variable, which comes with a previously unseen classifier. Hence we should assign eta at least the type – the rank-2 type. This is still not quite right: we would like to use eta in the expression such as , where f combines the open code received as argument with some other open code. To type-check this combination we need . Hence the correct type for eta should be

figure q

with the bounded quantification. One is immediately reminded of ML\(^F\). Such bounded quantification is easy to implement, however, by explicit passing of subtyping witnesses (as done in the implementation of the region calculus [5]) Our implementation of \({\texttt {<\!NJ\!>}}\) supports it too – and how it cannot: eta is just the first-class form of . Thus the practical drawback is the need for explicit type signatures for the sake of the rank-2 type (just as signatures are required in ML\(^F\) when the polymorphic argument function is used polymorphically). Incidentally, the original environment classifiers calculus of [15] gives eta the ordinary rank-1 type: here the coarseness of the original classifiers is the advantage. The formal treatment of rank-2 classifier polymorphism is the subject of the future research.

4 Complex Example

To demonstrate the expressiveness of \({\texttt {<\!NJ\!>}}\), we show a realistic example of assert-insertion – exactly the same example that was previously written in StagedHaskell. The latter is the practical Haskell code-generation library, too complex to reason about formally and prove correctness. The example was explained in detail in [6]; therefore, we elide many explanations here.

For the sake of the example, we add the following constants to \({\texttt {<\!NJ\!>}}\):

figure r

The first two are the integer division and the corresponding code combinator; returns the result of the boolean expression, if it is true. Otherwise, it crashes the program. The constant is the corresponding combinator, with the reduction rule .

The goal is to implement the guarded division, which makes sure that the divisor is positive before attempting the operation. The naive version

figure s

to be used as

figure t

produces . The result is hardly satisfactory: we check the divisor right before the division. If it is not positive, the time spent computing complexExp is wasted. If the program is going to end up in error, we had rather it end sooner than much later.

The solution is explained in [6], implemented in StagedHaskell and is reproduced below in \({\texttt {<\!NJ\!>}}\). Intuitively, we first reserve the place where it is appropriate to place assertions, which is typically right at the beginning of a function. As we go on generating the body of the function, we determine the assertions to insert and accumulate them in a mutable ‘locus’. Finally, when the body of the function is generated, we retrieve the accumulated assertion code and prepend it to the body. The function below accumulates the assertions; allocates the locus at the beginning and applies the accumulated assertions at the end.

figure u

They are to be used as in the example below:

figure v

As we generate the code, the reference cell within the locus accumulates the transformer (code-to-code function), to be applied to the result. In our example, the code transformer includes open code (embedded within the expression), which is moved from within the generator of the inner function. The example thus illustrates all the complexities of imperative code generation. The improved generated code

figure w

checks the divisor much earlier: before we started on complexExp, before we even apply the function . If we by mistake switch y and z in , we get a type-error message.

5 Related Work

We thoroughly review the large body of related work in [6]. Here we highlight only the closest connections. First is Template Haskell, which either permits effectful generators but then provides no guarantees by construction; or provides guarantees but permits no effects – the common trade-off. We discuss this issue in detail in [6]. BER MetaOCaml [8] permits any effects and ensures well-scopedness, even in open fragments, using dynamic checks. StagedHaskell and \({\texttt {<\!NJ\!>}}\) are designed to prevent scope extrusion even before running the generator.

Safe imperative multi-staged programming has been investigated in [1, 17]. Safety comes at the expense of expressiveness: e.g., only closed code is allowed to be stored in mutable cells (in the former approach).

We share with [15] the idea of using an opaque label, the environment classifier, to refer to a typing environment. The main advantage of environment classifiers, their imprecision (they refer to infinite sets of environments), is also their drawback. On one hand, they let us specify staged-eta Sect. 3.3 without any first-class polymorphism. On the other hand, the imprecision is not enough to safely use effects.

Chen and Xi [3] and Nanevski et al. [10] annotate the code type with the type environment of its free variables. The former relies on the first-order syntax with De Bruijn indices whereas the latter uses higher-order abstract syntax. Although internally Chen and Xi use De Bruijn indices, they develop a pleasant surface syntax a la MetaOCaml (or Lisp’s antiquotations). The De Bruijn indices are still there, which may lead to unpleasant surprises, which they discuss in [3, Sect. 3.3]. Their type system indeed rejects the blatant example of scope extrusion. Perhaps that is why [3] said that reference cells do not bring in significant complications. However, scope extrusion is much subtler than its well-known example: Sect. 3.1 presented a just slightly modified example, which is accepted in Chen and Xi’s system, but produces an unexpected result. We refer to [6] for extensive discussion.

One may think that any suitable staged calculus can support reference cells through a state-passing translation. The elaborate side-conditions of our (CAbs) and (IAbs) rules indicate that a straightforward state-passing translation is not going to be successful to ensure type and scope safety.

Staged-calculi of [3, 15] have a special constant run to run the generated code. Adding it to \({\texttt {<\!NJ\!>}}\) is straightforward.

Our bracketed expressions are the generalization of data constructors of the code data type in the ‘single-stage target language’ [2, Fig. 2]. Our name heap also comes from the same calculus. The latter, unlike \({\texttt {<\!NJ\!>}}\), is untyped, and without effects.

6 Conclusions and Future Work

We have described the first staged calculus \({\texttt {<\!NJ\!>}}\) for imperative code generators without ad hoc restrictions – letting us even store open code in reference cells and retrieve it in a different binding environment. Its sound type system statically assures that the generated code is by construction well-typed and well-scoped, free from unbound or surprisingly bound variables. The calculus has been distilled from StagedHaskell, letting us formally prove the soundness of the latter’s approach. The distilled calculus is still capable of implementing StagedHaskell’s examples that use mutation.

\({\texttt {<\!NJ\!>}}\) has drawn inspiration from such diverse areas as region-based memory management and Natural Deduction. It turns out a vantage point to overview these areas.

\({\texttt {<\!NJ\!>}}\) trivially generalizes to effects such as exceptions or IO. It is also easy to extend with new non-binding language forms. (Binding-forms like for-loops can always be expressed via lambda-forms: see Haskell or Scala, for example.) \({\texttt {<\!NJ\!>}}\) thus serves as the foundation of real staged programming languages. In fact, it is already implemented as an OCaml library. Although the explicit weakening is certainly cumbersome, it turns out, in our experience, not as cumbersome as we had feared. It is not a stretch to recommend the OCaml implementation of \({\texttt {<\!NJ\!>}}\) as a new, safe, staged programming language.

Extension to effects such as delimited control or non-determinism is however non-trivial and is the subject of on-going research. We are also investigating adding first-class bounded polymorphism for classifiers, relating \({\texttt {<\!NJ\!>}}\) more precisely to ML\(^F\).