Keywords

1 Introduction

Modern programming languages, be they general purpose or domain-specific, can be built in a flexible manner by composing simple, off-the-shelf language components. It is attractive to build languages in this way as it is useful to study language components in isolation. Furthermore, it reduces the cost of developing new and improved programming languages. Indeed, reducing the effort of building languages to the effort of composing off-the-shelf language components for features such as function abstraction, exceptions or mutable state, is likely to enable language designers with limited resources or expertise—e.g., domain experts—to build their own languages. Providing a modular definition for these advanced language features enables a more widespread use of them, especially in the development of domain-specific languages (DSLs).

To build programming languages from reusable components, we need a framework for defining those components. Two promising techniques for such a framework are Data Types à la Carte [23] (DTC) and Algebraic Effects & Handlers [18] (AE&H). Using these, language interpreters can be implemented in two steps:

$$\begin{aligned} Syntax \xrightarrow { denote } Effects \xrightarrow { handle } Result \end{aligned}$$

The first step defines a \( denote \) function that maps the syntax of an object language onto effectful (monadic [13]) operations. By implementing \( denote \) using DTC, we can seamlessly compose \( denote \) functions from isolated cases for different \( Syntax \) constructors.

In the second step, \( handle \) defines the semantics of effectful operations and their effect interactions. Using AE&H allows cases of \( denote \) to be defined in an effect polymorphic way; i.e., a case maps to a monad that has at least the relevant operations, and possibly more. Furthermore, we can define \( handle \) functions for different effects in isolation, and seamlessly compose them by nesting handlers. Thus, DTC+AE&H provides a powerful framework for composing programming languages from isolated components.

However, not all language fragments have an obvious modular definition in terms of AE&H. Traditional algebraic effects and scoped effects [29] are baking in assumptions about control-flow and data-flow. These assumptions get in the way of expressing a modular semantics for some language features. In particular, control-flow mechanisms that defer execution, such as lambda abstractions with effectful bodies, lazy evaluation strategies, or (multi-)staging, are neither algebraic [17] nor scoped. This implies that the only way to implement these mechanisms is by means of sophisticated encodings that are often relatively low-level and non-modular. As such, these control-flow mechanisms present a severe challenge for allowing non-experts to build programming languages by composing off-the-shelf components.

This paper presents a novel form of effects and handlers that addresses this challenge: latent effects and handlers. Latent effect handlers defer the running of side effects, such that they are “latent” until handlers for all effects have been applied. This notion of effects can be used to directly define handlers for language components that defer execution, such as lambda abstractions with effectful bodies, lazy evaluation, and (multi-)staging.

After introducing the background (Sect. 2), our main contributions are:

  • We introduce latent effect trees, implemented using Haskell (Sect. 3).

  • We show how to encode lambda abstraction and application in terms of latent effects. We illustrate how this allows us to define lambda abstraction with effectful bodies as a composable language component (Sect. 3).

  • We illustrate how to compose languages from reusable components using latent effects by showing how to encode lazy evaluation (call-by-need and call-by-name), and MetaML-like multi-staging (Sect. 4).

  • We provide an effect libraryFootnote 1 with syntax and semantics in Haskell for various simple and advanced language features. These features can be used as isolated components to construct languages.

Finally, we discuss related work (Sect. 5) and conclude (Sect. 6). For an extended version of this work, including a library of effects and an elobarion of our two case studies, we refer to van den Berg et al. [2].

2 Background and Motivation

This section summarizes the state of the art in modular language definition by means of DTC and AE&H, discusses the challenges and problems associated with treating lambdas in this setting, and sketches how our new latent effects enable the integration of modular forms of advanced control-flow effects.

2.1 Modular Syntax and Semantics with Data Types à la Carte

Data types à la Carte [23] (DTC) solves the core problem of assembling language definitions from isolated components. It solves this for both syntax and semantics to be composed in a modular way. For an accessible introduction to DTC, we refer to [23]; we summarize the key points here.

Firstly, the abstract syntax of a language is modularized into the syntax of each isolated feature. This is achieved by defining a recursive type \( Syntax \; S \) of ASTs in terms of the shape \( S \) of its nodes. This shape \( S \) can be composed out of shapes \( S _{1}\), ... ,\( S _{n}\) of the individual features: \( S \mathrel {=} S _{1}\mathbin {+}\,\ldots \,\mathbin {+} S _{n}\).

Secondly, the semantic function \( denote \mathbin {::} Syntax \; S \rightarrow M \) that maps ASTs \( Syntax \; S \) to their meaning \( M \) is modularized into the separate syntactic maps of the individual features. This is done by parameterizing the recursive semantic map of the AST with the semantic mapping of the individual nodes, \( denote \mathrel {=} fold \; denote _{S}\) where \( denote _{S}\mathbin {::} S \; M \rightarrow M \) is composed out of the semantic maps:

figure a

This modular approach affords great flexibility to quickly assemble a range of different languages and to explore language design: features can be added or removed, and their semantic maps can be changed. We use the DTC approach extensively when defining a library of effects in Sect. 4.3.

Unfortunately, this approach comes with a serious limitation: to be able to combine the semantic maps of different features, they must agree on the semantic domain \( M \). This often prohibits the unanticipated combinations of features, even due to small differences in their semantic domain, such as one feature requiring access to an environment or store that the other feature does not expect.

Moggi [13] observed that many of the differences in semantic domains of common language features, usually called (side-)effects, can be captured in the category-theoretical pattern of monads. Although not all monads compose, the state of the art in modular monads, algebraic effects and handlers (AE&H) [18], is well-aligned with the DTC approach of modular language features.

In fact, AE&H tackles the problem of modularly defining a monad in much the same way DTC tackles the problem of modularly defining a language. Indeed, the API of the monad (its “syntax”) is modularized into separate effects; the category-theoretical concept of a free monad plays the role of an abstract syntax of API calls. A modular meaning is assigned to the free monad by means of semantic maps for the separate effects. The key difference with DTC is that the different effects do not all have to be mapped to the same semantics. Instead, by means of “modular carriers” their semantics can be layered [20].

Whereas AE&H is used to define (simpler) language features, we use latent effects and handlers, which we introduce in Sect. 3, to modularly define more complex language features.

2.2 Advanced, Non-modular Control-Flow Effects

The AE&H literature shows how to express a range of different control-flow effects such as exceptions, non-determinism and even call-with-current-continuation. However, as traditional effects and handlers rely on assumptions about data-flow and control-flow, more advanced and complicated control-flow effects are missing, such as call-by-name and call-by-need evaluation or multi-staging. These features typically defer execution and are non-algebraic.

An operation is algebraic when it meets the following requirements:

  1. 1.

    It can be expressed as \( op \mathbin {::} D \rightarrow \forall a . M \; a \rightarrow \mathbin {...}\rightarrow M \; a \) where \( D \) represents the non-computational parameters that the operation has, \( M \) is a monad with algebraic operations, and each \( M \; a \) typed parameter represents a possible continuation of the operation.

  2. 2.

    It satisfies the algebraicity property, which states that, no matter which possible continuation \( m _ i \) we take, the continuation immediately transfers control to the current continuation \( k \).

    figure b

Although many operations are algebraic, there are many common control-flow operations that are not. For instance, \( catch \mathbin {::}\forall a . M \; a \rightarrow M \; a \rightarrow M \; a \) is an operation that executes the computation in the first argument, and if that computation raises an exception, proceeds by executing the second argument. The \( catch \) operation is not algebraic; i.e.,

figure c

since exceptions raised during evaluation of \( k \) should not be handled by \( m _{2}\).

The lack of support for control-flow effects such as exception handling, motivated the development of scoped effects [29]. An operation is scoped when it can be expressed as having the following type:

figure d

The universal quantification over \( a \) restricts data flow: for a given operation \( op \; d \; m _{1}\mathbin {...} m _ n \; k _{1}\mathbin {...}k_m\), it is only the possible continuations \( k _{1}\mathbin {...}k_m\) that can inspect values yielded by computations \(m_i\). The allowable destinations of data produced by the computation are restricted to those determined by the operation. The \( catch \) operation is compatible with this restriction: it can be implemented as a scoped operation \( catch' \mathbin {::}\forall a . M \; a \rightarrow M \; a \rightarrow \forall b .( a \rightarrow M \; b )\rightarrow M \; b \).

However, this pattern does not apply to more advanced control-flow effects for which the data produced by a computation can be used outside of the operation. For example, lambda abstractions delay the execution of computations in the body of a function until the function is applied (or not, depending on dynamic control- and data flow). To support such deferred execution, the return type \( V \) in the body of a lambda abstraction \( abstr \mathbin {::} String \rightarrow M \; V \rightarrow M \; V \) is not universally quantified. Thus, \( abstr \) is not a scoped operation. It is also not algebraic, as the equation \( abstr \; x \; m \mathbin {>\!\!\!>=} k \equiv abstr \; x \;( m \mathbin {>\!\!\!>=} k )\) would cause \( k \) to be (wrongly) deferred, and could cause free variables in \( k \) to be captured. Other control-flow effects, such as call-by-need and call-by-name evaluation and multi-staging annotations for (dynamically) staging and unstaging code, defer execution in a similar way, and are similarly neither scoped nor algebraic.

It is theoretically possible to define the control-flow effects discussed above, by making the control flow of all operations explicit; e.g., by writing interpreters in continuation-passing style (CPS). However, the relatively low-level nature of CPS and its non-modular style, make this approach fall short of our goal of composing languages from simple, off-the-shelf components.

2.3 Our Approach: Latent Effects

We provide modular support for advanced control-flow effects such as function abstraction, with its different evaluation semantics, and staging. Our solution generalizes the approach of DTC and AE&H outlined above. In fact, it does not require any changes to the DTC approach for modular abstract syntax trees and modular semantic mapping. It only impacts the second part of the pipeline, replacing AE&H with a more general notion of latent effects and handlers (LE&H).

LE&H is based on a different, more sophisticated structure than AE&H’s free monad. This structure supports non-atomic operations (e.g., function abstraction, thunking, quoting) that contain or delimit computations whose execution may be deferred. Also, the layered handling is different. The idea is still the same, to replace bit by bit the structure of the tree by its meaning. Yet, while AE&H grows the meaning around the shrinking tree, LE&H grows little “pockets of meaning” around the individual nodes remaining in the tree, and not just around the root. The latter supports deferred effects because later handlers can still re-arrange the semantic pockets created by earlier handlers.

LE&H are the first to modularly express advanced control-flow effects, such as staging and lambda abstractions, and provide different handlers, e.g., for call-by-name and call-by-need evaluation. Moreover, they combine with existing algebraic effects to express varying semantics for a large range of languages.

3 Latent Effects

This section presents latent effects. Latent effects generalize algebraic effects to include control-flow mechanisms that defer computation. As our running example we use lambda abstraction, which—as discussed in Sect. 2.2—is neither an algebraic nor a scoped operation. We show that it can be defined as a latent effect. We start from a non-modular version that we refine in two steps. First we add support for modular signatures, and then support for modular handlers.

3.1 Non-modular Definition of Lambda Abstraction

First we provide a non-modular definition of the lambda abstraction effect.

Monadic Syntax Tree. The type \( LC \; v \; a \) is a non-modular monadic syntax tree that supports three primitive operations for a de Bruijn indexed \(\lambda \)-calculus.

figure e

Here, the \( v \) of \( LC \; v \; a \) is a value type parameter, and \( a \) represents the return type of the computation. Thus \( Return \; x \) is a trivial computation that returns \( x \). \( Var \; i \; k \) retrieves the value of type \( v \) associated with the \( i \)th variable and passes it to the continuation \( k \). The application \( App \; v _{1}\; v _{2}\; k \) applies the value \( v _{1}\) to the value \( v _{2}\) and passes the result to the continuation. Finally, \( Abs \; e \; k \) builds a closure value out of the function body \( e \) and passes it to the continuation.

For example, we can represent the lambda expression \((\lambda x \rightarrow x )\;\mathrm {1}\) as the \( LC \) expression \( Abs \;( Var \;\mathrm {0}\; Return )\;(\lambda v \rightarrow App \; v \;\mathrm {1}\; Return )\). This computation constructs an abstraction and passes it to the continuation as a (closure) value \( v \). The continuation applies \( v \) to \(\mathrm {1}\) and passes the result to \( Return \). The closure retrieves the value of variable with index \(\mathrm {0}\) (i.e., \( x \)) and passes it to \( Return \).

Handler. The idea of a handler is to map the syntax tree onto its meaning. We illustrate this on the \( LC \; v \; a \) type, where we use the type \( v \mathrel {=} Closure \) for values.

figure f

A closure contains a function pointer and an environment. The function pointer is an index into a list of deferred computations (i.e., function bodies) that we call the (resumption) store. The environment is a list that maps the closure’s parameters (which are indexes) onto their values.

Now we are ready to define the handler \( handleAbs \) as a function that, given an initial environment and store, maps an \( LC \; Closure \; a \) computation onto its meaning, which is a tuple of the result value of type \( a \) and the final store.

figure g

First, the leaf case of the handler returns the value in that leaf, supplemented with the resumption store. Next, the variable case looks up the variable in the environment and passes it to the continuation. Then, the application case unpacks the closure, retrieves the corresponding function body from the resumption store and applies it to the extended environment and store. The resulting value is passed to the continuation. Finally, the abstraction case adds the function body to the resumption store, creates a closure that indexes this new entry, and calls the continuation on this closure value.

3.2 Modular Latent Effect Signatures and Trees, Naively

We now modularize the definition of \( LC \) by separating the recursive structure of the monadic syntax from the node shapes of the \( Var \), \( App \) and \( Abs \) operations.

Latent Effect Signature. We call the node shapes the latent effect signature. In this case, it is called \( Abstracting \; v \) with \( v \) the type of values.

figure h

Besides its first parameter \( v \), the type \( Abstracting \; v \; p \; c \) is indexed by two further parameters: The parameter \( p \) is the return type of the primitive operations; this is the type of value they pass to their continuation. As all three primitive operations return a value of type \( v \), we have that \( p \mathrel {=} v \). The parameter \( c \mathbin {::}\mathbin {*}\rightarrow \mathbin {*}\) captures the number and result type of the subcomputations. As \( Var' \) and \( App' \) have no subcomputations, they use the type \( c \mathrel {=} NoSub \) to indicate that. However, \( Abs' \) has a subcomputation and it indicates this with \( c \mathrel {=} OneSub \; v \). This subcomputation is the body of the function abstraction, whose return type is \( v \). Hence, \( OneSub \; v \) has one constructor \( One \mathbin {::} OneSub \; v \; v \).

figure i

Latent Effect Tree, Version 1. The type \( Tree _{1}\; \sigma \; a \) extends a latent effect signature \( \sigma \) into a recursive syntactic structure that is a monad in \( a \).

figure j

The \( Leaf _{1}\) constructor is trivial; \( Leaf _{1}\; x \) returns a pure computation with result \( x \). The internal nodes are of the form \( Node \; op \; sub \; k \) where the fields have the following meaning. The first, \( op \), identifies what primitive operation the node represents. Next, \( sub \) is a function that, in case of a non-atomic primitive, selects the subcomputations of the node. Finally, \( k \) is the continuation of further operations to perform after the current one.

Some notable characteristics are as follows:

  • Every operation has a result type \( p \) that is made available to its continuation, and a number of subcomputations \( c \). To model these two, the signature of an operation \( op \mathbin {::} \sigma \; p \; c \) is parameterized by \( p \) and \( c \).

  • The function \( sub \) has type \(\forall x . c \; x \rightarrow Tree _{1}\; \sigma \; x \). The input of type \( c \; x \) determines what subcomputation to select; the parameter \( x \) indicates the result type of that subcomputation.

  • Likewise, continuations take as input the operation’s output value (\( p \)).

  • The \( Tree _{1}\) data type is monadic, with a similar notion of return and bind as the free monad [23]:

    figure k

    A monadic binding \( t \mathbin {>\!\!\!>=} f \) thus “concatenates” the tree in \( f \) to the leaf positions in the continuation (only) of \( t \).

We can emulate the non-modular type \( LC \; v \; a \) with \( LC' \; v \; a \).

figure l

The corresponding representation for \( LC' \)s \( Return \) constructor is \( Leaf _{1}\). The \( Var \) constructor is represented with a \( Node _{1}\).

figure m

This is a \( Var' \; i \) node. As there are no subcomputations, there are no branches in the pattern match in the selection function on the value \( x \) of the empty type \( NoSub \). Lastly, the continuation \( k \) receives the value produced by the operation.

The encodings of the two other operations are similar. One notable aspect of \( abs _{1}\) is that it does have one subcomputation. Hence, the selection function matches on the \( One \) constructor and returns \( t \):

figure n

Modular Tree Constructors. We can create modular constructors for latent effect operations, similarly to how DTC admits modular syntax constructors [23]. To this end, we use a co-product operator \(\mathbin {+}\) that combines latent effect signatures, and a subtyping relation \( \sigma _{1}\mathbin {<} \sigma _{2}\) with an associated injection function, \( injSig \mathbin {::} \sigma _{1}\; p \; c \rightarrow \sigma _{2}\; p \; c \). Using these, we can implement the modular constructor functions in Fig. 1, that allow combining \( Abstracting \) with other latent and algebraic effects. The subtyping requirements in the type signatures are automatically inferrable by type class instance resolution in Haskell. The implementation details of \(\mathbin {+}\) and \(\mathbin {<}\) are given in DTC [23] or Appendix A of van den Berg et al. [2].

Fig. 1.
figure 1

The modular constructor functions of the \( Abstracting \) effect. These functions all fix the continuation to \( Leaf _{1}\), which can easily be replaced by an arbitrary continuation \( k \) using the \(\mathbin {>\!\!\!>=}\) operator of \( Tree _{1}\)’s monad instance.

We can now use these modular constructors to implement denotation function cases. We can also write programs using the constructors directly. For example, the following program with a lambda abstraction with an effectful body:

figure o

The body of the function abstraction increments the function argument value (de Bruijn index 0) by the \( n \) yielded by the \( get \) operation. The signature of the program tree is the co-product of three signatures: (1) \( Mutating \; V \) for mutable state; (2) function abstractions \( Abstracting \; V \); and (3) the empty signature \( Ending \), which provides no operations and serves as the base case. The \( Mutating \; V \) effect recasts the traditional algebraic state effect as a latent effect, and has two operations, \( get \) and \( put \). Observe that \( prog \) is essentially an AST, with multiple possible interpretations. If \( Mutating \; V \) is dynamic (runtime) state, then \( prog \) evaluates to \(3+2=5\). However, if it is for macro bindings that are expanded statically, then the \( get \) in the body of the abstraction is evaluated under the state at the “definition site” of the lambda, and \( prog \) evaluates to \(\mathrm {3}\mathbin {+}\mathrm {1}\mathrel {=}\mathrm {4}\). Next, we show how handlers can map the \( prog \) syntax tree to different semantics.

3.3 Trees with Support for Modular Handlers

In the case of a modularly composed signature \( \sigma \mathrel {=} \sigma _{1}\mathbin {+}\mathbin {...}\mathbin {+}\sigma _n\mathbin {+} Ending \), the idea is to compose the handler function from individual handlers for the different components of the signature \( h \mathrel {=} hEnd \circ h _n\circ \mathbin {...}\circ h _{1}\). The type of each handler would be \( h _i\mathbin {::}\forall \sigma . Tree _{1}\;(\sigma _i\mathbin {+} \sigma )\; a \rightarrow Tree _{1}\; \sigma \;( L _i\; a )\). Hence, it is polymorphic in the remaining part of the signature and preserves those nodes in the resulting tree. It only replaces the nodes of \(\sigma _i\) with their meaning, which is given in the form of a functor \( L _i\) that decorates the result type \( a \).

Unfortunately, our \( Tree _{1}\) type and, in particular, the type of its \( Node _{1}\) constructor, needs some further refinement to fully support this idea. Indeed, if the signature is \(\sigma _i\mathbin {+} \sigma \), and we apply \( h _i\) to all the recursive occurrences of \( Tree \;(\sigma _i\mathbin {+} \sigma )\) in a \( \sigma \)-node \( Node _{1}\;( Inr' \; op )\; sub \; k \), we get:

figure p

The resulting fields do not together form a node of type \( Tree _{1}\; \sigma \;( L _i\; a )\) because the highlighted result type of the subcomputations is \(( L _i\; x )\) rather than \( x \) which the \( Node _{1}\) constructor requires \( sub \) to have as return type.

The problem is that the \( Tree _{1}\) type is oblivious to the effect functor that the return type of subcomputations in the tree are decorated by. To solve this problem, we can expose the effect functor decoration in the tree type itself; e.g.,

figure q

But the \( Tree _{2}\) type requires effect handlers to be applied to subcomputations immediately. Motivated by modeling constructs that defer computation, we generalize the type further by parameterizing subcomputations by the effect functor state, and making each node “remember” the effect state (the latent effects):

figure r

In Sect. 3.4 we discuss how \( Tree \) supports deferring computation. \( Tree \) is a monad with a return and bind defined similarly to the ones for \( Tree _{1}\) in Sect. 3.2. We can also define modular tree constructors using similar techniques as in Sect. 3.2. For instance, using \( Tree \) instead of \( Tree _{1}\), the type of \( prog \) from Sect. 3.2 becomes:

figure s

Here, the \( Id \) functor models the absence of latent effects in the tree. The type \( V \) represents a concrete value type.

Example. Figure 2 shows how the type of the \( prog \) tree evolves when applying successive handlers for the three parts of the signature:

figure t

First, we run the modular handler for mutable state \( Mutating \; s \), which has type:

$$\begin{aligned} hMut \mathbin {::} Functor \; l \Rightarrow s \rightarrow Tree \;( Mutating \; s \mathbin {+} \sigma )\; l \; a \rightarrow Tree \; \sigma \;( StateL \; s \; l )\;( s , a ) \end{aligned}$$

Given an initial state of type \( s \), this handler transforms a tree into another tree. The signature of the tree evolves from \( Mutating \; s \mathbin {+} \sigma \) to \( \sigma \) because the handler interprets the mutable state, but not the other effects. Also, the latent effect functor evolves from \( l \) (the latent effects already present) to \( StateL \; s \; l \), which augments \( l \) with the value of the intermediate state.

figure u

The result type evolves from \( a \) to \(( s , a )\), which augments it with the final state.

Second, the handler for \( Abstracting \; V \) behaves similarly to \( hMut \), removing itself from the signature and growing the latent effect functor. Finally, \( hEnd \) handles the \( Ending \) base case. It takes a tree with an empty signature, which thus necessarily only contains a leaf, and extracts the final result out of it.

figure v

The remainder of this section illustrates how modular handlers are implemented.

Fig. 2.
figure 2

The type of \( prog \) after successive handling steps.

3.4 Example: Two Modular Handlers for Function Abstractions

We implement two different modular handlers for the operations in \( Abstracting \), which illustrate (1) how latent effects let us write handlers for function abstraction, and (2) the kind of fine-grained control the handlers provide. The first handler we consider evaluates the body of a function abstraction using the latent effects of its call site. Hence, the evaluation of side-effectful operations is postponed until the function is applied. The second handler evaluates the body of the function abstraction using the latent effects of its definition site. This immediately enacts the latent effects introduced by previously-applied handlers.

Modular Closure Values. A concern that arises when we step away from the earlier non-modular handler for \( Abstracting \) is reuse. Notably, in a modular setting we want to allow reuse of both handlers with different notions of values. For that reason, they are parameterized in the type of values \( v \). This type may comprise various shapes of values; all the function abstraction handlers require is that closures are one possible shape of value.

To express this requirement, we introduce another type class \( v _{1}\mathbin {<:} v _{2}\) for subtyping, this time at kind \(\mathbin {*}\), which witnesses (1) that any \( v _{1}\) can be “upcast” to the type \( v _{2}\); and (2) that a \( v _{2}\) can be “downcast” to type \( v _{1}\). The latter may fail, but the former does not. The minimal requirement for lambda abstractions is that the value type includes closure values.

figure w

In the modular setting, the types \( Closure \; v \) and \( Env \; v \), of respectively closures and value environments, are parameterized in the type of values used.

Modular Resumption Store. Recall that the resumption store keeps track of the function bodies whose execution has been deferred; i.e., it is a list of resumptions. In the modular setting, the type of resumptions is parametric in the specific type of signature, latent effect functor and value type.

figure x

Moreover, depending on whether we want to handle latent effects on the call site or definition site, the definition of a resumption differs.

A resumption of a call-site effect is a function that takes an \( l \;()\) input, which is the latent effect context of the call site where the resumption is evaluated.

figure y

The resumptions of a definition-site effect store are trees instead of functions that produce trees. Indeed, they have no dependency on the latent effects of the call site. Instead, they have been fully determined by the definition site.

figure z

Although the resumption store makes the handlers verbose, it is a more modular solution than storing \( Tree \)s in values.

Modular Handlers. Figure 3 shows the modular handler \( hAbs _{CS}\) that uses the call-site latent effects when executing a function bodyFootnote 2.

Compared to the non-modular definition, there are several differences. Firstly, the handler only interprets part of the work and thus returns the remaining tree instead. Hence, the \( Leaf \) case now returns a new leaf, and the other cases use monadic \(\mathbf {do}\)-notation to build a new tree. Secondly, because the signature is a composition and the resulting value type is too, the pattern matching on operations involves the \( Inl' \) and \( Inr' \) tags of the \(\mathbin {+}\) co-product. The pattern matching and construction of values involves \( inj _{\mathrm {v}}\) and \( proj _{\mathrm {v}}\) calls for the same reason. Thirdly, the latent effects now matter and need to be properly threaded in all the operation cases. Finally, there is an additional operation case, to handle unknown operations from the remaining part of the signature by “forwarding” them, i.e., copying them to the resulting tree for later handling.

As discussed, in a modular setting a second handler is possible: one that uses the latent effects of the definition site for function bodies rather than their call site. This definition-site handler, \( hAbs _{DS}\), looks much like its sibling. The key difference is the type of resumptions, which affects the code in two places (highlighted in gray). Firstly, the abstraction case applies the subtree function to the latent effect of the definition site instead of deferring the application (\( st \; One \; l \) instead of \( st \; One \)). Dually, the application case does not have to apply the resumption to the call-site latent effect (\( r \mathbin {!!} p \) instead of \(( r \mathbin {!!} p )\; l \)).

Fig. 3.
figure 3

Modular call-site abstraction handler. The gray highlights indicate the places where it differs from a definition-site handler.

Example. With the abstraction handlers in place, let us revisit the \( prog \) example. We run the handlers with default initial values, i.e., 0 for the state, the empty variable environment and the empty resumption store. When using the call-site abstraction handler after the state handler, the function body uses the value of the state that was written right before its invocation.

figure ab

4 Case Studies

This section reports on two advanced control-flow features implemented using this library: call-by-need lambdas (Sect. 4.1) and multi-staging (Sect. 4.2); and on a case study implementation of a library with a range of modular effects (Sect. 4.3). For the source code of these case studies, we refer to the implementation available at https://github.com/birthevdb/Latent-Effect-and-Handlers.git.

4.1 Call-by-Need Evaluation

We have implemented two different evaluation strategies, call-by-need (lazy) and call-by-value (CBV), for lambdas by using different latent effect handlers. Our approach is inspired by Levy’s call-by-push-value [10], which can express both strategies. We summarize here; Appendix B of van den Berg et al. [2] has all the details.

Call-by-need evaluation lazily delays the evaluation of argument expressions of function applications, and uses memoization to ensure that evaluation only happens once for delayed expressions. We build a lazy semantics for function abstractions out of three primitive effects:

  1. 1.

    The \( Reading \) effect corresponds to the well-known reader monad from the Haskell monad transformer library [11].

  2. 2.

    The \( Suspending \) effect delays the evaluation of function bodies, without memoizing the result of the evaluation of the delayed subtrees.

  3. 3.

    The \( Thunking \) effect delays the evaluation of argument expressions of function applications, memoizing the result of forcing a thunked computation.

The definition of these effects and their handlers can be found in the effect library of Appendix A of van den Berg et al. [2]. Using these effects, we define three operations for lazy evaluation (\( abs _{lazy}\), \( var _{lazy}\), and \( app _{lazy}\)). Lambda abstraction suspends the body of a lambda, and pairs a pointer to the suspension with the environment that the thunk should be evaluated under. The \( var _{lazy}\) and \( app _{lazy}\) functions memoize and recall argument values (possibly by forcing the evaluation of a thunked computation), and evaluate the body of a lambda. Application evaluates the first argument to a function value, and memoizes the second argument, which is placed in the current environment. Then, the function body is executed.

The following example program evaluates to 0 when using lazy evaluation:

figure ac

Function application delays the evaluation of \( put \) in the argument, and is never executed because the function body does not reference its parameter.We can run the program with call-by-need by applying its handlers:

figure ad

The inspect function extracts the final value out of the result that is decorated with the latent effect functor (in this case nested \( StateL \)’s).

We can also recover a CBV semantics from \( app _{lazy}\), \( abs _{lazy}\), and \( app _{lazy}\) by implementing an alternative handler for the \( Thunking \) effect. This handler eagerly evaluates subtrees and stores their value in a store.

figure ae

This case study demonstrates that modular call-by-need can be implemented by decomposing it into modular, primitive latent effects and handlers. It also shows how overloading the handler of the \( Thunking \) effect provides a means of changing the semantics of a program without touching the program itself.

Fig. 4.
figure 4

A MetaML program (left) and its latent effects implementation (right).

4.2 Staging

Another advanced control-flow feature that we have implemented with latent effects is multi-staging. By applying effect handlers before the handler for the staging effect, we can control which effects should be staged, and which not. The implementation of these staging constructs can be found in Appendix C of van den Berg et al. [2].

Our inspiration are the three constructs of MetaML [25]: (1) bracket expressions ( ) delay execution to a later stage; (2) escape expressions ( ) splice a staged expression into another; and (3) run expressions ( ) run an expression that has been dynamically generated by bracket and escape expressions.

A key feature of MetaML is that staged code is statically typed and lexically scoped. The staging constructs that we implement differ in two ways: our staging constructs are untyped, and we provide two constructs for splicing code (\( push \) and \( splice \)) instead of the single escape expression found in MetaML.

We use \( push \) for writing programs with escape expressions under binders in staged code. The dynamic semantics of \( push \) creates an environment with “holes” that represent unknown bindings, and \( splice \) automatically fills in these holes with bindings from the dynamic context of the \( splice \) expression.

The four staging constructs we implement are thus: (1) \( quote \), corresponding to brackets in MetaML; (2) \( unquote \), corresponding to in MetaML; and (3+4) \( splice \) and \( push \) for code splicing. The programs in Fig. 4 illustrate the difference in how splicing works. The MetaML program on the left prints the string and returns the value . The program on the right desugars into latent effects. With the appropriate handlers, it gives the same output.

Yet, by switching the order of handlers for the \( print \) effect and staging, we obtain a different semantics that eagerly handles \( print \) operations inside \( quote \)d code. This makes the program on the right print instead.

4.3 Library Summary

We have given two examples where latent effects can be modularly composed to form language features. Figure 5 gives an overview of our effect libraryFootnote 3 and how the primitive effects are combined into language features.

Fig. 5.
figure 5

Effect library with Lines of Code (LoC) per effect.

The left part shows the general framework code for implementing latent effects, consisting of \( Tree \)s, the DTC approach and helper definitions (e.g. \( Id \), \( Void \)); the figure also indicates the associated lines of code (LoC). The middle part shows ten different effects and their LoC. Each effect comes with an effect signature, a handler, and smart constructors for their operations. For the detailed implementation of these effects, we refer to the effect library (Appendix A of van den Berg et al. [2]). The right part contains several language features that can be implemented using these effects, with their associated LoC. Each feature comes with its object language syntax and a mapping onto the effects. Each language requires an additional two LoC, to indicate the effects and handlers used and their order. A different order of effects and handlers may give different semantics.

Figure 5 only includes a few language features covered in the paper. However, as we provide ten effects and handlers, they can be modularly composed in different order, using different combinations. In theory, when algebras are fixed, we can define (\(10! + 9! + \ldots + 2! + 1!\)) = 4,037,913 semantics, although some compositions may result in the same behaviour. Even more variations are possible, varying the algebra that maps the object language syntax to the effects.

5 Related Work

Modular Semantics and Effects. Modularity has received much attention both at the level of language definitions and of the effects used by those languages. A landmark is the formulation of the expression problem [26], the challenge to modularly extend languages with new features and new interpretation functions. As different language features use different effects, this also requires the modular composition of those effects. Monad transformers [11] are the most prominent approach, including alternative implementations such as Filinksi’s layered monads [3] and Jaskelioff’s Monatron [7].

Algebraic Effects. Algebraic effects [16] have been proposed as a more structured approach to monads that can also be composed [6]. The subsequent introduction of handlers [18] to deal with exceptions has enabled practical language and library implementations, e.g., [8, 9, 12]. Schrijvers et al. [20] identified when algebraic effect handlers are modular and related this to a subclass of monad transformers, using the notion of modules [15]. Wu et al. [29] have identified a class of what they call scoped effects, which cannot be expressed as algebraic operations. To remedy the situation, they have proposed a practical generalization of algebraic effects. Piróg et al. [14] have put this ad-hoc approach for scoped effects on formal footing in terms of a free monad on a level-indexed category.

Latent Effects. There are many works on specific types of latent effects. For instance, staging is a widely studied area [19, 22, 24]. Some works have also combined algebraic effects with staging mechanisms [21, 27, 30]. Yet, we are, to the best of our knowledge, the first to consider staging using effect handlers.

The call-by-push-value calculus of Levy [10] provides primitives for expressing both call-by-name and call-by-value. These have been an inspiration for our modular thunking handler. A more generic work is that of Atkey and Johann [1] on interleaving data and effects to model the incremental production of the data, and on interpreting these with f-and-m algebras.

Various forms of delimited control have been used in the literature to realize sophisticated control mechanisms, such as the simulation of call-by-need by Garcia et al. [5]. Moreover, several works [4] show the interdefinability of conventional algebraic effects and delimited control. A further investigation into the relative expressiveness of latent effects would be interesting.

In future work we would like to demonstrate the performance of latent effects, using the techniques of fusion by Wu and Schrijvers [28].

6 Conclusion

This paper has introduced the notion of latent effects. These extend algebraic effects with the ability to modularly model advanced control-flow mechanisms that can postpone the execution of certain computations and require fine-grained control over the effects inside them. Lambda abstraction, lazy evaluation, and staging were shown to be three prominent instances.