Keywords

1 Introduction

Denotational semantics \([\![{{-}}]\!]\) associates every program \( M \) with its meaning, i.e. its denotation, \([\![{ M }]\!]\). A key feature of a denotational semantics is compositionality: the denotation of a program depends only on the denotations of its constituents.

As a concrete example, consider an imperative language that manipulates a memory store \( \sigma \in \mathbb {S}\), and denotational semantics for it that associates with each program \( M \) a denotation \([\![{ M }]\!] : \mathbb {S}\rightarrow \mathbb {S}\) modelling how \( M \) transforms the store. For example – denoting by \( \sigma \left[ {{ \textsf {a} }\mapsto { v }}\right] \) the store that is the same as \( \sigma \) but with its value at \( \textsf {a} \) changed to \( v \) – we have \([\![{ \textsf {a} \mathbin {\boldsymbol{:=}} v }]\!] \sigma = \sigma \left[ {{ \textsf {a} }\mapsto { v }}\right] \). Compositionality manifests in the semantics of program sequencing: \([\![{ M \mathbin {\boldsymbol{;}} N }]\!] \sigma = [\![{ N }]\!] \left( {[\![{ M }]\!] \sigma }\right) \). Thus \([\![{ \textsf {a} \mathbin {\boldsymbol{:=}}0 \mathbin {\boldsymbol{;}} \textsf {a} \mathbin {\boldsymbol{:=}}1}]\!] \sigma = [\![{ \textsf {a} \mathbin {\boldsymbol{:=}}1}]\!] \left( {[\![{ \textsf {a} \mathbin {\boldsymbol{:=}}0}]\!] \sigma }\right) = \left( { \sigma \left[ {{ \textsf {a} }\mapsto {0}}\right] }\right) \left[ {{ \textsf {a} }\mapsto {1}}\right] = \sigma \left[ {{ \textsf {a} }\mapsto {1}}\right] \). Incidentally, we also have \([\![{ \textsf {a} \mathbin {\boldsymbol{:=}}1}]\!] \sigma = \sigma \left[ {{ \textsf {a} }\mapsto {1}}\right] \), and so \([\![{ \textsf {a} \mathbin {\boldsymbol{:=}}0 \mathbin {\boldsymbol{;}} \textsf {a} \mathbin {\boldsymbol{:=}}1}]\!] = [\![{ \textsf {a} \mathbin {\boldsymbol{:=}}1}]\!]\).

A desirable property of a denotational semantics \([\![{{-}}]\!]\) is adequacy, meaning that \([\![{ M }]\!] = [\![{ N }]\!]\) implies that \( M \) and \( N \) are contextually equivalent: replacing \( N \) with \( M \) within some larger program does not affect the possible results of executing that program. Contextual equivalence is useful for optimizations: for example, \( M \) could have better runtime performance than \( N \). Adequate denotational semantics can justify optimizations without quantifying over all program contexts, serving in this way as a basis for validating compiler optimizations.

Returning to the example above, although \([\![{ \textsf {a} \mathbin {\boldsymbol{:=}}0 \mathbin {\boldsymbol{;}} \textsf {a} \mathbin {\boldsymbol{:=}}1}]\!] = [\![{ \textsf {a} \mathbin {\boldsymbol{:=}}1}]\!]\), in the presence of concurrency \( \textsf {a} \mathbin {\boldsymbol{:=}}0 \mathbin {\boldsymbol{;}} \textsf {a} \mathbin {\boldsymbol{:=}}1\) and \( \textsf {a} \mathbin {\boldsymbol{:=}}1\) are not contextually equivalent. For example, if \( \textsf {b} \mathbin {\boldsymbol{:=}} \textsf {a} \boldsymbol{?}\) (read from \( \textsf {a} \) and write the result to \( \textsf {b} \)) is executed concurrently, it could write 0 to \( \textsf {b} \) only with the first program. Therefore, the semantics we defined is inadequate for a concurrent programming language; differentiating between these programs requires a more sophisticated denotational semantics.

Moreover, the transformation \( \textsf {a} \mathbin {\boldsymbol{:=}}0 \mathbin {\boldsymbol{;}} \textsf {a} \mathbin {\boldsymbol{:=}}1 \twoheadrightarrow \textsf {a} \mathbin {\boldsymbol{:=}}1\) eliminating the first, redundant memory access is valid in the presence of concurrency, even though the programs are not equivalent. Indeed, a compiler applying this simplification within a program would not introduce any additional possible results (though it may eliminate some), and in particular it would conserve the correctness of the program. We would like our semantics to be able to justify such transformations.

This leads us to the concept of directional adequacy, a useful refinement of adequacy. Given a partial order on the set of denotations, the denotational semantics is directionally adequate (w.r.t. ) if implies that \( M \) contextually refines \( N \): replacing \( N \) with \( M \) within some larger program does not introduce new possible results of executing that program. Thus, directional adequacy can justify the transformation \( N \twoheadrightarrow M \) even if it is not an equivalence.

In this paper we define directionally-adequate denotational semantics for a higher-order language, subsuming the imperative language above, that justifies the above transformation along with other standard memory access eliminations:

figure d

Other transformations and equivalences this semantics validates are structural ones, such as \(\textbf{if }\, M \,\textbf{then}\, N \,\textbf{else}\, N \, \cong M \mathbin {\boldsymbol{;}} N \); and concurrency-related ones, such as \(\left( { M \parallel N }\right) \mathbin {\boldsymbol{;}} K \twoheadrightarrow M \mathbin {\boldsymbol{;}} N \mathbin {\boldsymbol{;}} K \).

None of these transformations are novel. Rather, the contribution of this paper is in the methodology that is used to justify them, fitting shared-state concurrency semantics into a general, uniform model structure. In particular, each memory access eliminations is proven correct via a mundane algebraic calculation; and the structural transformations can be justified using simple structural arguments that abstract away the details of our particular semantics.

Methodology. The use of monads to study the denotational semantics of effects [30] has proven fruitful, especially with its recent refinement with algebraic operators and equational theories [9, 10, 23, 35, 36]. We follow the algebraic approach to define denotational semantics for a simple higher-order concurrent programming language, using an equational theory extending non-deterministic global-state with a single unary algebraic operator for yielding computation to a concurrently running program [1]. We find a concrete representation of the monad this theory induces based on sets of traces [4, 6], and use it to define a directionally adequate denotational semantics. From this adequacy result we deduce various program transformations via routine calculations.

The advantages of denotational semantics defined with this approach include:

  • Uniformity. Theories are stated using the same general framework. This uniformity means that many theoretical results can be stated in general terms, applying to all theories. Even if a theorem, like adequacy, must be proven separately for each theory, it is likely that a similar proof technique can be used, and experience can guide the construction of the proof.

  • Comparability. Comparing and contrasting theories is convenient due to uniformity [23]. While our language and semantics is very different from Abadi and Plotkin’s [1], the equational theory we obtain easily compares to theirs.

  • Modularity. Since the theories are stated algebraically, using operators and equations, they are amenable to be combined to form larger theories. Some combinations are the result of general theory-combinators, such as the theory of non-deterministic global-state resulting from combining the theory of global-state [34] with the theory of non-determinism [14]. In this combined theory, equations that are provable in each separate theory remain provable. Even if the combination is bespoke, using an algebraic theory breaks down the problem into smaller components [8].

  • Abstraction. The semantics we define for the fragment of our language without shared-state is identical in form to the standard semantics, by using the monad operations. Therefore, any structural transformation proven using these abstractions remains valid in the language extended with shared-state.

  • Implementability. Monads are ubiquitous as a computational device in functional programming languages, such as Haskell. Thus a theory based on a monad may in the future form a bridge to implementation.

Outline. The remaining sections are as follows. The next section provides background to the problem and overviews our results in a simplified manner. Then we dive into the weeds, starting with a succinct presentation of the equational theory and related notions (Sect. 3). We then work our way gradually to define the induced monad’s concrete representation (Sect. 4). Next we define the denotations using this representation (Sect. 5). With everything in place, we present our metatheoretical results and use them to justify program transformations and equivalences (Sect. 6). We conclude with a discussion of related work and future prospects (Sect. 7).

2 Overview

Our setting is a simple programming language with state. We fix a finite set of locations \(\mathbb {L}:=\left\{ \texttt{l}_{1}, \ldots , \texttt{l}_{\bar{\texttt{n}}}\right\} \) and a finite set of (storable) values \(\mathbb {V}:=\left\{ \texttt{v}_{1}, \ldots , \texttt{v}_{\bar{\texttt{m}}}\right\} \). A store \( \sigma \) is an element of \(\mathbb {S}:=\mathbb {L}\rightarrow \mathbb {V}\), where \( \sigma \left[ {{ \ell }\mapsto { v }}\right] \) is the store that is equal to \( \sigma \) except (perhaps) at \( \ell \), where it takes the value \( v \). We use subscripts to apply stores to locations, i.e. we write \( \sigma _{ \ell }\) instead of \( \sigma \ell \). In examples we often assume \(\mathbb {L}= \left\{ \textsf {a} , \textsf {b} , \textsf {c} \right\} \) and \(\mathbb {V}= \left\{ 0, 1\right\} \), and write stores in matrix notation, e.g. .

The language is based on a standard extension of Moggi’s [30] computational lambda calculus with products and variants (labelled sums), further extended with three shared-state constructs. Many other constructs are defined using syntactic sugar, such as if-statements and booleans, let-bindings, and program sequencing. The core syntax is presented below (where \(n \ge 0\)):

figure f

The typing rules for the shared-state constructs appear at the top of Fig. 1, where we define \(\textbf{Loc}:=\boldsymbol{\{}\texttt{l}_{1} \mathbin {\textbf{of}}\boldsymbol{(}\boldsymbol{)}\mathbin {\boldsymbol{|}}\cdots \mathbin {\boldsymbol{|}}\texttt{l}_{\bar{\texttt{n}}} \mathbin {\textbf{of}}\boldsymbol{(}\boldsymbol{)}\boldsymbol{\}}\) and \(\textbf{Val}:=\boldsymbol{\{}\texttt{v}_{1} \mathbin {\textbf{of}}\boldsymbol{(}\boldsymbol{)}\mathbin {\boldsymbol{|}}\cdots \mathbin {\boldsymbol{|}}\texttt{v}_{\bar{\texttt{m}}} \mathbin {\textbf{of}}\boldsymbol{(}\boldsymbol{)}\boldsymbol{\}}\).

The language is equipped with a call-by-value, small-step operational semantics \( \sigma , M \mathrel {\overset{}{\rightsquigarrow }} \rho , N \), meaning that the program \( M \) executed from store \( \sigma \) progresses to \( N \) with store \( \rho \). The operational semantics for the shared-state constructs appears at the bottom of Fig. 1. Parallel execution is interpreted via a standard interleaving semantics, ultimately returning the pair of the results of each side, and synchronizing on their completion. The reflexive-transitive closure of \(\rightsquigarrow \) is denoted by \(\mathrel {\rightsquigarrow *}\). The operational semantics can be seen in action in Example 2.

Fig. 1.
figure 1

Typing and operational semantics of the shared-state constructs.

2.1 Global-State (for Sequential Computation)

To make this exposition more accessible, we focus on sequential computation before advancing to denotational semantics of concurrent computation. Sequential computations with global state cause two kinds of side-effects: looking a value up in the store, and updating a value in the store. Plotkin and Power [34] propose a corresponding equational theory with two operators:

  • Lookup. Suppose \( \ell \in \mathbb {L}\), and \(\left( {{ t }_{ v }}\right) _{ v \in \mathbb {V}}\) is a \(\mathbb {V}\)-indexed sequence of terms. Then \(\text {L}_{ \ell }\left( {{ t }_{ v }}\right) _{ v \in \mathbb {V}}\) is a term representing looking the value in \( \ell \) up, calling it \( v \), and continuing the computation with \( t _{ v }\). We write \(\text {L}_{ \ell }\left( { v .\;{} t _{ v }}\right) \) instead of \(\text {L}_{ \ell }\left( {{ t }_{ v }}\right) _{ v \in \mathbb {V}}\).

  • Update. Suppose \( \ell \in \mathbb {L}\), \( v \in \mathbb {V}\), and \( t \) is a term. Then \(\text {U}_{ \ell , v } t \) is a term representing updating the value in \( \ell \) to \( v \) and continuing the computation with \( t \).

The equations of the theory of global-state are generated by taking the closure of the axioms – listed at the top of Fig. 2 – under reflexivity, symmetry, transitivity, and substitution. The grayed-out axioms happen to be derivable, and are included for presentation’s sake. The theory of global-state can be used to define adequate denotational semantics for the sequential fragment of our language, obtained by removing concurrent execution \(({\parallel })\).

Fig. 2.
figure 2

The axiomatization of the algebraic theory

Example 1

Global-state includes the following Eq. (1) which, when considering sequential programs, represents the program equivalence (2):

$$\begin{aligned} \text {L}_{ \textsf {b} }\left( { v .\;{}\text {U}_{ \textsf {a} , v }\text {L}_{ \textsf {c} }\left( { w .\;{}\text {U}_{ \textsf {a} , w } \langle {}\rangle }\right) }\right)&= \text {L}_{ \textsf {c} }\left( { w .\;{}\text {U}_{ \textsf {a} , w } \langle {}\rangle }\right) \end{aligned}$$
(1)
$$\begin{aligned} \textsf {a} \mathbin {\boldsymbol{:=}}{ \textsf {b} \boldsymbol{?}} \mathbin {\boldsymbol{;}} \textsf {a} \mathbin {\boldsymbol{:=}}{ \textsf {c} \boldsymbol{?}}&\cong \textsf {a} \mathbin {\boldsymbol{:=}} \textsf {c} \boldsymbol{?} \end{aligned}$$
(2)

2.2 Shared-State

The equivalence (2) from Example 1 fails in the concurrent setting, since the two program can be differentiated by program contexts with concurrency:

Example 2

Consider the program context \(\varXi \left[ {{-}}\right] = \left[ {{-}}\right] \parallel \textsf {a} \boldsymbol{?}\), i.e. executing each program in parallel to a thread dereferencing the location \( \textsf {a} \). Then there is no execution of \(\varXi \left[ { \textsf {a} \mathbin {\boldsymbol{:=}} \textsf {c} \boldsymbol{?}}\right] \) that starts with the store and returns \(\langle {\langle {}\rangle ,0}\rangle \), but there is such a execution of \(\varXi \left[ { \textsf {a} \mathbin {\boldsymbol{:=}}{ \textsf {b} \boldsymbol{?}} \mathbin {\boldsymbol{;}} \textsf {a} \mathbin {\boldsymbol{:=}}{ \textsf {c} \boldsymbol{?}}}\right] \):

figure h

Therefore, the aforementioned denotational semantics defined using the theory of global-state cannot be extended with a denotation for \((\mathbin {\mathrel \vert \mathrel \vert })\) while preserving adequacy. More sophistication is needed in the concurrent setting: the denotations, even of sequential programs, must account for environment effects.

We thus extend the global-state theory with a single unary operator:

  • Yield. Suppose \( t \) is a term. Then \(\text {Y} t \) is a term. Its intended meaning is to let the environment read and write and then continue with the computation \( t \).

We also need to account for the non-determinism inherent in parallel executions. We do so by extending the resulting theory with operators for finite non-determinism with their equational theory, and further standard equations axiomatizing commutative interaction with the other operators [14]:

  • Choice. For every \( \alpha \in \mathbb {N}\) there is a respective choice operator. Suppose \(\left( {{ t }_{ \imath }}\right) _{ \imath < \alpha }\) is a sequence of terms \( t _0, \dots , t _{ \alpha -1}\). Then \(\bigvee _{ \alpha }\left( {{ t }_{ \imath }}\right) _{ \imath < \alpha }\) is a term. Its intended meaning is to choose \( \imath < \alpha \) non-deterministically and continue with the computation \( t _{ \imath }\). We write \(\bigvee _{ \imath < \alpha }{ t _{ \imath }}\) instead of \(\bigvee _{ \alpha }\left( {{ t }_{ \imath }}\right) _{ \imath < \alpha }\); and when \( \alpha = 2\) we use infix notation, i.e. instead of \(\bigvee _{ \imath <2}{ t _{ \imath }}\) we may write \( t _0 \vee t _1\).

The axioms of the resulting theory of resumptions Res [1, 14, 30] are listed in Fig. 2. The novelty is not in the equational theory, but rather in the way we use it to define denotations. We compare to related work in Sect. 7.

2.3 Denotations

In Sect. 5 we define denotations of programs, but for the sake of this discussion we simplify, defining the denotation \([\![{ M }]\!]\) of a program \( M \) to be an equivalence class \(\left| { t }\right| \) of a particular term \( t \) in Res. Our actual denotations defined in Sect. 5 will use the concrete representation of the monad (developed in Sect. 4.3), similar to how state transformers represent equivalence classes of terms of global-state.

Dereference & Assignment. We use the monadic bind (defined in Sect. 4.3), which we can think of as “sequencing”; and possible yields \(\text {Y}^? t := t \vee \text {Y} t \):

figure j

The main idea is to intersperse possible yields to block the use of global-state equations such as in (1) and to allow computations to interleave. Although Eq. (1) still holds in Res, it does not imply the program equivalence (2) because the programs in (2) do not map to the algebraic terms in (1). Rather:

Example 3

The denotations of the programs in Example 1 are:

(3)
(4)

So the denotation (3) looks \( \textsf {c} \) up finding a value \( v \), then possibly yields, then updates \( \textsf {a} \) to \( v \), then possibly yields, and finally returns the empty tuple. The concrete representation (Theorem 1) immediately proves that (3) and (4) are not equal, in contrast to the situation in Example 1.

Parallel Execution. Computations interleave using the yield operator. Interleaving execution of programs suggests, for example, the following calculation:

$$\begin{aligned}{}[\![{ \ell \boldsymbol{?} \parallel \ell \mathbin {\boldsymbol{:=}}0}]\!]&= \left| {\text {L}_{ \ell }\left( { v .\;{}\text {Y}^?[\![{ v \parallel \ell \mathbin {\boldsymbol{:=}}0}]\!]}\right) \vee \text {U}_{ \ell ,0} \text {Y}^?[\![{ \ell \boldsymbol{?} \parallel \langle {}\rangle }]\!]}\right| \\&= \left| {\text {L}_{ \ell }\left( { v .\;{}\text {Y}^?\text {U}_{ \ell ,0} \text {Y}^?[\![{ v \parallel \langle {}\rangle }]\!]}\right) \vee \text {U}_{ \ell ,0} \text {Y}^?\text {L}_{ \ell }\left( { v .\;{}\text {Y}^?[\![{ v \parallel \langle {}\rangle }]\!]}\right) }\right| \\&= \left| {\text {L}_{ \ell }\left( { v .\;{}\text {Y}^?\text {U}_{ \ell ,0} \text {Y}^?\langle {{ v }, {\langle {}\rangle }}\rangle }\right) \vee \text {U}_{ \ell ,0} \text {Y}^?\text {L}_{ \ell }\left( { v .\;{}\text {Y}^?\langle {{ v }, {\langle {}\rangle }}\rangle }\right) }\right| \end{aligned}$$

The problem with the above is that is lacks compositionality: \([\![{ \ell \boldsymbol{?} \parallel \ell \mathbin {\boldsymbol{:=}}0}]\!]\) should be defined in terms of \([\![{ \ell \boldsymbol{?}}]\!]\) and \([\![{ \ell \mathbin {\boldsymbol{:=}}0}]\!]\), without referring to the underlying programs. In Sect. 4.6 we define a function \((\mathbin {\mathrel \vert \mathrel \vert \mathrel \vert })\) such that \([\![{ M \parallel N }]\!] = [\![{ M }]\!] \mathbin {\mathrel \vert \mathrel \vert \mathrel \vert }[\![{ N }]\!]\). This definition relies on the concrete representation from Sect. 4.3.

2.4 Program Transformations

Our main result is directional adequacy (Theorem 5). Under this simplified view it can be stated, in terms of the partial-order on our denotations generated by , as follows: if then the transformation \( N \twoheadrightarrow M \) is valid in the concurrent setting. The following example illustrates how directional-adequacy can be used to validate program transformations of interest, of the relatively few that are valid in the strong memory-model we consider here.

Example 4

We validate (write ; read) (see also Example 9):

figure m

We can similarly validate the other memory access eliminations from Sect. 1. By using \(\text {Y}^?\) rather than \(\text {Y}\) in the denotations, the cases where the environment is known to not interleave are taken into accounted explicitly. The relevant global-state equation can then be exploited to eliminate the redundant memory access.

2.5 Caveats and Limitations

Our goal in this work is to fit concurrency semantics on equal footing with other semantic models of computational effects. As a consequence, the proposed model has its fair share of fine print, which we bring to the front:

  • Memory Model. We study a very strong memory model: sequential consistency. Modern architectures adhere to much weaker memory-models, where further program transformations are valid.

  • Concurrency Model. Our semantics involves a simple form of concurrency in which threads interleave their computation without restriction, acting on a shared memory store. This is in contrast to a well-established line of work in which models include a causal partial-order in which incomparable events denote “truly” parallel execution [31]. These causal models are showing promise in modelling sophisticated (i.e. weak) shared-state models [7, 16, 17, 25]. We hope further work would fit these causal models into a relatable semantic footing that easily accommodates higher-order structure.

  • Features. Our analysis lacks many valuable features that appear in related work, such as recursion [1], higher-order state [4], probabilities [41], and infinitely many locations/values. This simplification is intended: we took to reductionism, finding a minimal model still accounting for core features of shared state. The benefit of the algebraic approach is that this model can be modularly combined with other features, hopefully using standard technology such as sum-and-tensor [14], domain-enrichment [27], and functor categories [21, 32, 33, 38]. For example, to support recursion our model may be integrated with one of the known powerdomain theory-combinators [42]. This requires making a semantic-design choice that is orthogonal to shared-state concurrency, each with different trade-offs. We avoid making such choices.

  • Semantic Precision. The equational theory and denotational semantics based on it leave much room for improvement in terms of precision and abstraction. For example, our denotational model does not support the introduction of irrelevant reads, i.e. it does not justify the valid transformation \( M \twoheadrightarrow \ell \boldsymbol{?} \mathbin {\boldsymbol{;}} M \). Indeed, taking \( M = \langle {}\rangle \), we have . The problem stems from a “counting” issue: even though the value being looked-up in \( \ell \) is discarded, the additional possible-yield remains. We hope further work could address this semantic inaccuracy.

  • Full Abstraction. Brookes’s seminal work [1, 6] defined denotational semantics for concurrency that is fully-abstract, meaning that the converse of adequacy holds: programs that are replaceable in every context have equal denotations. Our semantics is far from being fully-abstract: there is a first-order valid equivalence, \( M \cong \ell \boldsymbol{?} \mathbin {\boldsymbol{;}} M \), that our semantics does not support. Moreover, we do not include atomic block executions in our language as Brookes did, which was crucial for the proof of full-abstraction. However, even if our model was precise enough to capture the first-order equivalences, and even if we were to include atomic block executions, we still would not expect to obtain full-abstraction, since this result is infamously elusive for higher-order languages (see Abramsky’s recent overview on the full-abstraction problem of PCF [2]).

3 Equational Theory

At the foundation of our approach is the equational theory of resumptions Res [1, 14, 30] presented in Sect. 2, consisting of operators and equational axioms over them. We succinctly fill-in the formalities below, followed by related definitions.

The signature of Res consists of the following parameterized operators. The notation \(O: {A}\langle {P}\rangle \) means that the arity of the operator \(O\) is the set \(A\) and it is parameterized over the set \(P\):

$$\begin{aligned} \text {L}&: {\mathbb {V}}\langle {\mathbb {L}}\rangle{} & {} \text {lookup}{} & {} {}&\text {Y}&: {\mathbb {1}}\langle {\mathbb {1}}\rangle \qquad \text {yield} \\ \text {U}&: {\mathbb {1}}\langle {\mathbb {L}\times \mathbb {V}}\rangle{} & {} \text {update}{} & {} {}&\bigvee _{ \alpha }&: { \alpha }\langle {\mathbb {1}}\rangle \qquad \text {non-deterministic choice for every } \alpha \in \mathbb {N}\end{aligned}$$

From now on, whenever we refer to an operator, we mean an operator of Res. We denote the set of terms freely generated by the signature over \( X \) by \(\textrm{Term}_{} X \).

Figure 2 lists the axioms of Res, classified as follows: an axiomatization of the equational theory of global-state [34]; the standard axiomatization of non-determinism; and an axiomatization of the commutative interaction of non-determinism with the other operators [13] via the tensor [14].

A Res -algebra \( \mathcal {A} \) consists of a carrier set \(\underline{ \mathcal {A} }\) together with interpretations \(\tilde{O}^{ \mathcal {A} } : {\underline{ \mathcal {A} }}^A\times P\rightarrow \underline{ \mathcal {A} }\) for each operator \(O: {A}\langle {P}\rangle \). We elide the superscript if it is clear from context. For a set \( X \), a Res -algebra on \( X \) consists of a Res-algebra \( \mathcal {A} \) and a function \({\text {env}}\ : X \rightarrow \underline{ \mathcal {A} }\); which extends to \({\text {eval}}\ : \textrm{Term}_{} X \rightarrow \underline{ \mathcal {A} }\) homomorphically along the inclusion \( X \hookrightarrow \textrm{Term}_{} X \). A Res -model on \( X \) is a Res-algebra on \( X \) that satisfies each axiom of Res, i.e. the same element of \(\underline{ \mathcal {A} }\) is obtained by applying \({\text {eval}}\ \) to either side of the axiom.

In the following, we abbreviate using \( \vec {\text {L}}\left( { \sigma .\;{} t _{ \sigma }}\right) :=\text {L}_{\texttt{l}_{1}}\left( { v _1.\;{}\dots \text {L}_{\texttt{l}_{\bar{\texttt{n}}}}\left( { v _{\bar{\texttt{n}}}.\;{} t _{\lambda {\texttt{l}_{ i }}.\; v _{ i }}}\right) }\right) \) and \( \vec {\text {U}}_{ \sigma } t :=\text {U}_{\texttt{l}_{1}, \sigma _{\texttt{l}_{1}}} \dots \text {U}_{\texttt{l}_{\bar{\texttt{n}}}, \sigma _{\texttt{l}_{\bar{\texttt{n}}}}} t \), in addition to \(\text {Y}^? t := t \vee \text {Y} t \) that we saw in Sect. 2.3. For example, . We use similar shorthands with interpretations of operators as well.

4 A Monad for Shared-State

The next part in our denotational semantics is a monad whose elements represent equivalence classes of Res. The monad can be obtained via a universal construction [28] (by quotienting the terms by the equational theory), but a concrete representation is crucial to reason about it; for example, to show that two denotations are different.

4.1 Difficulty of Term Normalization

To motivate the definitions building up to this concrete representation, we first find a representative for each equivalence class in \(\textrm{Term}_{} X /\textsc {Res}\), by taking an arbitrary \( t \in \textrm{Term}_{} X \) and transforming it via equations in Res to a particular form – a normal form – such that there is only one term of this form equal to \( t \).

Consider an algebraic term \( t \in \textrm{Term}_{} X \). Using LU-noop once for each location, a sequence of LU-comm, and ND-return, we find that \( t = \vec {\text {L}}\left( { \sigma .\;{}\bigvee _{ \imath <1} \vec {\text {U}}_{ \sigma } t }\right) \). Note:

$$\begin{aligned} \vec {\text {U}}_{ \sigma } \text {L}_{ \ell }\left( { v .\;{} s _{ v }}\right)&\mathbin {\overset{\textsc {Res}}{=}}\vec {\text {U}}_{ \sigma } s _{ \sigma _{ \ell }}{} & {} {}&\vec {\text {U}}_{ \sigma } \text {U}_{ \ell , v } s&\mathbin {\overset{\textsc {Res}}{=}}\vec {\text {U}}_{ \sigma \left[ {{ \ell }\mapsto { v }}\right] } s{} & {} {}&\vec {\text {U}}_{ \sigma } \bigvee _{ \imath< \alpha } s _{ \imath }&\mathbin {\overset{\textsc {Res}}{=}}\bigvee _{ \imath < \alpha } \vec {\text {U}}_{ \sigma } s _{ \imath } \end{aligned}$$

By applying these equalities left-to-right as long as possible, and applying ND-join and ND-epi to rearrange the sums, we find that \( t \) is equal to a term of the form \(\vec {\text {L}}\left( { \sigma .\;{}\bigvee _{ \imath < \alpha _{ \sigma }} \vec {\text {U}}_{ \rho _{ \imath , \sigma }} s _{ \imath , \sigma }}\right) \), where \( s _{ \imath , \sigma }\) is either in \( X \) or is of the form \(\text {Y} s _{ \imath , \sigma }'\).

For every \( \sigma \), we can rearrange the sum according to common prefixes, thus we find that \( t \) is equal to a term of the form: \(\vec {\text {L}}\left( { \sigma .\;{}\bigvee _{ \rho \in \mathbb {S}} \vec {\text {U}}_{ \rho } \bigvee _{ \jmath < \alpha _{ \rho , \sigma }} s _{ \jmath , \rho , \sigma }}\right) \) where \( s _{ \imath , \rho , \sigma }\) is either in \( X \) or is of the form \(\text {Y} s _{ \imath , \rho , \sigma }'\) (we can take \( \alpha _{ \rho , \sigma } = 0\) when the prefix \(\vec {\text {U}}_{ \rho }\) did not appear in \( t \)). For every \( \rho \), we can rearrange to obtain the form:

$$\begin{aligned} \vec {\text {L}}\left( { \sigma .\;{}\bigvee _{ \rho \in \mathbb {S}} \vec {\text {U}}_{ \rho } \left( {\text {Y} r _{ \rho , \sigma } \vee \bigvee _{ \jmath < \beta _{ \rho , \sigma }} x _{ \jmath , \rho , \sigma }}\right) }\right) \end{aligned}$$
(5)

This is not yet a normal form, which to obtain would require recursively applying this procedure to \( r _{ \rho , \sigma }\) and propagating empty choice operators outward. Were we to continue in this way to find a normal form, we would still need to prove uniqueness and completeness. One standard way to achieve this is to show that this procedure equates the sides of every axiom and respects the deduction rules of equational logic. This requires a careful proof-theoretic analysis of this normalization procedure. Instead, we take a model-theoretic approach, akin to normalization-by-evaluation, constructing for every set a concrete representation of the free Res-model over it. This representation is based on finite sets of traces.

4.2 Traces

Brookes [6] defined a trace to be a non-empty sequence of transitions, where a transition is a pairs of stores; e.g. . Brookes used traces to define a denotational semantics for an imperative concurrent programming language. In Brookes’s semantics, traces denote interrupted executions, where each transition corresponds to an uninterrupted sequence of computation steps that starts with the first store and end with the second store. The breaks between transitions are where the computation yields to the environment.

The concept was adapted by many, including Benton et al. [4], to define denotational semantics for a functional language, where they have added an additional value at the end of the sequence to refer to the value the computation returns. A trace in this paper will refer to this concept: a non-empty sequence of transitions followed by an additional return value. If we wish to specify \( X \) as the set of the return values, we will call it an \( X \) -trace. For example, if \( x \in X \) then is an \( X \)-trace.

For sets \( Y , Z \), we denote by \( Y ^*\) the set of sequences over \( Y \), by \( Y ^+\) the set of non-empty sequences over \( Y \), and \( Y \cdot Z :=\left\{ y z \;\vert \; y \in Y , z \in Z \right\} \). That is, \((\cdot )\) is just notation for \((\times )\) which suggests that the elements of the set are written in sequence, eliding the tuple notation. Thus \(\left( {{\mathbb {S}\times \mathbb {S}}}\right) ^+\! \cdot X \) is the set of \( X \)-traces.

Following our discussion in Sect. 4.1, our representation will use finite sets of traces instead of the algebraic syntax. In particular, the form we have found in (5) suggests the recursive definition:

$$\begin{aligned} {\text {rep}}\ t :=\left\{ \langle {{ \sigma }, { \rho }}\rangle \tau \;\vert \; \sigma , \rho \in \mathbb {S}, \tau \in {\text {rep}}\ r _{ \rho , \sigma }\right\} \cup \left\{ \langle {{ \sigma }, { \rho }}\rangle x _{ \jmath , \rho , \sigma } \;\vert \; \sigma , \rho \in \mathbb {S}, \jmath < \beta _{ \rho , \sigma }\right\} \end{aligned}$$
(6)

The model-theoretic approach we use below obviates the need for the syntactic manipulation that leads to the form in (5) as part of finding the representation. In the model definition, \({\text {eval}}\ \) will play the role of \({\text {rep}}\ \).

4.3 Model Definition

We represent elements of \(\textrm{Term}_{} X / \textsc {Res}\) by \(\underline{\mathcal {T}}{ X } :=\mathop {\mathcal {P}_\textrm{fin}}\left( {\left( {{\mathbb {S}\times \mathbb {S}}}\right) ^+\! \cdot X }\right) \), i.e. finite sets of \( X \)-traces. We equip \(\underline{\mathcal {T}}{ X }\) with a Res-algebra structure \(\mathcal {F} X \):

figure r

We further equip it with \({\text {env}}\ x :=\left\{ \langle {{ \sigma }, { \sigma }}\rangle x \;\vert \; \sigma \in \mathbb {S}\right\} \) to make it a Res-algebra over \( X \). We denote \({\text {env}}\ x \) by \({\text {return}}\ x \), or \(\tilde{ x }\) for shorthand. This Res-algebra is in fact a Res-model over \( X \) by virtue of satisfying the axioms of Res:

Example 5

We verify that \(\langle {{\mathcal {F} X }, {{\text {return}}\ }}\rangle \) satisfies the axiom LU-noop:

figure s

4.4 Correspondence to Non-deterministic Global-State

The theory of non-deterministic global-state (the fragment of Res excluding \(\text {Y}\)) admits a concrete representation using non-deterministic state transformers \(\mathbb {S}\rightarrow \mathop {\mathcal {P}_\textrm{fin}}\left( {\mathbb {S} X }\right) \) [14]. This representation corresponds to the one we have defined in an interesting way. Namely, there is a bijection between \(\underline{\mathcal {T}} X \) and the set of functions mapping stores to finite sets of \( X \)-traces-with-the-first-store-removed:

$$\begin{aligned} \lambda { P \in \underline{\mathcal {T}} X }.\;&\lambda { \sigma \in \mathbb {S}}.\; \left\{ \rho \tau \in \mathbb {S}\cdot \left( {{\mathbb {S}\times \mathbb {S}}}\right) ^*\! \cdot X \;\vert \;\langle {{ \sigma }, { \rho }}\rangle \tau \in P \right\} \\ \lambda { \psi \in \mathbb {S}\rightarrow \mathop {\mathcal {P}_\textrm{fin}}\left( {\mathbb {S}\cdot \left( {{\mathbb {S}\times \mathbb {S}}}\right) ^*\! \cdot X }\right) }.\;&\bigcup _{ \sigma \in \mathbb {S}} \left\{ \langle {{ \sigma }, { \rho }}\rangle \tau \in \left( {{\mathbb {S}\times \mathbb {S}}}\right) ^+\! \cdot X \;\vert \; \rho \tau \in \psi \sigma \right\} \end{aligned}$$

Implicitly identifying the two, the model from Sect. 4.3 can be defined using formulas that look exactly like the non-deterministic global-state ones:

figure t

However, these are not the same formulas – they are defined for different elements (sets of traces as opposed to non-deterministic state transformers).

Using this identification for the yield operator, we obtained the definition \(\tilde{\text {Y}}^{} P :=\lambda { \sigma }.\; \left\{ \sigma \tau \;\vert \; \tau \in P \right\} \), which we understand as “the thread does not modify the state, then allows the environment to intervene, and then continues as before.”

4.5 Representation Theorem

The model \(\langle {{\mathcal {F} X }, {{\text {return}}\ }}\rangle \) defined in Sect. 4.3 represents \(\textrm{Term}_{} X / \textsc {Res}\) because – according to the representation theorem – this is a free Res-model on \( X \), and therefore equivalent to the model of equivalence classes we used in Sect. 2 or the model of syntactic normal forms to which we have alluded in Sect. 4.1.

To prove that the model is free we first equip the family of sets \(\underline{\mathcal {T}}\) with a monad structure. For every Res-model \( \mathcal {A} \) and function \( f : X \rightarrow \underline{ \mathcal {A} }\), define , the homomorphic extension of \( f \) along \({\text {return}}\ \), recursively; where \(R_{ P }^{\langle {{ \sigma }, { \rho }}\rangle } :=\left\{ \tau \in \left( {{\mathbb {S}\times \mathbb {S}}}\right) ^+\! \cdot X \;\vert \;\langle {{ \sigma }, { \rho }}\rangle \tau \in P \right\} \) and \(X_{ P , f }^{\langle {{ \sigma }, { \rho }}\rangle } :=\mathchoice{\mathop {{\mathop {{\text {*}}{\tilde{\bigvee }}}\nolimits ^{ \mathcal {A} }}}\limits _{\langle {{ \sigma }, { \rho }}\rangle x \in P }}{{{\text {*}}{\tilde{\bigvee }}}_{\langle {{ \sigma }, { \rho }}\rangle x \in P }^{ \mathcal {A} }}{{{\text {*}}{\tilde{\bigvee }}}_{\langle {{ \sigma }, { \rho }}\rangle x \in P }^{ \mathcal {A} }}{{{\text {*}}{\tilde{\bigvee }}}_{\langle {{ \sigma }, { \rho }}\rangle x \in P }^{ \mathcal {A} }} f x \):

figure v

The recursion is well-founded since \(R_{ P }^{\langle {{ \sigma }, { \rho }}\rangle }\) is smaller than \( P \) when measured by the length of the longest trace in the set.

Thus we have our monad structure . We show it is induced by the aforementioned family of free Res-models:

Theorem 1 (Representation for shared-state)

The pair \(\langle {{\mathcal {F} X }, {{\text {return}}\ }}\rangle \) is a free Res-model on \( X \): for every Res-model \( \mathcal {A} \) and \( f : X \rightarrow \underline{ \mathcal {A} }\), the function is the unique homomorphism \( g \) satisfying \( f = g \circ {\text {return}}\ \).

As a direct consequence:

Corollary 1 (Model is sound and complete)

Terms over \( X \) are equal in Res iff they have the same representation in \(\langle {{\mathcal {F} X }, {{\text {return}}\ }}\rangle \).

4.6 Synchronization

To define the denotational semantics of \((\parallel )\) in Sect. 5, we will define a corresponding function \((\mathbin {\mathrel \vert \mathrel \vert \mathrel \vert })\) on elements of the monad. To this end we first define the trace synchronization, an inductively defined relation presented below, that relates \( \tau _{ i } \in \left( {{\mathbb {S}\times \mathbb {S}}}\right) ^+\! \cdot X _{ i }\) and \( \tau \in \left( {{\mathbb {S}\times \mathbb {S}}}\right) ^+\! \cdot \left( { X _1 \times X _2}\right) \), representing the fact that \( \tau _1\) and \( \tau _2\) can synchronize to form \( \tau \):

figure z

One way to understand these rules is to concentrate on the first transition on the left trace \( \tau _1 = \langle {{ \sigma }, { \rho }}\rangle \tau _1'\); the right-sided rules are treated symmetrically. If the first transition is also the last, i.e. \( \tau _1' \in X \), then \( \rho \) must be the initial store when the execution continues (recall that only a break between transitions reflects a yield to the environment). This is why Var-Left combines the transitions as it does. The value in \( \tau _3\) is the pair of the values in \( \tau _1\) and \( \tau _2\), reflecting the operational semantics of \((\parallel )\) returning the pair of the results. If, on the other hand, the first transition is not the last, then we may combine the transition with the continuation of the computation (Seq-Left), or we may not (Brk-Left). The first option means the yield was used-up in this synchronization; while in the second option yield remains available to ambient synchronizations.

From this relation we derive the semantic synchronization function:

figure aa

Example 6

For \( \sigma , \rho \in \mathbb {S}\), we may synchronize \({\langle {{ \sigma }, { \rho }}\rangle \langle {{ \rho }, { \sigma }}\rangle \langle {}\rangle }\) and \({\langle {{ \rho }, { \rho }}\rangle 0}\) so:

figure ab

Therefore, if \({\langle {{ \sigma }, { \rho }}\rangle \langle {{ \rho }, { \sigma }}\rangle \langle {}\rangle } \in P \) and \({\langle {{ \rho }, { \rho }}\rangle 0} \in Q \), then \(\langle {{ \sigma }, { \sigma }}\rangle \langle {{\langle {}\rangle }, {0}}\rangle \in P \mathbin {\mathrel \vert \mathrel \vert \mathrel \vert } Q \).

The use of Seq-Left was possible since the stores happen to match, resulting in a trace that does not allow the environment to interfere. By using Brk-Left we could find a different synchronization, one that does yield to the environment.

5 Denotational Semantics

With the monad in place, denotations of types and contexts are standard [30]:

$$\begin{aligned}{}[\![{\boldsymbol{(} A _1 \mathbin {\boldsymbol{*}}\cdots \mathbin {\boldsymbol{*}} A _{ n }\boldsymbol{)}}]\!]&:=[\![{ A _1}]\!] \times \cdots \times [\![{ A _{ n }}]\!]&\quad [\![{ A \mathrel {\boldsymbol{\mathtt{\mathord -\mathord >}}} B }]\!]&:=[\![{ A }]\!] \rightarrow \mathcal {T}[\![{ B }]\!]&\\ [\![{\boldsymbol{\{}\iota _{1}\, \mathbin {\textbf{of}} A _1 \mathbin {\boldsymbol{|}}\cdots \mathbin {\boldsymbol{|}}\iota _{ n }\,\mathbin {\textbf{of}} A _{ n }\boldsymbol{\}}}]\!]&:=\textstyle \bigcup _{ i = 1}^{ n } \left\{ \iota _{ i }\,\right\} \times [\![{ A _{ i }}]\!]&\quad [\![{ \varGamma }]\!]&:=\textstyle \prod _{( \mathtt x : A ) \in \varGamma } [\![{ A }]\!]&\end{aligned}$$

Define the extension of \( \gamma \in [\![{ \varGamma }]\!]\) to \( \gamma \left[ {{ \mathtt x }\mapsto { y }}\right] \in [\![{ \varGamma , \mathtt x : A }]\!]\) by \( \gamma \left[ {{ \mathtt x }\mapsto { y }}\right] \mathtt x := y \).

On the above we base two kinds of denotations for programs \( \varGamma \vdash M : A \):

  • Computational. \([\![{ M }]\!]^{\texttt{c}} : [\![{ \varGamma }]\!] \rightarrow \mathcal {T}[\![{ A }]\!]\). When \( \varGamma \) is empty we may write \([\![{ M }]\!]^{\texttt{c}}\) instead of \([\![{ M }]\!]^{\texttt{c}} \langle {}\rangle \). We write \([\![{ M }]\!]^{\texttt{c}} \subseteq [\![{ N }]\!]^{\texttt{c}}\) for \(\forall \gamma \in [\![{ \varGamma }]\!] .\;[\![{ M }]\!]^{\texttt{c}} \gamma \subseteq [\![{ N }]\!]^{\texttt{c}} \gamma \).

  • Valuational. \([\![{ V }]\!]^{\texttt{v}} : [\![{ \varGamma }]\!] \rightarrow [\![{ A }]\!]\) defined solely for values, and satisfying \([\![{ V }]\!]^{\texttt{c}} \gamma = {\text {return}}\ \left( {[\![{ V }]\!]^{\texttt{v}} \gamma }\right) \). When \( \varGamma \) is empty we may write \([\![{ V }]\!]^{\texttt{v}}\) instead of \([\![{ V }]\!]^{\texttt{v}} \langle {}\rangle \); and if furthermore \( A \) is a ground type, we may write \( V \) instead of \([\![{ V }]\!]^{\texttt{v}}\), noting that the restriction of \([\![{-}]\!]^{\texttt{v}}\) to closed programs of ground type is a bijection.

Most denotations of programs are standard as well, such as:

figure ac

The denotations of the state effects allow the environment to intervene:

figure ad

Example 7

With the definitions above, we can state the denotations from Example 3 precisely. For instance, (4) becomes:

figure ae

Example 8

We can explain the execution of \( \textsf {a} \mathbin {\boldsymbol{:=}}{ \textsf {b} \boldsymbol{?}} \mathbin {\boldsymbol{;}} \textsf {a} \mathbin {\boldsymbol{:=}}{ \textsf {c} \boldsymbol{?}} \parallel \textsf {a} \boldsymbol{?}\) from Example 2 in denotational terms. First we find traces to synchronize:

figure af

Following from the derivation in Example 6 with and :

figure ai

This trace corresponds to the (uninterrupted) execution presented in Example 2.

6 Metatheoretical Results

First we find that the single-transition traces in the denotation of a program account for the possible executions of that program:

Theorem 2 (Soundness)

If \( \sigma , M \mathrel {\overset{}{\rightsquigarrow *}} \rho , V \) then \(\langle {{ \sigma }, { \rho }}\rangle [\![{ V }]\!]^{\texttt{v}} \in [\![{ M }]\!]^{\texttt{c}}\).

For the proof, omitted for brevity, we instrument the operational-semantics with actions, elements of \(\left\{ \text {U}_{ \ell , v }, \text {L}_{ \ell }, \varepsilon \right\} \) signifying the effect caused by the step, to analyse the change to the denotation of the program as it runs.

Working our way up to the fundamental lemma, we define a unary logical relation: functions and from types to sets of closed programs by mutual recursion. Specifically, is a set of closed values of type \( A \), and is a set of closed programs of type \( A \). The definition of is standard:

figure ao

The definition of is also standard in that it ensures programs in compute to values in , but bespoke in its requirement about how they compute. This requirement is based on the way traces specify interrupted executions, a notion we have discussed in Sect. 4.2 and now make precise. For a non-empty sequences \( \alpha = \langle {{ \sigma _1}, { \rho _1}}\rangle \dots \langle {{ \sigma _{ m }}, { \rho _{ m }}}\rangle \) we write \( M \xrightarrow { \alpha } N \) when there exist \( M = M _1, M _2 \dots M _{ m }, M _{ m +1} = N \) such that \( \sigma _{ i }, M _{ i } \mathrel {\overset{}{\rightsquigarrow *}} \rho _{ i }, M _{ i +1}\) for all \( i \in \left\{ 1, \dots , m \right\} \). We write \( M \xrightarrow { \alpha x } V \) when \( M \xrightarrow { \alpha } V \) and \([\![{ V }]\!]^{\texttt{v}} = x \). We now define:

figure as

The last component needed is the function from typing contexts to sets of program substitutions: The semantic typing judgment \( \varGamma \vDash M : A \) is then defined as:

Theorem 3 (Fundamental Lemma)

If \( \varGamma \vdash M : A \) then \( \varGamma \vDash M : A \).

This brings us one step closer to proving the theorem of directional adequacy. One piece is still missing: since the theorem assumes set inclusion of denotations rather than equality, we will need a different form of compositionality of the denotations than the one that holds by definition.

To state this form of compositionality we first define the standard notion of a program with holes. A function \(\varXi \left[ {{-}}\right] : \varGamma \vdash A \rightarrow \varDelta \vdash B \) is a program context (or context for short) if, in the language extended with a program \(\bullet \) and additional axioms \( \varGamma ' \vdash \bullet : A \) for all \( \varGamma ' \ge \varGamma \), we have \( \varDelta \vdash \varXi \left[ {\bullet }\right] : B \); and if \( \varGamma \vdash M : A \), then \(\varXi \left[ { M }\right] \) is obtained from \(\varXi \left[ {\bullet }\right] \) by replacing every occurrence of \(\bullet \) with \( M \).

Theorem 4 (Compositionality)

Let \(\varXi \left[ {{-}}\right] : \varGamma \vdash A \rightarrow \cdot \vdash G\) be a context for ground \(G\), and \( M , N \in \varGamma \vdash A \). If \([\![{ M }]\!]^{\texttt{c}} \subseteq [\![{ N }]\!]^{\texttt{c}}\) then \([\![{\varXi \left[ { M }\right] }]\!]^{\texttt{c}} \subseteq [\![{\varXi \left[ { N }\right] }]\!]^{\texttt{c}}\).

The condition that the context be closed and ground is necessary, so an attempt to prove directly by induction on the structure of the context fails. The proof, omitted for brevity, instead uses a binary logical relation approximating containment that identifies with it on ground types; the main ingredient being:

figure aw

With this compositionality in hand we are finally ready to prove the main result of this paper, that we will then use to justify program transformation. To state it we first spell out the standard definition of contextual refinement.

Suppose that \( M , N \in \varGamma \vdash A \). We say that \( M \) refines \( N \), and write \( M \sqsubseteq N \), if \( \sigma ,\varXi \left[ { M }\right] \mathrel {\overset{}{\rightsquigarrow *}} \rho , V \) implies \( \sigma ,\varXi \left[ { N }\right] \mathrel {\overset{}{\rightsquigarrow *}} \rho , V \) whenever \(\varXi \left[ {{-}}\right] : \varGamma \vdash A \rightarrow \cdot \vdash G\) is a context for ground \(G\). This justifies the transformation \( N \twoheadrightarrow M \), since replacing \( N \) with \( M \) within a larger program introduces no additional behaviours.

Theorem 5 (Directional Adequacy)

If \([\![{ M }]\!]^{\texttt{c}} \subseteq [\![{ N }]\!]^{\texttt{c}}\) then \( M \sqsubseteq N \).

Proof

Let \(\varXi \left[ {{-}}\right] : \varGamma \vdash A \rightarrow \cdot \vdash G\) be a program context for some ground \(G\), and assume \( \sigma ,\varXi \left[ { M }\right] \mathrel {\overset{}{\rightsquigarrow *}} \rho , V \) for some \( V \). By soundness, \(\langle {{ \sigma }, { \rho }}\rangle [\![{ V }]\!]^{\texttt{v}} \in [\![{\varXi \left[ { M }\right] }]\!]^{\texttt{c}}\). Using compositionality, by assumption \(\langle {{ \sigma }, { \rho }}\rangle [\![{ V }]\!]^{\texttt{v}} \in [\![{\varXi \left[ { N }\right] }]\!]^{\texttt{c}}\). By the fundamental lemma, \(\varXi \left[ { N }\right] \xrightarrow {\langle {{ \sigma }, { \rho }}\rangle } W \) for some \( W \) such that \([\![{ W }]\!]^{\texttt{v}} = [\![{ V }]\!]^{\texttt{v}}\). They are of ground type, so \( W = V \). Therefore, \( \sigma ,\varXi \left[ { N }\right] \mathrel {\overset{}{\rightsquigarrow *}} \rho , V \).

6.1 Example Transformations

Thanks to directional adequacy, we can now justify various transformations and equivalences using rather mundane calculations, requiring no reasoning about the context in which these transformations are to take place.

Example 9

We make the reasoning from Example 4 precise.

Denote \( \varGamma := \mathtt x : \textbf{Loc}, \mathtt y : \textbf{Val}\). We have \( \varGamma \vdash \mathtt x \mathbin {\boldsymbol{:=}} \mathtt y \mathbin {\boldsymbol{;}} \mathtt x \boldsymbol{?} : \textbf{Val}\) and \( \varGamma \vdash \mathtt x \mathbin {\boldsymbol{:=}} \mathtt y \mathbin {\boldsymbol{;}} \mathtt y : \textbf{Val}\). Let \( \gamma \in [\![{ \varGamma }]\!]\), and denote \( \ell := \gamma \mathtt x \) and \( v := \gamma \mathtt y \). Calculating, we have:

$$\begin{aligned}{}[\![{ \mathtt x \mathbin {\boldsymbol{:=}} \mathtt y \mathbin {\boldsymbol{;}} \mathtt x \boldsymbol{?}}]\!]^{\texttt{c}} \gamma = \tilde{\text {U}}^{}_{ \ell , v } \tilde{\text {Y}}^?\tilde{\text {L}}^{}_{ \ell }\left( { w .\;{}\tilde{\text {Y}}^?\tilde{ w }}\right) \supseteq \tilde{\text {U}}^{}_{ \ell , v } \tilde{\text {Y}}^?\tilde{ v } = [\![{ \mathtt x \mathbin {\boldsymbol{:=}} \mathtt y \mathbin {\boldsymbol{;}} \mathtt y }]\!]^{\texttt{c}} \gamma \end{aligned}$$

By directional adequacy, \( \mathtt x \mathbin {\boldsymbol{:=}} \mathtt y \mathbin {\boldsymbol{;}} \mathtt y \sqsubseteq \mathtt x \mathbin {\boldsymbol{:=}} \mathtt y \mathbin {\boldsymbol{;}} \mathtt x \boldsymbol{?}\).

Example 10

We validate elimination of irrelevant reads, i.e. \( M \sqsubseteq \mathtt x \boldsymbol{?} \mathbin {\boldsymbol{;}} M \):

figure ax

As mentioned in Sect. 2.5, the semantics does not validate introduction of irrelevant reads, i.e. we have \([\![{ \mathtt x \boldsymbol{?} \mathbin {\boldsymbol{;}} M }]\!]^{\texttt{c}} \not \subseteq [\![{ M }]\!]^{\texttt{c}}\) even though \( \mathtt x \boldsymbol{?} \mathbin {\boldsymbol{;}} M \sqsubseteq M \).

Example 11

Thanks to our use of standard monad-based semantics, structural transformations and equivalences follow from structural reasoning, avoiding considerations relating to shared-state. For instance:

$$\begin{aligned}{} & {} [\![{\textbf{if }\, \mathtt y \,\textbf{then}\, \lambda \mathtt x _{}.\, K _{\textbf{true}}\,\textbf{else}\, \lambda \mathtt x _{}.\, K _{\textbf{false}}\,}]\!]^{\texttt{c}} \gamma = [\![{\lambda \mathtt x _{}.\, K _{ \gamma \mathtt y }}]\!]^{\texttt{c}} \gamma \\{} & {} \qquad \qquad = {\text {return}}\ \lambda { z }.\; [\![{ K _{ \gamma \mathtt y }}]\!]^{\texttt{c}} \left( { \gamma \left[ {{ \mathtt x }\mapsto { z }}\right] }\right) = [\![{\lambda \mathtt x _{}.\,\textbf{if }\, \mathtt y \,\textbf{then}\, K _{\textbf{true}}\,\textbf{else}\, K _{\textbf{false}}\,}]\!]^{\texttt{c}} \gamma \end{aligned}$$

Therefore, \(\textbf{if }\, \mathtt y \,\textbf{then}\, \lambda \mathtt x _{}.\, K _{\textbf{true}}\,\textbf{else}\, \lambda \mathtt x _{}.\, K _{\textbf{false}}\, \cong \lambda \mathtt x _{}.\, \textbf{if }\, \mathtt y \,\textbf{then}\, K _{\textbf{true}}\,\textbf{else}\, K _{\textbf{false}}\,\).

Finally, adequacy can help validate expected transformations involving \((\parallel )\):

Example 12

Defining \({\text {map}} \psi P :=\left\{ \alpha \left( { \psi x }\right) \;\vert \; \alpha x \in P \right\} \) we have:

figure ay

Unlike the previous examples, proving the above involves careful reasoning at the level of the traces. We still gain the benefit of justifying equivalences and transformations of programs – even open ones – without resorting to analysis under arbitrary program contexts and substitutions:

figure az

Coordinating the returned values make these somewhat awkward. More convenient but less informative forms are derivable, such as \( M \mathbin {\boldsymbol{;}} N \mathbin {\boldsymbol{;}} K \sqsubseteq \left( { M \parallel N }\right) \mathbin {\boldsymbol{;}} K \) (mentioned as a transformation in Sect. 1) which is a consequence of (Sequencing).

7 Conclusion, Related Work, and Future Work

We have defined a monad-based denotational semantics for a language for shared-state providing standard higher-order semantics supporting standard meta-theoretic development. This monad is a representation of the one induced by the equational theory of resumptions, which extends non-deterministic global-state with a delaying/yielding operator [14].

Abadi and Plotkin [1] design a modification for the theory of resumptions to define a denotational semantics for a concurrent imperative programming language with cooperative asynchronous threads. We have shown that the theory of resumptions can be used as-is to define denotational semantics for concurrency, albeit of a different kind. It is interesting to note that they interpret the unary operator analogously to our interpretation of \(\text {Y}^?\), rather than \(\text {Y}\). By decomposing into a sum we were able to validate transformations that are not equivalences.

Benton et al. [4] also define a monad for higher-order shared-state, with additional features such as recursion and abstract locations, using Brookes’s style of semantics. Contrasting, the monad we defined is presented algebraically, and has finite sets of traces, whereas Benton et al.’s denotations are infinite even for recursion-free programs. Although this finiteness makes our definition simpler, we saw in Example 10 that it leads to a resumption-counting issue, thus less abstract semantics. It would be interesting to analyse their semantic model from the algebraic perspective as it may lead to more abstract semantics.

Like in previous work, including those mentioned above, our semantics is based on the sets of traces, originally used by Brookes [6] to define denotational semantics for an imperative concurrent language. Brookes proved that this semantics is not only directionally adequate, but also fully abstract. The proof makes crucial use of atomic execution blocks which we have not included.

Birkedal et al. [5] provide an interesting related model, given by logical relations (step-indexed, Kripke, etc.) over syntactic terms as semantics. Their language is substantially more expressive including higher-order local store, and accounts for a type-and-effect system semantics. A more precise model could lead to a monadic account that reproduces these results less syntactically.

Also of note are process calculi and algebraic laws concerning the structure of programs. Hoare and van Staden [12] give such an account for concurrent programs, unifying previous work. Their laws are much more general, parameterizing over the notions of sequencing programs and running programs in parallel. It would be interesting to discover if and how our semantics is an instance of theirs. There is also a lot of work on semantics of “while” languages where all information flows through the state, which support more advanced features such as probabilistic choice [3, 11, 41]. Others approach the study of concurrency through game semantics, such as Jaber and Murawski’s [15] study of the semantics of a higher-order call-by-value concurrent language. Trace semantics features in their study too, though their traces are quite different, being sequences of player/opponent actions that incrementally transform configurations.

In the future we plan to refine the type system into a type-and-effect system [18, 20, 22, 29, 39, 40], by annotating the typing judgments with the allowed effects. The denotations then depend on the effect annotations, with each annotation having its own associated equational theory. This may allow additional transformations that are currently beyond this model’s reach. For example, the converse of (Sequencing) under certain syntactic and static guarantees would enable compiler parallelism.

Atomic constructs that disallow interference from the environment are a common feature of concurrent languages. Adding such constructs may be a simple matter, since we have a dedicated operator, yield, for allowing interference. Nevertheless, in the spirit of reductionism, we leave this investigation to future work.

We would also like to see how well our approach extends to weak-memory models. In particular, we believe that the timestamp-based operational semantics of the release-acquire memory model [19, 24, 26, 37] is amenable to a similar treatment by using more sophisticated traces.