Program Extraction for Mutable Arrays

Sakaguchi, Kazuhiko

doi:10.1007/978-3-319-90686-7_4

Kazuhiko Sakaguchi¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10818))

Included in the following conference series:

International Symposium on Functional and Logic Programming

546 Accesses
2 Citations
2 Altmetric

Abstract

We present a mutable array programming library for the Coq proof assistant which enables simple reasoning method based on Ssreflect/Mathematical Components, and extractions of the efficient OCaml programs using in-place updates. To refine the performance of extracted programs, we improved the extraction plugin of Coq. The improvements are based on trivial transformations for purely functional programs and reduce the construction and destruction costs of (co)inductive objects, and function call costs effectively. As a concrete application for our library and the improved extraction plugin, we provide efficient implementations, proofs, and benchmarks of two algorithms: the union–find data structure and the quicksort algorithm.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Quantified Invariants via Syntax-Guided Synthesis

Extending VIAP to Handle Array Programs

Extending the Theory of Arrays: memset , memcpy , and Beyond

1 Introduction

The program extraction mechanism [7,8,9, 15] of the Coq proof assistant [20] is a code generation method for obtaining certified functional programs from constructive formal proofs and definitions by eliminating the non-informative part and widely used for developments of high-reliability software. For example, the CompCert project [6] uses Coq and its program extraction mechanism for obtaining a formally verified and executable C compiler which guarantees the correctness of the translation.

Verification and code generation (including extraction) of programs with side effects are important issues to apply proof assistants to realistic software developments. Particularly, in-place updates of mutable objects are important to increase the efficiency of programs. There are many recent studies to support writing, reasoning about, and generating programs with side effects in proof assistants, e.g., Coq (Ynot) [13], Isabelle/HOL [2], Idris [1], and F$^\star $ [17]. Ynot is such a representative Coq library which can handle various kind of side-effects, e.g., accessing reference cells, non-termination, throwing/catching exceptions, and I/O. Ynot is based on an axiomatic extension for Coq called Hoare Type Theory (HTT) [12] and supports reasoning with separation logic [14] which are good at its expressiveness, but not good at reducing proof burden in Coq.

This study establishes a novel, lightweight, and axiom-free method for verification and extraction of efficient programs using mutable arrays. Our library supports a simple monadic DSL for mutable array programming and powerful and simple reasoning method, and which is achieved by focusing only on mutable arrays and doing away with more side-effects such as reference cells and local states. Our contribution consists of three parts:

In Sect. 3, we define a state monad specialized for mutable array programming—the array state monad—and give two interpretations of it for reasoning and program extraction. The former interpretation is defined in a purely functional way with building blocks taken from the Ssreflect/Mathematical Components (MathComp) library [11, 21] and makes it possible to reduce the reasoning tasks on effectful programs to those on purely functional programs. The latter interpretation enables extraction of efficient effectful programs and provides encapsulation function like $\texttt {runST}$ of state threads [5] which corresponds to the interpretation function in the former interpretation. The encapsulation mechanism converts effectful functions written by the array state monad to referential transparent functions and it also enables encapsulation of proofs.
In Sect. 4, we present two new optimization techniques for the program extraction plugin. The optimizations are based on well-known transformations of purely functional programs, but effectively reduce the execution time of programs extracted with our library. More generally, the optimizations are especially effective for two cases: 1. proofs using mathematical structures and its theories provided by the MathComp library and 2. programs using monads that have functional types, e.g., State, Reader, and Continuation monads.
In Sect. 5, we demonstrate elegant formalization techniques for programs using mutable arrays, and efficiency of the extracted programs using our library and the improved extraction, through the two applications—the union–find data structure and the quicksort algorithm.

Our formalization of the quicksort algorithm uses the theory of permutations provided by the perm library of MathComp, and we show that some properties of permutations are also helpful for the reasoning of the quicksort algorithm. We also show benchmark results of the applications, and it indicates that performances of extracted programs using our library and the improved extraction are comparable to handwritten OCaml implementations of same algorithms and much better than purely functional implementations of the same kind of algorithms.

The source code of our library, the improved extraction plugin, case studies, benchmark scripts and patches for existing libraries are available at:

https://github.com/pi8027/efficient-finfun.

2 Finite Types and Finite Functions in Coq

This section briefly introduces the fintype and finfun libraries from the MathComp library with some modifications. The former provides an interface for types with finitely many elements—finite types. The latter provides a type of functions with finite domains—finite functions or finfuns—which is used as a representation of arrays in this paper and relies on the former part. Key definitions and lemmas of the both libraries are listed in Table 1 and described below. In our previous work [16], we modified these libraries to improve the efficiency of code extracted from proofs.

Table 1. Key definitions and lemmas in the modified fintype and finfun libraries

Full size table

2.1 A Finite Type Library—fintype

A finite type is a type with finitely many elements. The fintype library provides definitions of a class of finite types () and its basic operations. The class of finite types is defined as a canonical structure [10] which contains a type and a witness of its finiteness. In the original fintype library, such finiteness of a type is characterized by a duplicate-free enumeration of elements of . In the modified one, it is recharacterized by a pair of the natural cardinal number and a bijection between and a finite ordinal type . The bijection is given by a pair of an encoding function of type and a decoding function of type .

The two most basic operations on finite types are the enumeration () and the cardinality () of a subset of a finite type. We also provide accessors for the cardinal number and the bijection used by the new characterization of finiteness. The former is

notation, and the latter is and functions. For any

is equal to and more efficiently computable than .

Many canonical instances are given by the MathComp library: , finite ordinals , , finite functions with a finite codomain, finite subsets , symmetric groups , and more.

2.2 A Finite Function Library—finfun

The finfun library provides definitions of a type of finfuns () and its basic operations. Finfuns from to is defined as tuples of . Tuples are size-fixed lists defined in the tuple library, and easily translated to arrays in extracted programs by the command.

The two most basic operations of finfuns are the application () and the construction from CiC functions (). The application is the -th element of the underlying tuple of . The construction is the finfun extensionally equal to , and the underlying tuple of it associates for each -th element. Both of them can be efficiently computable by using $\texttt {Array.get}$ and $\texttt {Array.init}$ functions respectively in extracted OCaml programs.

The application can be expressed as because is the coercion from finfuns to CiC functions. The finfun library also provides constructor notations and that are equivalent to and respectively.

3 Representing Mutable Arrays in Coq

The state monad [22] is useful to represent computations with states in purely functional languages such as Haskell and Coq. Actions of the state monad have types of the form $S \rightarrow S \times A$ where S is a state type and A is a return type. They take the initial state as its arguments, and returns the result of the type A paired with the final state. To represent various computations with mutable arrays, we need an array state monad that hold two conditions: (C1) it can handle multi-dimensional and multiple mutable arrays, and (C2) it never needs copy operations on arrays.

The former part of (C1)—handling multi-dimensional arrays—is achieved naturally by representing arrays by finfuns, because the finfuns of type $\{\mathsf {ffun}~I_1 \times \dots \times I_n \rightarrow A \}$ correspond to the multi-dimensional arrays of A indexed by $I_1, \dots , I_n$.

Monad transformers are well-known methods to compose monadic effects, and the state monad transformer seem to be suitable for solving the latter part of (C1)—handling multiple arrays. However, if we allow one to compose the array state monad and other monads, the condition (C2) does not hold. For example, the actions of the monad that is a composition of the state monad transformer and the list monad have a type $S \rightarrow \mathrm{list}(S \times A)$, and it needs to copy the state on each branch of computation. Therefore, the array state monad should be defined in a more refined way.

We solve the above problem by defining the array state monad as an inductive data type that has a restricted set of primitive operations shown below:

is a type of array state monad actions with indexed mutable arrays of respectively and the return type . The first argument of is called a signature and indicates types of mutable arrays. Let $\varSigma $ be a metavariable of signatures, and $\varSigma _i$ means i-th element of the signature $\varSigma $. We refer to the array corresponds to the $\varSigma _i$ as the i-th array (of the signature $\varSigma $).

Each constructor of corresponds to return, bind, lift, get and set operators. The lift operator can lift array state monad actions of a signature $\varSigma $ to that with a signature $(I, T)\,{:}{:}\,\varSigma $, and lifted actions does not affect the first array. Get and set operators can only access the first array. The lift operator is necessary to get and set element of an array after the second, and it is also useful for modular programming.

We also define aliases for all constructors of to avoid the construction and destruction costs of tuples in extracted OCaml programs^{Footnote 1}, e.g.:

Primitive get and set operators take an index of type . Indices of type are also applicable by using the encoding function.

Programs represented by values cannot run directly. We give an interpretation for the array state monad by a translation to the functions of type $S \rightarrow S \times A$, where S is a type of mutable arrays defined as follows:

takes a signature $[(I_1, T_1); \dots ; (I_n, T_n)]$, and returns a Cartesian product of all types of arrays in the signature: $\mathrm{unit} \times \{\mathsf {ffun} ~ I_n \rightarrow T_n\} \times \dots \times \{\mathsf {ffun}~I_1 \rightarrow T_1\}$. We choose this order of types to omit parenthesis, because the $\times $ and $(\cdot , \cdot )$ operators have left associativity in Coq.

The translation is defined as follows:

is a pure set function for finfuns. It takes an index i : T, a value x : A and a finfun $f : \{\mathsf {ffun}~T \rightarrow A\}$, and returns a new finfun $f'$ which is equal to f except that the i-th element is changed to x. is a interpretation function for the array state monad which is inductively defined on values.

3.1 Program Extraction for the Array State Monad

This section provides a method to extract efficient stateful OCaml programs from Coq proofs which use the array state monad. In another point of view, we give an another interpretation for array state monad by the OCaml program extraction.

In stateful settings, state propagation can be achieved by in-place updates instead of state monad style propagation, and moreover it is not needed to return a new state in each action. We defined the array state monad as an inductive data type only because to restrict primitive operations and its case analysis is never used except for . Thus we interpret array state monad actions as OCaml functions which take states and return its result by the command:

We also give same realizations of aliases for the constructors of :

Finally, we provide an encapsulation function for the array state monad as a realization of the function:

The encapsulation is achieved by duplicating all the input (initial) arrays by $\texttt {Array.copy}$ and using the copied arrays in the execution of effectful actions. As a result, the range of in-place updates is limited to the copied arrays and it never affects outside of the .

3.2 Small Example: Swap Two Elements

Let us show a small example — swap — of programming and verification with the array state monad. The action $\mathrm{swap}(i, j)$ takes i-th and j-th values of the first array by , and then set them reversely by .

and are “do”-like notations which are equivalent to and respectively. Correctness of the swap action is described by the following lemma.

is a permutation (bijection) on which transposes and . is if if and otherwise . This formulation is useful for reasoning on a sequence of actions, e.g., sorting algorithms.

Operations of the array state monad can be erased from the goal by the simplification tactic . More generally, the case analysis and the simplification work as the erasure.

The encoding/decoding functions can be erased by the lemma. Both sides of equation have the form of , thus we use the congruence rule.

To prove a equation of finfuns, we use the lemma of functional extensionality . Applications of finfuns can be unfolded by the lemma.

The remaining goal can be proved by case analysis on comparison and .

The lift operator is also erased by similar method. Here is a correctness proof of the lifted swap action.

4 Optimizations by an Improved Extraction Plugin

We provide two modifications for the extraction plugin to improve the efficiency of extracted programs, particularly which use the array state monad. The Coq program extraction translates Gallina^{Footnote 2} programs to target languages (OCaml, Haskell, Scheme, and JSON) and consists of three translations: 1. extraction from Gallina to MiniML, an intermediate abstract language for program extraction, 2. simplification (optimization) of MiniML terms, and 3. translation from MiniML to target languages.

The modifications are of the part 2 and 1 of the translation. The former reduces the construction and destruction costs of (co)inductive objects by inlining. The latter reduces the function call costs by applying $\eta $-expansion to match expressions and distributing the added arguments to each branch. Each optimization is particularly effective for programs using the MathComp library and monadic programs respectively.

4.1 Destructing Large Records

The MathComp library provides many mathematical structures [4] as canonical structures, e.g., , , etc. which are represented as nested records in extracted programs. For example, the modified definition is translated to a nested record with 5 constructors and 11 fields by the program extraction. We implemented additional simplification mechanisms for MiniML terms to prevent performance degradation caused by handling such large records.

This section describes simplification rules for MiniML terms provided by the original and improved extraction plugin. The rules act on type coercion ($\texttt {Obj.magic}$ in OCaml, and $\texttt {unsafeCoerce}$ in Haskell) are omitted here because of simplicity. The key simplification rule for unfolding pattern matchings provided by the original extraction plugin is generalized $\iota $-reduction relation.

Definition 1

(Generalized $\iota $-reduction). We inductively define an auxiliary relation $\leadsto _\iota ^{\overline{ cl }}$ for a sequence of pattern-matching clauses $\overline{ cl } = C_1(\overline{x_1}) \rightarrow u_1 \mid \dots \mid C_n(\overline{x_n}) \rightarrow u_n$ by the following three rules:

$$\begin{aligned} C_i(t_1, \dots , t_m)&\leadsto _\iota ^{\overline{ cl }} \begin{array}[c]{l} \mathsf {let} ~ x_{i, 1} := t_1 ~ \mathsf {in} \dots \\ \mathsf {let} ~ x_{i, m} := t_m ~ \mathsf {in} ~ u_i \end{array} \end{aligned}$$

(1)

$$\begin{aligned} t \leadsto _\iota ^{\overline{ cl }} t' \Rightarrow \mathsf {let} ~ x := u ~ \mathsf {in} ~ t&\leadsto _\iota ^{\overline{ cl }} \mathsf {let} ~ x := u ~ \mathsf {in} ~ t' \end{aligned}$$

(2)

$$\begin{aligned} t_1 \leadsto _\iota ^{\overline{ cl }} t'_1 \wedge \dots \wedge t_m \leadsto _\iota ^{\overline{ cl }} t'_m \Rightarrow \begin{array}[c]{l} \mathsf {match} ~ t ~ \mathsf {with} ~ \\ ~ \mid D_1(\overline{x_1}) \rightarrow t_1 \\ ~ \mid \dots \\ ~ \mid D_m(\overline{x_m}) \rightarrow t_m \end{array}&\leadsto _\iota ^{\overline{ cl }} \begin{array}[c]{l} \mathsf {match} ~ t ~ \mathsf {with} ~ \\ ~ \mid D_1(\overline{x_1}) \rightarrow t'_1 \\ ~ \mid \dots \\ ~ \mid D_m(\overline{x_m}) \rightarrow t'_m \end{array} \end{aligned}$$

(3)

The generalized $\iota $-reduction relation $\leadsto _\iota $ is defined as follows:

$$\begin{aligned} t \leadsto _\iota ^{\overline{ cl }} t'&\Rightarrow (\mathsf {match} ~ t ~ \mathsf {with} ~ \overline{ cl }) \leadsto _\iota t' \end{aligned}$$

(4)

The combination of the rules (1) and (4) provides simple $\iota $-reduction. The rules (2) and (3) allows to traverse nested match and let expressions in the head of term t in the rule (4).

Other simplification rules are listed below.

$$\begin{aligned} (\lambda x.\, t) \, u&\leadsto \mathsf {let} ~ x := u ~ \mathsf {in} ~ t \end{aligned}$$

(5)

$$\begin{aligned} \mathsf {let} ~ x := u ~ \mathsf {in} ~ t&\leadsto t[x := u] \quad \begin{array}[t]{l} (t \text { or } u \text { is atomic }, \\ {(}\text {or } x \text { occurs at most once}) \end{array} \end{aligned}$$

(6)

$$\begin{aligned} (\mathsf {let} ~ x := t_1 ~ \mathsf {in} ~ t_2) ~ u_1 \dots u_n&\leadsto \mathsf {let} ~ x := t_1 ~ \mathsf {in} ~ t_2 ~ u_1 \dots u_n \end{aligned}$$

(7)

$$\begin{aligned} \begin{array}{l} (\mathsf {match} ~ t ~ \mathsf {with} ~\\ ~ \mid C_1(\overline{x_1}) \rightarrow t_1 \\ ~ \mid \dots \\ ~ \mid C_n(\overline{x_n}) \rightarrow t_n \\ ) \, u_1 \dots u_m \end{array}&\leadsto \begin{array}[c]{l} \mathsf {match} ~ t ~ \mathsf {with} ~\\ ~ \mid C_1(\overline{x_1}) \rightarrow t_1 \, u_1 \dots u_m \\ ~ \mid \dots \\ ~ \mid C_n(\overline{x_n}) \rightarrow t_n \, u_1 \dots u_m \\ {} \end{array} \end{aligned}$$

(8)

$$\begin{aligned} \begin{array}{l} \mathsf {match} ~ t ~ \mathsf {with} ~\\ ~ \mid C_1(\overline{x_1}) \rightarrow \lambda y_1 \dots y_m.\, t_1 \\ ~ \mid \dots \\ ~ \mid C_n(\overline{x_n}) \rightarrow \lambda y_1 \dots y_m.\, t_n \end{array}&\leadsto \begin{array}{l} \lambda y_1 \dots y_m.\, \begin{array}[t]{l} \mathsf {match} ~ t ~ \mathsf {with} ~\\ ~ \mid C_1(\overline{x_1}) \rightarrow t_1 \\ ~ \mid \dots \\ ~ \mid C_n(\overline{x_n}) \rightarrow t_n \end{array} \end{array} \end{aligned}$$

(9)

$$\begin{aligned} \mathsf {let} ~ x := (\mathsf {let} ~ y := t_1 ~ \mathsf {in} ~ t_2) ~ \mathsf {in} ~ t_3&\leadsto \mathsf {let} ~ y := t_1 ~ \mathsf {in} ~ (\mathsf {let} ~ x := t_2 ~ \mathsf {in} ~ t_3) \end{aligned}$$

(10)

$$\begin{aligned} \mathsf {let} ~ x := C(t_1, \dots , t_n) ~ \mathsf {in} ~ u&\leadsto \begin{array}[c]{l} \mathsf {let} ~ y_1 := t_1 ~ \mathsf {in} \dots \mathsf {let} ~ y_n := t_n ~ \mathsf {in} \\ \mathsf {let} ~ x := C(y_1, \dots , y_n) ~ \mathsf {in} ~ u' \end{array} \end{aligned}$$

(11)

where $u'$ in the rule (11) is obtained by replacing all the subterms of the form $(\mathsf {match} ~ x ~ \mathsf {with} \dots \mid C(z_1, \dots , z_n) \rightarrow t \mid \dots )$ in the u with the term $t[z_1 := y_1, \dots , z_n := y_n]$ which is a $\iota $-reduced term of $(\mathsf {match} ~ C(y_1, \dots , y_n) ~ \mathsf {with} \dots \mid C(z_1, \dots , z_n) \rightarrow t \mid \dots )$.

All simplification rules are safe for purely functional programs, but not safe for OCaml programs generally. The rules (6), (8) and (9) may change the execution order of code and skip executing some code, and other rules also help to apply these rules. Therefore it is difficult to implement these rules as optimizations for OCaml programs, and it is appropriate to implement it as a MiniML optimizer.

The generalized $\iota $-reduction without the rule (2) and the rules (5) through (9) are provided by the original extraction plugin. We additionally implemented the rules (2), (10), and (11) to it. The rule (11) recursively destructs nested records, and the rule (10) assist it in the case of a term of the form $(\mathsf {let} ~ x := (\mathsf {let} ~ y := t ~ \mathsf {in} ~ C(t_1, \dots , t_n)) ~ \mathsf {in} ~ \dots )$ occurred.

Let us exemplify a simplification process of the following definition^{Footnote 3}:

The simplification process is as follows. Constants inlined by extraction are underlined here. Identifiers , and are abbreviated here as $\texttt {nat}_\texttt {eq}$, $\texttt {Pack}_\texttt {eq}$, and $\texttt {Mixin}_\texttt {eq}$.

can be unfolded by rules (6) and (4) if has occurred only once. However, we need the rule (11) to apply the $\iota $-reduction over the $\mathsf {let}$ expressions which cannot be simplified by the rule (6).

4.2 $\eta $-expansion on Case Analysis

Match expressions returning a function are commonly used in monadic programming and dependently typed programming. However, they increase the number of function calls and closure allocations and decrease the performance of programs. Let us consider the following monadic program that branches depending on whether the integer state is even or odd:

By unfolding the $\mathsf {get}$ and in the above program, a match expression returning a function^{Footnote 4} can be found:

$$ \lambda n : \mathbb {Z}.\, (\mathsf {if} ~ n \bmod 2 = 0 ~ \mathsf {then} ~ f ~ \mathsf {else} ~ g) \, n. $$

Another example from dependently typed programming is a map function for size-fixed vectors:

The match expressions in the above examples can be optimized by the rules (8) and (9) respectively, and those rules can be applied for many other cases. However, these rules are not complete because of its syntactic restriction: match expressions to which that rules are applied should have following arguments or have $\lambda $-abstractions for each branch. Therefore, we apply the full $\eta $-expansion on all match expressions in the process of extraction from Gallina to MiniML.^{Footnote 5} The rule (8) can be applied for $\eta $-expanded match expressions.

We additionally provide new Vernacular command to declare the arity of an inductive type which is realized by the command, because the Coq system cannot recognize that the arity of is 1. Declared arities are used for full $\eta $-expansion described above. This command can be used as follows:

5 Case Studies

This section demonstrates our library and improved extraction plugin using two applications: the union–find data structure and the quicksort algorithm. Here we provide an overview of formalizations and the performance comparison between the extracted code and other implementations for each application.

All the benchmark programs were compiled by OCaml 4.05.0+flambda with optimization flags $\texttt {-O3 -remove-unused-arguments -unbox-closures}$ and performed on a Intel Core i5-7260U CPU @ 2.20 GHz equipped with 32 GB of RAM. Full major garbage collection and heap compaction are invoked before each measurement. All benchmark results represent the median of 5 different measurements with same parameters.

5.1 The Union–find Data Structure

We implemented and verified the union–find data structure with the path compression and the weighted union rule [18, 19]. The formalization takes 586 lines, and most key properties and reasoning on the union–find can be easily written out by using the path and fingraph library.

The benchmark results are shown in the Fig. 1. Here we compare the execution times of the OCaml code extracted from above formalization (optimized and unoptimized^{Footnote 6} version) and handwritten OCaml implementation of same algorithm. The procedure to be measured here is the sequence of $\mathsf {union}$ n times and $\mathsf {find}$ n times on n vertices union–find data structure, where all parameters of $\mathsf {union}$s and $\mathsf {find}$s are randomly selected. The time complexity of this procedure is ${\Theta }(n\alpha (n, n))$ where $\alpha $ is a functional inverse of Ackermann’s function. The results indicate that the OCaml implementation is about 1.1 times faster than the optimized Coq implementation, the optimized Coq implementation is about 1.1 times faster than the unoptimized Coq implementation, and the execution times of all implementations increase slightly more than linear.

5.2 The Quicksort Algorithm

We implemented and verified the quicksort algorithm by using the array state monad. The formalization including partitioning and the upward/downward search takes 365 lines, and the key properties are elegantly written out by using the theory of permutations. Here we explain the formalization techniques used for the quicksort algorithm by taking the partitioning function as an example. The partitioning function for indexed arrays of and comparison function ^{Footnote 7} has the following type:

The takes a pivot of type and range of partition represented by indices , reorders the elements of the arrays from index to so that elements less than the pivot come before all the other elements, and returns the partition position. We proved the correctness of as the following lemma:

The can be applied for goals including some executions of without giving concrete parameters by using the simple idiom , and it obtains the properties of described below. It is achieved by a Ssreflect convention [11, Sect. 4.2.1], which means separating the specification from the lemma as the coinductive type family .

In the specification , the finfun , permutation , and index indicates the initial state, permutation performed by the partition, and partition position respectively, and the final state is represented by which means the finfun permuted by . Properties of the partition are given as arguments of and numbered from to in above code. Each property has the following meaning:

1.
the partition position is between and ,
2.
the partition only replaces value in the range from to ,
3.
values in the range from to are less than the pivot, and
4.
values in the range from to are greater than or equal to the pivot.

Of particular interest is that the property 2 can be written in the form of which means that the permutation only displaces elements of . is used for constructing algebraic theory in the MathComp library, but is also helpful for reasoning on sorting algorithms.

The benchmark results are shown in the Fig. 2. Here we compare the execution times of following implementations of sorting algorithms for randomly generated arrays or lists of integers: 1. and 2. $\texttt {Array.sort}$ taken from OCaml standard library, 3. handwritten OCaml implementation of the quicksort, 4. optimized and 5. unoptimized OCaml code extracted from above formalization, and 6. OCaml code extracted from a Coq implementation of the bottom-up mergesort algorithm for lists. The results indicate that the implementation 5 (quicksort in Coq, unoptimized) is slowest of those, the implementation 4 (quicksort in Coq, optimized) is 1.7–1.8 times faster than implementations 5 and 6, and OCaml implementations 1, 2 and 3 are 1.07–1.8 times faster than the implementation 3.

6 Related Work

Ynot [13] is a representative Coq library for verification and extraction of higher-order imperative programs. Ynot supports various kind of side-effects, separation-logic-based reasoning method, and automation facility [3] for it, and provides many example implementations of and formal proofs for data structures and algorithms including the union–find data structure. The formal development of the union–find provided by Ynot takes 1,067 lines of code, and our formal development of it (see Sect. 5.1) takes 586 lines. Both implementations use almost the same optimization strategies: the union by rank and the weighted union rule respectively, and the path compression. This comparison indicates that Ynot is good at its expressiveness, but our method has smaller proof burden.

7 Conclusion

We have established a novel, lightweight, and axiom-free method for verification and extraction of efficient effectful programs using mutable arrays in Coq. This method consists of the following two parts: a state monad specialized for mutable array programming (the array state monad) and an improved extraction plugin for Coq. The former enables a simple reasoning method for and safe extraction of efficient effectful programs. The latter optimizes programs extracted from formal developments using our library, and it is also effective for mathematical structures provided by the MathComp library and monadic programs.

We would like to improve this study with more expressive array state monad, large and realistic examples, and correctness proof of our extraction method for effectful programs.

Notes

1.
In OCaml programs, arguments of constructors are parenthesized and comma separated. If a constructor is replaced with some function by the command, arguments of the constructor are interpreted as tuples.
2.
The specification language of Coq.
3.
is the canonical instance of nat.
4.
expressions are syntax sugar for match expressions in Coq.
5.
Implementing it as a part of the simplification of MiniML terms is difficult, because MiniML is a type-free language.
6.
It is extracted by disabling new optimization mechanisms described in Sect. 4, but compiled with same OCaml compiler and optimization flags.
7.
Here we assume that is a total order and means “less than or equal to” in some sense.

References

Brady, E.: Programming and reasoning with algebraic effects and dependent types. In: ICFP 2013, pp. 133–144. ACM (2013)
Google Scholar
Bulwahn, L., Krauss, A., Haftmann, F., Erkök, L., Matthews, J.: Imperative functional programming with Isabelle/HOL. In: Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp. 134–149. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-71067-7_14
Chapter Google Scholar
Chlipala, A., Malecha, G., Morrisett, G., Shinnar, A., Wisnesky, R.: Effective interactive proofs for higher-order imperative programs. In: ICFP 2009, pp. 79–90. ACM (2009)
Google Scholar
Garillot, F., Gonthier, G., Mahboubi, A., Rideau, L.: Packaging mathematical structures. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs 2009. LNCS, vol. 5674, pp. 327–342. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03359-9_23
Chapter Google Scholar
Launchbury, J., Peyton Jones, S.L.: Lazy functional state threads. In: PLDI 1994, pp. 24–35. ACM (1994)
Google Scholar
Leroy, X.: A formally verified compiler back-end. J. Autom. Reason. 43(4), 363–446 (2009)
Article MathSciNet Google Scholar
Letouzey, P.: A new extraction for Coq. In: Geuvers, H., Wiedijk, F. (eds.) TYPES 2002. LNCS, vol. 2646, pp. 200–219. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-39185-1_12
Chapter MATH Google Scholar
Letouzey, P.: Programmation fonctionnelle certifiée - L’extraction de programmes dans l’assistant Coq. Ph.D. thesis, Université Paris-Sud (2004)
Google Scholar
Letouzey, P.: Extraction in Coq: an overview. In: Beckmann, A., Dimitracopoulos, C., Löwe, B. (eds.) CiE 2008. LNCS, vol. 5028, pp. 359–369. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-69407-6_39
Chapter Google Scholar
Mahboubi, A., Tassi, E.: Canonical structures for the working Coq user. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) ITP 2013. LNCS, vol. 7998, pp. 19–34. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39634-2_5
Chapter Google Scholar
Mahboubi, A., Tassi, E.: Mathematical components (2016). https://math-comp.github.io/mcb/book.pdf
Nanevski, A., Morrisett, G., Birkedal, L.: Hoare type theory, polymorphism and separation. J. Funct. Prog 18(5–6), 865–911 (2008)
Article MathSciNet Google Scholar
Nanevski, A., Morrisett, G., Shinnar, A., Govereau, P., Birkedal, L.: Ynot: dependent types for imperative programs. In: ICFP 2008, pp. 229–240. ACM (2008)
Article Google Scholar
O’Hearn, P., Reynolds, J., Yang, H.: Local reasoning about programs that alter data structures. In: Fribourg, L. (ed.) CSL 2001. LNCS, vol. 2142, pp. 1–19. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44802-0_1
Chapter Google Scholar
Paulin-Mohring, C.: Extracting ${F}_{\omega }$’s programs from proofs in the Calculus of Constructions. In: POPL 1989, pp. 89–104. ACM (1989)
Google Scholar
Sakaguchi, K., Kameyama, Y.: Efficient finite-domain function library for the Coq proof assistant. IPSJ Trans. Prog. 10(1), 14–28 (2017)
Google Scholar
Swamy, N., Hriţcu, C., Keller, C., Rastogi, A., Delignat-Lavaud, A., Forest, S., Bhargavan, K., Fournet, C., Strub, P.Y., Kohlweiss, M., Zinzindohoue, J.K., Zanella-Béguelin, S.: Dependent types and multi-monadic effects in F$^\star $. In: POPL 2016, pp. 256–270. ACM (2016)
Google Scholar
Tarjan, R.E.: Efficiency of a good but not linear set union algorithm. J. ACM 22(2), 215–225 (1975)
Article MathSciNet Google Scholar
Tarjan, R.E., van Leeuwen, J.: Worst-case analysis of set union algorithms. J. ACM 31(2), 245–281 (1984)
Article MathSciNet Google Scholar
The Coq Development Team: The Coq Proof Assistant Reference Manual (2017). https://coq.inria.fr/distrib/V8.7.0/refman/
The Mathematical Components Project: The mathematical components repository. https://github.com/math-comp/math-comp
Wadler, P.: Monads for functional programming. In: Jeuring, J., Meijer, E. (eds.) AFP 1995. LNCS, vol. 925, pp. 24–52. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-59451-5_2
Chapter Google Scholar

Download references

Acknowledgments

We thank Yukiyoshi Kameyama and anonymous referees for valuable comments on an earlier version of this paper. This work was supported by JSPS KAKENHI Grant Number 17J01683.

Author information

Authors and Affiliations

University of Tsukuba, Tsukuba, Japan
Kazuhiko Sakaguchi

Authors

Kazuhiko Sakaguchi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kazuhiko Sakaguchi .

Editor information

Editors and Affiliations

Roskilde University, Roskilde, Denmark
John P. Gallagher
Karlsruhe University of Applied Sciences, Karlsruhe, Germany
Martin Sulzmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sakaguchi, K. (2018). Program Extraction for Mutable Arrays. In: Gallagher, J., Sulzmann, M. (eds) Functional and Logic Programming. FLOPS 2018. Lecture Notes in Computer Science(), vol 10818. Springer, Cham. https://doi.org/10.1007/978-3-319-90686-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-90686-7_4
Published: 24 April 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-90685-0
Online ISBN: 978-3-319-90686-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Program Extraction for Mutable Arrays

Abstract

Similar content being viewed by others

Quantified Invariants via Syntax-Guided Synthesis

Extending VIAP to Handle Array Programs

Extending the Theory of Arrays: memset , memcpy , and Beyond

1 Introduction

2 Finite Types and Finite Functions in Coq

2.1 A Finite Type Library—fintype

2.2 A Finite Function Library—finfun

3 Representing Mutable Arrays in Coq

3.1 Program Extraction for the Array State Monad

3.2 Small Example: Swap Two Elements

4 Optimizations by an Improved Extraction Plugin

4.1 Destructing Large Records

Definition 1

4.2 \(\eta \)-expansion on Case Analysis

5 Case Studies

5.1 The Union–find Data Structure

5.2 The Quicksort Algorithm

6 Related Work

7 Conclusion

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Program Extraction for Mutable Arrays

Abstract

Similar content being viewed by others

Quantified Invariants via Syntax-Guided Synthesis

Extending VIAP to Handle Array Programs

Extending the Theory of Arrays: memset , memcpy , and Beyond

1 Introduction

2 Finite Types and Finite Functions in Coq

2.1 A Finite Type Library—fintype

2.2 A Finite Function Library—finfun

3 Representing Mutable Arrays in Coq

3.1 Program Extraction for the Array State Monad

3.2 Small Example: Swap Two Elements

4 Optimizations by an Improved Extraction Plugin

4.1 Destructing Large Records

Definition 1

4.2 \(\eta \)-expansion on Case Analysis

5 Case Studies

5.1 The Union–find Data Structure

5.2 The Quicksort Algorithm

6 Related Work

7 Conclusion

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation