1 Introduction

The Internet of Things (IoT), Cyber-Physical Systems (CPS), and related initiatives point out a trend in informatics where computation and interaction are increasingly pervasive and ubiquitous, and carried on by a potentially huge and dynamic set of heterogeneous devices deployed in physical space. To address the intrinsic complexity of these settings, a new viewpoint is emerging: a large-scale network of devices, situated in some environment (e.g., the urban area of a smart city), can be seen as a computational overlay of the physical world, to be programmed as a “collective” exhibiting robustness and resiliency by inherent adaptation processes. These kinds of systems are sometimes referred to as Collective Adaptive Systems (CAS)  [1], to emphasise that computational activities are collective (i.e., they involve multiple coordinated individuals), and that a main expected advantage is inherent adaptivity of behaviours to unforeseen changes (e.g., as induced by changes/faults in the computational environment or interactions with humans or other systems).

Aggregate Computing [2] is an approach to CAS engineering that takes a global stance to design and programming and where coordinated adaptation is a key feature. Hence, it targets problems and application domains such as crowd engineering, complex situated coordination, robot/UAV swarms, smart ecosystems and the like  [3]. Its key idea is to program a large system as a whole, that is, to directly consider an ensemble of devices as the target machine to be programmed, and provide under-the-hood, automatic global-to-local mapping: once the desired system-level behaviour is expressed by a global program, then individual computational entities of an aggregate are bound to play a derived, contextualised local behaviour of that program. Prominently, the distinguishing characteristic of Aggregate Computing as a “macro-approach” [4] lies in the ability to formally represent the adaptive behaviour of an ensemble in a compositional and declarative way, namely, by combination of functional coordination operators and high-level building blocks expressing the outcome of a collective task.

One fundamental enabling abstraction for specifying the dynamics of situated collectives is that of a computational field (or simply, field) [5,6,7]: a distributed data structure that maps devices to computational objects across time. Accordingly, Aggregate Programming is about describing (dynamic) field computations, namely, how input fields (data coming from sensors) turn into output fields (actions feeding actuators)—computations that can be conveniently expressed using the functional paradigm.

A modern implementation of the aggregate programming paradigm is ScaFiFootnote 1 (Scala Fields)  [8]. It is a toolkit, tightly integrated with the Scala programming language, that comprises a Domain-Specific Language (DSL), a library, and platform tools for specifying and running (distributed) systems by leveraging computational fields. ScaFi provides a number of key advantages with respect to previous implementation attempts which were standalone DSLs (Protelis   [9] and Proto  [6]), such as: (i) familiar programming environment, by coherently supporting field constructs within the ecosystem as well as the syntactic and semantic model of a mainstream language like Scala; (ii) lightweight type safety, by leveraging Scala’s powerful type system and type inference; and (iii) seamless reuse of functionality, by providing unrestricted access to both Scala features (e.g., lightweight components, implicits) and existing libraries on the JVM.

Technically, such a smooth integration with Scala has been achieved thanks to a semantic variation of previous formalisation attempts, which were based on the field calculus [5]: the notion of “neighbouring value” (a map from neighbours to data values), used in field calculus to locally express outcome of message reception from neighbours, is replaced with that of “computation against a neighbour”, namely, by expressions whose evaluation depends on the same evaluation occurring on a neighbour. This change leads to a new computational model that we reify by a calculus called Featherweight ScaFi (FScaFi) and present in this paper.

The content is structured as follows. Section 2 provides motivation for ScaFi and covers related work. Section 3 describes syntax and informal semantics of FScaFi. Section 4 provides examples, showing how FScaFi can be used to develop collective adaptive behaviour. Section 5 ends up the paper with a wrap-up and discussion of future work.

2 Background

Scenarios like the IoT, CPS, smart cities, and the like, foster a vision of rich computational ecosystems providing services by leveraging strict cooperation of large collectives of smart devices, which mostly operate in a contextual way. Engineering complex behaviour in these settings calls for approaches providing some abstraction through the notion of ensemble, neglecting as much as possible the more traditional view of focussing on the single device and the messages it exchanges with peers.

2.1 Aggregate Computing

Aggregate computing is the main theme of this paper. A recent survey of its historical development and state-of-the-art is provided in  [3]. The essence of the approach is captured by the field calculus  [5], a core language grounding semantics and formal analysis of field computations  [10, 11].

Programming languages to work with computational fields have been introduced in the past, with Proto  [6] as common ancestor (Lisp-based), and Protelis   [9] as its Java-oriented, standalone DSL version. These approaches however have the drawback of not smoothly integrating field computations in the syntactic, semantic, and typing structures of a modern, conventional language—to fully remedy this problem, in this paper we shall present some key semantic changes to the field calculus.

To address this problem, a prominent, modern approach is to devise an “internal” or “embedded” DSL  [12] that provides mechanisms to support the new features on top of an adequate host programming language. Of course, with embedded DSLs, both the syntax and the semantics are limited by the constraints exerted by the host language. However, the model can sometimes be slightly adjusted in order to favour the embedding, considering the common syntactic, typing, and semantic features of the candidate set of host languages.

When conceiving this DSL, we took into account the following requirements and desiderata for the host language: pragmatism (supporting easy reuse of existing programming mechanisms); reliability (intercepting errors early—cf., type checking); expressivity (offering an eloquent syntax); and functional paradigm support (all the significant features of functional programming must be cleanly available). All the above considerations led us towards the Scala programming language as the host. Then, to well design the key constructs and provide a framework for rigorous analysis of programs and properties, we came up with FScaFi model of field computations. Its peculiarity is to handle standard values has been the local representative of a computational field, which provides a simplified setting for DSL embedding. To achieve this, we introduced a local notion of “computation against a neighbour”, namely, a computation whose outcome depends on the most recent, local view of the result of computation in that neighbour (unlike in the standard field calculus, this allows smooth application of host typing mechanisms to any field expression)—as detailed in Sect. 3.

Such a model, and related tooling, is implemented in the ScaFi aggregate computing DSL and platform  [8, 13]. ScaFi achieves the goal of providing an environment to streamline and support effective development of systems based on the Aggregate Computing paradigm, leveraging the solid basis provided by a mainstream programming language such as Scala and its ecosystem. In fact, Scala: runs on the JVM and thus enables straightforward interaction with the Java ecosystem; offers a powerful type system, with type inference, that helps to build type-safe libraries with minimal overhead; has a flexible syntax (convenient for creation of elegant APIs/DSLs). Moreover, Scala has great popularity in the distributed computing arena: it is the implementation language for several distributed computing toolkits, such as Apache KafkaFootnote 2 and the Akka actor frameworkFootnote 3. Hence, our choice of Scala also fosters the construction of a platform-level support on top of ScaFi, in the form of a middleware for running distributed and situated systems  [13, 14].

2.2 Related Work

Aggregate programming languages. Prior aggregate programming languages are standalone (also called external) DSLs and include Proto   [6], the Lisp-like progenitor, and its evolution Protelis   [9]. Protelis is based on an untyped, standalone DSL able to interoperate with existing Java code. This approach has some limitations: aligning the syntax and semantics, as well as providing training and documenting for a distinct language w.r.t. the one used to develop the execution platform can be burdensome; extra development and maintenance effort is needed to adequately support editing tools (e.g., plugins are required for common IDE features like syntax highlighting and refactoring); activities that span both the DSL and the target language (e.g., static analysis and debugging) may be hard to implement; and finally, the ability to smoothly reuse the features and libraries of the target language can be limited. Though language tools greatly improved recently (cf. the Xtext language workbench  [15] and its Xbase extension  [16], to name a popular one), practically, with an external DSL it may be difficult to come up with a cohesive design of the resulting software system (cf. Generation gap pattern  [17]), since parts written in the DSL need to bidirectionally refer and interact with other parts of the system [18].

Ensemble approaches. In Helena  [19] components can dynamically participate in multiple ensembles and adapt according to different roles whose behaviour is given by a process expression. DEECo  [20] is another CAS model where components can only communicate by dynamically binding together through ensembles; DEECo ensemble is formed according to a membership condition and consists of one coordinator and multiple members interacting by implicit knowledge exchange; DEECo has a Java implementation called jDEECo which enables the definition of components and ensembles through Java annotations. The GCM/ProActive  [21] framework supports the development of large-scale ensembles of adaptable autonomous devices through a hierarchical component model where components have a non-functional membrane and “collective interfaces”, and a programming model based on active objects. SCEL  [22] is a kernel language to specify the behaviour of autonomic components, the logic of ensemble formation, as well interaction through attribute-based communication (which enables implicit selection of a group of recipients). Attribute-based communication  [23] is an approach to CAS coordination that leverages implicit multicasts towards recipients matched by predicates over attributes. The approach has been formalised by the AbC calculus  [23] and implemented as an Erlang DSL in the so-called AErlang library  [24]. Generally speaking, it is worth noting that the field calculus fits useful device abstractions (such as neighbourhood, message exchange, attribute-based filtering) into a purely functional approach, which can then smoothly interoperate with more traditional programming frameworks and languages. More specifically, attribute-based communication can be achieved in the field calculus (and hence in FScaFi) both at the receiver and the sender side, via construct branch (see Sect. 4), by which one can define subcomputations carried on by a subset of nodes—those that execute the same branch and hence remain actually “observable” by operator nbr. In a more programmatically expressible way, a notion of ensemble can be captured as a field computation on a dynamic domain of devices, denoted by the concept of an aggregate process  [25].

Spatial computing and macro-programming. An extensive survey on spatial computing can be found in [4]. Indeed, multiple classes of approaches address (at least in part) the problem of organising a collective of computational entities. These include topological and geometrical languages like GPL  [26] (exploiting the botanical metaphor of “growing points”) and OSL  [27] (focussing on programming “computational surfaces” through folding operations); languages abstracting communication and networks, like TOTA  [7] and Linda-\(\sigma \tau \)  [28] (supporting diffusion and aggregation of tuples on a network of agents), Logical Neighbourhoods  [29] (supporting virtual connectivity), and SpatialViews  [30] (abstracting a network into spatial views that can be iterated on to visit nodes and request services); and macro-programming languages, like SpaceTime Oriented Programming (STOP)  [31] (providing abstractions to support collection and processing of past or future network data in arbitrary spatio-temporal resolutions) and Regiment  [32] (modelling network state and regions as spatially distributed, time-varying signals). Aggregate Computing belongs to the class of so-called general-purpose spatial computing languages, all addressing the problem of engineering distributed (or parallel) computing by providing mechanisms to manipulate data structures diffused in space and evolving in time. Other notable examples include the SDEF programming system inspired by systolic computing [33], and topological computing with MGS  [34]. They typically provide specific abstractions that significantly differ from that of computational fields: for instance, MGS defines computations over manifolds, the goal of which is to alter the manifold itself as a way to represent input-output transformation.

3 Featherweight ScaFi: A Core Calculus for ScaFi

In this section, we present Featherweight ScaFi (FScaFi), a minimal core calculus that models the aggregate computing aspects of ScaFi—much as FJ  [35] models the object-oriented aspects of Java.

In the aggregate computing model, devices undergo computation in rounds. When a round starts, the device gathers information about messages received from neighbours (only the last message from each neighbour is actually considered), performs an evaluation of the program, and finally emits a message to all neighbours with information about the outcome of computation. The scheduling policy of such rounds is abstracted in this formalisation, though it is typically considered fair and non-synchronous.

FScaFi is a core subset of ScaFi, strictly retaining its syntax (with the exception of typing annotations, which are not here presented). The syntax of FScaFi is given in Fig. 1. Following  [35], the overbar notation denotes metavariables over sequences and the empty sequence is denoted by \(\bullet \); e.g., for expressions, we let \(\overline{\mathtt {e}}\) range over sequences of expressions, written \(\mathtt {e}_1,\,\mathtt {e}_2,\,\ldots \,\mathtt {e}_n\) \((n\ge 0)\). FScaFi focusses on aggregate programming constructs. In particular:

  • it neglects the many orthogonal Scala features that one can use (object-oriented constructs, and the like), and

  • it is parametric in the built-in data constructors and functions.

Note that – apart from specific Scala syntax – the examples of ScaFi code given in Sect. 4 are actually examples of FScaFi code. In particular, in order to turn ScaFi functions (such as foldhoodPlus, gradient and branch—covered in Sect. 4) into FScaFi functions, it is enough to drop type annotations and default parameters.

Fig. 1.
figure 1

Syntax of FScaFi.

A program \(\mathtt {P}\) consists of a sequence \(\overline{\mathtt {F}}\) of function declarations and a main expression \(\mathtt {e}\). A function declaration \(\mathtt {F}\) defines a (possibly recursive) function; it consists of a name \(\mathtt {d}\), \(n\ge 0\) variable names \(\overline{\mathtt {x}}\) representing the formal parameters, and an expression \(\mathtt {e}\) representing the body of the function.

Expressions \(\mathtt {e}\) are the main entities of the calculus, modelling a whole field computation. An expression can be: a variable \(\mathtt {x}\), used as function formal parameter; a value \(\mathtt {v}\); an anonymous function \((\overline{\mathtt {x}}) \; {\mathop {\mathrm {\texttt {=>}}}\limits ^{\tau }}\mathrm { \texttt {@@\{} \mathtt {e} \texttt {\}}}\) (where \(\overline{\mathtt {x}}\) are the formal parameters, \(\mathtt {e}\) is the body, and \(\tau \) is a tag); a function call \(\mathtt {e}(\overline{\mathtt {e}})\); a \(\mathtt {rep}\)-expression \(\mathtt {rep}(\mathtt {e})\{\mathtt {e}\}\), modelling time evolution; an \(\mathtt {nbr}\)-expression \(\mathtt {nbr}\{\mathtt {e}\}\), modelling neighbourhood interaction; or a \(\mathtt {foldhood}\)-expression \(\mathtt {foldhood}(\mathtt {e})(\mathtt {e})\{\mathtt {e}\}\) which combines values obtained from neighbours.

Tags \(\tau \) of anonymous functions \((\overline{\mathtt {x}}) \! {\mathop {\mathrm {\texttt {=>}}}\limits ^{\tau }}\mathrm { \texttt {@@\{} \mathtt {e} \texttt {\}}}\) do not occur in source programs: when the evaluation starts each anonymous function expression \((\overline{\mathtt {x}}) \; \mathrm { \texttt {=> @@\{} \mathtt {e} \texttt {\}} }\) occurring in the program is given a distinguished tag \(\tau \)—for instance, two occurrences of the same anonymous function expression get different tags. In the following we will use the phrase name of a function to refer both to the tag of an anonymous function, or to the name of a built-in or declared function. As we will see below, names are used to define function equality.

The set of the free variables of an expression \(\mathtt {e}\), denoted by \(\mathbf{FV} (\mathtt {e})\), is defined as usual (the only binding construct is \((\overline{\mathtt {x}}) \; {\mathop {\mathrm {\texttt {=>}}}\limits ^{\tau }}\mathrm { \texttt {@@\{} \mathtt {e} \texttt {\}}}\)). An expression \(\mathtt {e}\) is closed if \(\mathbf{FV} (\mathtt {e})=\bullet \). The main expression of any program must be closed.

A value can be either a data value \(\mathtt {c}(\overline{\mathtt {v}})\) or a functional value \(\mathtt {f}\). A data value consists of a data constructor \(\mathtt {c}\) of some arity \(m\ge 0\) applied to a sequence of m data values \(\overline{\mathtt {v}}=\mathtt {v}_1,...,\mathtt {v}_m\). A data value \(\mathtt {c}(\mathtt {v}_1,...,\mathtt {v}_m)\) is written \(\mathtt {c}\) when \(m=0\). Examples of data values are: the Booleans \(\mathtt {True}\) and \(\mathtt {False}\), numbers, pairs (like \(\mathtt {Pair}(\mathtt {True},\mathtt {Pair}(5,7))\)) and lists (like \(\mathtt {Cons}(3,\mathtt {Cons}(4,\mathtt {Null}))\)).

Functional values \(\mathtt {f}\) comprise:

  • declared function names \(\mathtt {d}\);

  • closed anonymous function expressions \((\overline{\mathtt {x}}) {\mathop {\mathrm {\texttt {=>}}}\limits ^{\tau }}\mathrm { \texttt {@@\{} \mathtt {e} \texttt {\}}}\) (i.e., such that \(\mathbf{FV} (\mathtt {e}) \subseteq \{\overline{\mathtt {x}}\}\));

  • built-in functions \(\mathtt {b}\), which can in turn be:

    • pure operators \(\mathtt {o}\), such as functions for building and decomposing pairs (pair, fst, snd) and lists (cons, head, tail), the equality function (\(=\)), mathematical and logical functions (+, &&, ...), and so on;

    • sensors \(\mathtt {s}\), which depend on the current environmental conditions of the computing device \(\delta \), such as a temperature sensor—modelling construct sense in ScaFi;

    • relational sensors \(\mathtt {r}\), modelling construct nbrvar in ScaFi, which in addition depend also on a specific neighbour device \(\delta '\) (e.g., \(\texttt {nbrRange}\), which measures the distance with a neighbour device).

    In case \(\mathtt {e}\) is a binary built-in function \(\mathtt {b}\), we shall write \(\mathtt {e}_1 ~\mathtt {b}~ \mathtt {e}_2\) for the function call \(\mathtt {b}(\mathtt {e}_1,\mathtt {e}_2)\) whenever convenient for readability of the whole expression in which it is contained.

The key constructs of the calculus are:

  • Function call: \(\mathtt {e}(\mathtt {e}_1,\ldots ,\mathtt {e}_n)\) is the main construct of the language. The function call evaluates to the result of applying the function value \(\mathtt {f}\) produced by the evaluation of \(\mathtt {e}\) to the value of the parameters \(\mathtt {e}_1, \ldots , \mathtt {e}_n\) relatively to the aligned neighbours, that is, relatively to the neighbours that in their last execution round have evaluated \(\mathtt {e}\) to a function value with the same name of \(\mathtt {f}\). For instance, suppose to have defined a function \(\mathtt {def}\; plus(a,b)\; \mathrm { \texttt {= @@\{} a+b \texttt {\}} }\); then, function call plus(5, 2) yields a field that is 7 in every point of space and time (i.e., the expression evaluates to 7 in each round of every device).

  • Time evolution: \(\mathtt {rep}(\mathtt {e}_1)\{\mathtt {e}_2\}\) is a construct for dynamically changing fields through the “repeated” application of the functional expression \(\mathtt {e}_2\). At the first computation round (or, more precisely, when no previous state is available—e.g., initially or at re-entrance after state was cleared out due to branching), \(\mathtt {e}_2\) is applied to \(\mathtt {e}_1\), then at each other step it is applied to the value obtained at the previous step. For instance, \(\mathtt {rep}(0)\{(\mathtt {x})\; \mathrm { \texttt {=> @@\{} \mathtt {x}+ 1 \texttt {\}} }\}\) counts how many rounds each device has computed (from the beginning, or more generally, since that piece of state was missing). Another example is an expression \(\texttt {snd}(\mathtt {rep}(\mathtt {Pair}(\mathtt {x},\mathtt {False}))\{(\mathtt {x}_R)\; \mathrm { \texttt {=> @@\{} \mathtt {Pair}(\mathtt {x}, \mathtt {x}==\texttt {fst}(\mathtt {x}_R)) \texttt {\}} }\})\) that evaluates to \(\mathtt {True}\) when some value \(\mathtt {x}\) changes w.r.t. the previous round; it is common to use tuples when dealing with multiple pieces of state/result.

  • Neighbourhood interaction: \(\mathtt {foldhood}(\mathtt {e}_1)(\mathtt {e}_2)\{\mathtt {e}_3\}\) and \(\mathtt {nbr}\{\mathtt {e}\}\) model device-to-device interaction. The \(\mathtt {foldhood}\) construct evaluates expression \(\mathtt {e}_3\) against every aligned neighbourFootnote 4 (including the device itself), then aggregates the values collected through \(\mathtt {e}_2\) together with the initial value \(\mathtt {e}_1\). The \(\mathtt {nbr}\) construct tags expressions \(\mathtt {e}\) signalling that (when evaluated against a neighbour) the value of \(\mathtt {e}\) has to be gathered from neighbours (and not directly evaluated). Such behaviour is implemented via a conceptual broadcast of the values evaluated for \(\mathtt {e}\). Subexpressions of \(\mathtt {e}_3\) not containing \(\mathtt {nbr}\) are not gathered from neighbours instead. As an example, consider the expression

    $$\begin{aligned} \mathtt {foldhood}(2)(\texttt {+})\{ \texttt {min}(\mathtt {nbr}\{\texttt {temperature}()\}, \texttt {temperature}()) \} \end{aligned}$$

    evaluated in device \(\delta _1\) (in which \(\texttt {temperature}() = 10\)) with neighbours \(\delta _2\) and \(\delta _3\) (in which \(\texttt {temperature}()\) gave 15 and 5 in their last evaluation round, orderly). The result of the expression is then computed adding 2, \(\texttt {min}(10,10)\), \(\texttt {min}(15,10)\) and \(\texttt {min}(5,10)\) for a final value of 27.

Note that, according to the explanation given above, calling a declared or anonymous function acts as a branch, with each function in the range applied only on the subspace of devices holding a function with the same tag. In fact, a branching construct \(\mathtt {branch}(\mathtt {e}_1) \{ \mathtt {e}_2 \} \{ \mathtt {e}_3 \}\) (which computes \(\mathtt {e}_2\) or \(\mathtt {e}_3\) depending on the value of \(\mathtt {e}_1\)) can be defined through function application as \(\mathtt {mux}(\mathtt {e}_1, () ~\mathrm { \texttt {=> @@\{} \mathtt {e}_2 \texttt {\}} }, () ~\mathrm { \texttt {=> @@\{} \mathtt {e}_3 \texttt {\}} })()\), where \(\mathtt {mux}\) is a built-in function selecting among its second and third argument depending on the value of the first.

Notice that the semantics of this language is compositional and message exchanges are performed under the hood by \(\mathtt {nbr}\) constructs within a \(\mathtt {foldhood}\); with an automatic matching of each message from a neighbour to a specific \(\mathtt {nbr}\) construct, determined through a process called alignment  [36]. Basically, each \(\mathtt {nbr}\) construct produces an “export” (i.e., a data value to be sent to neighbours) tagged with the coordinates of the node in the evaluation tree (i.e., the structure arising from the dynamic unfolding of the main expression evaluation) up to that construct. All exports are gathered together into a message which is broadcast to neighbours, and which can be modelled as a value tree: an ordered tree of values obtained during evaluation of each sub-expression of the program. The alignment mechanism then ensures that each \(\mathtt {nbr}\) is matched with corresponding \(\mathtt {nbr}\) reached by neighbours with an identical path in the evaluation tree.

4 Showcasing FScaFi : Programming Examples

In this section, we provide examples of FScaFi programs, showing how to represent and manipulate fields to implement collective adaptive functionality, using the ScaFi syntax.

4.1 Scala Syntax

In ScaFi, the FScaFi constructs introduced in Sect. 3 are represented as object-oriented methods through the following Scala trait (interface):

figure a

This is mostly a straightforward Scala encoding of the syntax of Fig. 1. The main different is given by the presence of typing information and, in particular, the use of by-name parameters, of type => T, which provide syntactic sugar for 0-ary functions: these enable to capture expressions at the call site, pass them unevaluated to the method, and evaluate them lazily every time the parameter is used. This turns out very useful to implement the FScaFi semantics while providing a very lightweight syntax for the DSL. Moreover, note that method signatures do not include field-like type constructors: in fact, in FScaFi, fields are not reified explicitly but only exist at the semantic level.

4.2 Programming Examples

When thinking at field programs, one can adopt two useful viewpoints: the local viewpoint, typically useful when reasoning about low-level aspects of field computations, which considers a field expression as the computation carried on by a specific individual device; and the global viewpoint, typically more useful when focussing on higher-level composition of field computations, which regards a specification at the aggregate level, as a whole spatio-temporal computation evolving a field. So, an expression (e.g., 1+3 of type Int) can represent the outcome of execution of a computation locally (4), or globally as the program producing a field (a field of 4s). Note, however, that the global field is not accessed computationally: a local computation will only access a neighbouring field (which is actually a view, given by the messages received from neighbours, of the actual, asynchronously evolving field).

In the following, we incrementally describe the constructs introduced in Sect. 3 and the design of higher-level building blocks of collective adaptive behaviour through examples. In ScaFi, a usual literal such as, for instance, tuple

figure b

is to be seen as a constant (i.e., not changing over time) and uniform (i.e., not changing across space) field holding the corresponding local value at any point of the space-time domain. By analogy, an expression such as

figure c

denotes a global expression where a field of 1s and a field of 2s are summed together through the field operator +, which works like a point-wise application of its local counterpart. Indeed, literal + can also be thought of as representing a constant, uniform field of (binary) functions, and function application can be viewed as a global operation that applies a function field to its argument fields.

A constant field does not need to be uniform. For instance, given a static network of devices, then

figure d

denotes the field of device identifiers, which does not change across time but does vary in space. On the other hand, expression

figure e

is used to represent a field of temperatures (as obtained by collectively querying the local temperature sensors over space and time), which is in general non-constant and non-uniform.

Fields changing over time can also be programmatically defined by the rep operator; for instance, expression

figure f

counts how many rounds each device has executed: it is still a non-uniform field since the update phase and frequency of the devices may vary both between devices and across time for a given device.

Folding can be used to trigger the important concept of neighbour-dependent expression. As a simple initial example, expression

figure g

counts the number of neighbours at each device (possibly changing over time if the network topology is dynamic). Note that folding collects the result of the evaluation of 1 against all neighbours, which simply yields 1, so the effect is merely the addition of 1 for each existing neighbour.

The key way to define truly neighbour-dependent expressions is by the nbr construct, which enables to “look around” just one step beyond a given locality. Expression

figure h

evaluates to the field of average temperature that each device can perceive in its neighbourhood. The numerator sums temperatures sensed by neighbours (or, analogously, it sums the neighbour evaluation of the temperature sensor query expression), while the denominator counts neighbours as described above. As another example, the following expression denotes a Boolean field of warnings:

figure i

This is locally true if any neighbour perceives a temperature higher than some topical threshold. Notice that by moving the comparison into the nbr block,

figure j

then the decision about the threshold (i.e., the responsibility of determining when a temperature is dangerous) is transferred to the neighbours, and hence warnings get blindly extended by 1-hop. Of course, provided warningTh is uniform, the result would be the same in this case.

Functions can be defined to capture and give a name to common field computation idioms, patterns, and domain-specific operations. For instance, by assuming a mux function that implements a strictly-evaluated version of if:

figure k

A variation of foldhood, called foldhoodPlusFootnote 5, which does not take “self” (the current device) into account, can be implemented as follows:

figure l

Notice that the identity init is used when considering a neighbour device whose identifier (nbr{mid}) is the same as that of the current device (mid). As another example, one can give a label to particular sensor queries, such as:

figure m

The second case uses construct nbrvar, which is a neighbouring sensor query operator providing, for each device, a sensor value for each corresponding neighbour: e.g., for nbrRange, the output value is a floating-point number expressing the estimation of the distance from the currently executing device to that neighbour—so, it is usually adopted as a metric for “spatial algorithms”. Based on the above basic expressions, one can define a rather versatile and reusable building block of Aggregate Programming, called gradient  [37,38,39]. A gradient (see Figure 2) is a numerical field expressing the minimum distance (according to a certain metric) from any device to source devices; it is also interpretable as a surface whose “slope” is directed towards a given source. In ScaFi, it can be programmed as follows:

figure n

The rep construct allows one to keep track of the distances across rounds of computations: source devices are at a null distance from themselves, and the other devices take the minimum value among those of neighbours increased by the corresponding estimated distances as given by metric—defaulting to nbrRange. Notice that foldhoodPlus (i.e., a version of foldhood that does not consider the device itself) must be used to prevent devices from getting stuck to low values because of self-messages (as it would happen when a source node gets deactivated): with it, gradients dynamically adapt to changes in network topology or position/number of sources, i.e., it is self-stabilising [11].

Fig. 2.
figure 2

Snapshot of a gradient field from a simulation in ScaFi. The red nodes are the sources of the gradient. The nodes at the top-left have parted from the network and their values increase unboundly. The gray lines represent device connectivity according to a proximity-based neighbouring relationship.

Another key operation on fields is splitting computation into completely separate parts or sub-computations executed in isolated space-time regions. An example is computing a gradient in a space that includes obstacle nodes so that gradient slopes circumvent the obstacles. The solution to the problem needs to leverage aggregate functions, and their ability of acting as units of alignment. That is, we can use a different aggregate function for normal and obstacle nodes:

figure o

Calling such functions effectively restricts the domain to the set of devices executing them, thanks to the space-time branching enacted by construct @@ wrapping the bodies of the corresponding literal functions; by calling them exclusively in any device, the system gets partitioned into two sub-systems, each one executing a different sub-computation. For convenience, ScaFi provides as built-in function, called branch, defined as:

figure p

With it, a gradient overcoming an obstacle is properly written as

figure q

which is cleaner and hides some complexity while better communicating the intent: branching computation. Generally, notation @@ has to be used for bodies of literal functions that include aggregate behaviour, i.e., functions which (directly or indirectly) call methods of the Constructs trait—other uses have no effects on the result of computation. We remark that the above field calculus expression (gradient avoiding obstacles) effectively creates a distributed data structure that is rigorously self-adaptive [11]: independently of the shape and dynamics of obstacle area(s), source area(s), metric and network structure, it will continuously bring about formation of the correct gradient, until eventually stabilising to it. For instance, it could be used in a wireless sensor network scenario to let devices equipped with sensors transmit local perception on a hop-by-hop basis until all information reaches the gradient source. Generally, gradients can be used as building block for more complex behaviour, to which the self-adaption properties will be transferred, by simple functional composition.

An example of more complex behaviour is the self-healing channel [40], i.e., the field of Booleans that self-stabilises to a true value in the devices belonging to the minimal path connecting source with target devices. This functionality can be implemented as follows:

figure r

i.e., by applying the triangle inequality property (with some tolerance as captured by parameter width), and exploiting a block distanceBetween that calculates the distance between source and target (e.g., using a gradient) and broadcasts that value to the whole network (e.g., by gossiping or along another gradient).

5 Conclusion and Future Work

Aggregate Computing is a recent paradigm for “holistically” engineering CASs and smart situated ecosystems, which aims to exploit, both functionally and operationally, the increasing computational capabilities of our environments—as fostered by driver scenarios like IoT, CPS, and smart cities. It formally builds on computational fields and corresponding calculi to functionally compose macro behavioural specifications that capture, in a declarative way, the adaptive logic for turning local activity into global, resilient behaviour. In this paper, we have introduced FScaFi, a core calculus that captures the essential features of ScaFi, a recently developed Scala-internal aggregate programming DSL. In particular, it leverages a novel notion of “computation against a neighbour” which enabled seamless integration in the Scala language and type system.

In future work, we will formalise the FScaFi semantics (informally sketched in this paper), its properties, and relation with the field calculus—mainly aimed at proving analogous properties such as those in [10, 11, 41, 42]. Additionally, work on ScaFi is part of the research agenda for Aggregate Computing as comprehensively covered in  [3], which includes the study of dynamic field computations, or aggregate processes  [25] as well as the design and implementation of aggregate computing runtime platforms  [13, 14]. Together, they could lead to the emergence of a new platform for large-scale distributed systems deployment over phyisical environments (such as the IoT), whereby distributed computations can be dynamically injected, executed in a distributed way, and cooperate and compete with each other to realise an ecosystem of adaptive services—developing on the the vision of, e.g., [43].