1 Introduction

Recent years have seen increased attention being given to the design and implementation of pervasive systems. A pervasive environment is a technology-enriched living space that anticipates the needs and intentions of occupants and provides services accordingly to promote comfort, convenience, security, or entertainment, and therefore an improved quality of life. A service in such environment is realized by an assemblage of heterogeneous service components that collaboratively offer context-aware habitual supports to residents. These service components, also called “nodes” in the sequel, are usually interconnected by different wired or wireless protocols. Thus, a set of service management mechanisms is obviously required, which composes services by discovering and selecting nodes as well as makes the services work consistent and durable.

Fig. 1
figure 1

Message-oriented pervasive services in a Smart Home

It is important to observe that services in a pervasive system are reactive (event-driven) by nature since these services react to contexts by performing desired actions. One popular approach is to describe the message dissemination as sequences of events passing among nodes. The connected channels among publishers and subscribers are called “topics,” where all nodes in Message-Oriented Middleware (MOM) exchange messages via these channels. In this way, the MOM become a broker that makes these nodes loosely coupled as the nodes depend on topics instead of depending on one another. Taking a Smart Home system shown in Fig. 1 as an example. There are two services in the Smart Home: \(w_1\), \(w_2\), \(w_5\), and \(w_6\) collectively provide an automatic media provision service (denoted PS1); meanwhile, \(w_1\)\(w_4\) and \(w_7\) form a smart air- conditioning service (denoted PS2). In the automatic media provision service, \(w_1\) and \(w_2\) are communicating gateways of wireless temperature and light sensors. The node \(w_5\) is the software logic of the rule, indicating that when someone is sitting on the sofa (detected using the pressure sensors), and the light is turned on (detected using the light sensor), then turns on the TV by sending a message to node \(w_6\), where \(w_6\) is the control gateway to the TV device.

Meanwhile, robustness is one of an essential requirement of a pervasive system such as a Smart Home [9] because that the people setting up and maintaining the systems are consumers with little technical knowledge. Contrary to traditional enterprise service systems, a service in a pervasive system is more fragile as service components (nodes) can join, leave, or fail at any time. In our previous work, we proposed Pervasive Service Application Model (PerSAM) and Pervasive Service Management Protocol (PSMP) [22] that collaboratively provide both abstractions and implementations of a robust pervasive service management platform.

Although one can build robust home services with PerSAM/PSMP, the services are still not optimized because PerSAM/PSMP does not consider spatial information. As pointed by Dey [6], the “location” contexts are one of the most fundamental contexts. Residents may have different preferences when they are located in different places. Sometimes this lead to wrong application logics. To explain the problem of the traditional service composition schemes that do not take the spatial information into account, let us consider the Smart Home illustrated in Fig. 2. Assuming that there is media service that plays movies for the user \(U_1\) on a display. As there are five displays in the Smart Home, the service composition scheme selects display with the best quality, for example, the display with the largest size. In this case, Display 2 will be chosen. However, it is apparent that this decision is not going to satisfy the user \(U_1\) as \(U_1\) is located at the bedroom. Thus, Display 2, located at the living room, is not a good choice for being part of the media service used by \(U_1\). Consider another example, let there be an information service used by the user \(U_2\) located at the living room (see Fig. 2). Again, the information prompts the weather and bus information on a display. In this case, Display 2 is a reasonable choice as Display 2 is near \(U_2\) and it also has the best quality (the largest size). When Display 2 is not available for some reason (e.g., failed or occupied by another service), then without considering the spatial information of the displays, it is hard to determine the fallback display for the information service. In this situation, the system either fails the service or selects the largest one among the remaining displays (i.e., Display 5 will be selected). Obviously, both of these decision are not reasonable. On the other hand, if the system is able to take the structure of spaces into account when composing a service, then it becomes obviously to see that the living room is spatially closely related to kitchen. In this way, Display 3 will be selected as the fallback of Display 2, which is more satisfiable for the user \(U_2\).

Given this, this paper presents a spatial-aware service model for pervasive services residing in a pervasive environment so that the spatial information can be considered when composing services. The main contributions of this paper are summarized below.

  • Theoretical constructs To denote the spatial concepts and relations, this paper presents two theoretical abstractions, namely the Ambient and the Ambient Structure Tree (AST), that can formally reflect spatial relationships among physical spaces. Also, a set of message dissemination mechanisms that prorogate messages following the semantics of the proposed spatial abstractions is introduced.

  • Formal schemes Based on the theoretical constructs, this paper discusses how presence management, service discovery, and service composition can be performed in a spatial-aware way. The overall schemes are specified formally to enhance the reproducibility and to enable the verification of robustness.

  • Robustness and efficiency As mentioned, the overall approach is designed based on a robust service management scheme called PerSAM/PSMP. Therefore, it is desirable that the robustness property of PerSAM/PSMP be preserved. In this paper, the robustness property is verified. Also, spatial awareness improves efficiency. The reason is that instead of flooding the message over spaces, the proposed spatial-aware message dissemination mechanism only delivers the messages to the spaces that need them. This paper studies the improvements of efficiency by both theoretical analysis and experiments.

The rest of this paper is organized as follows: Section 2 discusses backgrounds and related works of pervasive service management systems as well as spatial awareness in such systems. Section 4 introduces spatial abstractions and their dissemination mechanisms. Then, we investigate how to enhance spatial-aware PerSAM/PSMP based on the Ambient abstractions in Sect. 5. In Sect. 6, we evaluate the enhanced model by proving its robustness, analyzing its efficiency, conduct a set of experiments, and discuss the costs and limitations of the proposed approach. Finally, in Sect. 7, conclusion and future works are provided.

2 Related work

Pervasive systems are difficult to design and maintain because they involve heterogeneous hardware, software, wiring protocols, and programming paradigms. Moreover, services in pervasive environments are highly dynamic. Therefore, many pervasive systems have been proposed to deal with the above problems. Based on the architectural styles, pervasive systems can be classified into two categories: process-centric and data-centric. In a process-centric system, the distributed components collaborate by invoking sequences of remote procedures. The flows of calls are controlled by software programs, which search services in a centralized service registry and invoke services in a synchronized way. Context Toolkit [7] and Gaia [29] belong to this category. More recently, with the raising of RESTful Web Services, many research works start to design pervasive systems by incorporating Web of Things (WoT) concepts. For instance, [32] consider the devices in the smart environment as Web resources, which can be accessed by any Web client. The others are to wrap up the sensor network based on WoT, enabling web clients to access data through RESTful API provided by the middleware [5, 10]. Both of the works mentioned above are also process-centric. It is worthy to note that these studies typically focus on ease the burden of development by applying the “separation of concerns” principle of Software Engineering. The outcomes are typically APIs for device control and data access. Few of them mentioned the service management issues.

Recently, loosely coupled and asynchronous data-centric styles such as Tuple-space (TS) [11] and Message-Oriented Middleware (MOM) become more popular. TS is essentially an associative virtual shared memory storing serialized objects. Distributed clients can read, write, or take serialized objects from TS server. The EventHeap [17] is the earliest well-known TS-based pervasive system that primarily focuses on interactive workspaces (smart office). TS and MOM have similar advantages, that is, easy to integrate heterogeneous hardware/software and failure isolation. However, they are two different architectures from the technology’s point of view: TS is a way to access shared information across multiple concurrent clients, whereas MOM focuses on flexible message delivery. Moreover, TS tends to store serialized objects, which is usually a penalty on performance and interoperability. Since that MOM does not enforce the wiring format of messaging content, the performance of MOM is better than TS. Because of the relief of performance and interoperability issues, MOM appears to be a good alternative that keeps the benefits of data-centric architecture and prevents performance and interoperability issues at the same time. MOM is very popular for constructing sensor-rich pervasive systems [25]. The baseline service model, PerSAM/PSMP [22], is also MOM-based. It is worthy to point out that there is a popular open standard for constructing MOM-based pervasive systems called Message Queue Telemetry Transport (MQTT) [1].

Spatial awareness of a pervasive system depends on indoor positioning techniques. Currently, most of the indoor positioning systems focus on detecting users’ locations; few of them focus on device localization. Therefore, detecting locations of smart devices is still a challenging task [33]. Most of current indoor location systems for devices require that either WiFi or the beacon profile of BLE be attached to every device. However, in addition to the increase in costs, continuously monitoring of device locations consumes too much energy. In a pervasive environment, there can be plenty of devices scattering over the various locations. If the power is provided by a battery, then it can be labor-intensive to change the battery for all devices. Otherwise, if the power is provided by power line, then it is less likely that the location of the device is changed, making location tracking less useful. As a result, many recent pervasive systems still require that the location of devices be configured and updated manually [14, 15].

Most of the works on pervasive systems lack formal foundations so that it is hard to reason about the behaviors and functionalities of these systems [26]. The specification and verification of a formal model refer to denote and to prove (or disprove) the qualitative properties of a system. Process algebra is a set of formal syntax to formally describe the behaviors or events of a system. As noted in [19], formal methods are critical for designing and implementing reliable pervasive systems and thus become an important tool for research in the Pervasive Computing discipline. PerSAM/PSMP is an attempt to address this issue, and it also serves as a baseline theoretical infrastructure for further investigation. Similar to PerSAM/PSMP, this work is mainly formalized using Communicating Sequential Processes (CSP) [13] notations, which is a member of the family of process algebra and is a widely used mathematical language for describing distributed and concurrent systems. The benefits of using the CSP are: (1) the schemes can be specified accurately, (2) it enhances the reproducibility of the schemes since CSP is more precise than pseudo-code, and (3) it is easier to validate the desired system attributes formally because of the preciseness of process algebra.

The use of formal verification in pervasive systems falls into two categories. The first category is to verify desired properties of the “applications” of a platform [8, 28, 30]. The other category focuses on verifying the desired properties of the “platform.” [22, 31] As it is hard to force every application developer of a platform to formally model and verify the application, we believe that it is more practical to use formal approach on verifying the system behaviors at the platform level. Thus, the approach proposed in this work belongs to the second category.

The Ambient is a useful formal abstraction for representing spatial relationship among spaces. An Ambient is a bounded place with computing capability. An Ambient is typically composite-capable, that is, an Ambient can be embedded within one another and has a complex structure. Ambient Calculus was proposed by Cardelli and Gordon [2] to describe the movement of process and devices. Jing et al. further discussed the operational semantics of Ambient Calculus [16]. Based on Ambient Calculus, Ranganathan et al. [28] proposed a systematic approach for verifying operations in the Active Spaces on the Gaia middleware [29]. Several research teams further develop additional methodology and visual tools for rapid-prototyping [4, 18] applications on top of Ambient Calculus. Our approach differs from previous ones in that we focus more on modeling the spatial correlation of places as well as the mechanisms for spatial-aware message dissemination.

3 Baseline service model

Before entering into detailed discussions of the proposed approach, it is desirable to briefly introduce the foundation of the proposed approach. Our approach is presented on top of a service model called message-oriented service model and management protocol for pervasive services, namely PerSAM [22]. In PerSAM, a messaging endpoint is called a “PerNode” or simply a “node.” Formally, A PerNode \(p \in P\) is an atomic stateful service component in a pervasive system, where P is the universe of PerNodes in the system, and \(state \in \{\texttt {INSTALLED, DORMANT, ACTIVE}\}\) is an attribute of p. We can divide PerNodes into two categories, namely the Manager Nodes and the Worker Nodes, according to their responsibilities. Manager Node is designed for administrative purposes. A Worker Node \(w \in W\) is a PerNode that encapsulates a unit of application logic, where W is the universe of Worker Nodes in the system. A Worker Node has an additional attribute \(\tau \in T\), where T is the universe of node types in the system which specify the functional category of the Worker Node. A Pervasive Community is a logical organization of nodes. There are two kinds of Pervasive Communities: (1) the Pervasive Service (PS) consists of one or more nodes that collectively provide a service to a user, and (2) the Pervasive Host (PH) refers to a group of nodes that colocate in the same computing device. Each community has a Manager Node that keeps track of its members: Each PS has a PSM and each PH also has a PHM. Considering an example depicted in Fig. 1, nodes A and B are members of the “automatic media provision service” (ps1) as well as the “smart air- conditioning service” (ps2) at the same time. Let us denote the PSM of a Pervasive Service ps as \(m^{ps}\) and the Worker Nodes that join ps as a set \(W^{ps} \in 2^W\), where \(2^W\) is the power set of W. Considering ps1 in Fig. 1, we have \(m^{ps1}=PSM1\) and \(W^{ps1}=\{A,B,E,F\}\). The Service Template \(ST^{ps} \in 2^{\tau }\) is pre-defined by service designers. Each \(ST^{ps}\) specifies required node types that comprise ps. To compose a Pervasive Service ps, \(m^{ps}\) first finds the best \(w \in W\) such that \(type(w)=\tau \) for each \(\tau \in ST^{ps}\). The definition of “best” depends on a user-defined selection function which takes the set of qualified candidate Worker Nodes (e.g., the set of nodes with the desired service type) as its input and returns the selected node based on a user-defined criteria. For performance, the default selection function is First Come, First Selected (FCFS), but it can be substituted by a more sophisticated one such as the one proposed in our previous work [21].

Before ps is successfully composed, some Worker Nodes of required types \(\tau \in ST^{ps}\) are called “Service Vacancy.” Let us denote the set of service vacancies of a Pervasive Service ps as \(V^{ps}\). Formally:

$$\begin{aligned} V^{ps}=\{\tau |\tau \in ST^{ps}, \lnot \exists {w}\in W^{ps} : type(w)=\tau \} \end{aligned}$$
(1)

In a similar way, a Pervasive Host (PH) is composed of a set of Worker Nodes, whose life cycles are managed by a PHM. The PHM and its affiliating Worker Nodes locate in the same device. Therefore, a PHM is able to detect the node states, to load, and to shut down Worker Nodes. Formally, a Pervasive Host ph is a tuple:\(ph = \left\langle {{m^{ph}},{W^{ph}}} \right\rangle \), where \(m^{ph}\) is the PHM of ph, and \(W^{ph}\) is the set of Worker Nodes that currently locate on ph.

Fig. 2
figure 2

Floor layout of the Smart Home 

Fig. 3
figure 3

Ambient distribution of Fig. 2 and its AST

4 Spatial abstractions

An Ambient is typically composite-capable (i.e., an Ambient can be embedded within one another) and has a complex structure. For instance, the Smart Home in Fig. 2 can be partitioned into several subspaces where one space can contain another space and be contained by another space at the same time. As shown in Fig. 3, the living room, denoted Ambient E, contains Ambients G, F, and H, and the living room is also contained by the Ambient A. It is important to note that both physical spaces (such as study room and living room) and computing devices (such as a computer or a mobile phone) are Ambients. Thus, a Pervasive Host is also a kind of Ambient.

4.1 Ambient containment and Ambient Structure Tree

If an Ambient x spatially contains an Ambient y, then their spatial relationship can be formally represented by \(x \sqsupset y\). In Fig. 3, \(Home \sqsupset A\), \(A \sqsupset D\), and \(A \sqsupset E\). Note that the \(\sqsupset \) relation is transitive. For example, if \(Home \sqsupset A\) and \(A \sqsupset E\), then \(Home \sqsupset E\). The Ambients of a space P can be denoted using a set \(A=\{a_1, a_2, \ldots a_n\}\), and \(C \subseteq A \times A\) represents the spatial containment relations (\(\sqsupset \)) between arbitrary \(a_i\) and \(a_j\). In this way, (AC) forms a rooted tree. Assuming that \(a^{parent}\) is the parent node of \(A^{child}=\{a^{child}_1, a^{child}_2, \ldots , a^{child}_n\}\) in the tree, then:

$$\begin{aligned} \forall a^{child}_i \in A^{child}, a^{parent} \sqsupset a^{child}_i. \end{aligned}$$
(2)

As a result, we can define a rooted tree \(\Upsilon ^{P}=(A,C)\) that conforms to (2) to be the Ambient Structure Tree (AST) of the space P.

Definition 1

(Ambient Structure Tree (AST)) If all nodes in a rooted Tree \(\Upsilon ^{P}=(A,C)\) satisfy \(\forall a^{child}_i \in A^{child}, a^{parent} \sqsupset a^{child}_i\), then \(\Upsilon ^{P}\) is an Ambient Structure Tree, where \(a^{parent}\) is the parent node of \(A^{child}=\{a^{child}_1, a^{child}_2, ...,a^{child}_n\}\)

We are now able to define some useful functions based on common operations of the Tree data structure. First, for an Ambient a located in an AST \(\Upsilon ^{P}\), the shortest distance between a and the root Ambient of \(\Upsilon ^{P}\) is denoted level(a) and the depth of AST is denoted \(depth(\Upsilon ^{P})\). Similarly, parentOf(a) can be used to acquire the closest parent node of a in the AST. With childOf(a), we can obtain the set of immediate child Ambients of a. Finally, \(childOf^*(a)\) and \(parentOf^*(a)\) are used to obtain the set of all parent or child Ambients of a (exclusively). For example, in Fig. 3, \(childOf(Home)=\{A, B, C, K, J\}\); \(childOf^*(Home)=\{A,B,C,K,J,D,E,L,M,N,I,F,G,H\}\). It is worthy to point out that according to the basic property of the Tree data structure,

$$\begin{aligned} a^x \in childOf^*(a^y) \Leftrightarrow a^y \in parentOf^*(a^x). \end{aligned}$$
(3)

As an illustrative example, consider the Ambients A and G in Fig. 3, we can find that \(G \in childOf^*(A)\) if and only if \(A \in parentOf^*(G)\).

4.2 Home Ambient

As mentioned, a Worker Node is typically a software module running on a Pervasive Host, where the Pervasive Host can be the control board of a sensor, a gateway to BLE-connected(Bluetooth Low Energy) [12] devices, or a controllable home appliance. Behaviors of Worker Nodes are usually highly dependent on surrounding environments of their Pervasive Hosts. For example, a temperature sensor device deployed in the bedroom (Ambient B in Fig. 3) does not reflect the accurate temperature of the doorway (Ambient J in Fig. 3). Also, when the system detects that the temperature in the bedroom is too high, it tends to turn on the fan in the bedroom instead of the fan in the living room. In this sense, we can define the notion of the Home Ambient of a Worker Node w to be the minimal surrounding Ambient of w’s Pervasive Host. For instance, in Fig. 2, the Home Ambient of the Worker Nodes running on the computer in the study room is the Ambient N (see Fig. 3). Based on AST, the system can determine a unique Home Ambient for each Worker Node. For a Worker Node w, which is deployed in a Pervasive Host ph, and ph is located at Ambient \(a^{ph^w}\), then the Home Ambient of w, denoted \(a^w\), is the closest parent node of \(a^{ph^w}\), formally, \(a^w=parentOf(a^{ph^w})\), where \(a^{ph^w}\) is a leaf node in AST.

The concept of a Home Ambient can also be applied to a Pervasive Service so that the “coverage of effects” of a Pervasive Service can be specified and reasoned precisely. Consider the smart air-conditioning service (PS2 in Fig. 1), assuming that the Home Ambient of the air conditioner (controlled by node G) locates in the living room (Ambient E in Fig. 3), it is intuitive that the temperature sensor (Node C in Fig. 1) should be placed as near Ambient E (living room) as possible. It does not make sense if the sensor is put in the bedroom and the air conditioner is placed in the living room since the “coverage of effects” of the air conditioner does not contain the place where the temperature sensor is deployed. Thus, we can conclude that since a Pervasive Service consists of a set of Worker Nodes collaboratively provide a function to fulfill inhabitant’s needs, each Pervasive Service has a “coverage of effects” and should be defined in the service template. In the air- conditioning service example, if the sensor and the air conditioner are both placed in the living room, then the “coverage of effects” of the service is the living room. In other words, the Home Ambient of the smart air-conditioning service is now living room (Ambient E).

Concretely speaking, the coverage of effects (Home Ambient) of a Pervasive Service refers to the scope of service composition and should be specified in the Service Template of a Pervasive Service. Then, the spatial-aware message dissemination mechanisms (see Sect. 4.3) can effectively limit the search scope of the PSM. Consider the previous example, since PS2’s Home Ambient is E, which also contains F, G, and H, when composing PS2, only the nodes located within E and its sub-Ambients can be found because of the underlying spatial-aware message dissemination mechanisms.

Based on the above discussion, a Home Ambient can be defined as follows:

Definition 2

(Home Ambient) The Home Ambient of a Worker Node w, denoted \(a^w\), is the minimal Ambient containing the Pervasive Host ph where w is deployed where \(a^w=parentOf(ph^w)\) in AST. In addition, the Home Ambient of a Pervasive Service ps, denoted \(a^{ps}\), is the minimal Home Ambient of all affiliating Worker Nodes that for all \(w \in W\). Formally, \(a^{ps} = \min \{a| \forall w\in W^{ps}, a \sqsupset a^w \} \text{. }\)

Intuitively, the Home Ambient of a Pervasive Service ps is conceptually the (spatial) union of Home Ambients of affiliating Worker Nodes.

4.3 Spatial-aware message dissemination

As shown in Fig. 1, the scope of a specific message channel (or called topic) is global. That is, a subscriber of a topic has to process all messages and does not care the location of the message source. However, in practice, spatial information is critical to application logic. For an air- conditioning service whose Home Ambient is the living room, the application logic of such service is only interested in the messages sent by the sensors located in the living room. In other words, the concept of Ambient has to be mapped appropriately to the message dissemination mechanism appropriately so that the message subscribers receive right messages according to the spatial relationships of the messaging source. Our approach is to map the hierarchical relationships of the AST to the message dissemination scope of the MOM. The overall spatial-aware message dissemination mechanism can be explained from two perspectives: subscribing to a topic (that is mapped to an Ambient) and publishing messages to a topic.

Subscribing to an Ambient topic In Fig. 3, if an application logic is interested in the messages emitted by the nodes in the living room and kitchen (Ambient A), then it subscribes the (SENSORA) topic so that the subscriber receives sensor messages coming from the living room and all of its subspaces (e.g., D, E, F, G, and H). From the AST’s viewpoint, if one subscribes a node in AST, then the subscriber receives all messages coming from the descendants of the node, as shown in Fig. 3. Given a space P and its AST \(\Upsilon ^{P}=(A,C)\), for all \(a \in A\), there exists a corresponding topic \(\hat{m}(a)\), and subscribers of \(\hat{m}(a)\) are able to receive the messages coming from \(\hat{m}(a_i)\), where \(a_i \in childOf^*(a)\). Formally,

$$\begin{aligned} \Omega ^{\hat{m}(a)}_p = \Omega ^{\hat{m}(a)} \cup \bigcup _{a_i \in childOf^*(a)} \Omega ^{\hat{m}(a_i)} \end{aligned}$$
(4)

where p indicates “propagated,” and \(\Omega ^{\hat{m}(a)}\) and \(\Omega ^{\hat{m}(a_i)}\) denote the set of messages that is published to \(\hat{m}(a)\) and \(\hat{m}(a_i)\), respectively. Note that the term \(\Omega ^{\hat{m}(a)}\) is required as \(childOf^*(a)\) is exclusive.

Publishing to an Ambient topic Again, in Fig. 3, if a Worker Node w is located at the living room (Ambient E), then \(a^w=E\) so that it publishes messages to \(\hat{m}(E)\). Similar to the semantics of subscribing to an Ambient topic mentioned above, the messages published to E are also propagated to F, G, and H, that is, \(childOf^*(a^w)\). If a message is published to Home in Fig. 3, then all Ambients in the AST receive the message. However, in the previous example, a node subscribing to A only receives messages coming from A, D, E, F, G, and H. As a result, (4) must be revised to enable the dissemination of propagating messages to subspaces, namely,

$$\begin{aligned} \Omega ^{\hat{m}(a)}_p = \Omega ^{\hat{m}(a)} \cup \bigcup _{a_i \in childOf^*(a)} \Omega ^{\hat{m}(a_i)} \cup \bigcup _{a_j \in parentOf^*(a)} \Omega ^{\hat{m}(a_j)}. \end{aligned}$$
(5)

To be concrete, in Fig. 3, a node subscribing to A receives messages coming from not only A, D, E, F, G, and H, but also Home. In this way, the message sent to Home is propagated to A. It is worthy to mention that no message is replicated when there is no node subscribes to an Ambient topic, even when the topic is in the propagation path.

As will be shown in Sect. 6.2, when compared to non-spatial-aware MOM, spatial-aware message dissemination helps to reduce traffic. For example, if a node subscribes to E, then it does not receive any message unless the messages come from Home, A, E, F, G, or H. On the contrary, in the traditional non-spatial-aware dissemination, all messages are broadcasted.

5 Spatial-aware service management

By incorporating the spatial abstractions into PerSAM and PSMP, we can derive an enhanced spatial-aware service management schemes in this section. As mentioned, the schemes are presented using CSP. Table 1 summarizes the CSP notations used in this paper.

Table 1 Summary of CSP notations

Based on the CSP notation, we are now able to formally define the behaviors of the message dissemination explained in Sect. 4.3, that is,

$$\begin{aligned} {\begin{matrix} \exists w^{sender} : \hat{m}(a^{w^{sender}})!msg \Rightarrow \forall w \text{ such } \text{ that } \\ a^{w} \in parentOf^*(a^{w^{sender}}): \hat{m}(a^{w})?msg, \end{matrix}} \end{aligned}$$
(6)

where m!msg means to send a message msg to channel m, and m?msg means to receive a message msg by subscribing to channel m.

5.1 Spatial-aware presence and life-cycle Management

Presence management is important in a dynamic service system such as a Smart Home. The primary objective of presence management is to make the system aware of the presence and absence of a node. In a home network, a node emits a “Presence Announcement (PA)” when it joins a home network and emits a “Leave Announcement (LA)” before it leaves the home network. To be spatial aware, we denote the Home Ambient of a PerNode p by \(a^p\). Then,

$$\begin{aligned}&PA[p] \triangleq \hat{m}(a^p)!present(p) \rightarrow \perp \end{aligned}$$
(7)
$$\begin{aligned}&LA[p] \triangleq \hat{m}(a^p)!leave(p) \rightarrow \perp \text{, } \end{aligned}$$
(8)

where present(p) means a PA message and leave(p) means a LA message. The scheme defined in (7) and (8) differs from PSMP in that the PA/LA messages in PSMP are broadcasted globally, whereas with the proposed scheme, the message propagation is limited with the Ambient \(a^p\).

The Life-cycle Management (LM) scheme is identical to PerSAM/PSMP. It changes the state of PerNode according to incoming calls to UPnP (Universal Plug and Play) Actions. If a node is changed to INSTALLED state, it is neither discoverable by peers nor functional. Thus, a node must perform a Leave Announcement (LA) before it is changed to INSTALL state, namely

$$\begin{aligned} {\begin{matrix} LM[p] \triangleq \hat{t}?newstate \rightarrow [ &{} \text{ if } (newstate=\texttt {SHUTDOWN}) \\ &{} \text{ then } LA[p];p.state:=newstate \\ &{} \text{ else } p.state:=newstate ] \rightarrow \\ LM[p]\text{. } \end{matrix}} \end{aligned}$$
(9)

5.2 Spatial-aware service composition

A Service Composition (\(SC_{PSM}\)) is initiated whenever the Pervasive Service is not alive. A Pervasive Service is not alive either when the service is still composing, or some affiliating Worker Nodes fail so that there are empty slots in the service vacancy \(V^{ps}\), see (1). For each slot of node type \(\tau \) in \(V^{ps}\), PSM issues a discovery message \(discover(psm,\tau )\) to the multicast channel (\(\hat{a^{ps}}\)) and then performs Service Selection \(SS_{PSM}\), see (14). Formally,

$$\begin{aligned} \begin{aligned} SC_{PSM}&[psm,ps] \triangleq \\&if(\lnot ServiceAlive())\\&then \coprod _{\tau \in V^{ps}} \hat{m}(a^{ps})!discover(psm,\tau ) \rightarrow SS_{PSM}[ps] \\&else\ SC_{PSM}[psm,ps]. \end{aligned} \end{aligned}$$
(10)

As mentioned, \(a^{ps}\) is the root of AST for a composing service. Otherwise, \(a^{ps}\) is the minimal Ambient that covers the Ambients of all \(w \in W^{ps}\).

After the Worker Nodes received the discovery messages, it compares if there is a match in its node type. A Worker Node responses via UDP unicast channel (\(\hat{u}^{psm}\)) to the PSM if there is a match. Formally,

$$\begin{aligned} \begin{aligned} SC_{W}[w]&\triangleq \hat{m}(a^{w})?discover(psm,\tau ) \rightarrow \\&if(type(w)=\tau ) \\&then \ [\hat{u}^{psm}!response(w) \rightarrow SC_{W}[w]]\\&else \ SC_{W}[w]. \end{aligned} \end{aligned}$$
(11)

Meanwhile, PHM also monitors discovery message belonging to its Ambient \(a^{ph}\), which is usually a leaf node in AST. The Worker Nodes deployed on ph are either in ACTIVE or in INSTALLED states. If a Worker Node is in ACTIVE, it responds to the discovery message following (11); otherwise, PHM loads the node, causing it enters the DORMANT state, see (12) and (13).

$$\begin{aligned} SC_{PHM}[ph]&\triangleq \hat{m}(a^{ph})?discover(psm,\tau )\nonumber \\ \rightarrow&\coprod _{w \in ph} LS[w,psm,\tau ,ph]\nonumber \\&if(type(w)=\tau ) \nonumber \\&then \ [\hat{u}^{psm}!response(w) \rightarrow SC_{W}[w]]\nonumber \\&else \ SC_{W}[w] \end{aligned}$$
(12)
$$\begin{aligned} LS[w,&psm,\tau ,ph] \triangleq \nonumber \\&if(type(w)=\tau \wedge w.state = \texttt {INSTALLED})\nonumber \\&then \ Load(w) \rightarrow \hat{u}^{psm}!response(w) \rightarrow SC_{PHM}[ph]\nonumber \\&else \ SC_{PHM[ph]}. \end{aligned}$$
(13)

Then, PSM examines responses from Worker Nodes. For each response, PSM adds the node to a list of candidates for a specific node type (\(W_{\tau }^*\)). After \(V^{ps}\) being empty, PSM selects the best ones for each node type according to a user-defined selecting function. After that, it executes PSM Service Activation (\(SA_{PSM}\)):

$$\begin{aligned} \begin{aligned} SS&_{PSM}[ps] \triangleq \hat{u}?response(w) \rightarrow \\&if(V^{ps}=\emptyset ) \\&then \coprod \nolimits _{nt \in S{T^{ps}}} {\left[ \begin{array}{ll} Add(Select (W_{\tau }^{candidate})) \rightarrow \\ SA_{PSM}^{}[Select(W_{\tau }^{candidate}),ps] \\ \end{array} \right] }\\&else {\left[ \begin{array}{ll} if(type(w) \in V^{ps}) \\ then \ V:= V-type(w) \rightarrow \\ \ \ \ \ \ \ W_{\tau }^{candidate}:= W_{\tau }^{candidate} + w \rightarrow SS_{PSM}[ps] \\ else \ SS_{PSM}[ps] \\ \end{array} \right] }. \end{aligned} \end{aligned}$$
(14)

Finally, a service is activated (15) if there is no more slot in the service vacancy. This part of scheme is identical to PerSAM/PSMP.

$$\begin{aligned} \begin{aligned} SA&_{PSM}[w,ps] \triangleq \\&\hat{t}?activate \rightarrow W_{\tau }^{candidate}:= \emptyset \rightarrow SC_{PSM}[ps]. \end{aligned} \end{aligned}$$
(15)

5.3 Spatial-aware service failure detection and recovery

In PerSAM, the system checks the health of a Worker Node by asking it to perform heartbeat (HB) periodically. This is realized by performing PA (7), as shown in (16).

$$\begin{aligned} HB_W[w] \triangleq sleep(w.hbp) \rightarrow PA[w];HB_{W}[w]. \end{aligned}$$
(16)

In this way, PSM is aware of the health status of affiliating Worker Nodes by subscribing to the channel of its Home Ambient \(\hat{m}(a^{ps})\). In this way, the PSM checks constantly if there is an affiliated node stops performing heartbeat. If PSM does not receive any heartbeat for a period, it sends a suspecting message indicating a possible node failure and to stop the suspected Worker Node, see PSM node suspecting (NSU, 17) and timeout eviction (EV, 18).

$$\begin{aligned} \begin{aligned} NSU_{PSM}[ps]&\triangleq \left[ \begin{array}{ll} \hat{m}(a^{ps})?present(w) \rightarrow Refresh(w) \\ \rightarrow NSU_{PSM}[ps] \\ \end{array} \right] \\&|| \ EV_{PSM}[ps] \end{aligned}\nonumber \\ \end{aligned}$$
(17)
$$\begin{aligned} \begin{aligned} \begin{array}{ll} {EV} _{PSM}^{}[ps] \triangleq {\text { }} \\ {\text { }}\coprod \limits _{w \in {W^{ps}}} \left[ \begin{array}{ll} {{if }}{Timeout} (w){\text { }} \\ {{then }}{Remove} {\text {(}}w{\text {)}} \rightarrow \hat{m}(a^w)!suspect(w) \\ \end{array} \right] {\text { }} \\ {\text { }} \rightarrow {EV} _{PSM}^{}[ps]. \end{array} \end{aligned} \end{aligned}$$
(18)

After a PHM receives (suspect(w)), it kills the process of the suspected node. Then, the PHM emits LA on behalf of the suspected nodes, see PHM shutdown suspects (SSU, 19).

$$\begin{aligned} \begin{array}{ll} {SSU} _{PHM}^{}[ph]{\text { }} \triangleq {\text { }}\hat{m}(a^w)?suspect(w) \\ {\text { }} \rightarrow {\text {if }}(w \in W^{ph}){\text { }} \\ {\text { then}}\left[ \begin{array}{ll} {Shutdown} (w) \\ \rightarrow {LA} [w];{SSU} _{PHM}^{}[ph] \\ \end{array} \right] {\text { }} \\ {\text { else }}{SSU} _{PHM}^{}[ph]. \\ \end{array} \end{aligned}$$
(19)

Finally, we can specify a global spatial-aware service management scheme that formally defines the behaviors of PSM, PHM, and Worker Nodes using the sequential (; ) and parallel operator (||).

$$\begin{aligned}&PSM[ps] \triangleq PA[ps];(SC_{PSM}[ps]||NSU_{PSM}[ps]) \end{aligned}$$
(20)
$$\begin{aligned}&PHM[ph] \triangleq PA[ph];(SC_{PHM}[ph]||SSU_{PHM}[ph]) \end{aligned}$$
(21)
$$\begin{aligned}&W[w] \triangleq PA[w];(SC_W[w]||HB[w]||LM[w]) \text{. } \end{aligned}$$
(22)

6 Evaluation

In this section, we present the results of evaluating robustness, efficiency, and performance of the proposed approach. The details are presented in the next few sections.

6.1 Robustness

As mentioned, as the proposed scheme is based on PerSAM/PSMP, it is desirable to preserve the robustness property. Hence, the purpose of this subsection is to verify the robustness of the enhanced service scheme. Here we use the same assumptions as PerSAM/PSMP. The assumptions are listed below; the rationales of these assumptions are detailed in [22]. The assumptions specify the un-handled issues that are out of the scope of this paper.

  • Eventually correct local failure detector (A1): A Worker Node stops performing heartbeat eventually after it fails.

  • Perfect-Link assumption (A2): All messages are guaranteed to be successfully delivered.

  • Persistent Manager Nodes assumption (A3): Manager Node does not fail.

  • Composable service assumption (A4): All services are composable. If this assumption does not hold, then technically it is impossible to recover the PS.

The term “robust service management” is used to refer to the scheme that enables a system to detect failures and to recover from failures autonomously. Having clarified the semantics of robustness, a “Robust Pervasive Service” can be now formally defined. A Pervasive Service is robust if and only if

$$\begin{aligned} \diamondsuit Fail(w^f) \Rightarrow \diamondsuit \lnot ServiceAlive() \wedge \diamondsuit ServiceAlive(), \end{aligned}$$
(23)

where \(w^f \in W^{ps}\) is a failed Worker Node, and \(Fail({w^f})\) represents the fact that \(w^f\) fails. The use the symbol \(\diamondsuit \) to denote “eventually” and the symbol \(\Box \) to denote “always” in the logic statements. These symbols are borrowed from Temporal Logic [20]. Note that \(\diamondsuit \lnot ServiceAlive()\) happens before \(\diamondsuit ServiceAlive()\); therefore, their conjunction is not necessarily false. The objective of verification is to ensure that (23) holds for the proposed scheme.

Before we proceed, let us define an auxiliary function \(\eta : E \rightarrow Boolean\) that maps an CSP event to a logical assertion, where E is a set of CSP events. For example, \(\diamondsuit \eta (e)\) represents the fact that an event e eventually happens.

Lemma 1

A failed Worker Node does not send any PA after the failure occurs eventually.

Proof

From (A2), all messages are guaranteed to be delivered to their sinks; a message does not appear on the network unless a node sends one, and hence, the following statements hold:

$$\begin{aligned} \eta (c!x) \Rightarrow&\diamondsuit \eta (c?x)\text {, and} \end{aligned}$$
(24)
$$\begin{aligned} \lnot \eta (c!x) \Rightarrow&\diamondsuit \lnot \eta (c?x). \end{aligned}$$
(25)

Assume there is a Worker Node \(w^f\) fails, according to (A1), \(w^f\) must stop working. From (7), no PA will be sent from \(w^f\) after this failure. So we have:

$$\begin{aligned} \eta (Fail(w^f)) \Rightarrow \diamondsuit \lnot \eta (\hat{m}(a^{w^f})!present(w^f))\text{. } \end{aligned}$$
(26)

\(\square \)

Lemma 2

After \(w^f\) fails, it will eventually be removed from its affiliating Pervasive Service ps and a suspecting message for \(w^f\) will be sent.

Proof

From (A2), (25), and (26), we have:

$$\begin{aligned} \diamondsuit \lnot \eta (\hat{m}(a^{w})?present(w^f)) \end{aligned}$$
(27)

Based on the definition of Home Ambient, (3), and AST:

$$\begin{aligned} a^{ps} \sqsupset a^{w^f} \Rightarrow a^{ps} \in parentOf^*(a^{w^f}). \end{aligned}$$

From the concept of spatial-aware message dissemination, namely (6) and (27), we have:

$$\begin{aligned} \lnot \eta (\hat{m}(a^{w^f})!present(w^f)) {\Rightarrow } \diamondsuit \lnot \eta (\hat{m}(a^{ps})?present(w^f)) \text{. } \end{aligned}$$
(28)

Then, from (17) and (28) we know that after the node fails \(\eta (Fail(w^f))\), then event \(Refresh(w^f)\) never happens, leading \(Timeout(w^f)\) returns true. From (18) we know that \(Remove(w^f)\) and \(\hat{m}(a^{w^f})!suspect(w^f)\) will happen. To summarize, from (28) we have:

$$\begin{aligned}&\diamondsuit \lnot \eta (Refresh(w^f)) \Rightarrow \diamondsuit Timeout(w^f) \Rightarrow \nonumber \\&\diamondsuit \eta (Remove(w^f)) \wedge \diamondsuit \eta (\hat{m}(a^{w^f})!suspect(w^f)). \end{aligned}$$
(29)

\(\square \)

Lemma 3

Eventually, the failure of \(w^f\) will be detected, causing the PS of \(w^f\) becomes unavailable.

Proof

\(Remove(w^f)\) leads to \(W^{ps}:=W^{ps}-w^f\) so that \(w^f \notin W^{ps}\). In addition, because \(type(w^f) \in ST^{ps}\), from the definition of Service Vacancy and (10), we have:

$$\begin{aligned}&\diamondsuit \eta (Remove(w^f)) \Rightarrow \diamondsuit type(w^f) \in V^{ps} \nonumber \\&\Rightarrow \diamondsuit V^{ps} \ne \phi \Rightarrow \diamondsuit \lnot ServiceAlive(). \end{aligned}$$
(30)

Next, from (26), (28), (29) and (30), we can obtain:

$$\begin{aligned} \eta (Fail(w^f)) \Rightarrow \diamondsuit \lnot ServiceAlive(). \end{aligned}$$
(31)

\(\square \)

As a result, the event of \(w^f\)’s failure is detected eventually and makes the Pervasive Services that utilize \(w^f\) become unavailable.

Lemma 4

Eventually, PSM finds alternative nodes by performing node discovery for the type of \(w^f\).

Proof

We can prove the statement by combining (10) and (30):

$$\begin{aligned}&\diamondsuit \lnot ServiceAlive() \wedge \diamondsuit type(w^f) \in V^{ps} \nonumber \\&\Rightarrow \diamondsuit \eta (\hat{m}(w^{ps})!discover(type(w^f))) \end{aligned}$$
(32)

\(\square \)

Lemma 5

Eventually, PSM finds alternative nodes by performing node discovery for the type of \(w^f\) and that node responds to the discovery message mentioned in lemma 4

Proof

As \(a^{ps} \sqsupset a^{w^f} \Rightarrow a^{ps} \in parentOf^*(a^{w^f}) \), from (6), (24) and (32), we know that \(\square \)

$$\begin{aligned}&\diamondsuit \eta (\hat{m}(w^{ps})!discover(type(w^{ps}))) \Rightarrow \nonumber \\&\diamondsuit \eta (\hat{m}(w^f)?discover(type(w^f)) \text{. } \end{aligned}$$
(33)

Additionally, from (A4):

$$\begin{aligned} \forall \tau \in V^{ps}, \exists w: type(w)=\tau \text{. } \end{aligned}$$
(34)

As \(type(w^f) \in V^{ps}\), based on (34), a qualified Worker Node \(w^r\) that is able to substitute the functions of \(w^f\) can be found. Formally, \(type(w^r)=type(w^f)\) and \(\exists type(w^r)=type(w^f)\). Based on (11), (12), (13) and (34), we have:

$$\begin{aligned}&\forall \tau \in V^{ps}, \nonumber \\&\exists w^r: type(w^r)=\tau \wedge \diamondsuit \eta (\hat{u}^{psm}!response(w^r)). \end{aligned}$$
(35)

Then, from (14), (24) and (35), we know that:

$$\begin{aligned}&\diamondsuit \eta (\hat{u}?response(w^r)) \Rightarrow \diamondsuit \eta (V^{ps}:=V^{ps}-type(w^r)) \nonumber \\&\Rightarrow \diamondsuit V^{ps}=\phi \Rightarrow \diamondsuit \eta (W^{ps}:=W^{ps} \cup w^r) \nonumber \\&\Rightarrow \diamondsuit w^r.state = ACTIVE. \end{aligned}$$
(36)

Finally, from (35), \(\diamondsuit ServiceAlive()\) is true, and by combining (31), (32), (33), (35) and (36), we have:

$$\begin{aligned} \eta (Fail(w^f)) \Rightarrow \diamondsuit \lnot ServiceAlive() \wedge \diamondsuit ServiceAlive() \end{aligned}$$
(37)

As a result, the robustness of the scheme is verified. \(\square \)

Fig. 4
figure 4

Original broadcast-style message dissemination

Fig. 5
figure 5

Spatial-aware message dissemination

6.2 Efficiency

In this section, we analyze the efficiency of the proposed approach by calculating the communication complexity [24], which is measured concerning the total non-null messages in the network caused by an action. Before turning to the detailed analysis, it is desirable to look at a concrete example. Figures 4 and 5 depict the propagation using the traditional MOM and the proposed spatial-aware mechanism, respectively. As shown in Fig. 4, there are 14 nodes in the system, and each of them subscribes to a topic T in the MOM. Assuming that the node \(w_{14}\) publishes x messages to T, then the x messages are replicated according to the number of subscribers, namely 14 times. As a result, each node (\(w_1,w_2..,w_{14}\)) receives x messages. Hence, \(14\cdot x\) messages are replicated. It is obvious that given x messages are published to T, and there are n nodes subscribing to T, then \(n\cdot x\) messages are replicated. Consequently, the communication complexity of sending x messages in such configuration is:

$$\begin{aligned} x + n\cdot x, \end{aligned}$$
(38)

where x is the number of messages sent and \(n\cdot x\) is the number of replicated messages. In a more general case, if there are n nodes and each node publishes x messages concurrently, then there are \(n\cdot x\) messages published to the topic and each message is replicated n times. As a result, \(n\cdot n\cdot x\) messages are replicated. The total message counts for n publishers thus become:

$$\begin{aligned} n\cdot x + n\cdot n\cdot x = (1 + n)\cdot (n\cdot x). \end{aligned}$$
(39)

As shown in (5) in Sect. 4.3, if a message is sent to an Ambient topic \(T_a\), then the messages propagates to all subscribers of \(T_a\). Besides, these messages also propagate to \(parentOf^*(T_a)\) and \(childOf^*(T_a)\) (and their subscribers). Taking Fig. 5 as an example, where there are 7 Ambient topics (\(T_{root}, T_1...T_6\)) in the system and 14 nodes(\(w_1, w_2, \ldots , w_{14}\)) are uniformly distributed to the Ambients. In this way, each topic has two subscribers. Given that the node \(w_14\) publishes x messages to \(T_1\), the messages are replicated and then propagate to subscribers of \(T_1\) (\(w_{13}\) and \(w_{14}\)). Besides, the messages also propagate to \(T_{root}\), \(T_3\), and \(T_4\). After that, the messages are replicated and propagate to the subscribers of \(T_{root}\), \(T_3\), and \(T_4\). In this case, the \(w_14\) publishes x messages to \(T_1\) and messages are replicated and sent to 8 nodes (\(w_1\), \(w_2\), \(w_9\)\(w_{14}\)). Thus, totally \(8\cdot x\) messages are replicated.

Figure 6 depicts the analytical model of spatial-aware message dissemination, which is helpful for determining the communication complexity. Assuming that the AST is a k-way full tree and that there are x messages published to the topic \(T_p\), there are t Ambient topics and n nodes in the system. The distance from \(T_p\) to \(T_{root}\) is d, and the distance from \(T_p\) to the leave nodes is h. Recall it has been pointed out in Sect. 4.3 that no message is sent when there is no node subscribes to an Ambient topic, even when the topic is in the propagation path. Thus, to make sure all topics have traffic, we uniformly distribute the node to each topic (\(\therefore \) each topic has \(\frac{n}{t}\) subscribers), and it is assumed that \(n>t\). As mentioned, after \(T_p\) receives the x messages, these messages propagate to the subscribers of \(T_p\), upward along with the path to the \(T_{root}\), and downward to all topics in the sub-tree of \(T_p\). For the upward traffic, it is easy to observe that there are d topics along the path to the root of the AST (\(T_p\) is excluded). Each of these topics then replicates x messages and then sends to \(\frac{n}{t}\) subscribers. Thus, the upward propagation traffic is \(x\cdot \frac{n}{t}\cdot d\). For the downward traffic, the number of topics in the sub-tree of \(T_p\) can be obtained by calculating the number of nodes in a full k-way tree with depth h: \(\sum _{i=0}^{h} k^i\) (\(T_p\) is included). The download propagation traffic is \( x\cdot \frac{n}{t}\cdot \sum _{i=0}^{h} k^i\). Consequently, the total communication complexity of spatial-aware message dissemination is:

$$\begin{aligned} x + x\cdot \frac{n}{t}\cdot \sum _{i=0}^{h} k^i + x\cdot \frac{n}{t}\cdot d. \end{aligned}$$
(40)

If there are n publishers publishing x messages to \(T_p\) concurrently, there are \(n\cdot x\) messages published. However, each message is only replicated \(\frac{n}{t}\) times as there are \(\frac{n}{t}\) subscribers per topic. Then, the equation of message counts for n publishers to the spatial-aware dissemination becomes:

$$\begin{aligned} n\cdot x + \frac{n}{t}\cdot \left[ x\cdot \frac{n}{t}\cdot \sum _{i=0}^{h} k^i + x\cdot \frac{n}{t}\cdot d \right] . \end{aligned}$$
(41)
Fig. 6
figure 6

Analytical model of spatial-aware message dissemination. Given that x messages are published to the ambient topic \(T_p\) located in a k-way full tree (Notations: n: number of nodes; t: number of ambient topics; d: the distance from \(T_p\) to \(T_{root}\); h: the distance from \(T_p\) to the leaves).

Judging from the above analysis, given the same number of nodes deployed in the system, spatial-aware message dissemination is more efficient than the original one because instead of flooding the messages to all nodes, spatial-aware message dissemination only propagates messages to spatial-related nodes, depending on the Ambient topics subscribed. The worse case is when all messages are published to the root Ambient topic, and all nodes subscribe to the root Ambient topic, in this case, the root Ambient topic is the only effective topic so that \(t=1\), \(d=0\) and \(\sum _{i=0}^{d} k^h=t=1\). In this way,

$$\begin{aligned} (41)= & {} n\cdot x + \frac{n}{t}\cdot \left[ x\cdot \frac{n}{t}\cdot t + x\cdot \frac{n}{t}\cdot 0 \right] \nonumber \\= & {} n\cdot x + n\cdot (x\cdot n) = (39), \end{aligned}$$

which indicates that the communication complexity is the same as the broadcast-style message dissemination.

6.3 Experiments

In the original PerSAM/PSMP, all nodes subscribe to the same set of topics, where the messages are flooded over all spaces. Intuitively, the performance can be improved if spatial-aware message dissemination mechanism is used as this dissemination mechanism only delivers the messages to the spaces that need them. To investigate such improvement in a real environment, we conducted experiments to compare the performance between spatial- and non-spatial-aware message dissemination in a switched local area network. In these experiments, the number of deployed nodes is the multiple of 15 as there are 15 Ambients in this environment. These nodes are then uniformly distributed among three PCs with Intel Core i5 CPU with 4G memory. All PCs are connected to the same switch. On each machine, a Wireshark packet sniffer [27] is also installed for capturing the packets for further analyzing tasks. Each node is instrumented so that it receives a multicast message from a centralized experiment controller (installed on the machine with MOM).

In the network, an additional PC with the same hardware serves as the MOM independently. In the experiments of both message dissemination mechanisms, the MOM is implemented using Mosquitto [23], an open-source implementation of MQTT. However, the configurations of topics are different: In the non-spatial-aware message dissemination approach, all messages are sent to a single the same topic; on the other hand, in the spatial-aware message dissemination approach, a hierarchical topic structure is configured according to the structure of AST following the topic structure syntax of MQTT. For example, a node, placed in the Ambient G, publishes the messages to a topic named “Home/A/E/G.” If a node wants to subscribe the messages emitted from Ambient A and all of its sub-Ambients, then the topic name of this subscription is “Home/A/#,” where “#” denotes a multilevel wildcard in MQTT.

Fig. 7
figure 7

Network traffic with and without spatial-aware message dissemination

The objective of the first part of experiments is to study the impact of nodes numbers to the overall network traffic under different dissemination schemes. The experiments in this part simulate the Smart Home illustrated in Fig. 2 and the AST in Fig. 3. In each round, tests are performed using the original single topic dissemination scheme and then re-performed using the proposed spatial-aware dissemination scheme. In the experiments using the spatial-aware dissemination scheme, each topic is associated with n/t (\(t=15\)) nodes, and these nodes publish and subscribe to the associated topics. When the experiment starts, each node subscribes to an assigned topic and then sends 300 messages to the assigned topic. Then, the traffic produced is measured and logged. Figure 7 shows the traffic produced under different node counts and dissemination schemes. The results indicate that the spatial-aware approach greatly reduces the network traffic. Also, the experimental results are consistent with analytical results. By examining the log of network packets, we found that the differences between the theoretical and experimental results come from the packets required for setting up TCP connections and the additional ACK messages used by Mosquitto.

The second part of experiments is conducted to study how the structure of AST impacts the overall network traffic. Figure 8 depicts the ASTs used in the experiments. Both deep (Fig. 8, right part) and flat ASTs (Fig. 8, left part) consist of four nodes (A, B, C, and D), and in each experiment, one node is associated with a topic. When the experiment starts, each node subscribes to the assigned topic, and 300 messages are sent to the topic D. Then, the traffic produced is measured and logged. The results are shown in Fig. 9. As expected, the flat AST outperforms deep AST as the structure prevents the messages published to D from propagating to B and C and thus around half of replication traffic is reduced. However, as the traffic of publishing messages is identical in both structures, the traffic reduction turns out to be around 1/3. However, it is easy to see that the traffic reduction increases when the size of the flat AST increases. Based on these observations, we can conclude that the effect of traffic reduction is more evident if the structure AST is more “flat.” This observation is helpful when partitioning the physical spaces and mapping them to Ambients topics in the AST when a designer is developing a system based on the proposed scheme.

Fig. 8
figure 8

Network traffic under different AST structures

Fig. 9
figure 9

Network traffic under different AST structures

6.4 Costs and limitations

In this section, we discuss the costs and limitations of the proposed approach as well as the issues that are out of the scope of this work.

6.4.1 AST construction and Home Ambient assignment

The key function of the AST is to represent the spatial relationships among locations (or called Ambients). Technically, AST is implemented as hierarchically structured topics in MOM. When setting up the system, the AST for the environment must be configured based on the floor plan. In practice, the Ambients/sub-Ambients can be divided arbitrarily according to users needs. For example, there can be more sub-Ambients in the living room as this place tends to contain more nodes and services. Also, to enable the spatial-aware service composition, for each service in the environment, a Home Ambient must be assigned for each service in their Service Template, which can be a burden for service developers or users.

6.4.2 Sophisticated service composition

The issues of service composition and matching are out of the scope of this paper. Deciding whether two services are equivalent is itself an active research area. For example, Semantic Web and Linked Data technologies can be used to compare whether the functions of two nodes are semantically equivalent [3]. In [21], we proposed a set of auto-unifiable preference expressions that facilitate automatic service composition for possibly conflicting preferences. Users can specify their (possible inter-conflicted) preferences using a set of Preference Expressions, and the conflictions in the expressions are resolved automatically. When a solution does not exist, the system can fall back to an alternative one based on the set of directives provided by the expressions. The outcomes of this work can also be used with the Preference Expressions proposed in [21]. As the system is aware of the spatial relationship among spaces, it turns out that the quality of matching process can be improved further.

6.4.3 Automatic location detection and update

Currently, AST does not adapt to spatial changes. The reason is that in a typical living space, the change of spatial relationship is not frequently. For example, the changes of the location of a bedroom and the location of the kitchen are less frequent. The cost of spatial change is that the hierarchically structured topics in the MOM must be re-configured. Although real-time automatic detection and update of node (device) locations are interesting issues, however, they are out of the scope of this paper. This paper assumes that once deployed, the home appliances and sensors that host nodes do not move frequently. If a host of a node is moved from one space to another, then the node should be implemented so that it is aware of its location and changes the subscribed topics to the one corresponds to the right Ambients automatically. Otherwise, the cloud to edge IoT deployment tools [15] can be used to automatically update and propagate the location information of nodes to ease the burden of users.

7 Conclusion

This paper presents a formal spatial-aware model for pervasive services residing in a pervasive environment. This model is designed based on PerSAM/PSMP so that it has to preserve the robustness property. We begin by designing several supporting abstractions, namely the Ambient and the Ambient Structure Tree, that can formally reflect spatial relationships among physical places. Then, we investigate how to enhance spatial-aware PerSAM/PSMP based on the abstractions mentioned above. Finally, we validate the enhanced model to prove its robustness, perform analytical results for predicting expected traffic, and conduct two sets of experiments. The results show that the proposed approach not only provides better and useful abstractions on designing a pervasive system but also improves the overall performance by preventing unnecessary messages to be propagated to the nodes located in the irrelevant Ambients. Much work can be done in the future; for example, the concept of Home Ambient can be extended to the discovery of nodes as the scope of the search. Also, the quality of recovery of a service can also be enhanced by AST; for example, it is desirable that the replacement of the failed node be as near the Home Ambient of the failed node as possible.