1 Introduction

Since healthcare costs are increasing rapidly and the world population is aging, there has been a need to monitor a patient’s health status efficiently while he or she is out of hospital for example in home environment. Wireless sensor network (WSN) has become a promising solution for home health monitoring see for example [1, 2] due to the low-power, low-cost, and pervasive nature of today’s sensor nodes. By outfitting the patient with wireless wearable sensors, vital signs can be automatically collected, processed and relayed to a home base station or coordinator in a real-time manner for remote access. Once the vital signs exceed some predefined thresholds, alert messages can be generated and delivered to the predetermined contacts for example doctors and nurses. Meanwhile, a doctor or nurse can query the patient’s health state anytime and anywhere with Internet. In such a health monitoring system, there are some immobile relay nodes to communicate with the wearable sensor node(s) carried by the patient and to connect to the base station in order to make the system energy-efficient and scalable [3, 4]. To maintain pervasive monitoring of the patient in the system, there are two fundamental requirements. First, there must be at least one bidirectional path from the sensor node to the base station. Secondly, to determine the location where the patient stays when emergency happens, the standard is the sensor node should be covered by at least three relays for using trilateration based or centroid positioning techniques. Localization accuracy can be further improved if more relays are in contact.

The above issues are in the scope of relay node placement in WSN, which have received much attention. For example, in [5], the problem has generally been classified into either single-tiered or two-tiered relay placement problems based on corresponding routing structures, and classified into either connected or survivable relay placement problems based on connectivity requirements. In two-tiered relay placement scheme, the sensor node transmits their sensed data (e.g., measurements) to relay nodes, but will not do packet forwarding for other co-existing sensor nodes. In single-tiered relay placement scheme, a sensor node may also forward packets. In connected relay node placement, the deployment of relay nodes must ensure connectivity between the sensor nodes and the base station. In survivable relay node placement, the deployment of relay nodes has to further ensure the bi-connectivity between sensor nodes and the base station. In the considered health monitoring system, the problem refers to the two-tiered relay node placement and connectivity problem. Indeed, this problem can be formulated from a couple of different perspectives for analytical studies. From one perspective, it can be modeled as finding a minimum connected k-dominating (k ≥ 3) set, where the relay set is interconnected and each sensor is adjacent to at least k relays. From another point of view, if the radio transceiver is treated as a special sensing device and the radio coverage as the sensing range, the problem is then equivalent to the problem of finding a minimum relay set maintaining connected k-radio-coverage (k ≥ 3). Previous work has made significant efforts on the theoretical analysis and problem solving in some relay placement schemes [6,7,8,9,10], in connected k-coverage issues [11,12,13,14], and also in connected k-dominating set cases [15,16,17,18]. However, most of them assumed an ideal environment and a perfect communication or sensing model [19, 20], e.g., using Boolean disk models. Indeed, under indoor environment, the radio propagation would be significantly affected by many obstacles (e.g., walls). Results based on over-simplified models cannot be directly applied to solve our problem and the consideration of realistic systems.

Xue et al. [21] were the first to study optimal relay placement in WSN in an indoor environment with consideration of the effect of walls on the radio propagation. In contrary to the literature, our work on health monitoring system has the following two new aspects. One is that existing work often considers more capable relays that have external wired connections and thus the relays do not need to maintain wireless connectivity among themselves but would be required to be adjacent to LAN ports, whereas in our system we consider that the relay nodes are low-power wireless (radio) devices similar to the wearable sensors and only the base station is connected to the external network. Therefore, in our system the relays can be more flexibly placed (i.e., without wireline connection) and are just required to form a connected wireless network. The second aspect is that we further expect the wearable sensor nodes carried by the patient should be covered by at least three wireless relays for the localization, which is a function for intelligent home health monitoring systems.

The major contributions of our paper are summarized below:

  • To the best of our knowledge, we are the first to study optimal relay placement for connected k-radio coverage (k ≥ 3) for the above location-aware home health monitoring system and model it as a minimum connected k-dominating set problem.

  • We explicitly consider the effect of indoor obstacles on the radio propagation in the system.

  • We prove that the problem is NP-hard and we propose an efficient greedy algorithm (ORPA) to optimize the relay locations.

  • Extensive simulations are conducted to evaluate the performance of the proposed algorithm and also the impact of the transmission power and the grid size.

  • Results have shown that our algorithm outperforms existing solutions and can substantially reduce the number of relays required.

The remainder of the paper is organized as follows. In Section 2, we discuss related work on relay placement and connected coverage issues. Section 3 describes the system model and formulates the problem. In Section 4, we prove that the problem is NP-hard and thus propose an efficient greedy algorithm to solve it and analyze its complexity. Extensive simulations are conducted in Section 5 to evaluate the performance of the proposed algorithm in comparison with other solutions. Finally, Section 6 concludes the paper.

2 Related work

In this section, we briefly review a handful of the work on relay node placement most related to our work.

Relay node placement problem concerning connectivity has been extensively investigated in the literature of WSN and home health monitoring systems, see for example [1]. It can be classified into single-tiered or two-tiered relay systems [3, 22]. In single-tiered relay placement, the sensor nodes can also serve as relay nodes and forward the packets, whereas in two-tiered scenarios, the sensor nodes only transmit their data to the relays or the base station but will not forward packets for other nodes. For single-tiered relay placement, it has been first modeled by [23] as a Steiner minimum tree with minimum number of Steiner points and bounded edge length problem and proved NP-hard. Different approximation solutions [8, 22, 24, 25] have been thus proposed. For two-tiered relay placement, it has been studied in [6] how to place a minimum number of relay nodes in the monitoring field such that every sensor node can reach at least two relay nodes and the relays can form a 2-connected network, assuming that relay’s communication range is larger than that of the sensor node. However, note that this assumption does not always hold in practice. The instant communication range of each node would depend on its transmission power and the environment where it is deployed. Some work has then considered more general cases of the communication ranges of the sensor node and relay node [7, 9] and the impacts. There is also some work considering higher connectivity requirement [7, 8, 26]. Apart from the above unconstrained relay node placement problems, in which relay nodes can be placed anywhere, recent work has started to investigate constrained relay node placement problems to capture the practical consideration such as interferences or forbidden regions preventing relay nodes from being placed [3,4,5].

It is obvious that the above-mentioned relay placement problems are closely related to connected dominating set and connected coverage issues. For example, the fault-tolerant relay node placement problem studied in [6] is equivalent to finding a minimum 2-connected 2-dominating set. A generalization is to construct an m-connected k-dominating set where the relays are m-connected and each sensor is dominated by at least k relays in the set. Some centralized or distributed algorithms have been proposed to solve this problem [15,16,17,18]. Notice that the connected dominating set and connected coverage issues are actually correlated. The radio transceiver can be treated as a special sensing device, so the radio coverage can be considered as the sensing range. If each point in the monitoring field is placed a virtual sensor, finding m-connected k-dominating set is equivalent to finding m-connected k-coverage set. In [11], triangle lattice pattern was proved optimal to achieve connected 1-coverage under the condition \(r_{c}/r_{s}\geq \sqrt {3}\), where rc and rs represent the communication range and sensing range, respectively. More general results were obtained in [12, 13] showing optimal deployment patterns to achieve full coverage and m-connectivity for m ≤ 6 under different ratios of sensor communication range over sensing range. There are also studies on how to select from a minimum number of nodes to form a connected communication graph and also provide k-coverage. Various centralized or distributed approximation algorithms for solving this problem can also be found in [27,28,29].

Most of the above work assumes an ideal environment or a perfect communication or sensing model, e.g., Boolean disk model. Actually, low-power wireless links have complex and often probabilistic properties [30, 31]. The disk model facilitates a geometric treatment to deal with the problem but often fails to capture the stochastic and anisotropic nature of wireless links. Therefore, most theoretical results in this scope cannot be directly applied to practical use. To capture the randomness of a wireless communication link, a classical log-normal shadowing model is commonly adopted [30, 32]. Nevertheless, such a model is still overly simplistic for indoor environments with various obstacles, see for example [31]. It should be noted that compared to the previous work, an important difference of our work is that we study the relay placement problem in a realistic WSN-based home health monitoring scenario in which the complex indoor environment and the mobility of the target render our problem much more challenging.

3 System model and problem formulation

In this section, we first introduce the system architecture, the radio model, and also the assumptions and notations used in this work, and then formulate the problem with analytical investigations.

3.1 System architecture

Figure 1 shows the architecture of a home health monitoring system. A wearable sensor node is attached to the patient. It samples the vital signs (e.g., the body temperature, blood pressure, body movement, etc.) of the patient and compares the sampled data to some predefined thresholds. Once any predefined threshold is exceeded, an alert message will be delivered to the home based station directly or by the relays (e.g., attached to the ceiling) through multi-hops. The base station can be a Zigbee/WiFi/4G gateway, which forwards the message to a remote contact (e.g., a designated smart phone and a server device) via the Internet. Meanwhile, designated smart phone or PC users can query the patient’s state and location at any time. To support state query, data collection, and target localization, the health monitoring system has the following two basic requirements: (1) there always exists at least one bidirectional path between the sensor and the base station, and (2) the sensor node can communicate with at least three relays wherever the patient moves in the house. To avoid asymmetric communication links, we have the sensor node and the relays operate on the same frequency and transmission power. It is worth noting that the solution in this paper can also be applied to the scenario of multiple sensor nodes and multiple base stations. For simplicity, only one wearable sensor and one gateway are illustrated in Fig. 1.

Fig. 1
figure 1

The architecture of the home health monitoring system

3.2 Radio propagation model

Before elaborating the relay placement problem, we have to determine the radio coverage of a relay node in the indoor environment. A generic way of assessing the radio range of a transmitter is as follows. First, assume a radio propagation model with some unknown parameters; then collect a set of measurements from the environment as training data; finally, fit these data to the model in order to estimate the unknown parameters. As a result, the received signal strength (RSS) can be predicted, and based on the RSS predictions and an RSS threshold, each relay’s radio range can be determined with respect to the actual indoor environment (e.g., walls and obstacles). For indoor environment, it has been shown in [31] that obstacles such as walls can significantly attenuate wireless links so that a radio propagation model that incorporates the attenuation effect of the obstacles is necessary and in the model each obstacle oiO in the environment is assumed to attenuate the signal by a constant factor \(\gamma _{o_{i}}\). Let d(s, r) be the distance between the sender s and the receiver r, and Is, r be the set of obstacles intersecting the (virtual) line between s and r, the RSS at the receiver r from the sender s is thus given by:

$$ P(s,r) = P_{t} - \alpha - 10 \beta \log_{10}d(s,r) - \underset{o_{i}\in I_{s,r}}{\sum} \gamma_{o_{i}}, $$
(1)

where Pt is the transmission power, α represents the path loss at a reference distance of one meter, and β represents the path loss exponent, here we consider that Pt, α and \(\gamma _{o_{i}}\) are in decibel values.

However, due to the complexity of the indoor environment, there could be a large number of obstacles. Actually, we have in total 17 obstacles, including walls and pillars, as shown in Fig. 2 of our experimental testbed. The radio propagation model could be very complicated with the number of unknown parameters (to be determined) and in the order of the number of obstacles. As a result, a large amount of training data may be needed to estimate the parameters. It has been reported in [31] that by automatically classifying obstacles into groups of similar attenuation, the number of unknown parameters can be thus reduced and the model can still maintain the same accuracy. Using this method, the obstacles will be classified into several groups so that the number of involved unknown parameters would be reduced and we can obtain the radio propagation model with a smaller amount of training data. Given a set of groups G, a mapping function π : OG, and an attenuation coefficient Γg for each group gG, the RSS at the receiver r from the sender s is expressible as:

$$ P(s,r)=P_{t}-\alpha -10\beta\log_{10}d(s,r)-\underset{o_{i}\in I_{s,r}}{\sum} {\Gamma}_{{\Pi}(o_{i})}. $$
(2)
Fig. 2
figure 2

Testing floor plan, in which the red border lines are walls, the red squares are pillars, the gray rectangles are tables, and the blue dots are randomly deployed relays on the ceilings

The unknown parameters in model (2) are α, β and \({\Gamma }_{{\Pi }(o_{i})}\), where \({\Pi }(o_{i})= 1, 2, \ldots , |G|\), and |G| is the number of classified groups. Clearly, the number of unknown parameters is equal to |G| + 2. Before determining the unknown parameters, the mapping function from obstacles to groups, denoted by π, should be constructed. One method is to manually classify the obstacles based on their construction materials. For example, the testing floor shown in Fig. 2 includes in total 17 obstacles which can be manually classified into 4 groups: drywall, wooden wall, glass wall, and cement pillar. Linear regression may then be used to fit the remaining unknown parameters. However, it is clear that manual classification is labor-intensive and also may be inaccurate. We therefore propose an automatic obstacle classification algorithm to classify the obstacles into groups that would incur minimum estimation error. The pseudo-code of the algorithm is shown in Algorithm 1. The inputs to Algorithm 1 are: the set of RSS measurements denoted by vector Pr, the distances between senders and receivers denoted by vector d, and the set of obstacles and classified groups denoted by O and G. Section 5.2 provides more details on how to collect the RSS measurements. The outputs are: the obstacle classification function π, the estimates of α and β, and the attenuation coefficient Γg for each obstacle group g, where g = 1, 2,…,|G|. Initially, we set SSEmin to , where SSE refers to the sum of squared errors (SSE, see line 16 of Algorithm 1), and each obstacle is classified into a random group (see lines 3–5 of Algorithm 1). After the initialization, each obstacle o in random order is re-assigned to each group in G and linear regression is used to estimate the unknown parameters (see lines 8–12). The SSE is computed each time when obstacle o is grouped into another g (see lines 13–16). If the resulting new classification has a smaller SSE than the optimal (smallest) one, denoted by SSEmin, a group update will be made and the optimal SSEmin will be replaced (see lines 18–23) and then it will go back to the execution of lines 6–8 (of Algorithm 1), iteratively checking possible updates from the start. Otherwise, the next obstacle will be considered. This process terminates when no obstacle can be classified into a new group to obtain a smaller SSEmin. Due to the random classification of the obstacle in the initial step, the result generated by lines 2–25 (of Algorithm 1) may be different each time. Therefore, the algorithm repeats the body part (lines 2–27) for K times and returns the classification π with the smallest SSE (see lines 26–29). The values of the parameters α, β, and Γg’s are then computed along with the mapping function π of obstacles to classified groups.

figure e

It is worth noting that this algorithm is much less computationally expensive than an exhaustive search and is guaranteed to converge since it reduces the SSE at each iteration until it terminates. However, it cannot guarantee strict optimality and may get stuck in a local minimum. That is why we repeat the body part (lines 2–27, Algorithm 1) for K times. After the parameter learning, we obtain the radio propagation model and can predict the radio coverage range of the relay nodes placed at any point of the deployment area.

3.3 Definitions and preliminaries

In our system, the relays attached to the ceilings of the rooms are immobile and the sensor nodes carried by people are mobile. We therefore model the network as a two-tiered architecture, say the relay tier πr and the sensor tier πs, as shown in Fig. 3. A sensor node in the sensor tier has to communicate with at least three relays to localize itself. For the gateway, we consider a more general case that it can be placed anywhere in the sensor tier. Therefore, in the algorithm design part, we do not specify the location of the gateway and thus do not consider the situation that the sensor directly communicates with the gateway. To make the placement problem easier to deal with, we discretize both the relay tier and the sensor tier into small grids with grid side lengths equal to gr and gs, respectively. The centers of the grids in the relay tier are considered as the set of candidate relay locations and those in the sensor tier are assumed as the points on the motion trail. Suppose the area size is X × Y, then the relay tier and the sensor tier consist of nx × ny and \({n}_{x}^{\prime } \times {n}_{y}^{\prime }\) grid points, respectively, where nx = ⌈X/gr⌉, ny = ⌈Y/gr⌉, \({n}_{x}^{\prime } = \lceil X/g_{s}\rceil \), \({n}_{y}^{\prime } = \lceil Y/g_{s}\rceil \) and ⌈ ⌉ is the ceil function. Note that we will discuss the impact of the grid size on the system performance later in our experimental work. One may also refer to [33] for similar grid-based deployment method.

Fig. 3
figure 3

Illustration of Definitions 1 and 2

Let R denote the set of relays, Rr denote the set of grid points in πr, and Rs denote the set of grid points in πs. We define the relay communication graph (RCG) and connected k-dominating set in the following:

Definition 1 (Relay Communication Graph)

The relay communication graph RCG is an undirected graph RCG(V, E) with vertex set V = R and edge set E, defined as follows: for any two relays ri, rjR, there is an undirected edge (ri, rj) if the RSS at the receiver exceeds a threshold, i.e., the receiver sensitivity.

Remark 1

The typical value of the receiver sensitivity is − 85 dBm for IEEE 802.15.4 devices at 2.4 GHz [34]. Since all the nodes use the same transmission power, we assume the links are all symmetric.

Definition 2 (Connected k-dominating set)

Given Rr and Rs, a connected k-dominating set is a subset RRr that satisfies: (i) each point in Rs is dominated by k (k ≥ 3) relays in R, and (ii) the RCG induced by R is connected (there is at least one path between each pair of relays in R).

Remark 2

From another point of view, if the radio transceiver is treated as a special sensing device and the radio coverage as the sensing range, a connected k-dominating set is equivalent to a relay set maintaining connected k-radio-coverage.

3.4 Problem formulation and analysis

Relay placement problem can be formulated as an optimization problem with the following different objectives:

Problem 1

Given the set of N relays and the transmission power Pt, how to place the relays so that the connectivity requirement C is maximized.Footnote 1

Problem 2

Given the transmission power Pt and the connectivity requirement C, how to place the relays so that the number of relays N is minimized.

Problem 3

Given the set of N relays and the connectivity requirement C, how to place the relays so that the transmission power Pt is minimized.

Theorem 1

The above three optimization problems are equivalent, i.e., if there exists a solution algorithm to one problem, the other two problems can be solved by invoking the solution algorithm in polynomial time.

Proof

Suppose we have a solution algorithm Max_C(N, Pt) to Problem 1. We can construct a solution algorithm Min_N(Pt, C) to Problem 2 as follows: For a given Pt, we step up the value of N and invoke Max_C(N, Pt) and compare the return connectivity C against the required value C. The search process terminates when |CC| < 𝜖, where 𝜖 is a tolerable discrepancy between the actual connectivity and the requirement, and then N is the optimal value. By a similar line of augment, we can invoke Min_N(Pt, C) to construct a solution algorithm \(\text {Min}\_P_{t}(N,C)\) to Problem 3, and use \(\text {Min}\_P_{t}(N,C)\) to construct a solution algorithm Max_C(N, Pt) to Problem 1. Since the relations between N, Pt and C are monotonic, the polynomial equivalence can be established. □

Remark 3

In this paper we will focus on solving Problem 2 under a realistic setting. Since more relay nodes incur more cost, the objective of the problem is to find an optimal relay placement strategy that minimizes the total number of relay nodes, while the connectivity requirement is to ensure a connected RCG and all reachable grid points in tier πs to be k-dominated by the relays, where k ≥ 3.

We formally formulate the problem and define as follows.

Definition 3 (Optimal Relay Placement Problem)

Given the set of grid points Rs in sensor tier πs and the set of grid points Rr in relay tier πr, the optimal relay placement for a home health monitoring system is to determine the relay location points RRr such that R forms a minimum connected k-dominating set.

Theorem 2

The Optimal Relay Placement Problem defined by Definition 3 is NP-hard.

Proof

A minimum connected k-dominating set of graph RCG is a connected k-dominating set with the smallest possible cardinality among all connected k-dominating sets of RCG. The problem of constructing a minimum connected dominating set has been proved NP-hard [35]. Our problem is equivalent to this problem if k = 1. Hence, our problem is a superset of the minimum connected dominating set problem and it is also NP-hard. □

Since the Optimal Relay Placement Problem is NP-hard, an efficient heuristic solution approach needs to be developed.

4 Algorithm design

In this section, we propose an efficient greedy algorithm ORPA (Optimal Relay Placement Algorithm) to solve the optimal relay placement problem, targeting at using minimum number of relays to meet the connected k-dominating requirements. We first give an overview of our approach, then elaborate on the detailed design and finally analyze the complexity of the algorithm.

4.1 Overview

The basic idea of our algorithm is threefold. First, the candidate grid point in πr is chosen only within the radio ranges of already-chosen points to ensure that the relays are all connected. Second, the candidate grid point in πr that can maximize the number of undominated grid points in πs is chosen to place the relays. Third, the unchosen grid points in πr are iteratively checked until all the grid points in πs are k-dominated. It is worth noting that the radio range of each placed relay can be estimated using aforementioned indoor radio model. The details of our algorithm ORPA are illustrated in Algorithm 2.

4.2 Optimal relay placement algorithm

The input variables to the algorithm are the transmission power Pt, the dominating requirement k, the set of grid points in πs denoted by Rs, the set of grid points in πr denoted by Rr, and the channel parameters which are learned through Algorithm 1. For a relay node at grid point r, where rRr, the set of grid points in Rr that lie in the radio range of r is denoted by Cr(r). Similarly, the set of grid points in Rs that lie in the radio range of r is denoted by Cs(r), as shown in Fig. 4. Let |Cs(r)| denote the cardinality of set Cs(r). For a sensor node at grid point s, where sRs, the dominated degree D(s) is defined as the number of already-placed relays it can communicate with. In other words, if s is k-dominated, D(s) = k.

Fig. 4
figure 4

Illustration of Cr(r) and Cs(r)

Initially, the grid point rRr that can maximize |Cs(r)| is selected and a relay node is supposed to be placed, corresponding to lines 6–9 in Algorithm 2. As a result, the grid points in Cs(r) are all 1-dominated, corresponding to lines 10–12. Then the algorithm checks the grid points in Rr within the radio range of the placed relay at point r, and select the grid point (a new r) that can maximize the number of undominated points in Rs to place the relay, corresponding to lines 16–20. The dominated degree of each grid point in Cs(r) is therefore updated, corresponding to lines 21–23. The inner while loop runs until all the points in Rs are at least 1-dominated. The outer while loop terminates until all the grid points in Rs is assured to be at least k-dominated. Since we always check the grid points within the radio range of already-placed relays, the resultant RCG is guaranteed to be connected.

figure f

4.3 Complexity analysis

Here we analyze the complexity of our algorithm, which is measured by the number of operations such as comparisons, calculations, etc. Let |Rs| and |Rr| represent the cardinalities of sets Rs and Rr. First, the outer while loop corresponding to lines 15–30 (see Algorithm 2) runs k times. The inner while loop runs at most |Rs| times. Inside the inner while loop, the complexity of calculating and sorting is O(|Rr|log |Rr|) if using efficient sorting algorithms. In summary, the total complexity of our algorithm is O(k|Rs||Rr|log |Rr|).

5 Performance evaluation

5.1 Baseline algorithms

In this section, we conduct extensive experiments to evaluate the efficiency of our approach. In addition to our proposed algorithm ORPA, we also implement two baseline algorithms for comparison: random placement and two-stage optimization approach, see below:

  • Random placement: This approach first randomly selects a point from the candidate locations as initial placement, then iteratively selects the points randomly within the communication range of the already-placed relays until the set of selected relays satisfy the k-dominating requirement. Clearly, the induced RCG is guaranteed to be connected.

  • Two-stage optimization: This approach first ignores the requirement of a connected RCG and adopts similar greedy algorithm proposed in [21], which selects the grid points that can maximize the number of uncovered points to place the relays, so that the k-dominating requirement can be satisfied. After the first stage optimization, it may yield several partitioned components, as illustrated in Fig. 5a. The algorithm then constructs a minimum spanning tree (MST) for these components such that each pair of components can be connected with a shortest path, as shown in Fig. 5b. Previous work [23,24,25] assumes a fixed communication radius rc of the relays, so one can simply put additional relays on the paths with distance interval rc to connect partitioned components. However, due to the obstacle effect in an indoor environment, the value of rc is different in different orientations. Therefore, in our scenario, this approach iteratively selects candidate locations on the middle of the path until the network connectivity is guaranteed, as shown in Fig. 5c and d, respectively.

Fig. 5
figure 5

Illustration of the two-stage optimization approach. Result of the first stage: see (a). Results during the second stage: see (bd). The first stage is to satisfy the k-dominating requirement, while the second stage is to satisfy the RCG requirement

5.2 Radio propagation model

For all testing cases, we consider a 24.2 m × 13.3 m office floor area, as shown in Fig. 2. This testbed includes 17 obstacles with different construction materials: drywall, wooden wall, glass wall, cement pillar, etc. To build the radio propagation model, a total of 32 TelosB motes [36] are attached to the ceilings of the testing area randomly but ensuring that there are motes on both sides of each obstacle. The motes are equipped with CC2420 low-power radio chips, which provide an RSS indicator reading for each decoded packet. These nodes take turns to broadcast 50 packets and all the receiver nodes record their RSS as the training data. We evaluate the performance of two approaches: (i) automatic obstacle classification following Algorithm 1, and (ii) manual classification based on architectural knowledge a priori (the construction material, the thickness, etc.). Figure 6 presents the mean absolute error (MAE) between the predicted and actual RSS values of these two approaches with the number of classification groups varying from 1 to 6. Except the result of one classification group, the automatic obstacle classification consistently outperforms the manual classification approach with 5.0–5.9% lower MAE. Figure 6 also shows that additional obstacle classes are beneficial to the prediction accuracy only in a limited classification group range. Based on the training data in hand, the improvement in the prediction accuracy is getting less and less as the classification groups increases from 4 to 6. We can see from Fig. 6 that 5-class automatic classification almost achieves the lowest error. We thus adopt the parameter estimates from the best obstacle classification, as shown in Table 1, and use them to predict the communication ranges of the placed relays in the coming simulations.

Fig. 6
figure 6

Mean absolute errors (MAEs) of manual and automatic classifications with different number of classified groups

Table 1 Optimal parameter estimates for the radio propagation model

5.3 Simulations

It is obvious that the grid size we choose to discretize πs and πr determine the cardinalities of sets Rs and Rc respectively and thus affect the algorithm complexities, as discussed in Section 4.3. The grid size may also have impact on the algorithm performance. Let us denote the grid side lengths of πs and πr by gs and gr, respectively and first study how they impact the algorithm performance and then try to figure out a most reasonable setting for the two parameters. Note that there are a few gray rectangles in Fig. 2 which represent some tables. Since these areas placing the tables are fixed and unavailable for human activity, those grid points in πs whose centers fall into the gray regions are excluded from the set Rs. Unless otherwise stated, each simulation result in the following is the average from 50 runs.

In the first set of simulations, we fix gr to be 1m and vary gs from 0.5 to 3 m with step size of 0.5 m. Other settings are as follows: Pt = − 19 dBm and k = 3. We assume that the grid is k-dominated as long as the center of the grid is k-dominated. Hence, intuitively a larger grid size may lead to more “blind area” which may not meet the k-dominating requirement. To make a fair comparison, we let the grid side length of the sensor tier, i.e., gs, be small enough, e.g., 0.1m, as a benchmark, when evaluating the actual k-dominating ratio. Figures 7 and 8 show the required number of relays and their respective radio-coverage ratios of the three algorithms in comparison with an increase in gs. As shown in Fig. 7, the number of relays required by random placement is more than twice of that required by the other two algorithms. Thanks to the redundancy in random placement, the coverage ratio performance is very stable and nearly 100%, see Fig. 8. However, in the two-stage approach and our proposed ORPA, the coverage ratios have larger fluctuations. Although ORPA requires the least number of relays regardless of the value of gs, its coverage performance degrades or becomes less stable when gs is large. So, considering both the number of relays and the coverage performance, it is better to let gs equal to 0.5 or 1 m. Thus, in the following unless otherwise stated, gs is set to 0.5 m to ensure the coverage performance.

Fig. 7
figure 7

Comparison of the number of required relays for the three algorithms under different gs settings, gr = 1m, Pt = − 19 dBm, k = 3

Fig. 8
figure 8

Comparison of the actual radio-coverage ratio for the three algorithms under different gs settings, gr = 1m, Pt = − 19 dBm, k = 3

In the second set of simulation, we set gs to be 0.5 m and vary gr from 0.5 to 3 m with step size of 0.5 m. Other settings are as follows: Pt = − 19 dBm and k = 3. While gs is set to be 0.5 m, we can see that the radio-coverage ratios of all the three algorithms approaches 100% when gr is varying from 0.5 to 3 m. Due to space limit, we omit the plot of the radio-coverage ratios. Figure 9 shows the required number of relays of the three algorithms with an increase in gr. We can see that our proposed ORPA always requires a smaller number of relays than the other two algorithms. Furthermore, for ORPA and the two-stage approach, the number of relays required is at the minimum when gr is set to 0.5 or 1 m. Since a smaller gr indicates more available candidate locations points, it is possible to make the relays more precisely placed. For random placement, more candidate locations lead to higher redundancy. So, the required number of relays decreases with the increase in gr. However, an overlarge gr value is likely to incur the risk that the coverage requirement cannot be satisfied even if all the grids are placed with relays. Therefore, if without a rough estimation of the number of relays required under certain radio-coverage requirement, choosing a small gr will be safer. Considering both the number of relays and the computation complexity, we let gr to be 1 m in the following simulations.

Fig. 9
figure 9

Comparison of the number of required relays for the three algorithms under different gr settings, gs = 0.5m, Pt = − 19 dBm, k = 3

In the third set of simulation, we compare the number of required relays under different values of k. Other settings are as follows: Pt = − 19 dBm, gs = 0.5 m, and gr = 1 m. As shown in Fig. 10, with the increase in k from 3 to 6, the increment of required number of relays is approximately linear. Meanwhile, our proposed ORPA outperforms both the random and two-stage approaches, with average performance gains of about 69.3 and 26.3%, respectively. Note that the standard deviations of the required numbers of relays by ORPA and two-stage algorithms are also always smaller than that of random placement, which implies higher performance stability.

Fig. 10
figure 10

Comparison of the number of required relays for the three algorithms under different radio-coverage requirements k, Pt = − 19 dBm, gs = 0.5 m, gr = 1 m

In the fourth set of simulation, we study the performances of the three algorithms under different transmission powers. Other settings are as follows: gs = 0.5 m, gr = 1 m, and k = 3. Figure 11 shows that the required number of relay decreases with the increase in the transmission power and the slope tends to flatten out. Due to the k-radio-coverage requirement, there should be at least k relays even if the transmission power of each relay is extremely large. Note that as the transmission power is getting large, the performance difference between ORPA and the two-stage algorithm gets smaller. This is because larger transmission power makes the connectivity requirement more easily satisfied. Hence, the number of augment nodes to ensure connectivity in two-stage algorithm can be reduced.

Fig. 11
figure 11

Effect of the transmission power Pt on the number of required relays for the three algorithms, k = 3, gs = 0.5 m, gr = 1 m

Table 2 Experimental data records

In the last set of simulation, we focus on investigating the performance of ORPA under various settings. Figure 12 shows the required number of relays with the increase in the transmission power under different radio-coverage requirements, where gs = 0.5 m and gr = 1 m. It provides us a good reference when we have to weigh the node budget against the energy consumption and the localization accuracy (a larger k value means more relays can participate in the localization) in a practical deployment. From Fig. 13, we can see that when the transmission power is − 17 dBm, the total power consumption is the lowest. Figure 14 shows the required number of relays with different k and gr values, where gs = 0.5 m and Pt = − 19 dBm. It is worth noting that a smaller gr value indicates a higher computation complexity but not definitely fewer relays. This phenomenon can also be observed in Fig. 9. Since we assume the relays can only be placed at the center of the grid, different grid side lengths lead to different candidate locations of the relays. If gr is small enough, almost each point in πr can be investigated and an optimal solution can be found. Otherwise the solution depends on the quality of the candidate locations and there is possibility that more candidate locations and higher computation complexity lead to worse solution instead. Figure 14 provides us different solutions under different computation complexities. In general, the relay placement problem is optimized once and for all. So people may care more about the relay cost rather than the computation complexity. Finally, noting that when gr is set to 3m and k is equal to 6, we find that ORPA cannot find a solution to satisfy the requirement. This means that there are some critical points that one cannot provide a guarantee of connected network. In other words, an over-sized gr may lead to no solution especially when the transmission power is small and the system requires a large k.

Fig. 12
figure 12

Effect of the transmission power Pt on the number of required relays under different k settings for ORPA, gs = 0.5 m, gr = 1 m

Fig. 13
figure 13

Effect of the transmission power Pt on the total power consumption under different k settings for ORPA, gs = 0.5 m, gr = 1 m

Fig. 14
figure 14

Effect of the radio-coverage requirement k on the number of required relays under different gr settings for ORPA, Pt = − 19 dBm, gs = 0.5 m

5.4 Implementation

To evaluate the performance of the proposed algorithm in real environment, we build an indoor fall detection system, as shown in Fig. 15, consisting of a sensor node with triaxial accelerometer, relay nodes on the ceilings and a sink node in charge of collecting the data. The experimental area is the same as described in Section 5.2. The placement of the relays is specified by red stars in Fig. 16, following one of the results obtained by ORPA under the settings Pt = − 17 dBm, k = 3, gs = 0.5 m and gr = 1 m. Our volunteer carrying the sensor node walks slowly along the route shown in Fig. 16. The sensor node samples the triaxial acceleration data once every second and transmit the packet to the sink through multihops using the collection tree protocol (CTP) [37]. The test repeats 10 times and the number of transmitted and successfully received packets are recorded in Table 2. By calculation, the average packet reception rate is 93.5%. The packet loss may caused by unpredictable interference and route switching in the course of moving. In addition, we let the sink query the location of the volunteer when he randomly stops at 50 locations in the testing area, where 98% of the locations have at least 3 relays reporting back the RSS of the sensor node. In summary, our proposed ORPA achieves a satisfying performance in real environment. Due to the labor-intensive implementation work, we did not test the performance of the two-stage and random placement.

Fig. 15
figure 15

Implementation scenario

Fig. 16
figure 16

Experimental area, with red stars representing the placement of the relays, lines with arrows representing the testing route and the laptop representing the sink

6 Conclusion

In this paper, we have studied the two-tiered relay placement problem for a WSN-based home health monitoring system. Such a system requires that the deployed relays form a connected network and the patient carrying the sensor nodes is at least k-radio covered (k ≥ 3). Instead of using an idealistic disk radio model, we explicitly take into account the obstacles’ effect on the radio propagation. We prove that the problem is NP-hard and propose an efficient greedy algorithm (ORPA) to compute the best locations for the relays. Results from extensive simulations have clearly verified the superiority of our proposed ORPA compared to the random and two-stage algorithms. The impact of the transmission power and the grid size on their performance is also reported. Note that the method used in the paper is beneficial for studying the indoor deployment of practical sensor systems and can also be extended to other similar node placement problems.