1 Introduction

The rapid development of wireless communication technology in the 5G era has driven the demand for sophisticated location-based services (LBS). In recent years, indoor LBS has been widely employed for object tracking and indoor navigation in hospitals, airports, malls, and other indoor environments. According to Diffey [1], many people spend a substantial amount of their time indoors, resulting in a need for precise and reliable indoor positioning systems.

Although the global positioning system (GPS) is extensively utilized for accurate outdoor localization, its effectiveness diminishes in indoor environments due to signal attenuation and multipath effects [2]. The signal strength of satellite signals weakens as they penetrate building materials, resulting in unreliable reception within indoor environments. There have been a variety of technologies proposed for indoor positioning, including Bluetooth Low Energy (BLE) [3, 4], ultra-wideband (UWB) [5], magnetic field [6, 7], pedestrian dead reckoning (PDR) [8], radio-frequency identification (RFID) [9], and inertial sensors [10]. Among these methods, the BLE, also known as Bluetooth Smart [11], is favored for indoor positioning due to its ease of deployment, low power consumption, low cost, broad coverage, and long battery life. As a result of these unique features, BLE beacons are commonly used to construct Bluetooth fingerprints, i.e., the received signal strength (RSS) readings at known locations, for precise indoor positioning.

An indoor positioning system (IPS) is a system that determines the physical location of an object (target) in an indoor environment. Fingerprinting technique is one of the commonly recommended methods for indoor positioning because of its simplicity, inexpensiveness, high precision, and reliability in indoor environments, compared to other existing techniques such as angle of arrival (AOA), time of arrival (TOA), and time difference of arrival (TDOA) [12]. AOA is disadvantageous in indoor environments due to its susceptibility to multipath interference and signal reflections, affecting the accuracy of angle measurements. On the other hand, TOA and TDOA systems require synchronized hardware and have limitations in distance estimation accuracy [13]. In contrast, the fingerprinting method can provide an accurate estimate of the user’s location by finding the closest match of the reference points (RPs) to the real-time RSS values without knowing the position of the BLE beacon and the line of sight (LOS) propagation. This paper uses the term RP to refer to the BLE fingerprints collected at a specific location from BLE beacons.

Generally, the RSS-based fingerprinting technique consists of two phases: the offline phase (radio map construction) and the online phase (localization). The offline phase involves site surveys to collect the RSS at every RP from different locations to build a fingerprint database or radio map. In the online phase, the real-time RSS of a user is measured using a mobile positioning device and compared to the pre-stored fingerprints in the database via a positioning algorithm to estimate the user’s location. Despite its numerous benefits, this well-known fingerprinting approach has certain drawbacks. Creating radio maps for IPSs is a resource-intensive process, requiring extensive manual surveys that are both laborious and time-consuming. Furthermore, these systems are highly susceptible to environmental dynamics. Obstacles within indoor environments often lead to multipath fading and signal shadowing, which introduce signal fluctuations. Such fluctuations can severely distort the matching process of the fingerprinting technique, resulting in significant performance degradation in localization systems.

Due to the indoor layout change, the initial fingerprint database might be outdated and deemed unusable. Bluetooth channels used to collect the BLE-based fingerprints may experience different multipath fading and shadowing effects caused by obstacles or partitions after the layout change. To address these challenges, this paper proposes a radio map construction method that adapts to environment layout modifications by efficiently selecting a limited number of RPs from specific locations, clustering them using the affinity propagation clustering (APC) algorithm based on their similar RSS characteristics and partition the regions using Voronoi diagram. The APC algorithm is employed in this work because it is an efficient and robust algorithm [14] that automatically determines the optimal number of clusters based on the fingerprint data, unlike other clustering techniques, such as K-means clustering, which requires the number of clusters to be pre-defined. Given the absence of prior knowledge regarding potential layout changes, the APC algorithm is a well-suited technique for grouping fingerprints based on RSS similarity.

Based on the clustering outcome, the Voronoi diagram is utilized to partition the area of interest into multiple zones that are adapted to the environment after the layout change. The Voronoi diagram can adapt to the environment after the layout change, ensuring that the regions accurately reflect the current state of the indoor space. This approach can improve the accuracy of fingerprint-based localization, as it helps to mitigate the negative effects of multipath fading and shadowing caused by changes in the indoor layout. Additionally, using the Voronoi diagram allows for more efficient localization by reducing the number of fingerprints that need to be compared in a given area, since the regions can be used to determine the most likely location of a mobile device based on the signal strength of nearby fingerprints.

After adapting the area of interest to the new layout, the irregularly distributed RPs require a suitable technique for reconstructing a surface. Natural neighbor interpolation (NNI) is the preferred method as it can generate a continuous surface from irregularly distributed sample points. NNI determines the value of a new point by considering the values of its neighboring points with the weights of the neighbors proportional to their areas of influence. The technique uses Delaunay triangulation to define the neighborhoods, enabling it to adapt to the varying density and shape of the region, resulting in a smooth and accurate surface. However, NNI faces the convex hull issue, which limits efficient fingerprint construction. To overcome this challenge, we use the gradient extrapolation method (GEM) to generate synthetic RPs for convex hull expansion, ensuring accurate virtual fingerprint construction.

In short, the proposed Clustering-based Voronoi Natural Neighbor Interpolation (C-VoNNI) offers a robust and efficient solution to reconstruct a new radio map in the face of dynamic indoor environments subject to different layout change, significantly improving the localization performance. The main contributions of this paper are summarized as follows:

  1. 1.

    We utilized the APC technique to group a small number of selected RPs with similar RSS characteristics. Subsequently, a Voronoi diagram is employed to partition the area of interest into multiple zones, which are adapted to the environment after the layout change. Virtual fingerprints are generated for each zone from their respective RPs, effectively reducing interpolation errors commonly induced by multipath fading and signal shadowing due to the dynamic layout change;

  2. 2.

    To enhance the accuracy of indoor positioning, we constructed a precise radio map using the NNI technique based on chosen RPs. This method allows the virtual generation of fingerprints that accurately capture spatial signal strength variations, thereby enabling more precise localization of mobile devices within the indoor environment;

  3. 3.

    Addressing the limitation of NNI, which requires interpolated points (IPs) to be within the convex hull, we employed the GEM to generate synthetic RPs, referred to as ‘super points’ (SPs) in this paper. This approach effectively expands the convex hull, allowing for interpolation even when existing RPs fall outside of it, thus enabling the generation of higher-quality fingerprints and enhancing the overall quality of the radio map.

The performance of the proposed C-VoNNI is evaluated using two different environmental testbeds and compared with two baseline experiments. First, we collect data from the environment without the partitions. This dataset is then utilized for training and testing the model. Subsequently, to simulate layout changes, additional data are collected from the environment by erecting multiple partitions that affect the collected RSSI values. The testbeds are then used to evaluate the performance of the newly interpolated radio map. The performance of the proposed technique is compared to the inverse distance weighting (IDW) and Kriging interpolators, and the evaluation is based on the root mean-squared error (RMSE) and the positioning error.

The remainder of this paper is organized as follows. In section 2, we review the related work corresponding to various approaches for radio map construction. Section 3 describes the proposed method in detail, including the theory and algorithm on which it is based for radio map construction. The experimental setup, results, and discussion are shown in Sect. 4. Finally, the conclusion and direction of future work are presented in Sect. 5.

2 Related work

Collecting RSS samples during offline site surveys is a major challenge as it requires a significant amount of effort, particularly for large areas [15]. At present, researchers focused on improving positioning accuracy and reducing human effort in fingerprint collection, such as using crowdsourcing [16,17,18] to collect constant feedback from volunteer mobile users to update the radio map, interpolation techniques [19,20,21], and estimating RSS values using path loss model (PLM) [22, 23] to densify the initial incomplete radio map. The main concerns with crowdsourced data collection are the reliability of the position information provided by the mobile user, as well as the users’ involvement and willingness to participate in the crowdsourcing-based activity.

Radio map construction utilizing PLM is a popular method for reducing the database establishment effort. The log-distance PLM employs actual RP measurements collected from offline site surveys to train the parameters of the PLM and subsequently estimate the RSS values at various locations. The downside of this technique is that the propagation model cannot precisely predict the multipath fading and shadowing effects in complex indoor environments, which often leads to poor localization accuracy. In 2019, Bi et al. [22] presented a method for constructing radio maps that integrates PLM, crowdsourcing, and interpolation techniques. The authors utilized the least square algorithm to estimate the optimal parameters of PLM and adaptively construct the propagation model. While this approach can achieve comparable positioning accuracy to a manually constructed radio map with a 1.2 m interval when the interval of sparse RPs was set to 9.6 m, it does not account for factors such as multipath interference and diffraction. These oversights result in less precise interpolated fingerprints, resulting in decreased localization accuracy.

To improve the accuracy of fingerprints generation, Moghtadaiee et al. [24](2019) introduced a zone-based weighted ring-based (WRB) interpolation technique in which the RPs and IPs are assigned to rings depending on their spatial location, and the RSS of the IPs is estimated using the mean of the RPs inside the ring. Experiments show that the proposed technique improves the localization accuracy by up to 40% while decreasing average error by 26% compared to conventional PLM. In 2020, Xia et al. [25] constructed an efficient radio map using an adaptive ordinary Kriging Interpolation technique and APC algorithm. The experimental results showed that the proposed method has a lower mean square error than IDW interpolation. Yong et al. [26] (2022) further this research by proposing a fingerprint clustering method based on RSS difference and the Voronoi diagram, enhancing positioning accuracy by 14%. Most recently, Wang et al. [27] (2023) introduced a novel approach using the bidirectional encoder representation from transformers (BERT) model, commonly applied in natural language processing, to fill in missing signal data in radio maps. By redefining the BERT model’s structure and loss function to suit wireless signal patterns, this method has proven to expedite the radio map construction process significantly and outperformed traditional methods such as linear interpolation, compressed sensing algorithms, and matrix completion. The BERT-based approach has demonstrated remarkable performance, with an error probability within 2 m of approximately 94%, according to their experimental results.

Radio map interpolation is another popular approach for database enhancement. The most commonly used spatial interpolation techniques which predict attribute values at unsampled locations based on observations at nearby sampled locations are the Kriging [28, 29], IDW [30, 31], and natural neighbor [32, 33] interpolators. Although Kriging interpolation is known as the optimal linear unbiased predictor for delivering robust estimation with low error variance, it can only provide accurate performance when the correlation structure of the data is known. Having only a small number of sample points can lead to unreliable results and a lack of robustness. While IDW interpolation has the advantage of fast construction, its main drawbacks are that it is sensitive to outlier measurements and there is no indication of error [34].

NNI is a robust and reliable smoothing technique that provides relatively small interpolation errors [35] based on Voronoi tessellation of a discrete set of spatial points. In the case of fingerprints collected at various locations, the RPs are likely to be irregularly spaced. NNI is well-suited for handling such irregularities [36] and adapting to the dataset. Additionally, NNI technique emphasizes using nearby data points to estimate RSS values at unobserved locations, making it particularly relevant for RSS fingerprints. This is because signal strength can be greatly impacted by local factors, such as walls and furniture. By focusing on neighboring data points, NNI effectively captures local variations in signal strength, providing a more accurate representation of the underlying spatial patterns. As such, the NNI technique is a viable method to generate IPs at the desired locations for a region partitioned into multiple zones with different channel characteristics. However, the main disadvantage of this technique is that the IPs must be in the convex hull of RPs for interpolation. However, in many scenarios, the IPs would be scattered out of the convex hull, forbidding the interpolation using the natural neighbor approach. As a result, we proposed using GEM [37] to generate the synthetic RPs, i.e., SPs, to expand the convex hull area. With this expansion, all the IPs are relocated into the convex hull so that the NNI technique can generate virtual fingerprints for radio map construction. To the best of our knowledge, this is the first work that combines GEM and NNI for interpolation to solve the layout change problem for updating the radio map.

As we all know, wireless signals are highly vulnerable in complex indoor environments. Changes in building structure and obstructions can cause RSS variations, resulting in significant localization errors. To address this issue, Tao and Zhao [38] proposed an automatic fingerprint map construction technique based on the Digital navigation center IPS that is robust to environmental changes. The proposed method collects RSS at fixed points and updates the radio map using Gaussian process regression model and a fireworks algorithm. In addition, Lee et al. [39] presented an automatic self-reconstruction (ASR) model for radio maps that combines radio encoding-based deep fingerprint positioning (RE-DFP) and radio anomaly detecting (RAD) networks, aiming to provide seamless Wi-Fi fingerprinting-based indoor positioning even in the presence of environmental changes. The authors utilized the RE-DFP network to improve positioning accuracy and data efficiency and a RAD network to analyze environmental dynamics according to access points. The experimental results show an improvement in average positioning accuracy by 0.5 m using RE-DFP and detection accuracy of 81.1% for access point abnormality with the RAD network. To the best of our knowledge, no prior work attempts to construct the fingerprint database without employing automated update capabilities to solve the indoor environment layout change issue for IPS.

3 Proposed method

Radio maps, which consist of RSS readings collected at known locations, are essential for fingerprinting localization systems. However, the dynamic nature of indoor environments poses a significant challenge to the efficacy of these systems. After changes to the layout, such as the addition or removal of obstacles, the Bluetooth channels used for collecting BLE RSS data can exhibit altered multipath fading and shadowing effects. These changes can drastically affect the accuracy of the localization due to the complexity of the environment and the potential obsolescence of the existing radio map. Hence, in this work, we propose an efficient method to construct a precise radio map using a minimal number of RPs collected after the layout change by considering the indoor area’s interior architecture and environmental effects. The goal is to minimize the degradation in localization accuracy accompanying such environmental changes. For the offline phase, 28 RPs are selected from the dataset to interpolate the remaining 90 fingerprints using the C-VoNNI method to build a complete radio map. For each RP, there are 30 RSS samples collected at multiple user orientations (typically four directions), with users facing north, south, east, and west to avoid noise effects and signal fluctuations. However, some RPs have extremely weak signal strength (with RSS values less than −100 dBm), or no RSS measurements were obtained during the data collection. To address this issue, we set the RSS values to −110 dBm [15].

Fig. 1
figure 1

An overall processing flow for the proposed fingerprint interpolation technique for radio map construction and online positioning

The proposed radio map construction technique consists of four steps: APC method to group the collected RPs based on RSS similarity, region partition using Voronoi diagram, GEM to produce SPs to expand the area of the convex hull, and generate virtual fingerprints using NNI. Figure 1 depicts the overall processing flow of our proposed fingerprint interpolation technique for constructing virtual fingerprints based on a limited number of RPs collected in the test area, and the algorithm is outlined in Algorithm 1.

The algorithm for creating new interpolated fingerprints consists of four major steps, which are listed below.

  1. 1.

    An initial radio map is created by collecting fingerprints from the test area. The collected fingerprints, known as the RPs, are then clustered based on the RSS similarity using the APC algorithm.

  2. 2.

    Next, the Voronoi diagram is drawn based on the centroids of each cluster to partition the regions after the layout change. This process is repeated for all the AP.

  3. 3.

    Following that, the GEM is employed to generate the synthetic RPs to expand the convex hull area.

  4. 4.

    Finally, the NNI algorithm is used to generate the virtual fingerprints based on regions to construct a complete radio map.

For each RP, the fingerprint contains the RSS vector and its corresponding location coordinates in the form of

$$\begin{aligned} f_{\text {RP}(L)} = [(x,y),\text {RSS}_{n,m}],\quad L=1,2,3,\ldots ,T \end{aligned}$$
(1)

where L is an index of the RP in radio map, T is the total number of RPs, (x, y) is the coordinates of the RP, and \(\text {RSS}_{n,m}\) is the mean of RSS samples collected at the mth RPs from the nth fixed BLE beacons. In this work, since there are 30 RSS readings collected at each RP, \(\text {RSS}_{n,m}\) is computed as shown in Eq. (2), and this process is then repeated for all the BLE beacons.

$$\begin{aligned} \text {RSS}_{n,m} = \text {Mean}[\text {RSS}_{n,m(1)},\text {RSS}_{n,m(2)},\ldots ,\text {RSS}_{n,m(30)}] \end{aligned}$$
(2)

3.1 Affinity propagation clustering (APC) algorithm

The APC algorithm is one of the most widely used clustering techniques in fingerprinting localization systems based on a measure of similarity between data points. The algorithm relies on message passing [40] to compute the exemplars and group the data points with the same exemplar into a cluster. APC algorithm is utilized for clustering processing in this work because it does not require specifying the number of clusters beforehand, which is suited to our experiment scenario without prior information on the layout changes. All RPs are considered potential exemplars clustered using APC algorithm based on RSS values.

APC algorithm required two pieces of information: First is the similarity between RPs, i.e., s(ik), which indicates how well the \(\hbox {RP}_k\) is suited to be the exemplar for \(\hbox {RP}_i\). Let s be a similarity function of two RPs, \(j_i\) and \(j_k\). The similarity s(ik) can be found by computing the negative squared distance of these two points, which is denoted by

$$\begin{aligned} s(i,k)= -||{j_i-j_k}||^2 \end{aligned}$$
(3)

The similarity values for pairs of RPs form a \(T \times T\) matrix: \([s(i,k)]_{T \times T}\), where T is the total number of RPs. This similarity matrix will then be fed into the APC algorithm for clustering processing.

The second information that APC algorithm requires is the preference, i.e., s(kk), which is the diagonal of the similarity that controls the number of clusters. Typically, the preference value is set to the median similarity of all pairs of RPs.

APC algorithm recursively transfers two types of messages between the RPs, named responsibility r(ik) and availability a(ik), until a good set of exemplars and clusters emerges. At the beginning of the procedure, both responsibility and availability are initially set to zero. The algorithm starts by distributing responsibility messages to gather evidence on how well-suited \(\hbox {RP}_k\) is to serve as the exemplar for \(\hbox {RP}_i\) compared to other candidate exemplars of \(\hbox {RP}_i\). The responsibility is updated by

$$\begin{aligned} r(i,k)= s(i,k) - \max _{k'\ne k}\{a(i,k')+ s(i,k')\} \end{aligned}$$
(4)

The availability, a(ik), which indicates how appropriate it is for \(\hbox {RP}_i\) to choose \(\hbox {RP}_k\) as its exemplar, is then updated using Eq. (5).

$$\begin{aligned} a(i,k)= \min \left\{ 0, r(k,k) + \sum _{i'\notin \{i,k \}} \max \{ 0, r(i',k) \} \right\} \text {for} \, i \ne k \end{aligned}$$
(5)

The self-availability a(kk) is updated differently as

$$\begin{aligned} a(k,k)= \sum _{i'\ne k} \max \{ 0, r(i',k) \} \end{aligned}$$
(6)

To prevent numerical oscillations during the message updating procedure, we use the default damping value, \(\lambda\) of 0.5, as suggested by Fred and Dueck [40], in all our experiments. The responsibility and availability messages are updated as follows.

$$\begin{aligned} r_{\text {new}}(i,k)= \lambda \cdot r_\text {old} + (1-\lambda ) \cdot r_{\text {new}}(i,k) \end{aligned}$$
(7)
$$\begin{aligned} a_{\text {new}}(i,k)= \lambda \cdot a_\text {old} + (1-\lambda ) \cdot a_{\text {new}}(i,k) \end{aligned}$$
(8)

The responsibility and availability can be combined at any stage during the clustering process to identify the exemplars. For \(\hbox {RP}_i\), the value of k that maximizes \(r_{\text {new}}(i,k)+a_{\text {new}}(i,k)\) either identifies \(\hbox {RP}_i\) as an exemplar itself if \(k \ne i\), or identifies \(\hbox {RP}_k\) as an exemplar of \(\hbox {RP}_i\) if, according to Eq. (9).

$$\begin{aligned} \max _{k} \{ r{_\text {new}}(i,k)+a{_\text {new}}(i,k) \} \end{aligned}$$
(9)

After identifying a collection of exemplars, clusters can be formed by grouping the RP with the same exemplar. The algorithm could terminate after a certain number of iterations or when the message changes fall below a threshold.

Algorithm 1
figure a

Interpolated radio map algorithm

3.2 Voronoi diagram for clusters partitioning

Voronoi diagrams, also known as Dirichlet tessellation or Thiessen polygons, are useful for space partitioning [41]. Given a set of points {\(P_1\), \(P_2\),..., \(P_n\)} in a plane, where these points are called Voronoi sites, the Voronoi diagram divides the plane into A Voronoi regions. Each site lies in precisely one region called a Voronoi cell. The Voronoi diagram of two sites, \(P_1\) and \(P_2\), is constructed by drawing the perpendicular bisector of the line segment. This line segment will form the boundaries of Voronoi cells called the Voronoi edges. The Delaunay triangulation is the dual graph of the Voronoi diagram, in which two Voronoi sites are connected and bounded by a common Voronoi edge.

The procedure for constructing a Voronoi diagram for each AP is as follows:

  1. 1.

    Let centroids of i cluster, \(C = \{c_1, c_2,\ldots , c_i \}\) as the Voronoi sites. Using Eq. (10), compute the location of a centroid in cluster j, \(c_j\), which consists of L RPs with coordinates \(\{(x_1, y_1), (x_2, y_2),\ldots , (x_L, y_L)\}\) by finding the mean vector of coordinates of all RPs within the same cluster.

    $$\begin{aligned} c_j=(x_j,y_j)=\text {mean vector} \, [(x_1, y_1), (x_2, y_2),\ldots , (x_L, y_L)] \end{aligned}$$
    (10)
  2. 2.

    For \(c_1,c_2 \in C\), draw the perpendicular bisector of two Voronoi sites, \(c_1\) and \(c_2\), following the steps below:

    1. (i)

      Find the midpoint of two sites \(c_1=(x_1,y_1)\) and \(c_2=(x_2,y_2)\) using

      $$\begin{aligned} \left( \dfrac{x_1 + x_2}{2}, \dfrac{y_1 + y_2}{2}\right) \end{aligned}$$
    2. (ii)

      Find the negative reciprocal of the gradient of the two sites \(c_1\) and \(c_2\), \(m_{{c_1}{c_2}}\) given by

      $$\begin{aligned} m_{{c_1}{c_2}}=-\left( \dfrac{y_2 - y_1}{x_2 - x_1} \right) ^{-1} \end{aligned}$$
    3. iii)

      Find y-intercept, b of the perpendicular bisector using one of the sites, \(c_1(x_1,y_1)\) or \(c_2(x_2,y_2)\), denoted by

      $$\begin{aligned} b= y_2 - (x_2)\left[ -\left( \dfrac{y_2 - y_1}{x_2 - x_1} \right) ^{-1}\right] \quad \text {or} \quad b= y_1 - (x_1)\left[ -\left( \dfrac{y_2 - y_1}{x_2 - x_1} \right) ^{-1}\right] \end{aligned}$$
    4. (iv)

      Form an equation of the perpendicular bisector in the form of

      $$\begin{aligned} y = m_{{c_1}{c_2}} \cdot x + b \end{aligned}$$

Due to page limit, we only present the results for BLE10 from layout 1. Figure 2 depicts the clustering results for BLE10 using APC algorithm. As shown in Fig. 2, the RPs are grouped into four clusters based on RSS similarity, as indicated by different colors and shapes. The clustering is performed by first finding the centroid of each cluster, which is represented by the star symbol based on the shapes’ color as shown in Fig. 3, and then partitioning the regions by drawing perpendicular bisectors to construct the Voronoi diagrams. From Fig. 3, the clustering techniques that produced four Voronoi cells are identical to the actual case in which the testbed is divided by three partition walls.

Fig. 2
figure 2

Clustering results for BLE10 using APC algorithm

Fig. 3
figure 3

Voronoi diagrams for BLE10

3.3 Gradient extrapolation method (GEM)

The main disadvantage of the NNI technique is that it cannot generate virtual fingerprints outside of the convex hull because the algorithm requires the IPs to be within the convex hull. To overcome this issue, we proposed employing GEM [37] to expand the surface area outside the convex hull, allowing NNI to generate virtual fingerprints for developing a complete radio map.

GEM is used to estimate the extrapolated RSS of an SP outside the convex hull. The fingerprint of an RP, \(f_\text {RP}\), is recorded in the form shown in Eq. (1). Let the fingerprint of an SP, \(f_\text {SP}=[(x_\text {SP},y_\text {SP}),\text {RSS}_\text {SP}]\), and the fingerprint of a collected RP, \(f_\text {RP}=[(x_\text {RP},y_\text {RP}),\text {RSS}_\text {RP}]\). The gradient at an RP on the surface edge is defined as \(\bigtriangledown \text {RSS}_\text {SP}=(\psi _x,\psi _y)\), where \(\psi _x\) and \(\psi _y\) are gradient component values in the x and y axes, respectively.

The procedure for estimating the RSS of an SP, \(\text {RSS}_\text {SP}\) is as follows:

  1. 1.

    Using x- and y-component gradients, calculate the approximated RSS value of SP based on the nearest RP.

    $$\begin{aligned} {\left\{ \begin{array}{ll} \begin{aligned} \quad \text {RSS}_x = \psi _x (x_\text {SP} - x_\text {RP}) + \text {RSS}_\text {RP} \\ \quad \text {RSS}_y = \psi _y (y_\text {SP} - y_\text {RP}) + \text {RSS}_\text {RP} \end{aligned} \end{array}\right. } \end{aligned}$$
    (11)
  2. 2.

    Compute the RSS value of the extrapolated SP by averaging the x- and y-component RSS estimates from Eq. (11), which is denoted by

    $$\begin{aligned} \text {RSS}_\text {SP}= \dfrac{1}{2} \{ \text {RSS}_x + \text {RSS}_y \} \end{aligned}$$
    (12)

3.4 Natural neighbor interpolation (NNI)

The NNI technique was first introduced by Sibson [42]. It is a spatial interpolation technique based on Voronoi tessellation of a set of RSS sample points, and it can provide smoother approximation compared to simpler methods [43, 44]. The NNI estimates the RSS value of an IP, \(\text {RSS}_\text {IP}\) by computing the weighted average of its B natural neighbors using Eq. (13).

$$\begin{aligned} \text {RSS}_\text {IP}= \sum _{i=1}^B {w_i}{\text {RSS}_i} \end{aligned}$$
(13)

where \(\text {RSS}_\text {IP}\) is the estimated RSS of the IP and \(RSS_i\) is the RSS of the IP’s natural neighbors.

The weight of each neighboring point of the IP, \(w_i\), is calculated as follows:

$$\begin{aligned} {w_i}= \dfrac{A(x_i)}{A(V_\text {IP})} \end{aligned}$$
(14)

where \(A(x_i)\) denotes the overlapping area between Voronoi cell \(V_1\) and the new Voronoi cell, \(V_\text {IP}\), and \(A(V_\text {IP})\) represents the total area of the new Voronoi cell, \(V_\text {IP}\), which are highlighted in blue and yellow in Fig. 4, respectively.

Fig. 4
figure 4

Natural neighbor interpolation

The main challenge of NNI is that it requires the interpolated points to be in the convex hull of measurement locations, as only the natural neighbors of the IP are considered in the interpolation process. As such, the GEM technique is proposed to generate the extrapolated points to broaden the convex hull coverage for each region before adopting the NNI technique to generate the virtual fingerprints to construct a complete radio map.

4 Experimental setup, results and discussion

4.1 Experimental setup

This work simulated a complex indoor environment by including three partition walls of size 1.5 m \(\times\) 1.8 m with a thickness of 0.0127 m in the testbed [15]. Figure 5 depicts the two experiment testbed layouts, each, with 144 data points distributed across the testbed and split into the test, reference, and interpolation points. The testbed is a rectangle area with dimensions 51.5 m \(\times\) 2.7 m, and 14 SENSORO SmartBeacon-4AA Pro BLE beacons are pre-installed on the corridor wall at 1.7 m from the ground to reduce pedestrian disruption. The position of the AP is chosen in such a way as to maximize the signal coverage, ensuring that BLE signals cover the corridor entirely.

As shown in Fig. 5, our work includes 28 RPs in the training dataset, 26 test points (TPs) in the testing dataset, and 90 IPs. The training and testing datasets consist of 17 attributes, including the RSS measurements collected from 14 BLE beacons, location labels, and corresponding coordinates. To evaluate the positioning performance of the proposed technique, the TPs are chosen to be evenly distributed around the testbed area. First, we use APC algorithm to cluster the RPs based on their similarity. The Voronoi diagram is then used to draw the cluster boundaries, and SPs are generated using a GEM method to expand each cluster’s convex hull. Finally, the virtual fingerprints based on RPs in each cluster are generated using the NNI algorithm. These interpolated fingerprints are then added to the training dataset, fed into the KNN algorithm, and evaluated using test data.

Fig. 5
figure 5

The layouts of the experimental testbed, where the solid blue, red, and black circle symbols refer to the RP, TP, and IP, respectively

4.2 Evaluation metrics

The root mean-squared error (RMSE) measures the accuracy of the estimated data obtained by the KNN regression. It reflects the closeness of the estimated and actual values. RMSE is calculated by taking the square root of the average square difference between the actual and corresponding estimated points over the sample, as given by

$$\begin{aligned} {\text {RMSE} = \dfrac{\sum \nolimits _{M=1}^{N}\sqrt{(x_M-\hat{x_M})^2+(y_M-\hat{y_M})^2}}{N}} \end{aligned}$$
(15)

where N denotes the total number of samples, \((x_M, y_M)\) is the Mth coordinate of the actual point, and \((\hat{x_M}, \hat{y_M})\) is the Mth coordinate of the corresponding estimated point.

4.3 Results and discussion

4.3.1 Number of training data for radio map construction

In this paper, the impact of the number of RPs on the localization performance by calculating the RMSE for different sets of RPs is investigated. The experiments were conducted 3000 times for each set of RPs to ensure the reliability of the results. Among the different sets tested, which included 20, 24, 28, 32, and 36 RPs, as shown in Fig. 6, the set of 28 RPs was consistently found to yield the most accurate virtual fingerprints with the lowest RMSE. The results suggest that employing more RPs for interpolation does not necessarily lead to optimal quality. One possible reason for this observation is the presence of erroneous RPs caused by multipath fading and signal shadowing during data collection. Moreover, as the number of RPs increases, the likelihood of interference between neighboring RPs rises, introducing additional noise and variability into the measurements, thereby affecting interpolation accuracy and resulting in a higher RMSE. Additionally, the density of RPs per square area may contribute to these findings, as uneven distribution and sparse placement of RPs across the entire area can lead to interpolation errors in regions with fewer RPs. Such uneven coverage hinders the accurate estimation of signal characteristics in those areas. Based on these results, 28 RPs are used for radio map construction in this work.

Fig. 6
figure 6

RMSE of interpolated radio map using a different number of RPs

4.3.2 Impact of number of access points on localization accuracy

To the impact of AP density on localization accuracy, simulations were conducted across two distinct layouts, as illustrated in Fig. 5. The simulations involved systematically varying the number of APs within the testbed, with configurations including 4, 6, 8, 10, 12, and the complete set of 14 APs. This method facilitated a thorough analysis of the performance trade-offs associated with varying the number of deployed APs.

Figure 7 presents the relationship between the number of APs and localization performance. The results reveal that Layout 1 and Layout 2 consistently demonstrated a direct correlation between the number of APs and the RMSE of the localization system. An increase in the number of APs within the testbed corresponded with a reduction in RMSE, signifying an improvement in localization accuracy. The results indicate that employing 14 APs yields the lowest localization errors because a higher number of APs provides a denser and more informative signal landscape, which enhances the system’s ability to estimate positions accurately. Employing more APs ensures comprehensive signal coverage, reducing the likelihood of signal dead zones and improving the triangulation process. Therefore, this work utilizes 14 APs to ensure optimal localization precision.

Fig. 7
figure 7

Localization Performance in Layout 1 and Layout 2 Using Different Numbers of APs

4.3.3 The efficiency of C-VoNNI

To evaluate the effectiveness of the C-VoNNI, baseline experiments were conducted under two distinct layout conditions within our experimental testbed, as depicted in Fig. 5. These conditions, with and without partition walls, serve as benchmarks to assess the performance of our proposed interpolation method in constructing the radio map.

The localization accuracy of the baseline experiments was evaluated under two defined scenarios:

  1. 1.

    Scenario 1 (best-case) In this scenario, the environment includes three partition walls, as illustrated in Fig. 5. RPs were collected from the complex indoor environment to both train and test the system. This scenario is considered ’best-case’ as it reflects a stable environment where the RPs are consistent with the test conditions.

  2. 2.

    Scenario 2 (worst-case) Contrasting the first, this scenario involves training the system with RPs from an indoor environment without partitions. The performance is then evaluated against data collected from an environment with partitions. This scenario is considered ’worst-case’ due to the discrepancy between training and testing conditions, highlighting the challenges of dynamic environments where layout changes can significantly impact signal propagation.

The RMSEs obtained for these two baseline experiments are shown in Table 1, which highlights the significant impact of layout changes on localization accuracy, with positioning accuracy degrading by up to 250%. Such changes in real-world scenarios, such as adding or removing partition walls and furniture, can lead to signal attenuation and shadowing effects. These, in turn, can cause substantial localization errors if the fingerprints used for localization become outdated.

Table 1 Localization accuracy of baseline experiments

Figure 8a, b illustrates the cumulative distribution function (CDF) for Layouts 1 and 2, respectively, comparing the performance of the proposed C-VoNNI interpolation method against the baseline experiments. The CDFs provide a clear visual representation of the distribution of localization errors for all test cases. In these figures, the blue dotted line indicates the CDF for the best-case scenario, while the black dotted line represents the worst-case scenario. The 90th percentile error distance for the proposed C-VoNNI method and the best-case and worst-case scenarios are 4.28 m, 4.30 m, and 15.10 m, respectively. These results demonstrate that the C-VoNNI method closely approximates the best-case scenario’s performance, suggesting that the interpolated RSS samples generated by our method are of a quality comparable to those collected through extensive site surveys. This outcome demonstrates the robustness of the C-VoNNI method in maintaining high localization accuracy despite environmental changes.

The APC algorithm plays a crucial role in achieving this robustness. By clustering RPs based on similar RSS characteristics, the APC algorithm effectively reduces the impact of signal shadowing and multipath fading, which are exacerbated by layout changes. The subsequent application of the NNI technique leverages these accurately clustered RPs to generate virtual fingerprints that are highly representative of the actual signal distribution in the environment. This approach ensures that the constructed radio map remains precise and adaptable, reducing the need for frequent and labor-intensive site surveys.

Fig. 8
figure 8

Comparison of localization errors computed by C-VoNNI to two baseline experiments

4.3.4 Localization performance of C-VoNNI

In this work, the APC algorithm is utilized to cluster the RPs in the testbed before generating IPs with GEM and NNI techniques. The KNN algorithm was then used to calculate the RMSE errors for these IPs, with parameter K set to 5 for all our experiments. To provide a comprehensive evaluation of the proposed C-VoNNI method, our comparison includes both conventional IDW and the more complex Kriging interpolations. IDW is a deterministic method for multivariate interpolation with a known scattered set of points [45]. Its simplicity is advantageous for quick predictions, and it is often used as a benchmark in spatial analysis due to its straightforward implementation and the intuitive nature of its assumptions. By comparing our results with IDW, we establish a baseline for performance against a commonly used interpolation method. On the other hand, Kriging interpolation is a well-established technique in spatial analysis known for its ability to capture spatial variability and provide accurate predictions based on spatial correlation. Work [46] demonstrated the relevance of Kriging interpolation in fingerprint update techniques, showcasing its efficacy in adapting algorithms to environmental changes and enhancing localization performance.

Fig. 9
figure 9

Heatmaps illustrating signal strength for (i) before interpolation, (ii) after C-VoNNI interpolation, (iii) after IDW interpolation, and (iv) after Kriging interpolation

Figure 9 presents heatmaps before and after applying C-VoNNI, IDW, and Kriging interpolation, offering a visual comparison of signal strength distribution across two different layouts: AP2 for layout 1 and AP25 for layout 2. The figure reveals that signal strength exhibits fluctuations between nearby RPs prior to interpolation due to the sparsity of data points. Post-interpolation, however, the heatmaps show a smoother transition across areas of varying signal strengths, indicating a more uniform signal distribution. Additionally, the regions of extreme signal strength appear expanded or shifted, illustrating how the interpolation algorithms estimate values for the spaces between RPs. A comparative analysis of the post-interpolation results, as depicted in parts (ii), (iii), and (iv) of Fig. 9a, b, reveals that C-VoNNI interpolation produces heatmaps with a smoother signal strength transition compared to IDW and Kriging.

Figure 10 illustrates the RMSE comparison among our proposed technique, IDW, and Kriging interpolation, while Fig. 11 displays the CDF for all three approaches. The results indicate that our proposed method outperforms radio map construction using both IDW and Kriging interpolation by up to 33% in terms of RMSE. This significant improvement was observed when utilizing 72 or more IPs in Layout 1. The superior performance can be attributed to the limitations of Kriging interpolation, particularly when using a limited number of RPs in the fingerprint interpolation process, leading to poor Kriging interpolation localization accuracy. Moreover, the performance improvement for C-VoNNI exhibits a clear trend as the number of IPs increases. This is because the coverage and density of the IPs become higher, allowing for a more precise estimation of the signal characteristics and spatial variations in the environment. This additional information leads to a better representation of the radio map, reducing the discrepancy between the estimated and actual positions and ultimately resulting in a lower RMSE. Grouping the RPs closely related to each other is critical in radio map construction to improve localization accuracy. According to Fig. 10, the RMSE of the interpolated radio map obtained using our proposed method decreases as the number of IPs increases. These results show that our C-VoNNI has superior indoor positioning capability in complex indoor environments.

Fig. 10
figure 10

Comparison of RMSE with the number of interpolated data generated based on 28 RPs using C-VoNNI (red line), Kriging interpolation [47, 48] (black line), and IDW [49](blue line) (color figure online)

Fig. 11
figure 11

Cumulative distribution function of localization errors computed using C-VoNNI (red line), Kriging interpolation [47, 48] (black line), and IDW [49] (blue line) (color figure online)

4.4 Computational complexity and scalability of the C-VoNNI

The computational complexity and scalability of the proposed C-VoNNI interpolation method are critical factors determining its practicality for deployment in large-scale real-world applications. The C-VoNNI method integrates the APC algorithm, Voronoi diagrams, and NNI to construct an accurate radio map for indoor localization. Table 2 compares the computational complexities of the proposed method with IDW and Kriging interpolation techniques.

Table 2 Computational complexity comparison

The computational complexity of the C-VoNNI method is determined by the complexities of the APC, Voronoi diagram construction, and NNI components. The APC has a complexity of \(O(g^2 \cdot t)\), where \(g\) is the number of collected RPs to be clustered, and \(t\) is the number of iterations the algorithm undergoes before converging to a stable set of clusters [50]. The construction of the Voronoi diagram, which also uses these RPs, has a computational complexity of \(O(g \log g)\) [51]. Meanwhile, the complexity of NNI can vary from \(O(d)\) to \(O(d \log d)\), depending on the data structures and search algorithms employed, where \(d\) refers to the number of data points involved in the interpolation process [52]. Since these components are applied sequentially, the overall complexity of the C-VoNNI method is dominated by the most computationally intensive step, typically the APC, due to its quadratic term.

Kriging interpolation has a computational complexity of \(O(d^3)\) because it involves solving a system of linear equations, which typically requires matrix inversions [53]. This cubic complexity arises from the matrix operations necessary to solve the Kriging equations and can become computationally intensive as the dataset size increases. On the other hand, IDW calculates the interpolated value at a prediction point based on the weighted average of values from nearby known points. IDW has a complexity of \(O(d \cdot z)\), where \(z\) is the number of prediction points [54]. While linear with respect to \(z\), IDW still requires computations involving all known points for each prediction.

The proposed C-VoNNI method offers a balanced trade-off between accuracy and computational efficiency, avoiding the cubic complexity of Kriging and reducing the computational load compared to IDW. However, the scalability to larger environments remains a challenge due to the quadratic complexity of the APC component. To address these scalability concerns, future work could explore optimization techniques for APC, such as reducing the number of iterations \(t\) through better initialization strategies. Furthermore, leveraging parallel processing and distributed computing could significantly decrease the time required for computations, thus enhancing the method’s scalability. These improvements could enable the C-VoNNI method to maintain high localization accuracy without prohibitive computational costs in large-scale applications.

5 Conclusion and future work

As the indoor layout changes, the initial fingerprint database may become outdated due to various environmental factors, including multipath fading and shadowing. These factors introduce signal fluctuations and distortion-matching issues, leading to significant performance degradation in localization systems. Hence, this paper proposes a new fingerprint method based on clustering, extrapolation, and interpolation techniques for constructing a precise radio map that can adapt to environment layout changes while reducing the human effort, cost, and time required for offline site surveys. By clustering RSS samples with similar characteristics, more accurate virtual fingerprints can be generated because RPs in the same region typically experience similar multipath fading and signal shadowing effects. GEM is proposed to estimate the synthetic RP to expand the area of the convex hull in order to generate IPs using the NNI technique. By reducing the collection effort by 76% of the initial fingerprints, the C-VoNNI outperforms the Kriging interpolation and improves the localization accuracy by up to 33%. However, the limitations of the proposed method become apparent in highly dynamic environments where the initial clustering may not fully capture the rapid signal variability, leading to potential inaccuracies in the radio map. Furthermore, the computational complexity, particularly of the APC algorithm, may pose scalability challenges for larger datasets, impacting the method’s practicality for real-time applications.

In future, we plan to extend our work to include a broader range of testbeds with various layout changes and environmental dynamics. This expansion will provide a more comprehensive understanding of our proposed method’s performance and robustness across different scenarios. Additionally, we will refine our radio map construction process by eliminating heterogeneous RPs during interpolation, aiming to enhance localization precision further. We are also considering the integration of Wi-Fi with BLE technologies to improve the accuracy of IPS.