1 Introduction

In recent years, device-to-device (D2D) communication has received special attention as a candidate technology of 5G wireless communication. D2D communication empowers direct connection of discovery-based services and applications of proximal devices. It will improve spectrum usage, system throughput, and energy effectiveness. There are two potential possibilities generally for the communication between two distance-based proximal devices as presented in Fig. 1. First, if the devices are near to each other, they can communicate in two ways, directly or via the base station. Second, if the devices are extremely far apart, cellular infrastructure is used. Therefore, to initiate D2D communication, device discovery is a fundamental problem. There are two types of device discovery initiation procedure: autonomous discovery and network-assisted discovery. In an autonomous device discovery procedure, one device transmits a known reference signal (beacon) without coordination of network and uses a randomized procedure. The communication and discovery without coordination are usually time- and energy-consuming. So, the assistance of the existing system is used for device discovery procedure by coordinating frequency and time and their distribution for transmitting and receiving discovery signals [1]. The outcomes are energy efficiency, efficient resource management, interference control and mode selection using link qualities. So, network-assisted device discovery is considered in this research because it improves the device discovery process and has been proved in this work [2].

Fig. 1
figure 1

a D2D communication and cellular communication, b network-assisted device discovery for D2D communication

Many technologies for positioning and localization (discovery) have been implemented for out-band D2D communication, but the fine time estimation is quite difficult due to unlicensed spectrum [2,3,4,5,6,7,8]. Presently, in view of technological innovation, the device discovery for emergency devices can be committed as RF based, inertial measurement units (IMU) based and hybrid [6, 9]. The primary advantage of the RF-based discovery system is that it travels through obstacles, and therefore, network performance is not disturbed by device motion by walking and velocity. The device discovery performance can be enhanced by relaying, and the discovery process can be rehashed after relaying. However, it requires more than three base station (triangulation) estimations to discover the devices and suffers from NLOS, weather conditions and the unavailability of relay devices. IMU-based device discovery is another research area that depends on inertia and movement of sensors (3D magnetometer and barometer, 3D accelerometer and gyroscope) that create IMU. The upside of such discovery procedure is low cost, no additional infrastructure, discovery continuity, and the capability for indoor condition, while the disadvantages are error exponential, effective by velocity and need for an emergency incident commander. The hybrid may be a combination of RF and IMU based, but due to heterogeneity decisions and area conditions, accurate discovery is difficult. Therefore, by taking advantage of in-band RF based, discovery algorithm performance is evaluated.

Performance evaluation is essential for researchers, either for validation of new discovery algorithm against the preceding algorithm or while selecting the current algorithm which best fits the prerequisites of a given D2D application. However, there is an absence of unification in the D2D field due to new technology in terms of discovery algorithm assessment and correlation. Also, no standard methodology/technique exists to evaluate algorithm via simulation and emulation, modeling and real deployment [10]. Thus, it can be difficult to evaluate precisely how and under what conditions one algorithm is superior to another. In addition, it can be difficult to choose what performance measures of discovery algorithms are to be looked at or assessed against. It is significant for the accomplishment of the subsequent implementation given that diverse applications will have divergent necessities. This is because the discovery algorithm is required to be utilized as a part of real applications and it is not definite to confirm their simulation performance. This research here contends that algorithms ought to be imitated and consequently implemented in equipment, in D2D-enabled environment, as part of the entire trial of their performance.

In 5G cellular, the proposed accuracy of discovery is in centimeters (cm) [11] and latency less than 10 ms. Therefore, the required accuracy and latency for in-band device discovery can be achievable because of control propagation characteristics, robust signaling, and high-resolution range. Both industry and academia are focusing on high accuracy of device discovery for different applications and scenarios. In this research, range-based (RF-based) device discovery is assessed and correlation matrix introduced to improve the accuracy. In all cases, performance metrics are evaluated for complexity, precision, and accuracy. In this paper, performance evaluation metrics are assessed together with three primary criteria, which are discovery signal success ratio (DSR), average E2E latency, and RMSE. Given that devices are normally constrained in terms of lifetime and per-device computational resources, tending to these requirements prompts the trade-off in the performance of the discovery algorithm. For instance, if boosting the discovery accuracy is the premier need, particular equipment must be added to every RF-based discovery system, expanding device size, price, and weight. On the other hand, if the device accessibility is already decided, then the application requires that performance criteria be adjusted. In this research, analysis of the range-based RSS error is evaluated using Euclidean distance, Hamming distance, and Cosine distance. In addition, accuracy, precision, complexity, and cost of the proposed algorithm are evaluated.

The rest of the paper is organized as follows: Sect. 2 explains the related work for device discovery algorithms. The performance evaluation methodology is discussed in Sect. 3 with metrics and parameters, and evaluation of discovery error. Section 4 explains the analysis of the range-based RSS technique with performance metric using different scenarios. Discovery error estimation (DEE) using triangulation is derived in Sect. 5, and results and discussion are explained in Sect. 6. Section 7 concludes the paper.

2 Related Work

A D2D communication is broadly utilized as a part of numerous conditions to perform different monitoring tasks in IoTs. To carry out these tasks, different algorithms have been proposed to afford better precision regardless of whether anchor device density is low [12], while in high-density areas, the probability of anchor device is high. The anchor devices discovery benefits use today to enhance discovery ratio and the assignment of the anchor devices is to help other obscure devices to discover their location. To discover the unknown devices, range-based techniques are applied and angle of arrival (AOA) or distance/direction information (DOA) is required to determine the direction of obscure devices. However, while these techniques give high precision, additional equipment is required to find device coordinates. It also permits to route the discovery signal through relaying [13, 14]. So, in a D2D network, devices gather the information about surroundings and also share their observed data to the D2D-enabled base station. The base station holds the data in the database unit, where devices can access the data without searching the surroundings as presented by the model in Fig. 2. The discovery algorithms are applied to minimize the discovery errors and to maximize the precision.

Fig. 2
figure 2

Discovery algorithm model

The device discovery is an important problem for D2D communication and its applications [3, 15, 16]; however, not all applications require fine time determination of discovery. For such services and applications, even though discovery error is introduced by a discovery algorithm, this may not be injurious. But in 5G cellular network standardization, accuracy and time are the main parameters of discovery. Therefore, there are very few discovery algorithms proposed for device discovery based on the integration technologies proposed in 5G [17, 11]. These discovery algorithms are classified into the following groups: range based and range free, region based and connectivity based, and many more [5,6,8, 18]. It has been explained in the literature that different classes of algorithms introduce errors with different characteristics. Therefore, different types of errors are added up during the development of an algorithm [19, 20]. The algorithm performance is compared with traditional techniques using device density and discovery error as explained in Table 1.

Table 1 Comparison of the proposed algorithm

There are also many types of researches that have explored the different types of errors for D2D discovery [21, 22]. Most research considers the Euclidian distance between definite discovery and estimates discovery as an error metric to judge the accuracy discovery algorithms. The direct functions for Euclidian distance are also used to normalize the error values based on communication range [12, 23]. However, there is no significant research that contemplates the other metrics such as Hamming distance, Cosine distance, precision, complexity, and cost that are argued in this research, and it is sure that this paper is the first attempt to incorporate various other metrics to judge the accuracy of device discovery algorithm in D2D communication. We have proposed more than two algorithms [1, 2, 24] that will be also evaluated to measure the performances. Some metrics are popular and well known in other domains such as WSN, but not recognized in D2D communication. Besides these popular metrics, novel metrics that can fit D2D applications and will be tolerated according to applications are suggested, and therefore, this research can be considered as an alternative metric in discovery domain. Hence, this study is considered here as a unique and innovative contribution for performance evaluation of device discovery algorithms.

Among the extensive literature survey on device discovery, two performance indicators are very important, namely energy efficiency and discovery latency [25, 26]. Device discovery in a single cell and multicell and in dense areas is equally important. Energy efficiency strongly depends on the discovery latency, and discovery latency has statistical properties which may change in the different application context [27]. Very few researchers have done work on discovery latency. However, in some applications, initial discovery latency is the only concern while final discovery latency is the main concern in others. While it is attractive to have a high energy efficiency and a low discovery latency [28], it is not hard to perceive a trade-off between energy efficiency and discovery latency. A higher energy efficiency usually prompts a lower discovery latency. Therefore, how to adjust these two conflicting measurements becomes key. To shorten the discussion, a composite metric is proposed in [29], known as power-latency product, resulting in average energy consumption on worst discovery latency. Another important parameter is the discovery accuracy [30] to assess the performance of discovery procedure and algorithm. It demonstrates the difference between true value and the estimated value of discovery. Precision is also another parameter to calculate the reproducibility of progressive discovery measures [31]. This value can be utilized to evaluate the robustness of the discovery algorithm as it uncovers the variety of discovery appraisals over several iterations. To calculate the precision, it is needed to initially find the median discovery of random devices in the first iteration. After that, Euclidean error of each estimated device using median position is computed [21].

3 Performance Evaluation Methodology

In this section, discovery model for multiuser and orthogonal frequency-division multiple-access (OFDMA) system and discovery using sphere decoder-like (SDL) algorithm [1, 2] in in-band 5G networks performance is evaluated for both moveable and static devices. The performance metrics evaluation, simulation parameters, and simulation results are discussed. The RSS and DOA are applied to discover devices in areas of interest as presented in Fig. 3. The discovery performance can be assessed through least square, analytical modeling, Taylor series and (extended) Kalman filter [30]. The algorithms were assessed under many situations with various propagation conditions for static and moving devices and under the human body’s impact. A NLOS error mitigation and identification algorithm was likewise used to enhance the ranging measurements. Performance metrics are defined which help in comparing device discovery algorithms. In this way, the performance of the proposed discovery algorithm under the diverse situations is compared in view of the accompanying measurements: precision, accuracy, and RMSE. The accuracy is expressed by the average distance error, and precision is characterized as the success probability of estimated discovery regarding accuracy using traditional methods. In any case, this methodology needs to give helpful information for an indoor discovery precision, since the precision is constantly connected with the accuracy and these measurements are autonomous. So, precision and accuracy are presented as a cumulative distribution function (CDF) and expressed in numerical value [32].

Fig. 3
figure 3

Methodology for device discovery using the proposed algorithm for performance evaluation

3.1 Metrics and Parameters

The performance evaluation algorithm is developed in MATLAB and evaluated in terms of discovery signal success ratio (DSR), end-to-end (E2E) latency, energy consumption, and signaling overhead by changing the parameters [33]. The important changing parameters are the number of devices and mobility (haphazard walk and velocity) as presented in Table 2. The main metrics are:

Table 2 Simulation parameters

3.1.1 DSR (%)

It is the ratio of discovery signal delivered to discovery devices to discovery signal sent by all discoverer devices and calculated as:

$$ {\text{DSR}} \% = \frac{{\sum {\text{all}} \,{\text{discovery}} \,{\text{signal}}\, {\text{received}}\, {\text{by}}\, {\text{all}}\, {\text{destination}}\, {\text{devices}} }}{{\sum {\text{all}}\, {\text{discovery }}\,{\text{signal}} \,{\text{sent}}\, {\text{by}}\, {\text{all}}\, {\text{discoverer}}\, {\text{devices}}}} $$
(1)

3.1.2 Average E2E Latency

It is a measurement of the taken time for delivery of discovery signal to the destination devices. It is measured from the first discovery signal to be recognized by the discovery device. It embraces propagation and hop delay, NLOS, and scattering. It is computed as

$$ {\text{E}}2{\text{E }}\,{\text{Latancy }} = \frac{{\mathop \sum \nolimits_{1}^{N} {\text{Time}}_{r} - {\text{Time}}_{s} }}{N} $$
(2)

where Timer and Times are reception time and the sending time, respectively, and N is the number of successful discovery signals.

3.2 Evaluation of Discovery Errors

The D2D applications that insist on discovery (position information) of devices, which can be estimated by various discovery algorithms, unavoidably may present different kinds of error in their estimation. How an application is influenced by errors and a discovery error metrics reaction to errors may rely upon the error characteristics [30] as

3.2.1 Accuracy

Accuracy is the most important parameter to assess the performance of discovery algorithm. It shows the difference between true and estimated discovery. This parameter is measured as the Euclidean error and defined as

$$ D_{\text{Accuracy}}^{2} = \left( {x_{\text{Est}} - x_{\text{Actual}} } \right)^{2} + \left( {y_{\text{Est}} - y_{\text{Actual}} } \right)^{2} $$
(3)

where \( x_{\text{Est}} , y_{\text{Est}} \) are estimated coordinates by the discovery algorithm and \( x_{\text{Actual}} , y_{\text{Actual}} \) are the true coordinates.

3.2.2 Precision

Precision calculates the reproducibility of progressive discovery measures. This value can be utilized to evaluate the robustness of the discovery algorithm as it uncovers the variety of discovery appraisals over several iterations. To calculate the precision, it is necessary to initially find the median discovery of random 50 (25 pairs) devices in first iteration. After that, Euclidean error is computed for each estimated device using median position as:

$$ D_{\text{Precision}}^{2} = \left( {x_{\text{Est}} - x_{\text{median}} } \right)^{2} + \left( {y_{\text{Est}} - y_{\text{median}} } \right)^{2} $$
(4)

3.2.3 RMSE

Unlike accuracy, using RMSE gives error for X and Y coordinates. The RMSE on coordinates can be calculated as

$$ {\text{RMSE}}_{i}^{2} = \frac{{\sum \left( {{\text{Est}}_{i} - {\text{Actual}}_{i} } \right)^{2} }}{{{\text{Number}}\, {\text{of}}\, {\text{Estimates}}}} $$
(5)

where i is the axis. The joint values of RMSEX and RMSEY give net RMSE of the discovery algorithm, and the net RMSE can be calculated as:

$$ \left( {{\text{Net}}\, {\text{RMSE}}} \right)^{2} = {\text{RMSE}}_{X}^{2} + {\text{RMSE}}_{Y}^{2} $$
(6)

Therefore, it is important to compute the correct error metric to assess the error performance of substitute discovery procedure that is conceivable to use for an application. To date, unfortunately, only shortsighted error metric relies upon the Euclidean distance between a base station or anchor device position and discovery device. To evaluate the performance of the proposed algorithm, two common metrics, Euclidian error and Hamming error, are used, and results are compared with Cosine error.

3.2.4 Euclidean Distance

It gives the shortest distance metric between true value and estimated value as shown in Fig. 4a and is calculated as:

$$ d_{\text{Euc}} = \sqrt {\left( {x_{i} - \hat{x}_{i} } \right)^{2} + \left( {y_{i} - \hat{y}_{i} } \right)^{2} } $$
(7)
$$ \mu_{\text{Euc}} = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} d_{\text{Euc}} $$
(8)

where xi and yi are true value coordinates and \( \hat{x}_{i} \) and \( \hat{y}_{i} \) are estimated values of the device. N is the number of devices in search space and μEuc discovery error calculated by the discovery algorithm for all devices. In this metric, every device discovery and its estimates are thought of in isolation from other device discoveries and their estimates. Since this metric does not have the direction information of device in the network, it does not do well in NLOS conditions. This is why it may not be a decent metric in applications for which evaluating relative discovery is more critical than evaluating absolute discovery.

Fig. 4
figure 4

a Euclidean distance, b Cosine distance, c Hamming distance

3.2.5 Hamming Distance

Hamming error is another two-dimensional and prevalent metric. It is the measured distance of coordinates along axis at right angles as presented in Fig. 4c. It has finite field with x and y elements. The Hamming error between any two vectors x and y is \( \mu_{\text{Ham}} = d\left( {0112, 0122} \right) = 1 \) and usually satisfies conditions as

$$ \mu_{\text{Ham}} = 0\,{\text{only}}\,{\text{when}}\,x = y $$
(9)
$$ \mu_{\text{Ham}} \left( {x, y} \right) = \mu_{\text{Ham}} \left( {y, x} \right) $$
(10)
$$ \mu_{\text{Ham}} \left( {x, z} \right) \le \mu_{\text{Ham}} \left( {y, x} \right) + \mu_{\text{Ham}} \left( {y, z} \right)\,{\text{when}}\,x, y, z{ \mathcal{E} }F $$
(11)

Therefore,

$$ d_{\text{Ham}} \left( {x, y} \right) = \left( {\left| {\hat{x}_{i} - x_{i} } \right| + \left| {\hat{y}_{i} - y_{i} } \right|} \right) $$
(12)
$$ \mu_{\text{Ham}} = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} d_{\text{Ham}} $$
(13)

It does well even in dense areas, but due to binary constraints, its accuracy and complexity are not good.

3.2.6 Cosine Distance

It is a similarity measure metric that incorporates multiple values at one time instead of single value. It is used in data recovery domain. Device discovery domain considers two devices having two vectors \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {X}_{ij} \) and \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {X}^{ '}_{ij} \), actual and estimated, respectively. The Cosine values depend on θ between the vectors as presented in Fig. 4b. It is opposite to Euclidian matric and good for where directivity is needed [21]. It has range between -1 and +1. From Fig. 4b, the Cosine distance between two devices is calculated as \( \frac{1 - \cos \theta }{2} \). For multiple devices in dense area, the topology distance is computed as

$$ \cos \theta = \frac{{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {X}_{ij} \cdot \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {X}^{'}_{ij} }}{{\left| {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {X}_{ij} } \right| \left| {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {X}^{'}_{ij} } \right| }} $$
(14)
$$ d_{\text{Cos}} \left( {x, y} \right) = \frac{1 - \cos \theta }{2} $$
(15)
$$ \mu_{\text{Cos}} = \frac{2}{{N\left( {N - 1} \right)}}\mathop \sum \limits_{i = 1}^{N} \mathop \sum \limits_{j = i + 1}^{N} d_{\text{Cos}} $$
(16)

It does well even in NLOS conditions due to its characteristics (14). Therefore, this error metric has been applied for performance evaluation in this research. A shortsighted overview of proposed error metrics is explained in Table 3.

Table 3 Overview of proposed error metrics

4 Analysis of the Range-Based RSS Technique

The RSS presents the relationship between transmitted and received power of discovery signal with respect to the distance of discoverer:

$$ P_{r} \left( {\text{dB}} \right) = {\mathcal{A}} - 10a\log {{\Delta }} $$
(17)

where PR is the received power, \( {{\Delta }} \) is the distance between discoverer and discovery and \( a \) is the propagation constant that is dependent on the environment. From (17), the values of \( {\mathcal{A}} \) describe the association between the RSS and the distance of a discovery signal transmission. There are two RSS propagation models: log-normal shadow fading and free-space models. Log-normal shadow fading model is appropriate for both indoor and outdoor [34]. It is best suited for discovery signal due to its flexibility to different environmental conditions. On the other hand, free-space models have some advantages such as longer transmission distance than antenna size and carrier wavelength. In addition, free-space models are not affected as much by obstacles. The received power at distance \( {{\Delta }} \) device is

$$ P_{R} \left( {{\Delta }} \right) = \frac{{P_{T} {\mathcal{G}}_{T} {\mathcal{G}}_{R} \lambda^{2} }}{{(4\pi {{\Delta }})^{2} {\mathcal{L}}}} $$
(18)
$$ {\mathbb{P}}_{{\mathbb{L}}} \left( {\text{dB}} \right) = 10\log \frac{{P_{T} }}{{P_{R} }} = - 20\log \left[ {\frac{\lambda }{{4\pi {{\Delta }}}}} \right] $$
(19)

where \( {\mathcal{G}}_{T} \) and \( {\mathcal{G}}_{R} \) are the antenna gains, \( {\mathcal{L}} \) is the system loss, PT is the transmitted power and (19) is the attenuation factor. For log-normal shadowing fading, it gives many parameters for different environments as

$$ {\mathbb{P}}_{{\mathbb{L}}} \left( {{\Delta }} \right)\left( {\text{dB}} \right) = \overline{{{\mathbb{P}}_{{\mathbb{L}}} }} \left( {{\Delta }} \right) + {\mathcal{X}}_{\sigma } = \overline{{{\mathbb{P}}_{{\mathbb{L}}} }} \left( {{{\Delta }}_{0} } \right) + 10\log a\left( {\frac{{{\Delta }}}{{{{\Delta }}_{0} }}} \right) + {\mathcal{X}}_{\sigma } $$
(20)

where \( {{\Delta }}_{0} \) is reference distance and depends on empirical values, a is path loss exponent and depends on propagation characteristics and \( {\mathcal{X}}_{\sigma } \) is Gaussian random variable with zero mean. It also depends upon the frequency and power so,

$$ P\left( {{\Delta }} \right) = P\left( {{{\Delta }}_{0} } \right) - 10a\log \frac{{{\Delta }}}{{{{\Delta }}_{0} }} + {\mathcal{X}}_{\sigma } $$
(21)

\( P\left( {{\Delta }} \right) \) is received power for exact located devices at \( {{\Delta }} \) and \( {\mathcal{X}}_{\sigma } \sim\,N\left( {0, \sigma^{2} } \right) \). \( P\left( {{{\Delta }}_{0} } \right) \) is the free-space path loss and relies upon frequency used by the discovery signal. Therefore, it is considered as frequency-dependent parameter [35]. Furthermore, path loss exponent depends on transmission frequency \( a \propto \text{ }f_{{\left( {\text{MHz}} \right)}} \). In the RSS-based device discovery, signal propagation parameters are computed online or off-line. Online RSS measurements consume more radio resources than off-line, but when devices are moving, RSS updating is not possible. If σ and a are found precisely in any environment, then RSS discovery will be quite perfect. Moreover, the RSS pattern has been gathered from the experimental setup and \( {\text{RSS }}\left( {r_{m,n} } \right) \) database values are

$$ {{\Phi }} = \left[ {\begin{array}{*{20}c} {r_{1, 1} } & {r_{1, 2} } & \ldots & {r_{1, n} } \\ {r_{2, 1} } & {r_{2, 2} } & \ldots & {r_{2,n} } \\ \vdots & \vdots & \ddots & \vdots \\ {r_{m, 1} } & {r_{m, 2} } & \ldots & {r_{m, n} } \\ \end{array} } \right] $$
(22)

where \( {{\Phi }} \) is used as off-line and online training data. The PDF of gathered RSS data using path loss model can be composed as

$$ f_{x, y} \left( {{\Phi }} \right) = \mathop \prod \limits_{i = 1}^{k} \frac{10}{{\log 10\sqrt {2\pi } \sigma_{i}^{2} {{\Phi }}}}\exp \left[ { - \frac{{{\Phi }}}{4}\left( {\log \frac{{d_{i} }}{{\hat{d}}}} \right)^{2} } \right] $$
(23)

where \( {{\Phi }} = \left[ {r_{m,1} \ldots r_{m,n} } \right] \), \( \hat{d} = d_{0} \left( {\frac{{{\text{RSS}}\left( {d_{0} } \right)}}{{P_{i} }}} \right)^{{\frac{1}{{a_{i} }}}} \)

Different measurements are used to quantify the performance of discovery techniques. The accuracy is not the only parameter to explain the performance of the algorithms. Alluding to the literature, most discussed performance evaluation parameters are complexity, robustness, accuracy, scalability, coverage, and cost. They are basically associated with thrifty or technical requirements, for example, equipment cost, stumpy battery power, and minimum computational complexity. The discovery accuracy (DA) is a very important parameter for requirement of discovery system, and mean error is considered as performance metric. It is explained as follows:

$$ {\text{DA\% }} = \frac{1}{k}\mathop \sum \limits_{i = 1}^{K} \frac{{\hat{x}_{i} - x_{i} ^{2} }}{{r_{i}^{2} }} $$
(24)

where K is the number of devices to be discovered, di and \( \hat{d}_{i} \) are true value and estimated value, respectively, and ri is the RSS range of the device inside the network coverage. It is presented in percentage and normalized with coverage range. Cooperative and centralized discovery system gives more accurate discovery than distributed ones [1]. Clearly, the greater the accuracy, the better the discovery framework. However, it is trade-off between discovery estimation accuracy and other features as coverage, complexity, and many more. Therefore, a bargain between the required accuracy and other features is required. These features are coverage, complexity, scalability, robustness, and cost [36].

The device discovery depends on the network coverage and device transmission range. If the devices are out of coverage, then accurate device discovery is much difficult. Cooperative coverage will help to improve the discovery range. Complexity is attributed as software, hardware, and operation factors. Range-based discovery is much complex due to hardware involvement than range free. Discovery algorithm has also computing complexity. So, a centralized discovery process is preferred due to low complexity. Scalability is a measure in which right discovery is ensured when network coverage area is expanding or changing. A discovery system should scale as per network size, density, and dimensional space. Robustness is a measure of discovery stability even discovery signal is noisy or unavailable. In some cases, especially in indoor discovery, the discovery signal is blocked due to obstructions and NLOS condition. Therefore, some devices in the network could be uncertain. The cost of discovery system depends on software, hardware and weight, energy, and time. But the RSS-based device discovery does not need any extra hardware. To get better resolution, additional hardware is generally required that altogether increases the cost of every device and besides enhances the weight of the devices.

The development and evaluation of discovery algorithm cycle implies modeling, simulation and validation. Every cycle is characterized and validates precise feature of the algorithm. The algorithm is modeled based on the prescribed parameters of discovery such as RSS, DOA, and AOA. After modeling, simulation validates the algorithm under specific and simulated conditions, and this verifies the function of the algorithm. After simulated verification, the algorithm is applied to real applications. To evaluate the performance of the proposed algorithm, worst-case scenario is considered, when no device is discovered during discovery interval. It is divided into three probable groups: (1). A = no other devices receive when every device is transmitting its discovery request in transmission state, (2) B = one device receives when all other devices’ discovery signals are transmitting, and (3) C = when all the devices answer, and N is the total number of devices in dense area. The discovery is apportioned into three states: request state, offset state, and response state, as presented in Fig. 5. The request has further two states, transmit and receive, while response time is converted to observe and answer. It will work as presented in flowchart shown in Fig. 6. When group A occurs, transmitter and receiver patterns overlap. Therefore,

$$ P\left( A \right) = \frac{1}{2}\left( {\frac{1}{2}\left( {\frac{1}{{\left( {d\left( {t_{ \hbox{max} } } \right) - d\left( {t_{ \hbox{min} } } \right) + 1} \right) }}} \right)} \right)^{N - 1} $$
(25)

When group B occurs, some devices choose the same discovery signal on whole discovery channel,

$$ P\left( B \right) = \left[ {\mathop \sum \limits_{M = 2}^{N - 1} \left( {\left\langle {\begin{array}{*{20}c} x \\ y \\ \end{array} } \right\rangle \left( {\frac{1}{2}} \right)^{N} } \right) \times \left( {\frac{1}{j}} \right)^{M - 1} } \right]^{X} $$
(26)

where M is devices at the same frequency, N is total devices, and X is maximum discovery time and minimum discoverable time is 1. C case is the combination of A and B [32].

Fig. 5
figure 5

Discovery time divisions

Fig. 6
figure 6

Discovery signal transmission and reception with device states

5 Discovery Error Estimation (DEE) Using Triangulation

RSS from more than two base stations creates a triangle as presented in Fig. 7. It enhances the discovery ratio and quality with minimum error. It is relatively meek, since it associates the base station with strongest RSS as device discovery. One fundamental function in mobile devices is to discover the base station with strongest RSS. Accordingly, this technique can be accomplished without hardware enhancement of either device or base station. Therefore, to enhance the accuracy, algorithm for estimation and maximization of associated parameters is needed. The algorithm calculates the device discovery with location error. Regarding each base station, it often has different error values due to shadowing fading. With the RSS of the base station, it is conceivable to evaluate the distance between the device and the base station and devices can get their positions from several adjacent cells or base stations. This will transform the problem into the outstanding triangulation situating issue [37], which is depicted in Fig. 7. This issue can be portrayed as below:

$$ x_{m}^{2} + y_{m}^{2} = d^{2} $$
(27)
$$ \left( {x_{m} - x_{b1} } \right)^{2} + \left( {y_{m} - y_{b1} } \right)^{2} = d_{1}^{2} $$
(28)
$$ \left( {x_{m} - x_{b2} } \right)^{2} + \left( {y_{m} - y_{b2} } \right)^{2} = d_{2}^{2} $$
(29)
$$ \left( {x_{m} - x_{b3} } \right)^{2} + \left( {y_{m} - y_{b3} } \right)^{2} = d_{3}^{2} $$
(30)
Fig. 7
figure 7

Triangulation positioning

Here, \( \left( {x_{bi} , y_{bi} } \right), i \in \left[ {1, 2, 3 \ldots } \right], \) is the location of base stations, \( d_{i}^{2} \in \left[ {1, 2, 3 \ldots } \right] \) is the separation from the device to base stations, and \( \left( {x_{m} , y_{m} } \right) \) is the location of the device. Equation (27) has two obscure factors (xm and ym) and three different equations for three cells. If di can be gained precisely, the solution of equations will also be precise. The issue is that the distance di is considered with RSS. In the transmission, the signal would endure interference and shadow fading, and di would not be correct distance between base station and device [38]. In this case, (28)–(30) will advance into

(31)
(32)
(33)

where \( e_{i} \left( t \right),i \in \left\{ {1,2,3 \ldots } \right\}, \) is distance error brought about by the shadow fading and interference which is a function of time. Equations (31)–(33) may not have an analytical solution. However, the numerical solution may solve it. The solution equations give the shortest distance from the base station and provide the device discovery solution. By combining from (27)–(30) and subtracting (27) from (28), (29) and (30) yield

$$ x_{b1}^{2} + y_{b1}^{2} - 2x_{m} x_{b1} - 2y_{m} y_{b1} = d_{1}^{2} - d^{2} $$
(34)
$$ x_{b2}^{2} + y_{b2}^{2} - 2x_{m} x_{b2} - 2y_{m} y_{b2} = d_{2}^{2} - d^{2} $$
(35)
$$ x_{b3}^{2} + y_{b3}^{2} - 2x_{m} x_{b3} - 2y_{m} y_{b3} = d_{3}^{2} - d^{2} $$
(36)

To solve the above equations, convert them into matrix form for LLME solution as

$$ \underbrace {{\left[ {\begin{array}{*{20}c} {x_{b1} } & {y_{b1} } \\ {x_{b2} } & {y_{b2} } \\ {x_{b3} } & {y_{b3} } \\ \end{array} } \right]}}_{{{\Psi }}}\underbrace {{\left[ {\begin{array}{*{20}c} {x_{m} } \\ {y_{m} } \\ \end{array} } \right]}}_{{\alpha_{i} }} = \frac{1}{2}\underbrace {{\left[ {\begin{array}{*{20}c} {d_{1}^{2} - d^{2} - k_{1}^{2} } \\ {d_{2}^{2} - d^{2} - k_{2}^{2} } \\ {d_{3}^{2} - d^{2} - k_{3}^{2} } \\ \end{array} } \right]}}_{{k_{i} }} $$
(37)
$$ \left[ {\begin{array}{*{20}c} {x_{m} } \\ {y_{m} } \\ \end{array} } \right] = \frac{1}{2}\left[ {\begin{array}{*{20}c} {x_{b1} } & {y_{b1} } \\ {x_{b2} } & {y_{b2} } \\ {x_{b3} } & {y_{b3} } \\ \end{array} } \right]^{ - 1} \left[ {\begin{array}{*{20}c} {d_{1}^{2} - d^{2} - k_{1}^{2} } \\ {d_{2}^{2} - d^{2} - k_{2}^{2} } \\ {d_{3}^{2} - d^{2} - k_{3}^{2} } \\ \end{array} } \right] $$
(38)

where k2i = x2bi + y2bi in which \( x_{bi} \,{\text{and}}\, y_{bi} \) are the true positions of ith base station and xm and ym are the estimated device positions. If ki represents the measured range and di represents the true range, di can be written in terms of ki as di = αiki, where \( {{\alpha }}_{i} \) is NLOS propagation and has values 0 < αi ≤ 1. The values αi are limited in such a way that the NLOS error is a vast positive bias that makes the measured ranges to be more prominent than the exact ranges. From the \( {\text{RSS}}\left( {r_{i} } \right) \) measured values,

$$ r_{i} = \alpha_{i} - 10\beta \log \left( {d_{i} } \right) + \eta_{i} $$
(39)

where β is path loss exponents and \( \eta \sim{\text{Norm}}\left( {0, \sigma } \right) \). The RSS from ith base station and the distance can be calculated by

$$ d_{i} = \log^{ - 1} \left( {\frac{{\alpha_{i} + \eta_{i} - r_{i} }}{10\beta }} \right) $$
(40)
$$ \theta = \tan^{ - 1} \left( {\frac{{y_{b} - y_{m} }}{{x_{b} - y_{m} }}} \right) + \eta $$
(41)

User position is changing simultaneously due to the mobility, so it is necessary that every base station tracks the behavior of the device using environmental information. Therefore, device discovery procedure can be performed by every base station independently, and many intriguing practices arise when numerous base stations running a device discovery procedure coordinate to distinguish an entering user. The accessibility of more base stations reduces the discovery time because of the parallelism search. From (39), the αi could be calculated using the pseudoinverse of \( {\varvec{\Psi}} \) as follows:

$$ \hat{\alpha }_{i} = \frac{1}{2}\left( {{\mathbf{\Psi \Psi }}^{{\mathbf{T}}} } \right)^{ - 1} {\varvec{\Psi}}{\text{k}}_{\text{i}} $$
(42)

6 Results and Discussion

There are many measures to evaluate the device discovery algorithms. These metrics are accuracy, precision, root-mean-square error, complexity and robustness against attacks. Other parameters that may help to evaluate the performance of discovery procedure are discovery signal success ratio, average end-to-end latency, average residual energy and signaling overhead. From Eqs. (31) to (33) and (42), the can be defined as

(43)

where is the estimation error of ith devices, \( \alpha_{i} = \left[ {x_{m} , y_{m} } \right]^{T} \), which has ith device coordinates and is excessively complex. It is appealing to calculate and investigate discovery estimation errors (DEE) as

(44)

where and . \( {\mathfrak{g}}_{\text{i}} \left( m \right) \) is the global discovery estimation error with the help of the base station. Suppose, in a dense area like stadium and shopping mall where there are a large number of devices, by applying the law of large numbers (LLN)

(45)

where \( \mathop{\longrightarrow}\limits{p \to 1} \) is the convergence probability toward one. is not only a statistical average, but it converges to any realization of \( \left( {\left( {x_{m} , y_{m} } \right), m = 1, \ldots , N} \right) \). To calculate average discovery estimation from (44),

(46)

where \( {{\Delta }}_{i} \) is the small changes and can be calculated as \( \varsigma_{1} - \varsigma_{N} , \ldots , \varsigma_{N - 1} - \varsigma_{N} \) with \( \varsigma_{N} = \hat{d}^{2}_{i - j} - d_{i - j}^{2} . \) Applying the trace on gives

(47)
(48)
$$ = \left( {\mathop \sum \limits_{i = 1}^{N - 1} {\mathcal{W}}_{jj} \left( {{{\Delta }}_{ij} } \right)^{2} + \mathop \sum \limits_{i = 1}^{N - 1} \mathop \sum \limits_{j = 1, j \ne k }^{N - 1} {\mathcal{W}}_{jk} \left( {{{\Delta }}_{ij} } \right)({{\Delta }}_{ik} )} \right) $$
(49)
$$ E\left\{ {\left( {{{\Delta }}_{ij} } \right)({{\Delta }}_{ik} )} \right\} = \left\{_{{0 \quad if \quad {\text{j}} \ne {\text{k}}}}^{{\sigma^{2} \quad if\quad {\text{j}} = {\text{k}}}}\right. $$
(50)
(51)

Standard concepts of computational complexity of any algorithm in time and space can be utilized as correlation measurements for the relative cost of discovery algorithms. For instance, in dense areas, the discovery algorithm with \( {\mathcal{O}}\left( {n^{3} } \right) \) complexity takes extensive time to converge as compared to \( {\mathcal{O}}\left( {n^{2} } \right) \) and the same is valid for space complexity [39]. This is because, as the number of devices increases, volume of memory (RAM) is required and will increment at a specific rate. The algorithm that requires less memory at a given scale might be ideal. This may help stimulate a trade-off between distributed and centralized algorithms; for example, centralized approach is better for dense areas.

To evaluate the discovery algorithms, the primary parameter is an evaluation of the discovery error, which is explained in Sect. 3. Further, it has three basic parameters, which are accuracy, precision, and mean square error. Accuracy defines the difference between estimated values and true values, while precision revolves around the median and gives a more robust accuracy of discovery algorithm. The mean square error gives the difference in X and Y coordinates and depends on the number of estimations. If estimations are large like in dense areas or in multicell, it gives small values. To calculate these parameters, Cosine error metrics is applied and compared with Euclidean error metrics and Hamming error metrics. To compare these metrics, simulation parameters mentioned in Table 1 are used and results are presented in Fig. 8. The Cosine error metrics give more accurate discovery than Hamming and Euclidean error metrics. A 3D view of Cosine error metrics of discovered devices in dense area is presented in Fig. 9.

Fig. 8
figure 8

a Cosine distance, b Hamming distance, c Euclidean distance

Fig. 9
figure 9

3D view of Cosine distance of devices in dense areas

The discovery accuracy depends upon variance and the number of devices. Variance shows the difference of observation with each other. There are two states of variance: probability of discovery (p) and probability of no discovery \( \left( {1 - p } \right) \). By applying the binomial distribution to measure the variance, \( n *p\left( {1 - p} \right) \), where n is the number of iterations dependent on device density. So, the proposed scheme is also verified by the variance as presented in Fig. 10. The main parameter of discovery algorithm is discovery accuracy, and it represents the algorithm precision and accuracy. The accuracy error (in meters) depends on the transmission power (dBm). As power increases, decreases toward zero and proposed method reduces accuracy of RMSE by 21% as proved in Fig. 11. In the last, the algorithm performance is assessed using algorithm complexity. Initially, it is supposed that the algorithm is complex and when Euclidean distance is applied, the complexity is reduced at 12 pairs of devices to 11%, while at the same conditions, that of the proposed algorithm decreases to 29% as verified in Fig. 12.

Fig. 10
figure 10

Variance of proposed model

Fig. 11
figure 11

Error performance of the proposed discovery algorithm

Fig. 12
figure 12

Proposed algorithm complexity

7 Conclusion

In conclusion, the discovery algorithms’ performance evaluation is not undervalued by the researchers. To entirely evaluate a discovery algorithm and its performance, it must be verified by simulation, imitation and realistic atmospheres. Although simulation is cheap and the most applied tool for algorithm evaluation, awareness of some constraints as RF communication and mobilization (walk and velocity model) is required. To fulfill the best fit for discovery algorithm, the design and advancement for new discovery algorithms require that attention be paid to trade-offs of accuracy, complexity, and scalability the discovery system is required to achieve. The parameter metrics are used to describe the discovery quality, and it is important for overall evaluation criteria, but most significant for accuracy evaluation. The Hamming and Euclidean error metrics are the weakest, but do not always measure if the discovery solution fits for ground truth. Therefore, equivalent metric, Cosine error metrics, is applied to measure the inter-devices distance estimates. It gives the significant performance for performance evaluation of discovery algorithms. This work can be extended for IoTs application, where small sensors are moving randomly.