1 Introduction

Internet of Things (IoT) design facilitates visualization and representation of unprocessed data in digital format. This digital visualization provides easy access to heterogeneous devices, ranging from small sensors to large cloud systems [1, 2]. The design goal of IoT is to provide pervasive access to resources and devices in a distributed communication platform without the need for additional infrastructure or computation units. For this purpose, the common Internet platform is used by the connected devices and service providers [3, 4].

The IoT platform consists of heterogeneous devices, and the communication technologies vary from Zigbee to WiMAX, a wireless local area network (WLAN) [5, 6]. This heterogeneous communication platform is interoperable across all devices that share common computation capabilities and storage [7, 8]. Resource allocation and sharing is preceded with the help of content dissemination and centralized cloud servers through infrastructure and other gateway devices [9, 10]. The fundamental process is the pervasive request processing by the cloud servers for allocating resources to the end-users [11, 12]. Time-constrained request processing and resource allocation improves the quality of service (QoS) and experience of the users over various applications [13].

Resource allocation in IoT is a challenging and demanding task because of its distributed nature, and the necessity for timely access. The interconnection between distributed systems through heterogeneous connectivity and distinct applications increases the demand for available resources [14]. Besides, the allocation interval and response, along with the processing time, are some other QoS constraints when determining the efficiency of an IoT-based system [15].

The primary task in a IoT-coupled resource allocation process is its support for interoperability, along with shared access and service response without delay [16]. Therefore, a resource allocating IoT environment has to balance between the available resources, allocation time interval, and user device requests for improving the QoS of the application [17, 18]. This aids in meeting the user requirements and in-time processing of requests to prevent allocation failures. The resource allocation and request processing are balanced for any number of devices to meet the application/ user requirements [19, 20]. Moreover, machine learning techniques are used to select a task from the list of tasks, which helps in reducing the resource allocation complexity. A number of techniques such as neural networks, k-nearest neighbor, support vector machine, and deep neural network are used to provide resources for their respective tasks. These intelligent learning techniques improve the resource allocation process by optimizing the learning process. Therefore, this study utilizes a deep learning technique to allocate resources by overcoming time and computation complexities. The main contributions of the study are as follows:

  • To allocate optimal resources for a task by applying deep learning with a scalable resource allocation framework

  • To minimize time and computation complexities while allocating resources

  • Allocating a resource by resolving its replication and overloading issues

The rest of the manuscript is organized as follows: Section 2 discusses various opinions regarding the resource allocation process. Section 3 analyzes the proposed scalable resource allocation framework (SRAF) with a resource allocation process based on deep learning, and Section 4 evaluates its efficiency. Finally, Section 5 provides the conclusions.

2 Related Works

This section discusses different opinions regarding the resource allocation process. Abedin et al. [21] introduced an effective QoS, formulated joint user association, and resolved the resource allocation problem using analytic hierarchy process (AHP)-matching. AHP-matching examines the priority of QoS requirements in heterogeneous applications. After identifying the priority, an association between an IoT device and fog infrastructure is established. This association helps in minimizing the resource allocation problem by selecting the best resource from a collection of resources. It consists of QoS requirements imposed by ultra-reliable low latency communication (URLLC) and enhanced Mobile Broadband (eMBB) services. Owing to the computation of quality constraints, the resources are allocated easily and effectively. In addition, this also maintains the reliability and scalability of resource allocation.

Mergenci and Korpeoglu [22] have discussed generic resource allocation for heterogeneous cloud infrastructure. They propose two metrics for reflecting the current state of a virtual machine. Thus, their proposed method uses a multi-dimensional resource allocation heuristic algorithm.

Nassar and Yilmaz [23] designed reinforcement learning for resource allocation in fog radio access networks (F-RANs). The limited resources are allocated to IoT applications. For each access, a fog network (FN) decides whether to serve the request from an IoT user locally at the edge by utilizing its own resources, or to refer it to the cloud and conserve its valuable resources for future users with a potentially higher utility to the system.

Efficient resource allocation for the uplink transmission of wireless IoT networks was proposed by Liu et al. [24]. In this study, an efficient channel allocation algorithm (ECAA) of low complexity was designed for user grouping. Then, a Markov decision process (MDP) model was used for unpredicted energy arrival and channel condition uncertainty from each user.

Li et al. [25] proposed an edge-cloud-assisted IoT. They designed an iterative double-sided auction scheme (DSAS) for computing resource trading. Here, the brokers solve an allocation problem, and design a specific price rule for the buyers and sellers of a computing resource to truthfully submit bids. Thus, the proposed system is able to assist different tasks and provide the resources without creating any complexity. However, regular update of price rules is difficult. Nevertheless, this DSAS system is able to manage compatibility, budge balance, and individual rationality.

Li et al. [26] introduced a fog computing node with IoT (FN-IoT) by collecting a large amount of data to make reliable offloading decisions. It transfers the data to the fog computing nodes, thereby supporting a large amount of data with low latency and limited resources. The deployment of non-orthogonal multiple access (NOMA) in an IoT network is used to transmit the data to the same FN in the same time and code domain. The NP-hardness in this process is resolved by applying an improved genetic algorithm. However, an intermediate access may change the task representation because of multiple access, creating difficulties in provisioning of resources. Duplications may occur, thereby reducing the entire system’s performance.

A data-driven resource allocation for NFV-based IoT was proposed by Tian et al. [27]. This synthetic approach is based on examining both network processing procedures and stationary users’ behaviors. Then, a matrix mapping based dynamic resource allocation mechanism is modeled for the virtualized core mobile networks.

Ramezani et al. [28] introduced a single wireless-powered relay for multiple users. An energy-constrained relay assists the information transmission from a number of IoT devices to the access point (AP) using WPC. It maximizes the total network throughput by optimizing the wireless energy transfer (WET) duration and the relay’s energy expenditure in each time slot together.

Wireless-powered IoT networks with short packet communication has been proposed by Chen et al. [29]. An effective throughput and an effective amount of information are used to manage the transmission rate and packet error rate (PER). It cooperates with the optimized transmission time and PER of each user to maximize the total effective throughput or minimize the total transmission time, subject to individual user’s information requirements.

Aazam et al. [30] addressed 5G tactile industrial fog computing. The tactile Internet has its own use-cases across a number of application domains, the industrial sector being one of the most popular among them. The objective is the quality of experience (QoE)-awareness for dynamic resource allocation in a tactile IoT application.

Dai et al. [31] presented a game-theoretic approach for QoE-driven, 5G enabled IoT. They introduced an allocation channel problem for the IoT uplink communication in a 5G network. A mean opinion score (MOS) function of transmission delay was used to measure QoE of each smart object.

Gao et al. [32] introduced the expansion of data based on wisdom architecture as an organized approach for modeling both entities and relationship elements. This approach focuses on accessing and processing resources for security protection by exploiting the cost variation in both types of resource conversions and traversals. Here, resources are allocated to a task based on their characteristics, reliability, and scalability. Therefore, this study proposes an effective, scalable, and reliable framework using deep learning techniques. The introduced techniques not only resolve the above-discussed problem but also attain the mentioned contributions. Here, references [21, 25, 26] are chosen for comparison because of their use of massive amount of data, along with reliable resource allocations and scalable allocation processes. Based on the statistical survey, the proposed deep learning based resource allocation framework helps in reducing the waiting and processing times of the requests under a controlled response time. Besides, the optimal segregation of available resources and request density facilitates failure-less allocation. The proposed system is details discussed in the next section as follows,

3 SRAF

An SRAF integrates the user requests and available resources to meet the user requirements. This framework jointly operates in the cloud and IoT layer for improving the resource allocation rate. The concise management of available resources, user requests and allocation lag is aided through deep learning. This deep learning paradigm is responsible for retaining the liveliness of requests and in-time allocation of resources. Therefore, the proposed framework is modeled in three phases: request mapping, resource allocation, and time lag optimization. Here, the interconnection between heterogeneous devices and long-range access support of resources are exploited for improving performance. Besides, the machine learning process for resource allocation and lag optimization is augmented to retain the quality of response. In the following subsections, the three phases will be explained briefly. The proposed SRAF is shown in Fig. 1.

Fig. 1
figure 1

SRAF structure

Figure 1 represents that the SRAF structure for allocating the resources according to the user request. The data is collected with the help of IoT devices, which are collected using fog nodes, and stored in the cloud data center. Based on the user request, the requests are mapping with relevant data which is performed by passing the instruction via the data controller and wireless transmission. According to the request, the resources are allocated by performing the resource mapping. The detailed explanation of the Scalable Resource Allocation Framework based resource allocation process is discussed in the following section.

Generally, Fog nodes have an extreme virtualization feature. Here, Each Fog node may be made up of one or more devices and therefore build a virtual network to serve the region of coverage in accordance with base station. Such machines may be routers, switches, gateways or the central base stations where controls run and managed using controllers. IoT Edge is connected to a Fog Computing Layer that is connected in turn to the centralized Cloud Computing Layer, which enables optimized resource, mapping and request of resources. That kind of relation forms a network framework for hierarchical computing architecture.

3.1 Resource Allocation Problem

Resource allocation problem is based on the joint optimization of n IoT users requesting\( \mathfrak{R} \) resources that are available with M service providers. Further, the time and type of resource requested by the users depend on the application platform. Here, Based on each resource and service provider as named as\( \left\{{r}_1,{r}_2,\dots .{r}_n\right\}\mathfrak{\in}\mathfrak{R},m\in M \), The n IoT users accommodated and serviced at a time ts need not be the same. However, the service provider M = {1, 2, …., m} verify the availability of resources\( \kern0.5em \mathfrak{R}=\left\{{r}_1,{r}_2,\dots, {r}_n\right\} \) at the request time tr, which is first processed by service provider M to verify if\( \left\{{r}_1,{r}_2,\dots {r}_n\right\}:\to \left\{{t}_{s_1},{t}_{s_2},\dots {t}_{s_n}\right\} \), that is, the resources {r1, r2, …rn}are allocated at the service time \( \left\{{t}_{s_1},{t}_{s_2},\dots {t}_{s_n}\right\} \). Let ρa and ρr denote the allocation of R to n and the total R allocated to n in ts. The allocation function for a user is given as 2R → ts ∀ R ∈ M and R :  → ts. With reference to the allocation function, the resource allocation problem is framed as follows:

$$ \underset{\rho_a,\gamma }{\max}\left({\sum}_{i=1}^n{\rho}_{r_i}\ {\rho}_{a_i}-{\sum}_{j=1}^M{\sum}_{k=1}^R{\rho}_{r_i}\times \frac{k}{j}\ {\sum}_{i=1}^n{\gamma}_i\frac{k}{j}\right) $$
(1a)
$$ {\rho}_{a_i}=\left\{\begin{array}{c}0, if\ R:\nrightarrow {t}_s\\ {}1, if\ R:\to {t}_s\end{array}\right. $$
(1b)
$$ {\sum}_{j=1}^M{\gamma}_{ij\frac{k}{j}}={r}_{i\frac{k}{j}}\ {\uprho}_{a_i},\forall i\in R,\forall k\in M $$
(1c)
$$ {\sum}_{i=1}^n{\gamma}_{ij\frac{k}{j}}\le \frac{r_i}{t_{r_i}},\forall i\in n\ and\ k\in M\ and\ j\in R $$
(1d)

In the above formulations, the requirement for resource allocation is defined, where\( \gamma =\frac{t_s}{t_r} \). In Equation (1a), the maximization of the ρa = 1 resource allocation from M to all n in ts is expected. The conditions in Equations (1b), 1(c), and 1(d) are designed to ensure that the R is high, the resources are mapped for appropriate tr and the ratio of serviced requests is high. From the above problem formulation, maximizing γij and\( {r}_i/{t}_{r_i} \) for\( {\rho}_{a_i}=1 \) helps achieve optimal resource allocation. The design of the proposed framework considers the above-mentioned constraints. The framework design focuses on improving γ and\( \frac{r_i}{t_r} \) for the available R, to ensure ρr = 1 for n in time (tr − ts).

3.2 Request Mapping

In the request mapping phase, the service measures for the input requests at tr from n are handled and allocated to an active resource. In the request mapping process, the new request has to wait for a time ta after the existing request, where ta is the time for allocation of the resource. Similarly, the processing time (tp) of M needs to be considered when effectively determining the wait time; therefore,\( {t}_w={t}_p=s+\left(\frac{count\ {\rho}_a}{t_a}\right) \) is the required time for the next request to be allocated to the resource. When M < n, the incoming requests are queued in the IoT buffer for assigning M with a tw. As formulated, the wait time of the new request is defined as the processing time for the existing request, and this depends on the processing speed (s) of M. Allocation failures will happen if this time exceeds the waiting time of successive requests. Therefore, request mapping is facilitated on the basis of best-fit M. The best-fit M is identified based on its allocation rate. The optimized resource of M(fM) is then defined by Equation (2):

$$ \left.\begin{array}{c}{f}_M=\frac{t_p+\frac{\rho_a}{\rho_r}}{s}\\ {} where,\\ {}s=\left(1-\frac{t_r}{t_s}\right)+\gamma \end{array}\right\} $$
(2)

In Equation (2), s, the processing speed, is based on the ratio of balanced r and\( \frac{t_r}{t_s} \). This fM is used by the deep learning process with respect to time (tp) for assigning r to an appropriate request. Assigning r to the processed request follows the satisfaction of internal and external constraints, that is, \( {\gamma}_{ij\frac{k}{j}}\le \frac{r_i}{t_{r_i}} \) and fM has a maximum value. If these two constraints are satisfied, then the request mapping occurs sequentially. On the other hand, retaining fM is crucial as {r} ∈ R → {t} in all tr accepted for processing. Therefore, a change in the sequence of request mapping degrades the performance of the allocating framework. Based on conditional analysis, the exploited and internal conditions are handled by training the fM at each sequence. Figures 2 (a) and 2 (b) present the analysis of fM with respect to the above conditions.

Fig. 2
figure 2

(a) Analysis of fM for \( {\gamma}_{ij\frac{k}{j}}\le \frac{r_i}{t_{r_i}} \) (b) Analysis of max fM (order of γ)

As presented in Fig. 2 (a), the initial mapping of R → {tr} until\( \gamma \le \frac{r}{t_r} \). If a change in this condition from prolonged ts or min {fm} is observed, then fM based ordering of γ is facilitated. The output of Fig. 2 (a) is refined on the basis of max {fM} condition, following which the reordering takes place.

In the reordering process, the tr with less ts is mapped with the R (i.e.) for reordering the request map and analyze the all the mapped and unmapped data. In the successive mapping instances, R → {ts} in the diminishing order of γ and fM. The outputs of processing layers fM and γ in Figs 2 (a), (b) are represented by Equations (3) and (4).

$$ \left.\begin{array}{c}{f}_{M_1}=\frac{\rho_{a_1}}{\rho_r}+\left[{s}_1\times \left({t}_{s_1}-{t}_{r_1}\right)\right]\\ {}{f}_{M_2}=\frac{\rho_{a_2}}{\rho_r}+\left[{s}_2\times \left({t}_{s_2}-{t}_{r_2}\right)\right]+\left(\frac{r_1}{t_{r_1}}-{\gamma}_1\right)\\ {}\begin{array}{c}\vdots \\ {}{f}_{M_n}=\frac{\rho_{a_n}}{\rho_r}+\left[{s}_n\times \left({t}_{s_n}-{t}_{r_n}\right)\right]-\left(\frac{r_n}{t_{r_n}}-{\gamma}_n\right)\end{array}\end{array}\right\} $$
(3)

In Equation (4), a variation in\( \left(\frac{r_n}{t_{r_n}}-{\gamma}_n\right) \) is observed when Equation (1d) is not satisfied. Therefore, the allocation of R and request mapping is reordered to satisfy the condition in Equation (1d). This reordering is followed by the validation of γ for all mapped and unmapped requests. This validation is given as

$$ \left.\begin{array}{c}{\gamma}_2={\rho}_{a_2}\times \left|\left(\frac{r_1}{t_{s_1}}-\frac{r_2}{t_{s_2}}\right)\right|-\frac{1}{s_1}\\ {}\begin{array}{c}{\gamma}_3={\rho}_{a_3}\times \left|\left(\frac{r_2}{t_{s_2}}-\frac{r_3}{t_{s_3}}\right)\right|-\frac{1}{s_2}\\ {}\vdots \end{array}\\ {}{\gamma}_n={\rho}_{a_n}\times \left|\left(\frac{r_{n-1}}{t_{s_{n-1}}}-\frac{r_n}{t_{s_n}}\right)\right|-\frac{1}{s_{n-1}}\end{array}\right\} $$
(4)

The request mapping is based on the validation of γn, that is, the analysis of γn prefers {tr} ∀ R :  → {ts}, either in an ordered or unordered manner. Therefore, the mapping order that does not satisfy Equation (1d) is rolled over to next ts, provided ts/tr ≤ γn (as per Equation (4)). If this condition is satisfied, then validation of ρa is not necessary, and tr and ts are not recorded. This ensures that the request is mapped successfully on either of the orders to R provided by M.

3.3 Resource Allocation

In this phase, we focus on the objective defined in Equation (1a) by satisfying Equations (1c) and (1d). In the request mapping process, the condition in Equation (1d) is satisfied by allocating requests to the appropriate ts and maximizing γ. Therefore, this resource allocation process focuses on the condition in Equation (1c). In a conventional process, resource allocation is performed on a first come first serve basis, wherein the available resource is mapped to the request processed in tr. Equation (1c) specifies that the rate of serviced requests γ is equal to the resource mapped for the requests with ρa = 1 constraint. This means the available requests are mapped with the allocated resource such that\( \frac{\rho_a}{\rho_r}=1 \) in time ts. The s of the M is the deciding factor in handling all requests and their allocated resources. Therefore, the instant allocation process for maximizing γ is defined as

$$ f\left(\gamma \right)=\left\{\begin{array}{c}{\sum}_{i=1}^M{f}_{M_i}.\frac{1}{s_{n_i}}, if\ M\ge \frac{t_r}{t_s}\\ {}{\sum}_{i=1}^R{f}_{M_i}\frac{1}{s_{n_i}}-{\sum}_{i=1}^M\left(1-\frac{t_{r_i}}{t_{s_i}}\right), if\ M<\frac{t_r}{t_s}\ \end{array}\right. $$
(5)

In Equation (5), f(γ) denotes the maximizing function with respect to the available M and R. The case of M ≥ R can be neglected as the available resource is sufficient for allocation, provided\( M\ge \frac{t_r}{t_s} \). Instead, if\( m<\frac{t_r}{t_s} \), the overloading of R needs to be considered. In this case, the changes in s of a M needs to be verified, and hence, s and\( \frac{t_r}{t_s} \) at any time instance is used for analyzing the allocation process. As given in Equation (1c), we can consider the case of\( {\sum}_{i=1}^M{\gamma}_{\frac{ijk}{j}}<{r}_{i\frac{k}{j}} \)as\( M<\frac{t_r}{t_s} \)for identifying the possible resource allocation criteria. Therefore, the allocation is determined by max {f(γ)}, for \( \frac{t_r}{t_s}>M \) or\( \frac{t_r}{t_s}>R \). When\( \frac{t_r}{t_s}>R \), the service provider is overloaded based on its s and\( \frac{t_r}{t_s} \) rate. The process is differentiated based on s and \( \frac{t_r}{t_s} \) rate illustrated in Figs 3(a), (b), respectively.

Fig. 3
figure 3

(a) Analysis based on s (b) Analysis based on\( \frac{t_r}{t_s} \)

When handling requests based on s, The following changes has been considered which are listed as follows,

  • \( M-\frac{t_r}{t_s} \) Requests are allocated with the M to achieve max f(γ).

  • \( if\ {t}_{s_n} \) for R is high under the condition when s is high, thereby reducing the (ts − tr),

  • If \( \frac{t_r}{t_s} \) is maximized, the allocation time of the requests, on the other hand of the M is considered.

If\( {t}_{r_n}\le \left({t}_{s_{n-1}}-{t}_{r_{n-1}}\right) \), then the M with\( \min \left\{{t}_{r_n}\right\} \) is selected for serving\( \left(M-\frac{t_r}{t_s}\right) \) requests. The allocation precedes the M with minimum\( {t}_{r_n} \)such that the remaining requests are allocated with appropriate resources. Therefore, Equation (1c) can be re-written as

$$ {\sum}_{j=1}^M{\gamma}_{ij\frac{k}{j}}=\left\{\begin{array}{c}{r}_{i\frac{k}{j}}{\rho}_a+{s}_i\frac{1}{t_{r_i}},\forall M<\frac{t_r}{t_s} and\ i\in M\\ {}{r}_{i\frac{k}{j}}\ {\rho}_a+{s}_i.\frac{\left({t}_{s_{n-1}}-{t}_{r_n-1}\right)}{\min \left\{{t}_{r_n}\right\}.{t}_{s_n}},\forall \left(M-\frac{t_r}{t_s}\right)\le R\ and\ i\in M\end{array}\right. $$
(6)

From Equation (6), the objective of resource allocation in Equation (1a) can be redefined as

$$ {\displaystyle \begin{array}{c}{\max}_{\rho_{a,\gamma }}\left({\sum}_{i=1}^n{\rho}_{r_i}\ {\rho}_{a_i}-{\sum}_{j=1}^{M-\frac{t_r}{t_s}}{\sum}_{k=1}^R{\rho}_{r_i}\times \frac{k}{j}{\sum}_{i=1}^n{f}_i\left(\gamma \right)\right)+\\ {}\left({\min}_{t_{r_n}}{\sum}_{j=M-\frac{t_r}{t_s}}^{t_r/{t}_s}{\sum}_{k=1}^R{\rho}_{r_i}\times \frac{k-M}{j}{\sum}_{i=1}^n\frac{k-m}{t_{r_i}\Big({t}_{s_{i-1}}-{t}_{r_{i-1}}\ }\right)\end{array}} $$
(7)

The achievable resource allocation based on s and\( \frac{t_r}{t_s} \) analysis differentiates the mapping of R using time and resource availability. Both the factors are verified to improve the rate of γ and\( \frac{t_r}{t_n} \), irrespective of the requests in tr. This also improves the non-overloading functions of M without increasing (ts − tr) for any\( M<\frac{t_r}{t_s} \). The objective of Equation (1a) is redefined in Equation (7) for satisfying Equation (1c), where\( {\gamma}_{ij\frac{k}{j}} \) is achieved through\( {\max}_{\rho_{a,\gamma }} \) and\( {\min}_{t_{r_n}} \). Therefore, the allocation is satisfied by maximizing γ based on f(γ) for all R and\( {t}_{r_n} \), for all\( \left(M-\frac{t_r}{t_n}\right) \) requests.

3.4 Time Lag Optimization

The delay in resource allocation is a significant factor in SRAF, as the scalability support for n user devices must not increase (ts − tr). If the allocation time of the requests increase, the change in ordering of request mapping and overloading of M requires additional time. Therefore, (ts − tr) causes a lag in the allocation of resources for \( \left(M-\frac{t_r}{t_s}\right) \) requests. Another factor affecting the regular allocation time is f(γ), as the concentration of requests in\( \underset{t_{r_n}}{\ \min }. \) is high from the remaining requests. This\( {t}_{r_n} \) must be addressed in order to prevent unnecessary wait time of the consecutive requests. Therefore, the time constraints are resolved by controlling the processing and response time of previously queued requests. Different from the objectives in Equations (1a), (1c), and (1d), the time lag for\( \left(M-\frac{t_r}{t_s}\right) \) is addressed in this phase. First, the time for processing and response are estimated for their balanced validation such that tp = (tr − ts), and the instances of tp and ts are the same. This condition is validated in two constraints, namely tp = ts and tp > ts. The case of ts < tp is not feasible as the processed request is dropped when this case is satisfied. Considering that the proposed resource allocation satisfies the condition and constraints in Equations (1a)–(1d), the ts < tp condition is discarded. Similarly, when tp = ts, the request processing and allocation is ideal. On the other hand, if ts > tp,then tw ≠ 0, which results in prolonged serviced time/resource allocation (response) time. In order to confine the process time of resource allocation, tw needs to be reduced. In some overloaded request-based scenarios, tw ≠ 0, but tw can be shared among the available requests to reduce ts. This time lag optimization follows the recurrent analysis of fM and γM in the preceded allocation process based on s. The consideration of\( \frac{t_r}{t_s} \) and the mapping is not necessary as tw is relevant for a allocated/processed request. The mapping and maximization of \( \frac{t_r}{t_s} \) is achieved through a learning based analysis as derived in Equation (7). The lag optimization is performed for\( \left(M-\frac{t_r}{t_s}\right) \) requests that experience tw. The validation of fM and γM based on available M and s is considered such that

$$ \left.\begin{array}{c}\ {f}_M\left(M-\frac{t_r}{t_s}\right)=\frac{\left({t}_w+\frac{\rho_a}{\rho_r}\right)s}{\left(M-\frac{t_r}{t_s}\right)}\\ {} and\\ {}{\gamma}_M\ \left(M-\frac{t_r}{t_s}\right)=\left(\frac{M-\frac{t_r}{t_s}}{\sum {t}_p}-\frac{t_r}{t_s}\times \frac{1}{s}\right)-\frac{\rho_a}{\rho_r}\end{array}\right\} $$
(8)

In Equation (8), the modified fM and γM for\( \left(M-\frac{t_r}{t_s}\right) \) is computed where the existing requests are mapped to s with a high\( {\gamma}_M\left(M-\frac{t_r}{t_s}\right) \). This case is valid until ts ≤ tp; when this condition is not satisfied, M with \( \max \left\{{f}_M\left(M-\frac{t_r}{t_s}\right)\right\} \) is selected for accommodating the request. This means (tr − ts) + tw ≤ tp for\( {\gamma}_M\left(M-\frac{t_r}{t_s}\right) \) constrains ts ≤ tp; else, M is replaced based on\( \frac{\rho_a}{\rho_r} \), where ρr > ρa. This helps in allocating all resources expelled by ρa to the overloaded requests in time (tr − ts) + tw ≤ tp. Therefore, tw = tp − (tr − ts) when tp = ts; then, tw = 2tp − tr, which is less than\( \left(M-\frac{t_r}{t_s}\right)\times {t}_p \) or\( \left(M-\frac{t_r}{t_s}\right)\times {t}_w+\left({t}_r-{t}_s\right) \) time interval. Hence, the delay in processing is optimized by differentiating γM and fM conditions for\( \left(M=\frac{t_r}{t_s}\right) \) requests.

4 Results and discussion

In this section, the performance of the proposed framework is discussed through suitable experiments. The experiments are carried out using an opportunistic network environment (ONE) simulator [33]. The IoT environment is created with varying number of devices (30, 60, 90, and 120), in which the number of resource servers is fixed. Metrics such as processing time, response time, resource allocation rate, and failure probability are observed through the simulation. The number of resource servers in the experimental simulation is 10, capable of handling multiple resources at the same time. The requests vary from 20 to 200, for 30 to 120 IoT devices. The maximum wait time of the request is set as 2.4 s. The 10 resource servers are configured with 2 × 2 Gb physical memory and 1 TB storage. Besides, the IoT environment is supplied with a shared resource of 1 TB multimedia application. A resource server is configured to handle and serve 40 requests at a particular time instance. To verify the consistency of the proposed framework, the observed metrics are compared with the existing DSAS, AHP-matching, and FN-IoT methods which were discussed in the related work section.

4.1 Processing Time

Figure 4 illustrates the comparative analysis of processing time over the allocated resources. The processing time for the accepted requests is less until ρr > ρa, where the available M accepts the additional requests. On the other hand, if ρr < ρa or requests, then resource allocation follows\( {\gamma}_{ij\frac{k}{j}}\forall \left(M-\frac{t_r}{t_s}\right) \). The conditional analysis in Figs 2(a) (b) allocate \( \left(M-\frac{t_r}{t_s}\right) \) requests to M with max {s} and\( \min \left\{{t}_{r_n}\right\} \). This means there is no additional wait time for the overloading requests. Besides, in the time lag optimization process,\( {f}_M\left(M-\frac{t_r}{t_s}\right) \) and\( {\gamma}_M\left(M-\frac{t_r}{t_s}\right) \) attenuation prevents tw > tp; in addition, (tr − ts) + tw ≤ tp is retained. Therefore, the wait time for \( \left(M-\frac{t_r}{t_s}\right) \) requests is confined within the maximum service time; hence, the processing time is retained for\( \left(M-\frac{t_r}{t_s}\right)\times {t}_p \) in tw + (tr − ts) interval. Therefore, tp is (ts − tr + tw), for \( \left(M-\frac{t_r}{t_s}\right) \) condition, whentw = 0, Then tp = (ts − tr) for the overloaded requests, which helps to reduce the processing time.

Fig. 4
figure 4

Processing Time

4.2 Response Time

The response time of varying requests and devices is compared in Figs 5 (a)– (d). The optimal response time is (tr − ts) for the requests that are processed with the condition ρa < ρr ∀ M. Response time increases if the number of ρr is less than the incoming requests. Therefore, the response time is (tr − ts) + tw, where tr is determined based on ρa = 1 condition. In the proposed framework, request mapping and resource allocation rely on the condition ρa < ρr for all incoming requests handled by the service providers. These two processes limit the resource allocation response time. Contrarily, the response time for\( \left(M-\frac{t_r}{t_s}\right) \) requests are to be limited by reducing tw; this is done by selecting M based on max{s} and\( \min \left\{{t}_{r_n}\right\} \) such that\( {\gamma}_{ij\frac{k}{j}} \) and\( {\gamma}_M\left(M-\frac{t_r}{t_s}\right) \) jointly satisfy the resource allocation objective in Equation (7). Therefore, based on s and\( {t}_{r_n} \), the remaining requests are assigned to M for which tr = 2tp − tw or 2ts − tw, provided the overall response time is\( \frac{s}{2}\left({t}_r-{t}_w\right) \), satisfying the maximum limit of\( \left(M-\frac{t_r}{t_s}\right)\frac{t_p}{s} \). Hence, the processing time is less than\( \left(M-\frac{t_r}{t_s}\right)\times {t}_w+\left({t}_r-{t}_s\right) \).

Fig. 5
figure 5

Response Time

4.3 Resource Allocation Rate

The resource allocation rate in the proposed framework is high, depending on ρa and s of the available M. In the request mapping process, fM and {tr} :  → {ts} based allocations are formed, where fM and γn are the balancing factors for assigning n requests to M resource servers. Different from this allocation process, \( \left(M-\frac{t_r}{t_s}\right) \) requests are mapped with M satisfying the s and\( {t}_{r_n} \) constraints. Besides, M must also meet fM and γM (as defined in Equation (8)) modeled using s and tw. Therefore, in the resource mapping and allocation phase,\( \frac{t_r}{t_s} \) is the maximum requests served, which implies the resource allocation is performed for this rate of processed requests. Similarly, in the mapping of\( \left(M-\frac{t_r}{t_s}\right) \), \( \frac{\rho_a}{\rho_r} \) is the achievable resource allocation rate. Here, ρa < ρr as the additional requests are allocated to M with\( \min \left\{{t}_{r_n}\right\} \). Therefore, \( \frac{t_r}{t_s} \) and\( \frac{\rho_a}{\rho_r} \) achieve maximum resource allocation in the proposed framework (Fig. 6).

Fig. 6
figure 6

Allocation Rate

4.4 Failure Probability

The chances of failed resource allocation in the request mapping phase is less as the condition \( \frac{t_r}{t_s} \) is satisfied for all ρa ≤ ρr of M. In order to reduce the failure probability of \( \left(M-\frac{t_r}{t_s}\right) \) requests, the selection of M is based on\( \min \left\{{t}_{r_n}\right\} \). If tw or tpexceeds (tr − ts), then the resource allocations is unsuccessful, reducing the success rate of the request. The time allocated for processing prolongs the delay for the consecutive requests. Therefore, assigning M based on γM and fM (as per Equation (8)) helps retain the concurrent processing and mapping of\( \frac{t_r}{t_s} \) requests. Therefore, the (tr − ts) time of the previous requests is the tw for the new requests. In particular, the successive time for two tr is tw, and hence, the processing experiences a delay. Besides, allocating\( \left(M-\frac{t_r}{t_s}\right) \) requests within the defined time interval helps reduce the failure in request processing and resource allocation (refer Figs 7(a)7(d)). This case is unanimous for varying requests and user densities.

Fig. 7
figure 7

Failure Probability

5 Conclusion

This paper proposes an SRAF for a user-focused IoT paradigm. The aim of this framework is to improve the quality of response for available users with in-time resource allocation and swift request processing. Deep learning aids the concise management of available resources, user requests, and the lag in allocation. This deep learning paradigm is responsible for retaining the liveliness of requests and in-time allocation of the resources. Therefore, allocation is performed by balancing processed requests and available resources. The optimal performance and delay in response is attuned using a time lag optimization process for the overloaded requests, based on the processing speed and time of the resource providers. The joint process flow helps improve the resource allocation rate, and reduce the processing and response time and failure probability. Future studies can include meta-heuristic techniques to further improve the resource allocation process.