1 Introduction

Nowadays, mobile devices (e.g., tablets, smartphones, and smartwatches) have become a necessary part of our daily life as the most valuable and handy communication tools. Mobile users accumulate rich experience of more sophisticated mobile applications such as gaming, web surfing, and navigation, that run on either the devices or distant servers using wireless networks [1, 2]. However, with these improvements, mobile devices still encounter many resource challenges (e.g., shorter battery lifetime, memory, storage, and processor performance) and communications (e.g., network bandwidth and mobility) [3]. Mobile devices typically have limited computational power and computing resources when compared to desktop devices. Thus, the essential, challenging issue for mobile devices is to improve battery life while running complex applications. To overcome the performance, energy and resource limitation issues, the best possible solution is to adopt cloud technology. As infrastructure-as-a-service (IaaS) model of cloud offers storage, computing, and networking resources to mobile clients (tenants) to deploy instances as virtual machines (VMs) on servers of a data center. Many mobile clients want to receive reliable services in a multi-tenant cloud, while the provider intends to maximize their revenue.

Recent work has recognized the mobile cloud computing as a promising computing infrastructure, where storage and processing of data occurring outside of the mobile device [1, 4]. Various cloud-integrated mobile frameworks e.g. CloneCloud [5], ThinkAir [6], cloudlet [7], DIET [8] are proven to be effective for scientific computing applications.

Fig. 1
figure 1

Heterogeneous mobile cloud computing

In mobile cloud computing research, the offloading technique is gaining significant attention because it has augmented the computation potential of mobile devices by relocating computation to the cloud. To realize the computation offloading, the elastic mobile applications of mobile clients can be partitioned into the number of tasks. Such tasks are broadly divided into two groups: offloadable and non-offloadable group [33]. The non-offloadable group runs locally because of the inter-dependencies between tasks. On the other hand, the offloadable group contains independent tasks and can be selected to run on clouds. The significant advantage of the computation offloading technique is energy-saving and improved battery lifetime. Figure 1 illustrates the basic mobile cloud computing model, where mobile users, along with mobile devices, can be connected to servers in multiple ways. One straightforward way is to connect the mobile devices to the Internet via Wi-Fi access points. However, due to its range limitation, a more flexible approach is to use the cellular networks for long-distance connections. In cellular networks, the mobile users are connected to an LTE/3G/4G network via devices such as Base Transceiver Station (BTS) and Mobile Switching Center (MSC). Then, the data is transmitted to the Internet. Such type of connection provides much higher availability in comparison to Wi-Fi because of its high coverage. After the connection establishment, the mobile applications discover a suitable cloud service and send the computational tasks offloading request to cloud and subsequently cloud response to the request. If the tasks are offloaded to the cloud server, its computation results will be returned to the mobile devices [29]. Nevertheless, it persists a challenging task to minimize the offloading decision time and to achieve a mutually beneficial relationship between the tasks of multiple mobile clients and cloud servers. Thus, the effectiveness of computation is measured through four fundamental questions: whether, what, when and where to offload.

Over the last years, different frameworks and schemes have been introducing in offloading techniques [5,6,7,8] and [10,11,12,13,14], such approaches either focused on battery lifetime by reducing the energy consumption or the execution time while offloading the compute-intensive tasks to a remote server. In most of the existing frameworks [5, 9], mobile devices directly send the computation offloading requests to the cloud server. Then the server sends back the offloading decision to the mobile devices. However, during the decision-making, mobile clients have to wait for offloading decisions from the cloud which may result in the serious waste of time without determining whether the task offloading is beneficial. Offloading of computational tasks also includes communication cost and monetary cost for accessing the cloud resources. If the total cost of task execution at a cloud server exceeds the maximum completion time limit or the budget of mobile devices, that may lead to the offloading failures. Also, most of the existing works on task offloading are centralized and consider a simple single-tenant environment, where the task of a mobile client is allocated to one cloud server. Thus, it is necessary to design an energy, monetary cost, and delay-aware distributed approach for a multi-tenant environment. Therefore, this work addresses the following significant issues related to the decision of computation offloading such as (i) How to design a framework to reduce the long waiting time during the decision making of computational task offloading in a multi-tenant environment? (ii) How to design a distributed computation offloading strategy under multi-constraints (i.e., completion time, budget, available resources), (iii) How to achieve the trade-off between energy consumption and monetary cost of offloading requests, and (iv) How to use game theory heuristics to maintain the dual preferences between the tasks of multiple mobile clients and servers.

To address the above issues, in this work, we present a novel distributed computation offloading framework named ‘Off-Mat’ to improve the offloading decision under multi-constraints and minimizing multi-tenant resource contention. Our framework’s challenges are to design an offloading strategy to effectively minimize the monetary cost and energy consumption of offloading requests under multiple constraints and propose an efficient, stable matching mechanism to allocate the offloaded tasks to servers based on their preferences. Concretely, the idea of dual preferences enables the game players to express different policies based on the ranked preference list satisfying requirements. In contrast, the concept of stability is applied to address the conflicts of interest among players. We present a stable matching based approach to solving the computation offloading problem. More specifically, this algorithm works into two phases: In the first phase, it achieves the offloading decision and improves the response time. The second phase performs a many-to-one distributed stable matching to achieve a trade-off of preferences between tasks and servers. The key contributions of this work are as follows:

  • Formulated ‘MinEMC’ problem using a 0-1 integer linear programming (ILP) problem and theoretically proven to be NP-Hard.

  • Then, a special case is provided by relaxing the conditions by assuming the same amount of resources are needed by each task, and each of the mobile devices has infinite energy capacity for which a polynomial time optimal solution is proposed using the bipartite weighted graph matching.

  • To solve the general case, we propose a distributed heuristic algorithm named Off-Mat algorithm, which works in two phases: (i) Off: Offloading phase and (ii) Mat: Matching phase. We also analyze the overall complexity of the Off-Mat algorithm.

  • Finally, we demonstrate our proposed approach by numerical results and algorithmic analysis. The experimental results validate the efficacy of the proposed Off-Mat algorithm.

The rest of this work is arranged in following way: Sect. 2 discusses a summary of the existing literature on computation offloading and stable matching. Section 3 presents the novel computational tasks offloading framework. The system model is discussed in Sect. 4. In Sect. 5, problem formulation is provided. In Sect. 6, we design algorithms to resolve the special and general case of the MinEMC problem. We then show the extensive simulations through real-world parameters in Sect. 7. Finally, we summarized the paper with future remarks in Sect. 8.

2 Related work

We have categorized the existing literature from two perspectives: The first part is focused on offloading schemes while the other part is on stable matching. Along with the related work, we have also highlighted their limitations to draw motivations for this work.

2.1 Computation offloading

To fully utilize cloud potentials, the majority of researches is focused on using distant cloud infrastructures with rich resources to augment the abundant capabilities of mobile devices. For example, the model of cloudlet [19, 20] was introduced to utilize nearby devices as cloud servers. It has attracted substantial research attention as it can deliver computing services with minimum energy consumption and delay. Several offloading schemes have been introduced in [11, 12, 21, 22] to decrease the energy-saving level of devices while meeting the constraint of completion time. Wang et al. [10] proposed an application offloading scheme for delay and energy trade-off by using the model of a bipartite graph. Barbera et al. [30] studied the energy cost, bandwidth, and data backup in the real-time system for computation offloading. Huang et al. [31] proposed a dynamic offloading scheme by using Lyapunov based optimization for energy saving while fulfilling the execution time of applications. To ensure the energy fairness at the device end, Song et al. [13] presented an energy-efficient application offloading framework for traffic-energy trade-off. For execution delay optimization, Xia et al. [14] presented a threshold-based model for heterogeneous networks. It offloads the application if the forecasted execution delay is smaller than the acceptable delay; otherwise, the application runs locally on mobile devices. In the work of Huang et al. [32], two approaches are presented: the first one is to manage the offloading users, and the other is the estimate the execution delay for offloading decisions. These approaches followed a traditional offloading technique, where the cloud makes the decision based on the information received from mobile devices and either focused on battery lifetime or the execution time while offloading the applications to the cloud.

To minimize the delay between mobile clients and cloud, the middleware or agent-based architecture is introduced in the literature. To solve the task allocation problem and reduce the overall consumption of energy, Nir et al. [34] proposed a task scheduler model on the centralized broker. In [35] Liu et al. presented a wireless resource scheduling based on back-off technique for mobile agent-based architecture to further enhance the QoS features of real-time streams. In comparison to traditional offloading frameworks, the agent-based framework is advantageous into two aspects: first, it can provide faster-offloading decisions on the neighboring agents itself in place of the remote cloud. Second, it collects the resource utilization details on a periodic basis from the cloud to accommodate the resource demand of mobile clients. Hence it minimizes the number of offloading requests and heavy computation workload directly to the cloud. Concerning energy saving in a multi-user offloading environment, Meskar et al. [36] designed a deadline constraint-based computation offloading approach. In [37, 38] Chen et al. provided a computation based multi-user offloading scheme to reduce the processing time and energy consumption but not consider the completion time. Liu et al. in [39] presented a multi-resource allocation strategy to optimize the system throughput and service time latency. To optimize the resource allocation, in [40], Kuang et al. proposed a quick response framework. In this framework, the authors have considered the bandwidth constraint and completion time to optimize energy saving. Haber et al. [42] modeled the optimization problem for energy efficiency and computational cost of offloading task in a multi-tier edge based cloud architecture. They have designed an algorithm using branch & bound for finding the optimal solution. Further, they have proposed a low-complexity algorithm and an inflation-based approach for finding a polynomial-time solution. Fang et al. [43] designed scheduling schemes to enhance multi-tenant serving performance for real-time mobile offloading systems. They have implemented a system named ATOMS for computer vision algorithms and proposed a Plan-Scheduling algorithm to improve delay and mitigate resource contention. Ghobaei-Arani et al. [44] presented an organized literature survey of resource management techniques for fog computing. They have designed a classical taxonomy to identify cutting edge methods and also discussed the open issues. Lakhan et al. [45] provided a microservices based mobile cloud platform and designed the application partitioning based task assignment algorithm for robust execution of applications. Verma et al. [46] presented a robust architecture for multimedia applications using mobile cloud computing. Shakarami et al. [47] provided a systematic literature review on computation offloading based on the game theory techniques for mobile edge environment. They have designed a classification to identify state-of-the-art techniques and also discussed the open issues. Nagasundari et al. [48] proposed a service selection scheme for multi-user based computation offloading environment. Further, they have exploited hidden markov model and fuzzy KNN based mobility prediction via cloudlet servers to enhance the framework. De et al. [49] provided multi-level partial and full offloading approaches using cloudlet, public, and private cloud servers. They have further analyzed the delay and power consumption and compared them with the existing offloading methods. For multiplayer online gaming in the cloud, Ghobaei-Arani et al. [50] provided an autonomous resource provisioning architecture. They have designed an adaptive neuro-fuzzy inference system based prediction model to handle workload fluctuations and designed a fuzzy decision tree approach to determine the auto-scaling decisions. Derhab et al. [51] designed a mobile cloud offloading framework for the two-factor mutual authentication applications.They have introduced a decision-making scheme for offloading the authentication application along with its virtual smart card, using energy cost, mobile device’s residual energy, and security. Nir et al. [54] proposed a centralized broker-node based architecture for task scheduling mobile cloud computing. The experimental results demonstrated that the offloading with optimization based technique results less energy consumption and monetary cost in comparison with the offloading without optimization using the centralized scheduler. To provide incentives for fog nodes and minimize the computational cost of mobile devices, Chen et al. [55] provided a cost-effective offloading approach for mobile-edge environment with the cooperation between the remote cloud and fog nodes while considering task dependency constraint. Hassan et al. [56] presented a reinforcement learning-based SARSA approach to solve the resource allocation issue in the edge and perform the optimal offloading decision to reduce computing time delay, system cost, and energy consumption. Hassan et al. [57] proposed a deep Q-learning based code offloading strategy in mobile edge for IoT applications. The proposed approach has significantly improved the computation offloading by minimizing the latency of service computing, execution time, and energy consumption. Enayet et al. [58] proposed a mobility-aware resource provisioning framework, named Mobi-Het to enable remote execution of big data tasks on the mobile cloud, which promises higher efficiency in timeliness and reliability. Islam et al. [59] developed an ant-colony based mobility and resource-aware VM migration model for the mobile cloud-based healthcare system in smart cities. Bedi et al. [60] proposed a multi-cloud storage technique for resource-constrained mobile devices to optimize mobile devices’ resources and improve the performance of CPU usage, battery consumption, and data usage. Durga et al. [61] designed an efficient context-aware dynamic resource allocation that utilizes the client present context information to meet the performance requirements specified by user. Saleem et al. [62] proposed a dynamic bitrate adaptation strategy using stochastic optimization for maximizing the user’s QoE. They have applied video assessment models and QoE feature metrics for evaluation. Elashri et al. [63] provided schemes for efficient offloading decision-making for soft and weakly hard (firm) real-time applications while ensuring the tasks schedulability. Milan et al. [64] designed a bacterial foraging optimization based task scheduling approach using for minimizing the idle time of VMs. However, the aforementioned literature’s major limitation is the non-existence of stability inducing offloading under multi-constraints (i.e., completion time, budget, available resources) while minimizing both the monetary cost and energy consumption of mobile devices in the heterogeneous multi-tenant mobile cloud environment. The detailed side-by-side comparison between the proposed approach and the existing techniques discussed in Tables 1 and 2.

Table 1 Comprehensive review of existing computation offloading approaches
Table 2 Comprehensive review of existing computation offloading approaches

2.2 Stable matching

The concepts of stable matching have been widely adopted since 1962 when Gale and Shapley presented a deferred acceptance algorithm in their pioneering work for solving the college admission problem [25]. Such type of game theory is successfully applied in many areas. Such as Kim et al. [16] have used the Hospital/Residents based stable matching to resolve the problem of cloud supported smart TV migration. \(Many-to-one\) matching approach is similar to the college admission problem [18], where a student can be admitted to a university, on the other hand, a university can admit many students. Similar to this problem, in [17], adopted a college admission based game in which the small scale stations and macrocells (i.e., colleges) are seeking to enroll the users (i.e., students) with given preferences. The other real-world applications of the matching game are: assigning hostel rooms to students and matching medical interns to hospitals etc. Based on the notion of matching theory [23,24,25], the tasks of multiple mobile client applications and the cloud servers can be identified as the players of two sides in a \(many-to-one\) matching game. In particular, one task of the mobile application may be offloaded to one server; on the other hand, one server can host many tasks of different mobile apps depending on its resource availability. However, the classical theory of stable matching cannot be directly applied in computation offloading scenario as the tasks have different demands of CPU, memory, bandwidth, and storage, etc. and the servers have a different capacity constraint. This problem becomes more complicated because of the size and demand heterogeneity. To clarify such an ambiguous situation, we have developed new preference functions and propose a new distributed stability concept based on a deferred acceptance algorithm and proved its convergence as well as optimality results.

Thus, in the proposed work, the matching game approach is adopted to solve the stable matching issues between the workload i.e., tasks of mobile clients and the cloud servers. This work maintains the trade-off of preferences by creating individual preference set to model each player’s interest and stability results as the solution rather than optimality.

3 Off-mat: framework for computational tasks offloading

This section presents the ‘Off-Mat’ framework with underlying assumptions, components, and their interaction.

3.1 Components of framework

The mobile multi-tenant cloud environment is depicted in Fig. 1. It composes the crowd of mobile devices, elastic mobile applications, computation-intensive tasks, access points (APs), agents, and cloud servers. The mobile clients operating mobile devices are geographically distributed into different regions and lie in the coverage of APs. Each mobile device has some tasks to be offloaded. The agents are active near to the APs and connected to the cloud via the high-speed wireline network. It handles the offloading requests sent by mobile devices. In a multi-tenant cloud, it provides shared computing resources to multiple mobile clients. Cloud have sufficient resources (i.e., compute, storage, and networking) to execute the requests in the form of VMs, but at a time cloud can support limited requests. Therefore, it requires the optimal decision-making method to filter unnecessary requests. The Off-Mat framework is represented in Fig. 2.

Fig. 2
figure 2

The Off-Mat computation offloading architecture

In this framework, we have designed middleware for mobile devices as well as for agents. The device middleware is composed of application partitioner, device profiler, offloading manager, and local execution manager. The application partitioner is responsible for partitioning the dependent and independent tasks. The device profiler gathers the information of device resources (i.e., energy, used resources, etc.), currently executing tasks, network resources, i.e., bandwidth. The offloading manager is used to filter the incompetent tasks at the device end. The execution manager follows the decision of the offloading manager. The middleware on the agent is composed of a resource monitor, which periodically collects the servers’ resource information from the cloud. The task profiler detects the invalid requests based on resource constraints. The matching engine matches the offloadable tasks to cloud servers while the remote execution manager follows the decision of a matching engine. The middleware at the agent minimizes the request delay and determine the final execution of tasks to cloud servers. Figure 3 depicts the sequence diagram of the computation offloading process.

Fig. 3
figure 3

Sequence diagram of computation offloading process

3.2 Phases of framework

The 2-phase computation offloading framework work in the following phases:

(i) Phase I- Off (Offloading): The offloading workflow starts with the device partitioner that partitioned the application into independent and dependent tasks and sent it to the device profiler. The device profiler gathers the meta-data of device and information such as: currently executing applications, type of device, tasks, network bandwidth, available energy and resources, and sends it to the offloading manager.

Based on the available resource information, the offloading manager decides whether the mobile device can be benefited by the task offloading or not? Here it checks: (i) whether the offloading energy is less than the local energy or not and (ii) whether the required bandwidth is lesser than the available bandwidth. If it satisfies the constraints, then it sends the request to the agent, otherwise, the device profiler kept on collecting the information. The agent periodically collects the cloud server information from the resource monitor. If it receives tasks, then it validates the constraints for maximum task completion time, monetary cost, and cloud resources availability. In case of failure, it filters the useless requests and sends an offloading failure message to the device local execution manager through its remote execution manager. To execute the above process, both mobile and agent perform the asynchronous procedures.

Fig. 4
figure 4

Off-Mat workflow

(ii) Phase II-Mat (Matching): If the tasks are offloadable, then it negotiates with the distant cloud for reserving resources for the given time periods and send the request to the matching engine. The matching engine creates the preference sets for mobile client tasks and servers and applies the distributed matching. Then it sends the outcomes to the remote execution manager. The remote execution manager is mainly responsible for sending the offloadable tasks to the cloud servers and finally receiving and returning the computation results to the device’s local execution manager. Hence the agent is a critical component as it monitors the cloud servers information and filters the device requests based on different parameters. It applies the distributed algorithm to find out the stable matching between tasks of mobile clients and servers. Thus it performs the optimal resource allocation to minimize multi-tenant resource contention. The workflow of the filtering process is shown in Fig. 4. Next, we have analyzed and formulated various task models to be used for this framework.

4 System model

This section discusses the model of computation & monetary cost and illustrate the matching concepts. The key notations used in work are described in Table 3.

Table 3 Main notations with descriptions

Let us assume a set of D mobile devices, denoted by \(MD =\{D_1,D_2,...\}\). For each mobile device, there are several computational tasks for offloading, denoted by \(T = \{t_{i,1},t_{i,2}...,t_{i,j}\}\), where \(t_{i,j}\) refers to the \(j\)th task of mobile device i. \(r_{i,j}\) denotes the resources (i.e, memory, CPU, storage, and bandwidth etc.) required by task j of device i. Each computational task \(t_{i,j}\in \bigcup _{i=1}^DMD_i\) can be run locally or offloaded to the \(k^{th}\) server. We represent the set of cloud servers by \(S= \{s_1,s_2,...s_k \}\) and the total number of resource available at \(s_k\) by \(R_k\). We consider the total available resource at the cloud servers as well as at the devices are enough the execute all the computational tasks of different applications. For each task \(t_{i,j}\), if it runs locally, the energy consumption would be \(e^{loc}_{i,j}\). We also assume \(E^{rem}_{i}\) is the energy left at the mobile end. Let us denote energy to offload task \(t_{i,j}\) to server k as \(e^{off}_{i,j,k}\). Now, as we all know that each server will have its own policy as to what kind of tasks to execute. Policies here mean that whether the cloud prefers the one which gives the highest revenue, or it could be like preferring the larger task. These policies are the ones that decide how much monetary cost is incurred during offloading of the computational task to the cloud. As we are using shared multi-tenant cloud resources while maintaining isolation between the tasks offloaded. We denote \(MC_{i,j,k}\) as the cost incurred by task \(t_{i,j}\) when offloaded to server k.

4.1 Computation and communication task models

Next, we have derived task models to compute energy consumption and execution time.

Energy consumption analysis:

(i) Local energy consumption: If the task is decided to run locally then the local energy consumption \(e^{loc}_{i,j}\) can be computed by the power used during local execution \(P^{loc}_{i,j}\), number of CPU cycles \(C^{loc}_{i,j}\) for task \(T^{i,j}_{loc}\) and execution speed \(S^{loc}_{i,j}\) of local device.

$$\begin{aligned} e^{loc}_{i,j}=\dfrac{C^{loc}_{i,j} \times P^{loc}_{i,j}}{S^{loc}_{i,j}} \end{aligned}$$
(1)

(ii) Offloading energy consumption: The energy costs of offloading tasks to distance cloud can be calculated as the sum of transmission energy i.e., sending energy \(e^{sent}_{i,j,k}\) and receiving energy \(e^{rec}_{i,j,k}\) and \(e^{idle}_{i,j}\) is the idle energy consumption waiting for the results from the cloud.

$$\begin{aligned} e^{off}_{i,j,k}= & {} e^{sent}_{i,j,k}+e^{idle}_{i,j}+e^{rec}_{i,j,k} \end{aligned}$$
(2)
$$\begin{aligned} e^{off}_{i,j,k}= & {} P^{send}_{i,j,k}\times T^{send}_{i,j,k}+P^{idle}_{i,j}\times T^{idle}_{i,j}+P^{rec}_{i,j,k}\times T^{rec}_{i,j,k} \end{aligned}$$
(3)

Execution time analysis:

(i) Local execution time: To run the task locally, the local execution time can be determined as the ratio of CPU cycles to execute task \(t_{i,j}\) to the execution speed \(S^{loc}_{i,j}\) of device.

$$\begin{aligned} T^{loc}_{i,j}=\dfrac{C^{loc}_{i,j}}{S^{loc}_{i,j}} \end{aligned}$$
(4)

(ii) Remote execution time: If the task is granted to execute on the distant cloud, then the remote execution time can be determined by the CPU time consumption and transmission time i.e., sending and receiving time.

$$\begin{aligned} T^{off}_{i,j,k}=\dfrac{B^{send}_{i,j,k}}{r^{send}_{i,j,k}}+\dfrac{C^{cloud}_{i,j,k}}{S^{cloud}_{i,j,k}}+\dfrac{B^{rec}_{i,j,k}}{r^{rec}_{i,j,k}} \end{aligned}$$
(5)

where \(B^{send}_{i,j,k}\) and \(B^{rec}_{i,j,k}\) represents the data bits uploaded and the data bits received respectively.

4.2 Monetary cost model

Monetary cost analysis: Each device has an offloading budget for mobile cloud services set by mobile clients. The monetary cost is the sum of data transferring cost and pubic cloud services cost.

(i) Cost of data transferring: The data transfer cost for offloading the task \(T_{i,j,k}\) can be expressed as:

$$\begin{aligned} MC^{trans}_{i,j,k}=\gamma (B^{send}_{i,j,k}+B^{rec}_{i,j,k}) \end{aligned}$$
(6)

where \(\gamma\) is the cost per megabyte (MB) in network i.e. 3G, wifi etc.

(ii) Public cloud services cost:

The cost of public cloud services depends on service type and its usage. For running task \(t_{i,j}\) on cloud VM, It can be expressed as follows:

$$\begin{aligned} MC^{VM}_{i,j,k}=\beta \dfrac{C^{cloud}_{i,j,k}}{S^{cloud}_{i,j,k}} \end{aligned}$$
(7)

where \(\beta\) denotes the cost per time unit of using the cloud instance which is the factor depending on the policy of the cloud.

Hence the total monetary cost can be expressed as:

$$\begin{aligned}&MC^{trans}_{i,j,k} + MC^{VM}_{i,j,k} \end{aligned}$$
(8)
$$\begin{aligned}&=\gamma (B^{send}_{i,j,k}+B^{rec}_{i,j,k})+\beta \dfrac{C^{cloud}_{i,j,k}}{S^{cloud}_{i,j,k}} \end{aligned}$$
(9)

4.3 Matching concepts and preference functions

The matching of tasks to servers can be considered as an outcome of a many-to-one matching game. Where multiple tasks can be allocated to one server based on preference functions and matching constraints. In this section, we have defined the preliminaries to explain the concepts of matching theory.

Definition 1

Given the set of tasks T and the set of servers S, mathematically a matching function can be defined as \(\mu : T \cup S \Rightarrow 2^{T \cup S}\) such that:

  • \(\mu (s) \subseteq T\) such that \(|\mu (s)|\le\) \(r_s,\forall s\in\) S, where \(|\mu (s)|\) represents the collective resources of all tasks that are matched to s.

  • \(\mu (t) \subseteq S\) such that \(|\mu (t)|=\) \(r_s\), or \(|\mu (t)|=0, \forall t \in T\) and \(s \in S,\) where \(|\mu (t)|\) is the server resources of s that is matched to t and \(|\mu (t)|=0\) means that task t is unassigned.

  • \(t\in \mu (s)\) if and only if \(\mu (t) = s\), \(\forall t \in T\) and \(s \in S,\)

Here, the definition describes that matching is defined to be a many-to-one relation, where each cloud server is matched to a subset of tasks. The objective of matching is to obtain an efficient and stable matching. In such matching game, each player specifies their preferences over the other depending on its objective in the mobile cloud computing environment.

Definition 2

A matching \(\mu\) can be blocked through a agents pair (ts) if there exists a (ts) pair with \(t \not \in \mu (s)\) and \(s \not \in \mu (t)\) then such kind of pair is termed as blocking pair.

Definition 3

The obtained matching \(\mu\) is stable if (a) No blocking pair is exist and (b) Each of the tasks are embedded to cloud servers.

Theorem 1

Stable matchings always exist for a set of marriages.

Proof

This theorem can be proven through the classical deferred acceptance algorithm (DAA) known as the \(Gale-Shapley's\) algorithm [25] for a stable marriage problem [27]. It applies an iterative procedure and finds a stable set of marriages. To begin with this procedure, let us assume a set of players, say men propose to women based on their set of preferences. It continues until there exists a man who is available and not yet proposed to all women of his set. Then he can propose to the highly preferred woman of his set who also has not yet rejected his proposal. If the woman is available, she holds the received proposal on a string to ensure the possibility of some better proposal. If she already received the proposal, then she rejects the least preferred proposal. This procedure is repeated until no further proposal can be formed since no men can propose to the woman more than once. Once the last women receive her proposal, the algorithm stopped and matched each woman to the man (if there exist any) whose proposal she is still holding in her string. The woman-oriented model also operates in a similar fashion by changing the roles of man and woman [26].

For general settings, the marriage model can be extended to the college admissions problem [25], where each college is looking for multiple students to admit, and each student aspires to be matched with one college. It is a prominent extension of many-to-one. The resource allocation problem in the cloud environment can be naturally cast as a stable matching problem, which resolves the conflicting interests amid all of the stakeholders and achieving stability. Here, we can model mobile tasks as ‘students’ and servers as ‘colleges,’ where both are wishing to be matched with each other. The preferences can be transformed to distinct policies. Due to the size heterogeneity of mobile tasks (i.e., CPU, memory, bandwidth, and storage, etc.), the task allocation problem is modeled as a job-machine stable matching problem [26], where machines have heterogeneous capacities, and jobs have different sizes. Each machine can contain multiple jobs ensuring the total size of jobs should not exceed its total capacity. Each machine possesses transitive preference with respect to all the acceptable jobs whose size is smaller than the capacity of machine. Equivalently, each job also possesses transitive preference with respect to all the acceptable machines having sufficient capacities to accommodate the job. The job-machine model is a more general type of many-to-one matching, and the problem of college admissions can be seen as a special case where all the jobs are having same size representing students [25, 26]. \(\square\)

5 Problem formulation

The computational tasks offloading problem is formulated over various decision variables. We define energy to run task j at mobile device i locally as \(e^{loc}_{(i,j)}\),energy for offloading the task j from mobile device i to server k as \(e^{off}_{(i,j,k)}\). Let us define a variant,

$$\begin{aligned} z_{i,j,k}= {\left\{ \begin{array}{ll} 1,&{} \text {if task is offloaded} \\ 0, &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(10)

The total energy consumed by tasks of different applications can be defined as

$$\begin{aligned} \sum _{j=1}^{T}\left( 1-\sum _{k=1}^{|S|}z_{i,j,k}\right) e^{loc}_{i,j} +\sum _{j=1}^{T}\sum _{k=1}^{|S|}z_{i,j,k}e^{off}_{i,j,k} \end{aligned}$$
(11)

The total energy consumption of mobile devices can be defined as

$$\begin{aligned} \sum _{i=1}^{D}\sum _{j=1}^{T}\left( 1-\sum _{k=1}^{|S|}z_{i,j,k}\right) e^{loc}_{i,j}+\sum _{i=1}^{D}\sum _{j=1}^{T}\sum _{k=1}^{|S|}z_{i,j,k} e^{off}_{i,j,k} \end{aligned}$$
(12)

The total monetary cost by tasks at task \(t_{i,j}\) can be defined as

$$\begin{aligned} \sum _{k=1}^{|S|}z_{i,j,k}MC_{i,j,k} \end{aligned}$$
(13)

The total monetary cost at the mobile devices from all the tasks can be defined as

$$\begin{aligned} \sum _{i=1}^{D}\sum _{j=1}^{T}\sum _{k=1}^{|S|}z_{i,j,k}MC_{i,j,k} \end{aligned}$$
(14)

Now, we formulate the overall objective function of MinEMC problem as follows:

$$\begin{aligned}&\min \sum _{i=1}^{D}\sum _{j=1}^{T}\left( 1-\sum _{k=1}^{|S|}z_{i,j,k}\right) e^{loc}_{i,j}+\sum _{i=1}^{D}\sum _{j=1}^{T}\sum _{k=1}^{|S|}z_{i,j,k} e^{off}_{i,j,k} \nonumber \\&\quad + \,\alpha \left( \sum _{i=1}^{D}\sum _{j=1}^{T}\sum _{k=1}^{|S|}z_{i,j,k}MC_{i,j,k}\right) \end{aligned}$$
(15)

subject to

$$\begin{aligned}&\sum _{k=1}^{|S|}z_{i,j,k} \le 1 ,\forall t_{i,j}\in \bigcup _{i=1}^DMD_i \end{aligned}$$
(16)
$$\begin{aligned}&\sum _{j=1}^{T}\sum _{k=1}^{|S|}z_{i,j,k}MC_{i,j,k}\le MC^{budget}_i, \forall i \in (1,2,....D) \end{aligned}$$
(17)
$$\begin{aligned}&\begin{aligned} \left( 1-\sum _{k=1}^{|S|}z_{i,j,k}\right) T^{loc}_{i,j}+\sum _{k=1}^{|S|}z_{i,j,k}T^{off}_{i,j,k}\le T^{max}_{i}\\ ,\forall i \in (1,2,....D) ,\forall j \in (1,2,....T) \end{aligned} \end{aligned}$$
(18)
$$\begin{aligned}&\sum _{j=1}^{T}\left( 1-\sum _{k=1}^{|S|}z_{i,j,k}\right) e^{loc}_{i,j,k}\le E^{rem}_i,\forall i \in (1,2,....D) \end{aligned}$$
(19)
$$\begin{aligned}&\sum _{i=1}^{D}\sum _{j=1}^{T}z_{i,j,k}r_{i,j}\le R_k, \forall s_k \in S \end{aligned}$$
(20)
$$\begin{aligned}&z_{i,j,k} \in \{ 0,1 \} ,\forall t_{i,j}\in \bigcup _{i=1}^DMD_i,\forall s_k \in S \end{aligned}$$
(21)

In the formulated problem, \(\alpha\) is a trade-off preference parameter, allowing the designer to weigh energy consumption and monetary cost differently. Equation 16 represents the offloading decision variable \(z_{i,j,k}\) need to be 0 or 1. Equation 17 guarantees that the monetary cost should be less than the available budget. Equation 18 defines the total completion time of the task can not exceed the threshold of maximum completion time. Equation 19 ensures that the local energy to run the tasks of mobile devices at \(d_i\) can not overreach the residual energy of \(d_i\). Equation 20 ensures all the resources (i.e., CPU, memory, bandwidth, and storage, etc.) required by mobile devices \(r_{i,j}\) should not exceed the available limit of resources represented by \(R_k\). Equation 21 denotes the value of offloading decision variable \(z_{i,j,k}\) should be 0 or 1. It also ensures that the task \(t_j\) of device i belongs to the set of mobile devices MD and server \(s_k\) belongs to the set of servers S.

6 Proposed algorithms

Firstly we discuss the complexity of the derived problem and present a specialized case after relaxing some constraints and then proceed to solve the general case using distributed algorithms. The proposed algorithm is an improved version of Gale and Shapley’s DAA for many-to-one matching and similar to the college-admission problem [16]. At last, we discuss the complexity analysis of the proposed algorithm.

6.1 Problem complexity

To derive the complexity of a defined problem, we use the well known multiple knapsack problem [41].

Theorem 2

MinEMC problem is \(NP-Hard\).

Proof

Let’s derive the NP-hardness of the formulated problem by assuming a special case, where \(e_{i,j,k}^{off}=0\) for \(\forall i,j,k\) and \(E_i^{rem}\ge \sum _{i=1}^{D}\sum _{j=1}^{A}\sum _{k=1}^{T}e_{i,j,k}^{off}\). It means if we ignore the offloading energy i.e. \(e_{i,j,k}^{off}=0\) and assuming that each mobile device exhibit ample amount of energy to run the tasks then the objective function in Equation (15) will become to

$$\begin{aligned} \sum _{i=1}^{D}\sum _{j=1}^{A}\sum _{k=1}^{T}z_{i,j,k}e^{loc}_{i,j,k} \end{aligned}$$
(22)

NP-hardness of the optimization problem can be proven through a reduction from a well-known problem of multiple knapsacks. Given a set I of n items with weight \(w_i\) and profit \(p_i\) where \(w_i\) and \(p_i\) \(\in (1,2,....n)\) and a set of m knapsacks with capacity \(c_j\) where \(c_j \in (1,2,....m)\). Now the optimization problem is to pick T disjoint items subsets with weight \(w_i\), such that the overall profit \(p_i\) of selected items can be maximized. Each subset of items can be allocated to a knapsack with capacity \(c_j\), which can not be less than the total weight of selected items. Similar to the problem of multiple knapsack, an another instance can be created for the decision form of proposed optimization problem to solve in polynomial time as follows: Given a set of T of n tasks of mobile agents with required amount of resources \(w_i\) and energy to locally run the tasks \(p_i\) where \(w_i \; and \; p_i \in (1,2,....n)\) and cloud servers set m with available resources \(c_j\) where \(c_j \in (1,2,....m)\). Now the multiple knapsack optimization problem is to choose I disjoint subsets of tasks, such that the overall profit of the selected tasks is maximized. More specifically, the profit maximization problem of chosen items is equivalent to maximize the derive objective function.

The reduction is polynomial. Due to the hardness of multiple knapsack problem, hence we get the multiple knapsack problem’s instances, which is analogous to another instance of maximizing the objective function. Thus the formulated optimization problem in Equation (22) is NP-hard. \(\square\)

6.2 Optimal solution for special case

Fig. 5
figure 5

Bipartite graph modelling for computational tasks offloading

Let us assume the special case, where the resources needed to run each task in the mobile devices is equal, i.e., \(res_{i,j}\) = r and the residual energy is unlimited for each mobile device; it means that each mobile device can run all of its tasks locally without energy constraints. Assuming this configuration, we present a polynomial time solution via minimum weight bipartite matching. Firstly, we build a bipartite graph G(\(S_1 \cup S_2\), L), as shown in Fig. 5. The set of vertices, \(S_1, S_2\) of the graph and set of edges L, can be transformed as follows.

  • There is a vertex \(v_{i,j}\) corresponding to each task \(t_{i,j}\) in \(S_1\). That is each task has a vertex in \(S_1\), i.e., \(S_1\) = { \(v_{i,j} |\forall t_{i,j} \in \cup ^{n}_{i=1} MD_i\) }

  • There is a vertex \(v^{'}_{i,j}\) corresponding to each task \(t_{i,j}\) in \(S_2\) also. For each server k, we add |R/r| vertices in \(S_2\) and denote them as \(v^{''}_{k,k^{'}}\). That is \(S_2\) comprises of vertices corresponding to each task and |R/r| vertices for each server, i.e., \(S_2\) = { \(v^{'}_{i,j} |\forall t_{i,j} \in \cup ^{n}_{i=1} MD_i\) } \(\bigcup\) { \(v^{''}_{i,j}|\forall s_k \in S,1\le k \le |R/r|\) }

  • Considering any two vertices \(v_{i,j} \in S_1\) and \(v^{'}_{i,j} \in S_2,\forall i,j\), a link \(( v_{i,j} , v^{'}_{i,j} )\) can be added to L and give a weight of \(w_{i,j} = e_{i,j}^{local}\) to it.

  • For any two vertices \(v_{i,j} \in S_1\) and \(v^{''}_{k,k^{'}} \in S_2,\forall i,j,k,k^{'}\), a link \(( v_{i,j} , v^{''}_{k,k^{'}} )\) can be added to L and give a weight of \(w_{i,j} = e_{i,j,k}^{off}+\alpha MC_{i,j,k}\) to it, i.e. L={ \(( v_{i,j} , v^{'}_{i,j} ) |\forall v_{i,j} \in S_1 and \forall v^{'}_{i,j} \in S_2\) } \(\bigcup\) { \(( v_{i,j} , v^{''}_{k,k^{'}} )|\forall i,j,k\)}.

Theorem 3

The MinEMC problem with the same resource requirement for all the tasks and unbounded energy at the mobile device could be transformed towards obtaining a minimum-weighted bipartite matching in graph G(\(S_1,S_2,L\)).

Proof

We show that any matching in a given graph G is a feasible solution for proposed problem, i.e.

  • if link \(( v_{i,j} , v^{'}_{i,j} )\) is included in matching, then \(\sum _k=1^{|S|} x_{i,j,k}=0\) in our solution.

  • if link \(( v_{i,j} , v^{''}_{k,k^{'}} )\) is included in matching, then task \(t_{i,j}\) can be offloaded to run on server \(s_k\).

The constraints 19 and 20 and are satisfied as we have \(\sum _{i=1}^{n}\sum _{j=1}^{T} x_{i,j,k} \le |R/r|\) an unlimited energy as our relaxations. We now show that a feasible solution \(\{z_{i,j,k}\}\) can be transformed into a weighted matching in graph G as follows: From constraint 16, we can say that for vertex \(v_{i,j}\), the total matchings can only be one, i.e. either the link \(( v_{i,j} , v^{'}_{i,j} )\) if \(\sum _{k=1}^{|S|} x_{i,j,k} =0\) or \(( v_{i,j} , v^{''}_{k,k^{'}} )\) if \(x_{i,j,k}=1\). Which means at most one incoming link is chosen in matching. So, feasible solution of proposed problem is transformed into a feasible matching.

We can prove that the weight acquired in a minimum weighted matching bipartite matching problem is equivalent to an optimal outcome. From Eq. 15 we get,

$$\begin{aligned}&\sum _{i=1}^{D}\sum _{j=1}^{T}\left( 1-\sum _{k=1}^{|S|}z_{i,j,k}\right) e^{loc}_{i,j}+\sum _{i=1}^{D}\sum _{j=1}^{T}\sum _{k=1}^{|S|}z_{i,j,k} e^{off}_{i,j,k}\nonumber \\&\quad + \alpha \left( \sum _{i=1}^{D}\sum _{j=1}^{T}\sum _{k=1}^{|S|}z_{i,j,k} MC_{i,j,k}\right) \nonumber \\&\quad =\sum _{i=1}^{D}\sum _{j=1}^{T}\sum _{k=1}^{|S|}z_{i,j,k}(e^{off}_{i,j,k} +\alpha MC_{i,j,k}) \nonumber \\&\quad +\sum _{i=1}^{D}\sum _{j=1}^{T}\left( 1-\sum _{k=1}^{|S|}z_{i,j,k}\right) e^{loc}_{i,j}. \end{aligned}$$
(23)

From our weights assignment, it can be observed that the sum of weights with the minimum weighted matching problem in a bipartite graph is equivalent to the optimal solution of the proposed problem.

From this, it is concluded that the proposed special case could be solved using polynomial time-optimal solution.

6.3 Designing algorithm for general case

Distributed algorithm for computation offloading: For the first phase, we have designed the distributed algorithm for computation offloading inspired by [28]. In the distributed version, each node can asynchronously execute its computation. The proposed algorithm work in two procedures for a pair of the mobile device and on the agent lies in the coverage area of the mobile client i.e. MobileDevice() and Agent() as discussed in Algorithm 1. The algorithm works in following phases:

The MobileDevice() procedure: This procedure is performed for every mobile device.

  1. 1.

    Firstly, in line \(5-7\) it initializes the EnergyList and updates the information by available local energy \(e^{local}_t\) for each task of mobile device MD. The CostList is used to rank the costs for each task t at MD based on their monetary cost. After that it sums the respective ranks and creates a OffloadList of tasks at each device.

  2. 2.

    It iteratively applies the constraints till the offloadList becomes empty. From line \(10-11\) it checks the offloading energy, local energy, and required bandwidth for task t. If it is satisfied, then it sends an offload message for task t and wait for reply.

  3. 3.

    From line \(12-20\), if the reply is accepted then it updates \(E_i^{rem}\) and remove the task from the OffloadList. If the reply is rejected then it runs the task locally on the device and updates \(E_i^{rem}\) and removes the task from the OffloadList else it doesn’t run the task.

  4. 4.

    If the OffloadList becomes empty then a stop message is received.

The Agent() procedure: This procedure is performed for every agent lies in the coverage area of mobile device MD

  1. 1.

    Wait for messages from mobile device MD

  2. 2.

    In line \(28-32\), it accepts the message received from MD and check the monetary cost, completion time constraints and required resource constraints and if it satisfied then send the “accepted message” else it sends the “rejected message” to MD

  3. 3.

    After completing the procedure, a stop message is received.

figure d

Distributed algorithm for stable matching Our objective is to create the preference sets and to generate a stable matching between tasks of different sizes as VMs and cloud servers. We do that with the help of the policies at both the server and the mobile end. As the model under consideration uses a multi-tenant, we get tasks that are sharing resources of the cloud instance. So, we have to ensure that isolation is achieved. This can be done with the help of having defined policies that give preference lists for each of the sets. Let us identify our two sets here which are used in matching, the solution for which will ensure that both energy and monetary cost are minimized at the mobile end.

Set of servers As defined in the model, we have k servers labeled from 1 to k. Here we add a special server \(s_0\), which helps us in matching. This \(s_0\) has no policies as to prefer one task over other, i.e., all tasks are preferred equally, and this server has a quota as infinity, which means this special server accepts all the tasks proposing it. This special server \(s_0\) is defined so that all the tasks which are matched to this server are executed locally. All the other servers will have their policies which they follow to give their preference for the tasks along with their quota limitation.

Set of tasks As defined in the model, we have a set of tasks from each of the mobile devices labeled as \(t_{i,j}\). These tasks will have to ensure that their policy would try to decrease overall energy consumption while reducing total monetary cost as well.

The preferences sets for task and servers can be defined as follows:

(i) Servers preference list function The multi-tenant cloud provider, generally aims to consolidate the mobile client workload onto a minimum number of hugely occupied servers so that the idle servers can be switch-off to minimize the operational cost and maximize the revenue. Each server can accommodate multiple VMs based on quota \(q_{max}\) of the maximum number of VMs. The servers create their preferences based on the function called \(P_S(s)\) based on the policies they employed, where s denotes the server. The preference set for servers with different policies can be defined as:

policy 0 Server 0 employs a policy as all tasks are equal irrespective of the incentive or the size of tasks, which means \(s_0\) prefers all the tasks equally.

policy 1 Some server between 1 to k might choose to follow this policy which is revenue-maximizing, which means they choose those tasks which give maximum incentives for them monetarily.

$$\begin{aligned} P_S(s) = \xi (Incentive) \end{aligned}$$
(24)

where \(\textit{Incentive}\) is a monetary benefit for providing the services.

policy 2 Some server between 1 to k might choose to follow this policy which is to choose maximum size tasks first, which means they choose those tasks which have big sizes. The reason for this might be if the task sizes are large, then they might execute for a long time and earn them more incentives while decreasing the maintenance costs.

$$\begin{aligned} P_S(s) = \xi (task size) \end{aligned}$$
(25)

where \(\textit{task size}\) is the size of the tasks under consideration.

This way, there can be a large variety of policies depending on CPU, RAM, memory, which the servers can employ depending on their situation to have maximum incentives to serve the multi-tenant model.

The server always prefers to match with the tasks providing higher \(P_S(s)\).

(ii) Tasks preference list function From the perspective of mobile clients and resource demand of tasks. The tasks create their preferences based on the function called \(P_T(t)\), where t denotes the task. Each task can be assigned to one server. The tasks have one policy, which is to minimize their monetary cost and total energy.

Mathematically the matching function for tasks can be defined as follows: For server \(s_0\) :

$$\begin{aligned} P_T(t) = e^{loc}_{i,j} \end{aligned}$$
(26)

For servers 1 to k:

$$\begin{aligned} P_T(t) = e^{off}_{i,j,k}+\alpha MC_{i,j,k} \end{aligned}$$
(27)

After this each task will have a preference list \(P_T(t)\) which sorts all the servers from 0 to k in ascending order of preference.

The tasks always prefer to match with the server providing higher \(P_T(t)\).

Now we discuss our proposed algorithm, as shown in Algorithm 2, which is inspired from the Gale and \(Shapley's\) DAA [25] for many-to-one stable matching.

figure e

We execute this algorithm on each agent. It works as follows: In (Phase 1.i) All servers and tasks exchanged information and marked as unengaged. In Phase1.ii The task computes their preference lists by using function \(P_T(t)\). In Phase1.iii Every server computes its preferences using function \(P_S(s)\). The matching algorithm then begins with rounds in the course of which tasks send proposals, servers reply with counter-proposals, and tasks either reject or accept the proposal (Phase 2.i to Phase 2.viii). Each server that collects a new proposal can reassess its chances and consequently marked unengaged (Phase 2.i). W(s) contains the list of those tasks that have proposed at least once to server s. There is a dynamic list denoted as D(s) again initialized to W(s) prior to receiving counter-proposal from any server (Phase 2.ii). For each round, every unengaged task further proposes to its highest favorable server for which it hasn’t proposed yet(Phase 2.i). Each server collecting proposals includes the players to its progressive proposer’s list and again initializes its dynamic list (Phase 2.ii). With the help of the dynamic list, it explores for its most favorable one, including only tasks and releases a counter-proposal to such tasks.(Phase 2.iii). Each task matches the received counter-proposals with its preference list obtained with the servers; it has not proposed yet (Phase 2.iv). If one of the servers is more preferred than the most desirable counter-proposal, subsequently, the task rejects the proposals while carrying on with proposing (Phase 2.iv, Phase 2.v). Else, the task accepts its most favorable counter-proposal (Phase 2.iv). For particular counter-proposal, if each of the tasks accepts it, then they become engaged with the server. From the whole computed set in which the set of tasks and cloud servers were previously engaged are shattered, and all of their corresponding players marked as unengaged (Phase 2.v). If at least one task doesn’t permit, subsequently the server is set to be unengaged (Phase 2.v), Its dynamic list is upgraded via eliminating tasks found rejected its counter-proposal also currently engaged with some other server (Phase 2.vi). The counter-proposals continue to execute until no more server can issue any new counter-proposal (Phase 2.vii). The ongoing round stops and the algorithm get into a new round (Phase 2.viii). The algorithm terminates when no additional tasks can be rejected. Hence the outcome is stable matching.

6.4 Algorithm analysis

This section discusses a brief complexity analysis for the proposed algorithms.

Theorem 4

The total run time complexity of the offloading algorithm \(\mathbf{Algorithm} 1\) is O(TlnT).

Proof

We give different tasks in all the mobile devices as input to the algorithm and get the decision for each of the tasks as to offload, run locally, or get rejected by the algorithm. The total run time complexity of the offloading algorithm is O(TlnT), where T indicates the total number of tasks, which is calculated as the sum of the number of tasks in each of the mobile devices (MDs). The number of tasks in one MD is calculated as the sum of tasks in each of its applications.

Theorem 5

On the basis of proposals received from players,the complexity of \(\mathbf{Algorithm} 2\) is \(O(\lambda ^5)\), where \(\lambda = \max (T,S)\).

Proof

Let’s begin with an upper bound on the proposals generated through the tasks of mobile devices, then an upper bound for cloud servers are also considered. In at most T proposals, each task has proposed to each of the cloud servers. Hence, in at most \(T \times S\) proposals, the tasks have proposed to all cloud servers. For no more than S counter-proposals, each cloud server has proposed to all of the tasks. Moreover, each server counter proposes in each round. Therefore, in at-most \(T\times S \times T\), the cloud servers released all of their counter-proposals. Hence, we can derive that the proposals should not exceed \(T^3 \times S^2\). The overall complexity of proposed algorithm is \(O(\lambda ^5)\), where \(\lambda = \max (T,S)\).

We are using counter proposals to eliminate the problem of complementaries. Now, once we obtain the solution to our stable matching, we execute all the tasks which are matched to \(s_0\) locally and offload all others to their own cloud.

Theorem 6

\(\mathbf{Algorithm} 2\) converges i.e., give a matching outcome within a finite number of iterations.

Proof

Following the initialization phase, we enter into a matching phase for all the unengaged tasks. The matching phase is composed of two different loops. The first one represents proposals from tasks. For each iteration of this exist an outer loop, there is a counter-proposal from the servers to the set of tasks. Both loops terminate after executing a finite number of iterations. During the phase of counter-proposals, the following cases can arise:

Case 1 If an engaged server is still engaged, then its dynamic list D(S) will remain unchanged. (As mentioned in Phase 2.vi, only the list of all unengaged servers is updated.)

Case 2 If an unengaged server has become engaged, then its dynamic list will remain unchanged. (As mentioned in Phase 2.vi, only the list of all unengaged servers are updated)

Case 3 If an unengaged server is still unengaged. This case may arise when a few of the tasks it counter proposed during Phase 2.iv and rejected its proposal, and either (i) No task is engaged with another server or (ii) only a few tasks are engaged. In (i) the list will be left unmodified only while in (ii) it decreases.

Case 4 If an engaged server becomes unengaged. This may be only possible if all tasks of the computed set accept the counter-proposal of a server(s), then all the tasks and servers who were engaged previously are marked unengaged. Hence the dynamic list D(S) of the server will be decreasing. Thus for all the above cases, the inner loop of counter proposals will converge into finite steps.

Let’s consider the outer loop. Here the convergence due to the finite number of cloud servers each task can propose to and also another certainty that no task can propose more than once to any of the servers. Hence the proposed algorithm will definitely converge into a finite number of iterations. \(\square\)

7 Simulation setup and experimental results

First, we illustrate the simulation setup environment and performance metrics. Then, we analyze the experimental results to show the efficacy of the Off-Mat method. In the experiments, we have compared the proposed algorithm with some other methods. Finally, we discuss how our proposed solution minimizes energy consumption, delay, and monetary cost through the ‘Off-Mat’ framework for multi-tenant mobile clients.

7.1 Simulation setup

In this study, all of the algorithms are executed on a local terminal having an Intel Core i7 processor with 3.4GHz and 8GB RAM using Java 14.0.2. Workload parameters and simulation settings are summarized in Table 4. Similar to the simulation settings of [40], we have used the real-world parameters following the random uniform distribution. We have randomly generated mobile devices (n) between 50 to 100 and the number of applications (M) per device between 1 to 10. The total number of computational tasks T is generated between 50 to 800. The data size of tasks \((B^{send})\) lies in the interval 10 kilobytes (KB) to 1 megabytes (MB), and the computation \((C^{local})\) of executing each task distributed in the range of 200 to 2000 megacycles. Similarly, the mobile device CPU frequency \((S^{local})\) is generated between 1 to 1.5 GHz at random, and the result data size \((B^{rec})\) is set to 1 to 10 KB. We assume the data receiving power consumption rate \((P^{rec})\) is between 257 to 325 MW, and data transmitting power consumption rate \((P^{send})\) is set to 257 to 325 MW. The data network charge rate per MB(\(\gamma\)) is set to \( {\$} \) 0.02 to 0.03 per MB. To simulate a cloud data center, we have configured the hosts between 10 to 100. The characteristics of these servers are listed in Table 5 [67]. Corresponding to Amazon EC2, the four types of VM instances are used, and their characteristics are described in Table 6 [67]. The CPU frequency of Cloud VM \((S^{cloud})\) is 3.4 GHz, and the charge rate of Cloud VM \((\beta )\) is set to \( {\$} \) 0.84 per unit time. The active CPU power consumption rate \((P_{local})\) is between 644 to 700 MW and the idle CPU power consumption rate \((P_{idle})\) is set to 5 to 10 MW. The total number of agents is set between 2 to 10. For modeling the agents, we set the available bandwidth \((r_{send})\) and \((r_{rec})\) between mobile devices and agents between 100 to 800kbps. The maximum time limit \((T_{max})\) is set from 1.0 to 2.0, and the total bandwidth ranges between 10 to 20 Mbps. The total budget \((MC_{budget})\) is between \( {\$} \)100 to 3000. To characterize the task offloading behaviour, we have adopted the ratio of load-input data (LDR) [40], where the LDR = \(\frac{B^{send}}{C^{local}}\). Thus if the LDR value is high, then the task is compute-intensive and preferred for remote execution in the cloud; otherwise, the task is communication-intensive and suitable for local execution.

Table 4 Simulation setting
Table 5 Configuration of hosts
Table 6 Configuration of VM instances

7.2 Performance metrics

The following performance metrics are applied to assess the efficiency of the proposed ‘Off-Mat’ approach.

7.2.1 Request filtering

The goal of request filtering metric is to minimize the offloading requests that cannot meet budget and deadline constraints so that the offloading decision-making latency can be improved. We have also analyzed the influence of different LDRs on the filtering of requests.

7.2.2 Energy consumption

Energy consumption measures the amount of energy used for serving the requests. To analyze the total energy consumption, we have analyzed the local energy consumption and offloading energy consumption.

7.2.3 Request delay

To measure the delay of request-response, we have used the Ping tool to check the request transmission delay between the mobile device and agent. It measures the time of ICMP-request packets when sent from mobile device to any agent or cloud server and then receiving the packets sent back from the agent or cloud server.

7.2.4 Monetary cost

The monetary cost denotes the sum of data transferring cost and pubic cloud services cost. Each device has an offloading budget for mobile cloud services set by mobile clients. We have analyzed the monetary cost by varying the budget of mobile clients and evaluating the total savings by varying the offloading requests.

7.2.5 Fitness cost

We have used the weighted-sum-method (WSM) to find the fitness function for Eq. (15). It applies an aggregation function to transform a multi-objective function into one scalar objective function. Using WSM, we have reformulated Eq. (15) as follows:

$$\begin{aligned}&\min w_1 \left( \sum _{i=1}^{D}\sum _{j=1}^{T}\left( 1-\sum _{k=1}^{|S|}z_{i,j,k}\right) e^{loc}_{i,j}+\sum _{i=1}^{D}\sum _{j=1}^{T}\sum _{k=1}^{|S|}z_{i,j,k}e^{off}_{i,j,k}\right) \nonumber \\&\quad + w_2 \left( \sum _{i=1}^{D}\sum _{j=1}^{T}\sum _{k=1}^{|S|}z_{i,j,k}MC_{i,j,k}\right) \end{aligned}$$
(28)

where \(w_1\) and \(w_2\) indicates the weights of the energy consumption and monetary cost objectives, respectively, the sum of both weight parameters is equal to 1. The objective aims to reduce the fitness cost.

7.2.6 Throughput

Throughput indicates the total number of computational tasks that receive their service in per unit time. We have used average throughput time to analyze the performance of different approaches. Throughput time depends on different parameters, such as network delays, processing power, etc. If the optimization rate of the algorithm is higher, then the throughput is faster.

7.2.7 Happiness performance

The happiness metric measures the advantage in resolving the conflicts between the mobile devices tasks and cloud servers using stable matching. We utilize the rank percentile of the selected partner, i.e., tasks or servers, to measure the “happiness” of matching. For cloud servers, happiness indicates the average rank obtained through the matched number of tasks. By varying the total number of tasks and servers, we evaluate happiness performance.

7.3 Baseline approaches

We have measured the performance of the Off-Mat algorithm with the following two baseline algorithms:

  • Traditional offloading: In the traditional offloading framework, the mobile devices directly send the tasks to a remote cloud. The remote cloud makes an offloading decision and returns the decision results to the mobile device. There are no agents in the offloading framework.

  • Agent-based offloading: It uses agents, where the device sends its offloading request to the agent for performing the offloading decision rather than the distant cloud.

7.4 Experimental results

7.4.1 Impact of request filtering

In this experiment, we have evaluated the performance of the Off-Mat algorithm with respect to the filtering of offloading requests. The primary task of request filtering is to reject the computational offloading requests that cannot satisfy the budget and deadline constraints so that the overall delay of decision making can be minimized. In Fig. 6, we have shown the impact of request filtering by varying the offloading requests. We have considered the requests of different mobile users from 100 to 800, which is equivalent to cases, i.e., case 1 to case 8, respectively. From Fig. 6, it can be identified that the filtering process performs better when the LDR value is low (LDR=1.0) as most of the offloading tasks are communication-intensive and suitable for local execution due to the completion time constraint. Thus, in each scenario, it can be observed that when the LDR value is high (LDR=1.5), then the task is more likely to be offloaded due to its computation-intensive nature.

Fig. 6
figure 6

Effect of LDR on request filtering

7.4.2 Performance on energy consumption

In this experiment, we evaluate the energy consumption of the three computation offloading approaches. For this study, we consider that the available bandwidth is sufficient, and all of the offloading tasks can be offloaded directly to the cloud. From Fig. 7, it can be observed that energy consumption rises with the increase of offloaded tasks. In comparison to traditional and agent-based offloading, the proposed approach outperforms and gives better results. As all of the computational tasks can not get advantage via remote execution because of lower LDR value. While proposed ‘Off-Mat’ approach schedules the computational tasks on the agents so that the mobile devices consume the least amount of energy.

Fig. 7
figure 7

Total energy consumption analysis of offloading approaches

Fig. 8
figure 8

Total energy consumption analysis for request size 800

Further, we set the number of offloading requests to 800 and task size to 10KB. From Fig. 8, it can be identified that the total energy consumption of the traditional offloading scheme is approximately three times more than the agent-based offloading. The reason behind this is the longer RTT (round-trip-time) in traditional offloading. Thus it results in more energy consumption than an agent-based scheme. Our proposed approach gives better performance as the device and agent check the constraints and only offload the valid requests to the cloud. Hence it considers only the beneficial offloading tasks for remote execution and saves more energy.

7.4.3 Impact of request delay

Figure 9 shows the average request-response delay of the proposed framework and compares its performance with the traditional and agent-based offloading frameworks. In this comparison, we send the ICMP-request packets through the mobile devices to the agent or cloud and receive back the computation results. For delay analysis, we set the number of offloading requests to 300 and task size to 10KB. In Fig. 9, it can be identified that the average request delay of the proposed approach and agent-based offloading approach is much shorter than the traditional offloading approach. Specifically, the average request delay for the agent is less than 10 ms, and the request delay of the traditional offloading scheme is nearly 30 ms. In contrast, the Off-Mat approach takes less than 5ms in comparison to the other two frameworks. The reason behind the better performance is that the agent is located one-hop near to the mobile devices. Thus the average latency is much shorter than the cloud where the latency increases due to the complex networks. Further, the Off-Mat approach makes better decision making as it only sends the valid offloading requests to the agent.

Fig. 9
figure 9

The average request delay

7.4.4 Performance on monetary cost

To analyze the monetary cost, we have varied the user budget from 100$ to 3000$ and offloading tasks from 100 to 900. As depcited in Fig. 10, when the budget of mobile devices increases for Off-Mat algorithms, the total cost of offloading tasks will increase accordingly as the more significant number of computation-intensive tasks are offloaded to the servers.

Fig. 10
figure 10

Monetary cost with varying budget

Fig. 11
figure 11

Monetary cost for request size 500

Fig. 12
figure 12

Monetary cost for request size 800

In Fig. 11, we have varied the user budget and set the request size to 500. It can be identified that the total cost of traditional offloading is nearly four times higher than the Off-Mat approach. The reason behind the lower monetary cost is the less number of offloaded tasks due to beneficial offloading. In Fig. 12, we repeat the same experiment and set the request size to 800. Finally, Figs. 13 and 14 demonstrate the total cost saving for the request size 500 and 800, respectively. From the figures, it can be identified that the Off-Mat algorithm can save more cost than the traditional and agent-based offloading algorithms.

Fig. 13
figure 13

Total saving when request size is 500

Fig. 14
figure 14

Total saving when request size is 800

7.4.5 Fitness cost performance

The fitness cost performance depends exclusively on the preferences of weight parameters. Thus, after performing some initial experiments, we found that the best performance was obtained by assigning equal weights to each objective. Figure 15 shows the analysis of average fitness value for the different number of offloading tasks. It can be identified that the Off-Mat scheme outweighs all baseline algorithms. The reason for better performance is the optimal number of offloaded tasks, which also reduces the monetary cost in terms of data transfer cost and public cloud cost. After the off-Mat, the agent-based approach shows better performance for the fitness value.

Fig. 15
figure 15

Average fitness cost

7.4.6 Throughput performance

In Fig. 16, we analyze the average throughput time by increasing the offloading tasks from 100 to 800.

Fig. 16
figure 16

Throughput time for different approaches

It can be noticed that the Off-Mat scheme has demonstrated maximum throughput time along with varying number of offloaded tasks. Due to the earliest response time of Off-Mat, it generates faster throughput, whereas the response time of traditional and agent-based offloading is higher, which results in low throughput (higher values). Throughput time depends on various parameters, such as delays in network, processing power, and hardware resources. It is observed that while varying the number of computational tasks, the net throughput of Off-Mat is getting better and producing stable behavior over other approaches.

7.4.7 Happiness performance

Figures 17, and 18 depict the happiness percentages of tasks and servers, respectively. For this experiment, we consider 10 cloud servers and increase the offloading tasks ranging from 50 to 300. Cloud servers are initially empty, and each can accommodate 10 VMs of different sizes. For computational tasks, we apply different allocation policies, and for servers, we perform a consolidation policy. To measure the matching happiness, we use the average rank of the matches, tasks, and servers.

Fig. 17
figure 17

Task happiness

Fig. 18
figure 18

Server happiness

Compared to the First-Fit benchmark algorithm, the proposed distributed stable matching algorithm provides a substantial improvement in servers’ performance. It demonstrates the advantage in resolving the conflicts between the mobile devices tasks and cloud servers using stable matching. First-Fit only allows an individual uniform ranking of tasks for all listed servers; on the other hand, the proposed distributed stable matching approach permits cloud servers to reveal their preferences. Additionally, First-Fit is not able to match a task to a server with inadequate capacity. It means no further rejection from servers, while the proposed algorithm grants rejections if a task is more preferable than other server’s tasks during its entire execution. Distinctly, this enhances the task’s and server’s happiness. Through the analysis of results, we find that the computational tasks can obtain top 10% companion on average while cloud servers are only able to acquire their top 50% tasks. As the number of available VMs is too small in comparison with total capacity of cloud servers, and most of the proposals from VMs’ can be directly approved by cloud servers.

For large-scale simulations, we vary the number of offloading tasks and servers and analyze overall happiness performance. As shown in Fig. 19, the tasks get their top 6–8% preferences, while the cloud servers obtain their 13–18% preferences. Thus, we can observe that the Off-Mat effectively analyzes the policies and able to resolve the conflicts for large-scale scenarios.

Fig. 19
figure 19

Overall happiness performance

7.5 Statistical analysis

To evaluate the reliability of the ‘Off-Mat’ approach, we have applied statistical evaluations. The statistical analysis is conducted to investigate whether the experimental results are statistically significant, and not by coincidence [52, 53].

Table 7 Statistical comparison of the Off-Mat approach for computational tasks offloading with benchmark approaches

For different applications, we have several types of statistical tests suitable for heterogeneous data such as homoscedasticity and normality. Thus, to perform the statistical test analysis, we use the StatService toolkit [68, 69], which offers a smart model to select the best statistical test according to the features of data. Based on the evaluation, the suggested statistical test for data analysis is the T-test. Thus, we conduct a paired T-test of the Off-Mat approach and other baseline approaches using statistics calculators available at social science statistics [70]. A T-test is conducted by constructing the following hypotheses:

  • Hypothesis 1: No difference is observed between Off-Mat and baseline algorithms.

  • Hypothesis 2: Significant difference is observed between Off-Mat and baseline algorithms.

The primary purpose of this statistical analysis is to validate that the obtained experimental results are statistically significant and not by chance [52, 53]. We evaluate all performance parameters using four standard statistical tests, including total samples, mean value, standard deviation (SD), t-value, p-value, and degrees of freedom (df).

Table 7 represents the statistical analysis of the paired t-test. It shows the significance level of Off-Mat compared with the other benchmark strategies concerning request filtering, energy consumption, total delay, monetary cost, total cost saving, throughput time, task happiness, and server happiness. The objective of the t-test is to validate the correctness of the stated hypothesis. To perform the test, we use the significance level of p<0.05. It can be observed from Table 7 that all the p-values are less than 0.05. Hence, it shows that there exist a significant difference between Off-Mat, and other benchmark approaches, as indicated using t-values. As the significance level of variance for all the performance parameters is less than 0.05 in the t-test, we can say that hypothesis 1 is discarded, and hypothesis 2 is accepted.

In Table 8, we perform a comparison of Off-Mat with all the algorithms for all the performance parameters using the ANOVA test. ANOVA test performs multiple comparisons at once for each performance measure for all benchmark approaches. It compares the mean value of two or more groups to identify whether the difference is statistically significant. We apply the same hypothesis case study that we have used earlier for the paired t-test. It can be observed that the p-values with respect to all the F-values such as 4.05 in total cost, 9.59 in total cost saving, 31.48 in energy consumption, 25.80 in total delay, 9.76 in throughput time, 9.00 in request filtering, 7.26 in task happiness, and 60.57 in server happiness are less than significance p < 0.05. Thus, we can conclude that the hypothesis 1 is again rejected and hypothesis 2 is accepted. It shows that the overall differences, compared with the benchmark approaches, are statistically significant.

Table 8 ANOVA test for all benchmark algorithms

7.6 Overall analysis

To further analyze the efficiency of proposed Off-Mat algorithm, we have compared the best, average, and worst-case value of all performance parameters. These values show the minimum, average, and maximum value for all the experiments. We have used the following gap value formula to identify the gap between the performance values of different approaches.

$$\begin{aligned} \text {Gap} = \frac{(\text {Average case value} - \text {Best case value})}{\text {Best case value}} \end{aligned}$$
(29)

The lower gap value of any performance parameter shows that the average-case value of the offloading algorithm is closer to the best-case value. Table 9 shows the different values of the performance parameters. From the table, it can be identified that the Off-Mat approach generates the lowest gap values for request filtering, monetary cost, fitness cost, and second-best for energy consumption, total delay, cost-saving, throughput-time, task happiness, and server happiness.

Table 9 Overall analysis of algorithms

Further, we analyze the improvement rate of proposed Off-Mat over other baseline approaches. For this, we have first analyzed the mean values of all performance parameters for all offloading approaches and calculated the improvement rate using the following formula:

$$\begin{aligned} \text {Improvement rate} (\%) = \frac{\text {Baseline} - \text {Off-Mat}}{\text {Off-Mat}} \times 100 \end{aligned}$$
(30)

The improvement rate measures the gain for any performance parameter. For example reduction in the monetary cost or energy consumption of the Off-Mat over other offloading algorithms. For this, In Table 10, we have recorded the mean values of overall performance results for all experiments and calculate the improvement rate %, as shown in Table 11. The Off-Mat approach shows 281.15% and 118.97% energy consumption reduction over traditional and agent-based offloading approaches. Similarly, for average request delay, Off-Mat shows 994.29% and 79.29% reduction over traditional and agent-based offloading approaches. For cost analysis, results are encouraging and show 320% and 200% reduction over traditional and agent-based offloading approaches. In the case of cost-saving, the Off-Mat maximizes the cost-saving by 99.70% and 62.32% over traditional and agent-based techniques. Overall, fitness cost is also reduced by 281.25% and 119.18% over traditional and agent-based approaches. In the case of throughput time, the performance is improved by 123.74% and 62.03% over the traditional and agent-based offloading scheme. We have further analyzed the improvement for happiness metric and obtained that compared with the First-Fit approach, the Off-Mat shows 8.89% improvement for task happiness and 93.99% for server happiness. In Table 12, we have summarized the overall performance of proposed and existing schemes for different performance parameters.

Table 10 Mean values of overall performance results of all experiments
Table 11 Improvement in Off-Mat algorithm over other approaches
Table 12 Comparison of performance metrics for proposed and existing method

7.7 Performance comparison with existing approaches

To further show the characteristics and advantages of the proposed study, we compare our results with some other existing techniques, i.e., Multi-Tenant Mobile Offloading [43], ENGINE [55], and Centralized broker-node based offloading [54] of literature as discussed in Table 13. Table 13 shows that [43] uses cloudlet-based offloading, [55] uses fog-based offloading, and [54] uses broker-node based offloading. However, in the proposed approach, we have used agent-based offloading. The agent-based offloading makes faster decision making and improves the total delay and energy consumption. With respect to computing environment, [43] and [54] uses centralized model while [55] and proposed scheme used distributed computing environment. [43] and proposed approach support multi-tenancy and heterogeneity. [43, 55], and [54] have formulated either single objective or bi-objective optimization problem, while proposed approach is a multi-objective approach. The proposed Off-Mat scheme supports beneficial offloading via request filtering, which ensures the offloading under budget and maximum completion time constraint. Thus it improves the delay, monetary cost, and energy consumption performance. To improve stability, the Off-Mat provides task and server happiness parameters while other approaches have neglected such functionality. The overall time complexity of the Off-Mat approach is O(TlnT) for offloading phase and \(O(T^3 \times S^2)\) for allocation phase. It takes 2T messages to reach any offloading decision. The proposed Off-Mat approach is based on the concept of matching theory applying the weighted-bipartite matching and stable matching in a distributed environment, which shows the novelty and advantages of the proposed Off-Mat strategy over the other existing schemes as illustrated in Table 13.

Table 13 Off-Mat comparison with the existing works

8 Conclusions and future work

This study analyzes the computational tasks offloading problem for mobile multi-tenant clouds. We first formulated it into an ILP model by using the objectives of energy and monetary cost under multi-constraints. The proposed \(Off-Mat\) framework effectively minimizes the request-response time and improves the offloading performance. It also balances the trade-off of energy and monetary cost. We have first proved the problem complexity of the proposed optimization problem. Subsequently we proceed to solve the special case which is formed after relaxing certain conditions and solved it in polynomial time. Then we went ahead to solve the generalized case, where we have performed offloading and calculated the preferences for each of the servers and tasks selected by agents and perform a stable matching based heuristic. In general, the complexity of the \(Off-Mat\) approach is analyzed and evaluated through extensive experiments. Simulation results verify that the ‘Off-Mat’ achieves superior performance compared with the other computation offloading strategies.

8.1 Future directions and open challenges

The applicability of the proposed solution can be explored in real-time systems and extended in the following main directions.

  1. 1.

    Mobile edge computing (MEC): MEC extends the application services and cloud capabilities to the edge of the network. It can be achieved through the dense deployment of servers or small-cell (pico, femto) base stations (BS) equipped with storage and computation resources. Mobile edge environment ensures efficient network operation and service distribution, minimize latency, and offers an enhanced service user experience. Despite the benefits of MEC, there are still major challenges such as the real-time mobile applications are extremely time-sensitive and energy-sensitive. Thus due to the dynamics of edge networks, the long execution times of such applications can lead to high energy consumption. Therefore, there is a requirement to design an efficient MEC framework for computation offloading [56].

  2. 2.

    Fog computing: Fog computing extends the cloud model to the network edge to enable Internet of Things (IoT) based services. For future Internet, the mobile fog technology is an integral framework of fog computing supporting seamless mobile computing and latency-enabled services. Nonetheless, the critical challenges for mobile fog-based computation offloading are: (i) Which process or module of the application to be offloaded? (ii) How to offload computation?, and (iii) Where to offload? Moreover, the geographical distribution, mobility, and heterogeneity of mobile devices also impose some additional challenges in mobile-fog [57].

  3. 3.

    Multi-Tier Edge-Clouds: To accomplish low end-to-end latency, multi-tier edge-clouds pushes units of computation i.e., cloudlets to the edge of the network in the coverage area. Hence, some recent works have adopted a hierarchical cloudlets arrangement in different edge-tiers. In such architecture, the higher tiers comprises some more powerful edge cloudlets. So, whenever any case of overloading occur, the higher tiers can receive migration demands from the lower-tier cloudlets. Thus, the cost and energy-efficiency issues in hierarchical edge-clouds is an open problem and well suited with the 5G business models [42].

  4. 4.

    Fault-tolerance: During run-time network contexts such as strength of the signal, bandwidth, latency, etc. are periodically changed, and short-term failures can occur for short-time span. Hence, the failure aware partitioning of offloading applications during run-time is a critical component [45].

  5. 5.

    Privacy and Security: A compromised mobile device can perform malicious activities without the user’s knowledge, encompassing the access of sensitive information. In consequence, the offloading of mobile applications to the remote server requires a more secure environment. In fact, mobile offloading offers a computation mechanism. Thus, due to the limited security of the mobile OS, partial or full offloading, allows us to run security applications in a more robust and secure environment [51]. However, privacy leakage, network attack, information theft during data transmission are some of the open issues during the computation offloading [65].

  6. 6.

    Deep learning-based offloading: Deep learning enables mining & processing of a variety of unstructured data collected through smartphones. So that more intelligent cognitive and robust services can be offered to the MEC systems. Thus, some of the popular deep learning techniques such as CNN, GRU, RNN, and LSTM can offer cognitive services for network functions, traffic, load, including other system measures for enhancing the quality of service (QoS). However, designing a user mobility prediction using deep learning algorithm enabling some decision-making mechanisms is one of the open problems for computation task offloading and migration [65].

  7. 7.

    Blockchain: Data integrity violation is one of the drawbacks of computation offloading. Most of the classical integrity preservation methods are mostly rely on the central entity and not necessarily convenient for the 5G network due to the single point of failure. Blockchain is become an emerging disruptive paradigm which guarantees data completeness. Thus, the blockchain is a promising future for data integrity preservation during the computation offloading [66].