1 Introduction

With the rapid development of wireless network and computer technologies, using mobile devices (MDs) has become highly popular in many industries. Cisco predicted that the number of MDs worldwide will grow from 8.8 billion in 2018 to 13.1 billion in 2023 [1]. At the same time, due to the progress of the Internet of Things (IoT), a large number of devices are connected to mobile Internet, making the MD concept further expand. With the popularity of MDs, mobile Internet has entered a high-speed development stage. For instance, according to Internet Trend Report 2019, the number of mobile users (MUs) in China has exceeded 817 million with a year-on-year growth rate of 9%, and their mobile data traffic consumption increased by 189% [2]. Benefiting from the continuous improvement of chip manufacturing techniques, MDs are equipped with more powerful CPUs and larger memories, enabling MDs to handle more business for MUs. However, the faster CPU has the greater energy consumption since CPU power increases super-linearly with its frequency [3]. MDs are usually powered by batteries, and the battery volume and capacity are limited to support MDs’ portability. As one of the most intuitive feelings, compared with feature phones of the previous generation, although today’s smartphones have more functions, they have shorter working hours after one charge. Different from the semiconductor technology following by Moore’s Law, the battery technology has not made breakthroughs in the short term, and the annual growth rate of the battery capacity is only 5% [4]. Furthermore, due to a series of factors such as CPU architecture and heat dissipation, regardless of the fact that MD processing capacity has been improved, it is still weak to execute some computation-sensitive applications. MDs’ limited resources cannot satisfy the increasingly complex MU requirements.

Computation offloading migrates computing tasks to the external platform to extend available MD resources, which is an effective way to solve the problem of limited MD resources [5]. Cloud computing, as the foundation of future information industry, is a business computing model, which can provide rich resources to MDs. Cloud computing is a pay-per-use model that supplies available, convenient, on-demand network access to a shared pool of configurable computing and storage resources [6]. The concept of cloud computing was first proposed by Google CEO Schmidt in 2006, and remarkable technological progresses have been achieved after more than 10 years of development. At present, there are many mature commercial cloud computing services, such as Amazon Web Services, Microsoft Azure, Alibaba Cloud, Tencent Cloud, etc. Based on cloud computing and computation offloading, mobile cloud computing (MCC), which provides MDs with a rich pool of resources that can be accessed through wireless networks, is proposed to address the problem that MDs’ resources are limited. MCC is an association of cloud computing, mobile computing, and wireless network and can migrate offloading units (OUs) to the cloud via computation offloading [7,8,9]. MCC helps MDs breakthrough their resource constraints, frees them from heavy local workloads, allows them to take more responsibility for connecting MUs and the information domain, and makes them become a simple modem that connects humans to the electromagnetic signal-based network. Benefiting from MDs’ portability, MUs can connect with the resource-rich cloud anytime and anywhere in MCC, and enjoy the convenience of informatization better. MCC has attracted wide attention from industry and academia because of its tremendous potential. According to the assessment of Allied Analytics LLP, the mobile cloud market is valued at $12.07 billion in 2016 and is expected to reach $72.55 billion by 2023, with a compound annual growth rate of 30.1% from 2017 to 2023 [10]. It can be inferred that the mobile cloud market will become more prosperous with the progress of MCC.

Computation offloading is the core of MCC and determines the ultimate MCC effect. Researchers have studied computation offloading from different perspectives and provided numerous research results. Wu presented a survey of current research work on multi-objective decision making for time-aware and energy-aware computation offloading in MCC [11]. Mach and Becvar surveyed the work on computation offloading from the perspective of offloading decision, computing resource allocation, and mobility management [12]. Bhattacharya and De focused on the variability and unpredictability of MCC environment in offloading decision [13]. They categorized the parameters that influence computation offloading as applications characteristics, network properties, and execution platform features, and then surveyed adaptation techniques utilized for computation offloading. Kumar et al. investigated different types of offloading decision algorithms and classified the application types that have been used to demonstrate [14]. Khan performed a survey of the computation offloading strategies impacting the performance of offloaded applications and categorized offloading decision approaches into static and dynamic [15]. Chen and Cheng reviewed the offloading decision algorithms and classified them into three categories based on three decision scenarios (i.e., single user, multiple users, and enhanced server) [16]. Shakarami et al. surveyed the offloading decision approaches from the perspective of game theory and classified these approaches into four main fields based on the game mechanisms (i.e., classical game mechanisms, auction theory, evolutionary game mechanisms, and hybrid-base game mechanisms) they used [17].

Nevertheless, these surveys ignored the systematicness of computation offloading in MCC and mainly focused on offloading decision approaches. Instead of reviewing one or more discrete technologies, this paper considers the interaction among components of the MCC computation offloading system and summarizes their key technologies. Our work aims at providing a comprehensive survey of research on computation offloading in MCC so that readers can spend less time to have a more comprehensive understanding of this field, and know its key technologies and existing problems. Architecture is the foundation of MCC, and different architectures lead to different computation offloading patterns. The offloading granularity determines the OU in the computation offloading system. Therefore, we first summarize the MCC architecture and offloading granularity. To promote MCC and computation offloading on a large scale, all the transactions should not be coupled together but decomposed into separate components that are responsible for their transactions under principles of high cohesion and low coupling. The computation offloading system has three basic components, i.e., MUs, application service operators (ASOs), and cloud operators (COs) [18]. Offloading decision, admission control, resource management, and equipment deployment are their technical challenges. We then summarize the key technologies used to solve these four challenges. In the theoretical research, it is usually assumed that there will be no failures in the computation offloading process, and the MU data will not be stolen. However, in the real-life MCC computation offloading system, it is prone to fail in the process of data transmission or/and distributed OU execution since MDs are connected to the cloud through wireless networks, and MDs are heterogeneous with cloud data centers. MDs usually store MUs’ high-privacy personal information. Privacy leakages and malicious attacks may occur in the process of data transmission or/and OU execution. Fault tolerance and privacy protection, which are two critical auxiliary technologies for computation offloading, are also summarized. Finally, we present a research outlook for MCC computation offloading from the perspective of systematic prototype and “device-pipe-cloud”. Our contributions can be summarized as follows:

(1) We consider the interaction among components of the MCC computation offloading system and conduct a comprehensive literature review on four technologies involved in computation offloading from the perspective of system by giving an explicit comparison and feature analysis of them. Besides, we summarize the MCC architecture and offloading granularity, which are the fundamental concepts of MCC.

(2) In MCC, MDs are connected to the cloud through wireless networks, and MDs are heterogeneous with cloud data centers. These two characteristics make failures and privacy leakages inevitable in the real-life process of data transmission or/and distributed OU execution. We summarize fault tolerance and privacy protection for computation offloading, and analyze their internal technical theories.

(3) We discuss the future research trend in systematic MCC prototype and analyze technologies that need to be paid attention to in the future from the perspective of “device-pipe-cloud”.

The remainder of this paper is organized as follows. Section 2 summarizes the MCC architecture and offloading granularity. Section 3 summarizes four technologies of offloading decision, admission control, resource management, and equipment deployment. Section 4 summarizes fault tolerance and privacy protection for computation offloading. Section 5 discusses the future research directions. Section 6 concludes this paper. Table 1 shows the abbreviations used throughout this paper.

Table 1 The abbreviation list

2 MCC architecture and offloading granularity

2.1 MCC architecture

After the concept of MCC was proposed, researchers have given many definitions [8, 18, 19]. These definitions are different due to various research contents and scenarios, but their core ideas are the same, that is, MCC offloads OUs from the MD to external platforms for execution, thereby enhancing the MD capability. According to the available external platforms, MCC architectures can be classified into four categories: one-layer architecture, two-layer architecture, three-layer architecture, and hybrid architecture [20], which are shown in Fig. 1. These MCC architectures are compared in Table 2.

Fig. 1
figure 1

Categories of the MCC architectures

(1) One-layer Architecture

In the one-layer architecture, several neighboring MDs form a network, in which these MDs make up the external platform to provide their idle resources to each other. The formed network is a self-organizing dynamic network that allows MDs to join and to leave at any time. This architecture, which uses MDs as the cloud servers to complete computing tasks through cooperation among MDs, is a variant of traditional cloud computing. A typical example is Hyrax [21], which applied Hadoop in MCC. Hyrax uses a group of smartphones to execute computing tasks in parallel and implements large-scale distributed applications through the cooperation among these smartphones. The one-layer architecture has advantages of short distance and fast data transmission, which solves the problem of MD resource limitation to a certain extent. However, the external platform in this architecture is composed of MDs, which have few resources and are difficult to provide sufficient resources. Moreover, MDs are owned by different MUs so that problems of permission and privacy are prone to occur in real life. How to persuade MUs to contribute their precious MD resources will be a major challenge.

(2) Two-layer Architecture

In the two-layer architecture, MDs are in the first layer, and the external platform is in the second layer. The external platform can be the server nearby MDs or the remote cloud. The two-layer architecture can be classified into two subclasses according to the external platforms.

A nearby server named cloudlet, which refers to the resource pool composed of small-scale data center clusters at the Internet edge and aims to bring cloud services around MUs, was proposed by Satyanarayanan et al. in 2009 [27]. In [27], Satyanarayanan et al. implemented a cloudlet prototype based on the dynamic virtual machine (VM) synthesis. In this prototype, when the MD wants to use the cloudlet, it first generates the parameters required to override the VM and then sends them to the cloudlet. After receiving these parameters, the cloudlet combines these parameters with its basic VM to generate the same VM as the MD, then executes the MD applications using its VM, and finally returns results to the MD. After the concept of cloudlet was proposed, some researchers are committed to improving it. For example, Hua et al. proposed a scheduling mechanism based on statistical prediction to solve the problem of a long time consuming for cloudlet synthesizing VM [28]. According to the prediction information, the service application VM on cloudlet is pre-synthesized to reduce the MU’s waiting time. Cloudlets can be composed of PCs, workstations, or small servers, which are deployed around wireless network access points (APs), such as Wi-Fi hotspots in libraries or coffee shops, to provide cloud service for MDs connected to APs. Also, in the disaster relief or war scenario, temporary cloudlets (e.g., unmanned aerial vehicles) can be deployed [31,32,33]. Compared with the one-layer architecture, using nearby servers can alleviate the shortage of external platform resources while retaining the advantage of low delay.

The architecture, in which the remote cloud is the external platform, is the classical MCC architecture. The remote cloud is rich in resources and has many commercial products, which is convenient for the practical construction of MCC. In traditional cloud computing, the user device (e.g., desktop computer) is connected to the cloud via wired networks and is electrified by plugs and sockets, which suppress the need for energy-efficient data transmission techniques. In this MCC architecture, MDs connect to the remote cloud via wireless networks, which consume MD energy and have limited bandwidth. If the energy and time saved by computation offloading are less than those consumed by data transmission through wireless networks, computation offloading is invalid. Wireless networks have a serious impact on the energy- and time-saving effects of computation offloading in MCC. The question of whether it is worth the effort to offload the computation to the remote cloud must be answered first. Kumar and Lu answered this question through theoretical analysis and experiment in [34]. They found that MCC can potentially save energy for MDs but not all computations are energy-efficient when offloaded to the remote cloud. They take energy-saving as the optimization goal and illustrated that offloading is beneficial when large amounts of computation are needed with relatively small amounts of communication. This MCC architecture completely solves the problem of limited resources and can be considered as providing unlimited resources. However, it is troubled by the problems of high delay and consumption caused by wireless networks.

(3) Three-layer Architecture

The two-layer architecture, which uses the nearby server as the external platform, can alleviate the resource shortage of one-layer architecture while retaining the advantage of low delay. The classical MCC architecture, which is the other two-layer architecture and uses the remote cloud as the external platform, can provide sufficient resources. However, the classical MCC architecture is not suitable for delay-sensitive scenarios (e.g., real-time control, real-time data processing, augmented/virtual reality, etc.) because of its high delay. Through the above analysis, it can be seen that the two-layer architecture that uses the nearby server as the external platform has limited resources but low delay, while the classical MCC architecture has rich resources but high delay. If they are combined, their advantages can be used to overcome each other’s shortcomings, and the resources and delay can be balanced perfectly.

Compared with the remote cloud, the nearby server is closer to MDs, and the layer it locates is also called “edge layer”. Adding the edge layer to classical two-layer MCC architecture forms a new architecture named multi-access/mobile edge computing (MEC). It is worth noting that MEC refers to mobile edge computing in the previous concept. In 2017, the European Telecommunications Standards Institute (ETSI) extended the MEC concept from initially supporting only the 3GPP mobile network to also supporting the non-3GPP networks (including multiple types of wireless networks and even wired networks), and its name is also modified from mobile edge computing to multi-access edge computing [52]. MEC is an enhancement and extension of MCC and still belongs to the MCC category. Because this architecture takes advantage of the collaboration between edge cloud and remote cloud, it is also called “cloud-edge collaboration” architecture. Compared with classical MCC, MEC has more advantages. On the one hand, MEC can transfer delay-sensitive OUs to the edge layer to reduce the response time. For example, in industrial manufacturing, some information requires real-time analysis to deal with emergencies timely, and this work can be done at the edge layer. Big data generated in the manufacturing process can be analyzed by machine learning technologies in the remote cloud. Wu et al. introduced a three-layer architecture for data-driven machine health and process monitoring in cyber-manufacturing [42]. The training datasets are streamed into the remote public cloud, in which the diagnostic and prognostic models are built using parallel machine learning algorithms. The predictive models are downloaded to the local private edge cloud and applied to the real-time datasets streamed to the edge cloud for online diagnosis and prognosis. On the other hand, MD data is transmitted to the edge layer through local/private network for processing or preprocessing instead of directly being transmitted to the remote cloud through the public core network, which greatly relieves the pressure of core network and enhances the data security. Just as every coin has two sides, classical MCC also has its advantages. Classical MCC uses the mature commercial cloud computing platform, which helps it to provide low-cost mobile cloud services quickly. Many traditional cloud services can be transplanted to MDs with minor modifications. MEC needs to deploy a large number of edge equipment additionally, which will bring great economic pressure, hinder the development speed and increase the service cost. On the contrary, classical MCC has no pressure to deploy the edge equipment.

(4) Hybrid Architecture

The hybrid architecture combines several different architectures to build MCC suitable for specific application scenarios. Sanaei et al. proposed a hybrid MCC architecture integrating the above three architectures, analyzed the key problems and solutions to achieve it, and demonstrated the application of hybrid MCC through a medical treatment case [47]. Alonso-Monsalve et al. proposed a hybrid MCC architecture, which combines the classical MCC architecture with the utilization of volunteer platforms as resource providers [48]. Their proposed architecture is an inexpensive solution, which highlights benefits in cost savings, elasticity, scalability, load balancing, and efficiency. Zhou et al. proposed a hybrid MCC architecture integrating the above three architectures, in which MDs, cloudlets, and remote cloud form a shared resource network for computation offloading [49]. To incentivize MUs to provide their MD resources, they designed an auction-based computation offloading market, in which MUs can sell and buy MD resources. Feng et al. proposed a hybrid MCC architecture to improve the computational capability of vehicles by using resources from the remote cloud, roadside units, and neighboring vehicles [50]. Their architecture also integrates the above three architectures, in which the roadside units are nearby servers, and the vehicles can offload their OUs to each other. Flores et al. proposed a social-aware hybrid computation offloading system, which integrates cloudlet, remote cloud, and device-to-device networks, and increases the spectrum of offloading opportunities [51]. They designed an credit- and reputation-based incentive mechanism to foster MUs to lease their MD resources as open commodities that may be acquired by others.

Table 2 Comparison of the MCC architectures

2.2 Offloading granularity

Offloading granularity determines the OU in the computation offloading process. At present, there has been some research on the MCC computation offloading prototype, which mainly focuses on the design and implementation of the software system. Researchers used different offloading granularity in their prototype, such as method, class, thread, VM, web service, etc. The offloading granularities in these prototypes are compared in Table 3. Besides these prototypes, more research studied MCC from the theoretical perspective, and there are two commonly used task/application models, which also indicate their offloading granularities. Fig. 2 shows categories of the offloading granularity.

Fig. 2
figure 2

Categories of the offloading granularity

(1) The independent task model [26, 30,31,32,33,34, 36, 40, 43, 45, 46, 49, 50, 57,58,59,60,61,62,63]. This model abstracts the task into an independent computing module, which has no interaction with other computing modules and arrives following a stochastic process (e.g., the Poisson process). The computing module is usually defined as a 3-tuple, in which one item represents the size of input data, one item represents the necessary CPU cycles to accomplish the computing module, and one item represents the maximum tolerable delay. In this model, the computing module is the OU and represents the offloading granularity.

(2) The graph model [35, 64,65,66,67,68,69,70,71,72]. A real-life mobile application is composed of many components (e.g., classes, threads, or methods), which are the OUs. A component can call other components for execution and needs the output data from other components. Therefore, a mobile application can be abstracted as a graph, in which vertexes represent the components, and the edge represents the interactive relationship between two components. In this model, the vertex represents the offloading granularity. An example of the graph-based application model is illustrated in [72]. The vertex is modeled as a 3-tuple, in which one item represents the CPU cycles, one item represents the indicator of whether it is offloadable, and one item represents the execution sequence. Some components of the mobile application are unoffloadable because they have to operate MD hardware (e.g., sensors or screen), and the corresponding vertexes need to be marked as unoffloadable. The indicator is a binary, which can be set as 1/0 if the vertex is unoffloadable and can be set as 0/1 if the vertex is offloadable. The edge represents the interactive relationship between two vertexes, and its weight denotes the amount of interactive data.

Table 3 Comparison of the offloading granularity in MCC computation offloading prototypes

3 Key technologies in the computation offloading system

To promote the large-scale development of computation offloading in MCC, all the transactions should not be coupled together but decomposed into components. These components are operated in the form of system under principles of high cohesion and low coupling. The computation offloading system has three basic components: MUs, ASOs, and COs [18], and their relationship is shown in Fig. 3. MUs purchase cloud application services provided by ASOs and offload their OUs to ASOs. ASOs rent virtual resources from COs and develop various cloud application services. ASOs do not need to purchase and maintain their computing hardware equipments but rent the required resources from COs. This mode saves the cost of equipment purchase and maintenance for ASOs, and helps them pay more attention to develop the cloud application service and to improve the quality of service (QoS). Only when ASOs continuously develop high-quality cloud application services can they attract more MUs and improve the system viscosity. COs are responsible for infrastructure construction and physical resource operation. COs need to manage the physical resources efficiently to optimize their profits or reduce their energy consumption. In MCC architectures that have the edge layer, COs face the edge equipment deployment problem, which involves how to place edge equipments efficiently. These two operators are not completely independent, and they can overlap with each other. For example, a CO can use its resources to develop cloud application services and then becomes an ASO. Offloading decision, admission control, resource management, and edge equipment deployment are these components’ technical challenges. In this section, we present a comprehensive review of the existing work that aims to solve these challenges.

Fig. 3
figure 3

The basic components and key technologies in the MCC computation offloading system

3.1 Offloading decision

In the computation offloading system, MUs face the offloading decision problem. Unreasonable offloading decision can not improve the MD performance but results in more energy and time consumption due to additional data transmission on wireless networks. Different from the traditional “client-server” mode, which offloads all OUs to the server, MUs need to decide whether to offload an OU according to their optimization targets [34]. At the same time, the offloading decision in MCC is different from that in grid computing and multiprocessor, in which its optimization goal is to balance the load and to minimize the edge cut and the offloading volume [73]. MD is much weaker than the cloud, and MDs are connected to the cloud via wireless networks, which force MUs to take time and energy consumption from data transmission on wireless networks into account when making offloading decisions. Existing research can be classified according to the cloud application service site (mono-site or multi-site) and decision mode (static or dynamic). The work on offloading decision is compared in Table 4.

Table 4 Comparison of the work on offloading decision

(1) Mono-site Offloading Decision versus Multi-site Offloading Decision

In the mono-site offloading decision, there is only one ASO that provides cloud application service to MDs. The OUs are divided into two parts, i.e., the local MD part and the ASO part. These divided OUs are executed locally on the MD or offloaded to the single ASO. Many ASOs offer the same services for MUs in the cloud application service market [74], and a MU can pay multiple ASOs to offload OUs to these ASOs. In the multi-site offloading decision, OUs are divided into \((k+1)\) parts, i.e., the local MD part and k (\(k \ge 2\)) ASO parts. The solution space of the multi-site offloading decision is larger than that of the mono-site offloading decision, which increases rapidly with the number of ASOs and improves the complexity of finding the offloading strategy. ASOs are connected through the high-speed wired networks. For MDs, the communication energy consumption among ASOs is negligible, and the communication time is very short. Therefore, compared with mono-site offloading, the communication consumption of multi-site offloading is less. Besides, multiple ASOs provide MUs with more choices and more reliable services. MUs can continue to offload their OUs to other ASOs when some ASOs crash or disconnect. For example, as one extreme, if one of two ASOs in the multi-site computation offloading crashes or disconnects, OUs can continue to be offloaded to the ASO that works normally. Also, as the other extreme, if the only one ASO in the mono-site computation offloading crashes or disconnects, the computation offloading stops working.

(2) Static Offloading Decision versus Dynamic Offloading Decision

The static offloading decision is made through program analysis in the application development stage, and the offloading strategies do not change. The problem of how to make static offloading decisions is usually converted to the static application partitioning problem, in which a mobile application is abstracted as a graph, and then the graph is partitioned into several parts. The dynamic offloading decision works at runtime, and its offloading strategies change constantly. MUs connect with ASOs via wireless networks, which are not stable and vary for many reasons, such as the wireless channel fading and channel interference [75]. Moreover, MUs move among different environments, and wireless network conditions change constantly. Spatiotemporally varying wireless networks bring uncertainty to MCC, and the dynamic offloading decision is required to adapt to the varying environments. During the execution of mobile applications, MDs require real-time offloading strategies to indicate whether to offload their OUs. The key to making a dynamic offloading decision is to balance the decision time and strategy accuracy [70]. The accuracy of strategies found by offloading decision algorithms is positively correlated with their execution time. Offloading decision algorithms need to consume much time to find the accurate result, which makes them unable to adapt to dynamic mobile cloud environment, especially the fast changing environment. They may be effective when the size of offloading decision problem is small or the environment changes slowly. If the problem size is large or the environment changes fast, their performance will become bad. Therefore, it is important to balance the decision time to adapt to dynamic mobile cloud environment and the strategy accuracy to optimize the offloading target.

3.2 Admission control

After the offloading decision is made, an OU request will be sent to the ASO if the offloading strategy indicate that the OU should be executed on the cloud. ASOs rent virtual resources from COs and provide cloud application services for MUs using these virtual resources. ASOs are motivated to develop various attractive cloud application service for MUs to achieve more revenues. ASOs also expect to provide services for more OUs as much as possible to increase their revenues. However, ASOs usually rent finite virtual resources from COs to reduce the costs. If ASOs accept and provide services for all OUs, it may lead to resource overload and then affect the QoS. Therefore, the ASO needs admission control to determine whether to accept a new OU according to its load conditions. If an OU is accepted, the ASO allocates resources and then executes it. Admission control couples virtual resource allocation and locates on the side of ASOs, making it different from the hardware resource management of COs. Since many ASOs offer same services for MUs in the cloud application service market [74], a MU can pay any ASOs to offload OUs to these ASOs. ASOs are allowed to reject OU requests when accepting the OU is not revenue effective. If an OU request is rejected, it can be sent to other ASOs that provide the same service. Much existing related work studied the admission control problem with the aim of maximizing the revenue. For example, Jin et al. studied the admission control problem with the goal of maximizing the ASO’s revenues while guaranteeing the QoS [78]. They considered three features of two-dimensional resources, uncertainty, and incomplete information in model establishment and algorithm design. These work can also be classified by the type of cloud application services. ASOs only provide one type of cloud application service in some work (e.g., [81, 87, 88]) and provide two or more types of cloud application service in other work(e.g., [78,79,80, 82,83,84,85,86]). In comparison, admission control in the latter work has higher complexity, but it is closer to the real-life computation offloading system and is more universal. The work on admission control is compared in Table 5.

Table 5 Comparison of the work on admission control

3.3 Resource management and edge equipment deployment

(1) Resource Management

In the computation offloading system, COs are responsible for the operation and maintenance of physical resources and provide on-demand virtual resource leasing services. COs have a large number of computing equipments, and it takes a lot of energy to maintain their normal operation. The increasing demand for large-scale computing and construction of cloud computing facilities has resulted in annual growth in energy consumption. Data centers are one of the most energy-intensive building types, consuming 10 to 50 times energy per floor space of a typical commercial office building [89]. For COs, unreasonable resource management wastes a lot of financial and material resources, and also leads to environmental problems, which will reduce the CO revenues and is not conducive to the benign operation of MCC. Wireless networks have a serious impact on MCC performance [34], and resource management in MCC has to consider both computing and radio resources. Furthermore, changing wireless networks bring uncertainty to resource management in MCC [7]. Si et al. established a stochastic restless bandits-based resource management model, and used a hierarchy of increasingly stronger LP relaxations to solve the resource management strategy [97]. Jin et al. studied the energy-efficient resource management by two steps [94]. They first studied the deterministic resource management and then studied the stochastic resource management with the consideration of uncertainty caused by wireless networks. They established two models of deterministic and stochastic resource management based on the bin packing model, and proposed two algorithms to solve the resource management strategy. The work on resource management is compared in Table 6.

Table 6 Comparison of the work on resource management

(2) Edge Equipment Deployment

When MCC architecture has the edge layer, the deployment of edge equipment is the most basic technical challenge and the key of “how to build MCC”. Edge equipment is more geographically distributed and more numerous. How to deploy edge equipment reasonably is the first problem to be solved when building MCC. COs are responsible for the operation of physical equipment and provide virtual resource rental services. Unreasonable edge equipment deployment not only increases the deployment cost but also increases the complexity of other problems in MCC. In the Juniper white paper, Brown analyzed the importance of edge equipment deployment and pointed out that correct deployment is critical because capabilities deployed at the edge have a higher operational overhead than centralized deployments [98]. Mao et al. reviewed the research about edge equipment deployment, content caching and mobility management [99]. They analyzed the difference between edge equipment deployment and base station deployment. They pointed out that the former is coupled with computing resources and wireless resources and is strictly subjected to the budget. A simple but inefficient approach is to deploy the edge equipment in full. Although this approach can satisfy the resource requirement and is simple enough, it leads to excessive deployment cost and operation cost to COs. The key point to deploy edge equipment is to make a tradeoff between performance requirements and cost. The work on edge equipment deployment is compared in Table 7.

Table 7 Comparison of the work on edge equipment deployment

4 Fault tolerance and privacy protection for computation offloading

4.1 Fault tolerance for computation offloading

In MCC, MDs are connected to the cloud through wireless networks, and MDs are heterogeneous with cloud data centers, making it prone to fail in the process of data transmission or/and distributed OU execution. Fault tolerance is critical to ensure the robustness of computation offloading in MCC. We summarize the work on fault tolerance into three categories according to the technology on which it is based. The work on fault tolerance is compared in Table 8.

(1) Checkpoint-based Fault Tolerance

In the computation offloading process without fault tolerance, when a failure occurs, it is required to re-execute the OU or re-transfer the data from the beginning, which leads to more energy consumption and execution time, reduces MCC performance, and even makes MCC invalid. Checkpoint, which stores the parameters needed to restore the OU execution periodically, is usually used to support the fault tolerance. When a failure occurs, the OU restarts its execution from a previously saved checkpoint. The computation offloading process with checkpoint is illustrated in Fig. 4 [112]. Checkpoint needs to send messages periodically to synchronize its parameters, leading to extra data transmission for MCC. The amount of synchronization message and the number of checkpoints determine the consumption of fault tolerance. However, the classical checkpoint technology, which is not specifically designed for MCC, does not optimize the consumption caused by saving checkpoints. Deng et al. found that although the classical checkpoint technology can avoid re-executing OUs from the beginning, it cannot guarantee to save the total execution time and energy consumption [113]. They proposed a fault-tolerant mechanism that makes a trade-off between waiting for reconnection by the fault tolerance and directly restarting an OU from the beginning. Some researchers improved the classical checkpoint technology to make it suitable for MCC. For example, Houssem et al. proposed an efficient collaborative checkpointing algorithm that minimizes the number of checkpoints and avoids blocking MDs [114]. Cao and Singhal proposed the concept of mutable checkpoint to solve the problem that traditional checkpointing algorithms suffer from high extra overhead [115]. They designed an efficient checkpointing algorithm that can save checkpoints anywhere to avoid the overhead of transferring large amounts of data on the wireless network.

Fig. 4
figure 4

Computation offloading process with the checkpoint

(2) Replication-based Fault Tolerance

Replication, which replicates multiple OU replicas to support fault tolerance, is usually used when the offloading granularity is coarse (e.g., VM and class). Chen et al. developed an energy-optimized fault-tolerant MCC framework based on the “k-out-of-n” system, which ensures that a system of n components operates correctly as long as k or more components work [116]. That is to say, their framework ensures MCC work well as long as k out of n nodes are accessible. Li et al. proposed an energy-efficient fault-tolerant replica management policy with the deadline and budget constraints to solve the management and overhead issues caused by replication [117]. Stahl et al. designed a platform, which supports fault tolerance to stream services by replicating processing, to provide excellent flexibility at the cost of performing large-scale OUs [118].

(3) Other Fault-tolerant Technologies

There are some other technologies used in fault tolerance for MCC. Zhou and Buyya thought only using one fault-tolerant technology is not suitable in computation offloading due to MCC’s heterogeneity [119]. They combined checkpoint and replication technologies to provide more efficient fault tolerance for computation offloading. They proposed a group-based fault-tolerant algorithm that considers the properties of different machine groups and adaptively selects either checkpoint or replication as the fault-tolerant policy. Park et al. combined checkpoint and replication technologies to provide more efficient fault tolerance for resource management [95]. They classified MDs into groups according to the availability and mobility, and selected either checkpoint or replication according to the group characteristics. Lakhan and Li combined offloading decision with fault tolerance and proposed an offloading decision algorithm that determines application partitioning at runtime and adopts the fault-aware policy that merges detection and retries strategy to deal with any kind of failures [120]. Raju and Saritha proposed a disease resistance-based fault-tolerant framework named DRFT, which consists of four main modules as the monitoring module, response module, knowledge module, and memory module [121]. DRFT regards the VM failure as the virus in human body and uses the human anti-virus mechanism to repair it.

Table 8 Comparison of the work on fault tolerance

4.2 Privacy protection for computation offloading

Compared with traditional cloud computing, which is accessed through the wired network, MCC is connected through the wireless network and is more vulnerable. MDs usually contain MUs’ high-privacy personal information. Privacy leakages and malicious attacks may occur in the process of data transmission or/and OU execution. Privacy protection is needed to ensure the security of computation offloading in MCC. We classify the privacy protection technologies into two categories: passive protection and active protection. The work on privacy protection is compared in Table 9.

(1) Passive Protection

Passive protection avoids or reduces the loss caused by privacy leakages but does not actively eliminate the privacy leakage. Offloading decision with passive protection takes the cost caused by privacy leakages into the total cost or sets privacy as the constraints. For example, Wu and Huang established a multi-factor multi-site risk-based offloading model based on a comprehensive offloading risk evaluation and proposed an ant-based offloading decision algorithm [122]. They took two risk factors (privacy risk and reliability risk) and two benefit factors (execution time and energy consumption) into the offloading decision process. He et al. established a privacy-aware constrained MDP-based offloading decision model, which optimizes delay and energy consumption while considering the location privacy and usage pattern privacy as constraints [123]. They used Q-learning to solve the offloading strategy and used the Lagrange multiplier to handle the constraints. Ma and Mashayekhy studied the offloading decision problem to minimize the delay and energy consumption while considering data protection [124]. They formulated the offloading decision problem as an integer program model in which privacy protection is set as a constraint. Dhanya and Kousalya established a secure offloading decision model that minimizes the transmission cost and security cost [125]. They took the security as an optimization target and proposed a genetic algorithm-based offloading decision algorithm.

(2) Active Protection

Active protection eliminates privacy leakages through encryption or non-offloading. For example, Liu and Lu established an energy model for privacy-preserving computation offloading and used homomorphic encryption to protect data in image retrieval before sending data to servers [126]. Wu et al. proposed a trust-aware computation offloading framework that consists of trust evaluation module, filtering module, and selection module [127]. They filtered out resource providers that are untrusted through trust evaluation to ensure that the services provided to MUs are trustworthy. Wang et al. applied deep learning applications on MDs with the help of MCC and proposed a lightweight privacy-preserving mechanism consisting of arbitrary data nullification and random noise addition to protect the sensitive information [128]. Yue et al. implemented a computation offloading system that automatically performs fine-grained privacy-preserving Android application offloading by using static analysis and bytecode instrumentation techniques [129]. They marked the private data manipulation statements that retrieve and manipulate private OUs as unoffloadable. Saab et al. proposed a minimum-cut algorithm-based runtime offloading decision algorithm that adds the computation cost of performing encryption and decryption to application cost and takes security measures into account [130]. Zhang et al. proposed a privacy protection method named match-then-decrypt, which has a matching operation before decryption operation [131]. They proposed a basic anonymous attribute-based encryption (ABE) construction and then obtained a security-enhanced extension based on strongly existentially unforgeable one-time signatures.

Table 9 Comparison of the work on privacy protection

5 Future directions

5.1 Systematic and scaled prototype

For a new technology to have longer vitality, it must be integrated into people’s daily life and change people’s life or work styles. To promote MCC computation offloading, it is necessary to systematize MCC and to decouple the transactions involved during the computation offloading process. We review four key technologies for MUs, ASOs, and COs, which are three basic components of the computation offloading system in MCC. Most of these technologies are theoretical, and the goal is to optimize some system indicators (e.g., minimizing the energy consumption, minimizing the execution time, maximizing the revenues, etc.). As mentioned in Sect. 2.2, current MCC prototypes are small-scale, and their goals are mainly to verify that MCC can indeed save energy or time. MCC is a combination of many well-known technologies, which cannot make a deep impression on people. Just as cloud computing, which was controversial in the early days, the feasibility and practicality of MCC are also controversial now. A systematic and large-scale prototype of MCC computation offloading is needed to show its feasibility. Only when people actually touch and use MCC can they really accept it. Traditional cloud computing is more like the B2B mode, in which its users are enterprise users, and have less impact and requirements on MUs. MCC is more like the B2C mode and requires deep MU participation (e.g., refactoring the mobile applications or changing the MD OS). Therefore, although it can bring greater changes to people’s life and work, it is still a challenge to achieve systematic and large-scale MCC prototype.

MCC will eventually become an infrastructure, just like the power system. Whether it can become the essential infrastructure for people’s life depends on its user viscosity. The ability of MCC to provide MUs with various applications is the basis to maintain the user viscosity. The current goal of MCC computation offloading is simply to save energy or time, just as the electricity was used to light up incandescent lamps in the early eras. To promote the large-scale development of MCC, it is necessary to provide MUs with more MCC applications like electricity-based “refrigerators”, “air conditioners” and “washing machines” in the power system. Therefore, in the construction process of systematic and large-scale MCC prototype, the development of MCC-based applications is also a very important research and business direction.

5.2 Other technologies

This section gives an outlook on technologies that need to be paid attention to in the future. We discuss these technologies from the perspective of “device-pipe-cloud”.

(1) Technologies for the “Device”

For MDs, two technologies need to be paid more attention to in the future:

(1.1) Portability MCC aims to free MDs from heavy local workloads and to make MD a simple modem connecting human and electromagnetic signal-based network. The portability of MDs must be strengthened further to achieve the purpose of helping humans access the information network anytime and anywhere. One of the most direct ways to improve portability is to reduce the MD volume. However, in the existing touch-based interaction mode, too small a volume is not conducive to human-computer interaction. Therefore, it is necessary to study portable MDs with a new human-computer interaction mode. Wearable devices, such as smart glasses (e.g., Google Glass), are portable MDs that can be used as the carriers for MCC in the future. At present, there has been some research (e.g., [132,133,134,135,136]) attempts to combine MCC and smart glasses. Furthermore, the brain-computer interface, as an interactive way to create a connection between the brain and external platforms, provides more possibilities for the miniaturization and portability of MD. There are also related studies (e.g., [137, 138]) that combine brain-computer interface with MCC.

(1.2) Distributed computation offloading platform MCC computation offloading belongs to the category of distributed computing. However, existing mobile OSs, such as Android and iOS, are weak to support MCC, and most of the current MCC prototypes implement computation offloading by refactoring mobile applications. For example, to automate the refactoring process, Zhang et al. implemented a refactoring tool, named DPartner[134], to automatically transform the Android application bytecode into MD and cloud patterns. Although refactoring mobile applications can implement MCC computation offloading functionally, it is inefficient. It is necessary to design a distributed computation offloading platform for MCC. The difficulty lies in the heterogeneity, that is, the computing devices in MCC are quite different, especially in the CPU architecture and computing power of MD and cloud. Therefore, how to design a distributed computation offloading platform that satisfies the MCC requirements is an important research direction in the future. At present, there is some related work (e.g., KubeEdge [140], K3s [141], MicroK8s [142], and FLEDGE [143]) in this direction.

(2) Technologies for the “Pipe”

As the “pipe” connecting MD and cloud, wireless networks have a great impact on MCC computation offloading and even can make computation offloading invalid. There are two characteristics of the wireless network that affect the computation offloading. One is that the wireless network consumes MD energy when transmitting data, and the other is that the wireless network takes more time in data transmission due to its limited bandwidth. The future research on “pipe” should be towards the wireless network technology with low energy consumption and large bandwidth. There are two research directions worthy of attention:

(2.1) New generation of wireless network technology For example, 5G is the new generation of cellular mobile communication technology, which aims at improving data rate, reducing delay, saving energy, reducing cost, and improving system capacity. According to the key 5G characteristics defined by International Telecommunications Union (ITU), the peak data rate (the maximum data rate the MU can achieve under ideal conditions) is 20Gbit/s, the MU experience data rate (the lowest ubiquitous data rate in coverage) is 100Mbit/s, the over-the-air delay is 1ms, and the energy efficiency (both the network side and MD side ) will increase 100 times [144]. Wi-Fi 6, also known as “802.11ax Wi-Fi”, is the new generation of WLAN technology. Compared with previous generations of Wi-Fi technology, Wi-Fi 6 has a faster data rate (the peak data rate is 9.6Gbit/s), lower delay, and more power saving. Wi-Fi 6 applies the target wake time technology, which allows the active planning of communication time between the MD and wireless router to reduce the wireless network antenna usage and signal search time, to reduce the MD energy consumption [145].

(2.2) Integrating existing technologies to improve the wireless network performance Concurrent multiple transfer (CMT) provides a solution to integrate existing wireless networks. At present, most MDs are equipped with multiple wireless network interfaces (e.g., smartphones are equipped with the cellular network interface and Wi-Fi interface), but traditional network protocols can only use one network interface to transmit data at the same time. To solve this problem, CMT, which uses multiple wireless network interfaces to transmit data at the same time, is proposed. There have been some CMT protocols, such as stream control transmission protocol (SCTP) [146] and multi-path transmission control protocol (MPTCP) [147]. These technologies provide a realistic basis for the application of CMT in MCC. MD energy consumption can be reduced, and network bandwidth can be increased by reasonably data scheduling or path selection with CMT [148,149,150,151]. Jin et al. explored the usage of CMT in MCC computation offloading to combat the challenges caused by wireless communication, and the results show that using CMT can further save energy and time [70]. CMT utilizes the idle wireless network bandwidth and increases its bandwidth by simultaneously using these bandwidths. Large bandwidth can reduce the delay caused by wireless network data transmission. Different wireless networks have different energy characteristics, and MD energy consumption can be reduced by optimizing data scheduling according to these characteristics. Besides enhancing MCC’s efficiency, CMT can also improve MCC’s reliability. In MCC, MDs are connected to the cloud through unreliable wireless networks, which is easily influenced by the outside environment. The failed wireless network will disable MCC. MCC can use multiple wireless networks with the help of CMT and switch to good networks when some wireless networks fail.

(3) Technologies for the “Cloud”

As the external resource platform, the bottom layer of the cloud is composed of a large number of data centers. Running current data centers requires a lot of power. For example, data centers have used approximately 2% of the total electricity in America [89]. As mentioned in Sect. 1, the number of MDs will be huger in the future. To provide cloud resources for these MDs, more data centers need to be built, which in turn will lead to more energy consumption and cause more serious environmental problems. Therefore, energy-saving for the cloud will be an important and urgent problem to be solved. In addition to using the resource management and edge equipment deployment technologies described in Sect. 3.3 to save energy, some other technologies should be noted. Renewable energy, such as solar energy and wind energy, can be used to supply power for the more dispersed edge equipment [152,153,154]. The centralized remote cloud data center can save its energy consumption by optimizing the cooling technology [155,156,157].

6 Conclusion

With people’s life and work more and more dependent on MD, the limited resources of MD cause a lot of inconveniences. MCC computation offloading is an efficient way to solve the problem that MD resources are limited. It aims to offload OUs from MDs to the external platforms and to free MDs from heavy local workloads. MCC has attracted wide attention because of its tremendous potential, and a lot of research on it has been done. In this survey, we presented a comprehensive overview and outlook of research on computation offloading in MCC. Researchers have given different definitions of MCC based on their research scenarios. We summarized the MCC architecture and offloading granularity, which are fundamental concepts of MCC, to classify these definitions. The MCC computation offloading system is decomposed into three basic components: MUs, ASOs, and COs. Offloading decision, admission control, resource management, and equipment deployment are their technical challenges. We conducted a comprehensive literature review on four key technologies used to solve these challenges. The wireless network connection and heterogeneity are the most important features of MCC, making failures and privacy leakages occur easier. We reviewed fault tolerance and privacy protection, which are two important technologies to support computation offloading. Finally, we presented the research outlook of the systematic prototype and other technologies from the perspective of “device-pipe-cloud”.