1 Introduction

Advancements in computing technology have expanded the usage of computers from desktops and mainframes to a wide range of mobile and embedded applications, including surveillance, environmental sensing, GPS navigation, mobile phones, autonomous robots, etc. Many of these applications run on systems with limited resources. For example, mobile phones are battery-powered. Environmental sensors have small physical sizes, slow processors, and small amounts of storage. Most of these applications use wireless networks and their bandwidths are orders-of-magnitude lower than wired networks. Meanwhile, increasingly complex programs are running on these systems—for example, video processing on mobile phones and object recognition on mobile robots. Thus there is an increasing gap between the demand for complex programs and the availability of limited resources.

Offloading is a solution to augment these mobile systems’ capabilities by migrating computation to more resourceful computers (i.e., servers). This is different from the traditional client-server architecture, where a thin client always migrates computation to a server. Computation offloading is also different from the migration model used in multiprocessor systems and grid computing, where a process may be migrated for load balancing [62]. The key difference is that computation offloading migrates programs to servers outside of the users’ immediate computing environment; process migration for grid computing typically occurs from one computer to another within the same computing environment, i.e., the grid. Offloading is in principle similar to efforts like SETI@home [5], where requests are sent to surrogates for performing computation. The difference is that SETI@home is a large scale distributed computing effort involving several thousands of users, whereas offloading is typically used to augment the computational capability of a resource constrained device for a single user. The terms “cyber foraging” and “surrogate computing” are also used to describe computation offloading. In this paper, we use the above terms interchangeably.

A significant amount of research has been performed on computation offloading: making it feasible, making offloading decisions, and developing offloading infrastructures, as shown in Table 1. Prior to 2000, researchers mostly focused on making offloading feasible. This was primarily due to limitations in wireless networks, such as low bandwidths. In early 2000s, the focus moved to developing algorithms for making offloading decisions i.e, decide whether offloading would benefit mobile users. Improvements in virtualization technology, network bandwidths, and cloud computing infrastructures, have shifted the direction of offloading. These developments have made computation offloading more practical. This paper surveys the development of computation offloading for mobile systems over the last 15 years, and identifies directions for future research.

Table 1 Research focuses about offloading in the past 15 years

Offloading may save energy and improve performance on mobile systems. However, this usually depends on many parameters such as the network bandwidths and the amounts of data exchanged through the networks. Many algorithms have been proposed to make offloading decisions to improve performance or save energy [8, 13, 26, 29, 30, 35, 41, 4446, 52, 56, 67, 74, 80, 83]. The decisions are usually made by analyzing parameters including bandwidths, server speeds, available memory, server loads, and the amounts of data exchanged between servers and mobile systems. The solutions include partitioning programs [13, 15, 35, 4446, 52, 60, 72, 74, 83] and predicting parametric variations in application behavior and execution environment [26, 30, 36, 67, 80].

Offloading requires access to resourceful computers for short durations through networks, wired or wireless. These servers may use virtualization to provide offloading services so that different programs and their data can be isolated and protected. Isolation and protections have motivated research on developing infrastructures for offloading at various granularities [4, 25, 27, 28, 32, 37, 58, 60, 66, 70, 71, 77, 84]. Offloading may be performed at the levels of methods [66], tasks [84], applications [83], or virtual machines [16]. Java RMI, .NET remoting, and RPC (remote procedure call) are several mechanisms enabling offloading at the class and object level. Techniques have been proposed to enable offloading at the virtual-machine level; for example, Chun and Maniatis [16] use cloud computing to enable offloading. Cloud computing allows elastic resources and offloading to multiple servers; it is an enabler for computation offloading. Various infrastructures and solutions have been proposed to improve offloading: they deal with various issues such as transparency to users, privacy, security, mobility, etc. All of these infrastructures and solutions address different issues associated with offloading.

The purpose of this paper is to acquaint readers with research on computation offloading for mobile systems. This paper provides an overview of the motivations, techniques, technological enablers, and architectures for computation offloading. It surveys the common approaches used to make offloading decisions, and classifies these approaches based on various factors, including

  • why to offload (improve performance or save energy)

  • when to decide offloading (static vs dynamic)

  • what mobile systems use offloading (laptops, PDAs, robots, sensors)

  • types of applications (multimedia, gaming, calculators, text editors, predictors)

  • infrastructures for offloading (grid and cloud computing).

This paper serves as a collective reference for the algorithmic mechanisms and the associated infrastructures, and identifies existing barriers and directions for research. The paper is organized as follows: Section 2 describes a brief history of enabling technologies. Section 3 explains two objectives for offloading: reduce execution time and save energy, and describes infrastructures and tools developed to address the challenges of offloading. Section 4 describes why offloading will become increasingly important in the years to come, and Section 5 concludes the paper.

As apparent throughout this paper, many studies have been conducted on topics related to computation offloading and a comprehensive survey of all studies would be impossible. Hence, this paper does not intend to provide a complete survey of the field. The references are selected based on our limited knowledge of the topics, as well as creating a coherent flow of this paper. Readers must be aware that some important papers may not be included in this survey due to the limited length.

2 Enabling technology

This section describes some enabling technologies for computation offloading. Figure 1 shows how various technological advancements contributed to offloading. The graph shows the number of publications with the terms “offloading” in the title or abstract obtained by searching IEEE Xplore. In the following subsections, we discuss two significant enablers for offloading: (1) wireless networks and mobile agents and (2) virtualization and cloud computing.

Fig. 1
figure 1

Enabling technologies for computation offloading. The paper counts are obtained from IEEE Xplore

2.1 Wireless networks and mobile agents

Until late 1990s, unstable and intermittent wireless network connectivity and low bandwidths were the main problem for mobile systems. The focus was building wireless data networks (in particular WiFi) to facilitate mobility. These improvements spurred many research activities on mobile computing, including mobile agents.

Mobile agents are autonomous programs that can control their movement from machine to machine in a heterogeneous network. Mobile agents introduced the concept of migrating computation from mobile devices. Infrastructures for mobile agents were targeted to achieve platform independence and used technologies like Java and XML [38, 40, 81, 82]. Kotz et al. [40] suggest using mobile agents for accessing Internet resources from a portable device as a solution to poor network connections, variable network addresses and signal quality. They propose Agent TCL: a mobile agent that can be written in Tcl, Java, and Scheme. Wong et al. [81, 82] suggest using Java, mobile agents and XML technologies to compose an enterprise platform that is application-independent, portable and efficient. They propose Concordia: a Java-based infrastructure for mobile agents with design goals of flexible agent mobility, agent collaboration, agent persistence, reliable agent transmission, and agent security. Joseph et al. [38] present a toolkit called Rover for mobile information access: it uses relocatable dynamic objects and queued remote procedure calls to overcome connectivity problems. All these technologies focus on migrating computation for mobile devices, network connectivity, and leveraging Java for developing platform-independent applications.

2.2 Virtualization and cloud computing

Virtualization was first developed in the 1960s by IBM as a way to logically partition large mainframe computers into smaller, independent computing units [23]. This enabled multitasking: the ability to run multiple applications and processes at the same time. Multitasking was necessary at that time because of mainframes’ high costs. Virtualization lost popularity during the 1980s and early 1990s when inexpensive x86 desktop computers became popular [68]. Rather than sharing resources centrally in the mainframe model, organizations used the low-cost desktops for their computational needs. However, new problems have emerged including

  • Under utilization: Typical deployments have very low utilization of total computing capacity. Users want to run only a few applications per computer to obtain better response time. As a result, many computers are under-utilized.

  • Security: Desktops are often managed by individual users and they have to regularly apply security patches. Otherwise, the computers can become vulnerable.

  • Operational costs: The total cost of ownership can grow rapidly for supporting increasing numbers of desktops and laptops, and for upgrading and updating software. Moreover, these computers may waste power as they are often kept on 24 h.

Virtualization has emerged as a solution over the last decade by making it possible to run multiple operating systems and multiple applications on the same computer (or a set of computers) simultaneously, increasing utilization and flexibility [3, 10, 51, 68]. Since different types of virtual machines can be created, users can scale the number of virtual machines based on demand. Due to virtualization, these machines have separation and protection. Cloud computing uses virtualization to offer computing as a service; users can “lease” computing resources based on their requirements. This paper focuses on mobile systems and they can use cloud computing for offloading. An overview of cloud computing, and its potential to influence the future of computing can be found in [6]. Several other articles discussing cloud computing applications, research, and implementations can be found in [11, 12, 31, 55, 73, 78].

3 Offloading decisions

Since offloading migrates computation to a more resourceful computer, it involves making a decision regarding whether and what computation to migrate. A vast body of research exists on offloading decisions for (1) improving performance and (2) saving energy. Sections 3.1 and 3.2 describe these two purposes for offloading. Section 3.3 provides a taxonomy and surveys existing studies. Section 3.4 describes some of the research areas to improve offloading, and surveys some infrastructures and solutions.

3.1 Improve performance

Offloading becomes an attractive solution for meeting response time requirements on mobile systems as applications become increasingly complex [9]. Another goal is meeting real-time constraints. For example, a navigating robot may need to recognize an object before it collides with the object; if the robot’s processor is too slow, the computation may need to be offloaded [52, 69]. Another application is context-aware computing [34]—where multiple streams of data from different sources like GPS, maps, accelerometers, temperature sensors, etc need to be analyzed together in order to obtain real-time information about a user’s context. In many of these scenarios, the limited computing speeds of mobile systems can be enhanced by offloading.

The condition for offloading to improve performance can be formulated below. Without loss of generality, we can divide a program into two parts: one part that must run on the mobile system and the other part that may be offloaded. The first part may include user interface and the code that handles peripherals (such as the mobile system’s camera). Let s m be the speed of the mobile system. Suppose w is the amount of computation for the second part. The time to execute the second part on the mobile system is

$$ \frac{w}{s_m}. $$
(1)

If the second part is offloaded to a server, sending the input data d i takes \(\frac{d_i}{B}\) seconds at bandwidth B. Here we ignore the initial setup time for the network. The program itself may also need to be sent to the server. We assume the size of the program is negligible, or the server may download the program from another site through a high-speed network [17]. Offloading can improve performance when execution, including computation and communication, can be performed faster at the server. Let s s be the speed of the server. The time to offload and execute the second part is

$$ \frac{d_i}{B} + \frac{w}{s_s}. $$
(2)

Offloading improves performance when Eq. 1 > Eq. 2:

$$ \frac{w}{s_m} > \frac{d_i}{B} + \frac{w}{s_s} \Rightarrow w \times \left(\frac{1}{s_m} - \frac{1}{s_s}\right)> \frac{d_i}{B}. $$
(3)

This inequality holds for

  • large w: the program requires heavy computation.

  • large s s : the server is fast.

  • small d i : a small amount of data is exchanged.

  • large B: the bandwidth is high.

This inequality shows limited effects of the server’s speed. If \(\frac{w}{s_m} < \frac{d_i}{B}\), even if the server is infinitely fast (i.e., s s → ∞), offloading cannot improve performance. Hence, only tasks that require heavy computation (large w) with light data exchange (small d i ) should be considered. This requires analyzing programs to identify such tasks. Moreover, if we define \(w (\frac{1}{s_m} - \frac{1}{s_s}) - \frac{d_i}{B}\) as the performance gain of offloading, the server’s speed has diminishing return: doubling s s will not double the gain.

3.2 Save energy

Energy is a primary constraint for mobile systems. A survey of 7,000 users across 15 countries showed that “75% of respondents said better battery life is the main feature they want” [1, 2]. Smartphones are no longer used only for voice communication; instead, they are used for acquiring and watching videos, gaming, web surfing, and many other purposes. As a result, these systems will likely consume more power and shorten the battery life. Even though battery technology has been steadily improving, it has not been able to keep up with the rapid growth of power consumption of these mobile systems. Offloading may extend battery life by migrating the energy-intensive parts of the computation to servers [42].

The following analysis explains the conditions when offloading saves energy. Suppose p m is the power on the mobile system. The energy to perform the task can be obtained by modifying Eq. 1:

$$ p_m \times \frac{w}{s_m}. $$
(4)

Let p c be the power required to send data from the mobile system over the network. After sending the data, the system needs to poll the network interface while waiting for the result of the offloaded computation. During this time, the power consumption is p i . Incorporating these parameters in Eq. 2 gives

$$ p_c \times \frac{d_i}{B} + p_i \times \frac{w}{s_s}. $$
(5)

Offloading saves energy when Eq. 4 > Eq. 5.

$$ p_m \times \frac{w}{s_m} > p_c \times \frac{d_i}{B} + p_i \times \frac{w}{s_s} $$
(6)
$$ \Rightarrow w \times \left(\frac{p_m}{s_m} - \frac{p_i}{s_s}\right) > p_c \times \frac{d_i}{B} $$
(7)

Equations 3 and 7 are very similar. To make offloading save energy, heavy computation (large w) and light communication (small d i ) should be considered. In both equations, we assume that data must be transmitted from the mobile system to the server. Fortunately, this may not be true in many cases. For example, the data (such as photographs and videos) may also reside in servers with high-speed networks (such as Facebook.com and YouTube.com). Instead of transmitting the data from the mobile system to the server, the mobile system needs to provide links to the server and the server may download the data directly from the hosting sites. In this case, the bandwidth B can be substantially higher, allowing offloading to improve performance and save energy.

3.3 Comparison of existing studies

The previous sections describe the objectives for offloading: improving performance and saving energy. In this section, we classify existing studies based on the following criteria:

  • When is the offloading decision made? Is it made statically during program development or dynamically during execution?

  • How are tasks identified for offloading?

  • What applications are offloaded?

  • What types of mobile systems benefit from offloading?

Tables 2 and 3 classify different papers based on these criteria. The papers are ordered by years so that readers can see the progression more easily.

Table 2 Offloading techniques for improving performance
Table 3 Offloading techniques for saving energy

3.3.1 Static or dynamic decisions

The offloading decision can be static or dynamic. When the decision is static, the program is partitioned during development. Static partition has the advantage of low overhead during execution; however, this approach is valid only when the parameters can be accurately predicted in advance. In Eqs. 3 and 7, s m , p c , p i , and p m can usually be accurately estimated. The server’s speed s s may vary but some cloud vendors can guarantee the minimum level of performance. The other parameters: w, d i , and B may vary widely due to run-time conditions. A static scheme may predict some of these parameters and decide how the application is offloaded. Prediction algorithms include probabilistic prediction [67], history-based prediction [30, 36], and fuzzy control [26].

In contrast, dynamic decisions can adapt to different run-time conditions, such as fluctuating network bandwidths. Dynamic approaches may also use prediction mechanisms for decision making. For example, the offloading bandwidth B can be monitored and predicted using a Bayesian scheme [80]. Meanwhile, most dynamic decisions incur higher overhead because the program has to monitor the run-time conditions. Even for programs with dynamic decisions, the tasks that may potentially be offloaded are identified during program development. Partitioning a program during execution is undesirable due to the very high overhead for analyzing the program. Figure 2a shows static and dynamic decisions in the papers surveyed.

Fig. 2
figure 2

a Types of algorithms used for offloading—in recent years, there are fewer static (i.e. development time) decisions and more dynamic (i.e. execution time) decisions. b Most frequently used types of applications for offloading. c Percentage breakdown of different types of devices used by the applications

3.3.2 Program partition

Before making the offloading decision, the offloadable parts of a program have to be identified. This is usually achieved by partitioning a program. Various algorithms are used [13, 15, 35, 4446, 52, 60, 72, 74, 83] to partition the computation between a mobile system and a server. A typical approach represents the program as a graph: the vertices represent the computational components (such as functions) and the edges represent the communication between them [59]. Figure 3 shows an example of dividing a program using graph partition. The program takes input x, y, z and gives output r. In the figure, the computation consists of four functions A, B, C, D. The objective is to decide which of these functions to offload. Each of these functions is a possible candidate for offloading; however it is difficult to independently apply the offloading analysis from Eqs. 3 and 7 for each function. We have to consider the intermediate data that is sent between the functions, given by x 1, y 1, z 1, x 2, and z 2. For example, A generates x 1, y 1, z 1; it sends x 1, y 1 to B and it sends y 1, z 1 to C. Let us assume that the data x 1, y 1, and z 2 are large. As a result, we want to keep the functions exchanging the data on the same system, so that the communication between them is a local function call and no network communication is needed. Some of the functions may require heavy computation, and the energy to perform computation on the mobile system must also be considered. The figure shows a possible partition between a mobile system and a server. The optimal decision depends on the relative tradeoffs between computation and communication; this decision is made by considering all the vertices simultaneously and is similar to a graph partition problem, known to be NP-Complete, even if all the parameters are known in advance [33]. In other scenarios where the program information is unknown [83], the application may not be partitioned—and the entire application is either executed on the mobile system, or offloaded.

Fig. 3
figure 3

Expressing a program as a graph and partitioning the graph between a mobile system and a server: x, y, z are inputs to the program and r is the output. The region marked “mobile” is executed on the mobile system, and the region marked “server” is executed on the server. The size of a vertex indicates the amount of computation and the width of an edge indicates the amount of data sent from one vertex to the other

3.3.3 Applications

Applications used for evaluation include text editors [26, 36, 49], multimedia [13, 35, 4446, 49, 59, 66, 74], vision and recognition [9, 41, 52, 56, 83, 84], and gaming [15, 71]. Text editors transfer relatively small amounts of data—i.e., text, hence small d i in Eqs. 3 and 7 and perform computations like spell check. Multimedia, vision, and recognition applications transfer large amounts of data in the form of images and videos (large d i ). If all the multimedia is on the mobile system, the offloading decision depends primarily on B: if most of the data is already on a server (such as Youtube), then offloading can be more beneficial. Gaming applications like chess are interesting candidates for offloading, because the amount of computation for the program to win the game depends on the skill level of the user; thus the computation w depends on how the user plays the game. Figure 2a shows the different applications used by various papers. The applications run on a wide range of clients with different computational capabilities, ranging from PDAs, laptops, and robots, as shown in Fig. 2c.

3.4 Infrastructures

The previous sections describe the conditions when offloading computation can improve performance, or save energy, or both. Many papers have contributed to the infrastructures to make offloading practically adopted. These infrastructures address various issues such as:

  1. (a)

    Inter-operability: Different types of resource-constrained devices may interact and connect across different types of networks to one or many servers. For example, devices like the iPhone switch to 3G signal when there is no WiFi network available. Since the 3G radio is typically slower and consumes more power than WiFi, the offloading decision may vary based on the network available. Moreover, offloading may be possible between different systems of different computational capabilities; it is important to hide these interactions from the user [4, 16, 38, 39, 43, 53, 64, 65, 77].

  2. (b)

    Mobility and Fault Tolerance: Offloading relies on wireless networks and servers; thus it is important to handle failures and to focus on reliable services. Fault tolerance enables the system to continue executing the application in the event of network congestion or failure, or server failure. Studies that have addressed this issue include [38, 43, 53, 57].

  3. (c)

    Privacy and Security: Privacy is a concern because users’ programs and data are sent to servers that are not under the users’ control. Security is an issue because a third party may access confidential data. Many studies have been conducted to protect outsourced data [14, 75, 76]. Solutions include steganography [47], homomorphic encryption [21, 22], hardware-based secure execution [7]. Most of these solutions have limitations in their applications: for example, encryption keys may be too large and dramatically increase the amount of data. Also, efficient computation on encrypted data is still a research topic.

  4. (d)

    Context Awareness: This refers to the device being able to perceive the users state and surroundings and infer context information. This is important because the mechanism of offloading may vary depending on the users’ location and context; various studies suggest adaptive mechanisms based on such information [25, 27, 37, 39, 64].

All these issues continue to be active areas of research. Table 4 describes some infrastructures and contributions that alleviate these issues.

Table 4 Frameworks to alleviate the challenges faced by offloading

4 Offloading in the future

How important will offloading be in the years to come? In this section, we discuss how growth in mobile data and mobile applications, and the computational capabilities of mobile devices will impact offloading in the future.

In the the past few years, two important trends have occurred:

  • sensor deployment: Sensors are widely deployed for monitoring the environment or for security. These sensors acquire large amounts of data but the sensors have limited computing capabilities.

  • growth in smartphones: smartphones have become the primary computing platforms for millions of people. Figure 4 shows that the volume of mobile data is forecasted to grow rapidly. Mobile platforms generate large amounts of multimedia data and most of the data are stored on-line on cloud servers.

Fig. 4
figure 4

Mobile data growth projected in the years to come. It is observed that there is a 6 fold increase in mobile data, particularly in multimedia such as videos. Source: CISCO 2010

Sensors and mobile platforms represent an entirely new set of input devices. Consider the possibility when millions of cameras, microphones, GPS, and many other types of sensors are connected. The amounts of data they can produce would be staggering. The information and knowledge that can be extracted would dwarf what we have called information explosion today. As the number of connected devices—including mobile phones, tablets, laptops, and sensors—grow, the demand for increased functionalities will continue. In the next few years, we will see pressing needs for personalized management of multimedia data. This would be a natural progression of the Internet. Before the Internet became popular, people already had large amounts of on-line documents stored in their desktop computers or company mainframes. In 1990s, as the Internet became popular, many documents were posted on-line and keyword-based search became necessary. Search engines were an important driving force for the Internet in the late 1990s. The first ten years of the 21st century marked the rapid growth of personal multimedia data: images, videos, and audios. As the amounts of multimedia data grow, users need better ways to manage their data than relying on file names, dates, and directories. It would be inconvenient to ask users to describe every image and every video by a set of keywords and then use keyword-based search. This will lead to a rapid growth in recognition and data management technologies on these connected devices. Many of these technologies can provide large speedups with parallelism, since multimedia processing offers many opportunities for both code and data parallelism.

Computing speeds of these connected devices however, will not grow at the same pace as servers’ performance. This is due to several constraints, including

  • form factor: users want devices that are smaller and thinner; yet they also want devices with more computational capability.

  • power consumption: current battery technology constrains the clock speed of processors since doubling the clock speed approximately octuples the power consumption. It becomes difficult to offer long battery lifetimes with high clock speeds.

These factors indicate that mobile computing speeds will not grow as fast as the growth in data, and applications’ computational requirements. Where do these trends intersect? On one hand, we have a massive growth in mobile data in both types and volumes, and in the computational requirements of mobile applications. On the other hand, the computational capabilities of the devices—that acquire and store the data, and provide applications for the user—will be unlikely to grow at the same pace. Offloading computation is a natural solution to this problem.

The economic model for offloading, by renting computation, is provided by virtualization and cloud computing. As the various connected devices become more widespread in their deployment, offloading techniques that can take advantage of cloud computing will become increasingly relevant. Applications on these connected devices will start to be designed such that they have “offloadable” computation—and such design of applications can benefit from the various techniques and solutions surveyed in this paper.

5 Conclusion

This paper surveys and classifies a vast body of research associated with computation offloading for mobile systems. We examine how enablers like mobile agents and virtualization make offloading feasible. We survey different types of algorithms used to partition and offload programs in order to improve performance or save energy. We classify the types of applications that have been used to demonstrate offloading. We list some of the research areas associated with offloading, and describe some infrastructures and solutions that address these research areas. Finally we describe why computation offloading will become increasingly important for resource constrained devices in the future.