1 Introduction

With the rapid development of wireless technology and cloud computing intelligent devices (such as smartphones [1]) access to the wireless network [2, 3] increasing. Furthermore, mobile devices will be more and more intelligent and the applications in mobile devices will require extensive computation power and persistent data processing [4,5,6]. However, the improvement of these emerging applications is Limited by the computational power of mobile devices [7, 8]. The computation-intensive task can be offloaded to the cloud for execution by the cloud computing technology, so as to compensate for the restrictions of inadequate computing capability in mobile terminals. However, the mobile devices accessing to the cloud through wireless network needs longer duration, so it is not applicable for the delay sensitive task.These tasks include wearable virtual reality (VR) etc [9].

The edge computing can provide the distributed computing and storage capabilities for mobile devices by deploying the server on the network edge [10], and accordingly can support computing and storing the intensive intelligent application and ensure the lower delay and higher performance [11]. Based on the edge computing, network-based information processing for distributed applications, such as [12], can be realized. Thus, the edge computing has the advantages of low delay, and is an important research direction of wireless communication in the future [13, 14]. Specially, there are two advantages to offload task to edge cloud [15]. On the one hand, compared with mobile devices, edge cloud has more computational resources [16,17,18]. On the other hand, it can overcome larger delay caused by offloading the computation-intensive task and data-intensive task to cloud [19,20,21]. Thus, for delay-sensitive and computation-intensive tasks, offloading to the edge cloud may achieve a better tradeoff among delay and energy efficiency [22].

For the task offloading based on edge cloud, there exist the following three types of offloading scenes. (i) One user and one server scene, i.e., one mobile user offloading computation task to an edge cloud. (ii) Multi-users and one server scene, i.e., many mobile users offloading the computing task to an edge cloud. (iii) Multi-users and multi-servers scene, i.e., many mobile devices offloading the computing task to many edge clouds servers. In view of different scenes, the researchers design different offloading schemes. Moreover, the complexity of multi-users and multi-servers scene and dense deployment of 5G network (i.e., 5G ultre-dense cellular network) result in the diversification of edge cloud deployment [23, 24]. Thus, the existing optimization-based task offloading scheme is difficult to meet the optimal solution. Fortunately, with the development of artificial intelligence (AI) and especially the development of deep learning, it makes breakthroughs in the field of computer vision [25]. The researchers hope to combine the AI related algorithm with the computation offloading, and enable the edge computation offloading to be more intelligent. For example, Chen et al [26] research the task offloading program by Markov decision process in traditional machine learning algorithm. The test result shows that, such task offloading program has lower delay.

Furthermore, few scholars make task offloading scheme by using deep learning algorithm. In view of computation offloading problem under user’s mobility, Sun et al., put forward the computation offloading scheme of minimizing task duration based on multi-armed bandit theory [27]. Chen et al.,research the edge cloud task offloading program by deep reinforcement learning. However, all above task offloading schemes are running on the mobile device using deep learning algorithm, and fail to consider the following two limitations. (i) The operation of task offloading scheme based on deep learning needs a lot of computing resources of the mobile devices, i.e., the deep learning algorithm is equivalent to the computing task. The existing task offloading schemes pays close attention to designing the delay and energy efficient algorithm for task offloading. It ignore the computing problem of offloading algorithm. (ii) The computing capability of the mobile device is limited. The offloading scheme based on deep learning may not be able to run, resulting in the failure of the optimal task offloading scheme for the design. Therefore, where to run the learning-based offloading scheme and how to offload the task by deep learning is still a challenging problem.

To solve the above challenges, we focus on the computing problem of task offloading algorithm, and propose the architecture of learning for smart edge. To be specific, we first propose the smart layered based task offloading architecture. Then we introduce the optimal task offloading problem of the mobile device under multi-user and multi-server. Furthermore, we provide the cognitive learning-based computation offloading (CLCO) scheme. Finally, the simulation experiment shows that, the task offloading strategy proposed by us minimizes the task processing delay, and enables the battery energy consumption to be lower. The problem of this paper as shown in Fig. 1.

Fig. 1
figure 1

An illustration of computation offloading in edge cloud

The main contributions of this paper are summarized as follows:

  • We focus on the computing problem of task offloading algorithm, and propose the architecture of learning for smart edge. From two perspectives of computing task offloading scheme learning and task offloading scheme, we give where and how to run the learning-based task offloading scheme.

  • In view of the specific task offloading problem, based on cognitive learning, we give the optimal task offloading scheme of the mobile device. The simulation experiment result shows that, the CLCO scheme proposed by us has lower task delay and energy consumption.

2 System architecture

In this section, we introduce the learning for smart edge architecture from the perspectives of task offloading scheme learning and task offloading scheme. This architecture generally aims at two applications as follows, i.e., (i) computation-intensive application, such as VR and AR, (ii) data-intensive application, such as personalized video service [28,29,30]. Moreover, these applications are generally delay sensitive [31], and need real-time task offloading. This architecture can make the utmost of computing, storage and network resources of edge server, introduce the cognitive learning method to computation offloading, and process the task offloading scheme learning and task offloading scheme both on the edge cloud, to reduce the computing task delay and improve the quality of user service.

To be specific, we divide the learning for smart edge architecture into three layers, i.e., edge resource cognitive layer, computation task cognitive layer, and global management cognitive layer. The edge resource cognitive layer includes physical resource cognitive layer [32] and virtual resource cognitive layer. It first conducts the software definition on physical resource, data link resource and network resource by the network function virtualization technology [33]. In other words, the physical computing, storage and communication resources are virtualized to form the virtual resources. Then, the software defined virtual resources are encapsulated, to form the edge-centered resource. The computation task cognitive layer includes the offloading task and mobile device cognition. Thus, it can cognizes the required computation amount and transmission for the task, as well as the power consumption of mobile device. Different users generate different tasks, and have different requirements of task delay. The global management cognitive layer is responsible for task offloading learning and the corresponding offloading strategy formulation. Specifically, it formulates the corresponding task offloading strategy by analyzing the offloading task feature and resource state of edge clouds, offloads the task from the global perspective, and accordingly reaches the optimal decision.

For the above mentioned architecture, the specific learning based task offloading flow is shown as follows.

  • Where to run the learning based task offloading scheme. In this paper, we assume the learning based task offloading schemes are running on the edge cloud rather than mobile device. It can gain the task offloading strategy on edge cloud based on deep learning algorithm and existing task offloading data training.

  • How to run the learning based task offloading scheme. When the mobile user generates the task, the mobile device firstly offloads the parameters. The edge cloud learns the optimal task offloading scheme in accordance with its own computing resources. After the edge cloud determines the offloading scheme, it is transmitted to the mobile device. The mobile device offloads accordingly.

3 System model

We consider an edge cloud enabled network consisting M edge clouds. Indexed by \(\mathcal {M}=\{1,2,\cdots ,M\}\), Let \(\mathcal {N}=\{1,2,\cdots ,N\}\) denote the set of mobile devices. In this paper, we consider that each mobile device can be connect to multiple edge clouds through wireless channel, and can offload the task to its connected edge cloud. Denote \(\mathcal {A}_{i}\) as the set of edge clouds that provides computation offloading services to the mobile device i. We also define fj and cj as the maximum computation amount and storage capacity of edge cloud j.

3.1 Task model

We assume the computing task to be processed by the mobile device i as Qi. Thereinto, Qi = {ωi,si}, where ωi is the computational demand of the task Qi, i.e., the CPU cycles and si is the size of computation task Qi, i.e., the input size of data contents. For example, as for the emotion recognition task, ωi is the required computing resource for emotion recognition algorithm (such as deep learning algorithm), and si is the emotion data size. In this paper, we assume that the task is separable, i.e. a part of each task can be executed locally, and the other part can be executed by offloading to edge cloud. We denote ui,j as the ratio of computation amount offloaded by the user i to the edge cloud j for the total amount of computation, where \(j\in \mathcal {A}_{i}\).

3.2 Communication model

We provide the communication model between mobile device and edge cloud. Let pi denote transmission power of mobile device i. Let hi,j denote channel power gain of mobile device i and edge cloud j. Thus, the offloading rate for Qi are defined as:

$$ r_{i,j}=B\log_{2}\left( {1+\frac{p_{i}h_{i,j}}{\sigma^{2}+I_{i,j}}}\right) $$
(1)

where σ2 is noise power, B is channel bandwidth, and Ii,j denotes the interference power between edge clouds. Then, the transmission delay of the mobile device i offloads the task to the edge cloud j can be defined as:

$$ T_{i,j}^{tra}=\frac{u_{i,j}s_{i}}{r_{i,j}} $$
(2)

Furthermore, the energy consumption when the mobile device i offloads the task Qi to the edge cloud j can be defined as follows:

$$ E_{i,j}^{ec}=\frac{p_{i}u_{i,j}s_{i}}{r_{i,j}} $$
(3)

3.3 Computation model

3.3.1 Local computation

For the computing task that runs locally, we define \({f_{i}^{l}}\) as the local computing capability in cycle/second of mobile device i. Thus, we can obtain the execution time of computing task Qi locally as follows:

$$ {T_{i}^{l}}=\frac{(1-{\sum}_{j\in \mathcal{A}_{i}}u_{i,j})\omega_{i}}{{f_{i}^{l}}} $$
(4)

Also, the energy consumption of mobile device is defined as follows:

$$ {E_{i}^{l}}=\kappa({f_{i}^{l}})^{2}\left( 1-\sum\limits_{j\in \mathcal{A}_{i}}u_{i,j}\right)\omega_{i} $$
(5)

where κ is a constant related to the chip architecture. In this paper, we set κ = 10− 25.

3.3.2 Edge cloud computation

We define \(f_{j,i}^{ec}\) as CPU computing capability distributed to the mobile device i by edge cloud j. Like in [14] the task duration for task Qi executed on edge cloud j can be defined as:

$$ T_{i,j}^{ec}=T_{i,j}^{tra}+T_{i,j}^{proc}=\frac{u_{i,j}s_{i}}{r_{i,j}}+\frac{u_{i,j}\omega_{i}}{f_{j,i}^{ec}} $$
(6)

Furthermore, the task duration when the task of mobile device i is executed on edge cloud as follows:

$$ T_{i}^{ec}=\sum\limits_{j\in \mathcal{A}_{i}}(T_{i,j}^{tra}+T_{i,j}^{proc})=\sum\limits_{j\in \mathcal{A}_{i}}\left( \frac{u_{i,j}s_{i}}{r_{i,j}}+\frac{u_{i,j}\omega_{i}}{f_{j,i}^{ec}}\right) $$
(7)

In a similar way, we can obtain the energy consumption when mobile device i offload the task on edge cloud as follows:

$$ E_{i}^{ec}=\sum\limits_{j\in \mathcal{A}_{i}}E_{i,j}^{ec}=\sum\limits_{j\in \mathcal{A}_{i}}\frac{p_{i}u_{i,j}s_{i}}{r_{i,j}} $$
(8)

3.4 Problem formulation

According to above analysis, we can obtain the task duration and energy consumption of the processing of mobile device i task. Suppose that the mobile device i residual energy is Ei. We are aimed at minimizing the task duration of all tasks with the limited battery capacity. Our optimization variable is ui,j. The optimization problem can be defined as follow:

$$\begin{array}{@{}rcl@{}} \underset{u_{i,j}}{\text{minimize}} && \sum\limits_{i = 1}^{N}({T_{i}^{l}}+T_{i}^{ec}) \\ \text{subject to} && C1: E_{i}^{ec}+{E_{i}^{l}}\leq E_{i} \\ && C2:\sum\limits_{i = 1}^{N}s_{i}u_{i,j}\leq c_{j}, j = 1,2, \cdots, M \\ && C3:\sum\limits_{i = 1}^{N}f_{j,i}\leq f_{j}, j = 1,2, \cdots, M \\ && C4: u_{i,j}\in [0,1] \end{array} $$
(9)

where the objective function is minimizing the total duration task duration. The first constraint (C1) shows that the energy consumption of task offloading of mobile device i can not be more than the residual energy consumption. Constraint (C2) ensures the task offloading to the edge cloud can not be more than total edge cloud storage capacity. The third constraint (C3) indicates the amount of computation offloaded to the edge cloud can not exceed the computing capability of edge cloud. Constraint (C4) indicate that the tasks are separable.

4 Cognitive learning-based computation offloading

In this section, based on the system model, we introduce the CLCO scheme. In view of the optimization problem (9), the general solution of this problem is utilizing the optimization theory (e.g., convex optimization). However, solving by the optimization theory is based on the following two assumed conditions. (i) From the perspective of mobile device, the mobile device knows the state of edge cloud or assumed to comply with some distribution. (ii) It is assumed that the mobile device has the sufficient computing capability to compute and obtain the optimal offloading scheme. However, in the 5G scene, such as in 5G ultra-dense cellular network, each small cell is provided with edge cloud [34]. Each mobile device may connect with multiple edge clouds in the vicinity, and the resource state of edge cloud is changed rapidly. Thus, these assumptions are unrealistic.

Thus, we propose the CLCO scheme. The specific idea is as follows. When the mobile device is idle, the pre-computation offloading is conducted in advance, and the pre-computation task offloading strategy is provided. When the task is reached indeed, the reinforcement learning is utilized to further optimize the problem. Suppose the CLCO iteration for T times. The basic steps are as follows.

  • We assume all mobile devices offload task to its connected edge cloud.

  • In tT iteration, the mobile device updates its offloading strategy according to the state of edge cloud in the previous iteration.

  • In tT iteration, the edge cloud updates its state according to the offloading strategy of mobile device.

  • After iteration for T times, the mobile device provides the optimal offloading strategy.

In the next, we give the CLCO scheme in detail. Suppose interaction t ∈{1,2,⋯ ,T}. According to above discussion, we define the state space of task offloading as st = Et, where Et is the energy of mobile device in t iteration. Thus, we can obtain:

$$ s_{t + 1}=s_{t}-({E_{i}^{l}}(t)+E_{i}^{ec}(t)) $$
(10)

where \({E_{i}^{l}}(t)\) is the energy consumption of mobile device i in local processing at iteration t. \(E_{i}^{ec}(t)\) is the energy consumption of mobile device when the task is processed in edge cloud at iteration t. Meanwhile we define the action space as:

$$ a_{t}=\left( \left( 1-\sum\limits_{j\in \mathcal{A}_{i}}u_{i,j}(t)\right),\sum\limits_{j\in \mathcal{A}_{i}}u_{i,j}(t)\right) $$
(11)

where \((1-\sum \limits _{j\in \mathcal {A}_{i}}u_{i,j}(t))\) is processing locally, and \(\sum \limits _{j\in \mathcal {A}_{i}}u_{i,j}(t)\) is processing in the edge cloud. Next we will give how to provide the offloading strategy by Q-learning.

When mobile device offload the task, it do not clear the resource state and computation load of edge cloud, so the state of edge cloud should be explored. In this paper, we adopt the model-free Q-learning for the dynamic strategic choice. Specifically, we define Q(st,at) as Q-function, showing the value of performing the action at when the system state is st. For the choice of the action at, we adopt ε-greedy search strategy, and choose a action among all possible actions uniformly and randomly at the probability of ε for exploration. Utilize the known best action (i.e., maximizing the Q-function) at the probability of 1 − ε. We define the reward of performing the action as follows:

$$ R_{t}=\beta_{t}\frac{T_{i}^{ec}(t)-{T_{i}^{l}}(t)}{{T_{i}^{l}}(t)}+ \beta_{e}\frac{E_{i}^{ec}(t)-{E_{i}^{l}}(t)}{{E_{i}^{l}}(t)} $$
(12)

where βt and βe is the weight of mobile device for energy consumption and task duration. In this paper, we adopt Bellman equation to update Q-function, with the details as follows.

$$ \begin{array}{ll} Q(s_{t}, a_{t}) &\gets Q(s_{t}, a_{t})+\\ &\eta\left( R_{t} + \gamma \max \limits_{a_{t + 1}}Q(s_{t + 1}, a_{t + 1})-Q(s_{t}, a_{t})\right) \end{array} $$
(13)

To sum up, we provide the specific algorithm as Algorithm 1.

figure a

5 Performance analysis

In this section, we consider the system involving 5 edge clouds and 300 mobile devices. We set the transmission bandwidth B of mobile device is 1 MHz, and the transmitting power P is 0.2 W. The corresponding noise power σ2 and channel power gain h are 10− 9 W and 10− 5. We assume computation amount ωi and data size si follow by a probability distribution. Specifically the task computation demand and the input data size comply with the normal distribution. The computing capability of edge cloud and mobile device is 10 GHz and 1GHz, respectively.

We first study the CLCO scheme and compared the following computation offloading strategies. (i) Random offloading scheme: mobile device randomly offload the computing task, till meeting the computing capacity of edge cloud. (ii) Uniform offloading scheme: The mobile device offloading the computation task uniform on the edge cloud, till reaching the computation capacity of edge cloud. Figure 2 shows the comparison of task duration among random offloading, uniform offloading and CLCO scheme under different data size and computation capacity of per task. We observe that the CLCO scheme outperform the random offloading and uniform offloading strategy when the average data size and average computations per task is increased. This can be explained by the fact that both the random offloading and uniform offloading scheme does not recognize the task.

Fig. 2
figure 2

Task duration achieved by random offloading, uniform offloading and CLCO for various value of (a) the average data size of per task and (b) the average computations capacity of per task

Furthermore, we compare the CLCO strategy with the local computing and edge computing strategy. Figure 3 shows the comparison of task duration among local computation, edge cloud computation and CLCO scheme under different data size and computation capacity of per task. We observe that the CLCO scheme outperform the local computation and edge cloud computation when the average data size and average computations per task is increased.

Fig. 3
figure 3

Task duration achieved by local computing, edge computing and CLCO for various value of (a) the average data size of per task and (b) the average computations capacity of per task

6 Conclusion

In this paper, we first introduce a new task offloading architecture which include computation task cognitive layer, edge resource cognitive layer and global management cognitive layer. Then, we give the optimization problem of computation offloading under multi-user and multi-server scene. Furthermore, we propose the CLCO scheme. It study about the computing problem of the learning-based computation offloading algorithm. Experiments indicate that the CLCO scheme outperforms several state-of-the-art offloading scheme. In the future, we will do deeper work about the optimization of learning-based computation offloading scheme, and to further reduce the task duration and energy consumption.