1 Introduction

Mobile devices (e.g. smartphones and tablets) are more and more becoming the hub around which much of the computing and communication demand of users is centered, thus posing new, heavy challenges. Indeed, in spite of the continuous technological improvements, the computation capabilities of mobile devices are still limited with respect to their “fixed” counterparts (e.g. desktop computers and data center servers).

In addition, mobile nodes are battery powered; hence energy consumption is a key issue to be accounted for. To overcome these potential limitations, it has been suggested to offload code execution from the mobile node to external machines [38]. This strategy has many potential advantages: (i) reduced application execution time; (ii) reduced battery consumption; and (iii) the possibility to execute applications whose resource demand could exceed the capabilities of mobile nodes.

There are several proposals in the literature (see [2, 42] for a comprehensive survey) which rely on cloud computing infrastructures for computation offloading in mobile scenarios [4]. Cloud computing delivers the vision of computing as a utility (such as water, electricity, gas, and telephony) and provides “a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released” [33]. However, the use of traditional cloud infrastructures in a mobile environment can introduce significant network delays that adversely affect the user experience and outweigh the potential benefits of this solution [12, 14, 23]. To overcome this problem, it has been proposed to use close-by servers (referred to as cloudlets), typically located at the wireless access points (APs) where mobile nodes connect to, so that they are at just “one hop” distance from the mobile node [39].

AP-located cloudlets cannot reasonably be expected to provide the “unlimited” amount of resources typically provided by a distant cloud server. Indeed, economic reasons and physical constraints limit the amount of resources that can be allocated to each cloudlet [39]. Hence, while a cloud server guarantees a good isolation among different users that offload their computations to it (i.e., users do not compete for the cloud resources), this does not hold in a cloudlet. As the load increases, resources contention and sharing can cause delays and performance degradation that might result in higher and higher response times, which in turn can offset the benefits of offloading computation to the cloudlet. As a consequence, the analysis of whether and to what extent it is convenient to offload computation in a cloudlet-based architecture requires to take into consideration the dynamics of the interactions among the different users and the possible presence of regulatory policies for the access to the shared cloudlet resources.

Most of the papers that investigate the effectiveness of computation offloading consider single user scenarios (e.g., [1, 11, 14, 22, 29, 31, 46]), thus implicitly assuming a perfect isolation in case of concurrent users. We are aware of only few papers where interactions among different mobile users on a resource-limited cloud are taken into account [5, 10, 36, 37, 48], as discussed in the next section.

In this paper, we will consider the general case of an architecture intended to support computation offloading for mobile nodes, where both a middle tier consisting of nearby resource-limited cloudlets and a remote tier consisting of resourceful distant cloud servers are available, as depicted in Fig. 1 (this architecture is referred to as a hybrid mobile cloud architecture in [2]).

Fig. 1
figure 1

Three-tier architecture for mobile cloud computing

For such an architecture both a managed and an unmanaged usage scenario can be envisioned. The former scenario typically corresponds to the case where a wireless service provider (WSP) deploys its own cloudlet infrastructure at its own APs, to be used by its mobile subscribers. Hence, the WSP can be expected to centrally regulate the access to the cloudlet, with the goal of offering a good service experience to its subscribers and of fulfilling its own utility goals. The latter scenario corresponds instead to the case where cloudlet-augmented WiFi hot spots are deployed by public authorities or private entrepreneurs at facilities like airports, train stations, public buildings, cafes, etc., for the benefit of their citizens or customers. This (future) scenario is an extension of the current one where free-access WiFi hot spots are deployed just as an additional service for citizens, or as a way for attracting more customers, on a simple best effort basis and without any attempt of regulating their use. Analogously to this current scenario, in the unmanaged scenario we envision cloudlet-augmented WiFi hot spots are offered on a best effort basis, as their management is not likely to be part of the core business of the authorities that deployed them, and mobile users autonomously decide whether or not to take advantage of their presence, according to their own goals.

The managed scenario gives rise to a hard optimization problem for the handling of the cloudlet resources for which only centralized heuristics have been proposed, see [36, 37, 48]. On the other hand, the unmanaged scenario is more challenging and has not been addressed in the literature. In this paper we focus on the latter unmanaged scenario, with the goal of investigating whether and under what conditions it gives rise to a convenient offloading strategy. To this end, we analyze the interaction among mobile users in a game-theoretic setting, assuming that the users independently determine their offloading strategies according to a rational behavior. Within this framework, the contributions of this work are as follows:

  • to the best of our knowledge, this is the first work where computation offloading is analyzed for a general multi-user “three-tier” mobile cloud computing scenario, with no central authority managing the access to the two external cloud tiers;

  • by using queueing theory, we model such a scenario as a non-cooperative game among selfish users, where the users interaction can be formulated as a Generalized Nash Equilibrium Problem (GNEP) [16].

  • we introduce a distributed algorithm for the computation of an equilibrium which is tailored to the model architecture. This algorithm, on the one hand is based on an in-depth analysis of the underlying equilibrium problem and, on the other hand, exploits and adapts some very recent game-theoretic advancements. The overall result is a model where each user can determine his/her own computation offloading strategy automatically on the basis of easily collected information;

  • we report computational experiments demonstrating the effectiveness of the proposed algorithm and illustrating the characteristics of the achieved solution.

The remainder of the paper is organized as follows. Section 2 presents related works and motivates the mobile computing scenario we consider. In Sect. 3, we describe this scenario and state the problem we intend to tackle, while in Sect. 4 we propose the game-theoretic model that is used in the rest of the paper. In Sect. 5 we analyze the properties of the game, and show the existence of an equilibrium. In Sect. 6 we provide a distributed algorithm for the achievement of an equilibrium, and discuss issues related to its implementation. Section 7 presents a set of experiments illustrating the behavior of the solution method and assessing the characteristics of the achieved equilibria. Finally, Sect. 8 outlines future work.

2 Related work

Several architectural proposals aimed at supporting the implementation of computation offloading (or “cyber foraging” [38]) in mobile cloud computing (MCC) scenarios have appeared in the recent past. Some comprehensive surveys have been recently published [2, 20, 42, 43], but other papers have appeared and continue to appear on this subject [1, 10, 22, 29, 35, 37, 48]. The proposed architectures for MCC mainly differ in: (1) the granularity of workload offloaded to external (cloud) nodes, spanning for example entire virtual machines, application components or single application functions; (2) the methodologies adopted to determine what parts of the application can be potentially offloaded, including manual or automatic partitioning methodologies; (3) and whether application partitioning is determined statically before the application starts its execution, or dynamically at runtime, with the possibility of changing the partitioning during the application execution.

The exploitation of external nodes “close” to mobile devices has been suggested to alleviate the latency problem caused by the interaction with distant cloud servers located in the Internet. Close nodes could be peer mobile nodes [25, 30, 45] or the wireless access points (APs) where mobile devices connect to, suitably augmented with some computational and storage capacity [23, 39]. An implementation of this kind of augmented AP has been recently launched by Nokia Solutions and Networks in partnership with IBM and Intel.Footnote 1 Other industrial solutions are being deployed under the term of Fog computing [9].

Mostly related to the work presented in this paper are the methodologies aimed at determining which offloadable tasks of a mobile application should be actually shifted from the mobile device to external nodes with the goal of improving the application performance and the user experience. We can broadly classify the existing proposals according to a single user versus a multiple users scenario. In the single user scenario, a single mobile node is considered, without taking into account possible interference with other mobile nodes. On the other hand, in the multiple users scenario the offloading decisions take into account that multiple users compete for computational external resources that may be scarce.

Most of the offloading methodologies proposed up to now (e.g. [1, 11, 14, 22, 27, 46]) focus on the single user scenario and address the issue by representing the application as a weighted graph/tree and applying a graph partitioning algorithm, whose complexity depends on the granularity of the offloading. The optimal solution is determined through an Integer Linear Programming (ILP) formulation while fulfilling some given objectives (e.g. application delay, energy saving, communication cost). However, since graph partitioning is an NP-complete problem, heuristics have been proposed to find efficiently approximate solutions so as to be able to deal also with large graphs. Solutions based on graph partitioning have been also investigated in pre-cloud mobile scenarios, e.g. [34].

Only few works have addressed the multiple users scenario [5, 10, 36, 37, 44, 47, 48]. Barbarossa et al. [5] propose a centralized scheduling algorithm to jointly optimize the allocation of radio and computation resources among multiple users with latency constraints. However, they consider a batch processing of the computation which is not realistic in the cloudlet environment. Yang et al. [48] study the partitioning problem for mobile data stream applications and consider multiple users that share the wireless network bandwidth as well as computational cloud resources with the goal of maximizing the throughput of the data stream application. The problem is addressed by means of a genetic algorithm that runs on the cloud side.

Two papers propose [10, 47] game-theoretic approaches for a two-tier architecture. Chen [10] focuses on decentralized computation offloading; differently from our paper where the resource contention among the multiple users occurs on the additional tier constituted by the cloudlet, the author considers the competition on the wireless access, thus focusing more on the communication aspects of mobile cloud computing. Wang et al. [47] devise a two-stage formulation. In the first stage, each mobile device determines the portion of computation to offload to a remote cloud with the goal to minimize its power consumption as well as the task response time. In the second stage, the provider of the remote cloud data center performs resource allocation for the offloaded tasks with the goal to maximize its own profit. In this paper, we do not consider resource allocation issues on the remote cloud servers, that are assumed to have an almost infinite capacity. Differently from us, all the above works consider a two-tier architecture, composed only of mobile devices and a distant cloud.

Similarly to our envisaged scenario, Rahimi et al. [36, 37] consider a three-tier architecture for MCC with multiple users, where local cloud resources are limited; in [37] they also take into consideration user mobility information. They formulate the tiered cloud resource allocation as an optimization problem and solve it through a greedy heuristic based on a simulated annealing approach. Their heuristic runs on a centralized entity that has to be contacted by the mobile nodes. Finally, Song et al. [44] propose an online task scheduling algorithm that aims to minimize the energy consumption of mobile devices with network traffic constraint. To this end, the authors envision a collaborative approach among mobile devices which can share computation results of similar tasks with each other; tasks can be thus allocated either on the originating device, on other collaborative device, or on a remote cloud.

Differently from these works that either consider a centralized decision-maker or cooperative mobile devices in a three-tier architecture, we consider a scenario where, as motivated in the introduction, multiple users decide selfishly whether and where to offload their computations, and we analyze their non-cooperative behavior in a game theoretic setting.

3 System model

We consider a mobile computing scenario as depicted in Fig. 1, where a set of mobile nodes share a wireless access point (AP) to connect to the Internet. Mobile nodes can use this connection to possibly offload (part of) their computational load to a nearby cloudlet or to a conventional remote cloud center. Different applications are executed on the mobile nodes, each consisting of one or more tasks.

In the following, without lack of generality, we will refer to a task as the unit of computation. At the coarsest level, a task can correspond to an entire application, while at the finest level it can correspond to a function, e.g., an image compression, or even a simpler operation. It is worth observing that, in general, not all application components can be offloaded, as some component always need to be executed locally, e.g., a task associated to an application user interface. The nearby cloudlet and the distant cloud are characterized by the same execution environment. In other words, it is functionally equivalent to offload a task to the cloudlet or to the cloud. Apart from this, a cloudlet differs substantially from a conventional cloud in that it is characterized by a limited amount of resources, while the conventional cloud is assumed to have a seemingly unlimited capacity.

As motivated in the introduction, we focus on an “unmanaged” scenario, where users autonomously decide whether or not to take advantage of nearby cloudlets, rather than some remote cloud. Each time an offloadable task is to be executed, a decision is selfishly taken by each user on whether it is more convenient to execute the task locally, on the one-hop cloudlet, or on the more resourceful and distant cloud server. If the mobile node decides to offload the task, the code and/or data are transferred for remote execution. Upon completion, a message with the computation results is returned to the mobile device.

We model our system as a queueing network, see Fig. 2. Queueing theory has been widely used in the analysis of resource contention in computing and communication systems [28], and is a natural candidate to capture the main features of our system.

Fig. 2
figure 2

System model

The mobile device and the cloudlet are represented as queueing nodes to capture the resources contention on these two systems. The cloudlet is modeled as a set of \(n\) servers since we expect a cloudlet to be a “data center in a box” and therefore to comprise possibly multiple servers machines, with multiple processors/cores [39], with a front-end dispatcher that uniformly splits the arrival stream among the servers (this latter architecture has been proved to be an effective and popular solution for load sharing in multiserver systems and is largely used by commercial products, e.g. see [3]). The cloud, on the other hand, given its virtually infinite capacity can be regarded as an infinite server, with no contention among different users. Finally, we model both the wireless access network and the Internet as simple delay centers to capture the average network delay experienced by the user when a task is remotely executed.

User \(u\) generates tasks at rate \(\lambda _u,\, u=1,\ldots ,N\). We denote by \(\frac{1}{\mu _{u,m}},\, \frac{1}{\mu _{u,clet}}\) and \(\frac{1}{\mu _{u,cloud}}\) the expected execution time of user \(u\) tasks on the mobile device, the cloudlet and the cloud, respectively. We denote by \(\frac{1}{\mu _{u,wl}}\) and \(\frac{1}{\mu _{u,wn}}\) the expected time to transfer data/code for remote execution over the wireless access network and the Internet, respectively. We assume the latter two quantities include the time for the return message to be delivered to the mobile node (in other words, they represent the round trip times).

Similarly to [10, 32, 44], our application model does not consider possible dependencies among tasks belonging to the same application. The papers [1, 11, 14, 22, 27, 46] model these dependencies as a graph and analyze how to partition the application tasks on the mobile and cloud resources. Our higher level model allows us to capture the tasks contention on the shared resources, which is our focus, and to overcome at the same time the difficulties caused by the combinatorial aspects in the above mentioned papers.

4 Generalized Nash equilibrium formulation

In this section we formulate the mobile computation offloading problem as a Generalized Nash Equilibrium Problem [16, 18]. The goal of each user (actually of the mobile node) is to determine whether and where to offload a task based on the impact this has on his/her usage experience, expressed through suitable Quality of Service (QoS) measures. We call this decision the user strategy and model it by associating to each user \(u\) a triple \(x_u=\{x_{u,m},x_{u,clet},x_{u,cloud}\},\, \sum _{i\in I} x_{u,i}=1\), where \(I=\{m,clet,cloud\}\), which represents the percentage of user tasks that is executed locally (\(x_{u,m}\)), offloaded to the cloudlet (\(x_{u,clet}\)), or to the cloud (\(x_{u,cloud}\)).

Given that power consumption and application performance are the most important quality factors in a mobile scenario, see e.g. [1, 11, 14, 22], we consider them as the QoS measures driving each user strategy. In particular, we assume that the user wants to optimize the observed performance while limiting the power consumption. Without lack of generality, we consider as the user performance measure the expected number of user tasks in the system, i.e. the expected number of tasks launched but not yet completed. From the mobile user point of view, this corresponds to the average execution time of the number of tasks launched in a time unit. We observe that this is a quite general approach, which accounts for different levels of detail/granularity. As an example, consider the case of an application which is executed once per second and whose execution requires ten modules to be run. We can consider as a task either the application or the invoked modules and with the proposed performance measure we obtain exactly the same expression.

Let us denote by \(R_{u,m},\, R_{u,clet}\) and \(R_{u,cloud}\) the mean response time when a task is executed locally, offloaded to the cloudlet or to the cloud, respectively. In order to use robust-yet simple-analytical expressions for these measures, we model the response time of the mobile device and of each of cloudlet server as the response time of a \(M/G/1/PS\), which amounts to assimilate the task arrival process to a Poisson process. The response time of a \(M/G/1/PS\) queue [28] is \(R=\frac{1/\mu }{1-\rho }\) where \(\lambda \) is the queue arrival rate, \(1/\mu \) the average service time and \(\rho =\lambda /\mu \) the queue utilization. By considering our system model assumptions, we readily have:

$$\begin{aligned} R_{u,m}&= \frac{\frac{1}{\mu _{u,m}}}{1-\frac{x_{u,m}\lambda _u}{\mu _{u,m}}},\quad R_{u,clet} = \frac{1}{\mu _{u,wl}} + \frac{\frac{1}{\mu _{u,clet}}}{1-\frac{1}{n}\sum _v\frac{x_{v,clet}\lambda _v}{\mu _{v,clet}}} \end{aligned}$$
(1)
$$\begin{aligned} R_{u,cloud}&= \frac{1}{\mu _{u,wl}} + \frac{1}{\mu _{u,wn}} + \frac{1}{\mu _{u,cloud}}. \end{aligned}$$
(2)

\(R_{u,m}\) directly follows from the fact that the number of tasks per unit of time which need to be executed by the mobile node is \(x_{u,m}\lambda _u\). \(R_{u,clet}\) comprises two terms: the first term is the local wireless delay; the second term is the cloudlet response time. The latter is affected by the cloudlet servers load which is the aggregate cloudlet load divided by the number of servers \(n\), that is, \(\frac{1}{n}\sum _v x_{v,clet}\lambda _v\). Finally, \(R_{u,cloud}\) is the sum of the wireless access network delay, the wide area network delay and the cloud delay.

Few words on the use of the Poisson assumption are in order. For the mobile devices, we note that when the adopted granularity level makes a task coincide with an entire application, then a Poisson process well captures the arrival of independent applications. For finer granularity levels, possible dependencies among tasks belonging to the same application could actually make the arrival process diverge from the Poisson one. Nevertheless, the Poisson approximation allows us to use an analytic formulation for the response time that captures the effect of resource contention; indeed, the Poisson assumption is an approximation that has been adopted in the literature on mobile cloud computing [8, 21, 32, 47] to model a user task arrival. Finally, for the cloudlet, the use of Poisson arrivals is justified because the overall arrival process is the superposition of relatively sparse arrival processes from (possible many) independent users and is also a common assumption in the Web context, see e.g. [3].

Finally, we denote by \(P_{u,m}\) and \(P_{u,t}\) the power consumed by the mobile device when the task is executed locally and the power required to transmit code/data for remote execution, respectively.

The user objective is to minimize \(\lambda _u R_u(x_u, x_{-u})\) (which by Little’s first law represents the number of tasks in the system) within a given energy budget. Here \(R_u(x_u, x_{-u})\) denotes user \(u\) mean task response time and \(x_{-u}\) denotes the strategies of all users except user \(u\). User \(u\) mean task response time is defined as follows:

$$\begin{aligned} R_u(x_u, x_{-u}) = x_{u,m}R_{u,m} + x_{u,clet}R_{u,clet} + x_{u,cloud}R_{u,cloud}. \end{aligned}$$
(3)

Note that the user mean response time depends not only on the user \(u\) strategy \(x_{u}\), but also on the strategies of the other users. This dependency is due to the users indirectly affecting each other when they offload tasks to the cloudlet, since the cloudlet mean response time is function of the cloudlet load \(\sum _v\frac{x_{v,clet}\lambda _v}{\mu _{v,clet}}\) to which each user contributes.

Each user \(u\), in order to compute the optimal strategy, needs to solve the following optimization problem:

$$\begin{aligned}&\mathrm{min} \; \lambda _u R_u(x_u, x_{-u}) \end{aligned}$$
(4)
$$\begin{aligned}&\mathrm{subject} \, \mathrm{to:} \; \frac{1}{n}\sum _v\frac{x_{v,clet}\lambda _v}{\mu _{v,clet}} \le U_{max} \end{aligned}$$
(5)
$$\begin{aligned}&\frac{x_{u,m}\lambda _u}{\mu _{u,m}}P_{u,m} + \frac{(x_{u,clet}+x_{u,cloud})\lambda _u}{\mu _{u,wl}}P_{u,t} \le P_{u,max} \end{aligned}$$
(6)
$$\begin{aligned}&x_{u,clet}+x_{u,cloud} \le \chi \end{aligned}$$
(7)
$$\begin{aligned}&\sum _{i\in I} x_{u,i} = 1 \end{aligned}$$
(8)
$$\begin{aligned}&x_{u.m},x_{u,clet},x_{u,cloud} \ge 0. \end{aligned}$$
(9)

Constraint (5) models the cloudlet utilization which we assume should not exceed a given threshold \(U_{max}\) (in practice, this corresponds to giving an upper bound to the cloudlet response time). Observe that this constraint involves the decision variables of all the users. Constraint (6) ensures that the user energy consumption is lower than a threshold \(P_{u,max}\). Constraint (7) takes into account that, in general, only a fraction \(\chi ,\, 0<\chi \le 1\), of the tasks can be offloaded. Finally, the simple and natural constraints (8) and (9) ensure that the considered fractions are greater than or equal to zero and that their sum is one.

In this setting, the users decisions are mutually dependent and the proposed model is a GNEP. GNEPs differ from classical Nash Equilibrium Problems (NEP) in that, while in a NEP only the players’ objective functions depend on the other players strategies, in a GNEP both the objective functions and the strategy sets depend on the other players strategies. In our problem, the dependence of each player strategy set on the other players strategies is represented by the constraint (5), which includes all the users decision variables \(x_{u,clet}\). More specifically, since the players all share a common (linear) constraint, this game is known as jointly convex game [15].

5 Properties of the GNEP formulation

In this section we show that the game (4)–(9) can actually be solved by finding a solution to a suitable Variational Inequality (to be defined later on), for which we can then derive a distributed algorithm. First, in Sect. 5.1, crucial to our approach, we establish that the function associated to the Variational Inequality is, under appropriate, reasonable conditions, monotone. Then, in Sect. 5.2 we transform the original GNEP in an equivalent extended game, the equilibrium point of which can be computed in a distributed way [41], as detailed in Sect. 6.

5.1 Existence and monotonicity properties of the GNEP

We recall that each user \(u=1,\ldots ,N\) controls three variables: \(x_u=(x_{u,m},x_{u,clet},x_{u,cloud})\). For sake of simplicity, we set:

$$\begin{aligned} \displaystyle \alpha _u=\frac{\lambda _{u}}{\mu _{u,m}},\quad \beta _u=\frac{\lambda _{u}}{\mu _{u,wl}},\quad \delta _u=\frac{\lambda _{u}}{\mu _{u,clet}}, \quad \gamma _u=\lambda _{u} \left( \frac{1}{\mu _{u,wl}}+\frac{1}{\mu _{u,wn}}+\frac{1}{\mu _{u,cloud}} \right) . \end{aligned}$$

Using this notation we can rewrite problem (4)–(9) as

$$\begin{aligned}&\min \quad \displaystyle \lambda _uR_u(x_u,x_{-u}) \end{aligned}$$
(10)
$$\begin{aligned}&\quad \hbox {subject to} \quad \displaystyle { \frac{1}{n} \sum _{v} \delta _v x_{v,clet}}\, \le \, U_{\max } \end{aligned}$$
(11)
$$\begin{aligned}&\quad \alpha _u P_{u,m} x_{u,m} + \beta _u P_{u,t} (x_{u,clet} + x_{u,cloud}) \, \le \, P_{u,\max } \end{aligned}$$
(12)
$$\begin{aligned}&\quad x_{u,clet} + x_{u,cloud} \, \le \, \chi \end{aligned}$$
(13)
$$\begin{aligned}&\quad x_{u,m} + x_{u,clet} + x_{u,cloud} \, = \, 1 \end{aligned}$$
(14)
$$\begin{aligned}&\quad x_{u.m},x_{u,clet},x_{u,cloud} \,\ge \, 0, \end{aligned}$$
(15)

where

$$\begin{aligned} \lambda _uR_u(x_u,x_{-u}) = \, \frac{\alpha _u x_{u,m}}{1-\alpha _u x_{u,m}}+\beta _u x_{u,clet}\ +\gamma _u x_{u,cloud}+\frac{\delta _u x_{u,clet}}{1-\frac{1}{n}\sum _{v}\delta _v x_{v,clet}}. \end{aligned}$$

In order to analyze the game we make the following basic assumption:

Assumption A

\(U_{\max }\) as well as all \(\alpha _u\) and \(\delta _u,\, u=1,\ldots ,N\), are positive and smaller than 1.

We note that assuming \(\alpha _u<1\) actually corresponds to assuming that all the computational load generated by a user can in principle be sustained by his/her mobile device. The assumption for \(\delta _u\) follows from this one, as a cloudlet has a higher computational capacity than a mobile device, while the assumption for \(U_{\max }\) is standard. Under these assumptions, it is easy to check that each user’s problem is convex for given values of the other users’ variables. By the results in [15], we know that we can recover a solution of this jointly convex game (known as variational solution or normalized solution) by solving a suitable Variational Inequality: VI \((K,F)\) Footnote 2 [17]. In order to define this VI which permits to compute a solution of our GNEP we therefore have to specify the set \(K\) and the function \(F\). We do this next, following [15]. To define \(K\) we first define the sets

$$\begin{aligned} \begin{array}{l} \tilde{K}_u :=\{x_u \in \mathbb {R}_{+}^3: \sum \nolimits _{i\in I} x_{u,i} =1, x_{u,clet} + x_{u,cloud} \le \chi , \\ \quad \alpha _u P_{u,m} x_{u,m} + \beta _u P_{u,t} (x_{u,clet} + x_{u,cloud}) \le P_{u,\max }\}, \end{array} \end{aligned}$$

which are nothing else but the feasible set of user \(u\) with the joint constraint neglected. The “contribution” of the joint constraint is taken into account by the set

$$\begin{aligned} {\varOmega } :=\left\{ x\in \mathbb {R}^{3N}: \displaystyle { \frac{1}{n} \sum _{u} \delta _u x_{u,clet} \le U_{\max }} \right\} . \end{aligned}$$

The set \(K\) in the definition of our VI is then given by \( K \, :=\, \left( {\varPi }_{u=1}^N \tilde{K}_u\right) \cap {\varOmega }\). It remains now to define the function \(F\). This is just the vector obtained by “stacking” the partial gradients of each user, where the gradients are taken only with respect to the users’ own variables:

$$\begin{aligned} \displaystyle \nabla _{x_u} \lambda _u R_u = \begin{pmatrix}\displaystyle \frac{\alpha _u}{\left( 1-\alpha _u x_{u,m}\right) ^2}\\ \displaystyle \beta _u+ \delta _u\frac{ 1-\frac{1}{n}\sum _{v\ne u}\delta _v x_{v,clet}}{(1-\frac{1}{n}\sum _{v}\delta _v x_{v,clet})^2}\\ \displaystyle \gamma _u \end{pmatrix}, \nonumber \\ F= \begin{pmatrix} \displaystyle \frac{ \alpha _1}{\left( 1-\alpha _1 x_{1,m}\right) ^2}\\ \displaystyle \beta _1+ \delta _1\frac{1-\frac{1}{n}\sum _{v\ne 1}\delta _v x_{v,clet}}{(1-\frac{1}{n}\sum _{v}\delta _v x_{v,clet})^2}\\ \gamma _1 \\ \displaystyle \vdots \\ \frac{\alpha _N}{\left( 1-\alpha _N x_{N,m}\right) ^2}\\ \displaystyle \beta _N+ \delta _N\frac{ 1-\frac{1}{n}\sum _{v\ne N}\delta _v x_{v,clet}}{(1-\frac{1}{n}\sum _{v}\delta _v x_{v,clet})^2}\\ \displaystyle \gamma _N \end{pmatrix}. \end{aligned}$$

Existence of a solution to a general GNEP is usually not easy to show. However, in our case we are dealing with a jointly convex GNEP with compact feasible set and it is well-known [15], but can also easily be seen directly, that a solution to the GNEP (10)–(15) exists.

Proposition 1

Supposing that Assumption A holds, the GNEP (10)–(15) has at least one solution.

Proof

As already observed, under Assumption A any solution of the VI \((K,F)\) is a solution of the GNEP (10)–(15), see [15]. But \(F\) is continuous on \(K\) and \(K\) is obviously compact. Therefore by [17, Corollary2.2.5] VI \((K,F)\) has a solution and, as a consequence, also the original GNEP (10)–(15) has a solution. \(\square \)

Note that in general the GNEP (10)–(15) could have infinite solutions; our aim is to compute a variational solution by a distributed algorithm (see comments later on the significance of this particular solution). To this end a key role is played by the monotonicity of \(F\).Footnote 3 The easiest way to check the monotonicity of a differentiable \(F\) is to check that the Jacobian of \(F,\, JF\), is positive semidefinite on \(K\) [17].

The Jacobian of \(F\) has the following structure:

$$\begin{aligned} JF(x)=\begin{pmatrix}A_1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} \ldots &{} 0 &{} 0 &{} 0 \\ 0 &{} B_1&{} 0 &{} 0 &{} B_{12}&{} 0 &{} \ldots &{} 0 &{} B_{1N}&{} 0 \\ 0 &{} 0&{} 0 &{} 0 &{} 0&{} 0 &{} \ldots &{} 0 &{} 0&{} 0 \\ &{}\vdots &{}&{}&{}&{}&{}&{}&{} \vdots &{} \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} \ldots &{} A_N &{} 0 &{} 0 \\ 0 &{} B_{N1}&{} 0 &{} 0 &{} B_{N2}&{} 0 &{} \ldots &{} 0 &{} B_{N}&{} 0 \\ 0 &{} 0&{} 0 &{} 0 &{} 0&{} 0 &{} \ldots &{} 0 &{} 0&{} 0 \end{pmatrix} \end{aligned}$$
(16)

where

$$\begin{aligned} \begin{array}{l} A_u =\displaystyle \frac{2\alpha _{u}^2}{\left( 1-\alpha _u x_{u,m}\right) ^3},\; B_u=\displaystyle \frac{2}{n}\delta _u^2\frac{1-\frac{1}{n}\sum _{v\ne u}\delta _v x_{v,clet}}{\left( 1-\frac{1}{n}\sum _{t}\delta _t x_{t,clet}\right) ^3}, \\ B_{uv}=\frac{1}{n}\delta _v\delta _u\displaystyle \frac{1-\frac{1}{n}\sum _{t}\delta _t x_{t,clet}+\frac{2}{n} \delta _u x_{u,clet} }{\left( 1-\frac{1}{n}\sum _{t}\delta _t x_{t,clet}\right) ^3}. \end{array} \end{aligned}$$
(17)

Theorem 1

Assume that

$$\begin{aligned} \delta _{\max } \le \frac{n}{N\chi }(1-U_{\max }), \end{aligned}$$
(18)

where \(\delta _{\max } = \max _{u=1,\ldots ,N}\delta _u\), then \(F\) is monotone.

Proof

Reordering the variables, \(JF(x)\) can be rewritten in the following form:

$$\begin{aligned} JF(x)= \begin{pmatrix}A &{} 0 &{} 0\\ 0 &{} B&{} 0\\ 0 &{} 0&{} 0 \end{pmatrix}, \quad A = \mathop {\mathrm {diag}}\left( A_u\right) _{u=1}^N,\quad B=\begin{pmatrix}B_1 &{} B_{12} &{}\ldots &{} B_{1N}\\ B_{21} &{} B_{2}&{} \ldots &{} B_{2N}\\ \vdots &{} &{} &{}\vdots \\ B_{N1} &{} B_{N2}&{} \ldots &{} B_{N} \end{pmatrix}. \end{aligned}$$

Since \(A\) is positive definite by Assumption A, checking the monotonicity reduces to checking that the matrix \(B\) is positive semidefinite. In order to check the semidefiniteness of \(B\) we check the semidefiniteness of its symmetric part \( B^s \, :=\, \frac{1}{2} (B^T + B). \) Set

$$\begin{aligned} D :=1-\frac{1}{n}\sum _{t}\delta _t x_{t,clet}, \end{aligned}$$

the diagonal elements \(B^s_u\) can be rewritten as

$$\begin{aligned} B^s_u=\displaystyle \frac{\delta _u^2}{nD^2}\left( 2+2\frac{\delta _{u}x_{u,clet}}{nD}\right) \end{aligned}$$

while the off-diagonal elements are

$$\begin{aligned} B^s_{u v}\,=\,B^s_{v u } \, = \, \frac{\delta _v\delta _u}{nD^2}\left( 1+\frac{\delta _v x_{v,clet}+\delta _u x_{u,clet}}{nD}\right) . \end{aligned}$$

Let \(\delta \) denote the vector \(\delta :=\begin{pmatrix}\delta _{1}&\ldots&\delta _N\end{pmatrix}^T\). It is easily seen that the matrix \(B^s\) can be rewritten as

$$\begin{aligned} \frac{1}{nD^2}\left( \delta \delta ^T\circ \left( I+E+\frac{2}{nD}\begin{pmatrix} \delta _1x_{1,clet}&{}\frac{\delta _1 x_{1,clet}+\delta _2x_{2,clet}}{2}&{}\ldots &{}\frac{\delta _1 x_{1,clet}+\delta _Nx_{N,clet}}{2}\\ &{}\ddots &{} \\ \frac{ \delta _N x_{N,clet}+\delta _1x_{1,clet}}{2}&{}\frac{\delta _Nx_{N,clet}+\delta _2x_{2,clet}}{2} &{}\ldots &{}\delta _Nx_{N,clet}\end{pmatrix}\right) \right) \end{aligned}$$
(19)

where the symbol \(\circ \) denotes the Hadamard product of two matrices, i.e. the matrix having as elements \((A\circ B)_{ij} = A_{ij}B_{ij}\), and \(E\) is the matrix with all entries equal to 1.

Set \(x_{clet}^\delta :=\begin{pmatrix}\delta _{1}x_{1,clet}&\ldots&\delta _Nx_{N,clet}\end{pmatrix}^T\), and let \(e\in \mathbb {R}^{N}\) be the vector of all ones, then, noting that

$$\begin{aligned} \begin{pmatrix} \delta _1x_{1,clet}&{}\frac{\delta _1 x_{1,clet}+\delta _2x_{2,clet}}{2}&{}\ldots &{}\frac{\delta _1 x_{1,clet}+\delta _Nx_{N,clet}}{2}\\ &{}\ddots &{} \\ \frac{ \delta _N x_{N,clet}+\delta _1x_{1,clet}}{2}&{}\frac{\delta _Nx_{N,clet}+\delta _2x_{2,clet}}{2} &{}\ldots &{}\delta _Nx_{N,clet} \end{pmatrix} = \frac{1}{2}\left( x_{clet}^\delta e^T+e (x_{clet}^\delta )^T\right) , \end{aligned}$$

the matrix \(B^s\) is given by

$$\begin{aligned} B^s=\frac{1}{nD^2}\left( \delta \delta ^T\circ \left( I+E+\frac{1}{nD}(x_{clet}^\delta e^T+e (x_{clet}^\delta )^T)\right) \right) . \end{aligned}$$
(20)

The Schur product theorem (see [24, Theorem 7.5.3]) states that the Hadamard product of two positive semidefinite matrices is positive semidefinite. Therefore, since the matrix \(\delta \delta ^T\) is obviously positive semidefinite, in order to show the positive semidefiniteness of \(B^s\) it is enough to show that the matrix

$$\begin{aligned} \left( I+E+\frac{1}{nD}(x_{clet}^\delta e^T+e (x_{clet}^\delta )^T)\right) \end{aligned}$$

be positive semidefinite. Neglecting the contribution of the positive semidefinite matrix \(E\), this reduces to proving that the minimum eigenvalue of the matrix \(\frac{1}{nD}(x_{clet}^\delta e^T+e (x_{clet}^\delta )^T)\) is greater or equal to \(-1\). It is known, see [6, Fact 4.9.16], that the matrix \(x_{clet}^\delta e^T+e (x_{clet}^\delta )^T\) has a characteristic polynomial given by

$$\begin{aligned} \eta ^{N-2}\left( \eta ^2 - 2(e^Tx_{clet}^\delta )\eta + (e^Tx_{clet}^\delta )^2 - N\Vert x_{clet}^\delta \Vert ^2\right) . \end{aligned}$$
(21)

From this we see that the matrix \(x_{clet}^\delta e^T+e (x_{clet}^\delta )^T\) has \((N-2)\) zero eigenvalue, a non negative eigenvalue and a non positive eigenvalue. These two latter eigenvalues are given respectively by

$$\begin{aligned} \eta _+ \,=\, \displaystyle e^Tx_{clet}^\delta + \sqrt{N} \Vert x_{clet}^\delta \Vert , \qquad \qquad \eta _- \,=\, \displaystyle e^Tx_{clet}^\delta - \sqrt{N} \Vert x_{clet}^\delta \Vert . \end{aligned}$$

We then get that a sufficient condition for the positive semidefiniteness of \(B^s\) is

$$\begin{aligned} \displaystyle \frac{1}{nD} \left( \sqrt{N} \Vert x_{clet}^\delta \Vert -e^Tx_{clet}^\delta \right) \, \le \, 1. \end{aligned}$$
(22)

But recalling that \(e^T x_{clet}^\delta \ge \Vert x_{clet}^\delta \Vert \) since \(x_{clet}^\delta \ge 0\), that on the feasible region \(\Vert x_{clet}^\delta \Vert \) is at most \(\sqrt{N}\delta _{\max }\chi \) (see (13)) and that \(D\ge 1-U_{\max }\) by (11), we see that

$$\begin{aligned} \displaystyle \frac{1}{nD} \left( \sqrt{N} \Vert x_{clet}^\delta \Vert -e^Tx_{clet}^\delta \right) \, \le \, \frac{1}{nD}\left( \delta _{\max }\chi \sqrt{N}(\sqrt{N} -1) \right) \le \frac{\delta _{\max }}{1-U_{\max }} \frac{N\chi }{n}. \end{aligned}$$
(23)

Therefore, (22) is certainly satisfied if (18) holds. \(\square \)

Remark 1

The previous theorem hinges on condition (18) which guarantees the key property of \(F\) being monotone. It is then important to get a good understanding of its meaning. However, before looking at this issue, we stress that condition (18) is just a sufficient condition for the monotonicity of \(F\). Indeed, a look at the proof of Theorem 1 shows that condition (18) derives from a series of majorizations based on worst case scenarios; therefore, in practice we can expect monotonicity of \(F\) even when (18) is not “violated too much”. This is confirmed by the numerical results in Sect. 7, that show that condition (18) is not critical from the practical point of view. Condition (18) essentially says that monotonicity of \(F\) is guaranteed if the cloudlet is not overloaded. In fact, condition (18) requires that the maximum traffic intensity \(\delta _{\max }\) of the users (on the cloudlets) be lower than a certain threshold value. For a given number \(N\) of users, this threshold increases when the number \(n\) of cloudlet servers increases or when either or both \(U_{\max }\) and \(\chi \) decrease. Therefore monotonicity can always be achieved by deploying more cloudlets or by imposing in the protocol, i.e. in the constraints (11) and (13), suitably small values of \(U_{\max }\) and \(\chi \).

5.2 The extended game

Centralized algorithms for the computation of an equilibrium could now be easily derived by solving the VI \((K,F)\) defined above. In fact, assuming monotonicity of \(F\), there are plenty of centralized algorithms available, see [17]. However, in order to develop a distributed algorithm, we can not act directly on the original GNEP (10)–(15) or on its equivalent VI reformulation. Roughly speaking, the reason is that distributed algorithms require that the feasible sets (of the game or of the VI) are the Cartesian product of lower dimensional sets, a condition that in our case is not satisfied due to the shared constraint (5). However, as we show next, we are able to reformulate the GNEP (10)–(15) into another game with no coupling constraints through a simple, but non trivial transformation which, essentially, was first hinted at in [18]. It will turn out this new game inherits the monotonicity properties of the original game so that, as we will see in the next section, under the conditions of Theorem 1, we will be able to develop distributed algorithms for the computation of a variational solution of the GNEP (10)–(15).

To achieve the decoupling of the users’ feasible sets, we consider an extended game, with one extra “player”. In this extended game the first \(N\) users control \(x_u\) and have the problem

$$\begin{aligned} \displaystyle \min _{x_u \in \tilde{K}_u} \quad \lambda _u R_u(x_u,x_{-u}) + \rho \left( \frac{\delta _u}{n} x_{u,clet}\right) \end{aligned}$$

while the \((N+1)-th\) player controls the variable \(\rho \in \mathbb {R}\) and solves the problem

$$\begin{aligned} \displaystyle \max _{\rho \ge 0} \quad \rho \left( \displaystyle { \frac{1}{n} \sum _{u} \delta _u x_{u,clet}}- U_{\max }\right) . \end{aligned}$$

We call this game the extended game. Note that this extended game is a standard Nash equilibrium problem since there is no coupling in the constraints. The first \(N\) users are the “original” users. Their problems have been modified in two ways: (a) the joint constraint has been eliminated and (b) in the objective function a term has been added to make up for this omission. The \((N+1)\)-th user is a sort of cloudlet manager and controls the variable \(\rho \) which can be seen as the cloudlet “price”. Note that the additional term in the objective function of the other users is then nothing else but the “cost” of using the cloudlet. More precisely, it can be shown that \(\rho \) will just turn out to be the Lagrange multiplier of the shared constraint (11). It is a classical result [17, Proposition 1.4.2] that our game is equivalent to the VI \((F_e, K_e)\), where

$$\begin{aligned} \displaystyle F_e (x, \rho ) = \left( \begin{array}{c} F(x) +\left( \begin{array}{c} 0\\ \displaystyle {\rho (\delta _1/n)} \\ 0 \\ \vdots \\ 0 \\ \displaystyle { \rho (\delta _N/n)}\\ 0\end{array}\right) \\ \displaystyle -\frac{1}{n} {\sum _{u}\delta _u x_{u,clet}+ U_{\max }}\end{array}\right) , \quad K_e = (\displaystyle {\varPi }_{u=1}^N \tilde{K})\times \mathbb {R}_+. \end{aligned}$$

The following result is key to our developments and relates GNEP (10)–(15) to the extended game. Note that in the theorem below, when we say that the game is monotone, we obviously mean that its VI reformulation is so, in other words, that the function \(F_e\) is monotone.

Theorem 2

A point \(\bar{x}\) is a variational solution of the original game (10)–(15) if and only if a \(\bar{\rho }\) exists such that \((\bar{x}, \bar{\rho })\) is a Nash equilibrium of the extended game. Furthermore, if the original game is monotone, then also the extended game is monotone.

Proof

The first assertion is just a verification which can be carried out comparing the Karush–Kuhn–Tucker conditions of the VI \((K,F)\) and of the extended game. Note that since all constraints involved in both problems are affine, the Karush–Kuhn–Tucker conditions surely hold at a solution. The second assertion of the theorem can be checked writing down the Jacobian of \(F_e\):

This is a block skew-symmetric matrix, and \(JF_e(x,\rho )\) is monotone if and only if \(JF(x)\) is monotone. \(\square \)

The bottom line of this section is: we can compute a (variational) solution of the GNEP (10)–(15) by finding a solution of the standard extended game. This latter game is monotone if and only if the original GNEP is monotone and, in particular, if the conditions of Theorem 1 are met. On the basis of these results, in the next section we will show how to apply some very recent algorithmic developments in order to design distributed algorithms for the solution of the extended game.

6 Distributed solution

In this section we consider the problem of computing an equilibrium of the GNEP (10)–(15) by a distributed algorithm. To achieve our goal we will combine, along lines first put forward in [40], classical results about proximal regularization, see e.g. [17, Chapter 12], with some very recent, advanced distributed methods proposed in [19] and [41]. In doing so, we take great care to make appropriate choices so that the resulting solution method is not only mathematically sound, but also well suited to the characteristics of our model, in terms of information exchange and computational burden of the users, so as to be amenable to practical use.

Our approach to the solution of (10)–(15) is to solve the VI \((F_e, K_e)\) in a distributed way. To this end, one key requirement is that \(F_e\) be strongly monotone.Footnote 4 However, it can easily be observed that, because of the 0 in the lower-right corner of \(JF_e\) (see the proof of Theorem 2) \(F_e\) can never be strongly monotone, even if \(F\) is so. To circumvent this difficulty, we regularize the VI \((F_e, K_e)\) and use a proximal-point method [17, Chapter 12]. This results in the following scheme, where \(\alpha \) is an arbitrary positive constant.

figure a

It is known [17, Chapter 12] that the above scheme converges to a solution of the VI \((K_e, F_e)\), i.e. to a (variational) solution of the GNEP (10)–(15). The key point in developing a (totally asynchronous) distributed solution method is therefore the development of a (totally asynchronous) distributed solution method for the VI \((K_e, F_e + \alpha (\cdot - (x^k, \rho ^k)))\). To this end we may consider the distributed Algorithm 2. Note that the algorithm we present is synchronous. We do so for simplicity of presentation only. Totally asynchronous (in the sense of [7]) versions can easily be envisaged and all the derivations we make in this section readily extend to the asynchronous case.

figure b

The overall scheme resulting by the combination of the outer Algorithm 1 and of the inner Algorithm 2 is depicted in Fig. 3, where the information flows are also represented.

Fig. 3
figure 3

Distributed algorithm

In the next Theorem we formally show convergence of the overall scheme combining Algorithm 1 and 2.

Theorem 3

Consider the solution Algorithm 1, where all subproblems in Step S.2 are solved using Algorithm 2. There exists a positive \(\bar{\alpha }> 0\) such that, for every \(\alpha > \bar{\alpha }\) and for every \(k\), the distributed inner Algorithm 2 converges to the unique solution of VI \((K_e, F_e + \alpha (\cdot - (x^k, \rho ^k)))\) and Algorithm 1 converges to a solution of VI \((K_e, F_e)\). In particular, we can take

$$\begin{aligned} \bar{\alpha }= \frac{3N}{n} \frac{\delta _{\max }}{(1-U_{\max })^3} \end{aligned}$$
(24)

Proof

By the discussion immediately after Algorithm 1, we only need to show that for every \(\alpha \ge \bar{\alpha }\) and for every \(k\), the distributed inner Algorithm 2 converges to the unique solution of VI \((K_e, F_e + \alpha (\cdot - (x^k, \rho ^k)))\) and we also need to justify the value of \(\bar{\alpha }\) in (24). By [19, Theorem 3] or [41, Theorem 13] we only need to show that a certain matrix \({\varUpsilon }\) is \(P\) (meaning that all principal minors are positive). The matrix \({\varUpsilon }\) is an \(N+1\) square matrix related to the regularized VI \((K_e, F_e + \alpha (\cdot - (x^k, \rho ^k)))\) and we describe next how it is constructed. Consider the Jacobian of \(F_e + \alpha (\cdot - (x^k, \rho ^k))\), which is given by

$$\begin{aligned} \left( \begin{array}{ccc|ccc|c|ccc|c}A_1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} \ldots &{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} B_1&{} 0 &{} 0 &{} B_{12}&{} 0 &{} \ldots &{} 0 &{} B_{1N}&{} 0 &{} \delta _1/n \\ 0 &{} 0&{} 0 &{} 0 &{} 0&{} 0 &{} \ldots &{} 0 &{} 0&{} 0 &{}0\\ \hline &{}\vdots &{}&{}&{}&{}&{}&{}&{} \vdots &{}\\ \hline 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} \ldots &{} A_N &{} 0 &{} 0 &{} 0\\ 0 &{} B_{N1}&{} 0 &{} 0 &{} B_{N2}&{} 0 &{} \ldots &{} 0 &{} B_{N}&{} 0 &{} \delta _N/n\\ 0 &{} 0&{} 0 &{} 0 &{} 0&{} 0 &{} \ldots &{} 0 &{} 0&{} 0 &{} 0 \\ \hline 0 &{} -\delta _1/n &{} 0 &{} &{}\cdots &{}&{} &{}0 &{} -\delta _N/n&{} 0&{} 0 \end{array}\right) + \alpha I_{(N+1)\times (N+1)}, \end{aligned}$$
(25)

where the matrices \(A_u,\, B_u\), and \(B_{uv}\) (whose dependence on \(x\) has been omitted for simplicity) are defined in (17). From this matrix we can now build \({\varUpsilon }\) (according to what indicated in [19] or [41]) in the following way:

$$\begin{aligned} {\varUpsilon } \,:=\, \left[ \begin{array}{cccc} s_1 &{} -t_{12} &{} \cdots &{} -t_{1(N+1)}\\ -t_{21} &{} s_2 &{} \cdots &{} -t_{2(N+1)}\\ \vdots &{}&{} \ddots &{}\vdots \\ -t_{(N+1)1} &{} -t_{(N+1)2} &{} \cdots &{} s_{N+1} \end{array} \right] + \alpha I, \end{aligned}$$

where the constants \(s_i\) and \( t_{ij},\, i,j = 1,\ldots , N+1\), are related to the blocks in (25) and, more precisely, are given by

$$\begin{aligned} s_i :=\min _{(x, \rho )\in K_e}\lambda _{\min } \left( \begin{array}{ccc} A_u &{}0&{}0 \\ 0&{}B_u&{}0\\ 0&{}0&{}0 \end{array} \right) =0, \; i = 1,\ldots , N, \qquad s_{N+1} :=\min _{(x, \rho )\in K_e} \lambda _{\min } (0) = 0, \end{aligned}$$

where \(\lambda _{\min }(A)\) denotes the minimum eigenvalue of the matrix \(A\), while

$$\begin{aligned} t_{ij} :=\left\{ \begin{array}{ll} \max _{(x, \rho )\in K_e}\left\| \begin{array}{ccc} 0 &{}0&{}0 \\ 0&{}B_{ij}&{}0\\ 0&{}0&{}0 \end{array} \right\| = \max _{(x, \rho )\in K_e} B_{ij} &{} \mathrm{if}\, i, j= 1,\ldots , N \\ \max _{(x, \rho )\in K_e}\left\| \begin{array}{c} 0 \\ \delta _i/n\\ 0 \end{array} \right\| = \delta _i/n&{} \mathrm{if}\, j=N+1\\ \max _{(x, \rho )\in K_e}\left\| \begin{array}{c} 0 \\ -\delta _i/n\\ 0 \end{array} \right\| = \delta _i/n&\mathrm{if}\, i=N+1. \end{array}\right. \end{aligned}$$

It is clear that the matrix \({\varUpsilon }\) is a \(Z\) matrix (meaning that all off-diagonal elements are non positive), and therefore if we write \({\varUpsilon } \ge \tilde{\varUpsilon }\) (where \(\ge \) indicates component-wise \(\ge \)) and \(\tilde{\varUpsilon }\) is a \(Z\) and \(P\) matrix, then also \({\varUpsilon }\) is a \(P\) matrix (this is an easy consequence of [13, Theorem 3.11.10]). By the above discussion we can write

$$\begin{aligned} {\varUpsilon } \,&=\, \left[ \begin{array}{cccc} \alpha &{} -\displaystyle {\max _{(x, \rho )\in K_e}} B_{12} &{} \cdots &{} -\delta _1/n \\ -\displaystyle {\max _{(x, \rho )\in K_e}} B_{21} &{} \alpha &{} \cdots &{} -\delta _2/n \\ \vdots &{}&{} \ddots &{}\vdots \\ -\delta _1/n &{} -\delta _2/n &{} \cdots &{} \alpha \end{array} \right] \\&\ge \, \left[ \begin{array}{cccc} \alpha &{} -\frac{3}{n} \frac{\delta _{\max }}{(1-U_{\max })^3} &{} \cdots &{} -\frac{3}{n} \frac{\delta _{\max }}{(1-U_{\max })^3} \\ -\frac{3}{n} \frac{\delta _{\max }}{(1-U_{\max })^3} &{} \alpha &{} \cdots &{} -\frac{3}{n} \frac{\delta _{\max }}{(1-U_{\max })^3} \\ \vdots &{}&{} \ddots &{}\vdots \\ -\frac{3}{n} \frac{\delta _{\max }}{(1-U_{\max })^3} &{} -\frac{3}{n} \frac{\delta _{\max }}{(1-U_{\max })^3} &{} \cdots &{} \alpha \end{array} \right] \, :=\, \tilde{\varUpsilon } \end{aligned}$$

The matrix \(\tilde{\varUpsilon }\) is clearly a \(Z\)-matrix. In order to check that it is also \(P\), we can, equivalently check, see [13, Lemma 5.3.14], that the spectral radius of the matrix

$$\begin{aligned} \left[ \begin{array}{cccc} 0 &{} \frac{1}{\alpha }\frac{3}{n} \frac{\delta _{\max }}{(1-U_{\max })^3} &{} \cdots &{} \frac{1}{\alpha }\frac{3}{n} \frac{\delta _{\max }}{(1-U_{\max })^3} \\ \frac{1}{\alpha }\frac{3}{n} \frac{\delta _{\max }}{(1-U_{\max })^3} &{} 0 &{} \cdots &{} \frac{1}{\alpha }\frac{3}{n} \frac{\delta _{\max }}{(1-U_{\max })^3} \\ \vdots &{}&{} \ddots &{}\vdots \\ \frac{1}{\alpha }\frac{3}{n} \frac{\delta _{\max }}{(1-U_{\max })^3} &{} \frac{1}{\alpha }\frac{3}{n} \frac{\delta _{\max }}{(1-U_{\max })^3} &{} \cdots &{} 0 \end{array} \right] \end{aligned}$$

is less than 1. But if \(\alpha \ge \bar{\alpha }\) this easily follows from Geršgorin circle theorem, see for example [24, Theorem 6.1.1]. \(\square \)

We note that the fact that if \(\alpha \) is large enough the matrix \({\varUpsilon }\) is \(P\), actually even positive definite, can be proved relatively easily. Part of the complication of the proof above is given by the fact that we wanted to give an explicit expression for \(\bar{\alpha }\) showing the qualitative behavior of this threshold value. Once again this parameter behaves in an expected way and its dependency on the system parameters goes in the direction: the more congested the system is, the higher \(\bar{\alpha }\) can be expected to be. We also remark that the expression of \(\bar{\alpha }\) in the above theorem is obtained through a really crude majorization of the terms \(B_{uv}\) and \(\delta _u/n\); better, if more complicated, estimates can certainly be obtained, but we do not pursue this issue further.

Below we discuss in more detail some important issues.

  • Both Algorithm 1 and 2 stop in Step 1 when a solution of VI \((K_e, F_e)\) and VI \((K_e, F_e + \alpha (\cdot - (x^k, \rho ^k)))\) respectively have been reached. In practice, in all cases, one can stop when a inexact solution has been found, provided the degree of inexactness decreases as the the algorithms progress. We do not discuss this technical issue here, but refer the reader to [19, 41] instead. In any case this point does not pose any serious practical problem. For example, usually very few inner iterations are needed to reach a very accurate solution of VI \((K_e, F_e + \alpha (\cdot - (x^k, \rho ^k)))\), since as the outer iterations progress and \((x^k, \rho ^k)\) converges, we are solving a sequence of outer problems that are more and more similar. This is confirmed in our numerical experiments in Sect. 7.

  • The problems solved by each user at Step 2 can be rather easily interpreted. The objective function includes two additional terms with respect to the original game. The first term, \(\rho \left( \frac{\delta _u}{n} x_{u,clet}\right) \), is a cost associated to the use of the cloudlet with a price of \(\rho \). In other words we penalize the shared constraint (11) and “put it in the objective function” in order to decouple the feasible sets of the users. The second term, \(\alpha \Vert x_u - x_u^k\Vert ^2\), is a classical regularization term that is needed to guarantee strong convexity of the objective function.

  • The problems solved by each user at Step 2 are three variables strongly convex problems with linear constraints and can be solved extremely efficiently and very fast by any commercial optimization software.

  • The updating of the “price” \(\rho \) requires the cloudlet to monitor the system load (the term \(\frac{1}{n} \sum _{u}\delta _u x^i_{u,clet}\)). The system load along with the price \(\rho \) are then sent by the cloudlet to the users which require this information to solve their optimization problem. We observe that the system load can be easily measured at the cloudlet side, and the cloudlet can be easily instrumented to transmit this information to the users exploiting its resources, so the distributed algorithm is amenable to a real-world implementation.

  • We remark once more that our algorithm computes a variational solution of the GNEP (10)–(15), that is, one of the possibly infinite number of equilibria of the game. The variational solution is characterized by the fact that the multipliers of the shared constraint (11) are the same for all users (see [15]). This solution is particularly appealing from a practical point of view since it can be interpreted as a fairness condition for it implies that the “cost” of use of the cloudlet (the multiplier) is the same for all users.

7 Experimental results

In this section we investigate through numerical experiments the behavior of the proposed computation offloading strategy. First, in Sect. 7.1, we compute the system equilibria under different scenarios and study how the users’ tasks are dispatched among the mobile device, the cloudlet, and the remote cloud infrastructure. Then, in Sect. 7.2 we compare the proposed non-cooperative strategy solution with the social optimum. Our aim is to understand how the performance degrades due to the selfish behavior of the users.

For the analysis, we implemented in MATLAB the distributed algorithm in Sect. 6, setting the parameter \(\alpha \) to \(0.1\). The algorithm stops when the norm of the difference of two consecutive iterations is less than \(10^{-4}\).

7.1 Non-cooperative strategy analysis

We consider a homogeneous scenario where the users profile is characterized by the same set of parameters. If not stated otherwise, as basic setting we consider \(n = 2\) cloudlet servers, \(\lambda _u = 0.25\) task/s, \(1/\mu _{u,m} = 0.5\) s, \(\mu _{u,clet} = 5\mu _{u,m},\, \mu _{u,cloud} = 10\mu _{u,m},\, 1/\mu _{u,wl} = 0.1\) s, \(1/\mu _{u,wn} = 0.4\) s and \(U_{max} = 0.7\). The execution time parameters are consistent with those experimentally measured in [11, 26, 29].

We also set \(\chi =1\), i.e. all tasks can be offloaded to the cloud. Moreover, unless otherwise noted, we do not consider the power consumption constraint, i.e. we set \(P_{u,max} = \infty \).

In Fig. 4, we show the results of four set of experiments to investigate the behavior of the non-cooperative strategy as number of users, number of cloudlet servers, task execution time, and maximum power consumption increase. Note that, since we consider a homogeneous scenario, the user’s strategies coincide. Hence, we only need to show the strategy of one user.

Fig. 4
figure 4

User strategies for different system parameters

In the first set of experiments, we study the computation offloading strategy as the number of users (\(N\)) increases from 20 to 70. From Fig. 4a we can observe that until the cloudlet is not overloaded, the users take fully advantage of its computational resources to execute their tasks (\(x_{u,clet}=1\)). As the number of users grows, the cloudlet utilization increases. Eventually, when the utilization hits the threshold \(U_{max}\), which occurs when \(N=56\), the cloudlet cannot serve all the tasks; as \(N\) increases further, a larger percentage of tasks is executed on the mobile nodes themselves. It is worth observing that, nevertheless, the tasks are not dispatched to the remote cloud due to the high delays which offset the faster computational speed. Figure 5a shows the number of user tasks in the system (i.e. the objective function value) for the first set of experiments. As we can expect, it increases with the number of users in the cloudlet, because the resource contention increases and a percentage of the tasks must be even executed on the slow mobile devices.

Fig. 5
figure 5

Number of user tasks in the system

In the second set of experiments, we study the behavior of the proposed strategy as the number of cloudlet servers increases from \(n=2\) to \(n=10\). We set the number of users to \(N=15\) and increase the task execution time, setting it to \(1/\mu _{u,m} = 2.2\) s (so that the local execution on the mobile device is not suitable). The results are shown in Fig. 4b. As expected, increasing the computational power of the cloudlet allows for a larger percentage of tasks to be executed on it, which results, as shown in Fig. 5b, in a reduction of the number of user tasks in the system, also due to the faster network connection to the cloudlet.

In the third set of experiments, we study the computation offloading strategy as the task execution time on the mobile node, \(1/\mu _{u,m}\), ranges from 0.1 s up to 2.2 s (\(\mu _{u,clet}\) and \(\mu _{u,cloud}\) are scaled accordingly). We fix the number of users \(N=15\) as in the previous experiments and set the number of cloudlet servers to \(n=2\). The results are shown in Fig. 4c. As we can see, at low-medium load the users take fully advantage of the cloudlet resources (\(x_{u,clet}=1\)), except when the task execution time is very small (\(1/\mu _{u,m} = 0.1\) s), in which case it is more convenient to execute the task locally on the mobile device. In particular, \(x_{u,clet}\) remains equal to 1 until \(1/\mu _{u,m} \le 1.3\) s, corresponding to a cloudlet utilization of about 0.49. From this point onwards, an ever growing number of tasks is offloaded to the remote cloud, because in these experiments the cloudlet is overloaded by the larger task execution time. Hence, when the cloudlet is overloaded, it is more convenient to dispatch some tasks to the remote cloud rather than to the mobile device, because the delay introduced by the wireless network and the Internet is compensated by the faster execution on the remote cloud.

We now analyze the impact of the constraint on the power consumption, which has been neglected in the previous experiments where we set \(P_{u,max} = \infty \). Following [31], we set \(P_{u,m} = 0.9\) W, \(P_{u,t} = 1.3\) W, and we study how the offloading strategy changes as \(P_{u,max}\) increases from 0.112 to 0.125 W. We also increase the transfer time over the wireless network, setting \(1/\mu _{u,wl} = 0.5\) s (for example, we can suppose that the access network is congested), while keeping \(1/\mu _{u,m} = 0.5\) s, so that the power consumption to transmit the task weighs more than the power consumed to execute the task locally. The results are shown in Fig. 4d. As we can see, when \(P_{u,max} \cong 0.124\hbox {W}\), the users’ strategy is to offload to the cloudlet more than 20 % of the tasks. Indeed, the high transmission time is compensated by the cloudlet faster response time and the power constraint is still satisfied. However, as the power constraint becomes more stringent, the user strategy is to reduce progressively the number of offloaded tasks, because offloading consumes too much energy due to the high transfer time over the wireless network.

We now turn our attention to the convergence speed of the proposed distributed algorithm. In our experiments we set \((x^0, \rho ^0) = (x^k, \rho ^k)\) each time Algorithm 2 is executed (step S.0). Furthermore, the values of \(x\) and \(\rho \) in the first outer loop are taken equal to \(\{x_{u,m},x_{u,clet},x_{u,cloud}\} = \{1,0,0\}\ \forall u\) and \(0\) respectively. Figure 6 shows the number of iterations needed to compute the equilibrium policies. For space limits, we show only the results for the second and third sets of experiments; however, similar conclusions hold for the other experiments. If we compare Fig. 6a, b with Fig. 4c, b, we can see that whenever the cloudlet is overloaded and the strategy requires to distribute the tasks between the remote cloud or the mobile device, the number of required inner iterations, i.e. the number of times step S.2 of Algorithm 2 is executed, grows up to 200. However, the number of iterations can be decreased up to one third by using as initial state of Algorithm 1 the previously computed equilibrium. This could be a promising solution to speed up the algorithm convergence in a real environment, where we could expect that the users gradually join and leave the system. Furthermore, intermediate solutions that progressively approximate the new system equilibrium can also be used as they are computed, rather than waiting the algorithm to stop. For example, Fig. 7 shows the intermediate outer loop solutions, i.e. the \(x_u^{k}\) values (this figure refers to the same setting of the first set of experiments when the number of cloudlet users is equal to 60). As we can see, from 10 outer iterations onwards, we already have a good approximation of the system equilibrium.

Fig. 6
figure 6

Inner and outer number of iterations of the distributed algorithm

Fig. 7
figure 7

Intermediate algorithm solution strategies

Finally, observe that in our experiments we never had problems due to assumption (18) in Theorem 1 not being satisfied. Indeed, our experiments showed that such assumption is not critical from a practical point of view, as the system appears to converge to an equilibrium even when it is not satisfied (for example, this is the case of the experiment in Fig. 4a where assumption (18) does not hold for \(N>24\)). Nevertheless, the parameter \(\alpha \) should be carefully tuned depending on the cloudlet load to ensure the algorithm convergence. As indicated by Theorem 3, the higher the expected cloudlet load, the larger \(\alpha \) should be in order to ensure convergence. In our experiments, we used \(\alpha = 0.1\) to accomodate the higher loads (even though a smaller \(\alpha \) would have ensured faster convergence at lower loads).

7.2 Comparison with the social optimum

We now compare the proposed non-cooperative strategy with the social optimum solution to investigate the performance degradation caused by the selfish users behaviour. The social problem is the problem of maximizing the sum of all users objective functions (the social utility), i.e. \(\sum _{v=1}^N \lambda _v R_v\), subject to the union of all the user constraints. Under Assumption A the corresponding problem is a convex optimization problem with linear constraints. We study the social optimum solution with the same set of parameters used in the first set of experiments as the number of users varies.

Figure 8a shows the social optimum solution. As we can see, differently from the non-cooperative solution, the users switch earlier their computation to the mobile devices (\(N = 40\) against \(N=56\)) because they are not acting selfishly. Indeed, we can expect that the behaviour of a selfish user is to offload as much as possible its computation to the cloud, regardless of what the others do. However, by doing so the users performance degrade as the overall load increases, as shown in Fig. 8b. This is the so called “price of anarchy”. Note also that, under light load, the two solutions coincide, because the cloudlet capacity can accommodate all the tasks.

Fig. 8
figure 8

Comparison of the proposed non-cooperative strategy with the social optimum

8 Conclusions

We have considered the problem of computation offloading in a mobile cloud computing scenario, motivated by the increasing interest in this architectural paradigm. In particular, as suggested by recent literature on this topic, we have considered a three-tier architecture where mobile nodes have the choice of offloading their computation to a nearby resource-constrained cloudlet or to a distant tier of resourceful cloud servers. While previous works have either dealt with single-user scenarios without considering the overall system or at most with centralized global approaches to tackle the interactions among different mobile users on a resource-limited cloud, in this paper we have focused on a non-cooperative usage scenario where individual users try to take advantage selfishly of the available resources.

We have adopted a game theoretic approach to investigate the dynamics of user interactions, modeling the offloading strategy of mobile users as a Generalized Nash Equilibrium Problem. We have shown existence of an equilibrium and have provided a distributed algorithm to compute an equilibrium strategy for each user. Through a set of numerical experiments we have illustrated the properties of the equilibrium that can be achieved and compared the resulting solutions with the social optimum. The proposed distributed algorithm has a solid theoretical foundation and is appealing for a real-world implementation, since it requires only a limited amount of information that can be easily obtained.

As noted in Sect. 5, our solution refers to the case where cloudlets and remote cloud nodes can be used to improve the user experience for a computational load that could in principle be sustained by his/her mobile device. We do not consider the case where the user generated load exceeds the mobile device capacity, and leave dealing with this scenario for future work.

Besides this, other topics may be explored in future research, including a multi-class model of the tasks launched by each user, as well as a monetary cost model to use the cloud servers. Furthermore, while we have proved the existence of a solution for the variational inequality, a further step with some practical relevance is the selection of the given variational solution if more than one exists. Besides working on these methodological extensions, we also plan to implement the distributed algorithm in a system prototype, to validate the results in a real environment.