1 Introduction

Online services are applied into many aspects of our daily life. For example, people perform communication, shopping, entertainment and professional activities over the Internet. The online service providers (OSPs) need to develop new technologies, aiming at providing QoS guaranteed service while at the same time, minimizing its operating cost. In general, an OSP operates a number of distinct services simultaneously, so it needs to answer requests from various applications that differ a lot in pattern. To this end, many OSPs develop private cloud systems so as to provision virtualized environment where each service can be properly addressed.

However, private cloud is still not enough. Users’ requests can be of high variance over time. In general, we can roughly classify the customers’ request patterns as following two categories:

  • Routine case: Normal and periodical user requests, such as webpage browsing or video viewing, can be roughly estimated over the time span according to their regularity.

  • Burst case: Sudden increase in request amount can incur a much higher burden, e.g., hundreds of times higher, than the routine case.

These requests can undoubtedly bring extensive revenue for the OSP, but in the meanwhile, it brings huge burden for the server to answer these requests. A naive way for an OSP to cope with these requests is to prepare a very powerful private cloud system, such that even when the burst occurs, it can still answer all requests. However, this approach is neither economically efficient nor robust. The hardware and management cost is surely very high, and for most of the time, the request amount is much lower than the peak time, so many servers remain idle. Even if one prepares such a system, it can hardly be guaranteed to work normally at the peak time. In fact, there have been many cases that online shopping sites crashed in shopping festivals (e.g., the Black Friday in western countries or Double Eleven in China, which draws much more requests than the routine workloads [1]).

Alternatively, hybrid cloud is a promising solution to this problem. Hybrid cloud means a combination of a private cloud operated by the OSP, and a public cloud system operated by a third-party cloud service provider (CSP). To address the routine case, most requests can be answered by the local private cloud; and for the burst case, the OSP can resort those requests beyond its capacity to the public cloud.

The fundamental questions of the hybrid cloud approach are resource allocation and pricing. In particular, given the request pattern of users, from the OSP’s perspective, it has to decide how to build up a private cloud in terms of the amount of each resource (e.g., CPU, memory, storage, etc.) to be prepared; and facing the incoming requests, how to properly distribute them into private/public clouds. From the CSP’s perspective, it needs to decide how to set prices to sell cloud services so as to maximize its profit.

These questions are non-trivial to answer due to the following challenges. First, the stochastic nature of the request arrival pattern makes it difficult to determine the local resource arrangement. Second, since the arrival rate of requests is very high in a large scale system, the resource allocation algorithm must be dynamic and time efficient, but finding the optimal allocation is a very complex problem: even if we know the request distribution a priori, it is still NP-complete. Last but not least, the CSP’s pricing decision and the OSP’s allocation and purchasing decisions are highly coupled, so we need have a deep understanding on their economic correlations and dynamics. All these challenges make the resource allocation and pricing problem in hybrid cloud an open question so far.

To tackle these challenges, we propose a hierarchical approach to analyze the hybrid cloud framework. In particular, we use a Stackelberg game to capture the interactions between the OSP and the CSP, based on which we design algorithms to decide the OSP’s strategy in purchasing the public cloud resources, as well as its allocation mechanism for the incoming requests. We also design an algorithm for the CSP to decide its pricing scheme that approximately maximizes its profit. Due to the NP-completeness of each sub-problem, we design efficient and effective approximation algorithms, and use theoretic analysis and simulations to validate their high accuracy. We believe this gives important guidelines to design practical hybrid cloud systems.

To sum up, we design a comprehensive framework to design resource allocation and pricing in hybrid cloud. Our main contributions are:

  • We reveal the interactions of OSP and CSP, and present a hierarchical analytical framework for the hybrid cloud.

  • We propose approximation algorithms for resource allocation and pricing, and show their efficiency and accuracy both theoretically and empirically.

  • Base on real-trace simulation, we show our design can achieve satisfactory performance in practice.

This is the outline of this paper. In Sect. 2, we formulate the hybrid cloud as a Stackelberg game model. Section 3 proposes two approximation algorithms to address the request distribution problem for the OSP. Based on this, we derive the dynamic resource allocation algorithm for the OSP in Sect. 4. In Sect. 5 we use an approximation algorithm to decide the optimal pricing scheme for the CSP. Real-trace evaluation is given in Sect. 6. Section 7 discusses some possible extensions to our framework. Section 8 states related work and Sect. 9 concludes.

2 System and game model

In this section, we present our system model and problem formulation, provide background material on a game-theoretic model called the Stackelberg model, and finally develop this game-theoretic model as a framework for capturing the complex interactions between the OSP and CSP in hybrid cloud environments.

2.1 System model and problem formulation

In this paper, we consider the hybrid cloud as a hierarchical model (as shown in Fig. 1), which includes the users, OSP and CSP in every individual layer. The primary process is that the OSP collects requests and serves network users. However, when requests cannot be fully satisfied in the private cloud of OSP, they are resorted to the public cloud in CSP. In what follows, we will give an overview of each part of the system, and show mathematical analysis respectively.

Fig. 1
figure 1

Hierarchical system model

2.1.1 Layer 1: users and OSP

Layer 1 depicts the interactions between network users and OSP. In this layer, users request for network services and try to access OSP’s computing resources. These requests are heterogenous and meanwhile, they call for different combination of computing resources to fulfill. After collecting users’ requests, the OSP assembles its computing resources and allocates them to serve users’ heterogenous requests. This process happens in local area, i.e., only between users and their corresponding OSP. Now we formulate the interactions between users and OSP in following paragraphs.

We assume that users’ requests can be classified into n types. In reality, they can represent various applications. Each request type requires a specific collection of different computing resources, for example, CPU, memory and storage. We can define \(\mathcal {R}\) as the set of all request types, i.e.,

$$\begin{aligned} \mathcal {R}=\left\{ \mathcal {R}_{1},\ldots ,\mathcal {R}_{i},\ldots ,\mathcal {R}_{n}\right\} , i=1,2,\ldots ,n, \end{aligned}$$
(1)

where \(\mathcal {R}_{i}=\left( r_{i1},\ldots ,r_{ij},\ldots ,r_{im}\right) \) is a type i request. Each element \(r_{ij}\) denotes the amount of resource j required to satisfy \(\mathcal {R}_{i}\).

Assume that the number of incoming request in type i, i.e., \(\mathcal {R}_{i}\), is a random variable \(K_{i}\) in any given time t. Note that the number of each incoming request is calculated in a time interval rather than a time instant. Without loss of generality, we formulate \(K_{i}\) to be an independent Gaussian distributionFootnote 1 as

$$\begin{aligned} K_{i}^{t}:{\mathcal {G}}_i^{t}\left( k_i\right) =\frac{1}{\sqrt{2\pi }\sigma _{i}^{t}}\exp \left[ -\frac{\left( k_{i}-\mu _{i}^{t}\right) ^{2}}{2{\left( \sigma _{i}^{t}\right) }^{2}}\right] \sim N \left( \mu _{i}^{t},\sigma _{i}^{t}\right) , \end{aligned}$$
(2)

where the parameter \(\mu _{i}^{t}\) and \(\sigma _{i}^{t}\) are given distributions over t, and they represent the request incoming patterns. We use a truncated Gaussian distribution to characterize the number of incoming request, where \(K_{i}^{t}\in \left[ k_{imin}, k_{imax}\right] \), where \(0\leqslant k_{imin}\ll \mu _{i}^{t}\) and \(k_{imax}\gg \mu _{i}^{t}\). Since the probability \(K_{i}^{t}<0\) or \(K_{i}^{t}>k_{imax}\) can be neglected, we can still use a Gaussian distribution to approximate \(K_{i}^{t}\). Figure 2 shows \(\mu _{i}^{t}\) in a routine or burst case, respectively.

We use \(\varvec{\varphi }_{t}\) to represent the instantaneous amount of incoming requests in all types, which can be formulated as

$$\begin{aligned} \varvec{\varphi }_{t}=\left( K_{1}^{t},\ldots ,K_{i}^{t},\ldots ,K_{n}^{t}\right) . \end{aligned}$$
(3)
Fig. 2
figure 2

Examples of \(\mu _{i}^{t}\). a a routine case b a burst case

Now let us consider how to capture the property of the private cloud system operated by the OSP. Let \(\mathcal {N}\) be the amount of resources possessed by this private cloud, then it can be represented by

$$\begin{aligned} \mathcal {N}=\left( \mathcal {N}_{1},\ldots ,\mathcal {N}_{j},\ldots ,\mathcal {N}_{m}\right) ,\quad j=1,2,\ldots ,m, \end{aligned}$$
(4)

where each element represents the amount of each resource available in the private cloud, e.g., CPU, memory, storage, etc. For each kind of resource, it is associated with a unit operating cost, denoted by \(\mathcal {P}_{l}=\left( p_{l1},\ldots ,p_{lj},\ldots ,p_{lm}\right) \).

2.1.2 Layer 2: OSP and CSP

When requests cannot be fully satisfied in the private cloud, they are resorted to the public cloud. The public cloud of CSP allocates its own computing resources to serve sorted requests and in the meanwhile, charges for its leased resources to OSP. Finally, the OSP rents CSP’s computing resource and serves resorted requests.

A classic methodology, vector bin packing [2, 3] provisions an optimal combination of virtual machines (VMs) to cover any given set of request demands. On measuring the fee paid by the OSP to the CSP, we can view it as the sum of fee paid for each kind of resource, where the unit price vector is \(\mathcal {P}_{c}=\left( p_{c1},\ldots ,p_{cj},\ldots ,p_{cm}\right) \).

Now let us focus on the profit of the OSP and the CSP at a particular time point t. We denote the allocation strategy used to distribute incoming requests as \(\mathcal {A}\). Let \(\mathbf {I}_{t}\) be the set of requests from users at time t. The OSP allocates a subset of these requests \(\mathbf {L}_{t}(\mathcal {A})\) to its private cloud, and the rest \(\mathbf {C}_{t}(\mathcal {A})\) to the public cloud. Resource demand of \(\mathbf {I}_{t}\), \(\mathbf {L}_{t}(\mathcal {A})\) and \(\mathbf {C}_{t}(\mathcal {A})\) are denoted as \(\mathcal {S}_{I}^{t}\), \(\mathcal {S}_{L}^{t}(\mathcal {A})\) and \(\mathcal {S}_{C}^{t}(\mathcal {A})\), respectively, each demand being a vector of resources needed to satisfy the corresponding requests. Obviously, we have \(\mathbf {I}_{t}=\mathbf {L}_{t}(\mathcal {A})\bigcup \mathbf {C}_{t}(\mathcal {A})\) and \(\mathcal {S}_{I}^{t}=\mathcal {S}_{L}^{t}(\mathcal {A})+\mathcal {S}_{C}^{t}(\mathcal {A})\). In essence, \(\mathbf {L}_{t}(\mathcal {A})\), \(\mathbf {C}_{t}(\mathcal {A})\), \(\mathcal {S}_{L}^{t}(\mathcal {A})\) and \(\mathcal {S}_{C}^{t}(\mathcal {A})\) are functions of the allocation strategy \(\mathcal {A}\). Given any deterministic \(\mathcal {A}\), the issue of request allocation will be settled immediately.

Upon satisfying a request, the OSP owns a profit for this service provided. We denote \(\mathcal {U}_{t}\) as the sum of this utility at time t. We can express the utility (or payoffFootnote 2) of the OSP and the CSP as

$$\begin{aligned} \varPi _{L}^{t}= & {} \mathcal {U}_{t}-\mathcal {N}\mathcal {P}_{l}-\mathcal {S}_{C}^{t}(\mathcal {A})\mathcal {P}_{c}=\varPi _{L}^{t}\left( {\mathcal {P}_{c},\mathcal {N}, \mathcal {A}}\right) , \end{aligned}$$
(5)
$$\begin{aligned} \varPi _{C}^{t}= & {} \mathcal {S}_{C}^{t}(\mathcal {A})\mathcal {P}_{c}=\varPi _{C}^{t}\left( {\mathcal {P}_{c},\mathcal {N}, \mathcal {A}}\right) , \end{aligned}$$
(6)

where the OSP’s utility equals the utility of satisfying all requests minus its cost, and the CSP’s utility equals its revenue. The terms \(\mathcal {N}\mathcal {P}_{l}\) and \(\mathcal {S}_{C}^{t}(\mathcal {A})\mathcal {P}_{c}\) are indeed scalar products, and we have \(\mathcal {N}\mathcal {P}_{l}=\sum \limits _{j=1}^{m}\mathcal {N}_{j}p_{lj}\) and \(\mathcal {S}_{C}^{t}(\mathcal {A})\mathcal {P}_{c}=\sum \limits _{j=1}^{m}S_{cj}^{t}(\mathcal {A})p_{cj}\), where \(S_{cj}^{t}(\mathcal {A})\) is the element of \(\mathcal {S}_{C}^{t}(\mathcal {A})\).

We should note that the operating cost of OSP (i.e., \(\mathcal {P}_{l}\)) is used to maintain the running of private resources, for example, CPU, memory and storage. In other words, the cost indicates the power consumption. Table 1 lists important notations introduced in this section.

Table 1 Main notations

2.2 Stackelberg’s model of duopoly

In game theory and economics terms, Stackelberg’s duopoly game [4] was proposed to model the interaction between the leader firm and follower firm. Considering a market in which there are two firms and they produce the same kind of good. Assume that Firm 1 is the leader firm who occupies the majority of market share and Firm 2 is the follower to Firm 1. Each firm’s strategic variable is its good output respectively, and different with other game models, the firms in Stackelberg game make their own decisions sequentially rather than simultaneously. That is to say, Firm 1 (i.e., the leader) chooses its output and then, Firm 2 (i.e., the follower) handles its output by knowing Firm 1’s output. It indicates that Firm 2’s strategy is a function which associates its own output with each Firm 1’s output.

In order to find the subgame perfect equilibria, backward induction is used in Stackelberg game. The basic idea of backward induction is listed as follows:

  • Step 1: find the outputs of Firm 2 that maximize its profit, given any output of Firm 1;

  • Step 2: find Firm 1’s output that maximizes its profit, given the output strategy of Firm 2.

By applying backward induction, we can obtain the subgame perfect equilibria of a Stackelberg game, and guarantee some useful properties, for example, the first-mover’s equilibrium profit and equilibrium outputs. Interested readers can refer to reference [4] for more details.

2.3 Game theoretic approach to hybrid cloud system

In this paper, we use a Stackelberg game [4] to capture the interactions between the OSP and the CSP, where the CSP is the leader and the OSP is the follower. Meanwhile, the OSP/CSP’s computing resource can be regarded as good in market. We use this game theoretical model because in reality, the CSP decides its public cloud resource sale scheme first. Once the decision is made, the CSP usually does not change it frequently. Based on this scheme, the OSP decides its local resource preparation accordingly in order to maximize its own utility. Thus, we can claim that the two players in game move sequentially. Given the allocation strategy \(\mathcal {A}\), \(\varPi _{L}^{t}\left( {\mathcal {P}_{c},\mathcal {N}, \mathcal {A}}\right) \) and \(\varPi _{C}^{t}\left( {\mathcal {P}_{c},\mathcal {N},\mathcal {A}}\right) \) are deterministic and therefore, we can omit the notation \(\mathcal {A}\) for simplicity in the rest of this paper. We formally describe the game as follows.

  • Players. The online service provider (OSP) and the cloud service provider (CSP).

  • Payoff. The payoff of the OSP is

    $$\begin{aligned} \varPi _{L}\left( {\mathcal {P}_{c},\mathcal {N}}\right) =\int _{t}\varPi _{L}^{t}\left( {\mathcal {P}_{c},\mathcal {N}}\right) \text {d}t, \end{aligned}$$
    (7)

    and the payoff of the CSP is

    $$\begin{aligned} \varPi _{C}\left( {\mathcal {P}_{c},\mathcal {N}}\right) =\int _{t}\varPi _{C}^{t}\left( {\mathcal {P}_{c},\mathcal {N}}\right) \text {d}t. \end{aligned}$$
    (8)

    We will formally describe these payoff functions in later sections.

  • Game strategy. The CSP decides the pricing strategy \(\mathcal {P}_{c}\) for the public cloud. The OSP decides the resource \(\mathcal {N}\) powered by the private cloud.

The solution to this game is called Stackelberg equilibrium (or SE), which can be solved by applying the backward induction [5] as follows.

  1. 1.

    Assume the CSP fixes \(\mathcal {P}_c\). The OSP solves the problem \(\mathcal {N}^{*}\left( \mathcal {P}_{c}\right) =\mathop {\hbox {arg max}}\limits _{\mathcal {N}}\varPi _{L}\left( \mathcal {P}_{c},\mathcal {N}\right) \), where the item \(\mathcal {N}^{*}\left( \mathcal {P}_{c}\right) \) is the optimal resource allocation for OSP given the pricing strategy \(\mathcal {P}_{c}\) of CSP.

  2. 2.

    By knowing \(\mathcal {N}^{*}\left( \mathcal {P}_{c}\right) \), the CSP solves \(\mathcal {P}_{c}^{*}=\mathop {\hbox {arg max}}\limits _{\mathcal {P}_{c}}\varPi _{C}\left( \mathcal {P}_{c},\mathcal {N}^{*}\left( \mathcal {P}_{c}\right) \right) \).

Our following formulation will show that both payoff functions are continuous with respect to the decision variables, thus the two steps are both optimization problems over a compact set, so the existence of SE is guaranteed. Note that one may also meet the discrete case when modeling a similar problem. The option to tackle this point is approximating the discrete function with a continuous one or modifying the payoff function to a discrete manner.

Correspondingly, the next sections are organized in a backward manner. In Sect. 3, we propose efficient solutions to address \(\mathcal {A}\) and decide \(\mathbf {L}_t\) for any given \(\mathcal {N}\) and \(\mathcal {P}_{c}\). This serves as the basis of our game analysis. Section 4 analyzes the OSP’s problem for any given CSP’s strategy. Section 5 decides the CSP’s optimal pricing scheme so as to obtain the Stackelberg equilibrium.

3 Request distribution: private or public

Prior to solving the Stackelberg game, let us first consider given the strategies of the OSP and the CSP, how can the OSP allocate the incoming requests into the private and public clouds so as to maximize its utility. In this section, we formulate this problem into an optimization problem, prove its NP-completeness, and then present two approximation algorithms for efficient solutions. In other words, we focus on how to design a rational and efficient \(\mathcal {A}\) in following parts of this section.

3.1 The static optimization problem

Given private cloud resource \(\mathcal {N}\), an OSP needs to decide which requests to be allocated in the private cloud for processing, and which are resorted to the public cloud. Note that due to the various nature of requests, they have different “appropriateness” to be put in local. For example, requests with sensitive information, high security requirement or high data transmission cost should be processed locally with a high priority. In here, we use a notation \(\lambda _{ip}\) to represent this appropriateness of putting the p th request of the i th type in the private cloud. Later we use \(\varLambda _t\) to represent the matrix consisting of elements \(\lambda _{ip}\). We use a binary variable \(x_{ip}\) to denote whether this request is allocated in the private cloud, i.e., \(x_{ip}=1\) if yes or 0 otherwise. In general, the OSP aims at solving the following problem, denoted by OPT:

$$\begin{aligned} \begin{array}{lll} \max &{} \mathop {\sum }\limits _{i=1}^{n}\mathop {\sum }\limits _{p=1}^{K_{i}^{t}}\lambda _{ip}x_{ip} \\ \text {subject \,\, to} &{} \mathop {\sum }\limits _{i=1}^{n}\mathop {\sum }\limits _{p=1}^{K_{i}^{t}}r_{ij}x_{ip}\leqslant \mathcal {N}_{j}, \forall 1\le j\le m, \\ &{} x_{ip}\in \left\{ 0,1\right\} . \end{array} \end{aligned}$$
(9)

In this subsection, we treat OPT as a static optimization at t, and based on this we will further determine the OSP’s best choice in the next section. Now let us prove that OPT is NP-complete.

Theorem 1

OPT is NP-complete.

Proof

Given any solution to OPT, we can verify the solution in a polynomial time. Thus OPT is NP.

Then we build OPT from the Binary Knapsack Problem (BKP), one of Karp’s 21 NP-complete problems [6, 7]. The decision of BKP is NP-complete and its optimization is NP-hard. A dimensional extension of BKP leads to the Multiple Knapsack Problem (MKP), which is at least NP-hard [8, 9]. Suppose we have a knapsack with a limit of \(\left\{ \mathcal {N}_{1},\ldots ,\mathcal {N}_{j},\ldots ,\mathcal {N}_{m}\right\} \). Given a set of items, each with a weight of \(\left( r_{i1},\ldots ,r_{ij},\ldots ,r_{im}\right) \) and a value of \(\lambda _{i}\). Next we determine an item collection to maximize its value under the constraint of the sum of weights. Therefore, OPT reduces to MKP and is thus NP-complete. \(\square \)

Given that OPT is NP-complete, we design approximation algorithms to achieve the sub-optimality in an efficient manner. In particular, we propose the greedy mechanism and the load balancing mechanism. The first one has an advantage in the low computational complexity, while the latter is nearer to the optimality.

3.2 A greedy sorting mechanism for OPT

Now we design the following Approximated Greedy Sorting (APPGS) algorithm depicted in Algorithm 1. Let \(\left\{ \mathbf {L}_{t},\mathbf {C}_{t}\right\} =APPGS\left( \varvec{\varphi }_{t},\varLambda _{t},\mathcal {N}\right) \) be the output of this algorithm. The algorithm consists of three steps:

  • Step 1: Initialization. In this step, all user requests are randomly arranged and denoted by the term \(m_{v}\).

  • Step 2: Sorting. We sort the requests in a sequence such that \(\frac{\lambda _{v}^{*}}{\sqrt{\sum _{j=1}^{m}v_{vj}^{2}}}\) is non-decreasing in v. We use \(\mathbf {I}\) to denote this sequence and \(m_{v}^{*}\) to denote the index of each request.

  • Step 3: Allocation. We use the greedy algorithm to generate \(\mathbf {L}_{t}\), the set of requests to process in the private cloud, and also \(\mathbf {C}_{t}\), the set of requests to the public cloud, as \(\mathbf {I} - \mathbf {L}_{t}\).

We next analyze the time complexity of Algorithm 1.

figure e

Theorem 2

The time complexity of Algorithm 1 is polynomial.

Proof

Let the number of user requests be N. In Step 2, the sorting process is of \(O\left( N\log N\right) \) complexity. In Step 3, we need to traverse N user requests, and its computational complexity is \(O\left( N\right) \). In conclusion, the computational complexity of Algorithm 1 is \(O\left( N\log N\right) \). \(\square \)

The APPGS is based on one of our previous works, and readers interested in the approximation ratio and other details can refer to [10].

3.3 A load balancing mechanism for OPT

In this subsection, we present another approximation algorithm with a better utility performance but a bit higher complexity. We call it Approximated Load Balancing (APPLB) algorithm, which derives from [11].

Let us use a toy example to illustrate the intuition of our algorithm. We consider requests associated with two dimensional resource requirements, i.e., CPU and storage. In Fig. 3a, the private cloud accepts requests \(m_{1}\) and \(m_{2}\), but the rest available resources are not enough for any other incoming requests. But if the OSP chooses \(m_1\), \(m_3\) and \(m_4\) into the private cloud, its resource utilization ratio is higher (see Fig. 3b). Thus, our objective is to design an algorithm that maximizes the resource utilization in the private cloud.

Now we derive the request selection for m-dimensional resource in a vector view. Each request can be denoted as

$$\begin{aligned} \mathbf {m}_{v}=\, & {} \alpha _{v}^{1}\mathbf {e}_{1}+\ldots +\alpha _{v}^{j}\mathbf {e}_{j}+\ldots +\alpha _{v}^{m}\mathbf {e}_{m}\nonumber \\=\, & {} \left( \alpha _{v}^{1},\ldots ,\alpha _{v}^{j},\ldots ,\alpha _{v}^{m}\right) ,\quad v=1,2,\ldots ,V \end{aligned}$$
(10)

where \(\mathbf {e}_{j}\) is the unit vector of the j th resource type. When selecting the next request denoted by \(\mathbf {m}_{v+1}\), we minimize the dot product \(\langle \mathbf {m}_{v},\mathbf {m}_{v+1}\rangle \). The intuition of our approach is shown in Fig. 4, where we select the vector \(\mathbf {m}_{2}\) by minimizing the dot product \(\langle \mathbf {m}_{1},\mathbf {m}_{2}\rangle \). Selection of \(\mathbf {m}_{3}\) also follows the same rule. Now let us depict the load balancing algorithm \(\left\{ \mathbf {L}_{t},\mathbf {C}_{t}\right\} =APPLB\left( \varvec{\varphi }_{t},\varLambda _{t},\mathcal {N}\right) \).

Fig. 3
figure 3

Comparison of load decision. a unbalancing load b balancing load

Fig. 4
figure 4

A simple diagram of vector selection

In each outer for-loop, Algorithm 2 runs as follows:

  • Step 1: Find requests who can be addressed by private cloud into the set \(\mathbf {T}\).

  • Step 2: We solve \(\mathop {\hbox {arg min}}\limits _{\mathbf {m}_{\theta }\in \mathbf {T}}\) \(\langle \mathbf {m}_{v},\mathbf {m}_{\theta }\rangle \) and place the result into \(\mathbf {H}\). If more than one solution exist, choose the one with the largest appropriateness. Denote our choice as \(\mathbf {\theta }^{*}\).

  • Step 3: Store \(\mathbf {\theta }^{*}\) into \(\mathbf {L}_{t}\), and update \(\mathcal {N}\), \(\mathbf {m}_{v+1}\), \(\mathbf {T}\) and \(\mathbf {I}_{t}\) for the next loop.

Comparing with Algorithm 1, or APPGS, the APPLB algorithm is more efficient in terms of the OSP’s payoff. Although a bit more complex than APPGS, Theorem 3 shows that APPLB is still efficient in time consumption.

Theorem 3

The time complexity of Algorithm 2 is polynomial.

Proof

Assume the OSP has received N user requests. At the \(v^{*}\)-th loop round, we need to find the candidates who fit the current resource constraint, i.e., traversing \(N-v^{*}\) residual requests. Until the termination time, the algorithm runs for \(N+\left( N-1\right) +\ldots +1=\frac{N^2+N}{2}\) times, where each time is only a dot product computing. In conclusion, APPLB has a polynomial computational complexity of \(O\left( N^2\right) \). \(\square \)

figure f

To summarize the two algorithms, APPGS has a lower computational complexity while APPLB is nearer to the optimality. In the remainder of this paper, we use them as a combination, called APP: when the computational size is not very large, we choose APPLB to obtain a higher utility, but if the amount of requests is very large, we opt to apply APPGS instead for better time efficiency. In later sections, we will further use simulations to validate the accuracy as well as efficiency of the two algorithms.

4 OSP’s decision on the private cloud

Recall that in the previous section, we have established a Stackelberg game model which requires a backward induction to seek its solution. In this section, based on the results in the previous section, we solve the second stage of this game, i.e., the OSP’s decision on constructing the private cloud. In short, we will solve \(\mathcal {N}^{*}\left( \mathcal {P}_{c}\right) =\mathop {\hbox {arg max}}\limits _{\mathcal {N}}\varPi _{L}\left( \mathcal {N},\mathcal {P}_{c}\right) \).

4.1 Decision of private cloud

Considering the APP algorithm in Sect. 3, we can rapidly compute the private set \(\mathbf {L}_{t}\) and the public set \(\mathbf {C}_{t}\) as

$$\begin{aligned} \left\{ \mathbf {L}_{t},\mathbf {C}_{t}\right\} =APP\left( \mathbf {{\varphi }}_{t}, \varLambda _{t}, \mathcal {N}\right) . \end{aligned}$$
(11)

Since \(\mathbf {I}_{t}=\mathbf {L}_{t}\bigcup \mathbf {C}_{t}\), and \(\mathcal {S}_{I}^{t}=\mathcal {S}_{L}^{t}+\mathcal {S}_{C}^{t}\), we can simplify Eq. (11) to be

$$\begin{aligned} \mathcal {S}_{L}^{t}=APP\left( \mathbf {{\varphi }}_{t}, \varLambda _{t}, \mathcal {N}\right) =APP\left( \mathcal {N},\mathbf {{\varphi }}_{t}\right) . \end{aligned}$$
(12)

Combining Eqs. (5) and (12), we obtain the OSP’s utility as

$$\begin{aligned} \varPi _{L}^{t}\left( {\mathcal {P}_{c},\mathcal {N}}\right)= & {} \mathcal {U}_{t}-\mathcal {N}\mathcal {P}_{l}-\mathcal {S}_{C}^{t}\mathcal {P}_{c}\nonumber \\= & {} \mathcal {U}_{t}-\left\{ \mathcal {N}\mathcal {P}_{l}+\left[ \mathcal {S}_{I}^{t}-APP\left( \mathcal {N},\mathbf {{\varphi }}_{t}\right) \right] \mathcal {P}_{c}\right\} . \end{aligned}$$
(13)

In Eq. (13), the terms \(\mathcal {P}_{l}\) and \(\mathcal {P}_{c}\) are constant and do not change over time t. Although the terms \(\mathcal {U}_{t}\) and \(\mathbf {{\varphi }}_{t}\) are time-related, they only depend on the incoming requests of users and are irrelevant to OSP’s resource \(\mathcal {N}\). This implies that \(\mathcal {U}_{t}\) and \(\mathbf {{\varphi }}_{t}\) can be regarded as constants when we derive the optimal \(\mathcal {N}\). Therefore, the OSP’s utility is a function of \(\mathcal {N}\) for any given time t. For the OSP, the resources it need to prepare in the private cloud is determined by

$$\begin{aligned} \mathcal {N}^{*}=\mathop {\hbox {arg max}}\limits _{\mathcal {N}} \int _{t}\varPi _{L}^{t}\left( {\mathcal {P}_{c},\mathcal {N}}\right) \text {d}t. \end{aligned}$$
(14)

Since \(\mathcal {U}_{t}\) and \(\mathcal {S}_{I}^{t}\mathcal {P}_{c}\) are constants, we have

$$\begin{aligned} \mathcal {N}^{*}= & {} \mathop {\hbox {arg max}}\limits _{\mathcal {N}}\quad \int _{t}APP\left( \mathcal {N},\varvec{\varphi }_{t}\right) \mathcal {P}_{c}- \mathcal {N}\mathcal {P}_{l}\text {d}t\nonumber \\= & {} \mathop {\hbox {arg max}}\limits _{\mathcal {N}} \int _{t\in \left[ 0,T\right] }\int _{\varvec{\varphi }_{t}\in \varTheta } APP\left( \mathcal {N},\varvec{\varphi }_{t}\right) \mathcal {P}_{c}- \mathcal {N}\mathcal {P}_{l}\text {d}\varvec{\varphi }_{t} \text {d}t\nonumber \\= & {} \mathop {\hbox {arg max}}\limits _{\mathcal {N}}\quad \mathcal {F} \left( \mathcal {N},\mathcal {P}_{c}\right) , \end{aligned}$$
(15)

where \(\mathcal {F}\left( \mathcal {N},\mathcal {P}_{c}\right) = \int _{t\in \left[ 0,T\right] }\int _{\varvec{\varphi }_{t}\in \varTheta }APP \left( \mathcal {N},\varvec{\varphi }_{t}\right) \mathcal {P}_{c} -\mathcal {N}\mathcal {P}_{l}\text {d}\varvec{\varphi }_{t}\text {d}t\), \(\varTheta =\left( {\mathcal {G}}_1^{t}\left( k_1\right) ,{\mathcal {G}}_2^{t}\left( k_2\right) ,\ldots ,{\mathcal {G}}_n^{t}\left( k_n\right) \right) \). In order to simplify the calculation, we proceed to show

$$\begin{aligned} \mathcal {F}= & {} \sum _{j=1}^{m}\int _{t\in \left[ 0,T\right] }\int _{\mathcal {S}_{j}^{t}}APP_{j}\left( \mathcal {N}_{j},\mathcal {S}_{j}^{t}\right) p_{cj} -\mathcal {N}_{j}p_{lj}\text {d}\mathcal {S}_{j}^{t}\text {d}t\nonumber \\= & {} \sum _{j=1}^{m}\mathcal {F}_{j}\left( \mathcal {N}_{j},p_{cj}\right) , \end{aligned}$$
(16)

where \(\mathcal {S}_{j}^{t}\) denotes the j-th resource demand at time t, and obviously, we have

$$\begin{aligned} \mathcal {S}_{j}^{t}:{\mathcal {G}}_j\left( s_{j}^{t}\right) =\frac{1}{\sqrt{2\pi }\hat{\sigma }_{j}^{t}}\exp \left[ -\frac{\left( s_{j}^{t}-\hat{\mu }_{j}^{t}\right) ^{2}}{2\left( {\hat{\sigma }_{j}^t}\right) ^2}\right] \sim N \left( \hat{\mu }_{j}^{t},\hat{\sigma }_{j}^{t}\right) , \end{aligned}$$
(17)

where the parameter \(\hat{\mu }_{j}=\sum \limits _{i=1}^{n}r_{ij}\mu _{i}^{t}\), \(\hat{\sigma }_{j}=\sqrt{\sum \limits _{i=1}^{n}r_{ij}^{2}\left( \sigma _{i}^t\right) ^{2}}\).

If \(\mathcal {S}_{j}^{t}<\mathcal {N}_{j}\), then the OSP can fully address the resource demand without purchasing public cloud resources. When \(\mathcal {S}_{j}^{t}\geqslant \mathcal {N}_{j}\), the private cloud is fully utilized but still cannot satisfy all demands, so public cloud is needed. We can simply define \(APP_{j}\left( \mathcal {N}_{j},\mathcal {S}_{j}^{t}\right) \) as

$$\begin{aligned} APP_{j}\left( \mathcal {N}_{j},\mathcal {S}_{j}^{t}\right) = {\left\{ \begin{array}{ll} \mathcal {S}_{j}^{t} &{} \sum \limits _{i}^{n}k_{imin}\leqslant \mathcal {S}_{j}^{t}<\mathcal {N}_{j},\\ \mathcal {N}_{j} &{} \mathcal {N}_{j}\leqslant \mathcal {S}_{j}^{t}\leqslant \sum \limits _{i}^{n}k_{imax}, \end{array}\right. } \end{aligned}$$
(18)

where \(k_{imim}\) (\(k_{imax}\)) denotes the minimal (maximal) number of requests of type i.

We assume that \(\hat{\mu }_{j}\) and \(\hat{\sigma }_{j}\) have certain distributions over time t, but we do not specify any particular form for them. We can still estimate the lower bound of \(\mathcal {F}_{j}\left( \mathcal {N}_{j},p_{cj}\right) \). Combining Eqs. (16), (17) and (18), we have

$$\begin{aligned} \inf \mathcal {F}_{j}\left( \mathcal {N}_{j},p_{cj}\right) =\frac{p_{cj}\hat{\sigma }_{j}^{t}}{\sqrt{2\pi }}\int _{0}^{T}\rho \left( \mathcal {N}_{j},t\right) \text {d}t-\frac{p_{cj}+2p_{lj}}{2}\mathcal {N}_{j}T-\frac{p_{cj}}{2}\int _{0}^{T}\hat{\mu }_{j}^{t}\text {d}t, \end{aligned}$$
(19)

where

$$\begin{aligned} \rho \left( \mathcal {N}_{j},t\right) =\exp \left[ {-\left( \mathcal {N}_{j}-\hat{\mu }_{j}^{t}\right) ^2/{2\left( {\hat{\sigma }_{j}^t}\right) ^2}}\right] -\exp \left[ {-\left( \sum \limits _{i}k_{imin}-\hat{\mu }_{j}^{t}\right) ^2/{2\left( {\hat{\sigma }_{j}^t}\right) ^2}}\right] . \end{aligned}$$
(20)

Interested readers may prefer to our appendix for more detailed derivations.

For any j, we take the derivative of \(\inf \mathcal {F}_{j}\left( \mathcal {N}_{j},p_{cj}\right) \) and have

$$\begin{aligned} \frac{\partial \inf \mathcal {F}_{j}\left( \mathcal {N}_{j},p_{cj}\right) }{\partial \mathcal {N}_{j}}=-\frac{p_{cj}+2p_{lj}}{2}T +\frac{p_{cj}}{\sqrt{2\pi }}\int _{0}^{T}\frac{\left( \hat{\mu }_{j}^{t}-\mathcal {N}_{j}\right) }{\hat{\sigma }_{j}^{t}} \exp \left[ -\frac{\left( \mathcal {N}_{j}-\hat{\mu }_{j}^{t}\right) ^{2}}{2\left( {\hat{\sigma }_{j}^t}\right) ^2}\right] \text {d}t. \end{aligned}$$
(21)

Let the derivative equal zero and we can obtain the result \(\mathcal {N}_{j}^{*}\). In conclusion, the best arrangement for OSP is \(\mathcal {N}^{*}=\left\{ \mathcal {N}_{1}^{*},\ldots ,\mathcal {N}_{j}^{*},\ldots ,\mathcal {N}_{m}^{*}\right\} \), where

$$\begin{aligned} \mathcal {N}_{j}^{*}=\mathop {\hbox {arg max}}\limits _{\mathcal {N}_{j}}\quad \inf \mathcal {F}_{j}\left( \mathcal {N}_{j},p_{cj}\right) . \end{aligned}$$
(22)

Up till now, we can see that for any given \(p_{cj}\), \(\mathcal {N}_{j}^{*}\) is determined by the above function. Since we remain general forms of the distribution on \(\hat{\mu }_{j}^{t}\) and \(\hat{\sigma }_{j}^{t}\), it is hard for us to obtain any closed-form solution. Specifically, our goal of this paper is to build a theoretical game model to capture the interactions between the OSP and CSP. In other words, we mainly focus on the game-based framework in hybrid cloud systems. We admit that one can still obtain the optimal \(\mathcal {F}_{j}(\mathcal {N}_{j},p_{cj})\) with some other methods, for example, by using the numerical integration or re-formulating the system model. Analysis of different methods to approach optimal \(\mathcal {F}_{j}(\mathcal {N}_{j},p_{cj})\) is not included in this paper, and considering the lower bound of \(\mathcal {F}_{j}(\mathcal {N}_{j},p_{cj})\) is easy to estimated relatively, we use this method to determine \(\mathcal {N}_{j}^{*}\). In what follows, we design an algorithm to approach the optimality.

4.2 An approximation algorithm to OSP

The previous subsection presents a theoretical expression of \(\mathcal {N}^{*}\). Runge–Kutta scheme [12] is an efficient method to search the optimum, and is widely applied in practice. Therefore, we propose an efficient and accurate algorithm based on the Runge–Kutta scheme.

Let \(\digamma _{j}\left( \mathcal {N}_{j}\right) =\frac{\partial \inf \mathcal {F}_{j}\left( \mathcal {N}_{j}\right) }{\partial \mathcal {N}_{j}}\) and the previous problem turns to finding zeros of \(\digamma _{j}\left( \mathcal {N}_{j}\right) \). A fourth order Runge–Kutta (RK4) scheme of \(\digamma _{j}\left( \mathcal {N}_{j}\right) \) is depicted as Algorithm 3.

figure g

In Algorithm 3, h is the step size and \(\varepsilon \) is pre-fixed small real value to control the termination of this algorithm. In every loop of this algorithm, we calculate the intermediate variables \(a_{1}\), \(a_2\), \(a_3\) and \(a_{4}\), and estimate a slope to calculate the next iteration value. According to [12], the RK4 based algorithm has an error bound of \(h^{5}\), and the total error bound is as small as \(h^{4}\). When we apply a small value of h, we can approach the result very close to the optimality.

5 CSP’s decision on pricing strategy

In the previous section, the OSP decides how it implements its private cloud for any given public cloud resource pricing strategy \(\mathcal {P}_{c}\). In this section, by knowing the OSP’s best response, we answer what is the optimal pricing strategy of the CSP, such that its profit can be maximized.

5.1 Problem formulation

Given the \(\hat{\mathcal {N}}_{j}^{*}\) in the previous section, the CSP needs to solve \(\mathcal {P}_{c}^{*}=\mathop {\hbox {arg max}}\limits _{\mathcal {P}_{c}}\varPi _{C}\left( \mathcal {P}_{c},\mathcal {N}^{*}\left( \mathcal {P}_{c}\right) \right) \). Therefore, we have

$$\begin{aligned} \mathcal {P}_{c}^{*}= & {} \mathop {\hbox {arg max}}\limits _{\mathcal {P}_{c}}\quad \int _{0}^{T}\mathcal {S}_{C}^{t}\mathcal {P}_{c}\text {d}t\nonumber \\= & {} \mathop {\hbox {arg max}}\limits _{\mathcal {P}_{c}}\quad \int _{0}^{T}\left( \mathcal {S}_{I}^{t}-\mathcal {S}_{L}^{t}\right) \mathcal {P}_{c}\text {d}t\nonumber \\= & {} \mathop {\hbox {arg max}}\limits _{\mathcal {P}_{c}}\quad \sum \limits _{j=1}^{m}\int _{0}^{T}\left[ \mathcal {S}_{j}^{t}-APP_{j}\left( \mathcal {N}_{j},\mathcal {S}_{j}^{t}\right) \right] p_{cj}\text {d}t\nonumber \\= & {} \mathop {\hbox {arg max}}\limits _{\mathcal {P}_{c}}\quad \sum \limits _{j=1}^{m}\mathcal {H}_{j}\left( p_{cj}\right) . \end{aligned}$$
(23)

The previous section gives \(\mathcal {N}_{j}=\hat{\mathcal {N}}_{j}^{*}\left( p_{cj}\right) \). Combining Eqs. (17) and (18), we have

$$\begin{aligned} \mathcal {H}_{j}\left( p_{cj}\right)= & {} \int _{0}^{T}\int _{\sum \limits _{i}k_{imin}}^{\hat{\mathcal {N}}_{j}^{*} \left( p_{cj}\right) } \left( s_{j}^{t}-s_{j}^{t}\right) {\mathcal {G}}_j\left( s_{j}^{t}\right) p_{cj}\text {d}s_{j}^{t}\text {d}t\nonumber \\&+\int _{0}^{T}\int _{\hat{\mathcal {N}}_{j}^{*}\left( p_{cj}\right) }^{\sum \limits _{i}k_{imax}}\left[ s_{j}^{t}-\hat{\mathcal {N}}_{j}^{*}\left( p_{cj}\right) \right] {\mathcal {G}}_j\left( s_{j}^{t}\right) p_{cj}\text {d}s_{j}^{t}\text {d}t. \end{aligned}$$
(24)

Note that the first term equals 0, because the case \(APP_{j}\left( \mathcal {N}_{j},\mathcal {S}_{j}^{t}\right) =\mathcal {S}_{j}^{t}\) implies the OSP does not need to purchase public resources. Then we can obtain the lower bound of \(\mathcal {H}_{j}\left( p_{cj}\right) \) as

$$\begin{aligned} \inf \mathcal {H}_{j}\left( p_{cj}\right) =\frac{p_{cj}}{\sqrt{2\pi }}\int _{0}^{T}{\hat{\sigma }_{j}^t}\omega \left( p_{cj},t\right) \text {d}t +\frac{p_{cj}}{2}\int _{0}^{T}\hat{\mu }_{j}^t\text {d}t-\frac{1}{2}\hat{\mathcal {N}}_{j}^{*}\left( p_{cj}\right) p_{cj}T, \end{aligned}$$
(25)

where

$$\begin{aligned} \omega \left( p_{cj},t\right) =\exp \left[ -\left( \sum \limits _{i}k_{imax}-\hat{\mu }_{j}^{t}\right) ^{2}/{2\left( {\hat{\sigma }_{j}^t}\right) ^2}\right] - \exp \left[ -\left( \hat{\mathcal {N}}_{j}^{*}\left( p_{cj}\right) -\hat{\mu }_{j}^{t}\right) ^{2}/{2\left( {\hat{\sigma }_{j}^t}\right) ^2}\right] . \end{aligned}$$
(26)

Interested readers can find the detailed derivatives in the appendix. The result we obtain is: the solution to the CSP’s utility maximization problem is \(\mathcal {P}_{c}^{*}=\left( p_{c1}^{*},\ldots ,p_{cj}^{*},\ldots ,p_{cm}^{*}\right) \), where

$$\begin{aligned} p_{cj}^{*}=\mathop {\hbox {arg max}}\limits _{p_{cj}}\quad \inf \mathcal {H}_{j}\left( p_{cj}\right) . \end{aligned}$$
(27)

5.2 Public pricing strategy

To further observe the feature of the solution to CSP, let us first take the derivative of \(\inf \mathcal {H}_{j}\left( p_{cj}\right) \), so we have

$$\begin{aligned} \frac{\text {d}\inf \mathcal {H}_{j}\left( p_{cj}\right) }{\text {d}p_{cj}}= & {} \frac{1}{\sqrt{2\pi }}\int _{0}^{T}{\hat{\sigma }_{j}^t}\omega \left( p_{cj},t\right) \text {d}t +\frac{p_{cj}}{\sqrt{2\pi }}\int _{0}^{T}{\hat{\sigma }_{j}^t}\omega \left( p_{cj},t\right) '\text {d}t+\frac{1}{2}\int _{0}^{T}\hat{\mu }_{j}^t\text {d}t\nonumber \\&-\frac{1}{2}\hat{\mathcal {N}}_{j}^{*}\left( p_{cj}\right) T-\frac{1}{2}\left[ \hat{\mathcal {N}}_{j}^{*}\left( p_{cj}\right) \right] 'p_{cj}T. \end{aligned}$$
(28)

A direct idea for public pricing strategy is to compute the price value such that the above equation equals zero. However, it is very hard to obtain a closed-form solution due to the complicated forms of the formulas. Therefore, we apply an efficient and accurate algorithm to approach the optimum. This algorithm also follows the RK4 methodology and is similar to Algorithm 3.

figure h

Let us present our approach in Algorithm 4, each iteration of which follows the RK4 method. Different from Algorithm 3, this algorithm searches the optimum point of \(\inf \mathcal {H}_{j}(p_{cj})\). Similarly this algorithm has an error bound of \(h^{5}\) in each step, so the overall error bound is \(h^{4}\). By carefully choosing the value of h, we can guarantee a very small error bound.

5.2.1 Summary

Up till now, we have proposed an integrated Stackelberg game model and presented guidelines for the OSP and the CSP to make their optimal decisions. In particular, the CSP decides the pricing scheme as \(\left( {p}_{c1}^{*},\ldots ,{p}_{cj}^{*},\ldots ,{p}_{cm}^{*}\right) \) for each of its public cloud resources, and the OSP decides how to implement its private cloud system by deciding the amount of resources it need to prepare, i.e., \(\left( {\mathcal {N}}_{1}^{*},\ldots ,{\mathcal {N}}_{j}^{*},\ldots ,{\mathcal {N}}_{m}^{*}\right) \). Due to the generality of problem formulation, as well as NP-completeness of the resource allocation problem, we do not provide closed-form solutions. Alternatively, we design efficient and accurate algorithms, and derive the theoretic formulas and bounds for these decisions.

Our goal in this paper is to build a Stackelberg game model for the hybrid cloud system, aiming at capturing the interactions between a single OSP and a single CSP. Future extensions to this model could include more complex interactions that may occur between multiple OSPs and multiple CSPs.

6 Performance evaluation

In this section, we use real-trace data to evaluate the performance of our design. We first show the performance of our designed algorithms APPGS and APPLB, and then compare the utility of the OSP and the CSP when using our algorithms to decide their strategies.

Fig. 5
figure 5

Comparison of static algorithms

6.1 Comparison of static allocation algorithms

We compare the OSP’s utility when using the APPGS and APPLB algorithms, compared with its maximal possible utility (denoted by OPT, and obtained by exhaustive search). We assume that users’ requests can be classified into three types, and each request type requires three different computing resources. We vary the request number from 1 to 150, and the proportion of each request type is fixed. Results are shown in Fig. 5, where we can see that APPGS achieves \(85.71\%\) utility compared to the maximal value, while APPLB achieves \(92.65\%\). We run our algorithms in a normal PC with 3.20GHz CPU, and the total running time of APPGS and APPLB are 1.05 and 107.04 ms, respectively. Therefore, we can use APPLB for a higher utility when computational time is allowed, or APPGS for a faster calculation. In the following, we use the combination of them, denoted by APP.

6.2 Private cloud utility

In this and the next subsections, we evaluate our Algorithms 3 and 4. We consider an OSP with three different type of services. Type 1 is a constant request over time. Type 2 is a time-variant service with strong periodicity. Type 3 represents a burst request (e.g., the Black Friday flooding in online sale websites). The data traces are depicted in Fig. 6, where the x-axis is the time domain and y-axis represents the number of requests from users. Type 1 and Type 3 curves are artificially generated by us, while Type 2 curve is a real-trace data we obtained from a particular online video service company, collected every 30 min from November 4 to 18, 2012. Due to the requests from that company, we anonymize the company’s title, and have normalized the values.

We consider three typical resources in cloud service, i.e., the CPU, memory, and storage; and they comprise the resource vector. Considering above type of services, we give a rational assumption that \(\mathcal {R}_{1}=(0.6,0.6,0.8)\), \(\mathcal {R}_{2}=(1.0,1.0,1.0)\) and \(\mathcal {R}_{3}=(1.5,1.0,1.0)\), respectively. Checking the simple economic relation of computing cost, we set the unit operating cost of OSP as \(\mathcal {P}_{l}=(1.0,0.8,0.6)\) and the CSP’s pricing strategy is \(\mathcal {P}_{c}=\gamma \mathcal {P}_{l}\), where \(\gamma \geqslant 1\). The coefficient \(\gamma \) implies the economic difference between computing resources of OSP and CSP.

There are numbers of related works dealing with tasks scheduling in hybrid cloud (see Sect. 8), but they have very different settings, and their algorithms are not comparable to ours due to different objectives. In here, we compare our design with two baseline approaches. The algorithms we consider are:

  • Stackelberg Game Model (SGM): our approach.

  • Always-Fit-Most scheme (AFM): the private cloud is powerful enough to deal with all requests, even at the peak time point.

  • AVeraGe scheme (AVG): the private cloud deals with half of total requests; the other half are resorted to the public cloud.

Figure 7 shows the OSP’s utility in different schemes, and the accumulative results are depicted in Fig. 8. Although possessing a peak utility value, the AFM scheme has the worst performance. The reason is that the OSP has to preserve large number of idle resources, and it is a huge waste for most of the time. The AVG scheme does not consider the cost difference using private and public cloud, and thus leads to a lower utility due to high fee charged by the CSP. Our SGM scheme achieves the greatest accumulative utility, and is obviously the best of the three approaches.

Fig. 6
figure 6

Service types

Fig. 7
figure 7

Comparison of private utility

Fig. 8
figure 8

Accumulative private utility

Fig. 9
figure 9

Comparison of public utility

6.3 Public cloud utility

In this subsection, we compare the pricing strategy of the CSP using our approach with the optimal pricing scheme. Since we need to use exhaustive search to find the optimality, we apply a simplified setting: we consider only Type 2 requests but omit the rest. We use Algorithm 4 to reach our solution, and exhaustive search to find the optimal price. We present our results in Fig. 9. We find that although our solution on the unit price is lower than the optimal choice, however, the CSP’s utility under our solution reaches \(94.7\%\) to the maximal possible value. To summarize, via extensive simulations we validate that our algorithms can achieve near-optimal solutions with low time complexity.

6.4 Performance impact of \(\gamma \)

In this subsection, we show how parameters affect the private utility. Specifically, we compare the accumulative utility with different coefficient, where \(\gamma \) equals 1, 3, 5 respectively. Note that \(\gamma =1\) indicates that it is an ideal case where the OSP pays same for leasing resources from CSP with its local operating. The results are shown in Figs. 10, 11, 12.

Fig. 10
figure 10

\(\gamma =1.0\)

Fig. 11
figure 11

\(\gamma =3.0\)

Fig. 12
figure 12

\(\gamma =5.0\)

The SGM scheme obtains the best performance in above three figures. The performance of AVG decreases with the increase of coefficient \(\gamma \) since it pays more to the CSP. In particular, AVG’s utility coincides with that of SGM in Fig. 10, and the reason is that the operating cost of OSP is same as the cost to lease resources from CSP, i.e., \(\gamma =1\). The AFM scheme is the worst when \(\gamma =1\) and \(\gamma =3\), while it overtakes the AVG in Fig. 12 since it prepares sufficient resources for the burst case. However, the AFM is inferior to the SGM scheme especially in the routine case, which is common in network service.

7 Discussion

In this section, we discuss some possible extensions to our framework and deal with some issues of previous sections.

7.1 Incomplete information game

In the previous sections, we assume that the CSP may decide its pricing strategy to maximize its own profit by knowing the OSP’s best response (see Sect. 5). In here, an implicit condition is that the CSP can obtain the internal information from the private cloud in the OSP. One may argue how can this be achieved. One way is to do market investigation. Through multiple probe, the CSP can estimate the best response of the OSP under different scenarios.

Another possibility is to use an imperfect information game to model the interactions between the CSP and the OSP, or the Bayesian Stackelberg game. Let us now analyze the hybrid cloud with a classical algorithm, i.e., DOBSS in [13]. For the leader CSP, its pure strategy j indicates the unit price \(p_{cj}\), while the follower OSP’s pure strategy i is \(\mathcal {N}_i\), where \(i,j=1,2,\ldots ,m\). We denote the CSP’s policy as y, which consists of a vector of its pure strategies. The proportion of used j in policy y is denoted as \(y_j\). For any OSP’s policy type \(l\in {L}\), \(x^{l}\) illustrates its vector of pure strategies, where the index set L contains all possible policies of the OSP’s.

Given a priori probability set \(\mathbf {H}=\{h_{l}:l\in {X}\}\), where each element \(h_{l}\) corresponds to each OSP’s policy, the CSP solves the following problem to decide its pricing strategy:

$$\begin{aligned} \begin{array}{ll} \max _{y,x,a} &{} \mathop {\sum }\limits _{j}\mathop {\sum }\limits _{l}\mathop {\sum }\limits _{i} h_{l}\varPi _{C}\left( j,i\right) y_{j}x_{i}^{l} \\ \text {subject to} &{} \mathop {\sum }\limits _{j} y_{j}=1, \\ &{} \mathop {\sum }\limits _{i} x_{j}^{l}=1,\\ &{} 0\leqslant a^{l}-\mathop {\sum }\limits _{j}\varPi _{L}\left( j,i\right) y_{j}\leqslant \left( 1-x_{i}^{l}\right) M,\\ &{} y_{j}\in [0,1], x_{i}^{l}\in \{0,1\}, {l}\in \mathfrak {R}\end{array} \end{aligned}$$
(29)

where \(a^{l}\) is the upper bound on the OSP’s utility given the OSP’s policy l and the CSP’s policy y. We simply omit details about this optimization problem since they are similar to [13].

7.2 General request distribution

7.2.1 Poisson distribution approximation

In Sects. 4 and 5, we assume the number of incoming request to be an independent Gaussian distribution. This assumption is for analytical tractability, and it also represents some realistic properties of incoming request. Now we will show why we use the independent Gaussian distribution to depict the request’s properties.

Given the request type \(\mathcal {R}_{i}\), the number of \(\mathcal {R}_{i}\) over a time interval is a discrete random variable that is often modeled by a Poisson distribution, where \(i=1,2,\cdots , n\). In most analytical cases, we want the corresponding distribution to be a continuous one, whereas the Poisson distribution is obviously discrete. A simple and direct idea is that we can use some other distributions to approximate a Poisson distribution, for example, the Gaussian distribution which is widely applied in engineering statistics. In reality, we have the following theorem.

Theorem 4

Suppose that \(X_{i}\) is the number of type \(\mathcal {R}_{i}\) and \(X_{i}\sim P\left( \lambda _{i}\right) \), we have that the variable

$$\begin{aligned} K_{i}=\frac{X_{i}-\lambda _{i}}{\sqrt{\lambda _{i}}} \end{aligned}$$

is approximately a standard Gaussian random variable. And the approximation is good for \(\lambda _{i}>5\).

We simply omit the proof and interested readers may refer to references [14, 15] for more details. Without of generality, we extend the standard case to a general one, i.e., \(K_{i}\sim N \left( \mu _{i},\sigma _{i}\right) \). Based on this approximation, we can use the Gaussian distribution to depict the incoming request since \(\lambda _{i}>5\) holds in our scheme and therefore, we obtain the continuity correction.

We should also note that in some other scenarios, the arrival rate of request is naturally modeled as a Gaussian distribution and there is no need to apply previous approximation. For example, when staffing a calling center [16] or modeling the arrival of public information [17]. In next part, we will show how to deal with the situation where the request distribution does not follow Gaussian.

7.2.2 Extend to general case

Previous parts make the Gaussian distribution to depict the properties of incoming request. However, we cannot rule out possibilities that the request distribution does not follow Gaussian. Now let us discuss how to address the general case of request distribution. Suppose the number of request type \(\mathcal {R}_{i}\) is a random variable \(K_{i}\sim {\mathcal {J}}_i\left( k_i\right) \). The definition and the instantaneous amount of incoming requests are the same as Eqs. (1) and (3). Now we present a framework for the general situation.

  • Step 1: Static Optimization. In this step, the OSP allocates the incoming requests into the private and public clouds, by applying the approximation algorithms in Sect. 3. This can be expressed as

    $$\begin{aligned} \left\{ \mathbf {L}_{t},\mathbf {C}_{t}\right\} =APP\left( \mathbf {{\varphi }}_{t}, \varLambda _{t}, \mathcal {N}\right) , \end{aligned}$$
    (30)

    where \(\mathbf {L}_{t}\) and \(\mathbf {C}_{t}\) are the private and cloud set, respectively.

  • Step 2: OSP’s Construction. Based on the private set \(\mathbf {L}_{t}\), the OSP can obtain the its own required resource denoted as \(\mathcal {S}_{L}^{t}\). The OSP’s utility is

    $$\begin{aligned} \varPi _{L}^{t}=\mathcal {U}_{t}-\mathcal {N}\mathcal {P}_{l}-\left( \mathcal {S}_{I}^{t}-\mathcal {S}_{L}^{t}\right) \mathcal {P}_{c}. \end{aligned}$$
    (31)

    Therefore, the OSP can arrange its private cloud as

    $$\begin{aligned} \mathcal {N}^{*}=\mathop {\hbox {arg max}}\limits _{\mathcal {N}}\quad \int _{T}\varPi _{L}^{t}\text {d}t. \end{aligned}$$
    (32)
  • Step 3: CSP’s Decision. The CSP can compute its required resource \(\mathcal {S}_{C}^{t}\) when knowing the cloud set \(\mathbf {C}_{t}\). Therefore, the CSP’s utility can be shown as

    $$\begin{aligned} \varPi _{C}^{t}=\mathcal {S}_{C}^{t}\mathcal {P}_{c}|_{\mathcal {N}^{*}}, \end{aligned}$$
    (33)

    and it can update its pricing strategy as

    $$\begin{aligned} \mathcal {P}_{c}^{*}=\mathop {\hbox {arg max}}\limits _{\mathcal {P}_{c}}\quad \int _{T}\varPi _{C}^{t}|_{\mathcal {N}^{*}}\text {d}t. \end{aligned}$$
    (34)

The above steps present a framework for general distributions of incoming request. It is difficult to obtain a closed-form solution, but can be addressed by heuristic algorithms. Deriving the heuristics of general cases is beyond the scope of this paper.

8 Related work

The resource/request allocation problems in the cloud have attracted a lot of attentions. Zhen et al. [18] presented a virtualization system to dynamically allocate data center resources and support green computing. Alicherry and Lakshman [19] developed efficient resource allocation algorithms in distributed clouds. The objective is minimizing the latency between selected data centers. With a ranking mechanism, Ergu et al. [20] modeled the task-oriented problem of resource allocation in a cloud computing environment. Dán et al. [21] considered the dynamic content allocation problem for a content delivery system, which combines cloud-based storage with low cost dedicated servers. In [22], Hao et al. proposed a non-prior method for VMs allocation in a distributed cloud to decrease the network cost. Shi et al. in [23] represented the first online combinatorial and truthful auction mechanism to model dynamic provisioning of VMs in a cloud paradigm.

Meanwhile, a number of research has been focusing on the hybrid cloud system as well. In [24], Armbrust et al. showed the hybrid cloud can achieve better performance than single public or private cloud. Bittencourt et al. showed the main characteristics of scheduling in [25], and proposed a cost optimization algorithm for the hybrid cloud in [26]. Concerning resource allocation and performance, Lee et al. proposed a cluster-based architecture to allocate resources in [27]. Authors in [28] discussed hybrid cloud management with deadline constraints. Zhou et al. [29] addressed the privacy issue in a hybrid cloud, where sensitive data are kept in trusted private cloud while insensitive data are moved to the public cloud. Beloglazov et al. [30] proposed several resource allocation policies and algorithms to realize Green IT.

Up till now, we have not found any work that combine the design of resource allocation and hybrid cloud. Different from all these previous works, we consider an online service application with specific request arrival patterns, and design the combinatorial resource allocation and pricing strategies in a hybrid cloud system.

9 Conclusion and future work

In this paper, we propose a hybrid cloud design to address the heterogeneous and time varying requests in online service applications. In particular, we use a Stackelberg game model to capture the interactions between the public cloud and the private cloud, based on which we answer (1) for the online service provider, how to implement the private cloud system in terms of local resource preparation, and (2) for the public cloud provider, how to decide the optimal prices for the cloud resources so as to maximize its profit. We prove the NP-completeness of the resource allocation problem, and propose efficient and effective algorithms to approach the optimality. We show, both theoretically and empirically, that our algorithms is time-efficient and achieves near-optimal solutions. We believe this gives important guidelines to design practical hybrid cloud systems.

This paper is merely an initial work on the particular topic of hybrid systems, and we only consider one OSP and one CSP in the hybrid cloud system. Future extensions to this paper could include more complex interactions that may occur between multiple OSPs and multiple CSPs.