Improved Algorithms for Online Load Balancing

Liu, Yaxiong; Hatano, Kohei; Takimoto, Eiji

doi:10.1007/978-3-030-67731-2_15

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12607))

Included in the following conference series:

International Conference on Current Trends in Theory and Practice of Informatics

1265 Accesses
1 Citations

Abstract

We consider an online load balancing problem and its extensions in the framework of repeated games. On each round, the player chooses a distribution (task allocation) over K servers, and then the environment reveals the load of each server, which determines the computation time of each server for processing the task assigned. After all rounds, the cost of the player is measured by some norm of the cumulative computation-time vector. The cost is the makespan if the norm is $L_\infty $-norm. The goal is to minimize the regret, i.e., minimizing the player’s cost relative to the cost of the best fixed distribution in hindsight. We propose algorithms for general norms and prove their regret bounds. In particular, for $L_\infty $-norm, our regret bound matches the best known bound and the proposed algorithm runs in polynomial time per trial involving linear programming and second order programming, whereas no polynomial time algorithm was previously known to achieve the bound.

This work was supported by JSPS KAKENHI Grant Numbers JP19H04174 and JP19H04067, respectively.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Nash Social Welfare in Selfish and Online Load Balancing

Randomized Online Computation with High Probability Guarantees

Article Open access 30 January 2022

Locally Optimal Load Balancing

Keywords

1 Introduction

Online load balancing problem is an active research topic since last century. Instead of the traditional measurement of algorithm performance, competitive ratio (e.g., [2, 3, 5, 12]), we utilize another well-known measurement as “Regret”, involved by [6].

In this paper we define online load balancing problem as follows. There are K parallel servers and the protocol is defined as a game between the player and the environment. On each round $t=1,\dots , T$, (i) the player selects a distribution $\varvec{\alpha }_{t}$ over K servers, which can be viewed as an allocation of data, (ii) then the environment assigns a loaded condition $l_{t,i}$ for each server i and the loss of server i is given as $\alpha _{t,i}l_{t,i}$. The goal of the player is to minimize the makespan of the cumulative loss vector of all servers after T rounds, i.e., $\max _{i=1,\dots ,K} \sum _{t=1}^T \alpha _{t,i}l_{t,i}$, compared relatively to the makespan obtained by the optimal static allocation $\varvec{\alpha }^*$ in hindsight. More precisely, the goal is to minimize the regret, the difference between the player’s makespan and the static optimal makespan. The makespan cost can be viewed as $L_\infty $-norm of the vector of cumulative loss of each server (we will give a formal definition of the problem in the next section).

Even-Dar et al. [6] gave an algorithm based on the regret minimum framework by involving an extra concept, the Blackwell approachability [4] with respect to $L_{2}$-norm, to a target set, which is defined in the following section. This algorithm achieves the regret bound as $O(\sqrt{KT}).$ Simultaneously another algorithm, DIFF, achieves the regret upper bound as $O((\ln K)\sqrt{T}).$ Rahklin et al. [13] gave a theoretical result for the online load balancing problem, that the upper bound to regret can achieve $O(\sqrt{(\ln K)T}),$ rather than $O((\ln K)\sqrt{T}).$ However there is no efficient algorithm given in this paper to obtain this regret.

Then, there were some explorations about the equivalence between the Blackwell approachability and online linear optimization (OLO) [1], in addition and online convex optimization (OCO) by involving a support function [15].

These work [1, 15] implied that the Blackwell approachability with respect to general norm can be guaranteed by sub-linearity of regret from OCO problem reduced by Blackwell approaching game. Moreover due to this result we give an efficient algorithm to online load balancing problem, achieving the best known regret.

More specifically speaking, we propose algorithms for online load balancing for arbitrary norms under a natural assumption. This algorithm is composed by three reductions. (i) First reduction is from load balancing problem to Blackwell approaching game in a general metric. In this reduction we extend the $L_{2}$-norm of load balancing problem in [6] to any general norm with a reasonable assumption. In this reduced Blackwell approaching game the metric is induced by the norm of load balancing problem. This reduction implies that the regret of load balancing problem can be bounded by the convergence rate of a corresponding Blackwell approaching game. (ii) Second reduction directly follows the existing work. Due to [15], we give a reduction from Blackwell approaching game to an OCO problem, by showing the existence of such reduction. Thus we can bound the regret of online load balancing with the regret of corresponding OCO problem. (iii) The last reduction is from OCO problem to two OLO problems, so that we can predict with FTRL.

Conclusively, we can predict the allocation of serves on each round in online load balancing according to the prediction of corresponding two OLO problems. Simultaneously we give the regret bound of online load balancing problem with this OLO regret.

Thus our technical contributions are the following:

We propose a new reduction technique from online load balancing to a Blackwell approaching game. This reduction enables us to use more general norms, induced by online load balancing, in Blackwell approaching game rather than $L_2$-norm used in the previous work [6]. Based on this reduction we can reduce online load balancing with general norm to OLO problem, by using the reduction technique of Shimkin [15] from Blackwell games to OCO problem, further to OLO problems. In conclusion, online load balancing problem can be reduced to two OLO problems according to our reduction route. Therefore the regret bound to online load balancing problem can be optimized by the corresponding OLO regret bound.
Especially, according to above reduction route, we give an efficient algorithm for online load balancing w.r.t. $L_\infty $-norm, achieving the best known $O(\sqrt{T\ln K})$ regret. The algorithm involves linear programming and the second order cone programming and runs in polynomial time per trial. This is the first polynomial time algorithm achieving $O(\sqrt{T\ln K})$ regret.

2 Preliminaries

First we give some notations. We use $\left\Vert \cdot \right\Vert $ to denote a norm of a vector. Moreover, for a norm $\left\Vert \cdot \right\Vert $, $\left\Vert \varvec{x} \right\Vert _*$ denotes the dual norm of $\left\Vert \varvec{x} \right\Vert .$ A norm $\left\Vert \cdot \right\Vert $ over $\mathbb {R}^d$ is monotone if $\left\Vert \varvec{x} \right\Vert \le \left\Vert \varvec{y} \right\Vert $ whenever $|x_i| \le |y_i|$ for every $1 \le i \le d$.

2.1 Online Load Balancing

Firstly we begin with a standard (offline) load balancing problem. Suppose that we have K servers to do a simple task with a large amount of data. The task can be easily parallelized in such a way that we can break down the data into K pieces and assign them to the servers, and then each server processes the subtask in time proportional to the size of data assigned. An example is to find blacklisted IP addresses in an access log data. Each server is associated with loaded condition, expressed in terms of “the computation time per unit data”. The goal is to find a data assignment to the servers so as to equalize the computation time for all servers. In other words, we want to minimize the makespan, defined as the maximum of the computation time over all servers.

Formally, the problem is described as follows: The input is a K-dimensional vector $\varvec{l}=(l_1, l_2, \ldots , l_K) \in \mathbb {R}_+^K$, where each $l_i$ represents the loaded condition of the i-th server. The output is a K-dimensional probability vector $\varvec{\alpha }= (\alpha _1, \alpha _2, \ldots , \alpha _K) \in \varDelta (K) = \{ \varvec{\alpha }\in [0,1]^K \mid \sum _{i=1}^K \alpha _i = 1 \}$, where each $\alpha _i$ represents the fraction of data assigned to the i-th server. The goal is to minimize the makespan $\left\Vert \varvec{\alpha }\odot {\varvec{l}} \right\Vert _\infty $, where $\varvec{\alpha }\odot {\varvec{l}}= (\alpha _1 l_1, \alpha _2 l_2, \ldots , \alpha _K l_K)$. Note that it is clear that the optimal solution is given by $\alpha _i = l_i^{-1}/\sum _{j=1}^K l_j^{-1}$, which equalizes the computation time of every server as $ C_\infty ^*({\varvec{l}})\,{\mathop {=}\limits ^{\text {def}}}\, \min _{\varvec{\alpha }\in \varDelta (K)} \left\Vert \varvec{\alpha }\odot {\varvec{l}} \right\Vert _\infty = \frac{1}{\sum _{j=1}^K 1/l_j}.$

Note also that the objective is generalized to the $L_p$-norm for any p in the literature.

In this paper, we consider a more general objective $\left\Vert \varvec{\alpha }\odot {\varvec{l}} \right\Vert $ for an arbitrary norm that satisfies certain assumptions stated below. In the general case, the optimal value is denoted by $ C^*({\varvec{l}}) {\mathop {=}\limits ^{\text {def}}} \min _{\varvec{\alpha }\in \varDelta (K)} \left\Vert \varvec{\alpha }\odot {\varvec{l}} \right\Vert .$

Assumption 1

Throughout the paper, we put the following assumptions on the norm. (i) The norm is monotone, and (ii) the function $C^{*}$ is concave.

Note that the first assumption is natural for load balancing and the both assumptions are satisfied by $L_p$-norm for $p > 1$.

Now we proceed to the online load balancing problem with respect to a norm $\left\Vert \cdot \right\Vert $ that satisfies Assumption 1. The problem is described as a repeated game between the learner and the environment who may behave adversarially. In each round $t = 1, 2, \ldots , T$, the learner chooses an assignment vector $\varvec{\alpha }_t \in \varDelta (K)$, and then receives from the environment a loaded condition vector ${\varvec{l}}_t \in [0,1]^K$, which may vary from round to round. After the final round is over, the performance of the learner is naturally measured by $\left\Vert \sum _{t=1}^T \varvec{\alpha }_t \odot {\varvec{l}}_t \right\Vert $. We want to make the learner perform nearly as well as the performance of the best fixed assignment in hindsight (offline optimal solution), which is given by $C^*(\sum _{t=1}^T {\varvec{l}}_t)$. To be more specific, the goal is to minimize the following regret:

$$\begin{aligned}\text {Regret}(T) = \left\Vert \sum _{t=1}^T \varvec{\alpha }_t \odot {\varvec{l}}_t \right\Vert - C^*\left( \sum _{t=1}^T {\varvec{l}}_t\right) .\end{aligned}$$

2.2 Repeated Game with Vector Payoffs and Approachability

We briefly review the notion of Blackwell’s approachability, which is defined for a repeated game with vector payoffs. The game is specified by a tuple $(A, B, r, S, \text {dist})$, where A and B are convex and compact sets, $r:A \times B \rightarrow \mathbb {R}^d$ is a vector-valued payoff function, $S \subseteq \mathbb {R}^d$ is a convex and closed set called the target set, and $\text {dist}:\mathbb {R}^d \times \mathbb {R}^d \rightarrow \mathbb {R}_+$ is a metric. The protocol proceeds in trials: In each round $t=1, 2, \ldots , T$, the learner chooses a vector $\varvec{a}_t \in A$, the environment chooses a vector $\varvec{b}_t \in B$, and then the learner obtains a vector payoff $\varvec{r}_t \in \mathbb {R}^d$, given by $\varvec{r}_t = r(\varvec{a}_t, \varvec{b}_t)$. The goal of the learner is to make the average payoff vector arbitrarily close to the target set S.

Definition 1 (Approachability)

For a game $(A,B,r,S,\mathrm {dist})$, the target set S is approachable with convergence rate $\gamma (T)$ if there exists an algorithm for the learner such that the average payoff $\bar{\varvec{r}}_T = (1/T) \sum _{t=1}^T \varvec{r}_t$ satisfies

$$ \mathrm {dist}(\bar{\varvec{r}}_T,S) {\mathop {=}\limits ^{\mathrm {def}}} \min _{\varvec{s}\in S} \mathrm {dist}(\bar{\varvec{r}}_T, \varvec{s}) \le \gamma (T) $$

against any environment. In particular, we simply say that S is approachable if it is approachable with convergence rate o(T).

Blackwell characterizes the approachability in terms of the support function as stated in the proposition below.

Definition 2

For a set $S \subseteq \mathbb {R}^d$, the support function $h_S: \mathbb {R}^d \rightarrow \mathbb {R}\cup \{\infty \}$ is defined as

$$ h_S(\varvec{w}) = \sup _{\varvec{s}\in S} \langle \varvec{s}, \varvec{w} \rangle . $$

It is clear from definition that $h_S$ is convex whenever S is convex.

Definition 3

(Blackwell [4]). A game $(A,B,r,S,\mathrm {dist})$ satisfies Blackwell Condition, if and only if

$$\begin{aligned} \forall \varvec{w}\in \mathbb {R}^d \; \left( \min _{\varvec{a}\in A} \min _{\varvec{b}\in B} \langle \varvec{w}, r(\varvec{a}, \varvec{b}) \rangle \le h_S(\varvec{w}) \right) . \end{aligned}$$

(1)

Remark 1

In [4], Blackwell characterized the approachability of a target set for $L_{2}$-norm metric in terms of the Blackwell condition.

In what follows, we only consider a norm metric, i.e, $\mathrm {dist}(\varvec{r},\varvec{s}) = \left\Vert \varvec{r}- \varvec{s} \right\Vert $ for some norm $\left\Vert \cdot \right\Vert $ over $\mathbb {R}^d$. The following proposition is useful.

Proposition 1

For any $\varvec{w}\in \mathbb {R}^d$, $\varvec{s}^* = \arg \max _{\varvec{s}\in S} \langle \varvec{s}, \varvec{w} \rangle $ is a sub-gradient of $h_S(\varvec{w})$ at $\varvec{w}$.

Proof

For any $\varvec{w}, {\varvec{u}}\in \mathbb {R}^d$, let and . Since $\langle \varvec{s}^*, {\varvec{u}} \rangle \le \langle \varvec{s}^{\varvec{u}}, {\varvec{u}} \rangle $, we have

$$\begin{aligned} h_S(\varvec{w}) - h_S({\varvec{u}})&= \sup _{\varvec{s}\in S} \langle \varvec{s}, \varvec{w} \rangle - \sup _{\varvec{s}\in S}\langle \varvec{s}, {\varvec{u}} \rangle = \langle \varvec{s}^*, \varvec{w} \rangle - \langle \varvec{s}^{\varvec{u}}, {\varvec{u}} \rangle \\&\le \langle \varvec{s}^*, \varvec{w}- {\varvec{u}} \rangle , \end{aligned}$$

which implies the proposition. $\square $

2.3 Online Convex Optimization

In this subsection we briefly review online convex optimization with some known results. See, e.g., [7, 14] for more details.

An online convex optimization (OCO) problem is specified by (W, F), where $W \subseteq \mathbb {R}^d$ is a compact convex set called the decision set and $F \subseteq \{f:W \rightarrow \mathbb {R}\}$ is a set of convex functions over W called the loss function set. The OCO problem (W, F) is described by the following protocol between the learner and the adversarial environment. For each round $t=1,2,\ldots ,T$, the learner chooses a decision vector $\varvec{w}_t \in W$ and then receives from the environment a loss function $f_t \in F$. In this round, the learner incurs the loss given by $f_t(\varvec{w}_t)$. The goal is to make the cumulative loss of the learner nearly as small as the cumulative loss of the best fixed decision. To be more specific, the goal is to minimize the following regret:

$$ \mathrm {Regret}_{(W,F)}(T) = \sum _{t=1}^T f_t(\varvec{w}_t) - \min _{\varvec{w}\in W} \sum _{t=1}^T f_t(\varvec{w}). $$

Here we add the subscript (W, F) to distinguish from the regret for online load balancing.

Any OCO problem can be reduced to an online linear optimization (OLO) problem, which is an OCO problem with linear loss functions. More precisely, an OLO problem is specified by (W, G), where $G \subseteq \mathbb {R}^d$ is the set of cost vectors such that the loss function at round t is $\langle \varvec{g}_t, \cdot \rangle $ for some cost vector $\varvec{g}_t \in G$. For the OLO problem (W, G), the regret of the learner is thus given by

$$ \mathrm {Regret}_{(W,G)}(T) = \sum _{t=1}^T \langle \varvec{g}_t, \varvec{w}_t \rangle - \min _{\varvec{w}\in W} \sum _{t=1}^T \langle \varvec{g}_t, \varvec{w} \rangle . $$

The reduction from OCO to OLO is simple. Run any algorithm for OLO (W, G) with $\varvec{g}_t \in \partial f_t(\varvec{w}_t)$, and then it achieves $\mathrm {Regret}_{(W,F)}(T) \le \mathrm {Regret}_{(W,G)}(T)$, provided that G is large enough, i.e., $G \supseteq \bigcup _{f \in F, \varvec{w}\in W} \partial f(\varvec{w})$.

A standard FTRL (follow-the-regularized-leader) strategy for the OLO problem (W, G) is to choose $\varvec{w}_t$ as

(2)

where $R:W \rightarrow \mathbb {R}$ is a strongly convex function called the regularizer and $\eta _t \in \mathbb {R}_+$ is a parameter. Using the strategy (2) the following regret bound is known.

Proposition 2

([14]). Suppose that the regularizer $R:W \rightarrow \mathbb {R}$ is $\sigma $-strongly convex w.r.t. some norm $\Vert \cdot \Vert $, i.e., for any $\varvec{w}, {\varvec{u}}\in W$, for any $\varvec{z}\in \partial R(\varvec{w})$, $R({\varvec{u}}) \ge R(\varvec{w}) + \langle \varvec{z},{\varvec{u}}-\varvec{w}\rangle + \frac{\sigma }{2}\Vert {\varvec{u}}-\varvec{w}\Vert ^2 $. Then, for the OLO problem (W, G), the regret of the strategy (2) satisfies

$$ \mathrm {Regret}_{(W,G)}(T) = O(D_R L_G\sqrt{T/\sigma }), $$

where $D_R = \sqrt{\max _{\varvec{w}\in W}R(\varvec{w})}$, $L_G=\max _{\varvec{g}\in G}\Vert \varvec{g}\Vert _*$ and $\eta _t=(L_G/D_R)\sqrt{T/\sigma }$.

Note however that the strategy does not consider the computational feasibility at all. For efficient reduction, we need an efficient algorithm that computes a sub-gradient $\varvec{g}\in \partial f(\varvec{w})$ when given (a representation of) $f \in F$ and $w \in W$, and an efficient algorithm for solving the convex optimization problem (2).

For a particular OLO problem (W, G) with $L_1$ ball decision set $W = \{ \varvec{w}\in \mathbb {R}^d \mid \left\Vert \varvec{w} \right\Vert _1 \le 1 \}$, an algorithm called EG$^\pm $ [8] finds in linear time the optimal solution of (2) with an entropic regularizer and achieves the following regret.

Theorem 2

([9]). For the OLO problem (W, G) with $W = \{ \varvec{w}\in \mathbb {R}^d \mid \left\Vert \varvec{w} \right\Vert _1 \le 1 \}$ and $G = \{ \varvec{g}\in \mathbb {R}^d \mid \left\Vert \varvec{g} \right\Vert _\infty \le M \}$, EG$^\pm $ achieves

$$ \mathrm {Regret}_{(W,G)}(T) \le M\sqrt{2T \ln (2d)}. $$

3 Main Result

In this section, we propose a meta-algorithm for online load balancing, which is obtained by combining a reduction to two independent OLO problems and an OLO algorithm (as an oracle) for the reduced problems. Note that the reduced OLO problems depend on the choice of norm for online load balancing, and the OLO problems are further reduced to some optimization problems defined in terms of the norm. For efficient implementation, we assume that the optimization problems are efficiently solved.

Now we consider the online load balancing problem on K servers with respect to a norm $\left\Vert \cdot \right\Vert $ defined over $\mathbb {R}^K$ that satisfies Assumption 1. The reduction we show consists of three reductions, the first reduction is to a repeated game with vector payoffs, the second one is to an OCO problem, and the last one is to two OLO problems. In the subsequent subsections, we give these reductions, respectively.

3.1 Reduction to a Vector Payoff Game

We will show that the online load balancing problem can be reduced to the following repeated game with vector payoffs, denoted by $P = (A,B,r,S,\mathrm {dist})$, where

$A = \varDelta (K)$, $B = [0,1]^K$,
$r: A \times B \rightarrow \mathbb {R}^K \times \mathbb {R}^K$ is the payoff function defined as $r(\varvec{\alpha }, {\varvec{l}}) = (\varvec{\alpha }\odot {\varvec{l}}, {\varvec{l}})$,
$S = \{ (\varvec{x},\varvec{y}) \in [0,1]^K \times [0,1]^K \mid \left\Vert \varvec{x} \right\Vert \le C^*(\varvec{y}) \}$, and
$\mathrm {dist}$ is the metric over $\mathbb {R}^K \times \mathbb {R}^K$ defined as $\mathrm {dist}(\varvec{r}, \varvec{s}) = \left\Vert \varvec{r}- \varvec{s} \right\Vert ^+$, where $\left\Vert \cdot \right\Vert ^+$ is the norm over $\mathbb {R}^K \times \mathbb {R}^K$ defined as
$$ \left\Vert (\varvec{x},\varvec{y}) \right\Vert ^+ = \left\Vert \varvec{x} \right\Vert + \left\Vert \varvec{y} \right\Vert . $$

Here we use the convention that $\mathbb {R}^{2K} = \mathbb {R}^K \times \mathbb {R}^K$. Note that the target set S is convex since $\left\Vert \cdot \right\Vert $ is convex and $C^*$ is concave by our assumption. Note also that it is easy to verify that $\left\Vert \cdot \right\Vert ^+$ is a norm whenever $\left\Vert \cdot \right\Vert $ is a norm, and its dual is

$$\begin{aligned} \left\Vert (\varvec{x},\varvec{y}) \right\Vert ^+_* = \max \{ \left\Vert \varvec{x} \right\Vert _*, \left\Vert \varvec{y} \right\Vert _* \}. \end{aligned}$$

(3)

The reduction is similar to that in [6], but they consider a fixed norm $\left\Vert \cdot \right\Vert _2$ to define the metric, no matter what norm is used for online load balancing.

Proposition 3

Assume that we have an algorithm for the repeated game P that achieves convergence rate $\gamma (T)$. Then, the algorithm, when directly applied to the online load balancing problem, achieves

$$ \mathrm {Regret}(T) \le T \gamma (T). $$

Proof

Let $\mathcal {A}$ denote an algorithm for the repeated game P with convergence rate $\gamma (T)$. Assume that when running $\mathcal {A}$ against the environment of online load balancing, we observe, in each round t, $\alpha _t \in \varDelta (K)$ output from $\mathcal {A}$ and ${\varvec{l}}_t \in [0,1]^K$ output from the environment.

Let $(\varvec{x},\varvec{y}) = \arg \min _{(\varvec{x},\varvec{y}) \in S} \left\Vert \bar{r}_T - (\varvec{x},\varvec{y}) \right\Vert ^+$, where $\bar{r}_T = (1/T) \sum _{t=1}^T r(\varvec{\alpha }_t,{\varvec{l}}_t)$ is the average payoff. Note that by the assumption of $\mathcal {A}$, we have $\left\Vert \bar{r}_T - (\varvec{x},\varvec{y}) \right\Vert ^+ \le \gamma (T)$. For simplicity, let $ L^\mathcal {A}_T = (1/T) \sum _{t=1}^T \varvec{\alpha }_t \odot {\varvec{l}}_t$ and $L_T = (1/T) \sum _{t=1}^T {\varvec{l}}_t.$

Then, we have

where the first inequality is from the definition of S and the triangle inequality, the third inequality is from the triangle inequality, and the fourth inequality is from the monotonicity of the norm. $\square $

3.2 Reduction to an OCO Problem

Next we give the second sub-reduction from the repeated game P to an OCO problem. We just follow a general reduction technique of Shimkin [15] as given in the next theorem.

Theorem 3

([15]). Let $(A,B,r,S,\mathrm {dist})$ be a repeated game with vector payoffs, where $\mathrm {dist}(\varvec{r},\varvec{s}) = \left\Vert \varvec{r}- \varvec{s} \right\Vert $ for some norm $\left\Vert \cdot \right\Vert $ over $\mathbb {R}^d$. Assume that we have an algorithm $\mathcal {A}$ that witnesses the Blackwell condition, i.e., when given $\varvec{w}\in \mathbb {R}^d$, $\mathcal {A}$ finds $\varvec{a}\in A$ such that $\langle \varvec{w}, r(\varvec{a},\varvec{b}) \rangle \le h_S(\varvec{w})$ for any $\varvec{b}\in B$. Assume further that we have an algorithm $\mathcal {B}$ for the OCO problem (W, F), where $W = \{ \varvec{w}\in \mathbb {R}^d \mid \left\Vert \varvec{w} \right\Vert _* \le 1 \}$ and $F = \{ f:\varvec{w}\mapsto \langle -r(\varvec{a},\varvec{b}), \varvec{w} \rangle + h_S(\varvec{w}) \mid \varvec{a}\in A, \varvec{b}\in B \}$. Then, we can construct an algorithm for the repeated game such that its convergence rate $\gamma (T)$ satisfies

$$ \gamma (T) \le \frac{\mathrm {Regret}_{(W,F)}(T)}{T}. $$

Moreover, the algorithm runs in polynomial time (per round) if $\mathcal {A}$ and $\mathcal {B}$ are polynomial time algorithms.

For completeness, the reduction algorithm(Algorithm 1) is as follow.

The rest to show in this subsection is to ensure the existence of algorithm $\mathcal {A}$ required for the reduction as stated in the theorem above. In other words, we show that the Blackwell condition holds for our game $P = (\varDelta (K), [0,1]^K, r, S, \mathrm {dist})$, where $r(\varvec{\alpha },{\varvec{l}}) = (\varvec{\alpha }\odot , {\varvec{l}}, {\varvec{l}}) \in \mathbb {R}^K \times \mathbb {R}^K$, $S = \{(\varvec{x},\varvec{y}) \in [0,1]^K \times [0,1]^K \mid \left\Vert \varvec{x} \right\Vert \le C^*(\varvec{y}) \}$, and $\mathrm {dist}(\varvec{r},\varvec{s}) = \left\Vert \varvec{r}- \varvec{s} \right\Vert ^+$.

Lemma 1

The Blackwell condition holds for game P. That is, for any $\varvec{w}\in \mathbb {R}^K \times \mathbb {R}^K$, we have

$$ \min _{\varvec{\alpha }\in \varDelta (K)} \max _{{\varvec{l}}\in [0,1]^K} \langle \varvec{w}, r(\varvec{\alpha }, {\varvec{l}}) \rangle \le h_S(\varvec{w}). $$

Proof

(Proof of Lemma 1). Let $\varvec{w}= (\varvec{w}_1,\varvec{w}_2) \in \mathbb {R}^K \times \mathbb {R}^K$. By the definition of r, the inner product in the Blackwell condition can be rewritten as a bilinear function

$$ f(\varvec{\alpha },{\varvec{l}})=\langle \varvec{w}, r(\varvec{\alpha }, {\varvec{l}}) \rangle = \sum _{i=1}^K w_{1,i} \alpha _i l_i + \sum _{i=1}^{K} w_{2,i} l_i $$

over $\varDelta (K) \times [0,1]^K$. Therefore, f meets the condition of Minimax Theorem of von Neumann. and we have

$$ \min _{\varvec{\alpha }\in \varDelta (K)} \max _{{\varvec{l}}\in [0,1]^K} f(\varvec{\alpha }, {\varvec{l}}) = \max _{{\varvec{l}}\in [0,1]^K} \min _{\varvec{\alpha }\in \varDelta (K)} f(\varvec{\alpha }, {\varvec{l}}). $$

Let ${\varvec{l}}^* = \arg \max _{{\varvec{l}}\in [0,1]^K} \min _{\varvec{\alpha }\in \varDelta (K)} f(\varvec{\alpha },{\varvec{l}})$ and $\varvec{\alpha }^* = \arg \min _{\varvec{\alpha }\in \varDelta (K)} \left\Vert \varvec{\alpha }\odot {\varvec{l}}^* \right\Vert $. Note that by the definition of S, we have $(\varvec{\alpha }^{*} \odot {\varvec{l}}^*, {\varvec{l}}^*) \in S$. Hence we get

$$\begin{aligned} \min _{\varvec{\alpha }\in \varDelta (K)} \max _{{\varvec{l}}\in [0,1]^K} f(\varvec{\alpha }, {\varvec{l}})&= \max _{{\varvec{l}}\in [0,1]^K} \min _{\varvec{\alpha }\in \varDelta (K)} f(\varvec{\alpha }, {\varvec{l}}) \\&= f(\varvec{\alpha }^*, {\varvec{l}}^*) \\&= \langle \varvec{w}, ((\varvec{\alpha }^{*} \odot {\varvec{l}}^*),{\varvec{l}}^*) \rangle \\&\le \sup _{\varvec{s}\in S} \langle \varvec{w}, \varvec{s} \rangle \\&= h_S(\varvec{w}), \end{aligned}$$

which completes the lemma. $\square $

The lemma ensures the existence of algorithm $\mathcal {A}$. On the other hand, for an algorithm $\mathcal {B}$ we need to consider the OCO problem (W, F), where the decision set is

$$\begin{aligned} W = \{ \varvec{w}\in \mathbb {R}^K \times \mathbb {R}^K \mid \left\Vert \varvec{w} \right\Vert ^+_* \le 1 \}, \end{aligned}$$

(4)

and the loss function set is

$$\begin{aligned} F = \{ f:\varvec{w}\mapsto \langle -r(\varvec{\alpha },{\varvec{l}}), \varvec{w} \rangle + h_S(\varvec{w}) \mid \varvec{\alpha }\in \varDelta (K), {\varvec{l}}\in [0,1]^K \}. \end{aligned}$$

(5)

Since W is a compact and convex set and F consists of convex functions, we could apply a number of existing OCO algorithms to obtain $\mathrm {Regret}_{(W,F)}(T) = O(\sqrt{T})$. In the next subsection, we show that the problem can be simplified to two OLO problems.

3.3 Reduction to Two OLO Problems

Consider the OCO problem (W, F) given by (4) and (5). Following the standard reduction technique from OCO to OLO stated in Sect. 2.3, we obtain an OLO problem (W, G) to cope with, where $G \subseteq \mathbb {R}^K \times \mathbb {R}^K$ is any set of cost vectors that satisfies

(6)

By (3), the decision set W can be rewritten as $W = B_*(K) \times B_*(K)$ where $B_*(K) = \{ \varvec{w}\in \mathbb {R}^K \mid \left\Vert \varvec{w} \right\Vert _* \le 1\}$ is the K-dimensional unit ball with respect to the dual norm $\left\Vert \cdot \right\Vert _*$. By Proposition 1, any $\varvec{s}\in \partial h_S(\varvec{w})$ is in the target set S, which is a subset of $[0,1]^K \times [0,1]^K$. Moreover, $r(\varvec{\alpha },{\varvec{l}}) = (\varvec{\alpha }\odot {\varvec{l}}, {\varvec{l}}) \in [0,1]^K \times [0,1]^K$ for any $\varvec{\alpha }\in \varDelta (K)$ and ${\varvec{l}}\in [0,1]^K$. Therefore, $G = [-1,1]^K \times [-1,1]^K$ satisfies (6).

Thus, $(B_*(K) \times B_*(K), [-1,1]^K \times [-1,1]^K)$ is a suitable OLO problem reduced from the OCO problem (W, F). Furthermore, we can break the OLO problem into two independent OLO problems $(B_*(K), [-1,1]^K)$ in the straightforward way: Make two copies of an OLO algorithm $\mathcal {C}$ for $(B_*(K), [-1,1]^K)$, denoted by $\mathcal {C}_1$ and $\mathcal {C}_2$, and use them for predicting the first half and second half decision vectors, respectively. More precisely, for each trial t, (1) receive predictions $\varvec{w}_{t,1} \in B_*(K)$ and $\varvec{w}_{t,2} \in B_*(K)$ from $\mathcal {C}_1$ and $\mathcal {C}_2$, respectively, (2) output their concatenation $\varvec{w}_t = (\varvec{w}_{t,1}, \varvec{w}_{t,2}) \in W$, (3) receive a cost vector $\varvec{g}_t = (\varvec{g}_{t,1}, \varvec{g}_{t,2}) \in [0,1]^K \times [0,1]^K$ from the environment, (4) feed $\varvec{g}_{t,1}$ and $\varvec{g}_{t,2}$ to $\mathcal {C}_1$ and $\mathcal {C}_2$, respectively, to make them proceed.

It is clear that the procedure above ensures the following lemma.

Lemma 2

The OCO problem (W, F) defined as (4) and (5) can be reduced to the OLO problem $(B_*(K),[0,1]^K)$, and

$$ \mathrm {Regret}_{(W,F)}(T) \le 2 \mathrm {Regret}_{(B_*(K),[0,1]^K)}(T). $$

3.4 Putting All the Pieces Together

Combining all reductions stated in the previous subsections, we get an all-in-one algorithm as described in Algorithm 2.

It is clear that combining Proposition 3, Theorem 3 and Lemma 2, we get the following regret bound of Algorithm 2.

Theorem 4

Algorithm 2 achieves

$$ \mathrm {Regret}(T) \le 2 \mathrm {Regret}_{(B_*(K),[-1,1]^K)}(T), $$

where the regret in the right hand side is the regret of algorithm $\mathcal {C}_1$ (and $\mathcal {C}_2$ as well). Moreover, if $\mathcal {A}$, $\mathcal {B}$ and $\mathcal {C}_1$ runs in polynomial time, then Algorithm 2 runs in polynomial time (per round).

By applying the FTRL as in (2) to the OLO problem $(B_*(K),[-1,1]^K)$ with a strongly convex regularizer R, Proposition 2 implies the following regret bound.

Corollary 1

Assume that there exists a regularizer $R:B_*(K) \rightarrow \mathbb {R}$ that is $\sigma $-strongly convex w.r.t. $L_1$-norm. Then, there exists an algorithm for the online load balancing problem that achieves

$$\begin{aligned} \mathrm {Regret}(T) = O(D_R \sqrt{T/\sigma }), \end{aligned}$$

where $D_R = \sqrt{\max _{\varvec{w}\in B_*(K)}R(\varvec{w})}$.

In particular, for the OLO problem $(B_1(K),[-1,1]^K)$, algorithm EG$^\pm $ achieves $\sqrt{2T \ln 4K}$ regret bound as shown in Theorem 2. Thus we have $O(\sqrt{T \ln K})$ regret bound for online load balancing with respect to $L_\infty $-norm (i.e., w.r.t. makespan), which improves the bound of [6] by a factor of $\sqrt{\ln K}$. Moreover, for $L_\infty $-norm, it turns out that we have polynomial time algorithms for $\mathcal {A}$ and $\mathcal {B}$, which we will give in the next section. We thus obtain the following corollary.

Corollary 2

There exists a polynomial time (per round) algorithm for the online load balancing problem with respect to $L_\infty $-norm that achieves

$$\begin{aligned}\mathrm {Regret}(T) \le 2\sqrt{2T \ln 4K}.\end{aligned}$$

4 Algorithmic Details for $L_\infty $-norm

In this section we give details of Algorithm 2 for the makespan problem, i.e., for $L_\infty $-norm.

4.1 Computing $\alpha _t$

First, we give details of implementation of $\mathcal {A}$ in Algorithm 2. Specifically, on the round t, we need to choose $\varvec{\alpha }_{t}$, which is the optimal solution of the problem in Lemma 1. That is,

$$\begin{aligned} \min _{\varvec{\alpha } \in \varDelta (K)}\max _{{\varvec{l}}\in [0,1]^{K}}\langle \varvec{w}_{1}, (\varvec{\alpha } \odot {\varvec{l}})\rangle +\langle \varvec{w}_{2},{\varvec{l}}\rangle , \end{aligned}$$

(7)

where we set that $\varvec{w}=(\varvec{w}_{1},\varvec{w}_{2})$ and $\varvec{w}_{1}$ and $\varvec{w}_2$ are K-dimensional vectors, respectively. We see that the optimization of this objective function is defined by $l_{i}=0$ if $w_{1,i} \cdot \alpha _i+w_{2,i} \le 0,$ otherwise we let $l_i=1.$ Hence we can convert our problem to choose $\varvec{\alpha }$ as

$$\min _{\varvec{\alpha } \in \varDelta (K)}\max _{\varvec{l} \in [0,1]^{K}}\langle \varvec{w}_{1}, (\varvec{\alpha } \odot \varvec{l})\rangle +\langle \varvec{w}_{2},\varvec{l} \rangle =\min _{\varvec{\alpha } \in \varDelta (K)}\sum _{i=1}^{K}\max \left\{ 0, \alpha _i w_{1,i}+w_{2,i}\right\} ,$$

which is equivalent to

$$\begin{aligned} \min _{\varvec{\alpha }\in \varDelta (K),\varvec{\beta }\ge \varvec{0}} \sum _{i=1}^{K} \beta _i \quad \mathrm {s{.}t.}~\beta _i \ge w_{1,i} \alpha _i+ w_{2,i} \quad \forall i=1,\dots ,K. \end{aligned}$$

The above problem is a linear program with O(K) variables and O(K) linear constraints. Thus, computing $\varvec{\alpha }_t$ in the problem (7) can be solved in polynomial time.

4.2 Computing Subgradients $g_t$ for the $\infty $-norm

The second component of Algorithm 2 is the algorithm $\mathcal {B}$, which computes subgradients $\varvec{s}_t \in \partial h_S(\varvec{w}_t)$. By Proposition 1, we have $\varvec{s}_t=\arg \max _{\varvec{s}\in S}\langle \varvec{s}, \varvec{w}_{t} \rangle .$ Recall that $S=\{ (\varvec{x}, \varvec{y}) \in [0,1]^K \times [0,1]^K \mid \Vert \varvec{x}\Vert _\infty \le C_\infty ^*(\varvec{y})\}$. In particular, the condition that $\Vert \varvec{x}\Vert _\infty \le C_\infty ^*(\varvec{y})$ can be represented as

$$\begin{aligned}&\max _i x_i \le \min _{\varvec{\alpha }\in \varDelta (K)}\Vert \varvec{\alpha } \odot \varvec{y}\Vert _{\infty } \iff x_i \le \frac{1}{\sum _{j=1}^K \frac{1}{y_j}}, {\forall i}. \end{aligned}$$

Therefore, the computation of the subgradient $\varvec{s}_t$ is formulated as

$$\begin{aligned} \max _{\varvec{x},\varvec{y}\in [0,1]^K} \langle \varvec{w}_{1}, \varvec{x}\rangle + \langle \varvec{w}_{2}, \varvec{y} \rangle \quad \mathrm {s{.}t.}~x_{i} \le \frac{1}{\sum _{j}\frac{1}{y_{j}}}, \quad \forall i=1,\dots ,K. \end{aligned}$$

(8)

Now we show that there exists an equivalent second order cone programming (SOCP) formulation (e.g., [11]) for this problem.

First we give the definition of the second order cone programming, and then we give a proposition, which states that our optimization problem is equivalent to the second order cone programming.

Definition 4

The standard form for the second order conic programming (SOCP) model is as follows:

$$\min _{\varvec{x}} \langle \varvec{c},\varvec{x}\rangle ~\text {s.t.}~A\varvec{x}=\varvec{b}, \Vert C_{i}\varvec{x}+\varvec{d}_{i} \Vert _{2} \le \varvec{e}_{i}^{\top } \varvec{x}+\varvec{f}_{i} \quad \text {for}~i=1, \cdots , m,$$

where the problem parameters are $\varvec{c}\in \mathbb {R}^{n},$ $C_{i} \in \mathbb {R}^{n_{i} \times n},$ $\varvec{d}_{i} \in \mathbb {R}^{n_{i}},$ $\varvec{e}\in \mathbb {R}^{n},$ $\varvec{f}_{i}\in \mathbb {R},$ $A\in \mathbb {R}^{p \times n},$ and $\varvec{b}\in \mathbb {R}^{p}.$ $\varvec{x}\in \mathbb {R}^{n}$ is the optimization variable.

Then we obtain the following proposition.

Proposition 4

$\sum _{i=1}^{K}\frac{x^{2}}{y_{i}} \le x,$ $x \ge 0$ and $y_{i} \ge 0$ is equivalent to $x^{2} \le y_{i}z_{i},$ where $y_{i},z_{i} \ge 0$ and $\sum _{i=1}^{K}z_{i}=x.$

Proof

On the direction “$\Rightarrow $”

From $\sum _{i=1}^{k}\frac{x^{2}}{y_{i}} \le x$ we obtain that $\sum _{i=1}^{k}\frac{x}{y_{i}} \le 1.$ By setting

$$z_{i}=x \cdot \frac{\frac{1}{y_{i}}}{\sum _{i}\frac{1}{y_{i}}},$$

we can have that $x^{2} \le y_{i}z_{i},$ and $\sum _{i=1}^{k}z_{i}=x.$

On the other direction “$\Leftarrow $” Due to $x^{2} \le y_{i}z_{i},$ we have $\frac{x^{2}}{y_{i}} \le z_{i}.$ So we have that

$$\sum _{i=1}^{k}\frac{x^{2}}{y_{i}} \le \sum _{i=1}^{k}z_{i} =x.$$

$\square $

Again in our case we need find to the optimal vector $\varvec{s}\in S,$ which satisfies that Then we can reduce our problem in following theorem.

Theorem 5

The optimization problem (8) can be solved by the second order cone programming.

Proof

To prove this theorem we only need to represent the original problem (8) as a standard form of the SOCP problem. Note that we only consider the case that $y_{i}\ne 0$ for all $i=1,\dots ,K$. The case where $y_i=0$ for some i is trivial. To see this, by definition of S, we know that for all i, $x_{i}=0.$ Then, the resulting problem is a linear program, which is a special case of the SOCP. Now we assume that $y_i \ne 0$ for $i=1,\dots , K$. For $x_{i} \le \frac{1}{\sum _{j}\frac{1}{y_{j}}},$ we multiply $x_{i}$ on both sides and rearrange the inequality:

$$\sum _{j=1}^{K} \frac{x_{i}^{2}}{y_{j}} \le x_{i}.$$

By Proposition 4, this is equivalent with

$$y_{j}z_{i,j} \ge x_{i}^{2}, \quad y_{j},z_{i,j} \ge 0, \quad \sum _{j=1}^{K}z_{i,j} =x_{i}.$$

By [11], we may rewrite it as follows: For each i,

$$\begin{aligned} x_{i}^{2} \le y_{j}z_{i,j}; \quad y_{j}, z_{i,j} \ge 0 \Longleftrightarrow \Vert (2x_{i}, y_{j}-z_{i,j})\Vert _{2} \le y_{j}+z_{i,j} \quad \forall j=1,\dots ,K. \end{aligned}$$

(9)

The above equivalence is trivial. On the other hand, since $x_{i} \le \frac{1}{\sum _{j}\frac{1}{y_{j}}},$ and $y_{i} \in [0,1],$ naturally we have $x_{i} \in [0,1].$ So we need only constrain that $y_{i}\in [0,1].$ We can apply the face that if $y_{i}$ is positive so $\vert y_{i} \vert =y_{i},$ and if $y_{i} \le 1,$ so $\vert y_{i} \vert \le 1.$ Therefore we may give a $(K^{2}+2K) \times (K^{2}+2K)$-matrix $C_{i}$ in SOCP, and the variable vector is composed as follows:

$$\begin{aligned} \varvec{\tilde{x}}=(x_{1}, \cdots , x_{K},y_{1}, \cdots , y_{K}, z_{1,1}, \cdots , z_{1,K}, \cdots , z_{K,1} \cdots , z_{K,K}), \end{aligned}$$

(10)

where for $z_{i,j},$ i is corresponding to $x_{i}.$

Now we may give the second order cone programming of our target problem as follows:

$$\begin{aligned} \begin{aligned}&\min _{\varvec{\tilde{x}}} \langle -(\varvec{w}_{1},\varvec{w}_{2}, 0, \cdots ,0), \varvec{\tilde{x}} \rangle \\&\mathrm {s{.}t.}\Vert C_{i} \varvec{\tilde{x}} \Vert _{2} \le \varvec{e}_{i}^{\top } \varvec{\tilde{x}} +\varvec{d}_{i} \quad \forall i=1, \cdots , K^{2}+2K, \\&A\varvec{\tilde{x}}=\varvec{b}. \end{aligned} \end{aligned}$$

(11)

where $C_{i},$ $\varvec{e}_{i}$, A and $\varvec{b}$ are defined as follows:

Firstly the matrix C for hyperbolic constraints are given as: For a fixed $s\in [K],$ where $[K]=\{1, \cdots , K\}$ in matrix $C_{i},$ where $i\in [(s-1)K, sK]$ we let $(C_{i})_{1,s}=2,$ $(C_{i})_{K+i,K+i}=1$, $(C_{i})_{2K+(s-1)K+i,2K+(s-1)K+i}=-1,$ and others are 0. $\varvec{e}_{i}$ is defined as $(\varvec{e}_{i})_{K+i}=1$ and $(\varvec{e}_{i})_{2K+(s-1)K+i}=1,$ others are 0.

Next we need to constrain that $y_{i}$ is less than 1. For $i\in [K^{2},K^{2}+K]$ we let that $(C_{i})_{K+i,K+i}=1$ and others are 0. And we let that $\varvec{e}_{i}$ is a zero vector and $\varvec{d}_{i}=1.$ It means that $\Vert y_{i}\Vert \le 1.$ For $i \in [K^{2}+K, K^{2}+2K],$ we set $(C_{i})_{K+i,K+i}=1$ $\varvec{e}_{K+i}=1,$ and $\varvec{d}_{i}=0$.

At last we need to constrain that $\sum _{j=1}^{K}z_{j}=x_{i}$ in Eq. 9: Let $A \in \mathbb {R}^{K \times (3K+K^{2})}$ for each row vector $A_{j},$ where $j \in [K],$ we have that $(A_{j})_{j}=1$ and $(A_{j})_{2K+(j-1)j+m}=-1,$ for all $m=1,\cdots , K.$ No w the matrix A is composed by the row vectors $A_{j}.$ and $\varvec{b}$ is a zero vector. $\square $

5 Conclusion

In this paper we give a framework for online load balancing problem by reducing it to two OLO problems. Moreover, for online load balancing problem with respect to $L_\infty $-norm we achieve the best known regret bound in polynomial time. Firstly, we reduce online load balancing with $\Vert \cdot \Vert $ norm to a vector payoff game measured by combination norm $\Vert \cdot \Vert ^{+}.$ Next due to [15] this vector payoff game is reduced to an OCO problem. At last, we can reduce this OCO problem to two independent OLO problems. Especially, for makespan, we give an efficient algorithm, which achieves the best known regret bound $O(\sqrt{T \ln K}),$ by processing linear programming and second order cone programming in each trial. Recently Kwon [10] proposed a similar reduction with other type of induced norm instead of our combination norm.

There are some open problems left in this topic. For instance, an efficient algorithm for online load balancing with respect to general norm or p-norm is still an open problem. Kwon’s reduction [10] might be helpful to this discussion. Furthermore, the lower bound of online load balancing is still unknown.

References

Abernethy, J., Bartlett, P.L., Hazan, E.: Blackwell approachability and no-regret learning are equivalent. In: Proceedings of the 24th Annual Conference on Learning Theory, pp. 27–46 (2011)
Google Scholar
Azar, Y.: On-line load balancing. In: Fiat, A., Woeginger, G.J. (eds.) Online Algorithms. LNCS, vol. 1442, pp. 178–195. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0029569
Chapter Google Scholar
Azar, Y., Kalyanasundaram, B., Plotkin, S., Pruhs, K.R., Waarts, O.: Online load balancing of temporary tasks. In: Dehne, F., Sack, J.-R., Santoro, N., Whitesides, S. (eds.) WADS 1993. LNCS, vol. 709, pp. 119–130. Springer, Heidelberg (1993). https://doi.org/10.1007/3-540-57155-8_241
Chapter Google Scholar
Blackwell, D., et al.: An analog of the minimax theorem for vector payoffs. Pac. J. Math. 6(1), 1–8 (1956)
Article MathSciNet Google Scholar
Crescenzi, P., Gambosi, G., Nicosia, G., Penna, P., Unger, W.: On-line load balancing made simple: greedy strikes back. J. Discrete Algorithms 5(1), 162–175 (2007)
Article MathSciNet Google Scholar
Even-Dar, E., Kleinberg, R., Mannor, S., Mansour, Y.: Online learning for global cost functions. In: COLT (2009)
Google Scholar
Hazan, E.: Introduction to online convex optimization. Found. Trends Optim. 2(3–4), 157–325 (2016). http://ocobool.cs.prinston.edu/
Hoeven, D., Erven, T., Kotłowski, W.: The many faces of exponential weights in online learning. In: Conference On Learning Theory, pp. 2067–2092 (2018)
Google Scholar
Kivinen, J., Warmuth, M.K.: Exponentiated gradient versus gradient descent for linear predictors. Inf. Comput. 132(1), 1–63 (1997)
Article MathSciNet Google Scholar
Kwon, J.: Refined approachability algorithms and application to regret minimization with global costs. arXiv preprint arXiv:2009.03831 (2020)
Lobo, M.S., Vandenberghe, L., Boyd, S., Lebret, H.: Applications of second-order cone programming. Linear Algebra Appl. 284(1–3), 193–228 (1998)
Article MathSciNet Google Scholar
Molinaro, M.: Online and random-order load balancing simultaneously. In: Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1638–1650. Society for Industrial and Applied Mathematics (2017)
Google Scholar
Rakhlin, A., Sridharan, K., Tewari, A.: Online learning: beyond regret. In: Proceedings of the 24th Annual Conference on Learning Theory, pp. 559–594 (2011)
Google Scholar
Shalev-Shwartz, S.: Online learning and online convex optimization. Found. Trends® Mach. Learn. 4(2), 107–194 (2012)
Google Scholar
Shimkin, N.: An online convex optimization approach to blackwell’s approachability. J. Mach. Learn. Res. 17(1), 4434–4456 (2016)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, Kyushu University, Fukuoka, Japan
Yaxiong Liu & Eiji Takimoto
Faculty of Arts and Science, Kyushu University, Fukuoka, Japan
Kohei Hatano
RIKEN AIP, Tokyo, Japan
Yaxiong Liu & Kohei Hatano

Authors

Yaxiong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Kohei Hatano
View author publications
You can also search for this author in PubMed Google Scholar
Eiji Takimoto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yaxiong Liu .

Editor information

Editors and Affiliations

Charles University, Prague, Czech Republic
Tomáš Bureš
University of Bergamo, Bergamo, Italy
Riccardo Dondi
Free University of Bozen-Bolzano, Bolzano, Italy
Johann Gamper
University of Genoa, Genoa, Italy
Giovanna Guerrini
University of Wroclaw, Wroclaw, Poland
Tomasz Jurdziński
Free University of Bozen-Bolzano, Bolzano, Italy
Claus Pahl
University Paris Dauphine, Paris, France
Florian Sikora
University of Liverpool, Liverpool, UK
Prudence W.H. Wong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Y., Hatano, K., Takimoto, E. (2021). Improved Algorithms for Online Load Balancing. In: Bureš, T., et al. SOFSEM 2021: Theory and Practice of Computer Science. SOFSEM 2021. Lecture Notes in Computer Science(), vol 12607. Springer, Cham. https://doi.org/10.1007/978-3-030-67731-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-67731-2_15
Published: 11 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67730-5
Online ISBN: 978-3-030-67731-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improved Algorithms for Online Load Balancing

Abstract

Similar content being viewed by others

Nash Social Welfare in Selfish and Online Load Balancing

Randomized Online Computation with High Probability Guarantees

Locally Optimal Load Balancing

Keywords

1 Introduction

2 Preliminaries

2.1 Online Load Balancing

Assumption 1

2.2 Repeated Game with Vector Payoffs and Approachability

Definition 1 (Approachability)

Definition 2

Definition 3

Remark 1

Proposition 1

Proof

2.3 Online Convex Optimization

Proposition 2

Theorem 2

3 Main Result

3.1 Reduction to a Vector Payoff Game

Proposition 3

Proof

3.2 Reduction to an OCO Problem

Theorem 3

Lemma 1

Proof

3.3 Reduction to Two OLO Problems

Lemma 2

3.4 Putting All the Pieces Together

Theorem 4

Corollary 1

Corollary 2

4 Algorithmic Details for \(L_\infty \)-norm

4.1 Computing \(\alpha _t\)

4.2 Computing Subgradients \(g_t\) for the \(\infty \)-norm

Definition 4

Proposition 4

Proof

Theorem 5

Proof

5 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation