1 INTRODUCTION

The search of equilibriums in transportation networks can often be reduced to finding an equilibrium in the corresponding congestion population game. In turn, the search for Nash equilibrium in such a game is always reduced to optimization problems. Under natural assumptions (that are typically satisfied in practical problems) such problems turn out to be convex. The most popular example is the BMW model of equilibrium distribution of flows over routes [1, 2] (the Beckmann model). Recently, the stable dynamics model [1, 3, 4] and the mixed model [5] that combines properties of the two preceding models have gained popularity.

In this paper, the result obtained in Section 3 of [1] is extended to mixed models. A new representation of the approach proposed in [1, Section 3] is given, which makes it more convenient for applications.

The paper is organized as follows. Section 2 contains the statement of the problem. The convex optimization problem the solution of which provides the desired equilibrium is formulated. The dual problem is constructed. Formulas determining the relation between the primal and dual variables are obtained. In Section 3, we describe the composite version of the well-known mirror descent (dual averaging) method for solving the dual problem. A method to numerically recover the solution of the primal problem given the (approximate) solution of the dual problem obtained using the mirror descent method is described. A theorem on the convergence rate of the proposed method based on the prescribed accuracy of reconstruction of solution of both primal and dual problems simultaneously is proved. In Section 4, another method to numerically recover the solution of the primal problem given the (approximate) solution of the dual problem obtained using the mirror descent method is described. A theorem similar to that proved in Section 3 on the convergence rate of the proposed method based on the prescribed accuracy of reconstruction of solution of both primal and dual problems simultaneously is proved. In Section 5, the results obtained in the preceding sections are analyzed and discussed. Finally, a summary of numerical results is given.

2 DESCRIPTION OF THE PROBLEM OF FINDING THE EQUILIBRIUM DISTRIBUTION OF FLOWS IN THE MIXED MODEL

Let the city transportation network be represented by a directed graph \(\Gamma = \left( {V,E} \right)\), where \(V\)are the nodes vertices (nodes), \(E \subset V \times V\) are the edges, \(O \subseteq V\) are the sources of trips (\(S = \left| O \right|\)), and \(D \subseteq V\) are sinks. In modern models of equilibrium distributions of flows in a metropolitan area, the number of graph nodes is of the order \(n = \left| V \right| \sim {{10}^{4}}\) [1], and the number of edges is greater by a factor of three or four. Let \(W \subseteq \left\{ {w = \left( {i,j} \right):\;i \in O,j \in D} \right\}\) be the set of possible trips, i.e., pairs source–sink. \(p = \left\{ {{{{v}}_{1}},{{{v}}_{2}},...,{{{v}}_{m}}} \right\}\) is called a route from \({{{v}}_{1}}\) to \({{{v}}_{m}}\) if \(\left( {{{{v}}_{k}},{{{v}}_{{k + 1}}}} \right) \in E\) for \(k = 1,...,m - 1\), \(m > 1\). \({{P}_{w}}\) denotes the set of routes corresponding to the trip \(w \in W\); i.e., if \(w = \left( {i,j} \right)\), then \({{P}_{w}}\) is the set of routes beginning at the vertex \(i\) and ending at \(j\); \(P = \bigcup\nolimits_{w \in W} {{{P}_{w}}} \) is the set of all routes in \(\Gamma \); \({{x}_{p}}\) [cars/hour] is the magnitude of the flow on the route \(p\), \(x = \left\{ {{{x}_{p}}:p \in P} \right\}\); \({{f}_{e}}\) [cars/hour] is the magnitude of the flow through the edge \(e\):

$${{f}_{e}}\left( x \right) = \sum\limits_{p \in P} {{{\delta }_{{ep}}}{{x}_{p}}} \quad (f = \Theta x),\quad {\text{where}}\quad {{\delta }_{{ep}}} = \left\{ \begin{gathered} 1,\quad e \in p, \hfill \\ 0,\quad e \notin p, \hfill \\ \end{gathered} \right.$$

and \({{\tau }_{e}}\left( {{{f}_{e}}} \right)\) is the specific cost of moving through the edge \(e\). It is usually assumed that these are (strictly) increasing smooth functions of \({{f}_{e}}\). More precisely, \({{\tau }_{e}}\left( {{{f}_{e}}} \right)\) can be better interpreted as the idea of the transportation network users about their spending (typically, time in the case of personal vehicle or convenience taking into account the travel time for public transport) for going through the edge \(e\) if the amount of those who wish to use this edge is \({{f}_{e}}\).

Consider the cost \({{G}_{p}}\left( x \right)\) (in terms of time or money) of using the route \(p\). It is natural to assume that \({{G}_{p}}\left( x \right) = \sum\nolimits_{e \in E}^{} {{{\tau }_{e}}\left( {{{f}_{e}}\left( x \right)} \right){{\delta }_{{ep}}}} \).

Suppose that we know how many moves \({{d}_{w}}\) in unit time are made according to the trip \(w \in W\). Then, the vector \(x\) describing the distribution of flows must be within the feasible set:

$$X = \left\{ {x \geqslant 0:\;\;\sum\limits_{p \in {{P}_{w}}} {{{x}_{p}}} = {{d}_{w}},\;w \in W} \right\}.$$

Consider the game in which each trip \(w \in W\) is associated with a sufficiently large set of similar “players” that move according to the trip \(w\) (the relative scale is determined by the numbers \({{d}_{w}}\)). The pure strategies of each player are routes, and the payoff is \( - {{G}_{p}}\left( x \right)\). The player “chooses” a route \(p \in {{P}_{w}}\); when making his choice, he neglects the fact that this choice “slightly” affects \(\left| {{{P}_{w}}} \right|\) of the components of the vector x and, therefore, the payoff \( - {{G}_{p}}\left( x \right)\) itself. It can be shown (e.g., see [2]) that finding the Nash–Wardrop equilibrium \(x* \in X\) (macrodescription of the equilibrium) is equivalent to solving the problem

$$\Psi \left( f \right) = \sum\limits_{e \in E} {{{\sigma }_{e}}\left( {{{f}_{e}}} \right)} = \sum\limits_{e \in E} {\int\limits_0^{{{f}_{e}}} {{{\tau }_{e}}\left( z \right)dz} } \to \mathop {\min }\limits_{\substack{ f = \Theta x \\ x \in X }} .$$
((1))

In the limit of the stable dynamics model (Nesterov–de Palma) [3, 4], we can pass to the limit on a part of the edges \(E{\kern 1pt} ' \subseteq E\):

$$\tau _{e}^{\mu }\left( {{{f}_{e}}} \right)\;\xrightarrow[{\mu \to 0 + }]{}\;\left\{ \begin{gathered} {{{\bar {t}}}_{e}},\quad 0 \leqslant {{f}_{e}} < {{{\bar {f}}}_{e}}, \hfill \\ \left[ {{{{\bar {t}}}_{e}},\infty } \right),\quad {{f}_{e}} = {{{\bar {f}}}_{e}}, \hfill \\ \end{gathered} \right.$$
$${{d\tau _{e}^{\mu }\left( {{{f}_{e}}} \right)} \mathord{\left/ {\vphantom {{d\tau _{e}^{\mu }\left( {{{f}_{e}}} \right)} {d{{f}_{e}}}}} \right. \kern-0em} {d{{f}_{e}}}}\;\xrightarrow[{\mu \to 0 + }]{}\;0,\quad 0 \leqslant {{f}_{e}} < {{\bar {f}}_{e}}.$$

Then, problem (1) takes the form

$$\Psi \left( f \right) = \sum\limits_{e \in E} {{{\sigma }_{e}}\left( {{{f}_{e}}} \right)} \to \mathop {\min }\limits_{\substack{ f = \Theta x,\;x \in X \\ {{f}_{e}} \leqslant {{{\bar {f}}}_{e}},\;e \in E' }} ,\quad {{\sigma }_{e}}\left( {{{f}_{e}}} \right) = {{f}_{e}}{{\bar {t}}_{e}},\quad e \in E{\kern 1pt} '.$$

In this paper, we propose a well-parallelized dual numerical method for finding the equilibrium in the mixed model (1), i.e., in the model in which a part of the edges satisfy Beckmann’s property and the other part are Nesterov–de Palma edges. Such problems arise, e.g., in the single-commodity railway cargo transportation model [5]. Unfortunately, the Frank–Wolfe conditional gradient algorithm, which is used in almost all modern transportation simulation software, is inapplicable to such mixed models. Novel approaches are needed.

We can construct the following dual problem for problem (1) (see [1, 4])

$$\Upsilon \left( t \right) = \underbrace { - \sum\limits_{w \in W} {{{d}_{w}}{{T}_{w}}\left( t \right)} }_{F\left( t \right)} + \sum\limits_{e \in E} {\sigma _{e}^{*}{\kern 1pt} \left( {{{t}_{e}}} \right)} \to \mathop {\min }\limits_{\substack{ {{t}_{e}} \geqslant {{{\bar {t}}}_{e}},\;e \in E' \\ {{t}_{e}} \in \operatorname{dom} \sigma _{e}^{*}\left( {{{t}_{e}}} \right),\;e \in E{\backslash\text{ }}E' }} ,$$
((2))

where \({{T}_{w}}\left( t \right) = \mathop {\min }\limits_{p \in {{P}_{w}}} \sum\nolimits_{e \in E}^{} {{{\delta }_{{ep}}}{{t}_{e}}} \) is the length of the shortest path from \(i\) to \(j\) (\(w = \left( {i,j} \right) \in W\)) in the graph \(\Gamma \) the edges of which are weighted by the vector \(t = {{\left\{ {{{t}_{e}}} \right\}}_{{e \in E}}}\). The solution \(f\) of problem (1) can be obtained using the formulas \({{f}_{e}} = {{\bar {f}}_{e}} - {{s}_{e}}\), \(e \in E{\kern 1pt} '\), where \({{s}_{e}} \geqslant 0\) is the Lagrange multiplier corresponding to the constraint \({{t}_{e}} \geqslant {{\bar {t}}_{e}}\);\({{\tau }_{e}}\left( {{{f}_{e}}} \right) = {{t}_{e}}\), \(e \in E{\backslash\text{ }}E{\kern 1pt} '\). Note that we have \(\sigma _{e}^{*}{\kern 1pt} \left( {{{t}_{e}}} \right) = {{\bar {f}}_{e}} \cdot \left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)\) for the edges \(e \in E{\kern 1pt} '\); and for the BPR-functions, we have

$${{\tau }_{e}}\left( {{{f}_{e}}} \right) = {{\bar {t}}_{e}} \cdot \left( {1 + \gamma \cdot {{{\left( {\frac{{{{f}_{e}}}}{{{{{\bar {f}}}_{e}}}}} \right)}}^{{\frac{1}{\mu }}}}} \right) \Rightarrow \sigma _{e}^{*}\left( {{{t}_{e}}} \right) = {{\bar {f}}_{e}} \cdot {{\left( {\frac{{{{t}_{e}} - {{{\bar {t}}}_{e}}}}{{{{{\bar {t}}}_{e}} \cdot \gamma }}} \right)}^{\mu }}\frac{{\left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)}}{{1 + \mu }}.$$

In applications, \(\mu = {1 \mathord{\left/ {\vphantom {1 4}} \right. \kern-0em} 4}\) is usually used. In this case, the step of the iterative method (3) described below can be performed using explicit formulas because Ferrari’s solution for the quartic equation is available (see [6]).

Finding the vector \(t\) is of independent interest because this vector describes costs on the edges of the transportation network graph. The solution of problem (2) gives the cost vector \(t\) at equilibrium.

3 THE FIRST METHOD FOR RECOVERING THE SOLUTION OF THE PRIMAL PROBLEM GIVEN THE SOLUTION OF THE DUAL PROBLEM

To solve the dual problem (2), we use the composite version of the mirror descent method [79] (\(k = 0,...,N,\)\({{t}^{0}} = \bar {t}\); the constraint \({{t}_{e}} \in \operatorname{dom} \sigma _{e}^{*}\left( {{{t}_{e}}} \right),\;e \in E{\backslash\text{ }}E{\kern 1pt} '\) is always inactive and can be, therefore, neglected)

$${{t}^{{k + 1}}} = \arg \mathop {\min }\limits_{\substack{ {{t}_{e}} \geqslant {{{\bar {t}}}_{e}},\;e \in E' \\ {{t}_{e}} \in \operatorname{dom} \sigma _{e}^{*}\left( {{{t}_{e}}} \right),\;e \in E\backslash E' }} \left\{ {{{\gamma }_{k}}\left\{ {\left\langle {\partial F({{t}^{k}}),\;t - {{t}^{k}}} \right\rangle + \sum\limits_{e \in E} {\sigma _{e}^{*}\left( {{{t}_{e}}} \right)} } \right\} + \frac{1}{2}\left\| {t - {{t}^{k}}} \right\|_{2}^{2}} \right\},$$
((3))

where \(\partial F({{t}^{k}})\) is an arbitrary element of the subdifferential of the convex function \(F({{t}^{k}})\) at the point \({{t}^{k}}\) and

$${{\gamma }_{k}} = {\varepsilon \mathord{\left/ {\vphantom {\varepsilon {M_{k}^{2}}}} \right. \kern-0em} {M_{k}^{2}}},\quad {{M}_{k}} = {{\left\| {\partial F({{t}^{k}})} \right\|}_{2}},$$

where \(\varepsilon > 0\) is the desired accuracy of solving problems (1) and (2) (see (6)). Unfortunately, we managed to prove inequality (*) (see the proof of Theorem 1 below) for μ ≥ 0 only for the constant step \({{\gamma }_{k}}: = {\varepsilon \mathord{\left/ {\vphantom {\varepsilon {\left( {\frac{1}{{N + 1}}\sum\nolimits_{k = 0}^N {M_{k}^{2}} } \right)}}} \right. \kern-0em} {\left( {\frac{1}{{N + 1}}\sum\nolimits_{k = 0}^N {M_{k}^{2}} } \right)}}\); in this case, \(\tilde {M}_{N}^{2}: = \frac{1}{{N + 1}}\sum\nolimits_{k = 0}^N {M_{k}^{2}} .\)

Set

$$\begin{gathered} {{{\bar {t}}}^{N}} = \frac{1}{{{{S}_{N}}}}\sum\limits_{k = 0}^N {{{\gamma }_{k}}{{t}^{k}}} ,\quad {{S}_{N}} = \sum\limits_{k = 0}^N {{{\gamma }_{k}}} , \\ f_{e}^{k} \in - {{\partial }_{e}}F({{t}^{k}}),\quad \bar {f}_{e}^{N} = \frac{1}{{{{S}_{N}}}}\sum\limits_{k = 0}^N {{{\gamma }_{k}}f_{e}^{k}} ,\quad e \in E{\backslash\text{ }}E{\kern 1pt} ', \\ \end{gathered} $$
((4))
$$\bar {f}_{e}^{N} = {{\bar {f}}_{e}} - s_{e}^{N},\quad e \in E{\kern 1pt} ',$$
((4'))

where \(s_{e}^{N}\) is the Lagrange multiplier corresponding to the constraint \({{t}_{e}} \geqslant {{\bar {t}}_{e}}\) in the problem

$$\frac{1}{{{{S}_{N}}}}\left\{ {\sum\limits_{k = 0}^N {{{\gamma }_{k}} \cdot \left\{ {\sum\limits_{e \in E'} {{{\partial }_{e}}F({{t}^{k}}) \cdot ({{t}_{e}} - t_{e}^{k})} } \right\}} + {{S}_{N}}\sum\limits_{e \in E'} {{{{\bar {f}}}_{e}} \cdot \left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)} + \frac{1}{2}{{{\sum\limits_{e \in E'} {\left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)} }}^{2}}} \right\} \to \mathop {\min }\limits_{{{t}_{e}} \geqslant {{{\bar {t}}}_{e}},\;e \in E'} .$$

The stopping rule is given by the last inequality in (5):

$$\left( {0 \leqslant } \right)\Upsilon ({{\bar {t}}^{N}}) + \Psi ({{\bar {f}}^{N}}) \leqslant \varepsilon .$$
((5))

In this paper, we obtain (following mainly [1, 9], where the case \(E = E{\kern 1pt} '\) was considered) the following result on the convergence of method (3).

Theorem 1.Let

$$\tilde {M}_{N}^{2} = {{\left( {\frac{1}{{N + 1}}\sum\limits_{k = 0}^N {M_{k}^{{ - 2}}} } \right)}^{{ - 1}}},$$
$$R_{N}^{2}: = \frac{1}{2}\sum\limits_{e \in E\backslash E'} {{{{\left( {{{\tau }_{e}}(\bar {f}_{e}^{N}) - {{{\bar {t}}}_{e}}} \right)}}^{2}}} + \frac{1}{2}\sum\limits_{e \in E'} {{{{\left( {\tilde {t}_{e}^{N} - {{{\bar {t}}}_{e}}} \right)}}^{2}}} ,$$
$${{\left\{ {\tilde {t}_{e}^{N}} \right\}}_{{e \in E'}}} = \arg \mathop {\min }\limits_{{{{\left\{ {{{t}_{e}}} \right\}}}_{{e \in E'}}} \geqslant 0} \left\{ \begin{gathered} {\kern 1pt} \hfill \\ {\kern 1pt} \hfill \\ \end{gathered} \right.{\kern 1pt} \underbrace { - \sum\limits_{w \in W} {{{d}_{w}}{{T}_{w}}\left( {{{{\left\{ {{{\tau }_{e}}(\bar {f}_{e}^{N})} \right\}}}_{{e \in E\backslash E'}}},\;{{{\left\{ {{{t}_{e}}} \right\}}}_{{e \in E'}}}} \right)} }_{F\left( {{{{\left\{ {{{\tau }_{e}}(\bar {f}_{e}^{N})} \right\}}}_{{e \in E\backslash E'}}},{{{\left\{ {{{t}_{e}}} \right\}}}_{{e \in E'}}}} \right)} + \sum\limits_{e \in E'} {\bar {f}_{e}^{N} \cdot \left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)} {\kern 1pt} \left. \begin{gathered} {\kern 1pt} \hfill \\ {\kern 1pt} \hfill \\ \end{gathered} \right\}.$$

Then, for

$$N \geqslant \frac{{2\tilde {M}_{N}^{2}R_{N}^{2}}}{{{{\varepsilon }^{2}}}},$$

we have inequality (5) and, as a consequence,

$$0 \leqslant \Upsilon ({{\bar {t}}^{N}}) - {{\Upsilon }_{*}} \leqslant \varepsilon ,\quad 0 \leqslant \Psi ({{\bar {f}}^{N}}) - {{\Psi }_{*}} \leqslant \varepsilon .$$
((6))

Proof. Using the results obtained in [8, 10], we have (we also use the fact that \(d\sigma _{e}^{*}\left( {{{t}_{e}}} \right){\text{/}}d{{t}_{e}} = {{f}_{e}}\)\( \Leftrightarrow \)\({{t}_{e}} = {{\tau }_{e}}\left( {{{f}_{e}}} \right)\), \(e \in E{\backslash\text{ }}E{\kern 1pt} '\); \(\sigma _{e}^{*}\left( {{{t}_{e}}} \right) \geqslant 0\), \(\sigma _{e}^{*}(t_{e}^{0}) = \sigma _{e}^{*}{\kern 1pt} \left( {{{{\bar {t}}}_{e}}} \right) = 0\), \(e \in E\))

$$\Upsilon ({{\bar {t}}^{N}}) \leqslant \frac{1}{{{{S}_{N}}}}\sum\limits_{k = 0}^N {{{\gamma }_{k}}} \Upsilon ({{t}^{k}}) \leqslant \frac{1}{{2{{S}_{N}}}}\sum\limits_{k = 0}^N {\gamma _{k}^{2}M_{k}^{2}} $$
$$ + \;\frac{1}{{{{S}_{N}}}}\mathop {\min }\limits_{{{t}_{e}} \geqslant {{{\bar {t}}}_{e}},\;e \in E{\kern 1pt} {\text{'}}} \left\{ {\sum\limits_{k = 0}^N {{{\gamma }_{k}}} \left\{ {F({{t}^{k}}) + \left\langle {\partial F({{t}^{k}}),\;t - {{t}^{k}}} \right\rangle } \right\} + {{S}_{N}}\sum\limits_{e \in E} {\sigma _{e}^{*}\left( {{{t}_{e}}} \right)} + \frac{1}{2}\left\| {t - {{t}^{0}}} \right\|_{2}^{2}} \right\} \leqslant \frac{\varepsilon }{2}$$
$$ + \;\mathop {\min }\limits_t \left\{ {\frac{1}{{{{S}_{N}}}}\left\{ {\sum\limits_{k = 0}^N {{{\gamma }_{k}}} \left\{ {F({{t}^{k}}) + \left\langle {\partial F({{t}^{k}}),\;t - {{t}^{k}}} \right\rangle } \right\} + {{S}_{N}}\sum\limits_{e \in E} {\sigma _{e}^{*}\left( {{{t}_{e}}} \right)} + \frac{1}{2}\left\| {t - \bar {t}} \right\|_{2}^{2}} \right\} + \sum\limits_{e \in E'} {s_{e}^{N} \cdot \left( {{{{\bar {t}}}_{e}} - {{t}_{e}}} \right)} } \right\}$$
$$ \leqslant \;\frac{\varepsilon }{2} + \mathop {\min }\limits_{{{{\left\{ {{{t}_{e}}} \right\}}}_{{e \in E'}}}} \mathop {\min }\limits_{{{{\left\{ {{{t}_{e}}} \right\}}}_{{e \in E\backslash E'}}}} \frac{1}{{{{S}_{N}}}}\left\{ {\sum\limits_{k = 0}^N {{{\gamma }_{k}}\left\{ {F({{t}^{k}}) + \left\langle {\partial F({{t}^{k}}),\;t - {{t}^{k}}} \right\rangle } \right\}} } \right.$$
$$ + \;\left( {{{S}_{N}}\sum\limits_{e \in E\backslash E'} {\sigma _{e}^{*}\left( {{{t}_{e}}} \right)} + \frac{1}{2}{{{\sum\limits_{e \in E\backslash E'} {\left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)} }}^{2}}} \right)\left. { + \left( {{{S}_{N}}\sum\limits_{e \in E'} {\bar {f}_{e}^{N} \cdot \left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)} + \frac{1}{2}{{{\sum\limits_{e \in E'} {\left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)} }}^{2}}} \right)} \right\}$$
$$ \leqslant \;\frac{\varepsilon }{2} + \mathop {\min }\limits_{{{{\left\{ {{{t}_{e}}} \right\}}}_{{e \in E'}}} \geqslant 0} \frac{1}{{{{S}_{N}}}}\left\{ \begin{gathered} {\kern 1pt} \hfill \\ {\kern 1pt} \hfill \\ \end{gathered} \right.{\kern 1pt} \sum\limits_{k = 0}^N {{{\gamma }_{k}}\overbrace {\underbrace {\left\{ {F\left( {{{t}^{k}}} \right) + \left\langle {\partial F\left( {{{t}^{k}}} \right),\;{{{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{t} }}^{N}} - {{t}^{k}}} \right\rangle } \right\}}_{ \leqslant F\left( {{{{\left\{ {{{\tau }_{e}}(\bar {f}_{e}^{N})} \right\}}}_{{e \in E\backslash E'}}},{{{\left\{ {{{t}_{e}}} \right\}}}_{{e \in E'}}}} \right)}}^{{{{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{t} }}^{N}} = \left( {{{{\left\{ {{{\tau }_{e}}(\bar {f}_{e}^{N})} \right\}}}_{{e \in E\backslash E'}}},{{{\left\{ {{{t}_{e}}} \right\}}}_{{e \in E'}}}} \right)}} $$
$$ + \;\left( {{{S}_{N}}\sum\limits_{e \in E\backslash E'} {\sigma _{e}^{*}\left( {{{\tau }_{e}}(\bar {f}_{e}^{N})} \right)} + \frac{1}{2}\sum\limits_{e \in E\backslash E'} {{{{\left( {{{\tau }_{e}}(\bar {f}_{e}^{N}) - {{{\bar {t}}}_{e}}} \right)}}^{2}}} } \right)\left. { + \left( {{{S}_{N}}\sum\limits_{e \in E'} {\bar {f}_{e}^{N} \cdot \left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)} + \frac{1}{2}{{{\sum\limits_{e \in E'} {\left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)} }}^{2}}} \right)} \right\}$$
$$ \leqslant \;\frac{\varepsilon }{2} + \mathop {\min }\limits_{{{{\left\{ {{{t}_{e}}} \right\}}}_{{e \in E'}}} \geqslant 0} \left\{ {F\left( {{{{\left\{ {{{\tau }_{e}}(\bar {f}_{e}^{N})} \right\}}}_{{e \in E\backslash E'}}},{{{\left\{ {{{t}_{e}}} \right\}}}_{{e \in E'}}}} \right) + } \right.\left. {\sum\limits_{e \in E'} {\bar {f}_{e}^{N} \cdot \left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)} + \frac{1}{{2{{S}_{N}}}}{{{\sum\limits_{e \in E'} {\left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)} }}^{2}}} \right\}$$
$$ + \;\sum\limits_{e \in E\backslash E'} {\sigma _{e}^{*}\left( {{{\tau }_{e}}(\bar {f}_{e}^{N})} \right)} + \frac{1}{{2{{S}_{N}}}}\sum\limits_{e \in E\backslash E'} {{{{\left( {{{\tau }_{e}}(\bar {f}_{e}^{N}) - {{{\bar {t}}}_{e}}} \right)}}^{2}}} $$
$$ \leqslant \;\frac{\varepsilon }{2} + \mathop {\min }\limits_{{{{\left\{ {{{t}_{e}}} \right\}}}_{{e \in E'}}} \geqslant 0} \left\{ {F\left( {{{{\left\{ {{{\tau }_{e}}(\bar {f}_{e}^{N})} \right\}}}_{{e \in E\backslash E'}}},{{{\left\{ {{{t}_{e}}} \right\}}}_{{e \in E'}}}} \right) + \sum\limits_{e \in E\backslash E'} {\sigma _{e}^{*}\left( {{{\tau }_{e}}(\bar {f}_{e}^{N})} \right)} + \sum\limits_{e \in E'} {\bar {f}_{e}^{N} \cdot \left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)} } \right\}$$
$$ + \;\frac{1}{{2{{S}_{N}}}}{{\sum\limits_{e \in E'} {\left( {\tilde {t}_{e}^{N} - {{{\bar {t}}}_{e}}} \right)} }^{2}} + \frac{1}{{2{{S}_{N}}}}\sum\limits_{e \in E\backslash E'} {{{{\left( {{{\tau }_{e}}(\bar {f}_{e}^{N}) - {{{\bar {t}}}_{e}}} \right)}}^{2}}} $$
$$ = \;\frac{\varepsilon }{2} - \Psi ({{\bar {f}}^{N}}) + \frac{{R_{N}^{2}}}{{{{S}_{N}}}} = \frac{\varepsilon }{2} + \frac{{\tilde {M}_{N}^{2}R_{N}^{2}}}{{\varepsilon \cdot \left( {N + 1} \right)}} - \Psi ({{\bar {f}}^{N}}) \leqslant \varepsilon - \Psi ({{\bar {f}}^{N}}).$$

Corollary 1.Let \(t{\kern 1pt} *\)be the solution of problem (2). Set

$${{R}^{2}} = \frac{1}{2}\left\| {t{\kern 1pt} {\text{*}} - {{t}^{0}}} \right\|_{2}^{2} = \frac{1}{2}\left\| {{{t}_{*}} - \bar {t}} \right\|_{2}^{2}.$$

Then, for

$$N = \frac{{2\tilde {M}_{N}^{2}{{R}^{2}}}}{{{{\varepsilon }^{2}}}},$$

it holds that

$$\frac{1}{2}\left\| {t - {{t}^{{k + 1}}}} \right\|_{2}^{2} \leqslant 2{{R}^{2}},\quad k = 0,...,N,$$
((7))
$$0 \leqslant \Upsilon ({{\bar {t}}^{N}}) - {{\Upsilon }_{*}} \leqslant \varepsilon .$$
((8))

Proof.Formula (8) is a standard result (e.g., see [7]). Formula (7) is also fairly conventional (see [10]); however, we outline its derivation below. From the proof of Theorem 1, we have for every \(k = 0,...,N\)

$$0 \leqslant \frac{1}{2}\sum\limits_{l = 0}^k {\gamma _{l}^{2}M_{l}^{2}} + \frac{1}{2}\left\| {t - {{t}^{0}}} \right\|_{2}^{2} - \frac{1}{2}\left\| {t - {{t}^{{k + 1}}}} \right\|_{2}^{2}.$$

Therefore,

$$\frac{1}{2}\left\| {t - {{t}^{{k + 1}}}} \right\|_{2}^{2} \leqslant {{R}^{2}} + \frac{1}{2}\sum\limits_{l = 0}^k {{{\varepsilon }^{2}}M_{l}^{{ - 2}}} \leqslant {{R}^{2}} + \frac{1}{2}\sum\limits_{k = 0}^N {{{\varepsilon }^{2}}M_{k}^{{ - 2}}} = 2{{R}^{2}}.$$

Remark 1. An advantage of approach (3), (4), (4') over the approach described in Section 3 of [1] is the simplicity of description (there is no need in doing restarts with respect to unknown parameters) and the existence of the stopping rule (5) that can be effectively checked. A disadvantage is that the estimate of the convergence rate includes the poorly controllable parameter \(R_{N}^{2}\), which can be large even when \(E = E{\kern 1pt} '\). Below, we describe another method (also see [11, 12], where similar constructs are described) for recovering the solution of the primal problem (1) that differs from (4), (4') in formula (4'); in the case \(E = E{\kern 1pt} '\), this method allows the use of \({{R}^{2}}\) instead of \(R_{N}^{2}\).

4 THE SECOND METHOD FOR RECOVERING THE SOLUTION OF THE PRIMAL PROBLEM GIVEN THE SOLUTION OF THE DUAL PROBLEM

Set

$$f_{e}^{k} \in - {{\partial }_{e}}F({{t}^{k}}),\quad \bar {f}_{e}^{N} = \frac{1}{{{{S}_{N}}}}\sum\limits_{k = 0}^N {{{\gamma }_{k}}f_{e}^{k}} ,\quad e \in E,$$
((9))
$${{\tilde {R}}^{2}} = \frac{1}{2}\sum\limits_{e \in E'} {{{{(t_{e}^{*} - t_{e}^{0})}}^{2}}} = \frac{1}{2}\sum\limits_{e \in E'} {{{{(t_{e}^{*} - {{{\bar {t}}}_{e}})}}^{2}}} .$$

Theorem 2. Let

$$\tilde {R}_{N}^{2}: = \frac{1}{2}\sum\limits_{e \in E\backslash E'} {{{{\left( {{{\tau }_{e}}(\bar {f}_{e}^{N}) - {{{\bar {t}}}_{e}}} \right)}}^{2}}} + 5{{\tilde {R}}^{2}}.$$
((10))

Then, for

$$N \geqslant \frac{{4\tilde {M}_{N}^{2}\tilde {R}_{N}^{2}}}{{{{\varepsilon }^{2}}}},$$

it holds that

$$\left| {\Upsilon ({{{\bar {t}}}^{N}}) - {{\Upsilon }_{*}}} \right| \leqslant \varepsilon ,\quad \left| {\Psi ({{{\bar {f}}}^{N}}) - {{\Psi }_{*}}} \right| \leqslant \varepsilon .$$

Furthermore, the inequalities

$$\begin{gathered} \sqrt {\sum\limits_{e \in E'} {{{{\left( {{{{(\bar {f}_{e}^{N} - {{{\bar {f}}}_{e}})}}_{ + }}} \right)}}^{2}}} } \leqslant \tilde {\varepsilon },\quad \tilde {\varepsilon } = {\varepsilon \mathord{\left/ {\vphantom {\varepsilon {\tilde {R}}}} \right. \kern-0em} {\tilde {R}}}, \\ \Psi ({{{\bar {f}}}^{N}}) - {{\Psi }_{*}} \leqslant \Upsilon ({{{\bar {t}}}^{N}}) + \Psi ({{{\bar {f}}}^{N}}) \leqslant \varepsilon \\ \end{gathered} $$
((11))

hold, which can be used as a stopping rule given the pair \(\left( {\varepsilon ,\tilde {\varepsilon }} \right)\) .

Proof. Reasoning as in the proof of Theorem 1 (also see [11, 12]), we obtain

$$\Upsilon ({{\bar {t}}^{N}}) \leqslant \frac{1}{{{{S}_{N}}}}\sum\limits_{k = 0}^N {{{\gamma }_{k}}} \Upsilon ({{t}^{k}})\;\mathop \leqslant \limits^{(12)} \;\frac{1}{{2{{S}_{N}}}}\sum\limits_{k = 0}^N {\gamma _{k}^{2}M_{k}^{2}} $$
$$ + \;\frac{1}{{{{S}_{N}}}}\mathop {\min }\limits_{\substack{ {{t}_{e}},\;e \in E\backslash E';\;{{t}_{e}} \geqslant {{{\bar {t}}}_{e}},\;e \in E' \\ \frac{1}{2}\sum\limits_{e \in E'} {{{{\left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)}}^{2}}} \leqslant 5{{{\tilde {R}}}^{2}} }} \left\{ {\sum\limits_{k = 0}^N {{{\gamma }_{k}}} \left\{ {F({{t}^{k}}) + \left\langle {\partial F({{t}^{k}}),\;t - {{t}^{k}}} \right\rangle } \right\} + {{S}_{N}}\sum\limits_{e \in E} {\sigma _{e}^{*}\left( {{{t}_{e}}} \right)} } \right\} + \frac{{\tilde {R}_{N}^{2}}}{{{{S}_{N}}}}$$
$$\mathop \leqslant \limits^{(13)} \;\frac{\varepsilon }{2} - \Psi ({{\bar {f}}^{N}}) - \mathop {\max }\limits_{\substack{ {{t}_{e}} \geqslant {{{\bar {t}}}_{e}},\;e \in E' \\ \frac{1}{2}\sum\limits_{e \in E'} {{{{\left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)}}^{2}}} \leqslant 5{{{\tilde {R}}}^{2}} }} \left\{ {\frac{1}{{{{S}_{N}}}}\sum\limits_{k = 0}^N {{{\gamma }_{k}}\sum\limits_{e \in E'} {(f_{e}^{k} - {{{\bar {f}}}_{e}})\left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right)} } } \right\} + \frac{{\tilde {R}_{N}^{2}}}{{{{S}_{N}}}}$$
$$ \leqslant \;\frac{\varepsilon }{2} - \Psi ({{\bar {f}}^{N}}) - 3\tilde {R}\sqrt {\sum\limits_{e \in E'} {{{{\left( {{{{(\bar {f}_{e}^{N} - {{{\bar {f}}}_{e}})}}_{ + }}} \right)}}^{2}}} } + \frac{{\tilde {R}_{N}^{2}}}{{{{S}_{N}}}} \leqslant \varepsilon - \Psi ({{\bar {f}}^{N}}) - 3\tilde {R}\sqrt {\sum\limits_{e \in E'} {{{{\left( {{{{(\bar {f}_{e}^{N} - {{{\bar {f}}}_{e}})}}_{ + }}} \right)}}^{2}}} } .$$

Inequality (12) was obtained using the equality

$${{\left\{ {{{\tau }_{e}}(\bar {f}_{e}^{N})} \right\}}_{{e \in E\backslash E'}}} = \arg \mathop {\min }\limits_{{{t}_{e}},\;e \in E\backslash E'} \left\{ {\frac{1}{{{{S}_{N}}}}\sum\limits_{k = 0}^N {{{\gamma }_{k}}} \left\{ {F({{t}^{k}}) + \left\langle {\partial F({{t}^{k}}),\;t - {{t}^{k}}} \right\rangle } \right\} + \sum\limits_{e \in E} {\sigma _{e}^{*}\left( {{{t}_{e}}} \right)} } \right\}.$$

Inequality (13) was obtained on the basis of the following relations:

$$\partial F({{t}^{k}}) = - {{f}^{k}},\quad F({{t}^{k}}) = - \left\langle {{{f}^{k}},{{t}^{k}}} \right\rangle ,$$
$$\mathop {\min }\limits_{{{t}_{e}}} \left\{ { - f_{e}^{k}{{t}_{e}} + \sigma _{e}^{*}\left( {{{t}_{e}}} \right)} \right\} = - {{\sigma }_{e}}(f_{e}^{k}),\quad e \in E{\backslash\text{ }}E{\kern 1pt} ',$$
$$ - f_{e}^{k}{{t}_{e}} + \sigma _{e}^{*}\left( {{{t}_{e}}} \right) = - (f_{e}^{k} - {{\bar {f}}_{e}})\left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right) - f_{e}^{k}{{\bar {t}}_{e}} = - (f_{e}^{k} - {{\bar {f}}_{e}})\left( {{{t}_{e}} - {{{\bar {t}}}_{e}}} \right) - {{\sigma }_{e}}(f_{e}^{k}),\quad e \in E{\kern 1pt} ',$$
$$ - \frac{1}{{{{S}_{N}}}}\sum\limits_{k = 0}^N {{{\gamma }_{k}}} \sum\limits_{e \in E} {{{\sigma }_{e}}(f_{e}^{k})} \leqslant - \Psi ({{\bar {f}}^{N}}).$$

Therefore,

$$\Upsilon ({{\bar {t}}^{N}}) + \Psi ({{\bar {f}}^{N}}) + 3\tilde {R}\sqrt {\sum\limits_{e \in E'} {{{{\left( {{{{(\bar {f}_{e}^{N} - {{{\bar {f}}}_{e}})}}_{ + }}} \right)}}^{2}}} } \leqslant \varepsilon .$$

Repeating the reasoning in Section 6.11 in [13] and Section 3 in [12], we obtain the desired inequalities.

5 FINAL REMARKS

In this section, we discuss in the form of remarks the results obtained above and briefly present the results of numerical experiments.

Remark 2. The advantages of formula (9) over formulas (4), (4') in the case \(E = E{\kern 1pt} '\) were discussed in Remark 1. Here we formulate the disadvantages of this approach: (1) the constraint \({{f}_{e}} \leqslant {{\bar {f}}_{e}}\), \(e \in E{\kern 1pt} '\) in the primal problem can be violated; (2) there are no left inequalities in the double inequalities (5), (6).

Remark 3.Formulas (4) and (9) by force (in case (4)) or purposefully (in case (9) for \(e \in E{\kern 1pt} '\)) recover the solution of the primal problem based on the “model”—the explicit formula relating the primal and dual variables. The presence of such variables inevitably causes the duality gap in the solutions of auxiliary problems the size of which is difficult to control. In the case when the existence of the model is related to a constraint in the primal problem (formulas (9) for \(e \in E{\kern 1pt} '\) and the constraint in the primal problem satisfies \({{f}_{e}} \leqslant {{\bar {f}}_{e}}\) for \(e \in E{\kern 1pt} '\)), the constraints to be controlled may be violated (11). But on the other hand, we have complete control of the parts of the duality gap estimates (10) corresponding to these variables. Approach (4') related to the existence of constraints (\({{t}_{e}} \geqslant {{\bar {t}}_{e}}\), \(e \in E{\kern 1pt} '\)) in the problem to be solved also leads to the emergence of duality gap in the solutions of auxiliary problems the size of which is difficult to control; however, this no longer causes violations in the constraints in the problem itself and in the adjoint problem (in the case under examination, the dual problem is (2) and the adjoint of it is the primal problem (1)). These two primal–dual approaches are complemented with another primal–dual approach in which the steps are made “with respect to the functional” if constraints in the problem are not violated or weakly violated, and “with respect to the violated constraint” otherwise. This issue is discussed in more detail in, e.g., [13, 14].

Remark 4. Both approaches described above can be extended (retaining the form of recovery formulas and the structure of reasoning) to almost all pairs of primal–dual problems because the example of the pair of mutually adjoint problems (1), (2) contains almost all details that can appear in such reasoning. Moreover, instead of the mirror descent method, we could use any other primal–dual method, e.g., the composite universal gradient method described in [15].

Remark 5. The proposed methods can be efficiently parallelized if the elements of the subdifferential \(\partial F({{t}^{k}})\), which is the most computationally costly part of the iterative step, can be efficiently computed (see formulas (3), (4), and (9), in which this subdifferential is used):

$$\partial F\left( t \right) = - \sum\limits_{i \in O} {\sum\limits_{j \in D:\;\left( {i,j} \right) \in W} {{{d}_{{ij}}}\partial {{T}_{{ij}}}\left( t \right)} } .$$

\({{\left\{ {\partial {{T}_{{ij}}}\left( t \right)} \right\}}_{{j \in D:\;\left( {i,j} \right) \in W}}}\) can by computed by Dijkstra’s algorithm [16] and its modern analogs (see [17]) in time \({\rm O}\left( {n\ln n} \right)\). By \(\partial {{T}_{{ij}}}\left( t \right)\), we may mean any (if there are several of them) shortest path from the vertex \(i\) to \(j\) in the graph \(\Gamma \) the edges of which are weighted by the vector \(t = {{\left\{ {{{t}_{e}}} \right\}}_{{e \in E}}}\). By the path description, we mean \({{\left[ {\partial {{T}_{{ij}}}\left( t \right)} \right]}_{e}} = 1\) if \(e\) appears on the shortest path and \({{\left[ {\partial {{T}_{{ij}}}\left( t \right)} \right]}_{e}} = 0\) otherwise. Thus, the computation of \(\partial F\left( t \right)\) can be parallelized to \(S\) processors.

Remark 6. Strictly speaking, we want to find the vector \(\partial F\left( t \right)\) rather than the shortest paths. To find \(\partial F\left( t \right)\) in \({\rm O}\left( {Sn\ln n} \right)\) operations, we can construct the shortest path tree (e.g., using Dijkstra’s algorithm) for each of \(S\) sources. Based on the principle of dynamic programming any part of the shortest path is a shortest path itself, it is easy to verify that we indeed obtain a tree rooted at the source vertex. For a single source, this tree can be constructed in \({\rm O}\left( {n\ln n} \right)\) (see [16, 17]). However, the key point is to find proper weights of edges (there are not more than \({\rm O}\left( n \right)\) edges) of this tree so that the contribution of the corresponding source (over all edges) to the total vector \(\partial F\left( t \right)\) could be recovered in a single path. The edge weight must be equal to the sum of all trips passing through it and beginning at the given source (tree root). Knowing the values of the corresponding trips (there are not more than \({\rm O}\left( n \right)\) of them), the proper weights can be found using not more than \({\rm O}\left( n \right)\) operations in a single backward pass through this tree (i.e., from the tree leaves to its root). This is done by setting the weight of each edge equal to the sum of trips (which may be zero) to the corresponding vertex passing through this edge and the sum of weights of all edges (if there are any) outgoing from this vertex.

Six-year students of the Moscow Institute of Physics and Technology performed various numerical experiments [18, 19] with the algorithms described in this paper and in [1]. The data for the experiments were taken from the open resource [20]. Here are the main conclusions drawn from these experiments:

• If all edges in the mixed model are of the Beckmann type, then it is better to use the conditional gradient (Frank–Wolfe) method [1, 2].

• The methods described in this paper and in [1] based on solving the dual problem by the primal–dual mirror descent method perform in agreement with the estimates (i.e., the estimates obtained in this paper are confirmed by practical computations, and the methods do not converge faster than predicted by these estimates).

• It is reasonable to apply the dual approaches based on the mirror descent (dual averaging [10]) method to real cities only if relatively low accuracy is required (not greater than 5%); to achieve better accuracy in the case of metropolitan areas (tens of thousands of edges) all these methods (not parallelized) will take hours.

• The use of high-level languages (like Python) reduces efficiency (up to an order of magnitude) compared with lower-level languages (like C++).