1 Introduction

This note considers just one contribution of Ragnar Frisch, Nobel Laureate in Economic Sciences 1969. It is, however, a most important one, belonging not to core economics, but rather to those parts of optimization theory that are most applied in economics.

The underlying issue goes broadly as follows. In optimization, linear equality constraints may help to reduce the number of variables. Not so, if constraints come as inequalities —as indeed, they often do. Variable elimination then becomes hard. So, why not incorporate sign restrictions by way of some barrier method? This was Frisch’s pioneering idea.Footnote 1 To enforce acceptable choice of nonnegative variables, [18] used the logarithmic potential ”as a guiding device—a sort of radar—to prevent us from hitting the boundary.”

In hindsight, his approach appears most natural. Yet it put him on a novel path, rich in consequences, but—at that time—distinctly beside the dominant orientation of optimization theory.

To date, his approach has impressed few economists,Footnote 2 and it took some time before it attracted wide attention in mathematics.Footnote 3 However, from the late 1980s onwards, his idea has been pursued by many—with much effort, great success, and due references.Footnote 4 Thus emerged an important, new class of optimization techniques, called interior-point methods, all somewhat tempered insofar as keeping each inequality strictly satisfied throughout the procedure.

Fiacco and McCormick [11] were chief inventors.Footnote 5 But certainly, [16, 17, 18, p.13] came first.Footnote 6 The purpose of this note is to set his contributions right, at the origin of barrier methods.

In reviewing parts of his approach, and briefly mentioning some subsequent developments, this note aims at reaching diverse readers. Included are economists and mathematicians, interested in applicable optimization and its history.Footnote 7 The material below is self-contained. It provides a general view, and is intended to be accessible with few prerequisites.

2 Frisch and optimization theory

Frisch’s interests revolved around econometrics [12] and many of his studies dealt with the dynamics, organization and planning of national development. In all, being well trained in mathematics, he saw optimization mainly as a supplement, especially useful to input-output analysis [31]. This is seen most clearly in his papers from 1955–1956. He had though, a strong and lasting interest in optimization per se.

Already his 1934 paper in Econometrica—a bit too long—on “national organization of a commodity and service exchange” featured a convex, linear-quadratic problem. Ahead of Stiegler (1945), [14] had a first rudimentary go at the diet problem, as reviewed by [40].Footnote 8 It appears that Frisch knew neither Kantorovich’s [26] work on production planning nor Hitchcock’s [24] study of product distribution. However, during the late 1940s and early 1950s he followed closely the development of linear programming—and its impacts on economic analysis—unfolding in [4, 8], and [30].Footnote 9

In parallel—during intense, fairly short spells—he worked on intimately related issues. To wit, from mid October 1953 until early May 1954 he issued ten memoranda at his Oslo institute of economics, all in Norwegian, all on linear programming; see [2]. In three early ones—dated Oct. 18, Nov. 13 and Jan. 14—he applied the ” logarithmic potential” to keep each nonnegative decision variable strictly positive.Footnote 10

By adding such highly nonlinear terms to the problem’s objective—whence inviting continuous optimization—he distinctly deviated from the linear programming of those days, dominated by Dantzig’s simplex algorithm.

As is well known, that algorithm has a discrete nature. It proceeds on the boundary of the polyhedron by jumping from one vertex to a better one—if any.Footnote 11 However, in the early 1970s, after long having enjoyed its practical efficiency, the optimization community became intrigued by the huge difference between its common and worst-case performances. An example of [29] unleashed a quest for solution procedures whose running times were polynomial functions of the number of bits needed to store the problem data. The first constructive steps in that direction were taken by [28] and [27]—neither of whom mentioned Frisch or barrier methods.Footnote 12 Their papers inspired development and practical use of polynomial-time methods. Central in that regard is Frisch’s logarithmic potential.

3 The problem and barrier methods

To review the originality of Frisch’s contributions it is convenient to let this section frame a suitable setting by way of the problem:

$$\begin{aligned} \left. \begin{array}{lll} \text {maximize} &{} c(x) &{} \\ \text {subject to} &{} Ax=y &{} \\ \text {and} &{} c_{j}(x)\ge 0, &{} j\in J. \end{array} \right\} \end{aligned}$$
(1)

\(A\) is here a linear operator—alias a matrix—mapping a finite-dimensional vector space \(\mathbb {X}\) onto another such space \(\mathbb {Y}\). Both spaces are equipped with inner products \( \left\langle \cdot ,\cdot \right\rangle \). By standing assumption, the criterion \(c:\mathbb {X\rightarrow R}\) and the constraint functions \(c_{j}: \mathbb {X\rightarrow R}\) are concave and differentiable. The number \(\left| J\right| \) of inequalities is finite.

For interpretation, the vector \(x\in \mathbb {X}\) records inputs to a linear production process \(x\mapsto Ax,\) bound to bring out a prescribed vector \( y\in \mathbb {Y}\)—without violating any constraint \(c_{j}(x)\ge 0,\) \(j\in J\).

Henceforth suppose problem (1) be strictly feasible in that

$$\begin{aligned} \text {some}\quad x\in \mathbb {X}\quad \text {satisfies} \quad c_{j}(x)>0\quad \text {for all} \quad j\in J,\quad \text {and} \quad Ax=y. \end{aligned}$$
(2)

Under that premise, fix a positive barrier parameter \(\beta \) and consider the modified problem:

$$\begin{aligned} \text {maximize }c(x)+\beta \sum _{j\in J}\ln c_{j}(x)\quad \text {subject to} \quad Ax=y. \end{aligned}$$
(3)

Problem (3) remains concave and differentiable—and most important: it enforces strict satisfaction of the inequality constraints. Footnote 13 The Karush-Kuhn-Tucker optimality conditions for (3) are necessary and sufficient, and they read:

$$\begin{aligned} c^{\prime }(x)+\sum _{j\in J}\frac{\beta }{c_{j}(x)}c_{j}^{\prime }(x)-A^{*}y^{*}=0\quad \text {and}\quad Ax=y. \end{aligned}$$
(4)

Here \(A^{*}\) is the transposed operator, mapping output prices \( y^{*}\) to input prices \(A^{*}y^{*}\).Footnote 14 Fix any pair \((x,y^{*})\) that satisfies conditions (4). Let

$$\begin{aligned} L(x,\lambda ,y^{*}):=c(x)+\sum _{j\in J}\lambda _{j}c_{j}(x)+\left\langle y^{*},y-Ax\right\rangle \end{aligned}$$

be the Lagrange function of the original problem (1). Introduce multipliers \(\lambda _{j}:=\beta /c_{j}(x)\) in (4) to see that, besides \(Ax=y,\) there holds

$$\begin{aligned} \frac{\partial L(x,\lambda ,y^{*})}{\partial x}=0. \end{aligned}$$
(5)

Looking at problem (1) anew, stationarity condition (5) says that \(x\) is a best response to price vectors \(\lambda ,y^{*}\). Moreover, for that same problem, the triple \((x,\lambda ,y^{*})\) provides an overestimate of the duality gap

$$\begin{aligned} \inf _{\lambda \ge 0,y^{*}}\sup _{x}L-\sup _{x}\inf _{\lambda \ge 0,y^{*}}L \le \left\{ c(x)+\sum _{j\in J}\lambda _{j}c_{j}(x)+\left\langle y^{*},y-Ax\right\rangle \right\} -c(x)=\beta \left| J\right| . \end{aligned}$$

The upshot is that an optimal solution to the “surrogate problem” (3) produces a primal-dual feasible choice for the original problem (1) as nearly optimal as desired.

Iterative restarts with smaller \(\beta \) entail a steadily decreasing duality gap. Versions of such so-called barrier methods were developed, analyzed and popularized in the 1960s by [10]. Their classic 1968 book, which referred to Frisch, was widely read, and their software much used; see [37] on their Sequential Unconstrained Minimization Technique.Footnote 15 Shanno [42] justly credits them as major inventors of interior-point primal methods.

These methods were long believed to suffer from ill-conditioning of Hessian matrices—a feature which may obstruct steepest ascent or quasi-Newton procedures.Footnote 16 Dire consequences have, however, rarely been documented. In particular, when stated in primal-dual form, ill-conditioning of interior-point methods is quite benign; see [49], or [38]. To see how that form enters, it is fitting to recall next how Frisch dealt with linear instances of problem ( 1).

4 Linear programming with the logarithmic potential

This section reviews how Frisch handled linear programming. It uses his format, subsumed by the more general problem (1).

Henceforth let \(\mathbb {X=R}^{J}\), \(c(x)=\left\langle x^{*},x\right\rangle ,\) \(c_{j}(x)=x_{j}\), and \(\mathbb {Y=R}^{I}\) for some finite set \(I\). Consequently, using standard inner products, problem (1 ) specializes to a linear program in canonical form:

$$\begin{aligned} \text {maximize }\left\langle x^{*},x\right\rangle \quad \text {subject to } Ax=y\quad \text { and }\quad x\ge 0. \end{aligned}$$
(6)

\(A\) is an \(I\times J\) matrix with linearly independent rows. Hence the system \(Ax=y\) is solvable for some maximal subvector \(x_{B},\) \(B\subseteq J,\) \(\left| B\right| =\left| I\right| ,\) in terms of the remaining part \(x_{N}\), \(N:=J\diagdown B\). Thus emerges an equivalent equation system of the form

$$\begin{aligned} x_{B}=\chi +\mathcal {B}x_{N}. \end{aligned}$$
(7)

Substituting for \(x_{B}\) in the objective gives a reduced but affine criterion

$$\begin{aligned} \left\langle x^{*},x\right\rangle =v+\left\langle r_{N},x_{N}\right\rangle . \end{aligned}$$
(8)

Frisch [17] started with tableau (7), (8), referring to variables \(x_{B}\) as dependent, variables \(x_{N}\) as independent, and he called \(\left| N\right| \) the number of degrees of freedom.Footnote 17

In linear programming jargon, \(x_{B}\) comprises basic variables, \( x_{N}\) nonbasic variables, and \(r_{N}\) reduced prices. Note that \(x=(x_{B},0)\) becomes feasible for (6) if \(\chi \ge 0\). If moreover, \(r_{N}\le 0,\) the vertex \((x_{B},0)\) is optimal, yielding best value \(v\). Along the same line, still with \(\chi \ge 0,\) if some reduced price \(r_{n}\) remains positive, one may increase \(x_{n}\) beyond \(0\)—whence improve the objective—until some \(x_{b}\) (if any) hits its minimal level \( 0\). Thus a nonbasic \(n\) comes to replace a basic \(b\).Footnote 18

Frisch [17] proceeded, however, differently. Justified by strict feasibility (2) and equation (7) he considered the logarithmic potential

$$\begin{aligned} \sum _{j\in J}\ln x_{j}=\sum _{b\in B}\ln \left( \chi _{b}+\sum _{n\in N}\mathcal {B} _{bn}x_{n}\right) +\sum _{n\in N}\ln x_{n}=:V(x_{N}) \end{aligned}$$

and its reduced gradient \(V^{\prime }(x_{N})\). If construed as a tentative direction of improvement, \(V^{\prime }(x_{N})\) competes with the reduced price vector \(r_{N}\). So, Frisch sought to strike “a wise compromise between these two directions”, setting

$$\begin{aligned} g_{N}:=r_{N}+\beta V^{\prime }(x_{N}) \end{aligned}$$
(9)

for some positive \(\beta \). Thus, in hindsight, modulo variable elimination, he would

$$\begin{aligned} \text {maximize }\left\langle x^{*},x\right\rangle +\beta \sum _{j\in J}\ln x_{j}\quad \text {subject to} \quad Ax=y, \end{aligned}$$
(10)

and dealt thereby with a most important instance of problem (3). As vehicle he applied the reduced-form gradient (9). When the latter has a positive “projection” \( \left\langle r_{N},g_{N}\right\rangle \) onto the reduced price vector \(r_{N}, \) he recommended that \(x_{N}>0\) be updated with suitable stepsize along \( g_{N}\)—and that \(x_{B}>0\) thereafter be adjusted by (7). Clearly, whenever halted, the method furnishes a feasible solution alongside an overestimate of the duality gap.

Frisch saw the early days of computation and programming. At that time, he had no direct access to electronic computers. A main concern of his was to identify active constraints. Accordingly, his implementation—by requiring repeated, on-line judgements—now appears somewhat ad hoc. And his convergence analysis is incomplete. Moreover, he did not apply Newton methods. Nonetheless, he opened the door to subsequent developments as indicated next.

5 Enter Newton methods

By strict feasibility (2), primal problem (6) and its dual

$$\begin{aligned} \text {minimize }\left\langle y^{*},y\right\rangle \text { subject to } x^{*}\le A^{*}y^{*} \end{aligned}$$

are both feasible—with equal optimal values. The optimality conditions for (6) read:

$$\begin{aligned} Ax=y, \quad x^{*}+\lambda =A^{*}y^{*}, \quad \lambda ,x\ge 0 \quad \hbox {and} \quad \left\langle \lambda ,x\right\rangle =0. \end{aligned}$$

For comparison, and in view of (4), the corresponding conditions for (10) take the simpler form:

$$\begin{aligned} Ax=y\quad \hbox {and} \quad x^{*}+\left[ \frac{\beta }{x_{j}}\right] =A^{*}y^{*}, \end{aligned}$$
(11)

it being understood that \(x>0\) is an implicit constraint.

Until the late 1980s, interior-point procedures for (6) used Newton methods to solve (11), letting \(\beta \) dwindle stepwise towards zero. However, around 1990, setting \(\lambda _{j}:=\beta /c_{j}(x)= \beta /x_{j}\), conditions (11) were reformulated equivalently, but more expediently, as

$$\begin{aligned} Ax=y, \quad x^{*}+\lambda =A^{*}y^{*}\quad \text {and} \quad \lambda _{j}x_{j}=\beta \quad \text {for each}\quad j, \end{aligned}$$

with the tacit understanding that all \(\lambda _{j},x_{j}>0\). Solving the last equations—called a primal-dual system —amounts to find a root of the function

$$\begin{aligned} F(x,\lambda ,y^{*}):=\left[ \begin{array}{c} Ax-y \\ x^{*}+\lambda -A^{*}y^{*} \\ (\lambda _{j}x_{j}-\beta ) \end{array} \right] . \end{aligned}$$

For that purpose, Newton methods proceeds by iteratively updating the currently proposed point \((x,\lambda ,y^{*})=:z\) along a direction \( (\Delta x,\Delta \lambda ,\Delta y^{*})=:\Delta z\) that solves the linearized equation \(F(z+\Delta z)\approx F(z)+F^{\prime }(z)\Delta z=0\). Implementations go under the heading of (central) path-following methods; see [9, 23, 34], or [44]. Fine, but crucial details deal with the linear solver, step computation, and updating of the barrier parameter \(\beta \); conf. Vanderbei [38, 47].

6 Concluding remarks

In applied economics, what are the most important or frequent problems? Paul Samuelson—also a Nobel laureate in economics—surmised it is the efficient allocation of scarce resources towards competing ends. For those who agree, both in theory and practice, optimization becomes indispensable. Especially so if optimization produces endogenous prices—alias dual variables—to guide economic decisions or valuations.

Similarly, in applied mathematics, what are the most important or frequent problems? Most practitioners would contend that these amount to solve an equation system—or equivalently, to find a root of some specified function. For such purposes, Newton methods stand out as highly efficient. It is most appropriate therefore that such methods have come to occupy center stage in a large class of interior-point, optimization techniques.

It’s noteworthy that Sir Isaac Newton and Ragnar Frisch provided chief inputs to these businesses. No less noteworthy was the confluence of economics with mathematics during the 1950–1960s. Central economists included Arrow, Debreu, Dorfman, Frisch, Hurwicz, Koopmans, Leontief, Samuelson, Solow and others. Among the important mathematicians were Dantzig, Gale, Kuhn, Nash, Shapley, Tucker and von Neumann—to name a few.

During that period, linear programming was a focal point—and a great source of inspiration. In his marvellous little book, [22] expressed the belief that linear economic models would become central to economists’ training. We regret that he was wrong. The subject has largely dropped out of the curriculum at most economics departments. Yet, major economic issues fit within the frames of linear programs. Included are activity planning, complementarity, cooperative production games, financial hedging, input-output analysis, matching, security valuation, shadow pricing, substitution, theory of comparative advantages, two-person zero-sum games etc.—and a plethora generic applications. Moreover, the field is one where the economic problem remains master.

No wonder that Frisch immediately saw the great potential. Yet more of a wonder that, after his time, so much of economics have taken a largely narrative turn.