1 Introduction

Transmission expansion in power markets may involve many players with different objectives. For instance, a system operator aims to improve the functioning of the power system, for example, through social welfare maximization or with respect to reliability of the network. Generation companies assess the effects of transmission expansions on their profits, since changes to the network topology involve changes to supply and demand. In this chapter, we take the perspective of a merchant transmission investor, i.e., a company that installs new transmission lines in order to profit from their use. We assume a power network with nodal prices, also known as locational marginal prices (LMPs) (Sorokin et al. 2012). As a producer sells its power in the node it is located and at the local LMP, flow of power to another node with a different LMP may involve profits to the owner of the line. The profits from using both existing and newly installed transmission lines consist of congestion rents defined by differences in nodal prices (Sorokin et al. 2012). In many cases, a transmission system operator (TSO) owns and operates the network and the profits translate into financial transmission rights (FTRs) which are sold in a secondary market or in an auction. The merchant-investor perspective to transmission investments is based on profits from long-term financial transmission right (LTFTR) offsetting the investment costs (Rosellón and Kristiansen 2013). Examples of this power market setup can be found in PJM, New York, California and New England (Kristiansen 2004).

As transmission expansions change the network topology, supply and demand is affected and the market adopts new LMPs. In particular, the installation of new transmission lines can connect producers to new nodes, in which the merit order and therefore the local LMP changes. To model this feedback mechanism between investment decisions and LMPs and the different objectives of the merchant and the market, we use bilevel programming. A bilevel programming problem (BPP) consists of an upper level and a lower level, often illustrated by the leader–follower paradigm (or Stackelberg game), in which a leader makes an upper-level decision while accounting for the reaction of one or more followers in the lower level. We consider the merchant investor as a leader making upper-level investment decisions while anticipating lower-level market-clearing. Our problem of long-term transmission expansions is static, but accounts for short-term dynamics of the power system, including market-clearing. Moreover, by including demand uncertainty, our problem becomes a two-stage stochastic program with recourse, with the first and second stages being the upper level and lower level, respectively.

A popular approach to solve a BPP is based on replacing the lower-level problem by its Karush-Kuhn-Tucker (KKT) optimality conditions, assuming these are sufficient (Dempe et al. 2015). The resulting problem is a mathematical program with equilibrium constraints (MPEC), for which solution approaches include reformulation by mixed-integer programming (MIP), non-linear methods or heuristics. In case the BPP has linear constraints and objectives, a widely applied method is linearization and reformulation by mixed-integer linear programming (MILP). In our case, the lower-level problem of the BPP is a linear program, meaning that the KKT-conditions are necessary and sufficient for optimality. Also, the upper-level problem has linear constraints. However, the upper-level objective function involves bilinear congestion rents, determined by products of LMPs (lower-level dual variables) and line flows (lower-level primal variables). These bilinear terms make the resulting MPEC non-linear and non-convex, and thus, difficult to solve to global optimality.

As an alternative to MILP and MPEC methods, we apply a solution approach for the merchant investor BPP that is based on parametric programming (Bylling et al. 2020). This method can solve a BPP with bilinear objective terms to global optimality. Furthermore, it facilitates decomposition with respect to both time periods and scenarios. In a numerical study, we illustrate its ability to solve the problem, even though standard solvers for non-linear MPECs fail.

The main objectives of this chapter are:

  • To formulate a bilevel programming problem for transmission expansion of a merchant investor.

  • To illustrate the application of parametric programming and its advantages for the transmission expansion problem.

  • To obtain numerical results for a case study of electricity investments in transmission lines.

The rest of this chapter is organized as follows: Sect. 2 provides an overview of the existing literature and positions this chapter within recent research. In Sect. 3, we present the bilevel programming problem of transmission expansion and Sect. 4 describes the parametric programming approach. Numerical results are provided in Sect. 5 and Sect. 6 concludes.

2 Literature Review

The existing literature includes a number of transmission expansion problems formulated as BPPs. For instance, Conejo et al. (2016) present a bilevel transmission and generation expansion problem with market-clearing in the lower level and profit maximization at the upper level. Similarly, Garcés et al. (2009) propose a bilevel problem of a transmission planner who minimizes network expansion costs in the upper level subject to market-clearing at the lower level. Baringo and Conejo (2012) likewise, consider a joint generation and transmission expansion, but with the objective to minimize consumer payments when installing wind power units and the required network reinforcements. These references reformulate the bilevel problem to a mixed-integer linear program (MILP) via the KKT-conditions. In fact, although the upper-level objective function by Conejo et al. (2016) and Baringo and Conejo (2012) is bilinear, it can be linearized using the KKT-conditions and strong duality of the lower-level problem.

The perspective of a merchant transmission investor is proposed by Joskow and Tirole (2005). This view is taken by Maurovich-Horvat et al. (2014), who formulate a stochastic bilevel problem and compare transmission investments of a merchant investor and a TSO. Buijs and Belmans (2012) likewise present a bilevel transmission expansion problem and analyze different upper-level objectives, including the merchant’s. Rosellón and Kristiansen (2013) investigate a merchant mechanism to transmission expansion, using LTFTR as incentive to construct new lines. The resulting problem becomes an MPEC, which is solved via its KKT-conditions. Since the MPEC is non-convex, the KKT-conditions may not be sufficient for optimality, and thus, the solution may not be globally optimal.

We continue to consider a merchant perspective on transmission expansion. Unfortunately, to the best of our knowledge, the structure of our problem does not allow for linearization and reformulation by MILP. For example, our problem fails to satisfy the sufficient conditions for linearization by Bylling et al. (2020). Also, the above solution methods may not solve bilevel transmission expansion problems with a bilinear objective to global optimality. In contrast, we apply a new solution method that guarantees global optimality.

For other solution methods to BPPs, we refer to the reviews by Dempe et al. (2015) and Colson et al. (2007). For a review of transmission expansion problems in general, we refer to Hemmati et al. (2013).

3 The Bilevel Transmission Expansion Problem

This section presents the bilevel programming problem of merchant electricity transmission investments. Our problem consists of two levels: a lower-level market-clearing problem and an upper-level investment problem. A nomenclature is provided in Table 7 in the Appendix.

In the lower-level market-clearing problem, we assume a perfectly competitive power market, such that producers offer generation at their marginal production cost. By further assuming inelastic demand, market-clearing can be formulated as a linear cost minimization problem; cf. Gabriel et al. (2013). In our setup, market-clearing accounts for network flow, which is modeled using a DC load flow representation. To capture short-term dynamics, we consider a number of time periods, e.g., hours, for which the power market clears. To represent demand uncertainty, we assume a discrete distribution with a finite number of scenarios. For fixed upper-level decisions, the lower-level problem decomposes into a number of subproblems, one for each time period and each scenario. The lower-level subproblem of time period t and scenario s is the following:

$$\begin{aligned} \text {min}_{\mathbf {y}_{ts},\mathbf {p}_{ts},\varvec{\theta }_{ts}} \quad&\sum _{g\in \mathcal {G}} c_g y_{gts}\end{aligned}$$
(1a)
$$\begin{aligned} \text {s.t.} \quad&\sum _{g \in \mathcal {G}(i)} y_{gts} - \sum _{j \in \mathcal {I}(i)} p_{ijts} = d_{its} \ : \lambda _{its}, \quad i \in \mathcal {I} \end{aligned}$$
(1b)
$$\begin{aligned}&0 \le y_{gts} \le y_{g}^{max} \ :\mu ^{y}_{gts}, \quad g \in \mathcal {G}\end{aligned}$$
(1c)
$$\begin{aligned}&p_{ijts} = B_{i j} (\theta _{its} - \theta _{j ts}) \ : \mu ^p_{ijts}, \quad (i,j) \notin \mathcal {J}\end{aligned}$$
(1d)
$$\begin{aligned}&p_{ijts} = x_{ij} B_{i j} (\theta _{its} - \theta _{j ts}) \ : \mu ^{p,\mathcal {J}}_{ijts} ,\quad (i,j) \in \mathcal {J} \end{aligned}$$
(1e)
$$\begin{aligned}&-F^{max}_{ij} \le p_{ijts} \le F^{max}_{i j} \ :\mu ^{F,{\text {min}}}_{i j ts}, \mu ^{F,{max}}_{i j ts}, \quad (i,j) \notin \mathcal {J}\end{aligned}$$
(1f)
$$\begin{aligned}&-\mathcal {F}_{ij} \le p_{ijts} \le \mathcal {F}_{i j} \ :\mu ^{\mathcal {F},{\text {min}}}_{i j ts}, \mu ^{\mathcal {F},{max}}_{i j ts}, \quad (i,j) \in \mathcal {J}\end{aligned}$$
(1g)
$$\begin{aligned}&- \pi \le \theta _{its} \le \pi \ : \mu _{its}^{\theta ,{\text {min}}}, \mu _{its}^{\theta ,{max}}, \quad i \in \mathcal {I} \end{aligned}$$
(1h)
$$\begin{aligned}&\theta _{its} = 0 \ : \mu ^{\theta ,ref}_{ts}, \quad i = ref \end{aligned}$$
(1i)

where \(\mathbf {y}_{ts}=\{y_{gts}\}_{g\in \mathcal {G}}, \mathbf {p}_{ts}=\{p_{ijts}\}_{i,j\in \mathcal {I}}\) and \(\varvec{\theta }_{ts}=\{\theta _{its}\}_{i\in \mathcal {I}}\). The objective function minimizes production costs, while the constraints (1b) balance demand and supply at each node. The constraints (1c) limit power generation by existing capacity for each generating unit. Similarly, the constraints (1d) and (1e) define the power flow on existing and candidate lines, respectively, and (1d) and (1e) limit flow by existing and potential capacity. Potential capacity depends on whether a candidate line has been installed (\(x_{ij}=1\)) or not (\(x_{ij}=0\)), which is an upper-level decision fixed in the lower-level problem. Finally, the constraints (1h) restrict the voltage angle at each node and (1i) define the voltage angle for some reference node of the network to be zero.

In the upper-level investment problem, the merchant maximizes profits, i.e., congestion rents less investment costs, subject to a total budget. The upper-level problem is as follows:

$$\begin{aligned} \text {max}_\mathbf{x , \varvec{\mathcal {F}}, \mathbf{p} , \varvec{\lambda }}&\sum _{t\in \mathcal {T}} \rho _t \sum _{s \in \mathcal {S}} \phi _s \sum _{i,j \in \mathcal {I}: i<j} p_{ijts} \left( \lambda _{its} - \lambda _{jts} \right) \\ \nonumber&- \sum _{(i,j) \in \mathcal {J}}\left( K_{ij} x_{ij} + k_{ij} \mathcal {F}_{ij} \right) \end{aligned}$$
(2a)
$$\begin{aligned}&\text {s.t. } \quad 0 \le \sum _{(i,j) \in \mathcal {J}} \left( K_{ij} x_{ij} + k_{ij} \mathcal {F}_{ij}\right) \le K^{max} \end{aligned}$$
(2b)
$$\begin{aligned}&0 \le \mathcal {F}_{ij} \le x_{ij} \mathcal {F}_{ij}^{max} , \quad i,j \in \mathcal {J}\end{aligned}$$
(2c)
$$\begin{aligned}&x_{ij}\in \{0,1\}, \quad i,j \in \mathcal {J}\end{aligned}$$
(2d)
$$\begin{aligned}&\mathbf{p} _{ts} \text { is a primal optimal solution to (1)} ,\quad t \in \mathcal {T}, s \in \mathcal {S}\end{aligned}$$
(2e)
$$\begin{aligned}&\quad \varvec{\lambda }_{ts} \text { is a dual optimal solution to (1)}, \quad t \in \mathcal {T}, s \in \mathcal {S} \end{aligned}$$
(2f)

where \(\mathbf{x} =\{x_{ij}\}_{i,j\in \mathcal {I}}, \varvec{\mathcal {F}}=\{\mathcal {F}_{ij}\}_{i,j\in \mathcal {I}}, \mathbf {p}=\{p_{ijts}\}_{i,j\in \mathcal {I}, t\in \mathcal {T}, s\in \mathcal {S}}\) and \(\varvec{\lambda }=\{\lambda _{ts}\}_{t\in \mathcal {T}, s\in \mathcal {S}}\). The objective function maximizes profits from installation of new lines. Profits consists of accumulated hourly congestion rents determined by the differences between nodal market prices and less fixed and variable investment costs. Constraints (2b) ensure compliance with the investment budget and the constraints (2c) limit the maximum capacity installed at each line.

4 The Parametric Programming Method

By replacing the lower-level problem of the BPP by its Karush-Kuhn-Tucker (KKT) optimality conditions, the resulting problem is a mathematical program with equilibrium constraints (MPEC). The bilinear term of the upper-level objective function makes the objective function of the MPEC non-linear. To the best of our knowledge, it is not possible to linearize this bilinear term and the problem can only be solved to local optimality by non-linear methods.

Instead, we propose a solution approach for the BPP based on parametric programming. The approach applies to a linearly constrained BPP with continuous variables at both levels, and thus, does not directly apply to the transmission expansion problem with binary variables in the upper level. For a limited number of candidate lines, however, the number of binary solutions is moderate (for \(|\mathcal {J}|\) candidate lines, the number of solutions is \(2^{|\mathcal {J}|}\)). We therefore use the parametric programming approach in combination with complete enumeration of the of binary solutions. Our method has the advantage that it solves the bilevel problem with bilinear objective to global optimality.

In Sect. 4.1 we present the parametric programming method for a BPP with only continuous variables and in Sect. 4.2, we briefly explain the enumeration of binary solutions.

4.1 Continuous Upper Level

In this section, we fix the binary decisions \(\mathbf {x}\in \{0,1\}^{|\mathcal {J}|}\) to install candidate lines or not and consider only the continuous line capacities \(\varvec{\mathcal {F}}\in \mathbb {R}^{|\mathcal {J}| }\) as upper-level decision variables.

We define the upper-level feasibility set \(S\subseteq \mathbb {R}^{|\mathcal {J}| }\) as the set of upper-level solutions that satisfy the upper-level constraints (2b) and (2c) and that render the lower-level problem (1) feasible.

The idea behind the parametric programming method is to parameterize the lower-level primal and dual optimal solutions by the upper-level feasible solutions, i.e.,

$$\begin{aligned} \mathbf {p} (\varvec{\mathcal {F}}) \quad \text {and} \quad \varvec{\lambda } (\varvec{\mathcal {F}}),\quad \varvec{\mathcal {F}}\in S, \end{aligned}$$
(3)

such the upper-level objective function can be expressed in terms of upper-level variables only.

To inspect the optimal solutions to the lower-level problem, let the upper-level solution \(\varvec{\mathcal {F}}\in S\) be fixed and let B be a basis for the lower-level linear programming problem, i.e., a set of linearly independent columns of the constraint matrix. We consider the corresponding basic solution to the lower-level problem, i.e., for which the variables corresponding to columns of the basis are called basic variables and the remaining variables are called non-basic and equal zero.

The following definition stems from parametric programming (Gal 1995).

Definition 1

The critical region \(\Lambda _B \subseteq S\) corresponding to the basis B is the set of upper-level feasible solutions for which the corresponding basic solution is optimal in the lower-level problem.

It can be shown that a critical region is a polyhedron; (cf. Gal 1995).

On each critical region, we can characterize the upper-level objective function in terms of upper-level variables only. This result follows from Bylling et al. (2020).

Proposition 1

Let \(\Lambda _B\) be the critical region corresponding to the basis B. Then, the bilinear term \(p_{ijts}(\varvec{\mathcal {F}})(\lambda _{its}(\varvec{\mathcal {F}})-\lambda _{jts}(\varvec{\mathcal {F}}))\) is an affine function of \(\varvec{\mathcal {F}}\) on the interior of \(\Lambda _B\) and for all ijts.

In other words, the upper-level objective function is a piece-wise linear (but not necessarily continuous) function which is affine on each critical region. It is easy to determine the gradient of the affine functions, see Bylling et al. (2020). With the gradient and a function value of the upper-level objective function for each critical region, we can obtain an explicit expression for the upper-level objective function. Furthermore, with an affine objective function and a polyhedral feasibility set, the restriction of the BPP to a single critical region is a linear programming problem. We use this to solve the BPP.

Our strategy is to find a cover of the upper-level feasibility set by critical regions, i.e., a set of bases \(\mathcal {B}\) such that

$$\begin{aligned} S=\bigcup _{B\in \mathcal {B}} \Lambda _B, \end{aligned}$$
(4)

to solve the restricted problems for all critical regions in the cover and finally obtain a global optimal solution by simply comparing candidate solutions.

To find a cover of S by critical regions, we define neighboring critical regions as follows, cf. Gal (1995).

Definition 2

Two critical regions, \(\Lambda _1,\Lambda _2\), are neighbors if the following holds for their corresponding bases \(B_1,B_2\):

  1. 1.

    There exists an \(\varvec{\mathcal {F}} \in S\) for which \(B_1\) and \(B_2\) are both optimal bases to (1).

  2. 2.

    It is possible to pass from \(B_1\) to \(B_2\) in one iteration of the dual simplex method.

By Gal (1995), the union of all neighboring critical regions forms a cover of S. Thus, it is unnecessary to consider all possible bases of the lower-level problem. Neighboring critical regions are obtained by the following algorithm by Gal (1995), based on dual simplex.

Algorithm 1

Parametric programming algorithm  

Step 0:

(initialization) Set \(h:=0\). Given an initial upper-level solution, solve the lower-level problem (1). Store an optimal basis \(B_0\) and set \(\mathcal {B}:=\{B_0\}\).

Step 1:

(iteration h) If \(\mathcal {B}=\emptyset \), then stop. Otherwise, set \(h:=h+1\), select \(B_h \in \mathcal {B}\) and set \(\mathcal {B}:=\mathcal {B}\setminus \{B_h\}\).

Step 2:

(determine leaving variable) Let \(B:=B_h\). Select a basic variable that has not yet been inspected and determine if a neighbor exists. If not, return to Step 2. If all basic variables have been inspected, return to Step 1.

Step 3:

(determine entering variable) Carry out an iteration of the dual simplex method with the basic variable as the leaving variable. Store a neighboring basis \(B_j\) and set \(\mathcal {B}:=\mathcal {B}\ \cup \{B_j\}\). Return to Step 1.

 

For further details on the parametric programming approach, see Bylling et al. (2020).

4.1.1 Decomposition

For fixed upper-level decisions, the lower-level problem of the BPP decomposes into a number of subproblems, one for each time period and each scenario. We refer to the BPP with one time period and one scenario as a BPP subproblem. We process the subproblems individually, which allows for parallel computations and is likely to provide computational advantages.

By processing a BPP subproblem, we obtain neighboring critical regions for one time period and scenario. By processing all subproblems, the union of all critical regions forms a cover of S. Observe that an optimal solution to the restricted BPP can be found at a vertex of the critical region. Unfortunately, an optimal solution to the BPP may not be found among the optimal solutions to the restricted BPP subproblems. However, the vertices of the critical region must be found among the vertices of the critical regions obtained for one time period and scenario at a time. Thus, to find an optimal solution to the BPP, we enumerate and evaluate all vertices of the critical regions of the BPP subproblems. This provides us with a global optimal solution. For vertex enumeration, we use the procedure of Avis and Fukuda (1996).

The solution algorithm is as follows:

Algorithm 2

Decomposition  

Step 1:

(parametric programming) Apply the parametric programming Algorithm 1 to the BPP subproblems.

Step 2:

(vertex enumeration) Use vertex enumeration for each of the critical regions obtained in Step 1.

Step 3:

(comparison) Collect all solutions from Step 2 and evaluate their upper-level objective function values.

 

As an alternative to Algorithm 2, we also propose a heuristic that omits the computationally costly vertex enumeration. In Step 2, we obtain optimal solutions to the restricted BPP subproblems.

The heuristic can be summarized as:

Algorithm 3

Heuristic  

Step 1:

(parametric programming) Apply the parametric programming Algorithm 1 to the BPP subproblems.

Step 2:

(restricted optimization) Solve the BPP subproblems restricted to each of the corresponding critical regions obtained in Step 1.

Step 3:

(comparison) Collect all vertices from Step 2 and evaluate their upper-level objective function values.

 

4.2 Binary Upper Level

This section outlines the combination of the parametric programming approach and complete enumeration. We simply iterate through the upper-level, binary solutions, i.e., all potential configurations of the network. For fixed binary decisions to install candidate lines or not, \(\mathbf {x}\in \{0,1\}^{|\mathcal {J}|}\), we apply parametric programming.

The procedure is as follows:

Algorithm 4

Enumeration  

Step 1:

(enumeration) Enumerate all binary solutions, \(\mathbf {x}\).

Step 2:

(parametric programming) For each solution, solve the BPP using Algorithm 1, Algorithm 2 or the Heuristic 3.

Step 3:

(comparison) Collect all solutions from Step 2 and their upper-level objective function values.

 

4.3 Non-linear Programming

As benchmarks, we also implement a non-linear MPEC formulation and a mixed-integer non-linear programming (MINLP) formulation of the problem. These can be solved using standard software, with the upper-level variables \(\mathbf {x}\) defined as binary. Since the MPEC and MINLP are non-convex, however, we can only obtain local optimality.

The MPEC formulation is derived by replacing the lower-level problem, (1), by the necessary and sufficient KKT-conditions. This formulation is:

$$\begin{aligned} \max&\quad (2a) \end{aligned}$$
(5a)
$$\begin{aligned} \text {s.t.}&\quad (2b)-(2d) \end{aligned}$$
(5b)
$$\begin{aligned}&\quad (1b)-(1i) \end{aligned}$$
(5c)
$$\begin{aligned}&\quad c^g - \lambda _{its} + \mu _{gts}^y \ge 0 \quad \forall g,t,s \end{aligned}$$
(5d)
$$\begin{aligned}&\quad \lambda _{its} - \mu _{ijts}^p - \mu _{ijts}^{F,\text {min}} + \mu _{ijts}^{F,\text {max}} = 0 \quad \forall t,s,(i,j) \notin \mathcal {J}\end{aligned}$$
(5e)
$$\begin{aligned}&\quad \lambda _{its} - \mu _{ijts}^p - \mu _{ijts}^{\mathcal {F},\text {min}} + \mu _{ijts}^{\mathcal {F},\text {max}} = 0 \quad \forall t,s,(i,j) \in \mathcal {J}\end{aligned}$$
(5f)
$$\begin{aligned}&\quad - \mu _{its}^{\theta ,\text {min}} + \mu _{its}^{\theta ,\text {max}} = 0 \quad \forall t,s,i \ne \text {ref.} \end{aligned}$$
(5g)
$$\begin{aligned}&\quad - \mu _{its}^{\theta ,\text {min}} + \mu _{its}^{\theta ,\text {max}} + \mu _{ts}^{\theta ,\text {ref}} = 0 \quad \forall t,s,i = \text {ref.} \end{aligned}$$
(5h)
$$\begin{aligned}&\quad y_{gts} \mu _{gts}^y = 0 \quad \forall g \end{aligned}$$
(5i)
$$\begin{aligned}&\quad (p_{ijts} + F^{\text {max}}_{ij}) \mu _{ijts}^{F,\text {min}} = 0 \quad \forall t,s,(i,j) \notin \mathcal {J}\end{aligned}$$
(5j)
$$\begin{aligned}&\quad (F^{\text {max}}_{ij} - p_{ijts} ) \mu _{ijts}^{F,\text {max}} = 0 \quad \forall t,s,(i,j) \notin \mathcal {J}\end{aligned}$$
(5k)
$$\begin{aligned}&\quad (p_{ijts} + \mathcal {F}_{ij}) \mu _{ijts}^{\mathcal {F},\text {min}} = 0 \quad \forall t,s,(i,j) \in \mathcal {J}\end{aligned}$$
(5l)
$$\begin{aligned}&\quad (\mathcal {F}_{ij} - p_{ijts} ) \mu _{ijts}^{\mathcal {F},\text {max}} = 0 \quad \forall t,s,(i,j) \in \mathcal {J}\end{aligned}$$
(5m)
$$\begin{aligned}&\quad (\theta _{its} + \pi ) \mu _{its}^{\theta ,\text {min}} = 0 \quad \forall i,t,s \end{aligned}$$
(5n)
$$\begin{aligned}&\quad (\pi - \theta _{its} ) \mu _{its}^{\theta ,\text {max}} = 0 \quad \forall i,t,s \end{aligned}$$
(5o)
$$\begin{aligned}&\mu ^y_{gts},\mu _{ijts}^{F,\text {min}},\mu _{ijts}^{F,\text {max}},\mu _{ijts}^{\mathcal {F},\text {min}},\mu _{ijts}^{\mathcal {F},\text {max}},\mu _{its}^{\theta ,\text {min}},\mu _{its}^{\theta ,\text {max}} \ge 0. \end{aligned}$$
(5p)

A challenge for standard solvers is that all feasible points of the MPEC are non-regular, i.e. the gradients of the binding constraints are linearly dependent. Most non-linear optimization solvers even fail to obtain a locally optimal solution. A way to overcome the non-regularity is by the regularization approach of Scholtes (2001) and Ralph and Wright (2004). Using this approach, the equality constraints of complementary slackness are replaced by inequalities and the infeasibility gap is iteratively reduced. With inequality constraints, the MPEC is regular.

Alternatively, the complementary slackness constraints can be linearized using disjunctive constraints. Disjunctive constraints introduces a binary variable for each complementary slackness constraint, i.e., the constraints (5i)–(5o), and a large constant. The binary variable ensures that the two factors of the product cannot both be non-zero. The constant, usually denoted by M, has to be sufficiently large not to cut off any feasible solutions. At the same time, it must be sufficiently small not to create computational difficulties, see (Pineda et al. 2017) for more details. The resulting problem is a mixed-integer problem but remains non-linear due to the bilinear term in the objective function, i.e., is a MINLP. Usually, such problems can only be solved to local optimality.

Since the above are standard methods, we use them as benchmarks for the parametric programming methods. To the best of our knowledge, no other existing methods can solve this problem to global optimality.

5 Numerical Results

We present a case study of transmission expansion in the Nordic region, with 4 nodes representing Norway, Sweden, and the two Danish pricing regions: DK1 as Western Denmark and DK2 as Eastern Denmark; cf. Nord Pool AS (2017).

5.1 Data

We assume that three DC cables are already in place: One connecting the two Danish price regions, one connecting the Eastern Danish pricing region, and Sweden and one connecting Sweden and Norway. The existing cables each have a capacity of 1.000 MW. Three additional DC cables can be installed, providing connections where not already. These are the cables \((N,\textit{DK}1),(N,\textit{DK}2)\) and \((\textit{SE},\textit{DK}1)\), see Fig. 1. The topology of the network is not as the current one, but is chosen for the purpose of illustration. Variable investment costs of each candidate line are assumed to be 20.000 DKK/MW, whereas we disregard fixed investment costs. We likewise disregard the budget and limitations for installed capacities of candidate transmission lines.

Fig. 1
figure 1

Network topology. Solid lines represent existing lines, dashed lines represent candidate lines

Hourly demand data at each node is available from Nord Pool AS (2017) and we select the year 2015. This data is clustered into a number of representative hours using k-means clustering (Hartigan and Wong 1979). We obtain results for different numbers of representative hours. For simplicity, we disregard demand uncertainty.

Generation capacities and costs for DK1 and DK2 are obtained from Energianalyse (2014) that divides generation into centralized and decentralized units. Generation capacities are adjusted to the Norwegian and Swedish nodes by considering historical production data. As opposed to Denmark, Norway and Sweden have considerable amounts of hydropower, which is reflected in the lower production costs of the centralized plants. The generation capacities and production costs are shown in Table 1.

Table 1 Generation capacities and production costs

5.2 Implementation

The parametric programming approach to decomposition and the heuristic has been implemented in R using the interfaces by Berkelaar (2015) to solve LPs and Robere (2015) for vertex enumeration. The software is open source and free. The MPEC and MINLP have been implemented in GAMS (2017) and solved using the DICOPT solver. All problems are solved on an HP ProLiant server with 4 AMD 2.50 GHz CPUs and with a total of 64 cores and 256 GB of RAM.

5.3 Optimal Investments

For the most detailed case with 1000 representative hours, an optimal solution is given in Table 2.

Table 2 Investment decisions and capacities in candidate lines

As the table shows, investment are made in all candidate lines with maximum capacity on (NDK1) and (NDK2). We use this solution as a benchmark.

The investments in all candidate lines are justified by total congestion rents offsetting the investment costs. In fact, the transmission of power and differences in nodal prices generate significant revenues for the merchant investor. We explain this as follows.

Since the costs of centralized generation are significantly lower than those of decentralized generation, demand is satisfied by central production unless generation capacity is binding. As the production costs of Norway and Sweden are lower than those of the Danish nodes, demand of all nodes is satisfied by central production in Norway and Sweden, using both existing and newly constructed transmission lines. Thus, power is transmitted from Norway and Sweden to Denmark unless transmission capacities are binding, i.e., congestion occurs. As a result, the nodal prices are determined by the marginal costs of centralized Norwegian and Swedish generation in many of the representative hours. When congestion occurs, however, market prices of the Danish nodes are higher than for Norway and Sweden.

Average nodal prices are given in Table 3. As expected, average prices are higher for the Danish nodes than for the Norwegian and Swedish nodes, but the same for both of the Danish nodes.

Table 3 Average prices at the four nodes in DKK/MWh

In Table 4, we list the number of hours (out of the 1000 representative hours) for which the transmission lines are congested. Furthermore, the direction of the power flow is indicated by the number of hours with positive and negative flow. We note that power always flows into the Danish nodes from the (N,DK1), (SE,DK1) and (SE,DK2) lines, clearly confirming the relatively low-cost generation from Norway and Sweden supplied to the Danish market. The (N,SE) and (N,DK1) lines mainly have flow from Norway to DK1 and Sweden (in 981 and 990 out of 1000 hours, respectively). All but the (DK1,DK2) line have power flow during all hours and the (DK1,DK2) line only has 3 out of 1000 hours without flow of power. Thus, the markets exploits the network at all times.

Table 4 Number of hours (out of 1000) with congested lines, positive flow and negative flow for each transmission line

As can be seen, the line connecting Sweden and DK1 is always congested, meaning the merchant investor collects congestion rents in all hours. Also, the transmission lines connecting Norway and Sweden, Norway and DK1, Norway and DK2, and Sweden and DK2 are almost always congested (between 812 and 997 hours out of the 1000 representative hours). The only line that is never congested is the one connecting the two Danish regions, DK1 and DK2.

5.4 Comparison of Solution Methods

We apply two solution methods based on parametric programming: The parametric programming approach to decomposition (Decomp.) that guarantees global optimality and the parametric programming heuristic (Heuristic). We compare with the three non-linear programming methods: A standard MPEC solver, a regularization approach (reg. MPEC), and a reformulation by disjunctive constraints (MINLP). We solve the BPP with all these methods, varying the number of representative hours by 10 from 10 to 100 and by 100 from 100 to 1000, the result of which is a total of 19 problem instances of increasing size.

The standard MPEC solver returned local infeasibility for all instances, and thus, we do not report further results of using this solution method. The MINLP method likewise did not provide any results, with the solver reporting that the search stopped as the objective function of the NLP subproblems started to deteriorate. While the regularization approach returned local optimal solutions for all 19 instances, all these solutions had \(x_{ij}=0\) and \(\mathcal {F}_{ij}=0\) for all \((i,j)\in \mathcal {J}\), i.e., no investments were made. This results in an optimality gap of 99% and is of no practical use.

To compare the solutions of the decomposition approach and the heuristic, we report the investment capacities of the three candidate lines in Table 5. We see that the two solution methods agree in 14 out of 19 cases, as also indicated by the zero optimality gap in Table 6. For both methods, investments are made in lines (N,DK1) and (N,DK2) at maximum capacity in all but one instance (50 representative days). The investment in line (SE,DK1) is of a smaller capacity, although in many instances (14 and 11 out of 19 for the decomposition approach and the heuristic, respectively), some investment is profitable. In fact, a small capacity is enough to create congestion and generate some revenue. For the larger instances, however, the heuristic fails to capture small investments, which results in a significant optimality gap. In particular, for 600–1000 representative days, the exact approach suggests investment in line (SE,DK1), whereas in four out of five instances, the heuristic does not.

Table 5 Investment decisions from the two solution methods, the decomposition approach and the heuristic

Table 6 provides the solution times of the exact parametric programming approach and the heuristic as well as their differences in objective function values, i.e., optimality gaps. For a number of representative days higher than 500, the optimality gaps produced by the heuristic varies from 0.1% to 10.1%. When the number of representative days is 500 or lower, the heuristic obtains an optimal solution. For instances with 100 representative days or lower, the heuristic obtains an optimal solution 15–70 times as fast as the decomposition approach. For problems with 200 representative days or more, the heuristic maintains lower solution times for almost all instances but with a factor between 4 and 7. While the heuristic provides no guarantees of optimality, our case study suggests that for small to moderate sized bilevel problems, it works very well. Furthermore, it solves even large problems relatively fast and provides solutions within a 10% optimality gap. Its main disadvantage is that the solutions may be structurally different from the optimal, and thus, this method may be better suited for cost assessments than for investment planning.

Table 6 Solution times and optimality gaps for the two solution methods, the decomposition approach and the heuristic
Table 7 Nomenclature

6 Conclusion

This chapter adopts a merchant investor perspective on transmission expansion. Investment is incentivized by the merchant collecting congestion rents on installed transmission lines. We formulate a bilevel programming problem in which investment decisions are made in an upper level and in anticipation of lower-level market-clearing. With the inclusion of congestion rents, the formulation involves a bilinear revenue term in the upper-level objective function. This makes the problem difficult to solve to global optimality by standard approaches, such as MPEC or MILP reformulations.

Instead, we apply an exact algorithm based on parametric programming that solves the bilinear bilevel programming problem to global optimality. Furthermore, it allows for decomposition of the lower-level problem and thereby has potential to provide computational advantages. We also present a faster, but heuristic version of the algorithm.

We illustrate the problem and the solution methods on a case study of transmission investment in the Nordic region. The numerical results indicate that is it profitable to be a merchant investor in an electricity network. The parametric programming approach is able to solve problem instances with up to 1000 representative days within 4.5 hours while the heuristic terminates in 1.2 hours and with an optimality gap of up to 10%. For small and moderately sized instances the heuristic found the optimal solution in 14 out of 19 cases with significantly lower solution times than the parametric approach. For large instances, however, the structure of the solutions produced by the heuristic often differ from the optimal.