1 Introduction

Recent years has witnessed an enormous growth in investment in renewable energy generation. A description of the recent history of this development under incentive schemes in the United States and Europe is given in [2]. Wind and solar power have zero short-run marginal cost of production and if the load factors are high enough or the capital cost can be reduced by taking advantage of incentives then they provide an inexpensive and environmentally acceptable generation technology. A number of authors [1, 12, 13] have commented on the extra costs that must be borne by a system with large amounts of wind or solar power. Some of these costs relate to the variability of power supply that affects the system frequency and voltage, requiring extra equipment to be installed to deal with this variability. On the other hand, electricity systems that require reliable sources of power must have either storage or reserve capacity available in case the wind does not blow or the sun does not shine. Although the distinction is not precise, the reserve capacity to meet this intermittency problem typically has slower response time than frequency-keeping or instantaneous reserve required to deal with variability and outages. Most authors focus on the variability and capture this aspect through time dynamics (see e.g. [11, 12]) without allowing for intermittency. Wogrin et al. [18] address the variability problem of introducing renewables through the concept of system states that takes the place of a discretized load duration curve.

In this paper we derive a model for determining the optimal investment in conventional generating plant to provide the required levels of reserve to deal with an intermittent supply of renewable power. The model we describe seeks an optimal mixture of peaking plant and plant with limited ramping ability (such as combined-cycle gas turbines) to provide this reserve. Limited ramping plant will be cheaper to run than peaking plant. By scheduling the start-up and shut-down times of such plant it is possible to use less peaking plant to cover periods when renewable power becomes unavailable. Our model seeks to represent this flexibility in an optimal investment plan.

The paper assumes a central planning paradigm that seeks the optimal level of investment to minimize total capital and expected fuel cost incurred by a social planner. Our aim is to understand the differences in investments that are made when the flexibility of generating plant is accounted for. While our model would pertain to a perfectly competitive electricity market, other studies investigate generation investments under oligopolies, but do not incorporate intermittency (see for example [19]). Although the central planning approach to investment in electricity plant is becoming increasingly rare in market economies, regulators and analysts still use optimization models as benchmarks of what investments might emerge from a competitive market. However in the authors’ experience, predictions made from deterministic capital planning models are rarely matched by real investments.

Our goal in this paper is to provide a methodology to solve the stochastic capital planning problem to serve as a different, and possibly improved, benchmark. Such a benchmark might correspond to a competitive equilibrium in which all agents are risk neutral and share the same probability distributions for future uncertainty. In practice, agents making investment decisions based on discounted cash flows will be risk averse, with possibly different discount rates. Unless the market for risk is completed, a competitive equilibrium will not necessarily correspond to an optimal social plan. In these circumstances a more realistic benchmark might come from a complementarity model of the form discussed by [5].

Classical central-planning investment models [7, 17] use a screening curve (Fig. 1) to rank generation options by their long-run marginal cost (LRMC), thus finding the best option to serve the production profile for each additional demand unit. The screening curve shows the annual total cost per MW capacity plotted against the number of annual operating hours. The total cost is a combination of fixed and variable cost based on the number of production hours in a year. A minimum cost for each capacity factor can be found by combining the screening curve with the load duration curve (LDC). The projection produces the least-cost capacity combination that can serve the load profile. For example, to supply the part of the LDC that has higher capacity factor (i.e. running most of the year), base load is the least cost option. As the number of operating hours decreases, the plants that are less expensive to build but more costly to run start to become more economical. For a small capacity at the tip of the duration curve, the very high variable cost peakers are the most economical.

Fig. 1
figure 1

The screening curve: how capacity is traditionally planned in electricity systems

The screening curve approach can be formulated as a linear program (LP), where

  • i denotes different generating technologies,

  • \(t\in \mathcal {T}\) denote the operational hours in the year.

  • the variable \(y_{i}\) is the capacity invested in technology i,

  • \(r_{i}(t)\) denotes the production at time t with technology i.

  • Fixed and variable cost for technology i is defined as \(F_{i}\) and \( V_{i}\) respectively.

This gives:

$$\begin{aligned} \begin{array}{clc} \text {LP: }\min &{} \sum _{i}{F_{i}y_{i}}+V_{i}\sum _{t}r_{i}(t) &{} \\ \text {s.t.} &{} g(t)\ge d(t), &{} t\in \mathcal {T}, \\ &{} g(t)\le \sum _{i}r_{i}(t), &{} t\in \mathcal {T}, \\ &{} 0\le r_{i}(t)\le y_{i}, &{} t\in \mathcal {T}. \end{array} \end{aligned}$$

Formulations like LP appeared as early as the 1950s [10], although the basic formulation has been extended in the past two decades to include operational constraints, and some supply-side uncertainties such as plant outages and technological changes were added. Demand distributions are still represented by load duration curves, or their discretized versions [3].

The ability to produce electricity on demand becomes valuable in a world with intermittency, but such flexibility cannot be valued in the traditional models based only on the marginal cost of energy. In fact, when running the ex-post dispatch problem, the optimization no longer follows the short-run marginal cost merit order when expensive plants must be started because of ramp constraints. To account for flexibility, the model we use requires some representation of load uncertainty. Although we assume that overall demand for power is relatively predictable, the residual demand after renewable energy has been subtracted, is not. In our context, residual demand is demand net of renewable generation. In other words intermittent generation will always be dispatched first and provides the system with a level of residual demand which must be met by conventional plant (or not satisfied at some penalty cost).

In our model, demand net of intermittent generation is treated as a time-inhomogeneous Markov process. This represents the case where demand is deterministic and intermittent generation is Markov, but the model also admits stochastic demand as long as the net demand is Markovian. The Markov assumption enables us to treat the dispatch problem as a Markov decision problem, in which dispatch actions for ramping plant and peaking plant are chosen to meet the residual demand at least expected cost per period. It is of course possible to include some provision for ramping plant in the LP model by requiring a given number of hours of ramping capacity, and then requiring that the optimal choice of plant type provide this. The requirements are however endogenous as they depend on the operating policies of the plants faced with uncertain residual demand. Specifying the correct ramping capacity requirement in LP begs the question of what this should be, which is the question that our model seeks to answer.

The paper is laid out as follows. In the next section we outline a Markov decision process for dispatching generation in systems with intermittent generation. We then explore how this model can be extended to optimize capacity expansion decisions. Section 4 describes a multistage framework for capacity expansion over a long planning horizon and applies this to some simple examples. Finally, Sect. 5 makes some concluding remarks. The paper is intended to be an exposition of a new methodology and so the results presented are from simple examples. More realistic examples can be found in the first author’s PhD thesis [20].

2 A Markov decision process for dispatch

In this section we describe the Markov decision process used for dispatch with intermittent generation. Consider demand net of intermittent generation. We refer to this amount, which must be met by non-renewable electricity generation, as net demand or residual demand. Clearly intermittent generation leaves the system with a random level of residual demand. If the levels of renewable penetration are relatively low, then the net demand series will look like a typical daily demand curve with some random variation.

To model the residual demand, we divide each 24-h period into N time intervals, and represent residual demand by an average demand (shown by the blue curve in Fig. 2), and a random variation about this average.

Fig. 2
figure 2

A realization of a Markov chain with \(N=8\) and \(M=5\), corresponding to the sequence of states 3, 5, 2, 5, 2, 1, 2, 5

The random variations are represented by a discrete process that moves between a finite number (M) of states (five are shown in Fig. 2.) By studying historical renewable dispatch and electricity demand over each of these intervals, one can estimate the parameters of a Markov chain that models the transitions between each state. We denote the elements of the \(M\times M\) transition matrix by \(p_{ij}(t)\). Since these probabilities depend on t the Markov chain is time-inhomogeneous. A typical realization of a five-state Markov chain is shown in Fig. 2.

The investment model that we define in this paper accounts for the operations of the chosen plant bearing in mind the variability of the residual demand process. These operational decisions are undertaken at a different time scale from the investment decisions, but the expected cost of them depends on what plant investment decisions have been made. Given a Markov decision process in which state transitions occur every few hours the expected annual operating cost of the system is best represented by an average reward per period objective. A infinite-horizon discounted reward model is also possible in our framework, but we do not pursue this as the discounting in each transition would be negligible. In both cases the optimal operating policy can be computed by solving a linear program. This linear program can then be embedded in an “upper level” optimization problem that will determine the optimal investments in e.g. generation capacity of various types, as we will observe.

Residual demand in our model will be satisfied by a mixture of baseload plant, ramping plant, and peaking plant. For notational simplicity we restrict attention in what follows to a model with a single ramping plant that can operate at discrete levels \(\mathcal {L}=\{1,2,\ldots ,L\}\), and a peaking plant that can operate at discrete levels \(\mathcal {H}=\{1,2,\ldots ,H\}\). Several ramping plants with different characteristics could be included in the model. However the increase in complexity would obscure the methodology that we are trying to illustrate. We assume that all baseload plant provides an exogenous level of generation that reduces the residual demand uniformly, and as such is not varied over the optimization. Peaking plant is utilized (as a last resort simple recourse decision) to deal with demand that cannot be met by any other means. The only remaining form of action is to move the ramping plant from its current operating level to a feasible operating level for the next period. This action is constrained by its ramp rate. To model this action we must record the current operating level of the ramping plant. This requires a state variable that measures its operating level.

The seven characteristics of our Markov decision process model are formally laid out below.

  1. 1.

    States The states of the Markov decision process are \(s=(t,l,i) \) where

    $$\begin{aligned} \begin{array}{c@{\quad }l} t= &{} \text {period of day, }1,2,\ldots ,N, \\ l= &{} \text {operating level of ramping plant, }1,2,\ldots ,L, \\ i= &{} \text {residual demand level,}\ 1,2,\ldots ,M. \end{array} \end{aligned}$$

    Since residual demand varies over the day, the optimal action to be taken in any state will depend on the time of the day, and so a time variable is added to the state space to enable the construction of a stationary optimal policy.

  2. 2.

    Action sets For each state \(s=(t,l,i)\), we choose an action\(\ a\in \) \(\mathcal {L\times H}\) that determines the operating levels of the ramping plant and the peaking plant for the next period. Any remaining demand is met by load shedding. The operating levels of the ramping plant defined by a is restricted to a subset \(\mathcal {L}(s)\ \)of \(\mathcal {L}\) that is determined by the ramping plant’s current operating level.

  3. 3.

    Decision time points The decision time points t correspond to discrete points in time when the operating level of a ramping plant can be adjusted, e.g. every five minutes or each half hour.

  4. 4.

    The immediate rewards (or costs) The immediate cost of action a in state s is the fuel cost of the ramping plant for the corresponding time period t plus the cost of meeting deficit in the net demand through peaking plant utilization or load curtailment. We denote this by c(sa).

  5. 5.

    Transition probabilities \(P_{\ }^{a}(s_{1},s_{2})\) denotes the probability that the system transitions to state \(s_{2}\), given that it is in state \(s_{1}\) and action a is taken.Footnote 1 If we denote \(s_{1}=(t,l,i)\) and \( s_{2}=(u,k,j)\) then

    $$\begin{aligned} P_{\ }^{a}(s_{1},s_{2})=\left\{ \begin{array}{l@{\quad }l} p_{ij}(t), &{}\text {if }u=\left( t+1\right) \text {mod }N\text {, and taking~action}~a\in \mathcal {L}(s_{1})\mathcal {\times H} \\ &{}\text {results in operating level}~k~\text {for the ramping plants.}\\ 0, &{} \text {otherwise.} \end{array}\right. \end{aligned}$$
  6. 6.

    Planning horizon and optimality criterion We consider an infinite planning horizon and optimize the average reward. Let the policy \( \pi \) be defined by the generation action \(a\in \mathcal {L}(l)\mathcal { \times H}\) in each state \(s=(t,l,i)\). We can state the infinite horizon average reward Markov decision problem of finding the optimal policy \(\pi \) by solving

    $$\begin{aligned} \min _{\pi }\lim _{N\rightarrow \infty }\mathbb {E}\left[ \frac{1}{N}\sum _{\nu =1}^{N}c(s_{\nu },\pi (s_{\nu }))\right] . \end{aligned}$$

This defines a minimum cost operational policy for dispatching thermal plant to meet residual demand in case renewable generation is not available. The problem of computing this policy can be formulated as a linear program (see e.g. [14]).

$$\begin{aligned} \begin{array}{cll} \text {MDP: }\min _{x} &{} \sum _{s\in \mathcal {S},a\in \mathcal {L}(s)\times \mathcal {H}}c(s,a)x(s,a) &{} \\ \text {s.t.} &{} \sum _{a\in \mathcal {L}(s)\times \mathcal {H}}x(s,a) &{}\\ &{} -\sum _{k\in \mathcal {S},a\in \mathcal {L}(s)\times \mathcal {H} }P^{a}(k,s)x(k,a)=0, &{} \quad s\in \mathcal {S}, \\ &{} \sum _{s\in \mathcal {S},a\in \mathcal {L}(s)\times \mathcal {H}}x(s,a)=1, &{} \\ &{} x(s,a)\ge 0. &{} a\in \mathcal {L}(s)\times \mathcal {H}, \\ &{} &{} s\in \mathcal {S}. \end{array} \end{aligned}$$

In the next section we show how MDP can be extended to inform capacity expansion in thermal plant.

3 Capacity expansion

We wish to select a capacity investment in peaking plant and an investment in ramping plant that will cover predicted increases in intermittent generation. An increase in peaking plant capacity might decrease the cost of load shedding in some states. An increase in capacity of ramping plant changes the set of actions available in each state. The formulation MDP in the previous section restricts the action a to be within a subset \( \mathcal {L}(s)\times \mathcal {H}\) of \(\mathcal {L}\times \mathcal {H}\), the set of operating levels at the next time period. In our application, \( \mathcal {L}(s)\) depends only on the current level of ramping generation, l , where \(s=(t,l,i)\), and defines the set of operating levels k that can be ramped to from l in one time period. For example, we might have \(\mathcal {L }(s)=\left\{ l-1,l,l+1\right\} \).

We proceed to describe a model in which \(\mathcal {L}(s)\) can be altered by investment decisions. To do this we model \(\mathcal {L}(s)\) using a flexibility matrix U where

$$\begin{aligned} U_{lk}=\left\{ \begin{array}{c@{\quad }c} 1, &{} \text {if ramping operating level }k\in \mathcal {L}(s),\text { }s=(t,l,i) \\ 0, &{} \text {if ramping operating level }k\notin \mathcal {L}(s),\text { } s=(t,l,i) \end{array} \right. . \end{aligned}$$

In the example where \(\mathcal {L}=\{1,2,3\}\) and for each s, \(\mathcal {L} (s)=\left\{ l-1,l,l+1\right\} \) we have

$$\begin{aligned} U=\left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1 &{} 1 &{} 0 &{} 0 &{} 0 \\ 1 &{} 1 &{} 1 &{} 0 &{} 0 \\ 0 &{} 1 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 \end{array} \right] . \end{aligned}$$

In general U will be zero everywhere except in the neighbourhood of its leading diagonal, the width of which defines how much ramping is feasible at the corresponding state. Thus row l corresponds to ramping level l, and the nonzero entries in row l are in columns corresponding to ramping levels k that are reachable from l in one period. So this includes column l, of course, and some columns either side of column l. The last diagonal entry of 1 corresponds to the ramping plant capacity. Flexibility matrices allow us to augment the linear programming formulation MDP to give an equivalent formulation MDPU with variables x(sa),  \(s\in \mathcal {S} ,a\in \mathcal {L}\times \mathcal {H}.\)

$$\begin{aligned} \begin{array}{cll} \text {MDPU: }\min _{x} &{} \sum _{s\in \mathcal {S},a\in \mathcal {L}\times \mathcal {H}}c(s,a)x(s,a) &{} \\ \text {s.t.} &{} \sum _{a\in \mathcal {L}\times \mathcal {H}}x(s,a) &{} \\ &{} -\sum _{k\in \mathcal {S},a\in \mathcal {L}\times \mathcal {H} }P^{a}(k,s)x(k,a)=0, &{}\quad s\in \mathcal {S}, \\ &{} \sum _{s\in \mathcal {S},a\in \mathcal {L}\times \mathcal {H}}x(s,a)=1, &{} \\ &{} x(s,a)\le U_{lk}, &{} a=(k,h),k\in \mathcal {L}, \\ &{} &{} h\in \mathcal {H},s=(t,l,i), \\ &{} x(s,a)\ge 0. &{} a\in \mathcal {L}\times \mathcal {H},s\in \mathcal {S}. \end{array} \end{aligned}$$

Note that the constraints on ramping that were previously modeled by \(a\in \mathcal {L}(s)\times \mathcal {H}\) are now represented by bounds on x(sa) defined by the elements of the flexibility matrices. The formulation MDPU can be used to represent different amounts of ramping generation. An investment choice g that increases ramping capacity can be modeled by a new flexibility matrix \(U^{g}\) with an increased number of nonzeros. Therefore some variables x(sa) representing (infeasible) ramping actions that were originally constrained to be 0 can now become positive since \( U_{lk}<U_{lk}^{g}\) for some l and k. This is illustrated in Fig. 3.

Fig. 3
figure 3

Two flexibility matrices. Investment in ramping plant gives the right hand matrix which has more capacity (a longer diagonal) and more ramping ability (a wider diagonal)

We are now in a position to define an investment optimization model based on MDPU. Suppose we are considering ramping investments \(g\in \mathcal {G}\) and peaking investments \(h\in \mathcal {H}\), and wish to choose at most one of each. Let \(z_{g}\in \{0,1\}\) denote the choice of ramping investment (at amortized capital cost \(M_{g}\)) and \(v_{h}\in \{0,1\}\) denote the choice of peaking investment (at amortized capital cost \(N_{h}\)). We choose indices \( g=g_{0}\) and \(h=h_{0}\) to denote existing capacity (with \( M_{g_{0}}=N_{h_{0}}=0\)).

The investment optimization model is then a mixed integer program:

$$\begin{aligned} \begin{array}{cll} \text {MIP: }\min _{x} &{} \sum _{s\in \mathcal {S},a\in \mathcal {L}\times \mathcal {H}}c(s,a)x(s,a)+\sum _{g\in \mathcal {G}}M_{g}z_{g} &{} +\sum _{h\in \mathcal {H}}N_{h}v_{h} \\ \qquad \text {s.t.} &{} \sum _{a\in \mathcal {L}\times \mathcal {H}}x(s,a) &{} \\ &{} -\sum _{k\in \mathcal {S},a\in \mathcal {L}\times \mathcal {H} }P^{a}(k,s)x(k,a)=0, &{} s\in \mathcal {S}, \\ &{} \sum _{s\in \mathcal {S},a\in \mathcal {L}\times \mathcal {H}}x(s,a)=1, &{} \\ &{} x(s,a)\le v_{h}, &{} a\in \mathcal {L}\times \mathcal {\{}h\mathcal {\}},h\in \mathcal {H}, \\ &{} &{} s\in \mathcal {S}, \\ &{} x(s,a)\le \sum _{g\in \mathcal {G}}z_{g}U_{lk}^{g}, &{} a=(k,h),k\in \mathcal { L},h\in \mathcal {H}, \\ &{} &{} l\in \mathcal {L},s\in \mathcal {S}, \\ &{} \sum _{g\in \mathcal {G}}z_{g}=1,\quad \sum _{h\in \mathcal {H}}v_{h}=1, &{} \\ &{} z_{g},v_{h}\in \{0,1\},\quad x(s,a)\ge 0, &{} g\in \mathcal {G},h\in \mathcal {H}, \\ &{} &{} a\in \mathcal {L}\times \mathcal {H},s\in \mathcal {S}. \end{array} \end{aligned}$$

Observe that the choice \(v_{h}=1\) for some \(h\in \mathcal {H}\) precludes peaking-plant actions for all other h, and the choice \(z_{g}=1\) for some \( g\in \mathcal {G}\) restricts ramping actions to those that satisfy the restrictions defined by flexibility matrix \(U^{g}\).

The size of problem MIP grows very rapidly with the number of investment choices and Markov states. It is easy to see that each investment option considered gives a new Markov decision problem with its own collection of state-action variables. To overcome this we use the Benders decomposition approach of Dimitrov and Morton [4]. This replaces MIP by a restricted master problem

$$\begin{aligned} \begin{array}{cll} \text {RMP:}\min _{z,w} &{} \phi +\sum _{g\in \mathcal {G}}M_{g}z_{g}+\sum _{h\in \mathcal {H}}N_{h}v_{h} &{} \\ \qquad \text {s.t.} &{} \alpha _{i}+\beta _{i}^{\top }z+\delta _{i}^{\top }v\le \phi , &{} i\in O, \\ &{} \gamma _{i}+\lambda _{i}^{\top }z+\mu _{i}^{\top }v\le 0, &{} i\in F, \\ &{} \sum _{g\in \mathcal {G}}z_{g}=1,\quad \sum _{h\in \mathcal {H}}v_{h}=1, &{} \\ &{} z_{g},v_{h}\in \{0,1\}, &{} g\in \mathcal {G},h\in \mathcal {H}. \end{array} \end{aligned}$$

Here the first and second sets of constraints are optimality cuts and feasibility cuts. The coefficients of these are determined by solving the following subproblem.

$$\begin{aligned} \begin{array}{cll} \text {SP: }\phi \left( \bar{z},\bar{w}\right) =\min _{x} &{} \sum _{s\in \mathcal {S},a\in \mathcal {L}\times \mathcal {H}}c(s,a)x(s,a) &{} \\ \text {s.t.} &{} \sum _{a\in \mathcal {L}\times \mathcal {H}}x(s,a) &{} \\ &{} -\sum _{k\in \mathcal {S},a\in \mathcal {L}\times \mathcal {H} }P^{a}(k,s)x(k,a)=0, &{} s\in \mathcal {S}, \\ &{} \sum _{s\in \mathcal {S},a\in \mathcal {L}\times \mathcal {H}}x(s,a)=1, &{} \\ &{} x(s,a)\le v_{h}, &{} a\in \mathcal {L}\times \mathcal {\{}h\mathcal {\}},h\in \mathcal {H}, \\ &{} &{} s\in \mathcal {S}, \\ &{} x(s,a)\le \sum _{g\in \mathcal {G}}z_{g}U_{lk}^{g}, &{} a=(k,h),k\in \mathcal { L},h\in \mathcal {H}, \\ &{} &{} l\in \mathcal {L},s\in \mathcal {S}, \\ &{} z_{g}\le \bar{z}_{g}, &{} g\in \mathcal {G}, \\ &{} v_{h}\le \bar{v}_{h}, &{} h\in \mathcal {H}, \\ &{} x(s,a)\ge 0, &{} a\in \mathcal {L}\times \mathcal {H},s\in \mathcal {S}. \end{array} \end{aligned}$$

At each iteration i the subproblem SP is solved with a candidate solution \( (\bar{z}^{i},\bar{v}^{i})\). At optimality, the dual multipliers (\(\beta _{i}\) and \(\delta _{i}\)) on the constraints containing \(\bar{z}^{i}\) and \(\bar{v} ^{i}\) give the coefficients for the ith cutting plane constraint

$$\begin{aligned} \alpha _{i}+\beta _{i}^{\top }z+\delta _{i}^{\top }v\le \phi \end{aligned}$$

in the restricted master problem, where we set

$$\begin{aligned} \alpha _{i}=\phi \left( \bar{z}^{i},\bar{v}^{i}\right) -\beta _{i}^{\top } \bar{z}^{i}-\delta _{i}^{\top }\bar{v}^{i}. \end{aligned}$$

In case SP is infeasible, a feasibility cut is added to the restricted master problem. The details are provided in [8].

The solution of MIP using Benders decomposition is complicated by the presence of binary variables in the restricted master problem, which requires a branch-and-cut procedure. We have used the procedure described by Rubin that uses the callback classes in CPLEX 12.2. (see [15]).

4 Multistage investments

In the previous section we showed how investment in ramping capacity could be formulated as a two-stage stochastic mixed integer program, in which the second stage is a Markov decision process. In this section we extend our basic model to the case where investments are to be staged over time as an uncertain future unfolds. We discriminate here between the uncertainty arising from intermittency (modeled by a Markov chain) and the coarse-grained uncertainty that applies to states of the world in future years. These uncertainties (for example in future demand, fuel prices or taxation levels) are modeled using a scenario tree \(\mathcal {N}\) with N nodes as shown in Fig. 4. Each node n represents a state of the world at a given future time period, occurring with probability \(\phi _{n}\). In the tree \(\mathcal {P}_{n}\) denotes the set of predecessors of node n, and \(\mathcal {S}_{n}\) the set of successors of node n. A scenario is a set of nodes consisting of a leaf node and its predecessors.

Fig. 4
figure 4

A scenario tree with three stages and four scenarios

The investment model we use is based on that described in [16]. We first introduce this for a general investment problem, and then specialize it to investment in peaking capacity and ramping plants in the presence of intermittent sources of generation. Suppose we have initial capacity \(\mathbf {u}_{0}\in \mathbb {R}^{U}\) that we increase with actions \(\mathbf {z}_{n}\in \{0,1\}^{Z}\) in node n of \( \mathcal {N}\). Investment in node n costs \(\mathbf {c}_{n}^{\top }\mathbf {z} _{n}\) and contributes additional capacity \(K_{n}\mathbf {z}_{n}\ge 0\) to the system in every node \(h\in \mathcal {S}_{n}\). So, given actions \(\mathbf {z} _{n}\), \(n\in \mathcal {N}\), the resulting capacity in node n will be \( \mathbf {u}_{0}+\sum _{h\in \mathcal {P}_{n}}K_{h}\mathbf {z}_{h}\). In the state of the world corresponding to node n we operate our capacity choosing actions \(\mathbf {y}_{n}\in \mathcal {Y}_{n}\), (where \(\mathcal {Y}_{n}\) denotes the space of all feasible operating decisions in state of the world n), to minimize the operating cost \(\mathbf {q} _{n}^{\top }\mathbf {y}_{n}\). The operations in the state of the world defined by node n are futher restricted by capacity constraints

$$\begin{aligned} V_{n}\mathbf {y}_{n}\le \mathbf {u}_{0}+\sum _{h\in \mathcal {P}_{n}}K_{h} \mathbf {z}_{h}, \end{aligned}$$

where \(V_{n}\) and \(K_{n}\) are matrices of order \(U\times Y\) and \(U\times Z\) respectively. Here \(V_{n}\) is a matrix that transforms operational decisions \(\mathbf {y}_{n}\) to compute their resource usage which is then constrained by the capacity \(\mathbf {u}_{0}+\sum _{h\in \mathcal {P}_{n}}K_{h}\mathbf {z} _{h}\).

The capacity planning problem then becomes a multistage stochastic mixed integer program.

$$\begin{aligned} \begin{array}{ll} \text {SIP: min} &{} \sum _{n\in \mathcal {N}}\phi _{n}\left( \mathbf {c} _{n}^{\top }\mathbf {z}_{n}+\mathbf {q}_{n}^{\top }\mathbf {y}_{n}\right) \\ \text {s.t.} &{} V_{n}\mathbf {y}_{n}\le \mathbf {u}_{0}+\sum _{h\in \mathcal {P} _{n}}K_{h}\mathbf {z}_{h},\;\;\,n\in \mathcal {N}, \\ &{} \mathbf {y}_{n}\in \mathcal {Y}_{n},\;\;n\in \mathcal {N}, \\ &{} \mathbf {z}_{n}\in \{0,1\}^{Z},\;\;\,n\in \mathcal {N}. \end{array} \end{aligned}$$

Observe that SIP has a set of binary variables \(\left( \mathbf {y}_{n}, \mathbf {z}_{n}\right) \) defined for each scenario node, which makes it a very large problem to solve as a single mixed integer program. The actions \( \mathbf {y}_{n}\), however, are not affected by any decisions in other nodes of the scenario tree apart from through capacity decisions \(\mathbf {z}_{h}\), \(h\in \mathcal {P}_{n}\). We observe that SIP has the form of a multi-horizon stochastic programming problem as discussed by [9]. Our approach to solving this differs from theirs in that we use the Dantzig–Wolfe decomposition technique developed by [16].

Following [16] we assume for simplicity that each facility can be expanded at most once over the planning horizon, and the matrix \(K_{n}\) of possible capacity expansions for node n is a constant matrix K. As shown in [16] the model SIP is then equivalent to the following problem.

$$\begin{aligned} \begin{array}{ll} \text {SIPS: min} &{} \sum _{n\in \mathcal {N}}\phi _{n}\left( \mathbf {c} _{n}^{\top }\mathbf {z}_{n}+\mathbf {q}_{n}^{\top }\mathbf {y}_{n}\right) \\ \text {s.t.} &{} \sum _{h\in \mathcal {P}_{n}}\mathbf {z}_{h}\le \mathbf {1},\quad \forall \,n\in \mathcal {N}, \\ &{} \mathbf {z}_{n}^{\prime }\le \sum _{h\in \mathcal {P}_{n}}\mathbf {z} _{h},\quad n\in \mathcal {N}, \\ &{} V_{n}\mathbf {y}_{n}\le \mathbf {u}_{0}+K\mathbf {z}_{n}^{\prime },\;\;\,n\in \mathcal {N}, \\ &{} \mathbf {y}_{n}\in \mathcal {Y}_{n},\;\;n\in \mathcal {N}, \\ &{} \mathbf {z}_{n},\mathbf {z}_{n}^{\prime }\in \{0,1\}^{Z},\;\;\,n\in \mathcal { N}. \end{array} \end{aligned}$$

Observe that SIPS has a split-variable formulation where new variables \( \mathbf {z}_{n}^{\prime }\) have been introduced to represent the extra capacity available for utilization in node n. This cannot exceed the accumulated investment defined by \(\sum _{h\in \mathcal {P}_{n}}K\mathbf {z} _{h} \).

Applying a Dantzig–Wolfe decomposition, the problem SIPS decomposes into a restricted master problem and single node subproblems of the form

$$\begin{aligned} \begin{array}{ll} \text {SP(}n\text {): min} &{} \phi _{n}\left( \mathbf {q}_{n}^{\top }\mathbf {y} _{n}+\varvec{\pi }_{n}^{\top }\mathbf {z}_{n}^{\prime }\right) \\ \text {s.t.} &{} V_{n}\mathbf {y}_{n}\le \mathbf {u}_{0}+K\mathbf {z}_{n}^{\prime }, \\ &{} \mathbf {y}_{n}\in \mathcal {Y}_{n},\quad \mathbf {z}_{n}^{\prime }\in \{0,1\}^{Z}. \end{array} \end{aligned}$$

Here we select a single node n, and ignore all variables \(\mathbf {y}_{m}\), \(m\ne n\), while accounting only for variables \(\mathbf {z}_{n}^{\prime }\). SP(n) can be interpreted as a two-stage investment problem of the same form as MIP in Sect. 3. Given the investment in capacity defined by \( \mathbf {z}_{n}^{\prime }\), we seek to minimize the cost \(\mathbf {q} _{n}^{\top }\mathbf {y}_{n}\) of operating the capacity in node n.

The problem SP(n) is still a mixed integer program so it is challenging to solve numerically. The difficulty arises in the operational constraints \( \mathbf {y}_{n}\in \mathcal {Y}_{n}\) which might involve many integer variables depending on the application. Our approach will work well if integer programming methodologies can be applied to give strong formulations of SP(n) and branch-and-bound strategies that solve rapidly. In our case SP(n) takes the special form

$$\begin{aligned} \begin{array}{cll} \text {SP(}n\text {): }\min _{x} &{} \sum _{s\in \mathcal {S},a\in \mathcal {L} \times \mathcal {H}}c_{n}(s,a)x(s,a)-(\varvec{\pi }_{n}^{\top }\mathbf {z} _{n}^{\prime }+\varvec{\sigma }_{n}^{\top }\mathbf {v}_{n}^{\prime }) &{} \\ \text {s.t.} &{} \sum _{a\in \mathcal {L}\times \mathcal {H}}x(s,a) &{} \\ &{} -\sum _{k\in \mathcal {S},a\in \mathcal {L}\times \mathcal {H} }P^{a}(k,s)x(k,a)=0, &{} s\in \mathcal {S}, \\ &{} \sum _{s\in \mathcal {S},a\in \mathcal {L}\times \mathcal {H}}x(s,a)=1, &{} \\ &{} x(s,a)\le v_{nh}^{\prime }, &{} a\in \mathcal {L}\times \mathcal {\{}h \mathcal {\}},h\in \mathcal {H}, \\ &{} &{} s\in \mathcal {S}, \\ &{} x(s,a)\le \sum _{g\in \mathcal {G}}z_{ng}^{\prime }U_{lk}^{g}, &{} a=(k,h),k\in \mathcal {L},h\in \mathcal {H}, \\ &{} &{} l\in \mathcal {L},s\in \mathcal {S}, \\ &{} \sum _{g\in \mathcal {G}}z_{ng}^{\prime }=1, &{} \\ &{} \sum _{h\in \mathcal {H}}v_{nh}^{\prime }=1, &{} \\ &{} z_{ng}^{\prime },v_{nh}^{\prime }\in \{0,1\},\quad x(s,a)\ge 0, &{} g\in \mathcal {G},h\in \mathcal {H}, \\ &{} &{} a\in \mathcal {L}\times \mathcal {H},s\in \mathcal {S}. \end{array} \end{aligned}$$

Here the role of \(K\mathbf {z}_{n}^{\prime }\) is played by replaced by \( \mathbf {v}_{n}^{\prime }\), the peaking plant capacity choice, and \( \sum _{g\in G}U_{ik}^{g}z_{ng}^{\prime }\), which defines the increases in ramping capacity provided by the extra capacity made available by the variable \(z_{n}^{\prime }\). We solve SP(n) using Benders decomposition as discussed in Sect. 3.

Suppose the solution to SP(n) yields \(\mathbf {z}_{n}^{\prime }=\hat{\mathbf {z}}_{n}\in \{0,1\}^{Z}\) and \(\mathbf {v}_{n}^{\prime }=\hat{\mathbf {v}} _{n}\in \{0,1\}^{V}\). This defines a column \(\left[ \begin{array}{cc} \hat{\mathbf {z}}_{n}&\hat{\mathbf {v}}_{n} \end{array} \right] ^{\top }\) of capacity decisions which is added to the columns of a Dantzig–Wolfe restricted master problem, as long as its reduced cost is negative. The columns are generated dynamically with different \(\varvec{ \pi }_{n}\) and \(\varvec{\sigma }_{n}\,\ \)the dual variables at the optimal solution of the restricted master problem

$$\begin{aligned} \begin{array}{lll} \text {RMP: min} &{} \sum _{n\in \mathcal {N}}\phi _{n}(\mathbf {c}_{n}^{\top } \mathbf {z}_{n}+\mathbf {d}_{n}^{\top }\mathbf {v}_{n})+\sum _{n\in \mathcal {N} }\sum _{j\in \mathcal {J}_{n}}\phi _{n}\mathbf {q}_{n}^{\top }\widehat{\mathbf {y }}_{n}^{j}w_{n}^{j} &{} \\ \text {s.t.} &{} \sum _{j\in \mathcal {J}_{n}}\widehat{\mathbf {z}} _{n}^{j}w_{n}^{j}\le \sum _{h\in \mathcal {P}_{n}}\mathbf {z}_{h},\quad n\in \mathcal {N}, &{} [\varvec{\pi }_{n}] \\ &{} \sum _{j\in \mathcal {J}_{n}}\widehat{\mathbf {v}}_{n}^{j}w_{n}^{j}\le \sum _{h\in \mathcal {P}_{n}}\mathbf {v}_{h},\quad n\in \mathcal {N}, &{} [ \varvec{\sigma }_{n}] \\ &{} \sum _{h\in \mathcal {P}_{n}}\mathbf {z}_{h}\le \mathbf {1},\quad \,\sum _{h\in \mathcal {P}_{n}}\mathbf {v}_{h}\le \mathbf {1},\quad n\in \mathcal {N}, &{} \\ &{} \sum _{j\in \mathcal {J}_{n}}w_{n}^{j}=1\quad n\in \mathcal {N}, &{} [\mu _{n}] \\ &{} w_{n}^{j}\in \{0,1\}\quad \,n\in \mathcal {N},\,\ j\in \mathcal {J}_{n}, &{} \\ &{} \mathbf {v}_{n}\in \{0,1\}^{V},\quad \mathbf {z}_{n}\in \{0,1\}^{Z}\quad \,n\in \mathcal {N}. &{} \end{array} \end{aligned}$$

The reduced cost of the column defined by the optimal solution of SP(n) is then

$$\begin{aligned} \phi _{n}\left( \sum _{s\in \mathcal {S},a\in \mathcal {L}\times \mathcal {H} }c_{n}(s,a)x(s,a)-(\varvec{\pi }_{n}^{\top }\hat{\mathbf {z}}_{n}+ \varvec{\sigma }_{n}^{\top }\hat{\mathbf {v}}_{n})-\mu _{n}\right) \text {. } \end{aligned}$$

The essential matrix structure of the restricted master problem for a seven-node binary scenario tree after four iterations is depicted in Fig. 5, where we omit the coefficients of constraints

$$\begin{aligned} \sum _{h\in \mathcal {P}_{n}}\mathbf {z}_{h}\le \mathbf {1},\quad \,\sum _{h\in \mathcal {P}_{n}}\mathbf {v}_{h}\le \mathbf {1},\quad n\in \mathcal {N}. \end{aligned}$$

The seven blocks of columns on the right of the matrix correspond to \(\left[ \begin{array}{cc} \hat{\mathbf {z}}_{n}^{j}&\hat{\mathbf {v}}_{n}^{j} \end{array} \right] ^{\top }\), \(j=1,2,3,4\), \(n=1,2,\ldots ,7\), and the blocks to the left of the matrix yield \(\left[ \begin{array}{cc} -\sum _{h\in \mathcal {P}_{n}}\mathbf {z}_{h}&-\sum _{h\in \mathcal {P}_{n}} \mathbf {v}_{h} \end{array} \right] ^{\top }\), \(n=1,2,\ldots ,7\). The constraints are all \(\le 0\), except for the last 7 convexity constraints on the variables \(w_{n}\).

Fig. 5
figure 5

A schematic showing the nonzeroes of the restricted master problem matrix for a scenario tree with 7 nodes. The columns on the left correspond to variables z,  and those on the right correspond to variables w

5 Results

In this section we present the results of running the models on some examples. These are designed to give an indication of improvement in computational performance obtained as compared with alternative models, and to show how the solutions obtained account for flexibility in investment and operation.

The first experiment computes the solution to MIP in Sect. 3 (called the MDP model) and compares this to the solution computed using a screening curve. The results are shown in the table in Fig. 6.

Fig. 6
figure 6

Comparison of screening curve investments and MIP investments for different wind penetration. The MIP model puts more in peak load generation, resulting in lower costs when simulated

The columns headed “Theoretical Results” show that the screening curve yields a solution with more ramping capacity and lower peaking capacity than the MDP model. Since ramping plant has lower short-run marginal cost than peaking plant, the screening curve model predicts a lower cost per period for the screening curve optimal solution than the cost per period for the MDP solution. The amount of ramping plant recommended by both models increases when the wind is more variable, while the screening curve solution recommends a lower investment in peaking plant. In both scenarios the screening curve model predicts a lower cost for the screening curve solution (as one might expect).

The right-hand side of the table shows the result of simulating the operation of each solution for 2920 consecutive eight hour-periods over 50 random sequences of wind accounting for the limited ramping capability of the ramping plant. The cost-per-period for each solution is averaged over the 50 simulations. Now the screening curve solution has a higher cost per period than the solution recommended by the MDP model. Much of this stems from insufficient peaking capacity leading to load shedding (at very high costs). The MDP model accounts for ramping limitations and so installs more peaking plant to account for this.

The next experiment we describe compares investments for two net demand profiles as shown in Fig. 7. The first net demand profile comes from a wind process that is negatively correlated with demand variation, so wind reduces the net demand variation. The second net demand profile comes from a wind process that is positively correlated with demand variation, so wind increases the net demand variation. For both cases the net demand duration curves are the same.

Fig. 7
figure 7

We compare investments from the MDP model for the two different wind models shown. The second profile gives a more volatile net demand curve, although both demand duration curves are identical

The investment solutions yielded by the MDP model in each case are shown in Table 1. The second (more volatile) net demand scenario results in more peaking and less ramping plant than that obtained from the MDP model in the first scenario. Observe that the screening model would give identical solutions in these two scenarios (as they have identical net demand duration curves).

Table 1 Results for wind profile example from MDP model

We now look at a multistage investment model with uncertain demand growth. This model is called GEMSTONE-MDP as it is based on the GEMSTONE model [6]. In all experiments, a discount rate of 8 % per year is applied to both investment and operational costs in the model, so that time value of money is reflected in the reward calculations.

We first consider a version of GEMSTONE-MDP with no wind variability but with random demand growth as shown in Fig. 8. This shows a scenario tree with two demand growth outcomes after each five-year period, each with a 50 % probability: in one outcome demand grows by 10 % and in the other demand remains constant.

Fig. 8
figure 8

Scenario tree showing demand growth for first multistage capacity expansion problem

The net demand satisfies a Markov chain with 40 demand variation states but no uncertainty so its transition matrix at each stage is a permutation matrix. The peaker capacity and ramping capacity each have seven expansion options, and the investment decisions with a delayed implementation period can be made at nodes 1–7 (i.e. the first three stages of the scenario tree), whereas the operation decisions are made at nodes 2–15.

We consider two cases. The base case assumes the same technology mix that we have used in Sect. 3. The second (advanced technology) case assumes an improvement in technology that doubles the ramping flexibility that would be available in five years. The optimal investment decisions for each case are shown in Table 2.

Table 2 Results on 15-node model, base versus technology advancement details

If ramping technology is expected to be better after five years, the ramping investment is delayed in the second case to node 2 and node 3, even though the ramping capacity decisions are the same as before. The improved ramping technology reduces the need for peaking plant, giving an investment of 400 MW rather than 500 MW.

We now consider the same demand scenarios as in the previous example, but with increased wind penetration in some scenarios as shown in Fig. 9.

Fig. 9
figure 9

Small 15 node scenario tree with uncertain demand and wind penetration

Investment decisions are made in the first 3-stages (nodes 1–7), and operational decisions are made in nodes 2–15. These assumptions are summarized in Table 3.

Table 3 Data for the 15-node model with wind variation
Table 4 Data for the 23-node model with wind variation and fuel cost changes

The value of more flexible plant and consequently the cost to supplement intermittent generation is very much related to the fuel cost of peakers (i.e. the gas price) (Table 4). In our model the gas price is assumed to be well predicted in the next 10 years with one scenario for each existing node, but at the 15-year stage there are two gas-price scenarios branching out for each existing node. We now have 8 additional nodes, making the problem slightly larger than before with 23 nodes (shown in Fig. 10).

Fig. 10
figure 10

Small 23 node scenario tree with uncertain gas price and wind penetration. Gas prices are expressed in $/MWh

The results for GEMSTONE-MDP applied to the two cases above are shown in Table 5.

With a higher level of wind penetration, GEMSTONE-MDP delays ramping investment until node 2 and node 3, varying the level of investment according to which demand level was realised after the first time-stage (very similar to the 7-node model shown previously). More wind variation results in 600 MW of peaker expansion in node 1 in contrast to 500 MW in the previous model without wind variation.

When fuel costs are uncertain, GEMSTONE-MDP delays both the ramping and the peaking decisions to node 2 and node 3. Here the peaker capacity decision is delayed until the future cost of fuel is clearer.

To demonstrate the effectiveness of the decomposition in GEMSTONE-MDP, we finish this section by presenting some computation times for the models described above. All the algorithms were implemented on a desktop machine with Windows 7 (32-bit), duo core 3.2 GHz with 4 Gb RAM. Software versions used are: JAVA JDK 7 and CPLEX 14.2. Following [16] GEMSTONE-MDP solved all restricted master problems using the CPLEX barrier optimizer without crossover. This produces more balanced dual variables for column generation. The optimal solutions to the restricted master problems (which are linear programming relaxations) at the final iteration were all naturally integer. A comparison of CPU times is given in Table 6. The largest model we solved has 8 scenarios (giving a 23-node tree). Although this took about 6 h to yield an optimal solution, this is far beyond the capacity of a standard application of mixed integer programming to this class of problems.

Table 5 Results for GEMSTONE-MDP with wind variation and fuel cost uncertainty
Table 6 CPU Time Comparison

6 Final comments

We have described a methodology for combining coarse-grained uncertainty over long time horizons with fine-grained uncertainty in operations. The ability to solve stochastic planning problems with multi-horizon scenario trees is limited by the capacity of mixed integer programming codes to scale. The Dantzig–Wolfe decomposition method described in this paper provides one approach to tackling this computational problem. An appealing feature of this methodology is the ability to target specific algorithms at the operational subproblems. In our case we applied Benders decomposition to an average reward Markov decision process represented as a linear program. The resulting algorithm is efficient enough to solve a stochastic capacity programming model with both fine and coarse uncertainties on a common desktop machine.

The solutions obtained by our GEMSTONE-MDP model are fundamentally different from those produced using screening curves and give better results when simulated with out-of-sample data. The objective functions in the new models more accurately reward flexible generation, and this often leads to different optimal capacity decisions.

Our models have assumed that levels of wind generation investment are exogenous. Such an assumption would be valid if an analyst were seeking the levels of conventional plant required to support a given level of installed wind capacity. An interesting extension would be to seek optimal levels of wind investment, given its effect on the variability in residual demand. This however would require alterations to the model so as to incorporate the change in transition matrices governing residual demand that would accompany such an investment, and this is not easy to represent in the decomposition framework we employ.

Finally, the results we obtain are for optimal capacity expansion by a social planner, to be used as a benchmark. As mentioned in the introduction, one might expect different expansion plans to be produced from equilibrium models in which agents are risk averse or exercise market power. Computing equilibrium solutions to such models is an active area of research.