1 Introduction

Nowadays, it is widely accepted that the most serious threat to ecosystems is the global warming fueled by the uncontrolled increase in carbon emissions, and for the last twenty-some years, starting with the Kyoto Protocol in 1997, international treaties have sprung out in hope to address this negative externality. The most recent of these treaties is the Paris Agreement with 196 signatories aiming at keeping the increase in temperature below \(2 ^\circ \)C. Throughout the world, local and federal governments try to disincentivize reliance on polluting means of production by introducing carbon taxes or cap-and-trade programs. In the latter case, regulators put a limit on the allowable quantity of greenhouse gas (GHG) emissions, any quantity above this limit having to be covered by emission certificates (allowances) or the payment of a penalty. In the former case, whether they are levied upstream or downstream, carbon taxes aim at penalizing the use of fossil fuels for their carbon content. The interested reader is referred to [11,12,13] for a review of the state of affairs in the early days of the European Union Emission Trading System, and mathematical treatments of thorough partial equilibrium models for the comparison of realistic implementations of these policies in the electricity sector.

According to the environmental protection agency, electricity production claims the lion share (\(25\%\)) of the total greenhouse gas emissions in the USA.Footnote 1 So here, we concentrate on the electricity sector and we propose a model for the analysis of the impact of investments in clean means of production (e.g., solar and wind). While a model of the electricity sector should comprise at least three types of agents: electricity producers, resellers/retailers and the end-users, we shall concentrate our modeling effort on the producers. Until the challenging technological problem of electricity storage is resolved at a larger scale, the demand for this commodity remains inelastic, and we shall penalize the producers for not matching the demand, forcing the independent system operator (ISO) to rely on costly reserves. In the following, we shall use the term renewable to mean electricity produced from wind turbines or solar panels. Alternatively, we shall use the term nonrenewable to mean electricity produced by burning fossil fuels like coal, crude oil or natural gas. We chose this convention for convenience, even if this literary license is not completely accurate.

Individual producers control over time their usage of fossil fuels, and hence, the amount of CO\(_2\) emissions they are responsible for. They also control their possible investment in solar or wind production, should they decide to go that route. Notice that while the decision to use fossil fuels changes over time, the investment in solar panels or wind turbines is a one-time decision made at the beginning of the time period under consideration. In our model, producing electricity from renewable sources involves an initial investment and no extra cost over time since the marginal cost of running these production assets is practically zero (except from maintenance costs and possible subsidies which we ignore here). While the zero cost of production is an attractive feature, it comes with the very high risks due to the difficulties to predict the weather and the uncertainty associated with the high volatility of these predictions. On the other hand, production from traditional power plants is more predictable, the costs depending upon the prices of the fuels and the price put on the CO\(_2\) emissions by the regulator. Each producer has to find the right balance between the pros and the cons of the two major means of production we single out in our stylized model. The overarching goal is to decarbonize so as to meet emission targets, harnessing demand-side policies through the establishment of a tax, as well as supply-side resources including wind and solar production technologies.

Furthermore, we add to the model a regulator who is in charge of finding suitable carbon tax policies. When the policies are set by regulators, it is very hard to control and anticipate how the population is going to behave since this population is composed of rational agents who react to the regulators’ decisions. For example, if producers behave individually in a free market to optimize their own outcomes, this creates a game setting in the population. In this case, we need to find a Nash equilibrium behavior of the producers; in other words, we need to find a behavior where no one has a profitable deviation unilaterally for a given carbon tax level. This requires us to use game theory tools. However, finding a Nash equilibrium when there is a large number of producers involved is a very complex task in general.

Our economic model is based on the premises that the individual producers and the regulator have only access to aggregate quantities. Basically, they only have access to the statistical distributions of the productions, emissions, investments, etc., of the individual producers. As a result, we propose two separate frameworks for the individual producers to optimize the mix of renewable and nonrenewable production they should include in their portfolios. We compute and compare the optimal centralized strategies by solving mean field control problems, and the optimal decentralized strategies by solving mean field game problems. Our theoretical analysis relies on the probabilistic approach to construct forward–backward stochastic differential equation (FBSDE) systems for which we show, in both settings, existence and uniqueness of the solutions. Because we have both time-dependent and time-independent controls for the electricity producers, FBSDE systems that characterize the solutions are nonstandard due to a term that is nonlocal in time. In this case, the existence of the solutions is proved by using Brouwer fixed point theorem. We further show the uniqueness. Later, we propose a numerical approach to monitor the effect of a carbon tax on the optimal and equilibrium decisions in both cases. Quantifying the differences between the two approaches is reminiscent of what is known as the price of anarchy (PoA).

Among the conclusions drawn from the analysis of our model, we confirm that a carbon tax is an effective incentive for the use of renewables. Also intuitive is the fact that in the absence of a carbon tax, the overall pollution is greater when producers compete than when they cooperate. Less obvious is the fact that cooperating producers will pollute less than when they compete, even if the carbon tax is significant. We also show that stricter regulations tend to reduce the differences between competitive and cooperative equilibria. Further, we argue that the best way for the regulator to encourage producers to match the demand is to incentivize competition over cooperation among the producers.

1.1 Related Literature

Mean field game (MFG) models appeared simultaneously and independently in the original works of [9] and [23]. The thrust of these works was to propose a paradigm to overcome the challenges of the search for Nash equilibria in large games by considering models for which the interactions between the players were of a mean field type, and deriving effective equations in the limit when the number of players goes to infinity. Models in which a single player plays a different role from the field of remaining players were introduced and studied under the name of MFGs with major and minor players. In their Stackelberg version, they had a significant impact on problems in economic contract theory. See, for example, [7, 26, 18], or [14, 17] or [5]. Notice that in these models, the major player uses a time-dependent control, while in this paper, we shall assume that the regulator uses time-independent controls.

Using mean field models for energy applications is very natural. Competition in the oil industry and the impact of the renewable energy competition was analyzed in [19] and [15]. The early work [19] was extended with the addition of a regulator in [1]. In [3], optimal entry and exit times for two types of agents, electricity producers using either renewable or nonrenewable energy resources, are analyzed using MFGs. Competition among electricity producers is analyzed in [16] by using mean field type game where the mean field interactions come through conditional expectation of the electricity price and in [4] by using a model where the interactions enter the electricity spot price. In [2] and [17], electricity consumers constitute the mean field population and a single electricity producer plays the role of the principal, in contrast to our model where we take the electricity producers as the mean field population and the regulator as the principal.

Mean field models have also been used to model environmental impacts. In [6], a MFG model is proposed to model climate change negotiations among countries interacting through a CO\(_2\) emission permit market. Emission certificate markets are also studied in [27] and [28], again without the presence of a regulator.

1.2 Contributions and Paper Structure

The contributions of this paper are twofold. First, we propose a model where we have a regulator and large number of players to find the optimal carbon tax policies. In this model, we let the regulator to take into account the rational reactions of the producers in order to make an informed decision about the carbon tax policies. For the producers, we look at the free market setting where large number of producers try to optimize their own outcomes individually (mean field game) and compare it with the social optimum solution where producers collaborate and behave as in a monopolistic situation (mean field control). We show that in the free market, producers are more likely to match the electricity demand instead of underproducing to increase the unit prices of the electricity by collaborating. We further show the advantage of this free market to the regulator because of the increased demand matching levels.

The second contribution is the analysis of a control problem with a nonstandard type of control. Indeed, in the model, producers have both time-independent (initial investment in renewable energy) and time-dependent (decisions related to the nonrenewable energy) controls. The initial investment decision affects the whole time horizon. According to our current information, this type of model is novel and prevents us from using standard theoretical and numerical tools from the literature. We analyze such problems by introducing nonstandard FBSDE systems that include nonlocal terms in time in order to characterize the Nash equilibrium and the social optimum. We further show the existence and uniqueness.

The paper is structured as follows. In Sect. 2, we introduce the minor players’ model and the various equilibrium notions used in the sequel. In Sect. 3 (resp. 4), the main theoretical (resp. numerical) results for the minor players’ model are given. In Sect. 5, we introduce the regulator and define the relevant notions of equilibrium. Finally, we provide numerical results for the combined model with minor players and the regulator in Sect. 6 and we summarize our findings in a short Sect. 7.

2 Mean Field Model for Electricity Producers

2.1 N-Player Model

Although we will focus on mean field limits involving an infinite number of players, we start with the description of what the \(\mathcal {N}\)-player version of the game would be. For symmetry reasons, we assume that the total electricity demand is split equally between all the agents, and each agent faces the same demand, say \(D_t\) at time t. The state of producer i is five-dimensional: instantaneous electricity production \(Q^i_t \in \mathbb {R}_+\), instantaneous irradiance \(S^i_t \in \mathbb {R}_+\), instantaneous emission level \(E^i_t \in \mathbb {R}_+\), cumulative pollution \(P^i_t \in \mathbb {R}_+\), and instantaneous nonrenewable energy production \(\tilde{N}^i_t \in \mathbb {R}_+\). Producer i controls their state by choosing at time \(t=0\), their initial investment \(R^i_e \in \mathbb {R}\) in renewable production assets (e.g., the number of solar panels they purchase), and at each subsequent time t, by choosing the rate of change \(N^i_t \in \mathbb {R}\) in nonrenewable energy production. Notice that \(N^i_t\) is time dependent, while \(R^i_e\) is time independent. This will be a challenging feature of the mathematical analysis of our model.

Remark 1

For the sake of definiteness, we use the terminology of solar power production. However, other types of renewable energy can be modeled in a similar way. For example, for wind power, \(S^i_t\) would stand for the instantaneous output of a wind farm and \(R_e\) would be the corresponding units of initial investment.

So with these proviso out of the way, we define the time evolution of the state of producer i as:

The instantaneous electricity production changes depend on the instantaneous nonrenewable energy usage (given by term 1) and the instantaneous yield from the renewable energy investment (given by term 2). This second term includes a seasonality component (sinusoidal term) and a random shock for the variability of the sun irradiance. The form of the seasonality component was chosen for the sake of simplicity. It can easily be extended to several harmonics to include nightly and daily, monthly and yearly effects. In any case, we have \( Q_t^i= Q^i_0 + \kappa _1 \tilde{N}^i_t + \kappa _2 R^i_e (\sin (\alpha t) + S^i_t) \) where \(\kappa _1, \kappa _2>0\) are constants that give the efficiency of the production from nonrenewable and renewable energy, respectively. The constant \(\alpha >0\) gives the period of the seasonality of the renewable energy.

We model the idiosyncratic noise terms \(S^i_t\) in the renewable productions as independent stationary processes. For the sake of definiteness, we assume that they are Ornstein–Uhlenbeck processes with the same mean \(\theta >0\) and volatility \(\sigma _0>0\), the being independent Wiener processes.

The dynamics of the instantaneous emissions \(E^i_t\) have two components: the contribution from the production from nonrenewable energy power plants and idiosyncratic random shocks with constant volatility \(\sigma _1>0\) given by independent Wiener processes \(W^i\), also independent of the ’s. The choice of the constant \(\delta \) could include the effects of some abatement measures such as carbon capture, sequestration and the use of filters.

Using the notation \({\tilde{N}}^i_t\) for the instantaneous nonrenewable given by \( {\tilde{N}}^i_t = {\tilde{N}}^i_0 + \int _0^t N^i_s ds \), the expected cost to producer i over the whole period is:

(1)

where \({\bar{Q}} = \sum _{j=1}^{\mathcal {N}} Q^j / \mathcal {N}\) and \(p: \mathbb {R}_+ \mapsto \mathbb {R}_+\) is the price function for the investment in renewable energy.

Term 1 with \(c_1>0\) is a penalty (i.e., delay cost) for attempting to ramp up and down nonrenewable energy power plants too quickly. Term 2 represents the costs of the fossil fuels used in nonrenewable power plants. The constant \(p_1>0\) can be understood as the average cost of one unit of fossil fuel. In lieu of storage which is not included in our models because of its scarcity, Term 3 with \(c_3>0\) imposes a penalty on producers for not matching the demand and forcing the system operator to use costly reserves. Term 4 represents the revenues from electricity production, \(\big (\rho _0 + \rho _1 (D_t - {\bar{Q}}_t)\big )\) being the inverse demand function which is assumed to be linear in excess demand or supply, where \(\rho _0\) and \(\rho _1\) are strictly positive constants. It captures the fact that the price increases if there is excess demand, and it decreases if there is excess supply. We assume that the producers are selling what they produce. This term introduces the mean field interactions into the model. Term 5 gives the pollution damage cost for the producer. This cost is levied by the regulator by using a carbon tax, and the damage is assumed to be increasing with the higher levels of the cumulative pollution. Therefore, we emphasize its role by assuming it is proportional to the square of the terminal pollution. Term 6 is the total cost related to the initial investment in renewable electricity production including the price of the solar panels and the cost of the land used.

2.2 The Mean Field Model

In discussing the mean field regime of the model, we focus on a representative producer interacting with the field of the other producers, so we drop the superscript i and the state dynamics equations become:

(2)

where W and are independent Wiener processes. Accordingly, the total expected cost becomes:

$$\begin{aligned} C(N, R_e; {\bar{Q}})= & {} \mathbb {E}\Big [\int _0^T \Big [c_{1} |N_t|^2 + p_1 \tilde{N}_t+ c_2|Q_t-D_t|^2 \nonumber \\&-\, c_3\big (\rho _0 + \rho _1(D_t- {\bar{Q}}_t)\big )Q_t\Big ] \mathrm{d}t + \tau |P_T|^2 + p(R_e)\Big ], \end{aligned}$$
(3)

where \({\bar{Q}}_t=\mathbb {E}[Q_t]\). We shall sometimes use the notation \({\bar{Q}}_t(N,R_e)\) to emphasize the fact that the expectation is computed under the state dynamics controlled by the admissible control \((N,R_e)\).

2.3 Equilibrium Notions

We consider two different models: mean field game (MFG) and mean field control (MFC). In the mean field game model, producers behave competitively and minimize their total expected costs (search for their best responses) given the other players’ decisions. A Nash equilibrium is then characterized as a fixed point of the best response map so defined. In the sequel, we restrict our attention to admissible strategies \((N,R_e)\) such that \(\mathbb {E}[\int _0^T|N_t|^2 \mathrm{d}t] <+\infty \) and \(R_e \in \mathbb {R}_+\).

Definition 1

(MFG Nash equilibrium) An admissible strategy and mean field flow tuple, \(({\hat{N}}, {\hat{R}}_e, {\bar{Q}})\), are called an MFG Nash equilibrium for any admissible \((N, R_e)\), we have:

$$\begin{aligned} C\Big ({(N, R_e)}; {\bar{Q}}\Big ) \ge C\Big (({\hat{N}}, {\hat{R}}_e); {\bar{Q}}\Big ), \end{aligned}$$

and \({\bar{Q}} = {\bar{Q}}({\hat{N}}, {\hat{R}}_e)\).

In the mean field control case, we assume that the producers cooperate and leave the choice of the control to a social planner minimizing the total expected cost as defined in (3). In the realistic setup, the producers can be thought as the production facilities of a monopolistic electricity production firm and the social planner’s decisions refer to the decisions taken by the headquarter. In this case, if one player changes their behavior, every player changes in the same way, and the mean field is affected. The problem is now an optimal control problem.

Definition 2

(Social Planner’s MFC Optimum) An admissible strategy and mean field flow tuple, \(({\hat{N}},{\hat{R}}_e, {\bar{Q}})\), are called an MFC optimum if for any admissible \((N, R_e)\), we have:

$$\begin{aligned} C\Big ({(N, R_e)}; {\bar{Q}}({N, R_e})\Big ) \ge C\Big (({\hat{N}}, {\hat{R}}_e); {\bar{Q}}\Big ), \end{aligned}$$

and \({\bar{Q}} = {\bar{Q}}({\hat{N}}, {\hat{R}}_e)\).

3 Main Theoretical Results

In this section, the following forward–backward stochastic differential equation system (FBSDE) is going to be of interest:

(4)

Theorem 1

\(({\hat{N}}_t,{\hat{R}}_e, {\bar{Q}})\) is a Nash equilibrium if and only if \(({\hat{N}}, {\hat{R}}_e)\) is given by:

$$\begin{aligned} \begin{aligned} {\hat{N}}_t&= -\dfrac{Y_t^1\kappa _1+Y_t^3\delta +Y_t^5}{2c_1}, \quad t \in [0,T]\quad \text {and}\quad \\ {\hat{R}}_e&=(p^{\prime })^{-1} \Big (-\mathbb {E}\Big [\int _0^T \kappa _2Y_t^1\Big (\alpha \cos (\alpha t) + (\theta {-S_t}) \Big ) \mathrm{d}t \Big ]\Big ), \end{aligned} \end{aligned}$$
(5)

where \((Q,S,E,P,\tilde{N},Y^1,Y^2,Y^3,Y^4,Y^5)\) is a solution to the FBSDE given in (4).

Here, \((p^\prime )^{-1}(\cdot )\) refers to the inverse of the first derivative of the function \(p(\cdot )\).

Condition 1

  1. (i)

    p is convex.

  2. (ii)

    \((p^{\prime })^{-1}\) is bounded, i.e., \((p^{\prime })^{-1}:\mathbb {R}\mapsto [0, R_e^{\text {max}}]\), continuous and monotone.

Theorem 2

Assume Condition 1 holds, then there exists a unique Nash equilibrium mean field flow \({\bar{Q}}\).

Theorem 3

\(({\hat{N}}, {\hat{R}}_e)\) is an MFC optimum if and only if \(({\hat{N}}, {\hat{R}}_e)\) is given by (5) where \((Q,S,E,P,\tilde{N},Y^1,Y^2,Y^3,Y^4,Y^5)\) is a solution to the FBSDE given in (4) where the equation for \((Y^1_t)_t\) is replaced by

(6)

Theorem 4

Assume Condition 1 holds, then there exists a unique mean field control optimum flow \({\bar{Q}}\).

4 Numerical Approach

For numerical purposes, given the technical challenges posed by the solution of the large FBSDE in 4 with the existence of time dependent and independent controls, we implement an analytic approach for which we give the details below. For this reason, we first notice that:

$$\begin{aligned} \inf _{(N_t)_t, R_e} C(N, R_e; {\bar{Q}}) = \inf _{R_e} \inf _{(N_t)_t} C(N, R_e; {\bar{Q}}), \end{aligned}$$

and we assume that \(R_e\) is fixed in a first analysis. Next, we rewrite the model in matrix form using \(X_t:=[Q_t\quad S_t\quad E_t\quad P_t\quad \tilde{N}_t]^{\top }\) as five-dimensional state process at time t and rewrite the optimization problem as:

$$\begin{aligned} \begin{aligned} \inf _{(N_t)_t} \tilde{C}\Big (N; R_e, {\bar{X}}\Big ) =&\inf _{(N_t)_t} \mathbb {E}\Bigg [\int _0^{T} \Big [\frac{R}{2} |N_t|^2 + H^{\top }_t X_t + \bar{X}_t^{\top }F X_t + X_t^{\top } G X_t + J_t\Big ] \mathrm{d}t\\ {}&\qquad {+ X^{\top }_T S_T X_T + p(R_e)} \Bigg ] \end{aligned} \end{aligned}$$
(7)
$$\begin{aligned} \mathrm{d}X_t = \Big (A X_t + B \cdot N_t + C_t \Big ) \mathrm{d}t + \Sigma d\widetilde{W}_t \end{aligned}$$

where R and \(J_t\) are the scalars given by \(R = 2c_1\) and \(J_t = c_2 D_t^2\) and:

$$\begin{aligned}&H_t = \begin{bmatrix} -(2c_2+c_3\rho _1)D_t-c_3 \rho _0\\ 0\\ 0\\ 0\\ p_1 \end{bmatrix}, F = \begin{bmatrix} c_3 \rho _1&{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0 \end{bmatrix}, G = \begin{bmatrix} c_2 &{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0 \end{bmatrix},\\&\quad S_T = \begin{bmatrix} 0 &{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} \tau &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0 \end{bmatrix}, \end{aligned}$$
$$\begin{aligned} A = \begin{bmatrix} 0 &{} -\kappa _2 R_e &{} 0 &{} 0 &{} 0\\ 0 &{} -1 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 1 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0 \end{bmatrix}, B = \begin{bmatrix} \kappa _1\\ 0\\ \delta \\ 0\\ 1 \end{bmatrix} , C_t = \begin{bmatrix} \kappa _2 R_e\Big ( \alpha cos(\alpha t) + \theta \Big )\\ \theta \\ 0\\ 0\\ 0 \end{bmatrix}, \Sigma = \begin{bmatrix} \kappa _2 R_e \sigma _0 &{} 0\\ \sigma _0 &{} 0\\ 0 &{} \sigma _1\\ 0 &{} 0\\ 0 &{} 0 \end{bmatrix}. \end{aligned}$$

Furthermore, we define \(\widetilde{W}_t\) and a as:

and the value function u(tX) as:

$$\begin{aligned} \begin{aligned} u(t, X)&= \inf _{(N_s)_s} \mathbb {E}\Bigg [\int _t^{T} \Big [\frac{R}{2} |N_s|^2 + H^{\top }_s X_s + \bar{X}_s^{\top }F X_s + X_s^{\top } G X_s + J_s\Big ] ds\\&\qquad { + X^{\top }_T S_T X_T + p(R_e)\Big | X_t= X \Bigg ]}. \end{aligned} \end{aligned}$$
(8)

Lemma 1

(ODE system for the MFG) For \(R_e\) fixed, if there exists a function \(t\mapsto (\eta _t, r_t, {\bar{X}}_t)\) solving the following system of ordinary differential equations (ODEs):

figure a

and if \(s_0\) is given by:

$$\begin{aligned} s_0 = p(R_e) + \int _0^{T} \Big (tr(a{\eta _t}) -\frac{1}{2} {r_t}^{T} B R^{-1} B^{\top } {r_t} +C_t^{\top } {r_t}+ J_t\Big )\mathrm{d}t, \end{aligned}$$
(10)

then \({\hat{N}}_t(R_e) = -R^{-1}B^{\top }(\eta _t X_t + r_t)\) is the MFG equilibrium given \(R_e\) fixed, and the expected cost to the representative producer in this equilibrium is:

$$\begin{aligned} \inf _{N=(N_t)_t} \tilde{C}^{MFG}\Big (N; R_e, {\bar{X}} \Big ) = \frac{1}{2} \left( Var(\sqrt{\eta _0} X_0) + \mathbb {E}[\sqrt{\eta _0} X_0]^2 \right) +{\bar{X}}_0^{\top } r_0 + s_0. \end{aligned}$$
(11)

Theorem 5

For \(R_e\) fixed, if T is small enough, there exists a unique MFG equilibrium.

Lemma 2

(MFC ODE system) Given \(R_e\), if there exists a function \(t\mapsto (\eta _t, r_t, {\bar{X}}_t)\) solving the ODE system (9) with the equation (9b) replaced by:

$$\begin{aligned} -\frac{d{r_t}}{\mathrm{d}t} = \left( A^{\top } - {\eta _t} B R^{-1} B^{\top }\right) {r_t} + {\eta _t} C_t + H_t + F^{\top } {\bar{X}_t} + {F {\bar{X}_t}}, \qquad {r_T} = 0 \end{aligned}$$
(12)

and the same \(s_0\) given by (10), then \(N^*_t(R_e) = -R^{-1}B^{\top }(\eta _t X_t + r_t)\) is an optimum for the MFC problem given \(R_e\), and the minimal expected cost is

$$\begin{aligned} \begin{aligned} \inf _{(N_t)_t} \tilde{C}^{MFC}\Big (N; R_e, {\bar{X}}\Big )&= \frac{1}{2} \left( Var(\sqrt{\eta _0} X_0) + \mathbb {E}[\sqrt{\eta _0} X_0]^2 \right) +{\bar{X}}_0^{\top } r_0 + s_0\\&\qquad {{-\int _0^T {\bar{X}}_t^{\top }F {\bar{X}}_t \mathrm{d}t}}. \end{aligned} \end{aligned}$$
(13)

Theorem 6

For \(R_e\) fixed, if T is small enough, there exists a unique MFC optimum.

Numerically, we search for the \(R_e\) and the corresponding equilibrium \(N=(N_t)_{t}\) that minimizes the cost of the minor players by using the ODE systems given in (9) and (12).

As emphasized earlier, the main difference between MFC and MFG is whether the mean field is affected by the decision of the representative producer (MFC), or taken to be fixed (MFG). This difference translates into the addition of a fixed point argument in the MFG case. For pedagogical reasons, we first discuss the MFC case, then the MFG. After solving the Riccati equation which is the same in both cases, we solve the coupled ODE system directly in the MFC case in order to find the mean field; on the other hand, notice that in the MFG case, the ODEs are decoupled since the mean field is assumed to be fixed in each iteration of the fixed point algorithm.

4.1 Mean Field Control Algorithm

In order to solve the system of MFC coupled ODEs for \((\bar{X}_t)_t\) and \(r_t\) given by equations (9c) and (12), we discretize the time with uniform step size \(\Delta t\) and solve the following linear equation:

$$\begin{aligned} \begin{bmatrix} \bar{X}\\ r \end{bmatrix} = M \begin{bmatrix} \bar{X}\\ r \end{bmatrix} + K, \end{aligned}$$
(14)

where \(\bar{X} = [\bar{X}_0, \bar{X}_{\Delta t} ,\bar{X}_{2\Delta t}, \dots , \bar{X}_T]^{\top }\), \(r = [r_0, r_{\Delta t} ,r_{2\Delta t}, \dots , r_T]^{\top }\).

figure b
figure c

4.2 Mean Field Game Algorithm

In mean field game case, since in each iteration it is assumed that the \((\bar{X}_t)_t\) is fixed, the ODE for \((r_t)_t\) in equation (9b) can be solved directly by using the following linear equation after we discretize time:

$$\begin{aligned} r = M_r r+ K_r, \end{aligned}$$
(15)

where \(r = [r_0, r_{\Delta t} ,r_{2\Delta t}, \dots , r_T]^{\top }\). Then with this \((r_t)_t\), the time discretization of \((\bar{X}_t)_t\) with dynamics given by Eq. (9c) can be written as:

$$\begin{aligned} \bar{X} = M_{\bar{X}} \bar{X} + K_{\bar{X}}, \end{aligned}$$
(16)

where \(\bar{X} = [\bar{X}_0, \bar{X}_{\Delta t} ,\bar{X}_{2\Delta t}, \dots , \bar{X}_T]^{\top }\). The numerical algorithms to find the mean field control and game equilibria are given in detail in the following sections.

figure d
figure e

4.3 Numerical Experiments

In the numerical experiments reported below, we use the following parameter values:

\( p_1 = 7/\Delta _t\) (dollar/time)

\( \rho _0=40/\Delta t\)

\(\theta =5 \)

\( p_2=10^4\), \(p_3=10^{-10}\)

\( \rho _1=0.1/\Delta t\)

\(T =20\) years, \(\Delta t= 10\) days

\( c_1=10^{-4}\) (dollar/\(10^3\) cu ft\(^2\))

\(\alpha =40\pi \)

\(R_e=[0,5\times 10^3]\) (10,000 dollars)

\( c_3=1\) (dollar/\(10^3\) cu ft)

\(\delta =0.15\)

\( D_t =2\times 10^4 - 5\times 10^2 \cos (80\pi \Delta t)\)

\(\kappa _1 =0.13\) (MWh/\(10^3\) cu ft)

\(\sigma _0=0.01 \)

\(\bar{X}_0 = [0,\theta ,0,0,0]\)

\(\kappa _2=0.1\) (MWh/\(10^3\) dollars)

\(\sigma _1=0.01 \)

\(Var[X_0] = [0,0.1,0,0,0]\)

Furthermore, we assume that \(p(R_e)=p_2 R_e-p_3\sqrt{R_e(R_e^{\max }-R_e)}+\epsilon \) where \(p_2, p_3\) are positive constants and \(\epsilon >0\) is a small constant that ensures the nonnegativity of the price of the units of the renewable energy investment.Footnote 2

We focus on natural gas as the source of nonrenewable energy. In our numerical experiments, we ignore the effect of the COVID-19 pandemic, and we run simulations for 10 years starting from March 2020. For the cost of solar power, we use the current assumption that a 1MW solar farm needs roughly a 1M$ investmentFootnote 3 and assume peak sun hours last about 5 h in the USA to compute daily average production from solar panels. Furthermore, we choose \(\alpha \) to take into account the seasonality, since sun exposure levels are maximum during summers, minimum during winters. Therefore, we infer that one unit investment of \(R_e\) corresponds to roughly $10,000 and on average it generates \(\kappa _2(\theta \pm 1)=0.1(5\pm 1)\)MWh electricity in 10 days.Footnote 4

According to the data provided by the U.S. Energy Information Administration (EIA) in 2018,Footnote 5 we assume that 1000 cubic feet of natural gas produce approximately 0.13MWh. Again, according to the emission data provided by the EIA.Footnote 6 we take \(\delta =0.15\).

We assume that the average demand of electricity for each plant is around the capacity of the plants. By using the data provided by the EIA,Footnote 7 we find the average daily capacity of a natural gas plant. Furthermore, the monthly seasonal component is found by using the monthly residential electricity consumption in 2018 data given by EIA.Footnote 8 Therefore, 10 day demand is taken sinusoidal to show the seasonality around 20, 000MWh. According to the data provided by EIA,Footnote 9 nonrenewable energy has 40% of its fuel cost as the operation and maintenance costs on top of the fuel cost in 2018 and the price of 1000 cu ft natural gas can be assumed $5. Therefore, we take \(p_1=\$7\). Finally by using the data given by EIA,Footnote 10 we see that the average price of wholesale electricity is around $40 per MWh; therefore, we take \(\rho _0=40\).

4.3.1 Price of Anarchy (PoA) Analysis

From the heat maps in Fig. 1, we see that the expected cost of the representative producer is increasing with the carbon tax \(\tau \) and the penalty \(c_2\). The second observation is that as expected, for any given couple \((\tau , c_2)\), the expected cost is higher in the Nash equilibrium than for the social optimum. Next, we quantify how inefficient the Nash equilibrium is, and the effect of \(\tau \) and \(c_2\) on this inefficiency. In other words, we quantify the adverse effect of the noncooperative behavior of the producers by computing the price of anarchy (PoA) defined in (17) for different values of \(\tau \) and \(c_2\).

$$\begin{aligned} PoA(\tau , c_2) = \frac{\inf _{N_t, R_e} C^{MFG}(N_t, R_e; {\bar{Q}},\tau , c_2)}{\inf _{N_t, R_e} C^{MFC}(N_t, R_e; {\bar{Q}}, \tau , c_2)}. \end{aligned}$$
(17)

The results are given in right subfigure in Fig. 1. Since for any given \((\tau ,c_2)\) the expected cost in a MFG equilibrium is higher, PoA is expected to be greater than 1 and as it gets higher, the Nash equilibrium is getting less efficient. It can be seen that PoA gets smaller as we increase \(\tau \) and \(c_2\). This means that for higher levels of \(\tau \) and \(c_2\), the expected costs of to the producers become closer. In other words, the impact of the social planner diminishes and the advantages of cooperation lessen as the regulator imposes stricter regulations.

Fig. 1
figure 1

Top: Expected cost of minor players in MFC (left), in MFG (right), Bottom: the price of Anarchy given different penalty for not matching the demand and tax levels

4.3.2 Electricity Production Decomposition Analysis

Here, we analyze the effect of the penalty \(c_2\) for not matching the demand and the carbon tax \(\tau \), on the optimal energy production portfolio in both MFC and MFG models. Figure 2 shows the total production and the decomposition of this production over a 10-year period together with a detailed zoom in behavior between years 1 and 3.

Fig. 2
figure 2

Production decomposition, when there is no tax (\(\tau \)) and the penalty for not matching the demand \((c_2)\) is low (left), when there is no tax and \(c_2\) is high (middle), when \(\tau \) and \(c_2\) are high (right). On the left and the middle subplots since all the production comes from the nonrenewable energy resources, the average total production lines (colored in orange) are not seen

The left subfigure in Fig. 2 shows that the demand is not matched by the producers in the MFC case. This is because the penalty coefficient \(c_2\) is low and the increased revenue from scarce supply is more advantageous. Here, we see that in the control setting, producers behave as a big monopoly when not matching the demand is inexpensive. When the penalty is increased, the middle subfigure in Fig. 2 shows that producers try to match the demand and their behaviors in the MFC and MFG cases are similar. In both of these figures, there is no carbon tax; therefore, the producers do not have incentives to invest in renewable energy, and as a result, all the production is exclusively from the nonrenewable sources. On the right subfigure in Fig. 2, when the carbon tax is increased, we see that the producers have an incentive to invest in renewable energy.

Fig. 3
figure 3

Planning time horizon effect in MFC (left), and in MFG (middle); end of the 10 years pollution in MFC and MFG (right)

We also analyze the effect of the planning horizon where we compare the cases in which the producers are planning for the next 2 years vs. planning for the next 10 years. As it can be seen in the left and middle subfigures in Fig. 3, when the planning horizon is short, the fixed costs of renewable energy outweigh its advantages. Short-sighted producers do not have an incentive to invest in renewable energy production.

4.3.3 Pollution Analysis

The right subfigure in Fig. 3 shows that whatever the level of the carbon tax, the terminal pollution levels are higher when the producers are competitive (MFG). The main reason here is that when the producers are competitive (which corresponds to a free market situation), they try to match the demand better. However, they choose to match this demand by using higher levels of nonrenewable energy resources in the production. This ends up with increased carbon emissions. Further, in the absence of a carbon tax, producers can decrease the pollution levels further by cooperating and following a social planner instead of implementing a carbon tax.

5 Models with a Regulator

In this section, we describe how the previous models can be extended to include a major player in charge of choosing the tax level \(\tau \) on behalf of a policy maker, and the penalty \(c_2\) for not matching the demand on behalf of system operator. We shall treat this major player as a regulator, and we shall often speak of minor players when we talk about the producers. We extend the “minor player only” model used previously by offering the producers the option to withdraw their entire production, de facto walking away from the contract imposed by the regulator. This decision is made when the expected cost to the producer is higher than a fixed level above which producing at such a level of loss does not make sense. If we refer to the plots in Fig. 1, we can see that the cost of the minor player is increasing with higher tax and the penalty for not matching the demand. Therefore, the regulator should be careful not to enact policies with very high values of \(\tau \) and \(c_2\).

In the new model, the regulator does not have a private state per se. The regulator only has 2 controls which are the carbon tax level (\(\tau \in \mathbb {R}_+\)) and the penalty \((c_2\in \mathbb {R}_+)\). Both controls are assumed to be time independent. This assumption is especially realistic when the period [0, T] is too short for changes in regulation to make sense. The cost function of the regulator is given as:

(18)

where \(\alpha _1, \alpha _2,\alpha _3,\alpha _4\) and \(\alpha _5\) are nonnegative constants whose role is explained below.

The first term is the cost for exceeding the pollution target denoted by \({\bar{P}}_T^*\). Here, we use the notation \(x_+=\max (0,x)\), so there is no penalty if the terminal pollution level is below the target. The constant \(\alpha _1\) quantifies the size of the penalty. The second term is the revenue from the carbon tax. Here, we assume that the regulator collects the tax incurred to the producers and for the sake of consistency with the producers’ model, it is chosen to be quadratic in the pollution. To prevent the regulator from choosing an abusive high tax to increase their revenue, Term 3 is added to represent a reputation cost. The joint roles of Term 4 and Term 5 are to ensure that the responsibility of matching the demand is not only incumbent on the producers, but also on the regulator, influencing the choice of \(\alpha _4\). This is consistent with our characterization of our major player/regulator as a policy maker as well as a system operator bearing the brunt of managing the ancillary services to avoid disruptions like system blackouts. Criteria (18) are chosen to take into account possible objectives of a policy maker and system operator. The coefficients of the terms can be chosen depending on the objectives of the regulator’s priorities. For example, if the regulator cares more about the pollution levels, they can choose a higher \(\alpha _1\). In the same way, the regulator can drop any term by choosing the related coefficient equal to 0.

We assume that if the tax level \(\tau \) and the penalty \(c_2\) are such that the representative minor player’s cost cannot be higher than a preset level \(\nu \); otherwise, the minor player chooses to walk away. In this case, we define the regulator’s cost to be \(J(\tau , c_2; N, R_e) = \infty \). In this way, the regulator’s optimal tax level and penalty are such that the minor players’ cost is at most \(\nu \).

Then, the regulator’s problem becomes:

$$\begin{aligned} \inf _{\tau , c_2} \inf _{\begin{array}{c} (N, R_e) \in \mathcal {A}(\tau , c_2) \\ C(N, R_e; {\bar{X}}(N, R_e), \tau , c_2)\le \nu \end{array}} J(\tau , c_2; N, R_e) \end{aligned}$$
(19)

Above, the set \({\mathcal {A}}(\tau , c_2)\) refers to the MFC optimum or MFG equilibrium depending on the problem setting. This set is nonempty by the existence results of Theorem 2 and Theorem 4. In expression (19), the first line below the second infimum shows the fact that the population is assumed to respond with a Social Optimum or Nash equilibrium. On the other hand, the second line shows the fact that the cost of the representative player should not be higher than the preset level \(\nu \). This condition ensures that minor players will not walk away from the game. Note that once a minor player decides to play the game, they stay in the game until the terminal time.

5.1 Equilibrium Notions

We analyze two types of equilibria in the models with a regulator. In both cases, we consider that the regulator announces their policy first, and the producers react accordingly. This is in the realm of Stackelberg games. We call the first equilibrium Stackelberg MFC equilibrium. In this case, the regulator assumes that a social planner chooses the controls used by the electricity producers. The latter behave like one big monopolistic firm. Therefore, the regulator chooses the tax level, \(\tau \) and penalty coefficient \(c_2\), assuming that the producers will settle in a MFC optimum. Note that in this interpretation the regulator and the social planner are two different entities. We define this equilibrium formally as:

Definition 3

(Stackelberg MFC equilibrium) For every \((\tau , c_2)\), let \(\Big ({\hat{N}}(\tau , c_2), {\hat{R}}_e(\tau ,c_2)\Big )\) be the social planner’s MFC optimum given the tax level \(\tau \) and the penalty coefficient \(c_2\). In other words, for every \(\tau , c_2\) and any admissible \(\big (N, R_e\big )\), we have:

$$\begin{aligned} \begin{aligned}&C\Big ({\big (N, R_e\big )}; \bar{X}\big ({N, R_e}\big ), (\tau , c_2) \Big ) \\&\qquad \ge {C\Big (\big ({\hat{N}}(\tau , c_2), {\hat{R}}_e(\tau , c_2)\big ); \bar{X}\big ({\hat{N}}(\tau , c_2), {\hat{R}}_e(\tau , c_2)\big ), (\tau , c_2)\Big )}, \end{aligned} \end{aligned}$$

where we added the notation \(\bar{X}\big (N, R_e\big )\) to emphasize the parameters for which the mean field term \({\bar{X}}\) is computed. Then, the strategy profile \(({\hat{\tau }}, {\hat{c}}_2)\) is Stackelberg MFC equilibrium with a regulator if, for any admissible \((\tau , c_2)\):

$$\begin{aligned} J\Big ({\tau , c_2}; {\hat{N}}({\tau , c_2}),{\hat{R}}_e({\tau , c_2})\Big ) \ge J\Big ({\hat{\tau }}, {\hat{c}}_2; {\hat{N}}({\hat{\tau }}, {\hat{c}}_2),{\hat{R}}_e({\hat{\tau }}, {\hat{c}}_2)\Big ). \end{aligned}$$

The second equilibrium is called Stackelberg MFG equilibrium. In this one regulator assumes electricity producers are competitive and chooses \(\tau \) and \(c_2\) levels by assuming that the minor player population is at Nash equilibrium. We can define this equilibrium formally as

Definition 4

(Stackelberg MFG equilibrium) For every \((\tau , c_2)\), let \(\Big ({\hat{N}}(\tau , c_2), {\hat{R}}_e(\tau ,c_2)\Big )\) be the producers MFG Nash equilibrium given the tax level \(\tau \) and the demand satisfaction coefficient \(c_2\). In other words, for any admissible \(\big (N, R_e\big )\), we have:

$$\begin{aligned} \begin{aligned}&C\Big ({\big (N, R_e\big )}; {\bar{X}}\big ({\hat{N}}(\tau , c_2), {\hat{R}}_e(\tau , c_2)\big ), (\tau , c_2) \Big ) \\&\qquad \ge {C\Big (\big ({\hat{N}}(\tau , c_2), {\hat{R}}_e(\tau , c_2)\big ); {\bar{X}}\big ({\hat{N}}(\tau , c_2), {\hat{R}}_e(\tau , c_2)\big ), (\tau , c_2)\Big ).} \end{aligned} \end{aligned}$$

Then, the strategy profile \((\hat{\tau }, \hat{c}_2)\) is a Stackelberg MFG equilibrium with a regulator if, for any admissible \((\tau , c_2)\), we have:

$$\begin{aligned} J\Big ({\tau , c_2}; {\hat{N}}({\tau , c_2}), {\hat{R}}_e({\tau , c_2}) \Big ) \ge J\Big (\hat{\tau }, \hat{c}_2;{\hat{N}}(\hat{\tau },\hat{c}_2), {\hat{R}}_e(\hat{\tau },\hat{c}_2)\Big ). \end{aligned}$$

6 Numerical Results in the Presence of a Regulator

6.1 Algorithms

To implement the walk-away option of the producers, we modify the SocialOpt and NashEq algorithms. This is done by simply adding an IF condition to these algorithms that assigns Accept=1 if the cost of the minor player is lower than the threshold and Accept=0 otherwise. After the algorithms for the producers are modified and called “ModifiedSocialOpt” and “ModifiedNashEq,” respectively, we implement the Stackelberg equilibrium algorithm where we assume that if the producers reject the contract (Accept=0), the regulator cost is equal to infinity.

figure f
figure g

Remark 2

In the two Stackelberg equilibria, the numerical algorithms only differ in the solution of producers’ problem.

6.2 Numerical Experiments

6.2.1 Analysis of Regulator’s Cost

For the experiments of this section, we used the same parameters as for the producers’ model in the previous section. For the regulator we usedFootnote 11:

\(\alpha _1 = 1\), \(\alpha _2 = 0.1\), \(\alpha _3 = 10^5\), \(\alpha _4 = 5\), \(\alpha _5 = 20\)

\(\tau \in \{0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100\}\)

\(c_2 \in \{50, 100, 250, 500, 750, 1000, 1500, 2000, 2500, 3000, 4000, 5000\}\)

Fig. 4
figure 4

Regulator cost in both MFC and MFG settings where penalty for not matching the demand, \(c_2\), (left) or the carbon tax, \(\tau \), (right) kept fixed

First, we analyze the regulator’s expected cost for different values of the carbon tax given a fixed penalty for not matching the demand. Then, we switch the roles of the two controls of the regulator. Plots in Fig. 4 show that the cost of the regulator is convex as a function of the carbon tax or the penalty, when the other control is fixed. We also analyze the effect of the coefficients in the regulator’s cost. First, we start with the analysis of the importance given to demand matching by the regulator by tracking the effect of \(\alpha _4\) in regulator’s cost. The left subfigure in Fig. 5 shows that when the tax is fixed, the regulator’s minimum cost is attained at higher \(c_2\) values when \(\alpha _4\) is higher. This shows that the regulator should impose higher penalties for not matching the demand to producers when demand matching is more important for the regulator. The middle subfigure shows that when the penalty for not matching the demand is fixed and when \(\alpha _4\) is higher, then the optimal tax is lower. The reasoning here is that when the regulator cares about demand matching since the production from renewable energy is more unpredictable, the regulator is not opposed to the nonrenewable energy usage in order to have more stable demand matching. Finally, the right subfigure shows the effect of the importance given to minimizing the excess pollution by the regulator by tracking the effect of \(\alpha _1\) in regulator’s cost. Here, it can be seen that when the penalty for not matching the demand is fixed, the optimal carbon tax is higher when the regulator wants to keep pollution at a lower level. These subfigures are for the MFC case, but similar results hold in the MFG case as well.

Fig. 5
figure 5

Effects of the coefficients of the regulator’s cost and decisions: the effect of the importance given to the demand matching by the regulator, \(\alpha _4\), on the decision of the penalty imposed on minor player for not matching the demand (left), the effect of \(\alpha _4\), on the decision of the carbon tax (middle), the effect of the importance given to minimizing the excess pollution, \(\alpha _1\), on the decision of the carbon tax (right). In the subplots, the dashed vertical lines show the position of the minimizers of the regulator’s cost with the corresponding color

Fig. 6
figure 6

Regulator cost given admissible \(c_2\) and \(\tau \) values in when the regulator cares about matching the demand in MFC (left) and in MFG (middle); the difference of the regulator’s cost between MFC and MFG given any admissible \(c_2\) and \(\tau \) couples (right)

Finally, Fig. 6 gives 3D plots of the regulator cost as a function of their controls \(\tau \) and \(c_2\). Here, the minimum is attained at \(({\hat{\tau }}, {\hat{c}}_2) = (55, 1500)\) in the MFC case and at \(({\hat{\tau }}, {\hat{c}}_2) = (55, 1000)\) in the MFG case. Also, we see that for any given tax and penalty, the expected cost of the regulator is higher if the producers are cooperative instead of competitive when the regulator gives more importance to demand matching. This is because for any given couple \((\tau , c_2)\), in the cooperative setting producers are behaving like a big monopolistic firm and care less about matching the demand than in the competitive setting in order to maximize their revenues by keeping the prices higher. When demand matching is important for the regulator, the regulator benefits from the competition among the electricity producers even if this competition creates adverse effect for the producers themselves. Furthermore, the regulator optimally chooses a higher penalty for not matching the demand in the cooperative case (1500 vs. 1000) in order to make the monopolistic producers match the demand better.

7 Conclusion

In this paper, we investigate the behavior of rational electricity producers in the presence of a carbon tax. We analyze how they manage the trade-off between reliance on traditional and predictable fossil fuel power production assets which emit greenhouse gas and hence cost revenues because of the carbon tax, and the temptation to invest in clean energy production assets which will not be the source of emissions but which make matching the demand problematic because of the volatility of their output. We study a large population of producers in two different models: a first one in which they compete and hopefully reach a Nash equilibrium and a second one in which they cooperate and rely on the solution of a centralized optimization problem. In a second set of models, we introduce a regulator choosing the level of the carbon tax in hope to control the overall emissions in the economy, and a penalty to be imposed on producers failing to meet the demand in hope to avoid power outages and reputation costs. In this way, we aim to find the optimal carbon tax levels. Our models are based on recent progress in the theory and the numerical analysis of mean field games and mean field control problems. As a contribution to the theory, since we both have time-dependent and time-independent controls, we propose nonstandard forward–backward stochastic differential equation systems and show the existence and uniqueness of the Nash equilibrium and social optimum.

We showed that when the producers cooperate, they are better off by behaving like a single monopolistic firm. However, if the regulator raises excessively the penalty to match the demand, they can take advantage of the competitive behavior of the producers. While our models remain stylized, they open the door to more complex models, e.g., involving time-dependent policies the regulator could base on the response of the producers. Furthermore, our models could be the used to include more features of the energy markets such as storage and the interactions between neighboring states or countries.