Index Terms

1 Introduction

Electricity market is becoming a technology driven business today. Energy technologies like distributed generation, energy storage, electric vehicles, demand-side management etc. along with exponential technologies like smart sensors/Internet of Things (IoT), big data and artificial intelligence (AI), augmented reality (AR)/virtual reality (VR) and blockchain are creating significant opportunity for a technology play in the entire electricity value chain. In addition, the advent of smart grids has not only resulted in proliferation of insightful information, both in terms of velocity and volume, but has also heightened the need for collection, profiling, aggregation and settlement using complex calculations and business logic.

In the wholesale market, predicting the demand (both for the overall market as well as for one’s own retail portfolio) as well as market clearing price considering the weather and prosumer’s supply/demand is becoming very important, whereas in the retail market, offering the right kind of tariff (like flat, tiered, time of use) considering the customers’ past consumption and inertia, and positioning it to the right consumer segment is becoming very important. Similarly predicting the direction of the market imbalance in the balancing market and how to leverage the imbalance on one’s own portfolio to mitigate market imbalance is yet another important decision facing a market participant.

There is thus a need for an intelligent trading agent which can define an integrated strategy for portfolio profit maximization considering the three dimensions of market, customer and strategy. The main objective of this paper is to describe the working of VidyutVanika (VV), an autonomous learning electricity broker agent which bids intelligently in the wholesale market to decrease procurement costs, reacts to competing tariffs and offers attractive retail tariffs to increase its share of the retail market, and uses demand response to mitigate transmission capacity costs.

The outline of this exposition as follows. In Sects. 2 and 3 we review the changing nature of electricity markets and the challenges posed by these changes to various market participants. In particular, we argue that these challenges create a strong need for a data-driven, algorithmic approach to decision making in energy markets. A particular instance of such an approach is a software broker agent which can autonomously make buy-sell decisions in an electricity markets and also possess the ability to attract retail customers with suitable tariffs in the presence of competitors. In Sect. 4, we provide an overview of one such autonomous broker agent, namely, VidyutVanika (VV). Specifically, we describe the architecture and the main modules of VV and comment on how these modules address the challenges mentioned above. In Sect. 5, we present the results of simulations performed on the Power Trading Agent Competition (PowerTAC) platform to demonstrate the performance of VV. Finally, in Sect. 6 we present our conclusions.

2 Electricity Market: A Transforming Industry

Various outside-in and inside-out factors have contributed to making the electricity markets more dynamic. Examples of outside-in factors include empowered consumers, new regulations such as Reforming Energy Vision in the US, Power Responsive in the UK and Power of Choice in Australia, convergence of industry boundaries (consider, for example, the entry of banks into connected homes) and volatile commodity and fuel prices. Inside-out factors include consumerization of energy technologies represented by electric vehicles (EV), energy efficient appliances, demand side management, renewables, storage, an increasing focus on portfolio optimization and managing risk against extreme events like the polar vortex, grid flexibility in the form of value-based transactive signals and market transparency (exemplified, for instance, by the transition from the 30 min settlement regime to the 5 min settlement regime in Australia). Along with the advent of new enablers, the interaction among the new enablers such as EV, photovoltaic (PV) and storage, and user behavior in the form of charging patterns are making the grid much more complex and demanding in terms of low latency. Electricity markets are thus moving from the low-complexity-high-latency quadrant to the high-complexity-low-latency quadrant as depicted in Fig. 1.

Fig. 1
figure 1

Evolution of Electricity Markets—Journey towards low latency and high complexity

Below we provide three examples to illustrate the necessity for power markets to move towards the low-latency-high-complexity quadrant of Fig. 1.

  1. (1)

    Australia Market—Settlement Regime: In Australia, the energy regulator Australian Energy Market Operator (AEMO) is making a transition from a 30 min settlement regime to a 5 min settlement regime to achieve efficient demand side participation, efficient signals for investment in capacity, and improved incentives [1]. In the current regime, which involves a 5 min dispatch window and a thirty-minute settlement window, a generator makes offers/bids to supply electricity to the National Electricity Market (NEM) every 30 min. An offer or bid consists of price and quantity for the six 5 min dispatch intervals in the 30 min settlement window. However, the price the generator receives in the 30 min settlement window is calculated by averaging the dispatch prices of the six dispatch intervals in that settlement window. The current regime provides little encouragement for flexible generators, and can lead to distorted bidding behaviors like early price spike, late price spike, and piling in of generators. In the new 5 min settlement regime, a generator makes offers/bids to supply electricity to the NEM at every 5 min, and the price the generator receives will be the 5 min trading interval price. Thus, in the new regime, a generator can make offers for 288 trading intervals in a day rather than 48.

  2. (2)

    UK Market—Ancillary Products: In the UK, the electricity system operator is encouraging the increased participation of different forms of flexible technologies such as demand side response (DSR) and storages for balancing of the grid [2]. We also see the advent of Non-Balancing Mechanism Units (Non-BMU) to support the ancillary service markets with reserve products like Short Time Operating Reserve (STOR) and frequency products like Frequency Control by Demand Management (FCDM). The characteristics of a typical non-BMU product are shown in the diagram in Fig. 2.

    Fig. 2
    figure 2

    Characteristics of a typical non-BMU seen in UK energy markets

  3. (3)

    US Market—Community Choice Aggregators: In the US market, we see the increasing dominance of the Community Choice Aggregator (CCA). CCA are local government entities that purchase electricity on behalf of the retail electricity customers. CCA have the advantage of a large customer base resulting in higher negotiating power in the wholesale market. If required, CCA can choose more renewable energy than the incumbent utility.

3 Challenges in Decision Making for Market Participants

Participants in electricity market operate in wholesale, retail and balancing markets. The wholesale market is where the electricity is bought in bulk from power generators. The retail market consist of various of prosumers who buy or sell power in low volume. The aim of a participant in the balancing market is reduce supply demand mismatch. An electricity broker agent is needed to make the following key decisions in these three markets.

  1. (1)

    Wholesale Market: The price and quantity for a bid, the frequency at which bids should be placed, type of order (market order or limit order) to be placed, type of information to be used for making the decision.

  2. (2)

    Retail market: Type of customer to target, type of tariff to offer from among fixed, time-of-the-day, variable etc., and the tariff amount.

  3. (3)

    Balancing market: Kind of participation in the balancing market (specifically, whether to create an imbalance that reduces the overall imbalance in the market, or to offer incentives to customers to keep their load flexible).

To make decisions related to the wholesale market, a market participant may have to rely on weather data, and consumption and production profiles of enrolled customers. Likewise, retail market decisions may have to be based on weather data, previous consumption history, and other consumer data. Since commitments for buying or selling power have to be typically made in advance, an additional input to decision making could come from predictive models yielding estimates of future demand and clearing price. Predictive analytic models and advanced data-driven decision-making algorithms can enable a broker agent to convert different scenarios into revenue-making opportunities. To illustrate this, consider the following scenarios and their associated opportunities.

  • Scenario 1: The demand in the balancing market is high for a certain future time slot

    • Opportunity 1: If the broker can predict the scenario, then the participant can buy a larger quantity in the wholesale market and sell the excess power to the balancing market at the future time slot.

    • Opportunity 2: The broker can give out an attractive tariff for prosumers to increase its prosumer base, so that the participant can purchase power from this prosumer base and sell it in the balancing market at the future time slot.

    • Opportunity 3: The broker can increase its share of consumers with storage devices like battery and EV, and sell the stored energy in the balancing market at the future time slot.

  • Scenario 2: The supply in the balancing market is is high for a certain future time slot.

    • Opportunity 1: The broker can give out an attractive tariff for consumers to increase its consumer base, so that the additional load that can absorb the extra future supply in the balancing market.

    • Opportunity 2: If the predicted future balancing market price is less than the current wholesale market price, then the broker can bid for lower quantity in the the wholesale market and purchase the remaining power at the future time slot from the balancing market.

    • Opportunity 3: The broker can tap into the storage capacity of its customers to store the predicted extra supply and sell it at a later time.

4 Overview of AI-Based Broker Agent VidyutVanika

In Sects. 2, 3, and 4, we have outlined challenges posed by the move towards low-latency-high-complexity regimes as well as possible opportunities for market participants to prevail over these challenges by leveraging predictive analytics and decision algorithms. We argue that these challenges highlight the need for market participants to increasingly use algorithmic trading as an element of their decision making. In this section, we look at one particular way of implementing algorithmic trading, namely, in the form of a software broker agent. More specifically, we describe VidyutVanika (VV) [3,4,5], a software broker agent that can operate autonomously in a smart-grid-based electricity market. In particular, we explain different modules and submodules of VV, which are built to address the challenges discussed in Sect. 3.

The broker agent VV consists of two main modules, namely, Wholesale Module (WM) and Tariff Module (TM). WM generates bids/asks to purchase/sell energy contracts in the whole-sale market. TM is responsible for publishing and revoking tariffs in the tariff (or retail) market. The bidding problem in the wholesale market is modeled as a Markov Decision Problem (MDP) [6], a framework for dynamic optimization in uncertain environments. The MDP is solved online using dynamic programming [7]. Tariff design is accomplished by formulating a separate MDP, which is solved approximately using Q-learning [8]. In addition to these two modules, VV incorporates a Customer Usage Predictor (CUP) sub-module built using neural networks (NN) to predict the usage of all subscribed customers in a future time slot, by using weather forecasts and past usage pattern of each customer. VV aggregates the predicted usage across all its subscribed customers to estimate the amount of energy to be procured in the wholesale market. Doing so helps VV reduce the imbalance on its portfolio. Figure 3 depicts the architecture of VV in terms of its various modules and sub-modules.

Fig. 3
figure 3

Broker agent architecture

Next, we describe TM and WM along with their sub-modules in detail.

  1. A.

    Wholesale Module (WM)

In order to balance the net usage in its tariff portfolio at a future consumption time t, VV participates in all preceding wholesale market auctions for the time t by placing bids/asks of the form (energy amount, limit-price). The limit price is determined by a Limit Price Predictor (LPP), which uses dynamic programming to solve a MDP to find the limit price. The energy amount to be purchased is determined by aggregating the usage predicted by the CUP.

The customer usage prediction module (CUP) is responsible for predicting the net usage of the broker’s tariff portfolio for a future target time-slot t, by summing over the predicted usage of each customer subscribed to the broker for that target time-slot t. For each customer, the CUP uses a small feed forward neural net to predict the actual usage of the customer from the actual weather data, time of the day, and day of week. During prediction, the weather forecast is used in place of the actual weather data. The model is improved as more data points become available with time.

VV’s Limit Price Predictor is primarily motivated by the work of [9] on MDP-based wholesale bidding strategy, which in turn is based on [10]. Although VV uses a similar MDP structure, the novelty lies in the reward, solution and application to place bids. First, VV does not bid for the entire predicted energy requirement in a single auction as proposed by [9]. Instead, VV participates in all possible auctions to procure the required amount of energy for a future time slot so that larger amounts of power can be purchased in those auctions in which the clearing price is expected to be less. Second, VV uses the limit-prices obtained by solving the MDP to place several small bids to purchase small quantities of energy. These small bids help in calculating better estimates for the probability of a bid getting cleared for a given limit price.

VV maintains two instances of the MDP at all times; one for bids, and the other for asks. The state of the wholesale MDP is the number of bidding opportunities left to buy energy for a future time slot. The action is a limit price that would be used in the bidding process. The reward is the amount of cost incurred in obtaining the total amount of energy required for a future time-slot. The detailed description of the MDP can be found in [3]. The solution to the MDP is a sequential bidding strategy that minimizes the cost per unit energy procured.

  1. B.

    Tariff Module (TM)

TM, the tariff module of VV, maintains two active time-of-use (TOU) tariffs, namely, (i) MDPTOU and (ii) WeeklyTOU. MDPTOU is the result of solving an MDP problem for retail market using reinforcement learning, and is revised every twenty-four hours. WeeklyTOU is an empirically determined, fixed weekly TOU tariff, which remains active throughout.

MDPTOU is generated through a two-step process. First, a fixed price tariff (FPT) is generated by using reinforcement learning. Next, the FPT is converted to a TOU tariff for consumption customers by predicting the overall demand profile for the tariff market over a future time horizon.

The tariff MDP formulation is primarily motivated from the work of [11]. The state of the MDP at any decision instant is a set of market features that captures rationality of the tariff market, the agent’s predicted energy surplus calculated as the difference between the power already purchased and the estimated demand for the future consumption slot, and rank-based comparison of VV’s current tariff with tariffs offered by competitors. At each decision time, VV observes the state and selects one of eight actions, each of which lets VV modify its previous production and consumption tariff in a specific fashion. A detailed description of the state and action space of the MDP can be found in [11].

TM is trained to provide near-optimal tariffs by using the reinforcement learning algorithm called Q-learning over repeated simulations. In each simulation, the module is provided “rewards” such that the total reward in a simulation captures the net profit made by VV ignoring balancing charges. Specifically, the reward at each time is the net revenue earned by selling or buying power from consuming and producing customers, respectively, at current tariffs, less the amount spent in the wholesale market to procure the power required to cover unfulfilled demand. Over several simulations, the Q-learning algorithm learns the value of the maximum possible total reward that can be achieved by taking a given action in a given state. This information is then used to identify the best possible action in the current state. The chosen action is then converted to a tariff.

For production customers, the tariff suggested by the MDP agent is published without any change. For consumption customers, the FPT tariff is converted to a TOU tariff before being published. For this, VV first predicts the net demand in the market for the next twenty-four hours of the simulation. Thereafter, at each of the next twenty-four time slots, the FPT is modified by an amount that is proportional to the excess estimated net demand at that time slot over the mean estimated demand for the twenty-four hour period. For details, the reader is referred to [3]. The TOU tariffs published in this manner helps in offsetting some of the transmission capacity charges triggered by high peak demand.

5 Validation of the Algorithimic Trading in Powertac Tournament

The performance of VV was tested using the Power TAC simulation platform [12]. Power TAC is an open source smart grid simulation environment, featuring a wholesale market with bulk power generators, a retail market with a diverse population of customers, a balancing market, and a distribution utility, all operating at a real geographical location with its own weather dynamics. The retail market comprises individual customers, residential communities, as well as businesses, and includes customers with production and storage capability. The platform contains evaluation and preference models that customers deploy to decide between tariffs, as well as consumption models that determine their consumption and demand response patterns. The balancing market operates through balancing orders placed by broker agents having customers with flexible loads. In addition to providing the transmission infrastructure, and acting as the broker of last resort, the distribution utility also charges transmission capacity fees for contributing to peak consumption. The platform can accommodate multiple independent broker agents, who have to compete to operate profitably in the Power TAC environment.

The Power TAC platform is used every year to run a competition, also abbreviated as Power TAC, among broker agents submitted by different teams. To benchmark the capabilities and performance of VV, VV was entered in the Power TAC 2017 and 2018 finals. The Power TAC 2018 finals had 7 brokers from research groups across the world. The tournament had a total of 324 games, with all possible combinations of 7-broker games (100 games), 4-broker games (140 games; 80 games for each broker), and 2-broker games (84 games; 24 games for each broker). Table 1 shows the net profit of all brokers across different game configurations, percentage of profit in comparison to the winning agent, AgentUDE, and the corresponding normalized scores. Despite winning more games than AgentUDE, VV was placed next to AgentUDE in overall ranking of Power TAC 2018. This is because, the determination of the winner is made based on normalized cumulative profits in each configuration across all games in the tournament. Specifically, AgentUDE netted high profits against competing agents (excluding VV) in 2-player games that helped in cementing its place as the winner of the tournament.

Table 1 POWER TAC 2018—Net profits and normalized scores (denoted by (N)) of each broker

VV won the most number of games in the tournament with 112 wins out of the 204 it participated in, with AgentUDE coming second with 92 wins out of 204. VV had the most wins in 7-broker and 4-broker games, and had the second highest number of wins, behind AgentUDE, in 2-broker games. It is important to note that, overall, VV finished in the top two, 72% of the time whenever it played in a game with more than 2 brokers. In comparison, AgentUDE stood at 65%. On a head-to-head comparison with AgentUDE, out of 100 7-broker games, AgentUDE and VV both shared 39 wins each. However in 4-Broker games in which both VV and AgentUDE participated, VV won 31 times out 40, with AgentUDE winning the remaining 9. In the four 2-broker games involving both brokers, AgentUDE ended up winning three games. VV led in all these three lost games almost till the end, only to fall behind finally due to transmission capacity fees. We also looked at the number of games in which each broker ended up with a negative profit. CrocodileAgent had the fewest games with negative profits, with VV coming second in this category with four times the average market share. Thus, VV managed to make up for its losses on a consistent basis, and rarely ended up being non-profitable.

TM played a crucial role in VV’s success, offering tariffs which were attractive to majority of the customers and contributing the most in revenue. VV had the highest market share on average in 2-broker games, 7-broker games and overall, and the second highest in 4-broker games. In contrast, AgentUDE had only a quarter of the overall average market share of VV. While one may expect a greater market share to lead to more profits, it usually leads to higher transmission capacity fees and distribution costs, which can cause higher losses unless managed properly. As a result, agents with lower market share often tend to make less losses, and end up winning. VV also had one of the best tariff market income-to-cost ratio (1.14), with only AgentUDE (1.43) and CrocodileAgent (1.32) having better ratios. However, both AgentUDE and CrocodileAgent had very low average market share compared to VV. Thus, VV is very efficient at making profits despite having a higher market share. Finally, although there was no explicit strategy for balancing market, VV had less imbalance costs even with high market share which exhibits the effectiveness of net usage prediction strategy using neural networks.

6 Conclusion

We foresee a strong need of algorithmic trading in days to come. There can be convergence of wholesale and retail market in future. Connected homes may directly participate in the wholesale market and buy or sell power on their own. Similarly, in peer-to-peer trading, consumers will trade electricity among themselves rather than going to the market, and market will only act as a last balancing resource. Algorithmic trading, possibly implemented in the form of autonomous software broker agents, can play a very important role in decision making in all these futuristic scenarios. In this paper, we have described the architecture and design of one such software broker agent.