1 Introduction

With the growing world population, energy demand is expected to increase by 34% up to 2035, with 20% more CO2 emissions from fossil fuels, among which 60% is emitted by coal (British Petrolium Company 2015). Therefore, CO2 emissions from power production account for 2/3 of man-made greenhouse gases, increasing concerns about global warming (British Petrolium Company 2015). As an alternative to the traditional electric grids and conventional power resources, smart grids are considered a promising solution, where power generation, delivery and utilization are optimized by advanced mechanisms. In fact, these grids are “smart” thanks to intelligent and independent demand–supply management and autonomous decision-making. To alleviate dependency on fossil fuels, unit commitment, energy forecast and Renewable Energy Sources (RES) are integrated for improved operation. Building a reliable and robust smart grid, with environment-friendly energy sources, is a topic of interest to many researchers. With the deployment of advanced control and management, power system complexity and vulnerability to failures is increasing as well. With the total blackout of 2003 in the US (Kantamneni et al. 2015), the idea of breaking the grid into smaller and connected systems emerged, to ensure an easier isolation and correction of faults. There is actually a global shift from large centralized power plants to small distributed renewable power plants due mainly to fluctuations and challenges imposed on the Energy Management System (EMS) (Wen et al. 2016). This new idea was systematically translated into microgrids, proposed in the context of Consortium for Electric Reliability Technology Solutions (CERTS) project, the first research project of this kind in the field (Xu et al. 2017; Lasseter et al. 2011). Also, researchers are developing real microgrids and test-beds all over the world in different sites: North Africa, North America, Europe and Asia to study the behavior of microgrids and propose a standard microgrid architecture (Hossain et al. 2014).

Microgrids are small-scaled power systems, equipped with local RES, diesel generators (DG), batteries and a control unit that balances demand with supply to increase self-sufficiency, correct local faults and improve power quality. Microgrids can be either residential, industrial, commercial or stand-alone depending on the type of connected loads (Kantamneni et al. 2015; Xu et al. 2017; Labeodan et al. 2015). They also differ in terms of voltage forms (i.e. low voltage, medium voltage, hybrid voltage) and phases (i.e. single phase, 3-phase and hybrid) (Kantamneni et al. 2015; Xu et al. 2017; Labeodan et al. 2015). These microgrids are then connected through adequate technologies to form the smart grid, or “smart city”, as shown in Fig. 1.

Fig. 1
figure 1

Residential/commercial/industrial/remote microgrids forming a smart grid

According to this figure, a microgrid is composed of producers, such as wind turbines and PV panels, consumers and prosumers, such as batteries, which are producers when discharging and consumers otherwise. All these components are connected to the main grid through the Point of Common Coupling (PCC), and a switch to isolate the microgrid in case of failures. The key elements for designing a stable and reliable microgrid are (Labeodan et al. 2015; Tazi et al. 2017; Logenthiran et al. 2012; Khan and Wang 2017):

  • Size of the microgrid and technologies to be used;

  • Type of RES to be integrated and their positioning;

  • Energy dispatching algorithms and mechanisms for self-sufficiency;

  • Control strategy and communication protocols;

  • Switching modes and black start strategies;

  • Voltage/current control and active/reactive power balance.

In general, the deployed storage and green power generation capacities should at least meet the demand of high-priority loads in case of failures to ensure a minimum of reliability (Tazi et al. 2017). Even though microgrids bring many benefits to power systems, there are still many unresolved design issues (Kantamneni et al. 2015). On the one hand, a microgrid is characterized by “plug-and-play” (PnP) feature, and its topology is adaptive and time-varying, where nodes, either consumers, producers or prosumers, can connect and disconnect at run time (Khan and Wang 2017). In case of a failure, a microgrid can be disconnected to prevent spreading fluctuations to nearby microgrids. Considering its small size compared to the main grid, evaluation of power quality and CO2 footprint of microgrids is less complex, and power supply is consumer-centered (Kantamneni et al. 2015; Xu et al. 2017; Labeodan et al. 2015). However, the long term behavior of microgrids and their impact on the main grid performance is not methodically investigated. Given the power system behavior, fluctuations at the level of a microgrid will definitely affect the main grid. In addition, scalability of microgrids and the optimal size to ensure efficient power control are not studied enough. Finally, the scattered nature of these systems raises many concerns about data security, given that detailed consumption profiles should be stored remotely, and sometimes on many servers (Khan and Wang 2017).

Hence, microgrid clusters are an interesting field of research, since they combine energy production with advanced management and alleviate the need of conventional energy resources. Microgrids are modeled as a physical layer, shown in Fig. 1, managed by a logical layer composed of intelligent entities, known as EMS (Tazi et al. 2017). In the traditional EMS, Remote Terminal Units (RTU) send measurements to State Estimator (SE), which processes the received raw data through mathematical computations. The processed data is fed to both System Control And Data Acquisition (SCADA) system and fault-detection tools like Contingency Analyzer (CA) for decision-making and system control (Tazi et al. 2017; Logenthiran et al. 2012). Even though this control system is popular, it is not flexible and does not cope with the PnP nature of microgrids. Other control systems were thus proposed to improve system reliability and adapt to time-varying topologies, such as Multi-Agent Systems (MAS). This control system is based on agents, which are autonomous entities that react to changes in the environment and make decisions without human intervention (Labeodan et al. 2015; Khan and Wang 2017).

An important selection of research articles about MAS optimization algorithms were previously reviewed, analyzed and compared by other researchers, such as works in Khan and Wang (2017), Khare and Kumar (2015) and Coelho et al. (2017). This significant interest in MAS stresses the opportunities of this system in achieving a reliable and an efficient microgrid management. Unlike other reviews, this paper throws light on the following perspectives of MAS design: (1) the problematic issues related to MAS architecture, (2) the progress in MAS optimization and information diffusion, (3) and synthesis of operational safety requirements and performance measurement techniques. This review focuses on research papers in the field of MAS for microgrid management published in the last decade, with a selection of eminent and interesting studies published even before. The selected papers are either original research works that analyze MAS for microgrid management from new perspectives or propose optimization techniques or solutions to unexplored MAS issues.

The paper is organized as follows: we first introduce the building blocks of MAS: types, characteristics, design, control layers and agent modeling and interactions. We then discuss learning and optimization algorithms and compare their performance in terms of convergence time and achievement of system goals based on simulation results presented by researchers. Next, we summarize the progress in information diffusion and consensus algorithms, followed by operational safety thresholds and performance indicators. In the last section, we synthesize multi-agent approach for multi-microgrids service restoration and the main process to feed high-priority loads based on the literature.

2 What is a multi-agent system?

Multi-agent systems are smart systems, with Distributed Artificial Intelligence (DAI) for optimized control and management, where complex computational and optimization problems are broken over many entities, known as agents (Kantamneni et al. 2015). In the context of microgrids and power systems, Distributed Problem Solving (DPS) is a subfield of MAS, where all agents should collaborate to decrease operational costs and increase user comfort. For this purpose, an agent should learn and analyze the external environment, update its local databases, react to local events autonomously. An agent should also collaborate and communicate with other agents, negotiate offers and achieve both local and global objectives (Boussaada et al. 2016; Jin et al. 2011; Et-Tolba et al. 2014a; Moradi et al. 2016).

Figure 2 shows the two building blocks of an agent intelligence: deliberation and means-ends reasoning (Blamah et al. 2013). The local database of an agent is initialized with beliefs, which are perceptions of the environment, and intentions, which are possible options to reach goals. These two are updated regularly to adapt the agent to its environment and any changes on the system. Intentions are updated using option generation function and filtering to choose the best options among the initial set of solutions and then build an action plan to solve a given problem.

Fig. 2
figure 2

Practical reasoning of an agent as discussed by reference (Blamah et al. 2013)

Agents communicate with each other by messages with defined ontology, semantics and syntax. The standardization of Agent Communication Language (ACL) is a must to ensure integrity and consistency of communication messages among different MASs (McArthur et al. 2007a; Khalique 2005). Agent communication specifications and protocols produced and published by Foundation for Intelligent Physical Agents (FIPA) in 2001 has become de facto standards in MAS communication (McArthur et al. 2007a). FIPA produced four content language specifications with different message representations, where SL is the most popular content language (McArthur et al. 2007a; Foundation for Intelligent Physical Agents (FIPA) 2011). Another popular semantic language is Web Ontology Language (OWL), which is an extension of Resource Description Framework language (RDF), having a well-defined model of theoretic semantics and axiomatic specification for interpretations (Khalique 2005). It supports actions, expressions and propositions and is FIPA-interoperable, but it lacks expressivities compared to FIPA-SL and is specifically designed for web-based systems (McArthur et al. 2007a; Khalique 2005).

FIPA-ACL is not the first communication and information sharing language designed for agents. In fact, Knowledge Query and Manipulation Language (KQML) is a high-level, message-oriented communication language for information exchange that was developed in early 1990 (Foundation for Intelligent Physical Agents (FIPA) 2011). FIPA-ACL and KQML share the same syntax and are independent of ontology, transport mechanisms and content language. However, they are different on many aspects as shown in Table 1 based on references (Foundation for Intelligent Physical Agents (FIPA) 2011; Mayfield et al. 1996; Labrou et al. 1999). Readers are kindly referred to references (McArthur et al. 2007a; Khalique 2005; Foundation for Intelligent Physical Agents (FIPA) 2011; Nvari-Moghaddam et al. 2017; Dimeas and Hatziargyriou 2004) for more concepts about message ontology, communication robustness to failures and an example of MAS ontology for a microgrid market auction.

Table 1 Comparison of KQML and FIPA-ACL based on references (Foundation for Intelligent Physical Agents (FIPA) 2011; Mayfield et al. 1996; Labrou et al. 1999)

The design of a MAS for microgrid management is a process, by which the power system is carefully analyzed through a number of steps (McArthur et al. 2007a, b). First, all technical aspects should be clearly listed, including microgrid detailed topology, number, type and priorities of connected loads, the total local supply provided by storage and on-site green energy production and the maximum total demand that can be satisfied in island-mode. Then comes the requirements and objectives specification that will be translated into system equations and cost functions. Based on these studies, number of agents, optimization algorithms and performance indicators are then specified. Finally, interactions and data exchange between agents, in both normal and island operation, should be detailed. The result of this process is a final model with specific agents’ behaviors that would be implemented using an appropriate agent platform, such as JADE. In addition to ontology and semantic language specifications, there are other key design elements to build a MAS, including architecture, control layers and agent specification. The following section discusses these design elements and their related research gaps.

3 Elements of multi-agent system design

3.1 MAS types and architectures

The type of MAS, which could be either homogeneous or heterogeneous, depends on the microgrid architecture and the internal implementation of agents (Stone and Veloso 2000; Abbas and Egerstedt 2011). In fact, when agents have the same internal structure including goals, domain knowledge, sensor inputs and outputs, actions and planning procedure, the system is said to be homogeneous. If agents differ in at least one of these elements, then the system is said to be heterogeneous, where complex computations are handled efficiently by independent entities (Abbas and Egerstedt 2011). However, these systems are not mature enough due to limited research focus, difficulty to formulate their mathematical models and vulnerability to failures due to the priority hierarchy of agents. Therefore, most studies assume a homogeneous MAS to avoid dealing with system complexity, while heterogeneous MAS optimization and behavior is still an interesting research opportunity.

There are three MAS control architectures: centralized, distributed and hybrid. In the centralized scheme, a supervisor agent, or microgrid central controller (MGCCO) agent or Main Controller, manages the whole system (Wu et al. 2014; Colson and Nehrir 2011; Li et al. 2016). It updates a central database with measurements and system statuses through regular data exchange with other agents. It is also responsible for complex energy computations, energy and demand forecast, data correction, RMS voltage/frequency control, and load shifting and shedding (Kanchev et al. 2012). However, centralized MAS does not benefit from DPS and parallel control and suffers from computational overhead, since the supervisor agent monopolizes decision-making. Also, system reliability depends heavily on availability of this agent, and the microgrid will always be at risk of cascading failures (Wu et al. 2014; Colson and Nehrir 2011; Li et al. 2016). In addition, centralized MAS is not suitable for time-variant microgrids.

A distributed MAS architecture is proposed as an alternative, where agents take decisions locally, have power synchronization mechanisms and communicate with adjacent agents (Kanchev et al. 2012). With DAI and DPS, this design reduces computational complexity and overhead, and it is suitable for parallel control of microgrids with distributed RES, storage and DG (Colson and Nehrir 2011). Distributed MAS is also resilient and robust to failures, so if an agent fails, the area can be isolated and controlled remotely by neighbors. There are some drawbacks to this design, such as information diffusion and consensus (Rahman and Oo 2017; Raju and Milton 2017). Cyber-security and data confidentiality is also an issue of concern raised by the local access to consumers’ energy (Raju and Milton 2017; De Azevedo 2016). Moreover, DPS can affect microgrid stability if there is no coordination between agents in the system (De Azevedo 2016). Finally, some everyday planning tasks, such as Economic Power Dispatch (EPD) and energy forecast, can be achieved by a single entity. In this case, assigning multiple agents with similar tasks is a waste of computational capacity (De Azevedo 2016).

To combine advantages from both architectures, some researchers propose a hybrid MAS design, where a supervisor agent observes the microgrid and updates its local databases without interfering with agents’ action plan (Jiang and Fei 2011; Dou et al. 2017). In some research work, this design is combined with a peer to peer control, in which agents keep a log of available and unavailable communication links and nearby isolated areas (Dou et al. 2017). Another variant of hybrid architecture is breaking the microgrid into smaller controllable areas, each with a “leader” agent used as a communication gateway and a coordination entity (Tolbert et al. 2001). For example, Zone Agent manages agents in a specific area, such as a floor or a room, and Home Energy Management (HEM) Agent manages agents inside a house (Joumaa et al. 2011). This hybrid architecture, illustrated in Fig. 3, is known as hierarchical MAS and is popular among researchers for its opportunities in advanced microgrid control (Dou et al. 2017; Tolbert et al. 2001; Joumaa et al. 2011; Zheng and Li 2010; Nunna and Doolla 2013).

Fig. 3
figure 3

Hierarchical design of multi-agent system

Even though each MAS design has advantages and drawbacks, combining MAS with Demand Response Programs (DRP) for energy management is a promising solution to achieve an efficient control of power generation and delivery. Indeed, DRP define a set of incentives and emergency actions for load shifting and shedding applied by Demand Management System (DMS) (Tazi et al. 2017; Cintuglu et al. 2016). In fact, DMS objective is to increase energy efficiency and conservation by encouraging consumers to change their consumption schedules based on the variations of market energy prices. DMS agent sends shifting/curtailment requests to consumers and receives their positive/negative feedback for quality analysis. If the microgrid is under maintenance, emergency or island-mode, the controller can enforce load shedding by imposing bigger Time Of Use (TOU) tariffs or shutting down low-priority loads (Tazi et al. 2017; Logenthiran et al. 2012; Moradi et al. 2016). The objective of DRP is to decrease demand on peak periods and align it as much as possible with available energy, either from the grid or from local energy production (Tazi et al. 2017; Palensky and Dietrich 2011).

One of the main issues in DMS is the creation of new demand peaks. Indeed, having a large number of consumers shift their consumption from t1 to t2 can result in another peak at t2 (Jin et al. 2011; Pau et al. 2017). This is common in distributed MAS, where agents coordination mechanisms are asynchronous either in the same or between adjacent microgrids. To solve this problem, a cloud-based MAS, illustrated in Fig. 4, is an interesting solution to investigate further (Boussaada et al. 2016; Palensky and Dietrich 2011; Pau et al. 2017; Boroojeni et al. 2017). In this architecture, a cloud-based DMS center receives energy requests from HEM agents and generates schedules for loads based on their priorities, the overall demand and available energy. This design combines Zigbee devices and protocols with energy auction protocols architecture (Labeodan et al. 2015; Tazi et al. 2017). The cloud-based MAS concept can be extended to manage microgrid clusters for optimal power sharing and efficient load shifting and shedding (Boroojeni et al. 2017).

Fig. 4
figure 4

A cloud-based MAS architecture for a single microgrid

Even though this last architecture is a promising candidate for smart grid management, it raises up questions about the size of a microgrid for the best system performance. There are few, if nonexistent, studies that explores the sizing of microgrids and the feasibility of cloud-based management for a smart grid. Also, load shifting and shedding can cause fluctuations on the power system, so it is recommended to investigate the behavior of microgrids in case of multiple load shifting/shedding requests.

3.2 MAS control layers

Another important element of MAS design is the number of control layers and agents assigned accordingly. There are many control parameters in power systems, such as breakers and switches, voltage and current, active and reactive powers, and energy forecast and dispatch. By defining control layers, each agent will be assigned a specific task and an area to control. Control layers are defined as follows (Khan and Wang 2017; Cintuglu et al. 2016; Lewis et al. 2013; Mao et al. 2014; Rivera et al. 2014a):

  • In Primary Control, the objective is system stability and reliability achieved by real-time measurement exchange. At this stage, demand and supply should be balanced by controlling injected active power, frequency is adjusted by droop control algorithm. Voltage is measured at fixed intervals.

  • Phase synchronization and voltage/frequency restoration are achieved at the Secondary Control. Power to be regulated is computed at this stage using Microgrid Regulation Error (MRE), as shown in Eq. (1). Service restoration is ensured by local agents’ bids, which is critical in case of failures recovery and island-mode.

    $$ P_{MRE} = \Delta P_{PCC} + k \cdot \Delta f $$
    (1)

    where PMRE: power to be regulated; ∆PPCC: power mismatch at PCC; k: system frequency characteristic constant; ∆f: frequency mismatch.

  • Tertiary Control focuses on energy dispatch and economic schedules. At this stage, power import and export are done based on economic profit and cost-effective schedules.

The choice of control levels and agents to deploy in each one is based on the physical architecture of the microgrid and MAS design. In the literature, some researchers use 2 level control, while most of them use 3 level control to distribute complexity and decision-making over many agents.

3.3 Agent specification

The choice and specification of agents in MAS is another key design element for a successful modeling. Having an agent for each device in the microgrid would reduce management overhead and dispatch computations over many entities. Each device would be interfaced by a corresponding agent that takes care of state management, control and measurement. However, with increasing number of agents, system latency increases as well (Lewis et al. 2013; Chung and Oh 2013; Nguyen and Flueck 2011). Latency in MAS communication network cannot be zero or constant and depends on the microgrid status (i.e. normal operation, island-mode), but the overall objective is to decrease it as much as possible to ensure near real-time communication (Nguyen and Flueck 2011). There is a tradeoff between the number of agents and communication latency, while the optimal number of agents for building a robust MAS system is an interesting topic of research. Kumar et al. (2011) conducted a simulation to study the impact of the number of agents on the performance of optimization algorithms. They concluded that having many agents increases both system accuracy and convergence time to an optimal solution. On the opposite, having a relatively small number of agents results in a faster convergence and a lower system latency.

According to the literature, the number of agents depends on the topology and size of the microgrid. In some research works, devices are modeled as producer, consumer, controller or observer agents. In other research works, an agent is assigned a specific task, such as load shifting and energy forecast. Dimeas and Hatziargyriou (2015) use a Translator Agent responsible for complex calculations on behalf of other agents, which reduces modifications overhead and latency. However, similar to the centralized architecture, if this agent fails, other agents will be affected. Table 2 classifies agents that appear in the different works discussed in this survey, namely references (Kantamneni et al. 2015; Xu et al. 2017; Labeodan et al. 2015; Tazi et al. 2017; Logenthiran et al. 2012; Coelho et al. 2017; Et-Tolba et al. 2014a; Nvari-Moghaddam et al. 2017; Dimeas and Hatziargyriou 2004; McArthur et al. 2007b; Rahman and Oo 2017; Raju and Milton 2017; De Azevedo 2016; Jiang and Fei 2011; Dou et al. 2017; Tolbert et al. 2001; Joumaa et al. 2011; Zheng and Li 2010; Nunna and Doolla 2013; Palensky and Dietrich 2011; Pau et al. 2017; Boroojeni et al. 2017; Lewis et al. 2013; Mao et al. 2014; Rivera et al. 2014a; Chung and Oh 2013; Nguyen and Flueck 2011; Kumar et al. 2011; Dimeas and Hatziargyriou 2015), based on the nature of their behavior and provides a description of their tasks as well.

Table 2 Summary of Agents grouped by classes based on references (Kantamneni et al. 2015; Xu et al. 2017; Labeodan et al. 2015; Tazi et al. 2017; Logenthiran et al. 2012; Coelho et al. 2017; Et-Tolba et al. 2014a; Nvari-Moghaddam et al. 2017; Dimeas and Hatziargyriou 2004; McArthur et al. 2007b; Rahman and Oo 2017; Raju and Milton 2017; De Azevedo 2016; Jiang and Fei 2011; Dou et al. 2017; Tolbert et al. 2001; Joumaa et al. 2011; Zheng and Li 2010; Nunna and Doolla 2013; Palensky and Dietrich 2011; Pau et al. 2017; Boroojeni et al. 2017; Lewis et al. 2013; Mao et al. 2014; Rivera et al. 2014a; Chung and Oh 2013; Nguyen and Flueck 2011; Kumar et al. 2011; Dimeas and Hatziargyriou 2015)

As discussed above, the main benefit of MAS for microgrid control is the deployment of DAI and DPS algorithms in agents to optimize power generation, delivery and utilization. The next section discusses optimization algorithms and techniques and their application areas, advantages and limitations.

4 MAS optimization techniques

One of the benefits of MAS is the integration of DAI and DPS to optimize control and power management in microgrids. Optimization algorithms and frameworks allow agents to evaluate many options based on the environment state to find the optimal solution to achieve a system goal. Based on the literature, optimization techniques include reinforcement learning, data-driven modeling techniques, mathematical programming models, heuristic and metaheuristic optimization algorithms, non-linear control methods and other algorithms such as auctions and programming platforms. The following sections discuss these techniques and compare them in terms of convergence time, area of application and main features.

4.1 Reinforcement learning optimization

Reinforcement Learning (RL) is a machine learning technique thanks to which agents take decisions based on the environment’s state. Since it does not require an initial model of the agent’s environment, RL is suitable for time-varying topologies, such as microgrids (Dou et al. 2017; Hu and Wellman 1998). Therefore, it is suitable for MAS to optimize convergence to a global solution (Schneider et al. 1999). Definitions and examples of Markov Decision Process (MDP), Game Theory and Nash Equilibrium are discussed in details in references (Schneider et al. 1999; Busoniu et al. 2010; Aristidou et al. 2011). Q-learning is a variant of RL that uses Q-functions and values (Hu and Wellman 1998; Busoniu et al. 2010), where agents have no prior knowledge of neither the reward function nor the state transition functions. The agent has to choose a policy from a pole of actions, based on trial-and-error method, and then it computes its discount factor (Leo et al. 2014). Another variant of RL is Distributed Value Function (DVF), where agents communicate with their neighbors to pursue a global objective (Hu and Wellman 1998). This method was created to overcome the lack of coordination between agents and the problems in computing the global reward in other methods.

According to the literature, RL techniques can be applied to storage power injection and system stability optimization. Q-learning algorithms are applied to dynamic Demand-Response (DR) scheduling, by enabling agents to learn from consumption and energy forecast data to generate day-ahead schedules for loads, storage and DG (Jiang and Fei 2011). User satisfaction is a system objective, and all generated schedules are periodically corrected by DMS Agent to improve their accuracy. MDP is applied to battery charge and discharge to achieve cost reduction, where Battery Agent is trained using Bayesian RL (BRL) algorithm with respect to predefined State Of Charge (SOC) levels (Nvari-Moghaddam et al. 2017). The objective is to find the best start time for charging/discharging a battery with reduced cost using an appropriate reward function. For additional performance, a hybrid coordinated Q-learning algorithm combines Q-tables with RL to control a stand-alone microgrid with PV panels (Leo et al. 2014). Each agent builds its Q-table based on the action space, and then it coordinates with other agents to differentiate “dangerous” actions from “safe” ones. Since microgrids are PnP topologies, connection/disconnection of devices to/from the grid results in significant power fluctuations. DVF is a good choice for stabilizing the microgrid, as discussed by researchers in Shirzeh et al. (2015). They propose a DVF-based approach to mitigate fluctuations when connecting or disconnecting devices. Power to be injected or consumed by a node (e.g. load) is first computed, then reactive power control is achieved through DVF.

One of the main challenges of RES integration is the intermittent behavior and stochastic nature of renewable resources. To address this issue, minority-game based EMS enable agents to compete and efficiently share the generated green energy among consumers (Huang et al. 2016). In fact, agents adjust their behavior based on other players’ historical actions and preferences, while Consumer agents compute attractiveness and priority factors and predict demand by supervised learning. Kalman’s filter is used in this approach to minimize prediction errors. Game theory is suitable as well for storage control optimization, where agents compute the required and the target storage capacity to minimize costs (Vytelingum 2010). They then compete to maximize their profits and reach Nash equilibrium by charging storage during off-peak periods. Game Theory is suitable also for market operation, in which agents seek their interests by negotiating bid prices either as buyers or sellers (Esfahani et al. 2018). Different communication protocols are used in this work to minimize delays and make the algorithm interoperable.

4.2 Data-driven modeling optimization

Data-driven modeling techniques are also a good candidate for operation optimization of microgrids using power flow measurements and energy prices data. Yoo et al. (2013) propose a fuzzy-based algorithm for controlling batteries under Emergency Demand Response (EDR) programs. The Battery Agent learns from past data to keep SOC between 30% and 100% charge and use the RES energy as the main charging source. Fuzzy logic can be extended to energy management and scheduling by deploying a fuzzy controller in Strategy Agent (Serraji et al. 2015). The objective of this controller is to serve demand first by local generated energy, then buy energy from the main grid in case of supply shortage.

In another research work, Queiroz et al. (2016) propose a data analysis and prediction-oriented MAS to optimize microgrid control. Two different predictions strategies are embedded in two different sets of agents: lower level agents responsible for short term learning to generate rapid decisions and corrective actions, and higher level agents, responsible for robust data analysis to generate long term decisions. This system is composed of the following modules for data analysis: Data Extraction Module, Data Analysis Module and Decision Module. However, this study does not measure system latency and memory resources required to build the system.

Besides microgrid control, data-driven modeling techniques are very useful for solar/wind energy forecast. Elamine et al. (2015) combine Particle Swarm Optimization (PSO) and back propagation (BP) to train their Neural Network (NN) with past wind speed records to predict wind energy (Gamarra and Guerrero 2015). Based on historical data, RES Agents forecast energy production, and DG Agents use previous diesel prices to schedule DG dispatch (Logenthiran et al. 2012). For real-time management and error reduction, DMS Agent corrects consumption and production schedules upon reception of real-time market prices and weather forecast. Historical data can be analyzed by Consumer Agents to find the best price that would motivate a consumer to accept load shifting requests (Mets 2012). Historical data is also useful to find the lowest price for starting a device, and solve Economic Load Dispatch (ELD) optimization problem. Producer Agents can use historical data for unit commitment and energy bids (Tazi et al. 2017; Logenthiran et al. 2012). Logenthiran et al. (2012) use past market prices and load demand records to produce a tentative day-ahead planning for optimized microgrid operation.

4.3 Mathematical programming models

Mathematical Programming (MP) models provide analysis and design tools to break down large-scale problems into small, manageable sub-problems and then reassemble sub-solutions into a global optimal solution (Herskovits et al. 2005). Many MP-based optimization frameworks for microgrid control were proposed by researchers, and a selection of these works is discussed hereby.

Model Predictive Control (MPC) is an advanced MP process control method that satisfies a predefined set of constraints to achieve an objective by generating future predictive values using current states (Wen et al. 2016; Minchala-Avila et al. 2015). MPC, combined with a leader-following consensus algorithm, is used to regulate frequency in microgrids and find a common storage ratio for local units during peak demand or generation (Wen et al. 2016). It is also used to achieve power balance by scheduling the operation of Port-Hamiltonian (PH) systems connected to a DC microgrid (Pham et al. 2017). Combined with Bellman’s Dynamic Programming principle, MPC is suitable for finding battery reference current, either in charge or discharge modes (Morstyn et al. 2018). The MAS ensures balance in the grid by means of sliding mode control with boundaries to prevent rapid power fluctuations.

Mixed-Integer Linear Programming (MILP) is a mathematical optimization program that can be applied to problems where some decision variables should be integers, while other variables could be non-integers (Minchala-Avila et al. 2015). In the context of MAS optimization, MILP can be used to express different system objectives, such as power balance in multi-microgrid environments (Ren et al. 2014) and operational cost reduction (Tenfen and Finardi 2015).

Lagrangian Relaxation is another interesting mathematical optimization technique, in which a complex constrained optimization problem is approximated by a simpler formulation (Desale et al. 2015). In fact, constraints are transferred to the objective function in form of weighted sums, called Lagrangian multipliers (Desale et al. 2015; Disfani et al. 2014). Disfani et al. (2014) use a hybrid price-update Lagrangian Relaxation to find the optimal energy bidding cost for both communities and utilities. They compare the performance of modified sub-gradient algorithm with Lower–Upper-Bound Switching algorithm (LUBS) in active/reactive power balance in a microgrid.

4.4 Heuristic and metaheuristic optimization

Since microgrids have time-varying topologies, their constraints are non-convex, complex and non-linear based on inequalities, so heuristic and metaheuristic algorithms are suitable to achieve a better system performance in this case (Kumar et al. 2011; Wu et al. 2015). Heuristic and metaheuristic optimization encloses different classes of algorithms, such as: nature-inspired, population-based, single-point, greedy, iterative, algorithms with dynamic and static objective functions (Gogna and Tayal 2013). An algorithm could fall into two of these subcategories or more, depending on the underlying theory, e.g. Fire-Fly (FF) optimization algorithm is a nature-based, a population-based and an iterative algorithm. Among these classes, population-based, nature-inspired optimization techniques use the concept of natural evolution of biological systems to build the search process. Even though the population size of solutions increases the computational complexity, these algorithms are self-taught, simple, reliable, efficient and can even be extended and hybridized (Ding et al. 2013; Himabindu and Jyothi 2017). Therefore, population-based metaheuristics, also called p-metaheuristics, are popular thanks to the parallel exploration of the search space, resulting in lower convergence and execution times (Salgueiro-Sicilia and Rivera 2017).

Given these advantages, an active research on population-based and nature-inspired approaches is still going on, and a large number of algorithms were proposed, tested and deployed in many applications. Swarm intelligence (SI) is a subcategory that regroups algorithms inspired by collective behaviors, such as bird schooling and ant/bee colonies (Himabindu and Jyothi 2017). A swarm is composed of relatively homogeneous, self-organizing and decentralized individuals that collaborate to reach an objective in a distributed environment (Himabindu and Jyothi 2017; Mahalem and Chavan 2012). In order to ensure reliability and flexibility of the system, SI is based on four principles: Proximity, Quality, Resource Diversity and Stability (Himabindu and Jyothi 2017). Following is a selection of popular SI algorithms used in MAS optimization: Particle Swarm Optimization (PSO), Differential Evolution (DE), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), Artificial Immune System (AIS), Chaotic Bat Optimization (CBO) and Whale Optimization (WO) (Salgueiro-Sicilia and Rivera 2017; Mahalem and Chavan 2012; Adarsh et al. 2016; Reddy et al. 2017; Shabir and Singla 2016). A selection of these algorithms is first introduced briefly before discussing their application in MAS optimization for microgrid management.

PSO is an optimization algorithm based on swarm behavior first proposed in 1995, where a single particle is characterized by velocity and fitness value (Das et al. 2008). These parameters are updated stochastically and used to find the global best position (Das et al. 2008; Wahab et al. 2015). This algorithm is very popular in optimization thanks to its efficient memory usage and the rapid convergence to a solution in continuous non-linear environments (Wahab et al. 2015). Next, ACO, considered one of the most successful swarm-based algorithms, is inspired by ant system and was first proposed in 1992 (Mahalem and Chavan 2012). The system is composed of ants, pheromone, daemon action and decentralized control, where the goal is to find paths leading to the “food” (Agarwal and Mehta 2014). Then comes ABC, which is one of the recent swarm-based algorithm proposed in 2005 based on the intelligent behavior of bees when searching for food sources (Mahalem and Chavan 2012; Wahab et al. 2015). In this algorithm, three types of bees are defined: scout bee, which randomly looks for new food sources; employed bee, which visits and exploits the found source; and on-looker bee, which waits on the dance area to make decisions on the food sources based on the information communicated by the employee bees (Mahalem and Chavan 2012; Agarwal and Mehta 2014). Inspired by human immune system, AIS emerged as an interesting concept in mid 1980, but got the full attention of researchers until 1990 (Himabindu and Jyothi 2017; Mahalem and Chavan 2012). The optimization is carried out by antigens that go through 3 phases: initialization, cloning and hyper mutation (Mahalem and Chavan 2012). Thanks to its self and non-self discrimination features, AIS is considered a good candidate to build intrusion detection and classification systems (Himabindu and Jyothi 2017). Concerning Chaotic Bat Optimization (CBO), it is an algorithm based on the hunting technique of bats by emitting frequencies, proposed in 2010, so it is a relatively new optimization algorithm (Adarsh et al. 2016; Soares et al. 2018). It is characterized by its stochastic properties and irregularity, and simulation results show the promising performance of CBO in achieving objectives. Finally, as its name suggests, Whale Optimization (WO) algorithm is one of the newest algorithms in p-metaheuristics, proposed in 2016 (Reddy et al. 2017; Shabir and Singla 2016). This algorithm optimizes the search by following the hunting mechanisms of whales in nature through 3 phases: Encircling prey; Bubble net hunting method; Search the prey. According to simulation results, WO outperforms other algorithms, such as PSO and DE, in solving EPD, ELD and RES sizing (Khan and Singh 2017).

Besides SI, evolutionary algorithms, which is another subset of p-metaheuristic algorithms, includes as well a selection of interesting optimization algorithms (Himabindu and Jyothi 2017; Mahalem and Chavan 2012). One of the most popular ones in this category is Genetic Algorithm (GA) proposed in the 1970s, which is a heuristic search inspired by natural selection (Wahab et al. 2015). To reach the global objective, chromosomes undergo three phases: crossover, reproduction and mutation controlled by fitness values (Ding et al. 2013; Himabindu and Jyothi 2017; Wahab et al. 2015). Table 3 presents the input parameters and evolutionary mechanisms of a selection of these algorithms based on references (Ding et al. 2013; Himabindu and Jyothi 2017; Salgueiro-Sicilia and Rivera 2017; Mahalem and Chavan 2012; Adarsh et al. 2016; Khan and Singh 2017).

Table 3 Input parameters and evolutionary mechanisms of the discussed algorithms based on references (Ding et al. 2013; Himabindu and Jyothi 2017; Salgueiro-Sicilia and Rivera 2017; Mahalem and Chavan 2012; Adarsh et al. 2016; Khan and Singh 2017)

The choice of an optimization algorithm has always been an important issue in this field, and multiple comparative studies were carried out for an efficient algorithm selection. On the one hand, it has been shown that parameters’ selection and initialization have a significant impact on the performance of algorithms (Khan and Singh 2017). Based on the extensive simulations conducted by researchers in Khan and Singh (2017), one can conclude that p-metaheuristic algorithms can generate good results when applied to the appropriate optimization area with suitable starting values. Table 4 depicts a non-exhaustive list of application areas for a selection of optimization algorithms based on (Gogna and Tayal 2013; Soares et al. 2018; Khan and Singh 2017). On the other hand, it is reported in (Salgueiro-Sicilia and Rivera 2017; Shabir and Singla 2016; Khan and Singh 2017) that FF, PSO and DE are the most popular algorithms in the field of optimization thanks to their computational efficiency, parallel search abilities and quality of the returned solutions. In fact, FF and PSO have a good optimization performance in overall when combined with MAS for microgrid control and are popular because of two main reasons: they are easier to hybridize and to implement (Salgueiro-Sicilia and Rivera 2017; Khan and Singh 2017). It has even been shown in (Shabir and Singla 2016; Khosla et al. 2007) that PSO outperforms GA in computational effort and outperforms other algorithms, such as AIS and ABC, in the evaluation of the search space (Khan and Singh 2017). In addition, a study conducted by Roche (2012) concludes that combining different optimization algorithms with PSO enhances the overall performance of the system.

Table 4 Application areas of selected optimization algorithms based on Gogna and Tayal (2013), Soares et al. (2018), Khan and Singh (2017)

Therefore, according to the literature, PSO and its hybrid versions are extensively deployed in MAS to optimize microgrid-related issues, such as reactive power dispatch, which affects transmission losses and tap ratios of transformers and compensators (Zhao et al. 2005). This issue is addressed by combining GA with PSO in such a way that DG agents compute the dispatch power at the lowest cost using GA, and then PSO finds the agent with the lowest dispatch cost. PSO is also useful in advanced and enhanced control of combined heat and power (CHP) micro-systems and electrical heaters, considering consumption reduction and user comfort (Jiang and Fei 2011; Hurtado et al. 2015; Wang 2013). Another issue in MAS is the communication delays between agents, which is addressed by asynchronous PSO used to reach a global consensus between agents (Gazi and Ordonez 2014). In this asynchronous variant, each agent is required to compute the error between the actual and received information and update the target position.

As highlighted before, combining PSO with other optimization algorithms improves the overall performance of the system and thus is a good candidate to solve ELD and EPD. On the one hand, ELD objective is to serve as many loads as possible with the lowest cost by generating dispatch schedules ahead of time (Gogna and Tayal 2013). Hybrid GLOBest-PSO and Back Propagation (GLOBest-BPPSO) is a good combination to find both local and global ELD solutions in the same iteration in less time and with smaller memory requirements (Elamine et al. 2016). On the other hand, EPD objective is to reduce production cost as much as possible by finding the optimal share of power generation among connected DGs (Kumar et al. 2011). It is important to consider rippling effect from steam admission value when generating DG schedules (Gogna and Tayal 2013). Bee colony and Nelder-Mead method are combined to find the local lowest cost, while PSO is used to find the lowest global generation cost. This version of PSO is called Hybrid Multi-Agent based PSO (HMAPSO). In addition to ELD and EPD, Market Clearing Price (MCP) is a computational problem addressed by combining PSO with Game Theory (Dou et al. 2016). The problem is first converted to a linear problem using Math Program with Equilibrium Constraint (MPEC) and MILP, and then Nash equilibrium is reached using both game theory and PSO.

Even though PSO is popular in MAS optimization, it has been shown that other p-metaheuristic algorithms generate better results and are accordingly used to solve issues related to microgrid management (Khan and Singh 2017). Adapted Ant Colony Optimization (A-ACO) is used to solve ELD problem by searching for the shortest path between start and end times of devices (Dethlefs et al. 2014). Microgrid operational cost can be reduced by almost half using Enhanced Bee Colony Optimization (EBCO) (Whei-Min et al. 2015). Because of the poor stability of traditional bee colony optimization in high dimensional search spaces, EBCO is improved by two factors: penalty factor to adjust the search in case of bad decisions, and self-adaption repulsion factor to serve search direction change when needed.

Supply–demand balance and efficient reserve management is an optimization issue of concern that can be addressed by Modified-Harmony Search Algorithm (M-HSA) (Jiao et al. 2017). Harmony Search Algorithm is a computational technique that aims at maximizing or minimizing the objective function based on music theory. To enhance the convergence time, M-HSA explores the search space from both sides and assigns a weight to the optimal solution and a penalty to the remaining ones. Besides Hybrid PSO and Bee Colony, Artificial Immune System (AIS) is another interesting option to solve EPD, which is shown to produce cost-efficient generation schedules (Bhuvaneswari et al. 2010).

Even though there are not many successful applications of WO and CBO reported in published works, these algorithms are gaining more attention for the great potential they can achieve (Khan and Singh 2017; Roche 2012). Reddy et al. (2017) and Trivedi (2016) use Whale Optimization (WO) algorithm to solve the EPD and ELD in a high dimensional and constrained space, where environmental variables are also considered. According to the simulation results, WO outperforms other evolutionary algorithms, such as PSO and DE, converging faster and generating better costs. Finally, Chaos Bat Optimization (CBO) algorithm is used to optimize EPD in a high dimensional space, and simulation results show how this algorithm outperforms PSO, DE and Teacher Learning-Based Optimization (TLBO) (Adarsh et al. 2016).

4.5 Non-linear control techniques

Non-linear control techniques deal with non-linear and time-variant systems that do not obey the superposition principle and might have multiple equilibrium points (Billings 1985). For more details about these type of systems, the reader is kindly referred to reference (Billings 1985). Even though these techniques are not as popular as heuristic and metaheuristic techniques for optimization, there are some research work that propose control schemes based on non-linear system theory. Since microgrids and MAS are non-linear systems, these techniques are suitable to optimize issues related to power management, such as secondary voltage and battery control. InputOutput Feedback Linearization (IOFL) is one of these techniques that algebraically transforms a non-linear problem into a linear one so as linear control techniques could be applied (Bidram et al. 2013; Megretski 2016). This technique cancels nonlinearities and transforms closed-loop dynamics to a linear form by decomposing agents’ dynamics into the rth order dynamical system, and then combined with Lyapunov technique, secondary voltage control is enforced. Lyapunov technique ensures stability of the system near equilibrium point. However, the proposed solution in Bidram et al. (2013) assumes a time-invariant topology with a D-graph communication network.

Sliding control mode is another technique that transforms the dynamics of a non-linear system by means of discontinuous control signal to make the system “slide” and lose its nonlinearity (Weigen et al. 2000). It is very suitable to overcome limitations of V-I droop control of batteries, which results in diverge SOC and rapid life deterioration of storage (Morstyn et al. 2017). Sliding mode technique generates the level of participation to achieve SOC balance by sliding closer to the average SOC of neighbors. This method still suffers from overloading the same batteries and excessive battery current variations; hence, boundary layers for smoothing transitions are added. The proposed algorithm is designed for time-variant topologies (Morstyn et al. 2017).

4.6 Other optimization models

Other frameworks and algorithms are proposed by researchers to optimize microgrid issues such as fair allocation of resources, market transactions and agents’ consensus. In some cases, load shifting is not enough to balance demand with supply, so load shedding or curtailment is used instead, where low-priority loads are temporarily disconnected from the microgrid. But which dwelling should be penalized by shutting down its loads? Researchers address this issue by designing algorithms for a fair load curtailment and resources allocation. These algorithms include Constrained Equal Losses (CEL) algorithm (Kim et al. 2010) and Distributed Load-Shedding (DLS) approach (Lim et al. 2014). CEL divides the amount of loads to be shut-down over many participants to decrease consumers’ penalties and discomfort. It assigns priorities to residential loads based on user preferences to prioritize critical loads (Kim et al. 2010). For demand–supply balance, DLS shuts down some loads in case of an islanded microgrid using energy prices as a consumer incentive to accept load shedding requests (Lim et al. 2014).

To achieve an important operational cost reduction, energy market auction is a popular technique, where agents negotiate their bidding costs directly with each other, such as peer-to-peer approaches (P2P) (Zhang et al. 2018a; Zhou et al. 2018). Consumers and generators are called prosumers acting in an internal energy market inside a microgrid. The internal market is composed of 3 models: the pricing model for internal transactions, decision-making model for scheduling bids, and implementation model for interaction regulation (Zhou et al. 2018). Agents use non-cooperative Game Theory concept to reach Nash equilibrium, and their actions have rewards for RL (Lim et al. 2014). Other approaches to build an energy market auction environment include Continuous Double Auction (CDA) and Zero-Intelligence plus (ZIP) strategy for bids (Nunna and Doolla 2013), FIPA auction protocol (Dimeas and Hatziargyriou 2004; Dimeas and Hatziagyriou 2004), fair distribution of resources through an auction algorithm (Dimeas and Hatziargyriou 2005), distributed finite-time optimal resource management through auction (Zhao and Ding 2018), and a multi-agent based transaction energy framework for smart microgrids (Nunna and Srinivasan 2017).

Finally, MAS are fully decentralized and cooperative environments, where a problem is broken into sub-problems and assigned to multiple agents. Cooperative MAS is discussed further by researchers and evaluated using scenarios, such as enhanced energy cost reduction (Dimeas and Hatziagyriou 2004) and frequency control (Lewis et al. 2013). Frequency control is achieved by local DGs, where one of the DG units is selected as a commander to define the reference frequency and lead other DGs to stabilize the grid (Lewis et al. 2013). For this purpose, a main entity, such as Task Decomposing System (TDS), assigns cooperatively tasks to agents in the system to achieve hybrid cooperative control (Ming et al. 2003).

Table 5 summarizes algorithms and approaches discussed in this section, their corresponding optimization framework designs, the main observations and simulation results. The comparison is carried out in terms of convergence time and performance in achieving system goals. Table 6 summarizes advantages and disadvantages of a selection of optimization algorithms. A couple of observations from both these tables are noteworthy. First, the performance of the proposed algorithms in the fourth section of Table 5 is compared with the performance of PSO in the same test settings. It is found that MAS-based PSO has a relatively good performance in microgrid optimization. In addition, optimization techniques are mostly used in hybrid form, where multiple algorithms are combined to overcome disadvantages discussed in Table 6. For example, MP methods are used as a first step to simplify and reformulate system constraints and objective functions, then heuristic algorithms and non-linear control techniques are used to optimize the system operation and reach equilibrium.

Table 5 Summary of reviewed optimization techniques
Table 6 Advantages and disadvantages of a selection of optimization techniques

However, there is still no clear study about the computational resources required to keep system latency under a certain threshold for hybrid MAS. Also, most researchers assume a time-invariant topology, which limits the possibility to apply the proposed methods in real settings. It is thus important to explore and confirm the behavior and performance of these algorithms in time-variant topologies. In addition, performance and convergence of data-driven optimization depends on the quality and size of data fed to the system. A study on this relationship and its effect on the overall system performance is a must to validate the solutions proposed by researchers.

Since information diffusion and consensus is considered a stand-alone optimization area of research (Zhang et al. 2018a), the following section is dedicated to the discussion of research progress in this field.

5 Information diffusion and consensus

MAS optimization techniques do not guarantee an efficient control of microgrids without a robust communication network and message protocols to reduce communication delays. Examples of message protocols are FIPA Contract Net Interaction protocol, Contract Net Protocol (CNP) (Colson and Nehrir 2011; Mao et al. 2014; Pipattanasomporn et al. 2009), and Service Level Agreements (SLA) (Amato et al. 2016). The objective of these communication protocols is to find the shortest channel between agents and emphasize their self-organization. For this purpose, MAS is modeled as a directed graph (D-graph) for easier spanning tree search and thus avoid central entities intervention (Raju et al. 2017; Basso et al. 2013; Oliveira et al. 2012). For improved spanning tree search, D-graphs are combined with Laplacian matrix for an efficient shortest path search (Kanchev et al. 2012).

Information diffusion among agents is meant to reach consensus, also called agreement, rendez-vous and synchronization (Gazi and Ordonez 2014; De Azevedo et al. 2017). One of the most used consensus algorithms in the literature is Average-Consensus Algorithm, where a virtual leader is needed to update consensus values, while the other agents are considered followers (Huang et al. 2017; Wang et al. 2015). This algorithm uses individual weights maintained and updated locally by each agent based on local and received information. Average-consensus algorithm has many applications, such as EPD, surplus/deficient active power management and optimization of storage discharge cycles (Wen et al. 2016; Huang et al. 2017). Incremental Cost Consensus Algorithm is a variant, where each agent maintains a consensus variable updated with regular information exchange with neighbors (Li et al. 2016; Lewis et al. 2013; Chow M and Zhan 2011). Each time a mismatch between demand and supply is detected by smart meters, the feedback coefficient, also called regulation ratio, is updated (Li et al. 2016; Guanghui et al. 2018). Adaptive consensus-based dispatch algorithm is another variant of this algorithm, where an adaptive law is used to update weights and converge to a consensus (Guanghui et al. 2018). A random multi-radius geographical spacial network is used to represent the communication network, while the analysis of constraints and objective functions is carried out by logarithmic barrier function and Lypunov energy function.

These algorithms are based on leader-following consensus, where a leader supervises update of weights and communicates the target consensus value to agents (De Azevedo et al. 2017; Huang et al. 2017; Wang et al. 2015; Chow M and Zhan 2011; Guanghui et al. 2018). To overcome the issues related to a leader failure, one method is to break the leader’s tasks over multiple agents to increase robustness (Chow M and Zhan 2011). An agent then handles negotiations and agreements, called a consensus manager, and another agent implements final decisions, called a local controller. Another method, called consensus + innovations approach, considers each agent as its own leader in fully distributed environments (Hug et al. 2015). In fact, each agent is assigned a quadric function to compute the power cost for buying/selling energy within the upper and lower boundaries set to regulate agents’ negotiations.

Combining Metropolis weights to average-consensus algorithm is proven to increase both efficiency and robustness of MAS (De Azevedo et al. 2017), since these weights are easier to compute and suitable for time-varying topologies (Xiao et al. 2006; Bui et al. 2017). Each agent keeps a matrix representation of its neighbors and a trust factor to track updated statuses, so if an agent is dropped, weights are updated quickly and easily. Combine-Then-Adapt (CTA) diffusion strategy is then used to broadcast information such as energy cost, available power and local demand (De Azevedo et al. 2017). CTA diffusion strategy is an information-sharing algorithm based on weights and matrices as well, where each agent is connected to a circuit breaker, and based on its status (ON/OFF), it updates its local Metropolis weight (Xiao et al. 2006; Bui et al. 2017). CTA equation is given as follows (Raju and Milton 2017):

$$ CTA_{Diff} = \left\{ \begin{array}{l} \phi_{i,k - 1} = \sum\nolimits_{{j \in N_{i} }} {a_{ij} x_{j,k - 1} } \hfill \\ x_{i,k} = \phi_{i,k - 1} - \mu_{i} S_{i,k} (\phi_{i,k - 1} ) \hfill \\ \end{array} \right. $$
(2)

where xj,k−1: state of agent i at time k − 1; ϕi,k-1: is the intermediate variable for agent i at time k − 1; μi: is a nonnegative updating parameter of agent i; Si,ki,k-1): is the stochastic gradient for agent i of the intermediate state ϕ at time k − 1.

Some issues in information sharing and consensus are still not thoroughly investigated by researchers, such as control collisions, communication network noises and failures and agents resources management. In fact, power systems in general are sensitive to control, especially devices such as buses and transformers. Hence, if at least two agents generate control commands for the same device at the same time, or the generated control commands are conflicting, there will be a collision resulting in equipment damage. This issue can be solved by control tokens (Yoon et al. 2011), where a flag is updated when an agent issues a control command to a device. The flag is updated again when an agent releases the token for other agents to take control over it. Other approaches and algorithms are needed to address this sensitive issue in MAS for microgrid control.

In general, researchers neglect communication noises when designing consensus and information diffusion algorithms, which is an ideal scenario given that all types of communication networks are subject to noises and delays. Wang et al. (2015) address this issue by an average-consensus algorithm with binary-valued observations, where agents keep only an estimate of neighbors’ states and receive information in a binary form. A control law is designed also to ensure the convergence and consistency of local values with the system objective. As a side note, Zigbee and Wireless Mesh Networks (WMN) are a good candidate to build microgrid communication backbone, with less deployment costs and time and an efficient, reliable and noiseless communication network (Kim and Lim 2012). More research about information diffusion and routing algorithms in noisy environment is needed.

Since sensor resources are actually limited and feedback is not continuous, agents should be event-triggered and remain on idle mode when no events are happening. This issue is addressed by asymptotical and asynchronous convergence, where MAS should be strongly connected and balanced (Chen and Hao 2012). Agents should have also an error measurement mechanism to filter inconsistent data. Finally, all reviewed articles assumed homogeneous agents, which is not always the case. When having heterogeneous MAS, dynamic compensators interface agents and their communication to reach consensus dynamically (Hu et al. 2017). Thanks to this approach, agents estimate the next triggering period based on the previous interactions, which reduces error handling overhead. Congestion avoidance algorithms are required in these systems to monitor communications and avoid collision as well. In addition, reaching consensus in such systems is different than in homogeneous MAS. Zhang et al. (2017) propose a consensus algorithm based on Hybrid Critic-Action Neural Network and Adaptive Dynamic Programming (ADP) for continuous-time heterogeneous linear MAS with communication delays. They use model transformation to transform it into a discrete-time delay-free model, and then they apply an appropriate control strategy based on HamiltonJacobiBellman (HJB) equations and Lyapunov technique.

Table 7 summarizes consensus and information diffusion algorithms discussed in this section, with the area of application, decision variables and the main features. According to simulation results, communication network topology, high connectivity of MAS and sampling times are important prerequisites to guarantee the convergence of the system to a value.

Table 7 Application and main features of consensus and diffusion algorithms

Optimization and consensus algorithms discussed in Sects. 4 and 5 have as objectives the reduction of operational costs, the balance between demand and supply and the use of locally-produced green energy. At the same time, microgrids are power systems that are subject to physical layer thresholds that should be strictly respected for operational safety. Therefore, while cooperating to achieve MAS objectives, agents are also required to respect these thresholds. The next section synthesizes physical layer thresholds for microgrid operational safety and summarizes performance indicators used to measure MAS performance in microgrid management optimization.

6 Microgrid performance indicators

In addition to energy generation management, demand–supply alignment and power cost reduction, MAS should ensure the operational safety of microgrids by controlling voltage and current flows. These measures are defined by the grid code of the country in which the microgrid is deployed. Thresholds of small-scaled power systems are defined and should be respected to ensure a normal operation of devices. Table 8 synthesizes microgrid physical layer thresholds for a safe operation (Bollen et al. 2009; Honarmand 2015; RASCO 2005).

Table 8 Synthesis of physical layer thresholds for microgrids based on references (Bollen et al. 2009; Honarmand 2015; RASCO 2005)

Therefore, equations should be embedded in agents to compute the necessary values for physical layer control. These equations include also performance indicators, such as power cost, CO2 footprint and user comfort indices. Based on the reviewed articles, Table 9 summarizes a selection of performance metrics and provides a brief description and the corresponding equation if applicable.

Table 9 Summary of MAS performance indicators

It is noteworthy that CO2 emissions are significantly affected by aging infrastructures and incomplete DG combustion. Therefore, maintenance of power systems is a must to decrease CO2 footprint and its impact on the environment (Kanchev et al. 2012). Agents listed in Table 2 use equations in Table 9 and thresholds in Table 8 to issue control commands to devices. However, an interesting research opportunity is to explore the behavior of the system if two agents or more issue contradicting control commands. The organization and monitoring of agents’ commands and interactions with the physical layer is therefore necessary even in fully distributed environments.

7 MAS for multi-microgrids service restoration

As shown in Fig. 1, microgrids, of different types, MAS, management strategies and objectives are connected to each other to form a smart grid. In normal mode, a smart grid MAS controls the trade and flow of energy between different microgrids (Ren et al. 2014; Nunna and Doolla 2014). When an area in the microgrid fails, the concerned agent, such as Zone Agent, identifies the causes and sends a request to Breaker/Switch Agent to isolate that area and mitigate the impact of cascading failures (Solanki J et al. 2005; Manickavasagam 2011). The remaining agents will then activate emergency service restoration algorithms for microgrid recovery (Colson and Nehrir 2011). As soon as the fault is been isolated, agents should launch the recovery procedure, by stabilizing voltage, balancing demand with supply and fulfilling any other system constraints (Boussaada et al. 2016). In case of a microgrid blackout, agents are required to launch a black start, which is difficult and dangerous for connected loads and diesel generators (Cai et al. 2011). The general procedure for service restoration in microgrids is composed of two steps: Emergency Reaction Step, for system stabilization, and Restorative Step, for loads restoration (Boussaada et al. 2016; Cai et al. 2011). At first, Generation agents should estimate the available power from local energy storage and power production to feed high priority loads. Since diesel generators do not generate power immediately, load shedding is the solution to the immediate imbalance between demand and supply (Cai et al. 2011; Leng and Polmai 2014). Load agents communicate their priorities to the appropriate agent in the control hierarchy, and energy is then dispatched using island-mode control algorithms (Hernandez et al. 2014). If the local power production cannot satisfy the local demand, MGCCO agent will send a request to nearby microgrids to trade their extra energy through a virtual auction market (Ren et al. 2014). Universal Market Intelligent agent is responsible for managing a fair auction between all microgrids, and IP agent stores the details of all transactions.

Since fluctuations at one microgrid can affect nearby microgrids, a global electrical system agent should keep a view on the smart grid to ensure collaboration for service restoration (Hernandez et al. 2014). This global agent is called an Arbiter, which keeps stability and low recovery cost for both the main grid and the faulty microgrid (Rivera et al. 2014b). One option for optimizing service restoration in a multi-microgrids environment is by breaking the system into sub-problems assigned to different agents across MAS. Decomposing complexity over many entities reduces computations overhead and increases system reactivity to failures (Wang 2017). Meanwhile, MGCCO agent of the faulty microgrid can request power support from nearby microgrids by combining all their diesel generators inertia and angles to produce enough power to feed high-priority loads (Wang 2017; Resende et al. 2011). This procedure requires appropriate energy dispatch algorithms, and nearby microgrids should use load shedding and shifting in case of energy deficiency.

Effective service restoration depends on the real-time communication between DG agents across microgrids for proper emergency power dispatch (Joumaa et al. 2011). To minimize any communication delays that could be fatal to the system (Tolbert et al. 2001; Nguyen and Flueck 2011), a D-graph can be used to represent paths between all diesel generators across microgrids to generate spanning trees for quick path search (Boussaada et al. 2016; Tolbert et al. 2001). To increase restoration efficiency, PI controllers are deployed to reduce errors between real and measured values, and socket messages are efficient in decreasing communication delays (Nguyen and Flueck 2012; Abidi et al. 2017). In case MGCCO agent of the failing microgrid receives REJECT to its power request, it should find alternatives to satisfy local demand until resolution of failures. One of these solutions is shutting down some neighborhood facilities (e.g. public lighting) with Direct Load Control (Et-Tolba et al. 2014b). Figure 5 shows the synthesized sequential diagram of multi-microgrid service restoration procedure that was discussed in this section based on references (Nunna and Doolla 2014; Solanki J et al. 2005; Manickavasagam 2011; Cai et al. 2011; Leng and Polmai 2014; Hernandez et al. 2014; Rivera et al. 2014b; Wang 2017; Resende et al. 2011; Et-Tolba et al. 2014b).

Fig. 5
figure 5

Sequential diagram of the general multi-microgrid restoration procedure

The reviewed articles focus on the effect of a single microgrid failure on neighboring microgrids, but transient stability of adjacent microgrids in case of multi-microgrid failures is an important issue to investigate. The effect of scattered failures on the main grid and service restoration strategies should be explored in depth, given that serious voltage and frequency fluctuations and major energy shortage are expected. The following section summarizes all these research gaps and opportunities highlighted throughout the paper, which are important for power system risk analysis and infrastructure management.

8 Summary of recommendations

Throughout this review, potential research opportunities were highlighted to the readers for future work, as summarized in this section.

8.1 MAS architecture and design

According to the literature, MAS architecture and design affect the convergence to an optimal solution and system equilibrium, so the microgrid topology should be wisely chose according to its type. Since there is no clear study on the optimal size of a robust microgrid, researchers are recommended to investigate the effect of connecting more loads on its operational efficiency. It is also important to study the long term behavior and impact of microgrids on the main grid, given that fluctuations and cascading failures are very common in these systems. Concerning the logical control layer, it has been shown that MAS communication delays become on overhead with increasing agents, so MAS modeling should be thoroughly investigated as well to produce a set of recommendations for building control units.

The proposed optimization frameworks are built upon the assumption of a homogeneous MAS, which simplifies mathematical formulation of the problem and computational overhead. An interesting research aspect in MAS is the design of optimization techniques for heterogeneous agents and the scrupulous investigation of the needed resources and infrastructure to implement and deploy these solutions. Even though heterogeneous systems are not mature enough because of their complexity, they are an interesting field of research thanks to their advantages, such as system stability.

Load shifting and shedding in a fully distributed MAS is shown to create a new peak demand when there is no coordination between agents. A cloud-based hybrid MAS is thus a good candidate for an efficient DMS in microgrids, where a central entity supervises shifting requests over time. Research on cloud-based schemes feasibility for smart grid management and the impact of load shifting and shedding on grid stability have to be conducted as well.

8.2 Optimization techniques

Optimization techniques, such as RL algorithms, MP techniques, heuristic and metaheuristic optimization algorithms, non-linear control and data-driven modeling techniques are been well studied and applied to solve many issues in microgrid management as discussed in this paper. PSO, and its hybrid variants, are the most popular algorithms used in MAS optimization thanks to the efficient PSO memory usage and fast convergence. Even though data-driven modeling optimization techniques have a good performance, the required computational resources, memory space and learning time are expected to be very high. A detailed study on the computational requirements to support on-line gradient learning and keep latency under acceptable thresholds is essential to guide the choice of optimization algorithms. Also, there should be a study about the impact of data fed to agents on the performance of the system.

In addition, most research works assume a time-constant microgrid, which is an ideal situation given that microgrids are intended to be of type PnP. It is interesting to investigate the behavior of the proposed optimization frameworks and algorithms under time-variant topologies and conclude guidelines to adapt existing algorithms. Also, most researchers set one main system objective and then formulate their mathematical solutions under multiple constraints. However, one should explore the feasibility of combining multiple optimization algorithms with different objectives into the same MAS and the effect on the convergence and system stability. The study of communication delays, convergence time and collisions among agents is an interesting research opportunity.

8.3 Information diffusion and consensus

Average-consensus algorithm is the most popular algorithm for reaching agent equilibrium that is based on leader-followers theory. In this review, progress in this area is noteworthy, and variants of this algorithm were proposed to deal with specific issues, such as noises and collisions. However, there are a couple of aspects that are not investigated properly. First, agent resources are assumed to be unlimited, with no possibility of collisions. Some research works discussed hereby address this issue, but it is not enough and test scenarios assume simple test constraints. A potential research opportunity is to study the effect of collisions on MAS performance and design control mechanisms to organize agents interactions with the physical layer. Researchers should incorporate agents resources constraints in their hybrid consensus algorithms and compare their performance with their previous findings. Erroneous measurements and leader failures should be considered in the study and design phase since the impact is expected to be fatal to both MAS and the microgrid.

8.4 Multi-microgrids service restoration

As shown in Fig. 1, microgrids of different types are connected to form a “smart city”, and this interconnection imposes additional control, as fluctuations and failures can affect the whole grid. In-depth research on emergency energy dispatch and transient stability in a multi-microgrid environment is necessary for the future smart city vision. Adapting existing hierarchical MAS and secondary droop control to multi-microgrids management could be a first step in this study.

9 Conclusion

Conventional energy resources are one of the factors behind global warming because of CO2 emissions, so the urge to develop and deploy other environment-friendly energies increases over time. Power generation is not the only problem in the present power system, since power utilization and delivery are issues of concern as well. Traditional power systems have obsolete infrastructures and do not cope with the growing populations. Hence, smart grids, broken-down to microgrids, are a solution that combines power grid with a communication network for data exchange and feedback. With the time-variant microgrid topology, MAS is the best control strategy to handle all optimization issues in power grids.

In the present review, a selection of papers about advanced optimization algorithms and techniques is discussed, and progress in MAS for microgrid control is summarized. Multiple research gaps and opportunities are highlighted for future work. Simulation results conducted by researchers show the impact of MAS design and connectivity on the convergence time. It is observed also that PSO, and its hybrid variants, outperforms other optimization algorithms, such as GA, in terms of convergence time and memory usage. Concerning information diffusion and consensus, average-consensus algorithm is popular among researchers, and other variants are proposed to address consensus under specific conditions, such as heterogeneous MAS, communication delays and noises. Finally, the general service restoration procedure in a multi-microgrid environment was summarized, and the main process to feed high-priority loads was briefly discussed.