Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

15.1 Introduction

The scope of the rights of individuals has been constantly evolving. It has long been established that the full protection of life and property falls within the individual rights coverage for most cultures throughout the human history. While the early boundaries of the “right to property” have only incorporated the tangible dimension, the intangible portion has been expanding [1] rapidly since the industrial revolution. One important component of the intangible part is defined by the right to privacy, coined by Warren and Brandeis in 1890 [1].

A strong positive correlation between technological development and privacy concerns is almost universally agreed [2]. In Warren and Brandeis’ terminology, “the right to be left alone” has expanded to include other personally associable phenomena such as audio, photographs, video, data, and more recently biometric identification and genetic data) rather than mere physical property. Computerization, automation, transmission, and storage of data, enabled by recent advances in telecommunications, Internet technologies, and mobile and cloud computing services, have increased the importance and relevance of the term “privacy”.

In spite of its wide usage, the term privacy does not have a universally-agreed-upon definition [3].Footnote 1 It is quite remarkable that such an important concept has evaded a formal definition. The concept of privacy has a long history of discussions of importance, from Greek philosophers including Aristotle (public sphere of political activity versus private sphere [3]) and Socrates, to Biblical and Quranic passages [5]. Allen West in his landmark work [2] defines privacy in terms of self-determination as follows:

Privacy, now, is the claim of individuals, groups, or institutions to determine for themselves when, how, and to what extent information about them is communicated to others.

Another important document about the principles of privacy protection was developed in 1981 by the Organization for Economic Co-operation and Development (OECD) [6]Footnote 2 and was later updated in 2013 [7]. Yet, even these guidelines are not observed by many countries. For example, while the European Union seems to be following them, the United States does not.

Widespread adoption of privacy protection mechanisms depends on the political will, which seems to be prioritizing other concerns such as public safety, especially since September 11, 2001. However, the awareness and demand of the public for a stronger adoption and enforcement of the privacy regulations has been increasing unabated. Many recent developments and news such as Wikileaks, US NSA leaks by Edward Snowden, Facebook’s recent disclosure of Emotion Experiment, EU’s recent ruling on “right to be forgotten,” have been keeping the topic of privacy discussions current and fresh in the public sphere, thereby increasing demand for more action.

In line with technological developments, the ever-changing field of ubiquitous applications, and high-level penetration of mobile and other electronic devices, the potential for privacy violation has been increasing in scope. While there is a perceived clash between the technology and privacy protection, there are also many efforts to put the use of technology in its defense. One pioneering work that has spawned quite a lot of attention, interest, and follow-up studies is Chaum’s paper [8] in 1985 on providing privacy to individuals and organizations bi-directionally in a secure fashion. He argues for embedding privacy-providing mechanisms in the design and development of the technology by means of cryptography. Chaum’s ideas are further developed and formalized under the term of privacy-enhancing technologies (PET) in 1995 [9] and then in 2003 [10]. PET is defined in [10]:

PET stands for a coherent system of ICT measures that protects privacy by eliminating or reducing personal data or by preventing unnecessary and/or undesired processing of personal data, all without losing the functionality of the information system.

Our work in this chapter is line with the notion of PET, which we use to provide an understanding and awareness of privacy issues, challenges, and threats in the Smart Grid (SG), the next generation of the traditional Power Grid enhanced with state-of-the-art computing and communications technologies. Just as is the case with many engineering and technical decisions, the touted benefits of the SG initiative comes with many risks and trade-offs. The deployment and adoption of Smart Grid technologies have opened up several security issues at the levels of the consumer, the communication, and the energy provider. Security aspects such as confidentiality, authentication, authorization, integrity, and non-repudiation have been extensively investigated and various innovative solutions have been proposed in the literature. There are many publications on SG security, including survey style articles and books, such as [1136]. While some of these address privacy, explicitly or implicitly, there is a need for an up-to-date coverage of SG privacy techniques. In contrast to most previous works with the SG security focus, our main motivation in this chapter is to review, classify, discuss, and analyze recent SG privacy solutions that have been proposed in the literature. In addition, we also provide a comprehensive treatment of the approaches, mechanisms, and cryptographic tools used in the SG to support the use and design of privacy enforcing techniques.

15.1.1 Contributions

In this chapter, we provide a novel taxonomy of privacy provisioning and protection techniques in the SG. The comprehensive survey, explanations, and discussions of the various privacy schemes are expected to serve as a good reference for those interested in working on privacy issues in the SG environment. The rest of the chapter is organized as follows. Section 15.2 presents a brief SG overview. Section 15.3 discusses the privacy-related problems within the SG environment and explains why privacy is crucial in the overall success of the SG paradigm. Section 15.4 presents a novel taxonomy of recently proposed privacy-preserving solutions for the SG. Section 15.5 explores outstanding challenges that must be addressed in the future and opportunities for new research directions. Section 15.6 concludes the chapter.

15.2 Background on Smart Grid

In this section, we present the main features of the traditional Power Grid followed by the SG vision.

15.2.1 Traditional Power Grid

The current traditional electric Power Grid is considered to be the largest man-made machine in the world. Its infrastructure and operations have not changed significantly over the past century. Its architecture mainly consists of four sections, as shown in Fig. 15.1: generation,Footnote 3 transmission, distribution, and consumption. A high-level structure of its topology and its components are displayed in Fig. 15.2. The generation of energy is highly centralized and is carried out in bulk mode, such as nuclear systems, hydroelectric systems, wind farms, and others. The high-voltage electricity is relayed in the transmission subsystem over long distances. When handed off to the distribution subsystem, the energy is converted into medium voltage. Through the distribution subsystem substations, the voltage is reduced to lower values and then distributed to a variety of end-users, from commercial, industrial, business, to residential areas. The energy production and distribution schema are supervised by a centralized control system, known as Supervisory Control and Data Acquisition (SCADA) systems, in charge of mapping and visualizing any operational activity in the field as well as controlling the storage and demand of power. In fact, SCADA systems can remotely and locally control the power transmission and distribution based on the current demand and peak loads thereby minimizing unnecessary power generation.

Fig. 15.1
figure 1

Architecture of the traditional electric power grid

Fig. 15.2
figure 2

A high-level structure of the current power grid

15.2.2 The Smart Grid Vision

SG is a term generally used to refer to an enhancement of the traditional Power Grid, especially, in terms of the computing and communications technologies. SG can be defined as follows [37, 38]:

The SG can be regarded as an electric system that uses information, two-way, cyber-secure communication technologies, and computational intelligence in an integrated fashion across electricity generation, transmission, substations, distribution and consumption to achieve a system that is clean, safe, secure, reliable, resilient, efficient, and sustainable.

“System of Systems” is a term generally used to qualify the SG in the literature to emphasize its heterogeneity.

Economic development and its sustainability are closely coupled with the effective, efficient, and robust use of the energy. The energy sector, and especially the grid infrastructure, has traditionally focused on the reliable provisioning. Until recently, communications and flow of information have been considered only with extraneous significance. Under an aging and ineffective energy distribution system, unprecedented initiatives have recently been instituted in many countries to improve the Power Grid with the SG. The key facilitators of the SG are two-way energy and information flows between the suppliers and consumers. The conventional supply chain of the energy is being expanded to include alternative sources of energy, such as solar, wind, tidal, biomass, and so on. from a variety of distributed small and large energy producers. The consumers are becoming more active participants by means of such devices as smart meters, smart thermostats, smart appliances. The grand vision of an autonomic, self-healing SG with a dynamic demand response model with pricing still has many challenges, not the very least from the perspective of the networking infrastructure and distributed computing. Demand Response (DR) is defined by the US Department of Energy as follows [39]:

Changes in electric usage by end-use customers from their normal consumption patterns in response to changes in the price of electricity over time, or to incentive payments designed to induce lower electricity use at times of high wholesale market prices or when system reliability is jeopardized.

The sheer size of the contemplated SG of the future is to rival the Internet in the number of participants. Smarter generation, transmission, distribution, and consumption of electricity are essential to achieve a reliable, clean, safe, resilient, secure, efficient, and sustainable power system [37].

Some of the noteworthy standardization efforts, high-level conceptual reference models, and roadmaps for the SG are given by the NIST Framework and Roadmap for SG Interoperability Standards [40], IEC SG Standardization Roadmap [41], CEN/CENELEC/ETSI Joint Working Group on Standards for SGs [42], and IEEE P2030 [43]. A conceptual view of the NIST’s SG reference model is depicted in Fig. 15.3 with seven domains: customers, markets, service providers, operations, generation, transmission, and distribution. As compared to Fig. 15.2, the generation is no longer in bulk; it also includes the distributed and renewable energy sources as well. It is also worth noting from Fig. 15.3 the bi-directional electricity and information flows and the integration of the renewables. Another important conceptualization is the addition of third-party services to enhance the energy consumption experience of the end-users by means of open markets. The financial gears are also in place: global investment on SG had exceeded $15 billion as of 2013, more than a four-fold increase from 2008 levels [44].

Fig. 15.3
figure 3

NIST’s 7-domain smart grid conceptual model

The most relevant domain of the NIST Conceptual Model for this chapter is the Distribution Domain (as depicted in Fig. 15.4), because it is the main physical interface between the end-user and the SG and it is the center of almost all of the potential privacy violations. Note that it is also the Distribution Domain that is responsible for achieving the most widely-cited benefits of the SG which include control, measurement, sensing, data collection and storage, and optimization of operations that take place in or for it.

Fig. 15.4
figure 4

Distribution domain of NIST’s smart grid conceptual model

The anticipated benefits [40] of the SG include:

  • Increased power reliability and quality.

  • Optimized resources to smoothen the power demand to avoid using expensive peaker capacity.

  • Improved resilience to disruption by natural disasters and attacks.

  • Automated systems to enable self-healing responses to system disturbances.

  • Incorporation of distributed and/or renewable energy sources.

  • Reduction of greenhouse emissions.

  • Actionable and timely energy usage information to customers.

  • Facilitation of plug-in electric vehicles and new energy storage options.

15.2.3 Smart Meters and AMI

In transitioning from the Power Grid to the SG, Automatic Meter Reading (AMR) has provided a stepping stone functionality. AMR provides automatic collection of data from the energy metering devices and transmission of them to a central location for further processing and analysis.

In the SG, AMR is replaced by Advanced Metering Infrastructure (AMI) which enables bidirectional data transfer between the meter and the grid. The meter that provides such functionality in the SG is usually referred to as a Smart meter. Smart meters can read real-time energy consumption information as well as other operationally needed data, such as voltage values, phase angle and the frequency, and so on. Smart meters are solid state programmable devices that can perform many functions allowing users to perform intended tasks by inputting a sequence of instructions into their processing unit and memory. Among some of the tasks that a smart meter can do are [45]: time-based pricing, collecting consumption data for consumer and utility, net metering, loss of power (and restoration) notification, better access and data to manage energy, decision and selection of rate options, remote turn on/turn off operations, load limiting for bad pay or demand response purposes, energy prepayment, power quality monitoring, meter tampering and energy theft detection, costs reduction in wrong estimations of billings, service and operational reduction in traditional tasks of metering reading, or communications with other intelligent devices or appliance devices in the home. Although all these tasks may not be supported by a particular meter and there might be other tasks that it can do, the overall idea is that smart meters make it possible to add some kind of intelligence to the network and individual features of each residential consumer.

There are several technologies and applications that have been integrated to perform as one in an AMI system [45] including: smart meters, wide-area communications infrastructure, Home (local) Area Networks (HANs), Meter Data Management Systems (MDMS), and operational gateways working as main collectors. Figure 15.5 shows a model of AMI system as envisioned by NIST from the perspective of computer networking terminology by means of interconnected nodes and clouds to emphasize the bidirectional nature of the communication enabled by AMI.

Fig. 15.5
figure 5

Smart grid advanced metering infrastructure reference architecture

Another abstraction of the AMI network is presented in Figs. 15.6 and 15.7 that show the concepts of HAN, Building Area Network (BAN), Industrial Area Network (IAN), Neighborhood Area Network (NAN), and Field Area Network (FAN).

Fig. 15.6
figure 6

Smart grid advanced metering infrastructure

Fig. 15.7
figure 7

Details of HAN, BAN, and IAN

There is some notion of hierarchy in AMI when data are collected, processed, and analyzed to optimize the energy use and bring about the benefits of the SG. Such a hierarchy of the communications architecture is depicted in Fig. 15.8. Smart meters span out from feeders, which may also serve as natural data aggregation points. Feeders are controlled by the distribution substations, which are in turn connected to the transmission substations. NIST domains interact with this hierarchy to provide a new level of experience and service as part of the SG.

Fig. 15.8
figure 8

SG communications architecture

15.2.4 Microgrids

One of the many new mechanisms of the SG for power delivery is microgrids [4648]. As a low voltage distribution network, microgridsFootnote 4 are autonomous energy management systems under the control of a single administrative authority that is capable of operating in parallel to or in intentional or accidental islanded mode from the existing Power Grid. They usually include distributed and renewable energy sources as well as some level of energy storage subsystems. A representation of a microgrid model is shown in Fig. 15.9.

Fig. 15.9
figure 9

A microgrid model

15.3 Smart Grid Privacy Issues

Demand Side Management (DSM) is one of the most important components of the grid of the future [49]. The overarching goal of DSM is to improve the efficiency and effectiveness through energy consumption scheduling. DSM tries to shift and/or reduce the load to achieve its objective by reducing the Peak-to-Average Ratio (PAR), cost, and so on. In [50], energy-cost and PAR minimization are performed with the help of an energy consumption scheduler and a Linear Programming (LP) formulation. Joint energy payment and waiting time minimization are studied in [51]. A game theoretic approach is proposed to maximize the utility function in [52]. In [53], a consumption scheduling algorithm based on Integer Linear Programming (ILP) and game theory is applied to minimize load. In contrast to the current grid, one of the key features of the future grid is to adjust loads dynamically, turning them on or off as needed. This is called load shedding. In [54], an optimization framework is proposed to find the minimum amount of load to shed while satisfying load-balancing and shedding constraints. Dynamic load-shedding schemes have been studied in the presence of large disturbances accounting system dynamics [55, 56]. Du and Nelson [57] presents a two-step algorithm for the optimal load shedding in an intentional island.

Given the information collected by smart meters in the SG environment, privacy issues become a vital concern for the success of SG initiatives. In the SG AMI, the privacy goes beyond anonymity to include undetectability of operational status of individual residential appliances. It has been well-known for quite a while that it is trivial to determine sophisticated usage patterns from the smart meter data by using rather simple statistical methods [58, 59]. Prevention of this kind of violation is the main aspect of privacy that we are addressing in this chapter.

The privacy-related issue here is that for proper functioning of the AMI system, very detailed and often precise information about users’ electricity usage is needed. Hence, while this smart system could offer many great benefits, it takes away a significant level of privacy a user may like to have. In the rest of this section, we first elaborate on the general notion of privacy and then delve into some details as to why we need to address the privacy concerns explicitly and convincingly.

15.3.1 Basic Privacy Concepts

Privacy may be defined as the claim of individuals, groups, or institutions to determine when, how, and to what extent information about themselves is communicated to others [2]. The notion of privacy may vary from person to person, and from culture to culture. It could also be defined as the right to informational self-determination, that is, individuals must be able to determine for themselves when, how, to what extent, and for what purpose information about them is communicated to others [60]. This term is often related to an entity’s (individual, group, or institution) identity or anonymity. As human beings, each of us likes to keep certain information about ourselves confidential while we like to express some information to draw a distinct line with others or to make a presence in the society that we live in. Similarly, a group or institution may have some information for disclosure to the public while sensitive information must be protected from being disclosed to unwanted parties. The unwanted parties may include individuals who are not members of the group or institution, other groups or institutions, a person with short-term membership, or a deliberate intruder (attacker) attempting to retrieve information illegitimately.

The definition and boundaries of privacy tend to vary among different societies and cultures and as such, there is no clear list of categories of privacy that can be applicable for all. However, four major types of privacy are generally recognized:

  • Personal privacy. This includes mainly body privacy and territorial privacy. Body privacy varies among individuals in terms of the types of clothing one wears to protect the body. Territorial privacy means making a boundary or to create a barrier between the person and others. This can be implemented by erecting walls, fences, or screens, by using cathedral glass/partitions, by maintaining a distance, among other things.

  • Information privacy. This kind of privacy is mainly related to passing of information over various media and could also be called communications privacy. Some of the notable information privacies are:

    • Internet privacy. The ability to determine the kind of information one reveals or withholds about oneself over the Internet, who has access to such information, and for what purposes one’s information may or may not be used.

    • Financial information privacy: information about own bank account, amount of money, transaction details, debt, and so on.

    • Medical privacy: information about a persons health conditions.

    • Political privacy: political stance such as who a person may have voted for.

    Information privacy also means how someone expresses matters about him- or herself in any field. People are sometimes willing to give up information about themselves not because they are ignorant or because they are being tricked by evil corporations, but because it can sometimes be in their best interests to do so [61, 62]. Such information can be posted on the Internet or via social networks or other channels the person is involved with. So, in such a case, a person may judge the benefit of exposing such information, which he or she may like others to know but not through him- or herself directly, to be avoiding the accountability or responsibility of such apparent “leak” of information.

  • Organization privacy. this includes the confidential information about an organization such as business strategies, loss and profit statistics, current trend in the market, future products, potential customers, transaction details, and similar information. An organization may put some information in the public arena for transparency (which will show the ethical standard of the organization, commonly accessible by anybody) and declares certain information as classified, which is a categorization applied to information that a government or a group claims as sensitive. Prominent examples of organizational security could be often associated with trade secrets and national security.

  • Spiritual and intellectual privacy. This kind of privacy includes a person’s spiritual nature, of his or her feelings and intellect. A person may have certain religious beliefs but may not like to express these to others. It may be because of the adverse or hostile environment. Also, a highly intelligent person may act as dumb or may not like to show his or her intelligence in all gatherings. For example, a person working in a research group may restrain from showing all his or her talents to others so that others may not take his or her ideas away without giving proper credit, or it may be that the person is selfish or may like not to actually get involved in intellectual contribution in the group for some personal reasons.

As the meanings of privacy are different in various scenarios, there are other ways of looking at it. [63, 64] described six types of privacies related to a mans personality: (1) solitude, (2) isolation, (3) anonymity, (4) reserve, (5) intimacy with friends, and (vi) intimacy with family. Solitude is the most complete state of privacy that individuals can achieve. It is a type of privacy in which the individual is alone and unobserved. Pedersen [63] differentiates between isolation termed as alone and away from others and solitude defined as alone by oneself and free from observation by others. Anonymity is a type of privacy that occurs when it is possible to move around in public or, for example, browsing through the Internet without being recognized or being the subject of attention. Reserved behavior includes examples of low self-disclosure. Finally, any kind of intimacy is a type of privacy that relates to an individual’s or group’s desire to promote close personal relationships. All of these personal traits of human beings need to be studied and thoroughly understood while making any policy related to privacy in any sector, because the same human beings are the beneficiaries or users of these systems.

15.3.2 The Need for Privacy in the Smart Grid

In a SG network, key questions regarding setting the policies on user data privacy are [65]:

  • Who owns the data of the customer?

  • How is the access to and use of customer data regulated?

  • Who guarantees privacy and security of customer data (e.g., against risk of surveillance or criminal activity)?

  • Will sale or transfer of customer data be allowed, and under what terms and to whose benefit?

  • In jurisdictions with retail choice, are measures needed to ensure competing electricity providers have access to customer data on the same terms as the incumbent utility?

In fact, rival electricity providers may compete to dominate the market, and their access to users electricity usage patterns and behavioral information could be very crucial. The electricity providers or provider agents may use the user data to determine their business strategies and special packages or offers. In an open market environment, such data could be partially collected after the offers are made public and some information is available for all, but if privacy is breached beforehand and specific user data is available to some parties, then these electricity providers may have unfair gains. Appropriate privacy policies may restrict, mitigate, or resolve such use of unfair means in setting business strategies. All these issues explain why the privacy of data of SG users is a very critical issue both for users and the electricity providers.

The privacy of SG users is a very important issue. The strong integration of Information and Communication Technologies (ICTs) for the SGs operation introduces different types of privacy concerns. Depending on how the consumer (or user) uses electricity and recharges it, the privacy of the user can be affected by two usage scenarios namely:

  • The user recharges electricity balance via personal interaction (private mode). For instance, the user goes in person to the electricity providers agent and recharges his “smart electricity card” similar to a credit or debit card that can be reloaded and placed into the electricity meter. The other personal interaction may happen via the phone or in person by going to the agent and getting a new recharge or reload number, similar to that used in many places for pre-paid mobile phone balance or validity extension. The customer can also obtain a recharging number obtained from a pre-paid card. This method does not reveal the identity of the person who has purchased the card, which is later used in the electrical meter to do the reloading task. It is worth pointing out that the authorization number will need to be validated and authenticated before electricity consumption. When this number is entered from any home or building (connected to the SG), it passes through an authentication process during which information could be stored by the utility company or one of its designated agents. This information needs privacy protection measures in place.

  • The users recharge their electricity balance via the Internet (public mode). If any website or online system is used and the balances are adjusted via payment through some bank account or other payment methods, then all the cybersecurity-related privacy issues must be considered. When a web interface is used and there is a back-end database, web attacks (such as Structured Query Language (SQL) injection [66]) could affect the privacy of the user by disclosing not-to-be-exposed data from the back-end database. The web-based (i.e., online) form to recharge the user’s electricity balance could be made as simple as requiring a single identification number from the user. The privacy issue in this process is whether the user wants to be known at the time of recharging a balance for future electricity usage. In fact, user’s information can be used by different departments or branches of the electricity provider. The user may choose who can access the information and who can not. An instance of personal preference can be the option of receiving company related news, updates, or offers of newly introduced packages or benefits from the electricity supplier company to the user’s email address. For managing user’s own preferences, agent technology [67] could be used, in which each subscriber or user is assigned an agent representing the user’s interests. Each service can also be assigned an agent to reap the most benefit. A service agent could negotiate with subscriber agents about information and authorizations versus the quality of the offered service.

The level of personal information involved and used will dramatically increase with the modernization of the grid. Smart meters and smart appliances could lead to a data explosion of intimate details of daily life. However, at this point, it is quite unclear as to who will gain access to this information, besides the customer’s utility provider, and control utilities. With the deployment of the SG, energy measurements can take place at much shorter intervals (unlike at the end of the billing cycle as in conventional methods).

Currently, there are several types of concerns related to the privacy and security of data associated with the SG. In this chapter, we focus on the issue of privacy linked with consumer information. Potential privacy concerns of SG consumers include: how the required information is going to be collected, used, and disclosed; how customer information is expected to be safeguarded and how it may be used for or against the consumers; how permissions will be granted for the collected data to be shared with multiple agencies; and the liabilities related to any breaches of consumer information. It is also worthwhile exploring how the SG will know about individuals. For example, the energy fluctuation pattern of home appliances is so unique that it may be possible to infer, for example, the model applied for a user’s refrigerator. It is also worth noting that many times data that is harmless when collected in isolation may become a privacy threat when combined with other types of data, or examined by a third party for a pattern.

Even when the data about electricity consumption is not collected at regular intervals, information can still be collected at a slower rate through the persistent monitoring of energy consumption. As a result, private information such as how many people live in a household, their presence and absence at home, their schedules for taking showers, watching TV, frequency of microwave use, and their sleeping patterns can be collected or deduced. For many individuals, the collection of this type of information represents an invasion of the “sanctity of the home”, and one may argue that such intimate details of someone’s daily life should not be accessible. The user’s data could disclose their usage pattern of electric devices, and very intimate details of household equipment, even their possible locations (e.g. if the SG concept also is combined with the smart home concept where, when a person leaves a room the lights and electric equipment are automatically turned on or off). In such a case, even the movement pattern of the user within his or her own home could be deduced!

The privacy concerns discussed here are further confirmed by a study called Privacy Impact Assessment (PIA) [12] conducted in September 2009 by the Privacy Sub-Group of the Cyber Security Working Group. The report has identified the following issues and concerns related to consumer-to-utility information exchanges in the US SG:

  • There is no clear understanding of the privacy issues on the SG.

  • There are a lack of standards, privacy policies, or procedures by the entities involved in the SG and the collection of information.

  • Definitions of personally identifiable information are inconsistent in the utility industry.

  • Smart meters and distributed energy systems may reveal information about residential consumers and activities within the house.

  • Roaming SG devices (e.g., electrical vehicle recharging at other charging stations such as a friend’s house) may generate more personal information.

  • Even though the National Association of Regulatory Utility Commissioners adopted the 2000 resolutionFootnote 5 urging the adoption of privacy principles, only a few state utility level commissions have begun to assess privacy issues associated with the SG. This is the case with the state of California through its eight Fair Information Practice (FIP) principlesFootnote 6 such as transparency, right to access information collected (individual participation), individual access to see and copy information stored on an individual, limited types of information that may be collected on an individual (collection limitation), limited internal use of information about an individual, data quality and integrity, data security, accountability, and auditing.

15.3.3 Load-Monitoring Techniques

As we mentioned previously, the possibility of learning information about individuals’ behaviors, personal habits, and lifestyle raises concerns. This becomes an important issue when this information can be used for other purposes besides delivering electricity. Electric utilities and other providers may have access to information about the in-house activities of customers, the times when they are using various devices and appliances, as well as the type of devices being used. The initial goal of collecting electricity usage information to generate an electricity profile has now become a source of behavioral information with an immense potential. The most serious threats related to the privacy deterioration of SG consumers include: cyber-attack and intrusion, identity theft, tracking and observing the behavioral patterns of the consumers and the appliances being used, and real-time spying and surveillance. In intrusive load monitoring (ILM), there is an individual monitor for each appliance to acquire the aggregate energy consumption of household electric devices. An alternative technique for deducing the appliance usage characteristics is called non-intrusive load monitoring (NILM), or non-intrusive appliance load monitoring (NIALM), where only one individual monitor is enough to decide the energy usage from the aggregate data. NILM was first reported in 1992 [68]. Since then, various other techniques have been developed for NILM that separate individual appliance power consumption levels from from single, aggregated measurements. Recent surveys about NILM can be found in [59, 69]. An illustration of the concept is presented in [70], where a behavior extraction algorithm implemented in Matlab is used. DSM and Demand response systems provide sufficient power usage information to reveal in-home activities that might be disturbing for the privacy of the households. It is worth noting that NILM can be easily implemented using off-the-shelf hardware and software without much technical expertise.

As a result, privacy concerns, coupled with a degree of security related issues, may lead to any of the following unintended consequences [31, 71, 72], or some other vulnerabilities not currently identified:

  • Hackers could manipulate power consumption and billing.

  • Cyber-terrorists might fake power consumption data on a large scale to attack the power system.

  • Attackers may take control of the smart meters for manipulation at will.

  • Direct marketers, criminals, law enforcement agencies may use the energy consumption data without prior approval or notification.

  • Energy consumption patterns of individual appliances can be identified with high accuracy.

Thus, privacy is the Achilles’ heel for the success of the SG and needs to be carefully investigated and addressed.

15.4 Privacy Solutions

In this section, we present a novel taxonomy of the privacy techniques proposed for the SG domain, and we provide a synopsis of each category with references, and compare and contrast them.

15.4.1 Taxonomy of Privacy Techniques

A comprehensive and novel taxonomy of the SG privacy-protection mechanisms and approaches is given in Fig. 15.10. We divide the SG approaches into spatial and temporal broad categories. The former include those that devise privacy into the system by means of a physical device or entity while the latter incorporates privacy into the system by means of logical extensions. We note that the individual categories identified in Fig. 15.10 do not necessarily indicate an exclusive technique. In fact, a privacy preservation proposal reported in the literature may, and usually does, implement a combination of them. The categorization of Fig. 15.10 is to provide a delineation of identifiably distinguishable techniques to provide a smoother and clear explanation in what follows. A different approach has been taken in [73] where privacy preservation techniques are presented with a combination of methods from parts of Fig. 15.10 on a per paper basis.

Fig. 15.10
figure 10

Privacy-preservation techniques used in the SG

Next, we provide a discussion of the spatial and temporal categories along with their subclasses.

15.4.2 Spatial Privacy Techniques

There are five main categories of spatial privacy-protection mechanisms proposed in the literature for the SG, as shown in Fig. 15.11, together with the cited references.

Fig. 15.11
figure 11

Spatial privacy-provisioning techniques for the smart grid

15.4.2.1 Trusted Third Party

A trusted third party (TTP) in cryptography is an independent entity that acts as a liaison between two or more collaborating organizations; which, in our case, is between the end-user and the power utility [7476]. The TTP has to be completely trusted by all participants with respect to its intentions, technical competence, and so on, so mutual trust can be achieved. In the literature, TTP is also referred to as the third party escrow service [75].

In what follows, we elaborate on the approach in [75] as one example in this category: [75] provides a mechanism for anonymizing high-frequency energy measurement data (such as usage patterns of specific electrical appliances) through the use of a Pseudonymous Identity (PID). The anonymous meter readings are difficult to associate with a particular smart meter or customer, thus offering a higher level of privacy to the SG user.

The distinguishing feature of the Escrow smart meter is that it has two separate IDs, rather than a single ID as is the case with standard smart meters. The two IDs are the high-frequency ID (HFID) which is anonymous, and the low-requency ID (LFID) [77], which can be related to a specific customer or smart meter. The main idea of the scheme is to provide anonymity of the HFID messages. The anonymity is implemented by not disclosing the HFID to the utility or the smart meter installer. The HFID is ‘hidden’ inside the smart meter, or hard-coded to be used for all HFID-related messages. In order for the utility to verify the legitimacy of the HFID, a third party Escrow mechanism is implemented. The third-party can be the manufacturer of the smart meter itself or some other trusted third-party, which has been given access to this information. The manufacturer can assign two unique IDs to each smart meter that is produced, only one of which (LFID) is visible to the utility, both during the procurement and deployment procedures. Essentially, the manufacturer (or the Escrow service) is the only party that is aware (and has a record) of the connection between a valid HFID or LFID pair. The Escrow is required to comply with a strong data privacy policy. For example, the Escrow may not be expected to access, process, or store smart metering data—it will only know about the relationship between a valid HFID and LFID.

15.4.2.2 Gateway-Based Approaches

In the gateway-based approach, an external entity outside of the customer premises acts on behalf of the end-users to obfuscate the relationship between the data and the owner [7884].

The Smart Energy Gateway (SEG) architecture [83] is deployed at users’ premises and uses a privacy manager, which is designed as a software component running on SEG, deployed at users’ premises. The idea behind the architecture is to provide user-centric privacy, which means that the user could be in control of his or her own privacy parameters. The proposed privacy manager has the ability to specify privacy conditions and obligations with respect to the handling of users’ private data, and to rely on SEG security architecture features such as application isolation, mandatory access control, pseudonymity, and secure storage to reliably enforce the users’ specified privacy constraints. The main features of the privacy manager include:

  • Customer privacy preference specification and enforcement. The energy customer would express how personal information disclosed should be handled and the utility or service provider would express how customer’s information will be treated. Privacy policies enforcement: each SEG application policy is bound to a smart software agent and has to be validated against the SEG platform integrity policy both during the installation and at runtime. This ensures that SEG only hosts and runs smart software agents that meet pre-defined gateway security requirements (e.g., that the former will not access locally-stored energy usage data collected at this particular premise).

  • Secure storage and data masking. The secure storage will guarantee the confidentiality and accuracy of locally-stored energy usage data. Only trusted and legitimate applications (e.g., billing provider software agent) can access the metered data repository.

  • Pseudonymity. Enables the customer to use SG resources or related services without revealing their respective identities but remaining accountable for their transactions.

  • Privacy feedback. Allows the display of feedbacks to the energy customer regarding the handling of its personally identifiable information.

15.4.2.3 Architectural Schemes

Architectural schemes arrange the topology of the smart meters in order to implement privacy protection. Two distinct categories are considered:

  1. 1.

    Overlay. Randomly organized smart meters form peer-to-peer groups in [85] using Chord algorithm [86]. Peer anonymization algorithm together with in-network aggregation enhance the privacy protection capabilities of the proposed approach.

  2. 2.

    Ring topology. A few proposed approaches [81, 82, 8792] take advantage of imposing some form of a ring architecture for the SG meters. For example, a virtual ring architecture is proposed in [87] to provide a privacy protection solution using symmetric or asymmetric encryption of customers’ requests belonging to the same group.

15.4.2.4 Storage-Based Mechanisms

As the name implies, a type of energy storage infrastructure is employed for the privacy protection in this category [9397]. For example, the authors in [94, 95] assume that future smart homes will contain several energy storage and energy generation devices, and thus electrical power routing will be feasible. More details of this are given in Sect. 15.4.3.6, under Time series-based privacy.

15.4.2.5 Privacy with Distributed Energy Generation

The main idea behind privacy protection using Distributed Energy Generation (DEG or a.k.a. Distributed Energy Resources or DER) relies on the intermittent and stochastic energy values provided by DEG to mask the actual energy consumption from the disclosed values assuming that DEG is private to the end-user.

15.4.3 Temporal Privacy Techniques

A second major category of privacy preservation techniques we consider includes those that implement techniques over time without relying on an external tangible entity. We describe some of these techniques in this category.

15.4.3.1 Compression-Based Approach

As the name implies, the energy consumption data is transformed using compression techniques to protect the privacy [98, 99]. Compression alone may not be strong enough and thus [99] uses it in combination with other techniques.

Li et al. [98] makes use of the technique of compressed sensing from signal processing to provide privacy protection. Compressed sensing [100102], also known as sparse sampling, assumes the smart meter data is sparse with uniform delay and uses a secret random sequence so that the original data can be reconstructed at the receiving end.

15.4.3.2 Cryptographic

There are various cryptographic techniques reported in the literature that are used to provide privacy in the SG as shown in Fig. 15.12. We divide the cryptographic temporal privacy-protection techniques into seven categories and discuss them here.

Fig. 15.12
figure 12

Cryptographic temporal privacy-protection techniques for the smart grid

15.4.3.2.1 Privacy Through Identity-Based Encryption

An identity-based encryption (IdBE) scheme is a public-key cryptosystem where the key may be selected to be any string, such as email addresses, dates, and so on. It was first introduced as a problem in [103] with solutions in [103105]. IdBE may be used for privacy in the SG as discussed in [106].

15.4.3.2.2 Privacy Through Attribute-Based Encryption

In the attribute-based encryption (AbE) [107], ciphertexts are associated with sets of attributes. Private keys are coupled with access structures to control which ciphertexts can be used to decrypt them. AMI is an important component of the overall DR system, as defined before in Sect. 15.2.2. In [108], the authors propose to protect multicast communications involving crucial DR messages from the control center to the smart meters by means of an AbE implementation.

15.4.3.2.3 Privacy Through ID Anonymization

Anonymization is a general term that decouples a message from its originator. Several proposals in the literature take advantage of the anonymization techniques for the SG privacy [15, 85, 109, 110].

15.4.3.2.4 Secure Multiparty Computation

Secure multiparty computation (SMC) has been developed as an alternative to the TTP approach. SMC is a set of techniques to compute a function collectively with the assurance that at the end of the multiparty computation, no participant can learn anything except its own input and the result. Then intended information should be inferable only from these two pieces of information. Historically, SMC was initiated to address Yao’s Millionaire Problem [111] where two parties can know which of them is richer without disclosing their actual wealth. Yao’s two-party solution was extended to multiple parties in [112].

  1. 1.

    Secure comparison. This is an implementation of the Yao’s Millionaire Problem [111], as described above. [91] used secure comparisons algorithms as part of the overall SMC approach for smart meter data processing.

  2. 2.

    Homomorphic encryption. One of the most common methods to ensure privacy in the SG has been the homomorphic encryption technique, which dates back to the first problem formulation in 1978 [113]. A partial homomorphic encryption that preserves the structure of multiplication or division, but not both, has been used until recently. The solution has been elusive until the formulation of the first fully homomorphic encryption scheme in 2009 [114]. Homomorphic encryption enables computation on the encrypted data without revealing the plaintext. Given a homomorphic encryption function \( {\text{E}}() \), and two messages \( x,y \), the following relationship is guaranteed:

    $$ {\text{E}}(x{ \odot }y) = {\text{E}}(x){ \star }{\text{E}}(y), $$
    (15.1)

    without knowing the plaintext \( x,y \), and the private key. Paillier cryptosystem [115, 116] is an example of an additive homomorphic encryption, where with respect to Eq. 15.1, \( { \odot } \) is multiplication and \( { \star } \) is addition. In other words, the sum of plaintext is calculated from multiplication of the ciphertext. Another commonly used additive homomorphic encryption is the Boneh-Goh-Nissim (BGN) cryptosystem [117], which is based on Paillier but with bilinear groups.

    Implementation of homomorphic encryption techniques for privacy preservation in the Smart grid are given in [72, 76, 78, 7981, 84, 90, 92, 108, 118120, 121, 122]. For example, the authors of [121] propose an Energy Privacy Preserving Aggregation (EPPA) scheme for secure SG communications. EPPA uses a multi-dimensional data aggregation approach based on the homomorphic Paillier cryptosystem [116], which is composed of three algorithms namely, key generation, encryption, and decryption. The proposed technique is based on composite residuosity classes, whose computation is believed to be computationally difficult. It is a probabilistic asymmetric algorithm for public key cryptography and inherits additive homomorphic properties [113]. Homomorphic encryption allows specific types of computations to be carried out on ciphertext and obtain an encrypted result. For example, one user could add two encrypted numbers and then another user could decrypt the result, without either of them being able to find the value of the individual numbers. Homomorphic encryption schemes are malleable by design. Another homomorphic encryption system for the privacy-preserving data collection and aggregation is proposed in [84, 122] based on the Lite Cramer-Shoup Scheme [123].

  3. 3.

    Secure sum. One way to implement the secure sum is by means of Paillier cryptosystem, as proposed in [91]. Another secure sum technique is used in [124] based on the algorithm in [125]. The basic idea of this algorithm is shown in Figs. 15.13, 15.14, 15.15, 15.16, 15.17 and 15.18. Bob, Alice, and Charlie have their own secrets, as shown in Fig. 15.13, and they would each like to compute the sum without revealing their own secret values. Any arbitrary initiator may start the process. Let Bob initiate it in our example by generating a random profile, as shown in Fig. 15.14. Bob adds the random secret profile to its secret profile, shown in Fig. 15.15.

    Fig. 15.13
    figure 13

    Secrets of Bob, Alice, and Charlie

    Fig. 15.14
    figure 14

    Bob’s secret random values

    Fig. 15.15
    figure 15

    Bob’s secret random values added to his own secret

    Fig. 15.16
    figure 16

    Alice receives Bob’s transmission

    Fig. 15.17
    figure 17

    Charlie receives Alice’s transmission

    Fig. 15.18
    figure 18

    Bob receives Charlie’s transmission and computes the sum

    Bob sends its secret plus random secret to Alice. Note that Alice cannot break up the totals to find out Bob’s secret. Alice adds her own secrets to the values received from Bob, as shown in Fig. 15.16. Figure 15.17 show that Charlie gets Alice’s transmission and adds his values. Bob receives the profile from Charlie, subtracts the random secret only known to him and reaches the sum, without knowing either Alice’s or Charlie’s values and disseminates it to the others, as shown in Fig. 15.18.

  4. 4.

    Secret sharing. The basic idea of secret sharing is to break up a message \( M \) into \( k \) parts before transmission in such a division that the original message \( M \) can be assembled together from these \( n \) pieces while even access to \( n - 1 \) parts will not be sufficient to infer \( M \). The techniques for such a goal have been introduced by Shamir in [126]. Secret sharing has been exploited in [84, 122] to develop a secure and distributed protocol with privacy-preserving aggregation of SG metering data.

15.4.3.2.5 Pseudonymity

Unlike anonymity, where identity is hidden and/or decoupled from the message, in pseudonymity, fictitious names are used to represent messages. The real identity to the fictitious mapping must be kept secret. Examples of this approach are proposed in [75, 88, 83, 127, 128]. We provide some details of one these here.

The privacy-preserving authentication scheme for an SG network (PASS) [127] involves the use of a smart appliance (located at customers homes) attached to a tamper-resistant device for generating pseudo identities and signatures on messages. A customer is given this device when he or she opens an account or registers a newly purchased smart appliance. The characteristic features of the PASS architecture are as follows:

  • Message authentication: before a smart appliance transmits a request message to the control center, it has to include a hash-based message authentication code (HMAC) signature on the message using the regional system key. This regional system key is only known by the control center, the substation, and all tamper-resistant devices within the region. Hence, an outside attacker (who does not belong to the region or is not a registered smart appliance) does not know how to generate a valid HMAC signature. Thus, the PASS scheme protects from outsider attacks.

  • Identity privacy: in all request messages sent by a smart appliance, real identities are used instead of pseudo identities.

  • Request message confidentiality: the amount of electricity required by a smart appliance is encrypted using the public key of the control center. Thus, except for the control center, no one can decrypt the value representing the electricity amount. On the other hand, the encryption feature in the PASS architecture allows a substation to aggregate request messages sent by smart appliances within its region but the substation does not need to know about those individual amount values.

15.4.3.2.6 Zero-Knowledge Proof

Zero-knowledge proofs are those convincing assertions that yield nothing but their validity [129]. In other words, one party proves to another without revealing any information besides a statement of affirmation or decline. The authors of [130] deal with preserving the privacy of metered data. The authors propose a set of privacy-preserving protocols amongst a provider, a user agent, and a simple tamper-evident meter by taking advantage of a zero-knowledge proof. This work considers a scenario where the privacy of the metered data is preserved by employing encryption mechanisms along with certification techniques. Within the boundary of a home environment, plaintext is used, but when sending or communicating with entities outside the home boundary, certification, and encryption techniques are used. The authors argue that their scheme can be applied to all types of smart metering including electricity, waters and gas metering, and can be extended for other future smart meter-based systems. The main contribution of this work can be summarized as follows: the meter produces certified readings of measurements and transmits them to the user via a secure communication channel. For billing, the user combines those readings with a certified tariff policy to produce a final bill. The bill is then transmitted to the provider alongside a zero-knowledge proof that ensures the calculation to be correct and leaks no additional information. A zero-knowledge proof of knowledge [131] is a two-party protocol between a prover and a verifier. The prover demonstrates to the verifier its knowledge of some secret input (witness) that fulfills some statements without disclosing this input to the verifier. The protocol should meet two properties: (1) it should be a proof of knowledge, which means that a prover without knowledge of the secret input convinces the verifier with negligible probability, and (2) it should be zero-knowledge, that is, the verifier learns nothing but the truth of the statement. The fact that a witness is not distinguishable from active participants is a weaker property which requires that the proof does not reveal the witness (among all possible witnesses) used by the prover.

15.4.3.2.7 Blind Signature

In [109], the authors consider an SG network as three basic layers: at the highest layer, there is a control center maintained by the power operator, the second layer has substations inside the distribution network and each substation is responsible for the power supply of an area, and the lowest layer has the smart meters which are placed at the users’ premises as shown in Fig. 15.19.

Fig. 15.19
figure 19

A three-layer smart grid architecture. PO power operator, DS distribution substation, SM smart meter

The proposed anonymous credential architecture [109] preserves users’ privacy information, including their daily electricity usage pattern from third parties as well as from the power operator. The scheme is based on blind signatures [132]. Blind signature is a method that allows the first party (Party 1) to sign a message generated by a second party (Party 2), without knowing its actual content. When a third party (Party 3) receives the signed message, it can verify that the message is signed by Party 1. The anonymous credential scheme uses the blind signature technique to allow the control center (Party 1) to sign a credential generated by a customer (Party 2) without knowing its actual content. At a later time, the control center itself (Party 3) can verify that the credential is indeed signed by Party 1 without knowing who requested the signature or when the signature was generated. The use of the blind signature technique in this scheme is as follows: the customers prepare a set of credentials, each stating the amount of electricity requested, and request the control center to sign them blindly so that the customer can submit any of these credentials for the request of electricity. Since Party 1 does not know the actual content of the message sent by Party 2, the message is verified using a special technique which is widely adopted in e-cash schemes. Party 2 generates \( n \) messages using different blinding factors. It then blinds the \( n \) messages and sends them to Party 1. Next, Party 1 randomly chooses \( m \) messages (\( m < n \)) and challenges Party 2 to reveal them by providing the \( m \) blinding factors. If the \( m \) blinding factors are correct, Party 1 accepts the signature request and signs the remaining (\( m - n \)) messages. The scheme assumes that any smart meter can communicate with the control center via a secure communication channel (such as one using the advanced encryption standard (AES) and third parties cannot read the contents without the key concerned).

When a customer presents a credential anonymously, the control center cannot tell which customer is making the request, yet it can verify the signature to confirm that it is from a valid customer (since only valid customers can request blind signatures). The four phases involved in the Anonymous Credential scheme are as follows:

  • Setup phase. The control center assigns a Ron Rivest, Adi Shamir and Leonard Adleman (RSA) public and private key pair for signing credentials.

  • Registration phase. Carried out at the beginning of each month. This phase is not anonymous. Customers need to be authenticated using their real identities via an authenticated channel.

  • Power requesting phase. Can be executed at any time during the month when the smart meter of a customer finds that it needs more power to support all the electric appliances. This phase is anonymous. Customers are validated via anonymous credentials.

  • Reconciliation phase. Carried out at the end of each month. This phase is not anonymous. The smart meter sends the unused credentials back to the control center to evaluate the amount of power requested so far.

Another approach based on the fair blind signature [133] method is reported in [128] for the vehicle-to-grid (V2G) system, involving both charging and discharging of battery vehicles (BVs). Fair blind signature is an extension of the basic blind signature scheme where misuse of the system against black-mailing and money laundering is prevented by means of an embedded property to remove anonymity via a trusted entity. In our case, it is used to ensure proper billing.

15.4.3.3 Priced Oblivious Transfer

Oblivious transfer, introduced in 1981 in [134], is a protocol in which the sender remains unaware of what has been transmitted out of the potentially transferable many pieces. Using oblivious transfer protocol, a protocol is developed in [135], called priced oblivious transfer, to enable buyers purchase digital goods from vendors without letting the seller learn what, and to the extent possible, when and how much. Priced Oblivious protocol is used in [136] to propose a privacy preserving billing protocol which guarantees the power operator gets the correct amount of money without learning the current energy consumption of each customer.

15.4.3.4 Certified Tariffs

As explained in Sect. 15.4.3.2 (the subsection on Zero-knowledge proof) from [130], the energy provider cannot gather any fine-grained readings. The provider is guaranteed that the correct fee is calculated based on the actual readings and time-of-use tariffs without learning.

15.4.3.5 Sampling Interval

Smart meters in the AMI system provide sampling of measurements and potentially other useful information and report them back to the power operator or other third parties. The sampling process is the center of privacy concerns as it transmits potentially sensitive information. The authors in [137] consider sampling as a design parameter in the performance of DR schemes to explore some trade-offs between performance and privacy. An optimization problem is considered to find the right sampling interval given a set of performance goals and desired privacy level.

15.4.3.6 Perturbation

Another technique for privacy preservation that has gained a considerable attention is a set of techniques collectively known under the term perturbation. A taxonomy with the cited work is depicted in Fig. 15.20. A common theme in these techniques is the transformation of the energy consumption data from what gets disclosed out of the customer premises. We provide details of this category with its subclasses in what follows.

Fig. 15.20
figure 20

Perturbation-based temporal privacy-provisioning techniques for the smart grid

15.4.3.6.1 Privacy Using Orthogonal Code

The work in [138] analyzes security and privacy in the SG and specifically emphasizes the privacy aspects. The authors propose a secure and efficient in-network data aggregation and dispatch scheme for AMI in home area networks for the SG. In-network aggregation is the process of collecting content from multiple sources or devices in a network. With this mechanism, the authors propose the use of Walsh function [139] based on Hadamard code [140] to generate mutual orthogonal chip codes to be used in the secure in-network data aggregation and dispatch scheme. The use of orthogonal code allows multiple users to communicate simultaneously over a single frequency. This is achieved by the use of spreading codes whereby a single data bit is “spread” over a longer sequence of transmitted bits. These codes, also known as chip sequences, must be carefully chosen so that the data may be correctly “de-spread” at the receiver. Such codes are known as orthogonal codes. The Hadamard code [140] is an error-correcting code that is usually used for error detection and correction when transmitting messages over very noisy or unreliable channels. In their work, the authors apply these techniques envisioning that the smart meter works as an authentication server that is connected with multiple smart devices and each smart device contributes to the formation of confidential data, which can be regenerated at the smart meter. This work describes the coding techniques and the steps on how the original data readings are spread and then mixed up with the spreading code of other smart devices. The smart meter can reconstruct the original reading data from the mixed data using the chip code established with smart devices during their initialization procedure through mutual authentications.

Another work that encrypts measured data by orthogonal codes by using Walsh code is reported in [89], which uses a ring communication architecture.

15.4.3.6.2 Time Series-Based Privacy

One way to look at the measurements coming out of the smart meters is a series of data giving way to a wealth of methods that can be invoked from the field of time series. We present a survey of some of these methods here.

  1. 1.

    Load signature moderation: Load signature moderation (LSM) [95] scheme suggests that the home electrical power routing can be used to moderate the home’s load signature in order to hide appliance usage information. Load signature is defined as a series of time-stamped average power loads \( p(t) \) derived from cumulative energy values \( e(t) \) metered at interval \( \varDelta t \):

    $$ p(t) = \frac{e(t) - e(t - \varDelta t)}{\varDelta t} $$
    (15.2)

    A home load signature is the sum of all home appliance loads. To perform load signature moderation, the authors assume that future smart homes will contain a variety of energy storage and energy generation devices, and thus electrical power routing will be feasible. Electrical power routing means the selective control and power mixing of a number of electricity sources to route electricity to a number of consumers. For instance, a kettle drawing 2 kW of power when switched on; the power router could be configured so that 1 kW is supplied from a solar panel, 0.5 kW from a battery, and 0.5 kW from the main electricity supply. The basic contribution of this approach is that it presents the idea how to provide sufficient privacy for the user by including privacy mechanisms for the smart meters which is supposed to record the usage. The authors also propose a power management model using a rechargeable battery, a power mixing algorithm, and evaluate its protection level by proposing three different privacy metrics: an information theoretic (relative entropy), a clustering classification, and a correlation/regression one. We will briefly review these metrics:

    • Relative entropy: the relative entropy or Kullback Leibler distance [141] is a well-known information theoretic quantity which can be used to compare two sources of information. The distance here is not the mathematical meaning of distance but rather it quantifies the relation between probability densities. If \( p_{0} \) and \( p_{1} \) are two probability densities, the Kullback-Leibler distance is defined to be,

      $$ D(p_{0} ||p_{1} ) = \int_{xmin}^{xmax} p_{1} (x)log\frac{{p_{1} (x)}}{{p_{0} (x)}}dx $$
      (15.3)

      where \( p_{0} \) and \( p_{1} \) are the probability density functions of \( p_{0} \) and \( p_{1} \), respectively.

      Relative entropy is always positive, and for identical \( p_{0} \) and \( p_{1} \), it is zero. Hence, the authors in [95] state that the level of privacy protection offered by a mapping \( \emptyset \) can be measured by the relative entropy, \( D_{\emptyset } (p_{0} ||p_{1} ) \) such that the higher the level of protection offered by \( \emptyset \), the larger the relative entropy.

    • Clustering classification: the authors propose using any of the available clustering classification mechanisms which takes a set of data with a distance metric and groups them into \( n \) clusters that minimize the distance between points. The distance metric here is the difference between power consumption values. They propose to use a simple method of trace analysis that aims to recover information about device power usage from a small amount of information sent via the signals.

    • Regression analysis: as a third metric, the work described in [132] quantifies privacy by combining cross correlation and regression procedures, which can be termed as regression analysis. In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. A dependent variable is what is measured in an experiment and what is affected during the experiment. This kind of variable responds to the independent variable. It is termed so because it depends on the independent variable. In a scientific experiment, there cannot be a dependent variable without an independent variable. Just as an example, if someone is interested to find out how time spent on studying changes “test score”, then it is understood that the test score does not change the time spent, as that had happened earlier. In this case, “studying time” is the independent variable and “test score” is the dependent variable. Based on these foundations and ideas, the authors in this work apply regression analysis on the received signals to recover information by comparing them over time.

    This work can be extended to include other types of privacy metrics, such as mutual entropy or equivocation. In addition, smarter battery privacy algorithms may be designed, which the authors have left as future works.

  2. 2.

    Theory of rate distortion: Rate-distortion theory is a subfield of information theory that addresses the problems of lossy compression. It analyzes the theoretical fundamentals of determining the bit rate to be communicated over a communications channel in order for the original data to be reconstructed at the receiver subject to a distortion level.

    Information theoretic approaches to SG privacy have been proposed in a few studies [99, 142, 143] by means of the rate-distortion theory. Rate-distortion theory has been used to provide SG privacy in a few recent studies [99, 142, 143]. Rajagopalan et al. [99] and Sankar et al. [143] attempt to quantify privacy in order to gain insight into the tradeoff between sharing information (utility) and hiding it (privacy). The utility is represented by means of square error (distortion \( D \))

    $$ D = \frac{1}{n}\sum\limits_{k = 1}^{n} {{\mathbb{E}}[(X_{k} - \hat{X}_{k} )^{2} ]} $$
    (15.4)

    where \( X_{k} \) is the actual measurement, \( \hat{X}_{k} \) is the exposed value; while privacy is represented by information leakage:

    $$ L = \frac{1}{n}{\mathscr{I}}(Y_{n} ;\hat{X}_{n} ) $$
    (15.5)

    where \( Y_{n} \) represents the inferred data as a random variable correlated with the measurement variable \( X \). Some examples of interference sequence \( Y_{n} \) include the known appliance signatures that are provided by NILM techniques discussed earlier in Sect. 15.3.3. The proposed algorithm, interference-aware reverse waterfilling solution, exposes high power but less private appliance information and filters out components with lower power to a distortion threshold. However, this proposal is only limited to a framework proposal and an algorithmic approach is not detailed enough to implement it.

    Another rate-distortion theory based approach is given in [142]. However, similar to the previous ones, it also suffers from unrealistic assumptions and the approach is complex. For instance, the assumption about binary input and output loads are unrealistic.

  3. 3.

    Zero-Sum Errors. The authors of [144] propose a cooperative state vector estimation technique that preserves the privacy of the personal behavior of the user. Unlike most other privacy preservation techniques for the SG where energy consumption information is the focus, the authors here provide privacy protection for phase angle measurements. Thus, they take advantage of the state estimation methodology [145]. The key objectives are to ensure mainly two things: (1) the power measurement is well obfuscated such that users do not fully disclose their private behavioral information, and (2) the obfuscated data retains the necessary or basic information such that the state vector (a column vector whose components are the state variables of the system) can be accurately estimated from the perturbed data. “Perturbed data” are the original measurement data that are modified to conceal the information and to make it difficult to infer the original data. Another significant contribution of this work is that the authors evaluated the performance of the proposed data obfuscation scheme with 1,349 measurement data sets. For this, they used the data sets as if they are connected to five different IEEE test systems that are portions of the Middlewestern US Electric Power Grids. They also evaluated the illegibility to human inspectors, resilience to automated data mining attackers, and communication overhead.

15.4.3.6.3 Privacy with Probability Distribution Functions

Another method of transforming the exposed measurement data is by means of adding noise from probability distribution functions.

  1. 1.

    Binomail. Binomial distribution is proposed in [97, 146].

  2. 2.

    Gaussian.

    1. a.

      Basic: straight Gaussian distribution is used to determine the magnitude of the noise in [147150].

    2. b.

      Offset: [149] proposed a noise canceling mechanism by using a technique which is based on the Central Limit Theorem. In the offset method, the margin caused by noises in previous time slots is compensated to achieve zero error in billing computation.

  3. 3.

    Laplacian. Laplace distribution is the basis for computing the noise in [151]. Chen et al. [79] employs symmetric geometric distribution, which can be regarded as a discrete approximation of Laplace distribution. The use of geometric distribution for the noise was pioneered by [152].

15.4.3.6.4 Differential Privacy

The notion of differential privacy is coined in [153, 154]. Differential privacy has emerged from the field of database queries where the goal is to answer queries in an accurate way while preserving the privacy of individuals. Differential privacy yields plausible deniability to blur the data hidden behind. It is about an information-releasing algorithm with a mathematical underlying model. Differential privacy boils down to distorting the answers to the database queries by means of adding a predefined noise so that the intended receiver filters it out to reach an almost accurate answer. As can be seen from the problem definition, this is applicable to the SG privacy case as well. Differential privacy-based mechanisms have been proposed in [79, 97, 146, 151].

15.4.3.7 Aggregation

To secure the data-collection task, there are two major approaches: one is to ensure the protection of the data content directly without regard to the data semantics. The approach presented in [60] is based on symmetric cryptography to provide data confidentiality and authentication between sensors and the base station. [155] describes a protocol for data collector (DC) to collect data from a measurement device (MD), but direct communication between the DC and the MD is assumed. Another category for providing security exploits the aggregate statistics of the sensed data, such as summation, average, minimum, maximum, and so on. These approaches take advantage of in-network data processing (also referred to as aggregation) to apply some obfuscating operations on the transmitted data [72, 122, 138, 156162]. A few common examples in this category include cluster-based private data aggregation [159] and its integrity enhanced version [160], secret perturbation [157], k-indistinguishable privacy-preserving data aggregation [158], a centralized authentication server based in-network aggregation for AMI [138, 161], homomorphic encryption-based aggregation [72, 78, 80], a secure architecture for distributed secure hierarchical data collection aggregation of additive data [84, 122], a secure and scalable data collection protocol for smart meter data [163, 164], multifunctional, privacy-protecting aggregation [79], and a network coding-based encryption between smart meters and aggregators [162]. Another one is reported in [121]. Many of the existing data aggregation schemes collect information as one-dimensional information. However, smart meter data could be considered as multi-dimensional in nature, because, these include including various aspects of the information such as the amount of energy consumed, the time it was consumed, the purpose of the consumption, and so on. Considering the high data collection frequency, multi-dimensional information and the large number of users, current data aggregation schemes generate not only huge communication costs but also impose overwhelming processing load on local gateways. In contrast to traditional one-dimensional data aggregation methods, Energy Privacy Preserving Aggregation (EPPA), as discussed earlier in “Secure Multiparty Computation”, is shown to greatly reduce computational cost and significantly improve communication efficiency while satisfying the real-time high-frequency data collection requirements in SG communications. The main drawback of the work is that it is highly theoretical and it does not really provide enough details on how such an approach can be deployed in practice.

15.5 Challenges and Opportunities

The preservation of privacy in the SG environment has many fundamental open challenges that still need to be solved. As our literature survey shows, several research projects have been investigating privacy-preserving techniques for the SG environment in the last few years. We found that there is need for privacy to be comprehensively regulated through legal and regulatory frameworks for enhancing users’ confidence and for reinforcing individual’s privacy rights. These frameworks should provide a comprehensive view of both the challenges and limitations related to personal data protection rights as they pertain to the SG technology.

In recent years, a lot of work has been undertaken on designing privacy-preserving methods using various technical approaches, which vary according to the context and the architecture in use. Throughout this chapter, various SG privacy solutions aimed at preserving smart meters’ privacy have been discussed. As we have pointed out earlier, most of the recently proposed SG solutions have limitations and they do not always follow the recommendations being made by standardizations entities and governmental agencies [40]. Although it is not mandatory to follow the recommended guidelines made by standardization bodies, for future interoperability and scalability, SG privacy solutions should nevertheless take these recommendations into consideration. We discuss here some of the challenges that still need to be addressed in the future by researchers and designers working in the area of SG privacy.

  • Third party issue

    The privacy issues in the SG are particularly magnified by the large-scale infrastructures, the diversity of communication technologies, the number data sources, and the high volume of data generated. In the past, most of the SG services were basically limited to governments or large enterprises, which have traditionally built by proprietary and isolated infrastructures (e.g., electrical power network) to provide services to customers. However, third parties can actually offer their infrastructures and services with limited control from governments and, hence, concerns have arisen about third-party access to the customer’s personal information.

  • Privacy and authentication

    Privacy is often closely linked with authentication. The issue of trade-off between privacy rights of entities and the need to authenticate them needs to be explored further. Unfortunately, authentication leads to personal information becoming available. However, authentication is a very important security service that may help to eliminate some of the cyber attack classes such as man-in-the-middle attacks and false data injection attacks. The latter consists of forging and manipulating the quantities of energy supply and requests. It is worth noting that authenticated nodes may also inject false data without being detected as is the case with recently proposed homomorphic encryption-based solutions [87, 165].

  • Privacy and forensics

    Privacy solutions are also closely linked with verifiability requirements [166] and with tractability as well as forensic techniques. We should formulate threat models to detect cyber attacks and data leakage scenarios [166] such as infrastructure attacks and rogue nodes. In particular, a privacy-preserving solution should provide a well-maintained log that may help in preventing fraud and in resolving disputes. Traceability and forensic techniques should be taken into consideration during service design and the development of service architectures [167]. In the context of the SG and real-time ecosystems, we should not only cover the effectiveness of privacy-preserving methods, but also have the ability to monitor and detect anomalies in real-time and analyze the data collected and aggregated from the different sources. The challenge here is to define an effective method to identify legitimate traffic, to enable forensic investigation on subversive and illegal activities, and to mitigate any possible insider attacks against the infrastructure. In fact, security and forensics techniques are fundamental, especially when an adversary tampers with a device from which data are collected or aggregated or when the same adversary successfully performs cloning attacks.

Multi-disciplinary research approaches which consider training, legal, and technological aspects should be developed to address the privacy issues that arise with the SG environment. Future SG privacy solutions should include the design and development of architectures that prevent unnecessary linking between the user identity and the SG services, while guaranteeing traceability and accountability in the presence of an important set of interconnected engineering resources and nodes. We argue that a holistic approach is needed to identify and address privacy challenges throughout the engineering phase of the SG in order to ensure SG solutions that maintain privacy and are also secure, scalable, and cost-effective.

15.6 Conclusion

Over the past several years we have witnessed huge investments and interests from industry and governments in SG technologies. Various stakeholders (residential/commercial customers, local government, utility operators, etc.) are expected to reap several benefits associated with the SG including improved energy efficiency, increased reliability, reduced energy costs, greater flexibility in energy consumption, better safety and security, and an improved environment (through renewable, renewable non-variable, non-renewable/non-variable energy sources). The deployment of SG technologies has also raised considerable concerns in data privacy issues of SG users, as we have discussed in this chapter. The privacy concerns are mostly related to the collection and use of energy consumption data. In this context, we have discussed various SG privacy issues and we have presented SG privacy architectures and approaches that have been recently proposed in the literature. A unique taxonomy of the various privacy protection mechanisms proposed in the literature has been developed. We also identified the various strengths and weaknesses of these privacy solutions. The success of SG technology and its wide acceptance rely on gaining the trust and confidence of customers, which in turn depends on assurances regarding the protection of their privacy.