1 Introduction

This chapter introduces the technology Non-Intrusive Load Monitoring, a method for detecting individual devices from an overall signal. Non-Intrusive Load Monitoring is the research area and technology behind the third word in Smart Meter Inclusive. Using a smart meter as a basis and recognizing devices from the power profile is not a new idea but is now a common practice in Non-Intrusive Load Monitoring. However, the approach to creating such a measurement system that classifies appliances in real-time and visualizes the results directly on the same hardware has not been existing yet. Smart Meter Inclusive wants to leave the data where it originates, namely with the customer. This book chapter provides a general overview of non-intrusive load monitoring to be able to understand the basics and approaches for such a Smart Meter Inclusive.

2 Efficient Energy Monitoring Through Non-intrusive Load Monitoring

One of the most important issues of our time is environmental protection. Everyone owns more and more electronic appliances; politicians rely on electric vehicles; energy consumption continues to rise. Of course, this is compensated for by ever more efficient devices. Washing machines and dryers use significantly less energy today than ten years ago. However, this energy efficiency alone is no longer sufficient.

One approach to address this seemingly endless increase in energy demand is to use consumers more efficiently (Hoyo-Montaño et al. 2016; Luca et al. 2015). A possible idea of the implementation is to make the customer’s consumption more transparent to be able to show him the resulting potential savings. This transparency opens a new problem. How do you measure the consumption of individual devices? And is it a sensible concept to install more and more power meters? Each of these devices consumes energy itself. But in order to identify all consumers with this concept, each consumer must be equipped with its own electricity meter. Power meters are not cheap. In order to evaluate the data, they have to be synchronized first. Some devices may be installed permanently and inaccessible. Therefore, it is difficult to add a power meter after the fact. And finally, power meters themselves are new consumers.

Developments for more efficient consumption monitoring in electronic networks began at MIT in the USA in the 1990s (Hart 1989). A basic physical principle is that the power of consumers that are running at the same time can be superimposed. Hart postulated at that time that electronic consumers could be divided into different main categories. He also found through recording that various devices have very individual behavior. Hart’s conclusion was that a kind of reengineering could take place here. So instead of installing an electricity meter in front of each individual consumer, only one meter is required at the entrance to the electronic network. This central meter allows the measurement of the superimposed total consumption of all devices. These individual devices are calculated out of the total consumption with the help of various signal processing and artificial intelligence methods. A kind of inverse superposition takes place.

Another motivation for monitoring and detecting consumers can be found under the keyword Ambient Assisted Living (AAL) (Bucci et al. 2021; Ruano et al. 2019; Klein et al. 2013). Here, Non-Intrusive Load Monitoring (NILM) can help monitoring the health status of older people. For the most part, AAL relies on sensors of all kinds, from direct vital signs sensors in smartwatches to indirectly incorporated accelerometers in smartphones. Usually, these sensors have to be carried actively on the body and therefore require a certain tolerance of the wearer. NILM enables an indirect insight into the everyday behavior via appliance recognition without intervening on the freedom of the person or forcing a change in behavior. It is not necessary to procure several expensive sensors for this, but every household appliance automatically assumes the role of such a sensor.

2.1 Load Disaggregation and Other Terms

When Hart began his work in this area of research, he coined the name Nonintrusive Appliance Load Monitoring (NIALM) from the title of his publication of the same name (Hart 1992). Over the years, other terms and forms of writing derived from them have been established, but they can all be used synonymously. The most common representatives of these are Nonintrusive as well as Non-Intrusive Load Monitoring (NILM) or Nonintrusive as well as Non-Intrusive Appliance Load Monitoring (NIALM). Another abbreviation is NALM, where the non-intrusive is abbreviated to just one letter. Energy or load disaggregation is also quite common. Disaggregation derives from the idea that the aggregated performance i.e. the resulting total performance of all devices, is measured at a central measuring point. Mathematically, the aggregated power can be expressed as Eq. 1.

$$P_{tot} \left( t \right): = \sum\limits_{i = 1} {\left( {P_{i} \left( t \right)} \right)^{N} + e\left( t \right)}$$
(1)

The N individual loads are described by Pi(t), and an additional disturbance term e(t) is added, which describes both the noise and possibly unidentifiable loads. So, disaggregation is the inversion of this aggregated signal into its consumer signals.

In this book, we mainly use the notation most commonly used today, Non-Intrusive Load Monitoring, and the abbreviation NILM.

2.2 Intrusive Versus Non-intrusive

One point that inevitably gets stuck when examining the name of the research area is non-intrusive. So the question arises of what intrusive and non-intrusive mean. The two terms refer to the measuring principle. Intrusive means that a measuring device is on the electronic network, while non-intrusive represents a black box measuring method.

The simplest example of an intrusive measurement is through a plug-in power meter. This measuring device is attached between the actual socket and the appliance. It also creates a load drop itself, although usually relatively small. However, it requires an intervention in the network structure since it has to be connected in between. In the case of a non-intrusive measurement, measurements are taken outside of the network to be analyzed. The current measuring methods for NILM use the physical principle that every current also induces a magnetic field. This means that the sensors can easily be retrofitted around the individual current phases.

Both measurement methods, intrusive and non-intrusive, can be found. They are summarized under the term Appliance Load Monitoring (AML) (Hart 1992). Intrusive monitoring of appliances is the classic example that everyone immediately has in mind. Each device to be monitored has its power meter. The problem with this measurement method is that the power meters are expensive to purchase and install. Furthermore, the recorded data must be combined to be able to evaluate them. On the other hand, there is the non-contact monitoring of devices by measuring at a central point in the network. The non-intrusive method is significantly cheaper to install and easier to retrofit. The problem with this approach is that only the aggregated signal is available due to the central measurement. The load disaggregation must therefore be implemented in software using various algorithms from signal processing and artificial intelligence.

2.3 Process Chain of Non-intrusive Load Monitoring

Over the years, a standard procedure for the implementation of NILM has been established, which can be found again and again in this or a slightly modified form. These individual components of the processing chain ensure that the complex NILM problem is broken down into more manageable sub-problems. The structure for event-driven NILM approaches is shown as an example.

  • Data Acquisition (current and voltage)

  • Preprocessing (filtering, but also conversion to common formats like P&Q, Harmonics, …)

  • Feature Extraction (steady-state features, transient-state features)

  • Event Detection

  • Classification

  • Monitoring (depending on the objectives, can be approximation of consumption or detection of anomalies, …).

The individual steps are described in more detail below.

Data Acquisition

Data acquisition involves measuring a signal. Current and voltage can be measured here, but also the power itself can form the input signal using a power meter. An important parameter related to data acquisition is the sampling rate. In this case, sampling below 1 Hz is referred to as a low sampling rate, while high-frequency sampling in the context of NILM is in the range of > 1 kHz to the MHz range (Zhuang et al. 2018).

Preprocessing

In the preprocessing stage, the recorded signals are subjected to an initial adjustment. If necessary, digital filtering can take place here. Signal conversions, such as power calculations, also count as preprocessing in this sense. The signal is converted into a form that can be used for event detection and feature extraction.

Feature Extraction

Depending on the objective, a feature is obtained from the preprocessed signal. This feature can later be used for classification. A wide variety of methods are used here, which can deliver one-to-multidimensional features. A simple example is the power values P&Q per period. Other examples are the harmonics or V-I-trajectories.

Another distinction in feature extraction is the question of the signal section to be used for extraction. There are Steady State Features (SSF) and Transient State Features (TSF). With the SSF, the signal change before and after an event is compared. The SSF is particularly suitable for low-dimensional features. TSFs, in turn, cover an entire signal section, from the beginning of the occurring signal change of an event to the transition when the signal again assumes a quasi-stable state. By considering this dynamic change, TSF delivers multidimensional features.

Event Detection

In event detection, an algorithm is applied to the recorded signal to detect when an appliance has changed its state. The bandwidth for such methods ranges from simple threshold detectors to complex methods such as wavelet transforms.

Classification

Once events are detected, they can be examined more closely. An attempt is made to identify the appliance causing the event. Any classification method from the field of machine learning can be used to solve this problem. Not included in the representation of this process chain is the fact that such a classification method must be learned beforehand.

Monitoring

In monitoring, the goal is crucial. Once it is clear which information the NILM processing chain should deliver, these results can be combined with the knowledge gained so far. The motivations for NILM listed at the beginning, energy reduction and ambient assisted living, alone indicate how broadly monitoring is to be understood.

2.4 Appliance Categories

From the outside, some appliances look similar, while others are fundamentally different. This can also be observed inside. While two different toasters have a simplified internal structure consisting of a heating element and are therefore similar, they can be distinguished from a fan, for example, by a completely different type of electronic consumer in the signal curve. Many different appliances have been analyzed in research on NILM. The division into four basic categories of electronic devices has become established in the literature on NILM (Abubakar et al. 2015; Hart 1992; Zeifman and Roth 2011; Zoha et al. 2012), which are presented in Table 1.

Table 1 Appliance categories

These appliances are further divided into event-based and eventless appliances. An event is a transition from one state to the next or the turning on or off. Event-based appliances include Type I and II devices. Eventless devices are Type III and IV. On the consumption side, changes in the status of the latter two device types cannot usually be recognized or clearly defined.

2.5 Event-Driven Versus Eventless Approach

There are two approaches to solving the problem of detecting devices within the framework of NILM. As already described, one way is to search for events in the incoming signal. This path requires a suitable event detection method, which must be adapted to the respective situation in the network. The approach of wanting to recognize devices without an event usually amounts to an algorithm based on a Hidden Markov Model. Both approaches have their areas of application.

Event-Driven Approach

In the event-driven approach, an event detection method is used. It aims to identify events that are occurring as precisely as possible. Events are changes in the signal in the classic sense, which lead from one steady-state to another. How well such an algorithm can work depends on the device constellation within an electronic network. There are very power-intensive consumers, as well as small consumers. If there are only large consumers in a network that are to be recognized, there is a good chance for the algorithm to produce good results. It is the same in a network in which only small consumers appear. Two problematic quantities limit the result. On the one hand, there is the strength of the noise in the network; on the other hand, there is the power difference of the smallest event change. If the background noise is already greater than the smallest event, it is not detectable with certainty. The same problem also comes into play when large and small consumers are together network. Small events then threaten to be lost in the dynamic behavior of large devices.

There are different approaches to implementing event detection algorithms. These approaches range from threshold-based methods and statistical tests to neural networks (Held et al. 2018b; Lu and Li 2020; Wild et al. 2015; Yang et al. 2020).

The F1 score can be used to validate the quality of such an event detection algorithm. This is based on the key figures of a binary classifier.

Another key figure is true negative, which is not determined in this specific problem.

The three values of Table 2 are used to calculate precision (Eq. 2) and recall (Eq. 3):

$$precision: = \frac{TP}{{TP + FP}}$$
(2)
$$recall: = \frac{TP}{{TP + FN}}$$
(3)
Table 2 Key figures of a binary classifier

The value precision indicates how exactly all events that actually occurred were recognized. In contrast, recall expresses how high the correct detection was for all events detected by the event detector.

Both, precision and recall, give a value between 0 and 1 for the quality of the event detector. With the F1 score, there is another value that combines these two in a common quality criterion. The F1 score (Eq. 4) is the harmonic mean of recall and precision.

$$F_{1} : = 2 \cdot \frac{precision \cdot recall}{{precision + recall}} = \frac{TP}{{TP + \frac{1}{2}\left( {FP + FN} \right)}}$$
(4)

With the help of the F1 score, the quality of the event detector can be expressed in just one number, which is advantageous for comparability and optimization.

Eventless Approach

With the eventless approach, a different strategy is followed than paying attention to individual abrupt changes in the signal. Algorithms based on factorial Hidden Markov Models (fHMM) (Ghahramani and Jordan 1995) are used here. These models use previously determined probabilities to estimate which devices are involved in the current overall signal and in what form.

The Hidden Markov Models (HMM), on which the fHMM is based, are state machines that are not directly accessible. However, it is known from which state Si into which other states Sj can go (transition) and with what probability this occurs. This results in the transition matrix A. The observable outputs of the HMM Yt at the time t are called emissions, whereby the transition from the respective state S to the emission Y is described by the emission matrix B. Yt shows an observation of a predefined value set O := [O1; O2; …; OM]. Since the states Si cannot be observed directly, they are referred to as hidden states. An example of HMM can be seen in Fig. 1.

Fig. 1
Two diagrams represent an example structure of a Hiddden Markov model and a simplified structure of a Hidden Markov model. The hidden states and emission states are exhibited in each structure.

Hidden Markov Model

The fHMM in turn is a combination of many such HMMs. It is assumed that the individual HMMs are independent of each other. This results in an observable emission Yt for a time t and a resulting state vector St with S := [S(1), S(2), …, S(N)]. Transferred to NILM, each of the N devices is in a state St(i) at any point in time and Yt is the metric to be observed, e.g. the overall performance of the network. An exemplary representation can be seen in Fig. 2.

Fig. 2
Two diagrams represent an example of the timing behavior and a simplified structural description. Diagram a denotes the appliances 1 to n and the emission from the top to the bottom layer. Diagram b indicates the H M Ms for 4 appliances 1, 2, 3, and n from left to right, pointing arrows to the emission at the top.

Factorial Hidden Markov Model

An electronic network in which all consumers are known is required. Consumer status may vary. A separate HMM describes which states a consumer can assume. For this description, each device must first be examined for the number of possible states. This is followed by the development of a suitable statistical model for each device. In the fHMM, the individual HMMs are then combined into a common model.

The problems arising from the use of fHMMs are the creation of the individual HMMs for the devices. For this purpose, the transition matrix and the emission matrix are calculated in a training phase. If only little is known about the devices, state estimation methods can be used (Egarter et al. 2015). The identification of the individual states within the framework of the classification is often solved with the Viterbi algorithm (Yang et al. 2021).

Use of Both Approaches

The eventless approach is particularly suitable for devices with very slow state changes. In addition, only a low sampling frequency is required for this, since the state changes do not have to be specifically detected. The eventless approach relies on probability to find the combination of devices that cause the current overall performance. However, if several devices of the same or similar type are to be kept apart, the fHMM approach is not suitable for keeping them apart. In this case, an event-driven approach with a high sampling rate is advantageous because it separates event detection from classification. A special algorithm can thus deal with the issue of precise device recognition, which in this case will lead to more precise results.

The eventless approach is suitable for appliances of types I, II, and IV. Since devices of type IV have no or only very rare events, it is difficult to recognize them with the event detection method. Additional boundary conditions would have to be inserted for this. With the eventless approach, appliances of type IV can simply be taken into account, like all other devices. However, in the form presented here, both approaches have problems with type III. Varying performances cannot be assigned to a state with certainty and are probably only partially recognizable with the eventless approach, especially in a dynamic phase.

3 Measurement Systems

In many works on NILM, Smart Meter recordings are used as the data source. Again, a lot of the research work is based on recorded data sets and focuses on the development of algorithms. In other research work, special hardware solutions are developed in-house. This section deals with the different sensors and measurement hardware. In addition, a few existing and publicly accessible datasets are presented.

3.1 Sensors for Non-intrusive Load Monitoring

Non-intrusive load monitoring begins with a measurement signal. This signal is measured by sensors. Classically, NILM measuring devices rely on a contactless measuring method. Various current measurement methods are presented below, which can be found in smart meters and NILM measuring devices.

Shunt Resistor

The simplest measuring principle to measure the electric current is a shunt resistor. The shunt resistor is installed in series in the current path. The electrical current can then be measured as voltage VS through the contact points using Ohm’s law and the defined resistance RS. The basic measuring principle is shown in Fig. 3.

Fig. 3
A circuit diagram contains shunt resistors R L and R S connected to the voltage source V S.

Current measurement principle with a shunt resistor

For the measurement, the existing network must be interrupted at one point in order to be able to use the resistor. It should be noted that RS must be as small as possible here so that the influence on the existing network is as small as possible and the total resistance can be neglected alongside the actual consumers RL. Then Eq. 5 applies to the current measurement accordingly.

$$i\left( t \right) = \frac{{V_{S} \left( t \right)}}{{R_{S} }}$$
(5)

The current is calculated with sufficient accuracy using an amplifier circuit and an evaluation circuit. This measuring principle can be found in part in smart meters. It works with both direct current and alternating current.

Current Transformer

The folding coil current transformers are based on a measuring principle of induction of current-carrying conductors. Two coils are wound on a ferromagnetic core. The primary side is the input side with the current-carrying conductor that is to be measured. On the output side, the secondary side, there is a defined number of turns and the outlets of the measuring line. In Fig. 4 this measuring principle is shown schematically.

Fig. 4
A schematic diagram represents the structure of a transformer with a primary side, secondary side, and core region. The current flowing from the primary side is I p and from the secondary side, is I s.

Current measurement principle with a current transformer

If the number of turns N1 and N2 is known, the unknown primary current IP can be measured via the secondary current IS. Then Eq. 6 applies.

$$I_{P} = \frac{{N_{2} }}{{N_{1} }} \cdot I_{S}$$
(6)

The electrical input current can be easily determined by means of an evaluation circuit on the secondary side. The relationship between the input and output sides is usually given in the datasheet about the current ranges. With some current transformers, such as a folding transformer, the core can be opened and thus fitted around a conductor. This conductor is then only passed through the core once and thus has the number of turns N1 = 1.

Rogowski Coil

The Rogowski coil can also be used to measure current. This is an annular air-core coil without a metal core, as shown in Fig. 5. The coil is passed through a ring and wound back around the ring. The beginning and the end of the ring are not tightly closed. If the ring ends are placed close enough together, the inhomogeneity of the magnetic field can be neglected.

Fig. 5
An illustration of the Rogowski coil. The radius is marked as r. It indicates current and voltage as i t and v t.

Current measurement principle with a Rogowski coil

The Rogowski coil is placed as a ring around a current-carrying conductor. The current-carrying conductor has a magnetic field in which the Rogowski coil is now located. Due to the magnetic coupling, the Rogowski coil experiences self-induction. A voltage can be measured at the open ends of the Rogowski coil. This voltage can be directly related to the current i(t) to be determined by means of an amplification and evaluation circuit. The measuring principle using the Rogowski coil only works with alternating current, since the change in current induces the voltage.

Hall Sensors

Another way to measure the current without contact is with a Hall sensor. Here, the Hall effect is used. The basic measurement principle is shown in Fig. 6.

Fig. 6
Two schematic diagrams represent the structure of the hall sensor principle and the hall effect on the hall probe. A. The Hall sensor principle indicates a core region, a hall probe, and the air gap region with the current flow. B. The circuit diagram indicates a hall probe with the current flow between two regions.

Current measurement principle with a Hall sensor

The current-carrying conductor to be examined with the current IM is led through a ferromagnetic core. This core has an air gap in which the Hall probe is located. The flow of current IM generates a magnetic field in the core, which penetrates vertically through the Hall probe. A known current IH flows through the Hall probe itself. The Hall effect generates a Lorentz force FL. This Lorentz force separates the charge carriers in the Hall probe. This creates an electrical field proportional to the magnetic field, which can be measured as a Hall voltage VH. Because the dimensions of the core, the air gap, the material of the Hall probe, and the current through the Hall probe are known, the current IM can be calculated using the measured voltage VH.

Comparison

Each measurement method has its advantages and disadvantages. Shunt resistors, for example, are very cheap and have a simple circuit design. However, they can only be introduced into the current path and cannot be attached without contact with a conductor. The various measuring methods also work with different degrees of accuracy. A study from 2016 (Leferink et al. 2016) was able to show that, depending on the type and behavior of the electronic consumers, various measurement methods sometimes show massive differences in the measurement results. The Rogowski coils, in particular, sometimes showed considerable deviations when there were rapid load changes. Again, this effect turned out to be a positive property for event detection and classification in the context of NILM in Held et al. (2020), especially when it comes to distinguishing between very similar devices.

3.2 Measurement Hardware

Developing your hardware is a very time-consuming process and is therefore not always the first choice. Depending on the goal of your research, the focus can only be on developing algorithms or on setting up an entire NILM measurement system. Many therefore work with public data sets, which meanwhile abound. Works that rely on in-house measurements either try to use existing solutions or develop specific hardware to meet their own needs.

Smart Meter Based Data Acquisition

Developing your hardware is often expensive, time-consuming, and may not even be the focus of research work. A direct connection to a Smart Meter is used in many publications. From a pragmatic point of view, one day every household will be equipped with a smart meter. Many Smart Meters have different interfaces. This means that the measurement data can be accessed directly. One advantage is that there is no need to develop and calibrate your measuring unit. A disadvantage may be the low sampling rate. While the energy companies receive the measured values at intervals of several minutes to one hour (Adabi et al. 2016; Liang et al. 2019), the measured value can be queried directly on the device via interfaces such as Modbus, sometimes with a frequency of a few minutes to around 1 Hz (Bousbiat et al. 2020; Raiker et al. 2018). Furthermore, the output format cannot be freely selected. A Smart Meter provides performance values, not voltage and current. Accordingly, the possibilities in terms of feature extraction are limited. However, for use cases in which such low sampling and the power values are sufficient, Smart Meters offer a simple and stable option as a basis for data acquisition. In some cases, the smart meters are supplemented by other elements such as openHAB for data logging and evaluation (Bousbiat et al. 2020). The development of a NILM measuring system based on Smart Meters has a few limitations as well as decisive advantages. The hardware already exists, no calibration of the sensors is required, and the downstream hardware or software can be kept very simple. It is therefore also understandable why this path is followed in many research projects.

Self-developed Hardware for Data Acquisition

Smart Meters are a real alternative for data collection. However, if special questions require a high sampling rate, for example, or if the goal in an industrial context is to take a closer look at several machine systems, Smart Meter-based hardware approaches may no longer be so suitable. There are attempts to carry out a disaggregation with low sampling rates (Liang et al. 2019). However, it could also be shown that different devices cannot be distinguished with slowly sampled signals, while they can be distinguished without any problems with a higher sampling rate (Adabi et al. 2016). Furthermore, the features to be used have a significant impact on which sampling frequency is required (Dinesh et al. 2016; Zeifman and Roth 2011). In the case of in-house developments, a distinction can be made between FPGA-based solutions (Barbero et al. 2020; Cardenas et al. 2016; Trung et al. 2012) and those with microcontrollers (Shiddieqy et al. 2021; Yaemprayoon et al. 2016). In addition to the goal of developing a real-time NILM solution, there are also pure data loggers (Kolter and Johnson 2011). The data loggers in particular are needed to be able to record new data sets for algorithm development. Here, the disaggregation takes place separately from the hardware.

3.3 Public NILM Datasets

Developing your hardware is a separate topic that can take a lot of time and effort. Since there is no development hardware for NILM to buy, you have to develop it yourself. Using Smart Meters, there is already an almost generic way to produce your measurement data, but setting up your measurement scenario is also a very time-consuming undertaking. In the meantime, there are many different data sets in the research field, which cover a wide variety of goals. Some of them were developed in the laboratory, others are recordings from real households. The data sets differ, among other things, in the sampling frequency, the signal form, the length of the sequence, and the devices used. Pereira and Nunes (2018) provide a large overview of many data sets. A distinction can also be made between datasets recorded in households and datasets generated under laboratory conditions. Representatives for household datasets are REDD (Kolter and Johnson 2011), BLUED (Anderson et al. 2012), UK-DALE (Kelly and Knottenbelt 2015) and ECO (Beckel et al. 2014). Representatives for laboratory datasets are WHITED (Kahl et al. 2016), COOLL (Picon et al. 2016), PLAID (Gao et al. 2014). A problem with existing datasets was that often only single device measurements or aggregated measurements are available in the datasets. The dataset HELD1 (Held et al. 2018a), which was also generated under laboratory conditions, was developed to combine training and test sequences in a common dataset. Another particular dataset is HELD2 (Weißhaar et al. 2020). Only the individual measurements have been included here. The aggregated datasets were generated synthetically from the individual measurements under defined conditions. HELD2 is the first simulation data set for NILM.

4 Feature Extraction for the Appliance Classification

Good device recognition results can only be achieved later with well-prepared features. Therefore, feature extraction with its many possibilities is a topic that should not be neglected.

4.1 Steady State and Transient State

The goal of NILM is to detect devices and their states in a current or power signal. Various more general information can be recognized in the signals. In the event-based approach, two phases in the signal can be distinguished concerning a device, the steady-state, and the transient state.

The steady-state describes the state in which the signal behaves quasi-statically, i.e. does not experience any state change. Applied to a simple device, this is either the on or off state. The transient state designates the period in which the signal changes before it has passed from one steady-state to the other. The concepts of steady-state and transient state are elementary for event-based feature extraction since the entire signal is not processed here, but event-centric sections are processed.

4.2 Extraction of Features in NILM

Hart started with admittance as a signal form (Hart 1992), but has already described several other suitable signal forms that can be used for processing in NILM. With the features themselves, a basic distinction is made between steady-state features and transient state features. Steady State Features are formed by comparing two Steady States. In the simplest case, the signal curve is subtracted from one another over a defined time window before a transient state with the same time window after the transient state. The difference then forms a possible steady-state feature. The transient features simply use the temporal signal range during the transient state.

The literature provides many different options for choosing the appropriate feature. Zhang and Zhu (2019) compared various steady-state features and transient state features. The active power P and the reactive power Q are given here as the simplest form, which can be found in numerous publications. A promising feature is the V-I-trajectory over a signal period. It provides a strong separability of devices. Other options are the harmonics of the current signal, which can be calculated using Fast Fourier Transform, and the waveform of the current signal itself. In the transient state features area, the instantaneous power and the instantaneous current waveform are listed as possible options. The S-Transform (Martins et al. 2012) offers another transient state feature. In the area of the steady-state features, there are also the wavelet transform (Zoha et al. 2012) and eigenvalues (Liang et al. 2010).

In addition to the classic features, which are ultimately based on the current and voltage signal, there are further investigations that involve additional sensors. Here, for example, the temperature (Morán et al. 2020) can be found as an additional signal. Light intensity, acceleration sensors, acoustic sensors, and other environmental sensors are described in Bergés et al. (2010) as possible additions.

For the selection of a suitable feature or feature set, the question remains whether the computing intensity plays a role, what data is available, whether the hardware can be expanded, whether additional sensors can be installed, and how these different sources can be combined. In general, only the steady-state features remain when using low sampling. The transient states are often short and meaningful information content requires a high sampling rate. This in turn means higher demands on the hardware, because high sampling rates also mean more data to be processed. In many different works, the classic features have given good results. Depending on the question, other sensors could provide useful support.

5 Frequency Invariant Transformation of Periodic Signals

Frequency Invariant Transformation of Periodic Signals (FIT-PS) (Held et al. 2016, 2019a) is an algorithm developed for NILM. FIT-PS can be applied to discretely sampled periodic signals. The idea behind FIT-PS is to generate a multi-dimensional signal out of a one-dimensional signal. This makes it possible to detect changes in the signal at specific points in time over a period of time. The motivation for the FIT-PS transform is that sampled voltage signals are always subject to a certain scatter in the period duration since the mains frequency f0 is not constant. As a result, the number of sampling points per period is not always the same. This also results in a slight difference within a period from other periods at the time the sample was drawn. So, the signal has a certain frequency dependency. Using FIT-PS this frequency dependence is eliminated by interpolation and each period has a constant number of sampling points.

The FIT-PS transform can be described as follows:

$$FITPS:R^{L} \to R^{K \cdot N}$$

Here, L represents the length of the discretely sampled signal. The parameter K represents the number of periods of the transformed signal, and the parameter N represents the number of sampling points per period in the transformed signal. The degree of freedom of the parameter n makes it possible to choose the dimensionality of the transformed signal yourself within certain limits. Only the sampling rate of the original signal defines the limit of how large N can be chosen considering the Nyquist–Shannon sampling theorem.

The FIT-PS transform according to the algorithm is performed in several consecutive steps. First, a resampling takes place, in which the original signal is gradually converted into a signal with newly calculated sampling points Eq. 7.

$${\text{Resampling}}:S_{org} \to S_{R}$$
(7)

First, the trigger signal used to determine the period changes is defined. With NILM, the voltage is selected as the trigger signal. In a normal power grid, it is assumed that the voltage signal has a sinusoidal curve and that there is a clear point in time for the period change. First, the time stamp of the beginning of the period is determined by linear interpolation. The same happens with the time of the end of the period. N − 1 equidistantly calculated points in time are defined between these time stamps. This results in exactly N points in time for each period. At each point in time, the interpolated sample point is calculated using the closest sample points from the original signal. This results in an N-valued vector per voltage period. Equation 8 shows the kth period vector extracted from the resampling signal SR.

$$\begin{aligned} P_{k} & : = \left( {{\text{matrix}}\left( {S_{R} \left[ {k \cdot N + 1} \right]\text{@}S_{R} \left[ {k \cdot N + 2} \right]\text{@.@.@.}S_{R} \left[ {\left( {k + 1} \right) \cdot N} \right]} \right)} \right), \\ & \quad \quad \quad \quad \quad \quad \quad k \in \left\{ {0, \ldots ,K - 1} \right\} \\ P_{k} & : = \left( {\begin{array}{*{20}c} {S_{R} \left[ {k \cdot N + 1} \right]} \\ {S_{R} \left[ {k \cdot N + 2} \right]} \\ \vdots \\ {S_{R} \left[ {\left( {k + 1} \right) \cdot N} \right]} \\ \end{array} } \right),\quad k \in \left\{ {0, \ldots ,K - 1} \right\} \\ \end{aligned}$$
(8)

Each vector Pk is appended as a row vector to the transformed voltage signal, so the FIT-PS signal in Eq. 9 results in a matrix.

$$\begin{aligned} S_{FITPS} & : = \left( {{\text{matrix}}\left( {P_{1}^{{\prime}} \text{@}P_{2}^{{\prime}} \text{@.@.@.@}P_{k}^{{\prime}} \text{@}P_{k + 1}^{{\prime}} \text{@.@..}} \right)} \right) \\ S_{FITPS} & : = \left( {\begin{array}{*{20}c} {P_{1}^{{\prime}} } \\ {P_{2}^{{\prime}} } \\ \vdots \\ {P_{k}^{{\prime}} } \\ {P_{k + 1}^{{\prime}} } \\ \vdots \\ \end{array} } \right) \\ \end{aligned}$$
(9)

In Fig. 7 the transformation of the original signal Sorg into the resampling signal form SR is sketched.

Fig. 7
Two wave graphs exhibit the original signal and resampled signal. The frequency waves present the amplitude over the periods K 1, K, and K + 1.

Frequency-invariant resampling of the original signal

In the next step, the same interpolation is applied to the current signal at the previously determined times. This also results in a vector of length N for the current signal, which is appended to the transformed current signal as a line vector. In Fig. 8 this transformed signal matrix can be seen as a three-dimensional FIT-PS representation.

Fig. 8
A graph of amplitude versus dimension represents the range for signal period k. It exhibits the trend of 5 lines in a form of a wave.

Signal in FIT-PS representation

Another form of representation is a heat map, as can be found in Held et al. (2019a), where the amplitude of the signal determines the color. The resulting, transformed signals now correspond to a fundamental frequency that corresponds exactly to the ideal mains frequency f0. This means that relatively the same points in time within the different periods can now be compared directly. This makes it also possible, for example, to use algorithms from image processing. Likewise, phase changes in the signal are visible. Another special feature of the FIT-PS algorithm is that it can also be used for upsampling and downsampling. The resampling frequency can be determined by selecting the N parameter. This has advantages, for example, when different data sets are to be combined and compared, as shown in Held et al. (2019b).

6 Artificial Intelligence for Appliance Classification

Artificial intelligence plays one of the central roles at NILM. There are approaches in which the overall signal is evaluated, and the event detection and classification are carried out together. Other approaches assume the recognized events and only examine the relevant signal sections for their current consumers. When detecting appliances using artificial intelligence, a distinction can be made between supervised and unsupervised learning approaches.

6.1 Supervised Learning

Supervised learning requires training and test data. A special machine learning problem is solved here by training a machine learning model with existing data. There are many publications that use simple machine learning methods, but also achieve good results with them. These simple classification methods include k Nearest Neighbors (kNN), Support Vector Machine (SVM), Decision Tree, Naive Bayes, and Random Forest (Gurbuz et al. 2021; Lin and Tsai 2011; Weißhaar et al. 2018). These simple classification methods are suitable as good comparison values for more complex machine learning models. Problem-solving using neural networks is en vogue these days. Accordingly, there is many research work about it. All types of neural networks can be found, such as back propagation neural networks, recurrent neural networks (RNN), convolutional neural networks (CNN), as well as more advanced forms such as long short-term memory (LSTM) and others (Ciancetta et al. 2021; Held et al. 2019a; Le et al. 2016; Wang and Yin 2017). Techniques such as transfer learning (Devlin and Hayes 2019; D’Incecco et al. 2020; Zhou et al. 2021) are also taken into account in research, which can significantly accelerate the learning process. A problem in the context of NILM is the lack of data, which makes it difficult to train deep neural networks with good results and reduces the risk of overfitting. A possible solution is the use of data augmentation (Rafiq et al. 2021).

Both the simple machine learning algorithms and the neural networks deliver good results in the problems they examine. This is mostly because they are trained on a specific problem. The limits of the supervised learning approach come to light when it comes to problems such as newly added devices. Here the area of supervised learning must be left and looked in the direction of unsupervised learning.

6.2 Unsupervised Learning

Unsupervised learning takes a completely different perspective on device detection. In this case, it is assumed that much to all information about the electronic network and consumers is missing. Accordingly, strategies have to be applied to piecewise decompose the present overall signal and to identify consumers contained therein. In an overview of unsupervised learning in the context of NILM, Bonfigli et al. (2015) offers the idea of carrying out this device detection in two steps. In the first step, individual loads are detected. This detection is still completely detached from a specific device assignment. The second step is clustering, in which the previously identified loads are assigned to a common source. This source then ultimately corresponds to a device. The first step is pursued using various fHMM-based approaches. The second step of actual device assignment is done here either using various forms of matrix factorization or a genetic k-means clustering method. The approaches based on fHMMs are all eventless. However, there are also event-based unsupervised learning approaches that require as little prior knowledge as possible. Kamoto et al. (2017) use a Competitive Agglomeration (CA) algorithm in their work. This clustering method starts with a too high number of possible clusters and optimizes this number. The advantage over many other clustering methods is that no a priori knowledge of the number of devices is required. The feature sequences are previously extracted from the overall signal by an event detector and then entered into the CA. In the next step, the ON and OFF clusters that belong together are identified so that simple Type I appliances can be modeled from them. From the cluster pairs found in this way, the device is then recognized in the overall signal. With the help of these detected devices, the last step is to check whether the overall performance can now be reconstructed from these individual device performances. A limitation of Kamoto et al.’s (2017) method compared to the approaches described in Bonfigli et al. (2015) is that it is generally assumed here that everything can be put together from simple devices. Each switch-on event is converted to the same switch-off event. A problem could arise with finite state machines, which allow transitions between the individual states so that they no longer run back in the same order. Likewise, Type III Various Power devices are not included in this approach. In a direct comparison, the CA-based approach of Kamoto et al. (2017) performs better overall than fHMM approaches when evaluating part of the REDD dataset, according to their investigation.

6.3 Semi-supervised and Online Learning

Compared to supervised learning approaches, unsupervised learning approaches are more generic. In supervised learning, a model is always optimized for a specific problem. This model is then complete. Subsequent addition or removal of a device will result in the model having to be retrained. Unsupervised learning approaches start from scratch and develop a model that is as suitable as possible in an existing environment. In order to be able to tackle the real problems with NILM in the future, one approach could be to develop methods that are less based on batch learning and involve more online learning. A parallel structure is conceivable here. A trained model performs the classification. Another model uses the incoming features to check whether the current clusters are still correct or whether a new cluster has to be established due to an added device. This newly defined cluster must then be integrated into the classification model via an update process, and existing clusters must be modified if necessary.

Semi-supervised and online learning approaches can already be found in various works. Egarter et al. (2015) describes an approach based on fHMM with particle filtering. In the presentation of the algorithm, there is also a description of how this algorithm can be supplemented with online learning. In Salem and Sayed-Mouchaweh (2020), a semi-supervised online learning approach based on a conditional HMM (CHMM) and the Expectation Maximization (EM) algorithm is pursued, in which an existing model is improved with continuous classifications.

7 Conclusion

This chapter gave an overview of Non-Intrusive Load Monitoring (NILM). The overall performance is measured at a central node within an electronic network. This overall signal is then broken down into its components using various algorithms, and the power is assigned to the consumers. This process is called disaggregation. The realization of NILM can be done with many methods. A standard processing chain was presented. Following this process chain, one of the first questions is how the measurement data is acquired. The measurement data are further processed, and various features can be extracted. Depending on the sampling rate of the measurement signal, low or high-frequency features must be paired with suitable event detection methods and classification methods. A fundamental distinction is made between event-based and eventless approaches. The eventless concept is usually based on factorial Hidden Markov Models. Various measurement principles were introduced which are suitable for a non-intrusive measurement. In addition to different approaches to the measurement hardware, publicly accessible data sets for NILM were presented. Various used feature extraction methods for event detection and classification have been shown. Additionally, the Frequency Invariant Transformation of Periodic Signals (FIT-PS) algorithm, developed for NILM, has been explained. Different artificial intelligence approaches for NILM were presented, divided into supervised and unsupervised learning. Finally, semi-supervised and online learning approaches with example implementations in NILM were shown.