A benchmark and survey of fully unsupervised concept drift detectors on real-world data streams

Lukats, Daniel; Zielinski, Oliver; Hahn, Axel; Stahl, Frederic

doi:10.1007/s41060-024-00620-y

A benchmark and survey of fully unsupervised concept drift detectors on real-world data streams

Review
Open access
Published: 27 August 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Data Science and Analytics Aims and scope Submit manuscript

A benchmark and survey of fully unsupervised concept drift detectors on real-world data streams

Download PDF

395 Accesses
Explore all metrics

Abstract

Concept drift detection techniques can be used to discover substantial changes of the patterns encoded in data streams in real-time. If left unaddressed, these changes can render deployed machine learning models unreliable because their training data no longer matches the patterns present in the data stream. Most algorithms proposed in the literature depend on the immediate availability of ground truth class labels. This is unrealistic for many applications due to the associated cost of labeling. Therefore, this study reviews the availability of fully unsupervised concept drift detectors, which can operate entirely without labeled data. Ten algorithms are analyzed in terms of architectural choices, core ideas and assumptions about data because they fulfilled several inclusion criteria designed to ensure faithful and reliable implementations. Seven of these algorithms are evaluated with common concept drift detection metrics on eleven real-world data streams; the remaining three performed too slow or depended on chance. Based on the results of these experiments, three concept drift detectors—Discriminative Drift Detector, Image-Based Drift Detector and Semi-Parametric Log-Likelihood—can be recommended depending on the desired target metric. This study further reveals issues with the evaluation metrics Mean Time Ratio and lift-per-drift. Finally, it highlights open research challenges.

Unsupervised Concept Drift Detectors: A Survey

Concept learning using one-class classifiers for implicit drift detection in evolving data streams

Article 20 November 2020

SDDM: an interpretable statistical concept drift detection method for data streams

Article 05 February 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

A wealth of data is generated in real-time in the form of data streams [1, 2], e.g., in network traffic monitoring systems [3, 4], in internet-of-things networks [5] or in environmental observatories [6, 7]. Classic machine learning methods operate under the assumption that data is stationary, i.e., that the data a model is deployed on is similar to the data it was trained on. In long-running data, this assumption does not hold, when data is generated continuously and is therefore non-stationary. Instead, patterns encoding the incoming data may change in such a manner that deployed models can no longer provide reliable predictions, giving rise to a phenomenon called concept drift [8].

Various methods were proposed to address the issue of concept drift, either by making predictive models themselves adaptive [9] or by detecting concept drifts with methods such as Drift Detection Method (DDM) [10] or Adaptive Windowing (ADWIN) [11] to allow manual adaptation. Most concept drift detectors proposed in the literature operate in a supervised manner; they monitor the predictive performance of a classifier deployed on the data stream. These detectors require immediate access to ground truth information about the class labels—which is unrealistic in many applications due to associated cost or limited accessibility. Any detector that detects concept drift without online access to class labels of the observed data is called unsupervised. In addition to detectors which do not require labeled data at any time, this term also applies to those concept drift detectors which require class labels in an offline pre-training phase and operate without labeled data once deployed on the data stream. The key distinction between these two approaches is that the latter methods assume that labels are available for pre-training and can be used for concept drift detection [12, 13].

Both supervised and unsupervised concept drift detectors can be used in diverse real-world applications, e.g., network intrusion detection [14, 15], spam detection [16], solar irradiance forecasting [17], landslide detection [18] or predictive maintenance [19].

Although many data streams are associated with classification tasks, which pre-trained methods such as L-CODE [12] or EMAD [13] can leverage, other data streams come entirely unlabeled or are otherwise not associated with any kind of classification tasks. For this reason, this study is concerned with fully unsupervised concept drift detectors. For example, data streams from coastal observatories provide information from different sensors in real-time [6, 7]. Although no classification tasks are associated with these data streams as of now, concept drift detection is desired for these data streams to support scientific operations [20]. Various surveys addressing supervised concept drift detection are available [8, 21,22,23]. Barros and Santos benchmarked supervised concept drift detectors and ensembles thereof in two studies [24, 25]. Moreover, Gemaque et al. [26] provide a broader view on unsupervised methods, highlighting mostly those which require labeled data in an offline pre-training phase. A study by Suárez-Cetrulo et al. [27] reviews the state of concept drift detectors for recurring concept drift. Finally, Xiang et al. [28] review the state of deep learning methods for concept drift detection, which is sparsely covered in the other literature reviews.

Here, in addition to highlighting available methods from the literature, 7 detectors are implemented for evaluation on a choice of 11 real-world data streams. Common metrics from the literature are used in this study to evaluate the predictive performance of the detectors: The proxy metrics classifier predictive performance [29] and lift-per-drift [30] evaluate a concept drift detector with the help of a classifier deployed on the data stream. Classifier predictive performance is known to be flawed, as there is a bias towards more frequent detection on several real-world data streams [29]. On one data stream ground truth information about concept drifts is available enabling the use of Mean Time Ratio, which directly assesses the detection rate, detection time and time between false alerts [29]. This study aims to address the following research questions: Which detectors are available in the literature and suitable for implementation? Which detectors detect concept drift well? Given classifier predictive performance’s bias, is lift-per-drift an unbiased proxy metric? Finally, implementing and benchmarking these detectors also reveals open issues with the current state of the art in both fully unsupervised concept drift detectors and their evaluation.

Hence, this study makes the following contributions: Firstly, it provides implementations of 7 unsupervised concept drift detectors, made available under the 3-clause BSD license. Secondly, it offers extensive evaluation of these detectors on 11 real-world data streams with several metrics identifying the best performing detectors. The results of these experiments are made available alongside the source code on GitHub^{Footnote 1} to ensure reproducibility and enable further research. Lastly, it provides a comparison of the different metrics used and reveals a few open issues.

This paper is structured as follows: Firstly, a definition of concept drift is given and different ways how concept drifts can manifest are highlighted in Sect. 2. Then the methodology for the literature review is stated in Sect. 3. Section 4 follows with a review of the literature, highlighting typical architectural choices and introduces the implemented algorithms. The setup of the experimental evaluation is explained in Sect. 5 and the corresponding results are shown and are discussed in Sect. 6. Finally, concluding remarks are given in Sect. 8.

2 Concept drift

A concept drift denotes a change in the probability distributions governing a data stream. In contrast to outliers, which are just a few data points outside of the regular distribution of the data, a concept drift marks a longer lasting change.

Usually, two types of concept drift are described, real and virtual concept drift [8]. Real concept drift means changes in the posterior distribution such that $P_{t_1}(y \mid X) \ne P_{t_2}(y \mid X)$ given $t_1 \ne t_2$, $t_1$ and $t_2$ being different points in time. X denotes the features of the data excluding the label or target feature, which is denoted y instead. In contrast to this, virtual concept drift or covariate shift denotes a change in the distribution of the features: $P_{t_1}(X) \ne P_{t_2}(X)$, $t_1 \ne t_2$. Supervised concept drift detectors compare a classifier’s predictions $\hat{y}$ to the true class label y. In contrast to this, the unsupervised detectors benchmarked in this study observe the features X only because real-time access to the classification ground truth is unrealistic in many scenarios (see Table 1). By virtue of operating on the feature space only, these unsupervised concept drift detectors cannot detect concept drift in the posterior distribution unless it is accompanied by a covariate shift.

Table 1 An overview of the different spaces observed and required by supervised, pre-trained unsupervised and fully unsupervised concept drift detectors

A benchmark and survey of fully unsupervised concept drift detectors on real-world data streams

Abstract

Similar content being viewed by others

Unsupervised Concept Drift Detectors: A Survey

Concept learning using one-class classifiers for implicit drift detection in evolving data streams

SDDM: an interpretable statistical concept drift detection method for data streams

Explore related subjects

1 Introduction

2 Concept drift

3 Literature search

4 Algorithms investigated

4.1 Architecture

4.1.1 Data windows

4.1.2 Data modeling

4.1.3 Dissimilarity measurement

4.1.4 Drift criterion

4.1.5 Reset

4.2 Implemented algorithms

4.2.1 Bayesian nonparametric detection method (BNDM)

4.2.2 Clustered statistical test drift detection method (CSDDM)

4.2.3 Discriminative drift detector (D3)

4.2.4 Ensemble drift detection with feature subspaces (EDFS)

4.2.5 Image-based drift detector (IBDD)

4.2.6 Nearest neighbor-based density variation identification (NN-DVI)

4.2.7 One-class drift detector (OCDD)

4.2.8 Semi-parametric log-likelihood (SPLL)

4.2.9 Unsupervised concept drift detector (UCDD)

4.2.10 Unsupervised change detection for activity recognition (UDetect)

5 Experimental design and setup

5.1 Data streams used

5.2 Metrics

5.2.1 Mean time ratio

5.2.2 Proxy metrics

5.3 Setup

6 Results and discussion

6.1 Filtering of failed configurations

6.2 Mean time ratio

6.3 Accuracy

6.4 Lift-per-drift

6.5 Verification on synthetic data

7 Open research challenges

8 Conclusion

Data availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Appendices

Implementation details

1.1 BNDM

1.2 CSDDM

1.3 IBDD

1.4 SPLL

1.5 UDetect

Configurations

1.1 Classifiers

1.2 Concept drift detector configurations

1.2.1 BNDM

1.2.2 CSDDM

1.2.3 D3

1.2.4 IBDD

1.2.5 OCDD

1.2.6 SPLL

1.2.7 UDetect

1.3 Best configurations

Figures

Rights and permissions

About this article

Cite this article

Share this article

Keywords