Concept Drift Detection Using Autoencoders in Data Streams Processing

Jaworski, Maciej; Rutkowski, Leszek; Angelov, Plamen

doi:10.1007/978-3-030-61401-0_12

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12415))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

Abstract

In this paper, the problem of concept drift detection in data stream mining algorithms is considered. The autoencoder is proposed to be applied as a drift detector. The autoencoders are neural networks that are learned how to reconstruct input data. As a side effect, they are able to learn compact nonlinear codes, which summarize the most important features of input data. We suspect that the properly learned autoencoder on one part of the data stream can then be used to monitor possible changes in the following stream parts. The changes are analyzed by monitoring variations of the autoencoder cost function. Two cost functions are applied in this paper: the cross-entropy and the reconstruction error. Preliminary experimental results show that the proposed autoencoder-based detector is able to handle different types of concept drift, e.g. the sudden or the gradual.

Access provided by Autonomous University of Puebla. Download conference paper PDF

A novel framework for concept drift detection using autoencoders for classification problems in data streams

Article 03 June 2024

State-Transition-Aware Anomaly Detection Under Concept Drifts

Unsupervised Concept Drift Detection Based on Stacked Autoencoder and Page-Hinckley Test

Keywords

1 Introduction

In recent years, the topics connected with data stream mining attracted much attention of machine learning researchers [1, 6, 9,10,11,12,13,14,15,16,17, 24, 27,28,29,30,31,32,33,34, 37, 38]. In data streams it is quite common that the underlying data distribution can change over time. It is known in the literature as the ‘concept drift’ [40, 41]. Another characteristics of the streaming data is their potentially infinite size and high rates of arriving at the system. Therefore, proper data stream mining algorithms should be resource-aware [8, 15].

The data stream can be defined as a sequence of data elements

$$\begin{aligned} S = (x_{0},x_{1},x_{2},\dots ). \end{aligned}$$

(1)

In this paper the case of unsupervised learning is considered, which means that there are no classes assigned to data elements. Hence, the data stream element $s_{n}$ is a D-dimensional vector defined as follows

$$\begin{aligned} x_{n} = \left[ x^{1}_{n},\dots ,x_{n}^{D}\right] . \end{aligned}$$

(2)

The data stream mining algorithm should be able to react to possible changes in data distribution [16, 20, 41]. The concept drift handling can be realized in two ways, i.e. passively or actively. In the passive approach, a concept drift reaction mechanism is incorporated in the data stream mining algorithm itself. It can bed done by applying the sliding window techniques [7] or ensemble methods [28, 29]. In the active approach, there is an external drift detector functioning in parallel to the main algorithm. Based on the information from the detector the algorithm can make a decision about rebuilding the model. Many active drift detectors focus on monitoring the current accuracy of the model. If it decreases, then it means that the model becomes invalid and needs to be modified. En examples of such methods are the Drift Detection Method (DDM) [18] and the Early Drift Detection Method (EDDM) [3]. Other active drift detection methods rely on statistical tests. They look for possible changes in parameters of data probability density functions. Among others, the Welch’s, the Kolmogorov-Smirnov’s [27], or the Page-Hinkley’s tests are often used [19].

In this paper, we present an active concept drift detection approach based on autoencoders. This approach is similar to the one presented in [23], in which the Restricted Boltzmann Machine (RBM) [39] was applied as a drift detector. This idea was further extended with resource-awareness strategies to deal with fast data streams [26] or to deal with missing values [25]. Autoencoders are important neural network models often used in the field of deep learning [21]. They are used, for example, in greedy-pretraining methods of learning deep neural networks [4, 5]. Similarly to RBMs, the properly learned autoencoder contains in its middle layer the information about the data probability distribution. It can be then used to check whether the data from another part of the data stream belong to the same distribution as the part on which the autoencoder was trained. This can be checked, for example, by measuring the reconstruction error or the cross-entropy of new data.

The rest of the paper is organized as follows. In Sect. 2 autoencoders are presented in more detail. In Sect. 3 methods for concept drift detection using autoencoders are proposed. In Sect. 4 the results obtained in numerical simulations are demonstrated. Sect. 5 concludes the paper and indicates possible future research directions.

2 Autoencoders

The autoencoder is a kind of feed-forward neural network, which is learned to reconstruct input data [22]. There are many types of autoencoders [21], like the denoising autoencoder [2], the sparse autoencoder [35] or the contractive autoencoder [36]. Generally, the autoencoder is composed of two parts: the encoder and the decoder. The aim of the encoder is to transform input data in a nonlinear way, extracting the most important features of the data. The decoder is learned to reconstruct the output of the encoder back to the input data as close as possible. In this paper, we consider autoencoders consisting of $L+1$ layers. Let $y_{j}^{(l)}(x_{n})$ denote the value of the j-th neuron in the l-th layer obtained for input data $x_{n}$. Then, the values of neurons in the l-th layer can be computed using the following formula

$$\begin{aligned} y^{(l)}_{j}(x_{n}) = \sigma \left( \sum _{i=1}^{N_{l-1}}w^{(l)}_{ij}y^{(l-1)}_{i}(x_{n})+b_{j}^{(l)}\right) ,\; l = 1,\dots ,L, \end{aligned}$$

(3)

where $w_{ij}^{(l)}$ are weights of synapses between the $(l-1)$-th and the l-th layers, $b_{j}^{(l)}$ are the biases, and $N_{l}$ is the size of the l-th layer. Function $\sigma (z)$ is an activation function. Many different types of activation functions can be applied in autoencoder. In this paper, we apply the most common one, i.e. the sigmoid function defined below

$$\begin{aligned} \sigma (z)=\frac{1}{1+\exp (-z)}. \end{aligned}$$

(4)

The 0-th layer is the input layer, i.e. $y_{j}^{(0)}(x_{n}) = x_{n}^{j}$. The size of the output layer is equal to the input layer size, i.e. $N_{L} = N_{0} = D$. If we do not apply any sparsity constraints while learning the autoencoder, the middle layer should be of the smallest size, i.e.

$$\begin{aligned} D = N_{0}> N_{1}> \dots > N_{\frac{L}{2}}< \dots< N_{L-1} < N_{L} = D \end{aligned}$$

(5)

The autoencoder is trained as a usual feedforward neural network by minimizing a cost function. Let $C(x_{n})$ be the cost function obtained for the $x_{n}$ data element. Then, for example, a weight $w_{ij}^{(l)}$ would be updated in each step using the following formula

$$\begin{aligned} w_{ij}^{(l)} := w_{ij}^{(l)} - \eta \frac{\partial C(x_{n})}{\partial w_{ij}^{(l)}}, \end{aligned}$$

(6)

where $\eta $ is the learning rate. Analogous formula applies for biases.

In practice, the learning is often performed on minibatches of data instead of single data elements. Let us assume that the minibatch $M_{t}$ consists of B subsequent data

$$\begin{aligned} M_{t} = (x_{tB},\dots ,x_{(t+1)B-1}). \end{aligned}$$

(7)

Then, the neural network parameters are obtained using the arithmetic average of gradients obtained for all data from the minibatch

$$\begin{aligned} w_{ij}^{(l)} := w_{ij}^{(l)} - \eta \frac{1}{B}\sum _{m=tB}^{t(B+1)-1}\frac{\partial C(x_{m})}{\partial w_{ij}^{(l)}}. \end{aligned}$$

(8)

Many different types of cost functions can be used. In this paper, we consider two of them. The cross-entropy:

$$\begin{aligned} C_{CE}(x_{n}) = -\sum _{j=1}^{D}\left[ x_{n}^{j}\log _{2}\left( y^{(L)}_{j}(x_{n})\right) +\left( 1-x_{n}^{j}\right) \log _{2}\left( 1-y^{(L)}_{j}(x)\right) \right] , \end{aligned}$$

(9)

and the reconstruction error:

$$\begin{aligned} C_{RE}(x_{n}) = \sqrt{\sum _{j=1}^{D}\left( x_{n}^{j} - y^{(L)}_{j}(x_{n})\right) ^{2}}. \end{aligned}$$

(10)

The monitoring of cost function values can be used to detect potential drifts in data distribution. The detection procedure is described in the next section.

3 Concept Drift Detection with the Autoencoder

We suspect that the autoencoder properly trained on one part of the data stream can be applied to monitor possible changes in data distribution in the following parts. For any cost function C (given by (9) or (10), the average value of this cost function for minibatch $M_{t}$ of size B is given by

$$\begin{aligned} C(M_{t}) = \frac{1}{B}\sum _{x_{m}\in M_{t}}C(x_{m}) \end{aligned}$$

(11)

Apart from the cost function itself, we want to investigate also the trend Q(C, t) of cost function C for subsequent minibatches. To ensure that the trend is computed only on the basis of the most recent data, we apply additionally the forgetting mechanism, as it was done in [23]. The trend at time t is given by the following formula

$$\begin{aligned} Q(C,t) = \frac{\overline{n}_{t}\overline{TC}_{t} - \overline{T}_{t}\overline{C}_{t}}{\overline{n}_{t}\overline{T^{2}}_{t} - \left( \overline{T}_{t}\right) ^{2}}. \end{aligned}$$

(12)

In standard regression procedure, the terms in formula (12) are given simply by appropriate arithmetic averages. In our case, because of the forgetting mechanism applied, the arithmetic averages are replaced by the following recurrent formulas

$$\begin{aligned} \overline{TC}_{t}= & {} \lambda \overline{TC}_{t-1} + t*C(M_{t}),\; \overline{TC}_{0} = 0, \end{aligned}$$

(13)

$$\begin{aligned} \overline{T}_{t}= & {} \lambda \overline{T}_{t-1} + t,\; \overline{T}_{0} = 0, \end{aligned}$$

(14)

$$\begin{aligned} \overline{C}_{t}= & {} \lambda \overline{C}_{t-1} + C(M_{t}),\; \overline{C}_{0} = 0, \end{aligned}$$

(15)

$$\begin{aligned} \overline{T^{2}}_{t}= & {} \lambda \overline{T^{2}}_{t-1} + t^{2},\; \overline{T^{2}}_{0} = 0, \end{aligned}$$

(16)

$$\begin{aligned} \overline{n}_{t}= & {} \lambda \overline{n}_{t-1} + 1,\; \overline{n}_{0} = 0, \end{aligned}$$

(17)

where $\lambda $ is a forgetting factor and it takes values lower than 1. The trend of the cost function should be close to 0 as long as there is no concept drift in the data stream. After the drift occurs, then Q(C, t) is suspected to increase to some positive value. Since we consider two cost functions in this paper, there are two trends analyzed as well, i.e. $Q(C_{CE},t)$ and $Q(C_{RE},t)$.

4 Experimental Results

In this section, we present preliminary simulation results of experiments conducted on simple synthetic datasets with concept drift^{Footnote 1}. The dataset was generated in the same manner as it was done in [23], i.e. based on synthetic Boltzmann machines with randomly chosen parameters. At the beginning, we applied two Boltzmann machines, which then were used to generate to datasets with static probability distributions

$$\begin{aligned} S_{1} = \left( x_{1,1},x_{1,2},x_{1,3}\dots \right) ,\end{aligned}$$

(18)

$$\begin{aligned} S_{2} = \left( x_{2,1},x_{2,2},x_{2,3},\dots \right) . \end{aligned}$$

(19)

The dimensionality of the data in both sets is $D=20$. The datasets $S_{1}$ and $S_{2}$ were used to generate two datasets with concept drifts. The dataset with sudden concept drift is denoted as $S_{s} = \left( x_{s,1},x_{s,2},\dots \right) $, where $x_{s,n}$ is taken from $S_{1}$ if $n < 500000$ and from $S_{2}$ if $n \ge 500000$. The dataset with gradual concept drift is denoted as $S_{g} = \left( x_{g,1},x_{g,2},\dots \right) $, where $x_{g,n}$ is from $S_{1}$ if $n < 500000$ and from $S_{2}$ if $n \ge 600000$. In the interval [500000; 600000) the data element $x_{g,n}$ is drawn from $S_{1}$ with probability $\frac{600000\,-\,n}{100000}$ and from $S_{2}$ with probability $\frac{n\,-\,500000}{100000}$. Hence, the two datasets $S_{s}$ and $S_{g}$ are constructed in such a way, that the concept drift occurs at after processing 500000 data elements.

In the experiments, we used the simple 3-layered autoencoder ($L=3$). The sizes of the 0-th and the 2-nd layer are equal to $N_{0} = N_{2} = D = 20$, to match the dimensionality of the data. The middle layer is chosen to be of size $N_{1} = 15$. The learning rate and minibatches sizes are set to $\eta = 0.05$ and $B = 20$, respectively. In each experiment, the autoencoder is trained only for the first 400000 data elements. Then, it is turned to the monitoring mode (no learning is performed, only the values of the cost functions are analyzed). The forgetting factor $\lambda $ for trend measures was set to 0.998.

4.1 Sudden Concept Drift Detection

The first experiment was conducted on the $S_{s}$ dataset, with the sudden drift. The results obtained for the cross-entropy loss function are presented in Fig. 1. As can be seen, the value of the cross-entropy decreases while the autoencoder learns until the 400000-th data element. Then it monitors the incoming data. Since the distribution of data does not change until the 500000-th element, the cross-entropy values fluctuate around some fixed level. Then the sudden drift occurs and the cross-entropy suddenly achieves higher values. It is more clearly visible in Fig. 1(b), where the trend of cross-entropy values is presented. The peak at $n=500000$ matches the occurrence of the sudden drift.

In the case of the reconstruction error, the results can be additionally compared with the drift detector based on the RBM, presented in [23]. We applied the same parameters as for the autoencoder. The RBM was learned using the Contrastive Divergence (CD-k) method with K = 1. The results are shown in Fig. 2.

It seems that the autoencoder wins with the RBM approach in both considered aspects. In Fig. 2(a) we can observe that the values of the reconstruction error are lower for the autoencoder. It might be important if we wanted to use the network simultaneously for other purposes, besides the concept drift detection. In Fig. 2(b) it is visible that the signal peak of sudden change detection is significantly higher for the autoencoder.

4.2 Gradual Concept Drift Detection

In the next experiment, analogous simulations were carried out for the dataset $S_{g}$ with gradual concept drift. The results concerning the cross-entropy are presented in Fig. 3, whereas in Fig. 4 the comparisons of reconstruction error values are demonstrated.

In this experiment, the values of cost functions raise smoothly whilst the gradual drift takes place between the 500000-th and the 600000-th data elements. Figure 4 confirms the superiority of the autoencoder over the RBM-based approach also in the case of gradual drift.

5 Conclusions and Future Work

In this paper, the autoencoder was proposed as a drift detector in time-changing data streams. First, the neural network learns the model on the beginning part of the stream. Then it is used to monitor possible changes in the following parts. The variations of the cost function values are analyzed. If he changes turn out to be large, it means that the concept drift occurred, and it can be a signal for the data stream mining algorithm to rebuild the current model. Two cost functions were applied in this paper, i.e. the cross-entropy and the reconstruction error. The preliminary results obtained in experiments carried out on simple synthetic datasets confirmed that the autoencoder can be successfully used as a concept drift detector. It was demonstrated that it is capable of handling both sudden and gradual concept drifts. In future research, we plan to extend the proposed method to make it more resource-aware for fast-changing data streams and to make it able to deal with missing values.

Notes

1.
All the experiments were conducted using our own software implemented in C/C++ and it can be found at www.iisi.pcz.pl/~mjaworski.

References

Aggarwal, C.: Data Streams: Models and Algorithms. Springer, New York (2007)
Book MATH Google Scholar
Alain, G., Bengio, Y.: What regularized auto-encoders learn from the data-generating distribution. J. Mach. Learn. Res. 15(1), 3563–3593 (2014)
MathSciNet MATH Google Scholar
Baena-Garcıa, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavalda, R., Morales-Bueno, R.: Early drift detection method. In: Fourth International Workshop on Knowledge Discovery from Data Streams, vol. 6, pp. 77–86 (2006)
Google Scholar
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article MATH Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Proceedings of the 19th International Conference on Neural Information Processing Systems. NIPS 2006, pp. 153–160. MIT Press, Cambridge, MA, USA (2006)
Google Scholar
Bifet, A.: Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams. Frontiers in Artificial Intelligence and Applications. IOS Press, Amsterdam, Berlin (2010)
Google Scholar
Bifet, A., Gavaldá, R.: Learning from time-changing data with adaptive windowing, pp. 443–448 (2007)
Google Scholar
Bilski, J., Kowalczyk, B., Grzanek, K.: The parallel modification to the Levenberg-Marquardt algorithm. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds.) Artificial Intelligence and Soft Computing, pp. 15–24. Springer, Cham (2018)
Chapter Google Scholar
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)
Google Scholar
Duda, P., Rutkowski, L., Jaworski, M., Rutkowska, D.: On the Parzen Kernel-based probability density function learning procedures over time-varying streaming data with applications to pattern classification. IEEE Trans. Cybern. 50(4), 1683–1696 (2020)
Article Google Scholar
Duda, P., Jaworski, M., Cader, A., Wang, L.: On training deep neural networks using a streaming approach. J. Artif. Intell. Soft Comput. Res. 10(1), 15–26 (2020)
Article Google Scholar
Duda, P., Jaworski, M., Rutkowski, L.: Convergent time-varying regression models for data streams: tracking concept drift by the recursive Parzen-based generalized regression neural networks. Int. J. Neural Syst. 28(02), 1750048 (2018)
Article Google Scholar
Duda, P., Jaworski, M., Rutkowski, L.: Knowledge discovery in data streams with the orthogonal series-based generalized regression neural networks. Inf. Sci. 460–461, 497–518 (2018)
Article MathSciNet MATH Google Scholar
Dyer, K.B., Capo, R., Polikar, R.: COMPOSE: a semisupervised learning framework for initially labeled nonstationary streaming data. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 12–26 (2014)
Article Google Scholar
Gaber, M., Zaslavsky, A., Krishnaswamy, S.: Mining data streams: a review. Sigmod Rec. 34(2), 18–26 (2005)
Article MATH Google Scholar
Gałkowski, T., Krzyżak, A., Filutowicz, Z.: A new approach to detection of changes in multidimensional patterns. J. Artif. Intell. Soft Comput. Res. 10(2), 125–136 (2020). https://doi.org/10.2478/jaiscr-2020-0009
Gama, J.: A survey on learning from data streams: current and future trends. Prog. Artif. Intell. 1(1), 45–55 (2012)
Article Google Scholar
Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29
Chapter Google Scholar
Gama, J., Sebastião, R., Rodrigues, P.P.: Issues in evaluation of stream learning algorithms. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD 2009, pp. 329–338. ACM, New York (2009)
Google Scholar
Gomes, J., Gaber, M., Sousa, P., Menasalvas, E.: Mining recurring concepts in a dynamic feature space. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 95–110 (2014)
Article Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.org
Hinton, G.E., Zemel, R.S.: Autoencoders, minimum description length and Helmholtz free energy. In: Proceedings of the 6th International Conference on Neural Information Processing Systems. NIPS 1993, pp. 3–10. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1993)
Google Scholar
Jaworski, M., Duda, P., Rutkowski, L.: On applying the restricted Boltzmann machine to active concept drift detection. In: Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence Honolulu, USA, pp. 3512–3519 (2017)
Google Scholar
Jaworski, M., Duda, P., Rutkowski, L.: New splitting criteria for decision trees in stationary data streams. IEEE Trans. Neural Netw. Learn. Syst. 29(6), 2516–2529 (2018)
Article MathSciNet Google Scholar
Jaworski, M., Duda, P., Rutkowska, D., Rutkowski, L.: On handling missing values in data stream mining algorithms based on the restricted Boltzmann machine. In: Gedeon, T., Wong, K.W., Lee, M. (eds.) ICONIP 2019. CCIS, vol. 1143, pp. 347–354. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-36802-9_37
Chapter Google Scholar
Jaworski, M., Rutkowski, L., Duda, P., Cader, A.: Resource-aware data stream mining using the restricted Boltzmann machine. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds.) ICAISC 2019. LNCS (LNAI), vol. 11509, pp. 384–396. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20915-5_35
Chapter Google Scholar
Lemaire, V., Salperwyck, C., Bondu, A.: A survey on supervised classification on data streams. In: Zimányi, E., Kutsche, R.-D. (eds.) eBISS 2014. LNBIP, vol. 205, pp. 88–125. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17551-5_4
Chapter Google Scholar
Ludwig, S.A.: Applying a neural network ensemble to intrusion detection. J. Artif. Intelli. Soft Comput. Res. 9(3), 177–188 (2019)
Article Google Scholar
Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: How to adjust an ensemble size in stream data mining? Inf. Sci. 381(C), 46–54 (2017)
Article MathSciNet MATH Google Scholar
Rafajłowicz, E., Rafajłowicz, W.: Testing (non-) linearity of distributed-parameter systems from a video sequence. Asian J. Control 12(2), 146–158 (2010)
Article MathSciNet Google Scholar
Rafajłowicz, E., Rafajłowicz, W.: Iterative learning in repetitive optimal control of linear dynamic processes. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9692, pp. 705–717. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39378-0_60
Chapter Google Scholar
Rafajłowicz, E., Rafajłowicz, W.: Iterative learning in optimal control of linear dynamic processes. Int. J. Control 91(7), 1522–1540 (2018)
Article MathSciNet MATH Google Scholar
Rafajłowicz, E., Wnuk, M., Rafajłowicz, W.: Local detection of defects from image sequences. Int. J. Appl. Math. Comput. Sci. 18(4), 581–592 (2008)
Article MATH Google Scholar
Ramírez-Gallego, S., Krawczyk, B., García, S., Woźniak, M., Herrera, F.: A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239, 39–57 (2017)
Article Google Scholar
Ranzato, M., Poultney, C., Chopra, S., LeCun, Y.: Efficient learning of sparse representations with an energy-based model. In: Proceedings of the 19th International Conference on Neural Information Processing Systems. NIPS 2006, pp. 1137–1144. MIT Press, Cambridge (2006)
Google Scholar
Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: explicit invariance during feature extraction. In: Proceedings of the 28th International Conference on Machine Learning. ICML 2001, pp. 833–840. Omnipress, Madison (2011)
Google Scholar
Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: A new method for data stream mining based on the misclassification error. IEEE Trans. Neural Netw. Learn. Syst. 26(5), 1048–1059 (2015)
Article MathSciNet Google Scholar
Rutkowski, L., Pietruczuk, L., Duda, P., Jaworski, M.: Decision trees for mining data streams based on the McDiarmid’s bound. IEEE Trans. Knowl. Data Eng. 25(6), 1272–1279 (2013)
Article Google Scholar
Smolensky, P.: Parallel distributed processing: explorations in the microstructure of cognition. In: Information Processing in Dynamical Systems: Foundations of Harmony Theory, vol. 1, pp. 194–281. MIT Press, Cambridge (1986)
Google Scholar
Tsymbal, A.: The problem of concept drift: definitions and related work. Technical report. TCD-CS-2004-15. Computer Science Department, Trinity College Dublin, Ireland (2004)
Google Scholar
Zliobaite, I., Bifet, A., Pfahringer, B., Holmes, G.: Active learning with drifting streaming data. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 27–39 (2014)
Article Google Scholar

Download references

Acknowledgments

This work was supported by the Polish National Science Centre under grant no. 2017/27/B/ST6/02852.

Author information

Authors and Affiliations

Department of Computational Intelligence, Czestochowa University of Technology, Czestochowa, Poland
Maciej Jaworski & Leszek Rutkowski
Information Technology Institute, University of Social Sciences, Łódź, Poland
Leszek Rutkowski
Department of Computing and Communications, Lancaster University, Lancaster, UK
Plamen Angelov

Authors

Maciej Jaworski
View author publications
You can also search for this author in PubMed Google Scholar
Leszek Rutkowski
View author publications
You can also search for this author in PubMed Google Scholar
Plamen Angelov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maciej Jaworski .

Editor information

Editors and Affiliations

Częstochowa University of Technology, Częstochowa, Poland
Leszek Rutkowski
Częstochowa University of Technology, Częstochowa, Poland
Rafał Scherer
Częstochowa University of Technology, Częstochowa, Poland
Marcin Korytkowski
Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada
Witold Pedrycz
AGH University of Science and Technology, Kraków, Poland
Ryszard Tadeusiewicz
Electrical and Computer Engineering, University of Louisville, Louisville, KY, USA
Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jaworski, M., Rutkowski, L., Angelov, P. (2020). Concept Drift Detection Using Autoencoders in Data Streams Processing. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2020. Lecture Notes in Computer Science(), vol 12415. Springer, Cham. https://doi.org/10.1007/978-3-030-61401-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-61401-0_12
Published: 07 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61400-3
Online ISBN: 978-3-030-61401-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics