Abstract
Automatic data acquisition systems provide large amounts of streaming data generated by physical sensors. This data forms an input to computational models (soft sensors) routinely used for monitoring and control of industrial processes, traffic patterns, environment and natural hazards, and many more. The majority of these models assume that the data comes in a cleaned and pre-processed form, ready to be fed directly into a predictive model. In practice, to ensure appropriate data quality, most of the modelling efforts concentrate on preparing data from raw sensor readings to be used as model inputs. This study analyzes the process of data preparation for predictive models with streaming sensor data. We present the challenges of data preparation as a four-step process, identify the key challenges in each step, and provide recommendations for handling these issues. The discussion is focused on the approaches that are less commonly used, while, based on our experience, may contribute particularly well to solving practical soft sensor tasks. Our arguments are illustrated with a case study in the chemical production industry.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Root Mean Square Error
- Feature Selection
- Partial Little Square
- Partial Little Square Regression
- Data Preparation
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Budka, M.: Clustering as an example of optimizing arbitrarily chosen objective functions. In: Advanced Methods for Comp. Collective Intell., pp. 177–186 (2013)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: A survey. ACM Computing Surveys 41(3), 1–58 (2009), doi:10.1145/1541880.1541882
Fortuna, L.: Soft sensors for monitoring and control of industrial processes. Springer (2007)
Han, C., Lee, Y.: Intelligent integrated plant operation system for six sigma. Annual Reviews in Control 26, 27–43 (2002)
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data, 2nd edn. Wiley (2002)
Kadlec, P., Gabrys, B.: Architecture for development of adaptive on-line prediction models. Memetic Computing 1(4), 241–269 (2009)
Kadlec, P., Gabrys, B., Strandt, S.: Data-driven soft sensors in the process industry. Computers and Chemical Engineering 33(4), 795–814 (2009)
Kadlec, P., Grbic, R., Gabrys, B.: Review of adaptation mechanisms for data-driven soft sensors. Computers & Chemical Engineering 35(1), 1–24 (2011)
Lin, B., Recke, B., Knudsen, J., Jorgensen, S.: A systematic approach for soft sensor development. Computers & chemical engineering 31(5-6), 419–425 (2007)
Mandelbrot, B.: The fractal geometry of nature. W.H. Freeman (1983)
Netzeva, T., Worth, A., Aldenberg, T., Benigni, R., Cronin, M., Gramatica, P., Jaworska, J., Kahn, S., Klopman, G., Marchant, C., et al.: Current status of methods for defining the applicability domain of (quantitative) structure-activity relationships. Alternatives to Laboratory Animals 33(2), 1–19 (2005)
Park, S., Han, C.: A nonlinear soft sensor based on multivariate smoothing procedure for quality estimation in distillation columns. Computers & Chemical Engineering 24(2-7), 871–877 (2000)
Pearson, R.K.: Mining imperfect data. Society for Industrial and Applied Mechanics, USA (2005)
Qin, J.: Recursive PLS algorithms for adaptive data modeling. Computers & Chemical Engineering 22(4-5), 503–514 (1998)
Žliobaitė, I., Gabrys, B.: Adaptive preprocessing for streaming data. IEEE Trans. on Knowledge and Data Engineering 26, 309–321 (2014)
Warne, K., Prasad, G., Rezvani, S., Maguire, L.: Statistical and computational intelligence techniques for inferential model development: a comparative evaluation and a novel proposition for fusion. Eng. Appl. of Artif. Intell. 17, 871–885 (2004)
Willmott, C., Matsuura, K.: Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Research 30, 79–82 (2005)
Witten, I., Frank, E., Hall, M.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Budka, M. et al. (2014). From Sensor Readings to Predictions: On the Process of Developing Practical Soft Sensors. In: Blockeel, H., van Leeuwen, M., Vinciotti, V. (eds) Advances in Intelligent Data Analysis XIII. IDA 2014. Lecture Notes in Computer Science, vol 8819. Springer, Cham. https://doi.org/10.1007/978-3-319-12571-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-12571-8_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12570-1
Online ISBN: 978-3-319-12571-8
eBook Packages: Computer ScienceComputer Science (R0)