Abstract
Data quality is a vital topic for business analytics in order to gain accurate insight and make correct decisions in many data-intensive industries. Albeit systematic approaches to categorize, detect, and avoid data quality problems exist, the special characteristics of time-oriented data are hardly considered. However, time is an important data dimension with distinct characteristics which affords special consideration in the context of dirty data. Building upon existing taxonomies of general data quality problems, we address ‘dirty’ time-oriented data, i.e., time-oriented data with potential quality problems. In particular, we investigated empirically derived problems that emerge with different types of time-oriented data (e.g., time points, time intervals) and provide various examples of quality problems of time-oriented data. By providing categorized information related to existing taxonomies, we establish a basis for further research in the field of dirty time-oriented data, and for the formulation of essential quality checks when preprocessing time-oriented data.
Chapter PDF
Similar content being viewed by others
References
Rahm, E., Do, H.H.: Data Cleaning: Problems and Current Approaches. IEEE Techn. Bulletin on Data Engineering 31 (2000)
Kim, W., Choi, B.-J., Hong, E.-K., Kim, S.-K., Lee, D.: A Taxonomy of Dirty Data. Data Mining and Knowledge Discovery 7, 81–99 (2003)
Müller, H., Freytag, J.-C.: Problems, Methods, and Challenges in Comprehensive Data Cleansing. Technical report HUB-IB-164, Humboldt University Berlin (2003)
Oliveira, P., Rodrigues, F., Henriques, P.: A Formal Definition of Data Quality Problems. In: International Conference on Information Quality (MIT IQ Conference) (2005)
Barateiro, J., Galhardas, H.: A Survey of Data Quality Tools. Datenbankspektrum 14, 15–21 (2005)
Sadiq, S., Yeganeh, N., Indulska, M.: 20 Years of Data Quality Research: Themes, Trends and Synergies. In: 22nd Australasian Database Conference (ADC 2011), pp. 1–10. Australian Computer Society, Sydney (2011)
Madnick, S., Wang, R., Lee, Y., Zhu, H.: Overview and Framework for Data and Information Quality Research. Journal of Data and Information Quality (JDIQ) 1(1), 1–22 (2009)
Neely, M., Cook, J.: A Framework for Classification of the Data and Information Quality Literature and Preliminary Results (1996-2007). In: 14th Americas Conference on Information Systems 2008 (AMICS 2008), pp. 1–14 (2008)
Aigner, W., Miksch, S., Schumann, H., Tominski, C.: Visualization of Time-Oriented Data. Springer, London (2011)
Andrienko, N., Andrienko, G.: Exploratory Analysis of Spatial and Temporal Data: A Systematic Approach. Springer, Berlin (2006)
Shneiderman, B.: The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations. In: IEEE Symposium on Visual Languages, pp. 336–343. IEEE Computer Society Press (1996)
Allen, J.: Towards a General Model of Action and Time. Artificial Intelligence 23(2), 123–154 (1984)
XIMES GmbH: Time Intelligence Solutions – [TIS], http://www.ximes.com/en/software/products/tis (accessed March 30, 2012)
XIMES GmbH: Qmetrix, http://www.ximes.com/en/ximes/qmetrix/background.php (accessed March 30, 2012)
Microsoft: Excel, http://office.microsoft.com/en-us/excel/ (accessed March 30, 2012)
Corbin, J., Strauss, A.: Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory, 3rd edn. Sage Publications, Los Angeles (2008)
Card, S., Mackinlay, J., Shneiderman, B.: Readings in Information Visualization: Using Vision to Think. Morgan Kaufmann, San Francisco (1999)
Raman, V., Hellerstein, J.: Potter’s Wheel: An Interactive Data Cleaning System. In: 27th International Conference on Very Large Data Bases (VLDB 2001), pp. 381–390. Morgan Kaufmann, San Francisco (2001)
Kandel, S., Paepcke, A., Hellerstein, J., Heer, J.: Wrangler: Interactive Visual Specification of Data Transformation Scripts. In: ACM Human Factors in Computing Systems (CHI 2011), pp. 3363–3372. ACM, New York (2011)
Huynh, D., Mazzocchi, S.: Google Refine, http://code.google.com/p/google-refine (accessed March 30, 2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 IFIP International Federation for Information Processing
About this paper
Cite this paper
Gschwandtner, T., Gärtner, J., Aigner, W., Miksch, S. (2012). A Taxonomy of Dirty Time-Oriented Data. In: Quirchmayr, G., Basl, J., You, I., Xu, L., Weippl, E. (eds) Multidisciplinary Research and Practice for Information Systems. CD-ARES 2012. Lecture Notes in Computer Science, vol 7465. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32498-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-32498-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32497-0
Online ISBN: 978-3-642-32498-7
eBook Packages: Computer ScienceComputer Science (R0)