Some Criterions for Selecting the Best Data Abstractions

Haraguchi, Makoto; Kudoh, Yoshimitsu

doi:10.1007/3-540-45884-0_8

Makoto Haraguchi² &
Yoshimitsu Kudoh²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2281))

507 Accesses
1 Citations

Abstract

This paper presents and summarizes some criterions for selecting the best data abstraction for relations in relational databases. The data abstraction can be understood as a grouping of attribute values whose individual aspects are forgotten and are therefore abstracted to some more abstract value together. Consequently, a relation after the abstraction is a more compact one for which data miners will work efficiently. It is however a major problem that, when an important aspect of data values is neglected in the abstraction, then the quality of extracted knowledge becomes worse. So, it is the central issue to present a criterion under which only an adequate data abstraction is selected so as to keep the important information and to reduce the sizes of relations at the same time. From this viewpoint, we present in this paper three criterions and test them for a task of classifying tuples in a relation given several target classes. All the criterions are derived from a notion of similarities among class distributions, and are formalized based on the standard information theory. We also summarize our experimental results for the classification task, and discuss a future work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Towards Finding Maximal Subrelations with Desired Properties

Abstract Representations and Generalized Frequent Pattern Discovery

Data Description Through Information Granules: A Multiview Perspective

Article 27 July 2020

References

Han, J. and Fu, Y.: Attribute-Oriented Induction in Data Mining. In Advances in Knowledge Discovery and Data Mining (Fayyad, U.N. et.al. eds.), pp.399–421, 1996.
Google Scholar
Kudoh, Y. and Haraguchi, M.: An Appropriate Abstration for an Attribute-Oriented Induction Proceeding of The Second International Conference on Discovery Science, LNAI 721, pp.43–55, 1999.
Google Scholar
Kudoh, Y. and Haraguchi, M.: Detecting a Compact Decision Tree Based on an Appropriate Abstraction Proc. of 2nd Intl. Conf. on Intelligent Data Engineering and Automated Learning, LNCS-1983, pp.60–70, 2000.
Google Scholar
Quinlan, J.R.: C4.5-Programs for Machine Learning, Morgan Kaufmann, 1993.
Google Scholar
Shannon, C. E.: A Mathematical Theory of Communication, The Bell system technical journal, vol. 27, pp.379–423 (part I), pp.623–656 (part II), 1948.
MathSciNet Google Scholar
Kudoh, Y., Haraguchi, M. and Okubo, Y.: Data Abstractions for Decision Tree Induction, submitted to an international journal, Jan. 2001.
Google Scholar
Murphy, P.M. and Aha, D.W.: UCI Repository of machine learning databases, http://www.ics.uci.edu/ mlearn/MLRepository.html.
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D. and Miller, K.: Intorduction to WordNet: An On-line Lexical Database In: International Journal of lexicography 3(4), pp.235–244, 1990.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Division of Electronics and Information Engineering, Hokkaido University, N-13, W-8, 060-8628, Sapporo, Japan
Makoto Haraguchi & Yoshimitsu Kudoh

Authors

Makoto Haraguchi
View author publications
You can also search for this author in PubMed Google Scholar
Yoshimitsu Kudoh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, 812-8581, Fukuoka, Japan
Setsuo Arikawa & Ayumi Shinohara &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Haraguchi, M., Kudoh, Y. (2002). Some Criterions for Selecting the Best Data Abstractions. In: Arikawa, S., Shinohara, A. (eds) Progress in Discovery Science. Lecture Notes in Computer Science(), vol 2281. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45884-0_8

Download citation

DOI: https://doi.org/10.1007/3-540-45884-0_8
Published: 14 March 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43338-5
Online ISBN: 978-3-540-45884-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Some Criterions for Selecting the Best Data Abstractions

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Towards Finding Maximal Subrelations with Desired Properties

Abstract Representations and Generalized Frequent Pattern Discovery

Data Description Through Information Granules: A Multiview Perspective

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Some Criterions for Selecting the Best Data Abstractions

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Towards Finding Maximal Subrelations with Desired Properties

Abstract Representations and Generalized Frequent Pattern Discovery

Data Description Through Information Granules: A Multiview Perspective

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation