Data mining technology has attracted significant interest as a means of identifying patterns and trends from large collections of data. It is however evident that the collection and analysis of data that include personal information may violate the privacy of the individuals to whom information refers. Privacy protection in data mining is then becoming a crucial issue that has captured the attention of many researchers.
In this chapter, we first describe the concept of k-anonymity and illustrate different approaches for its enforcement. We then discuss how the privacy requirements characterized by k-anonymity can be violated in data mining and introduce possible approaches to ensure the satisfaction of k-anonymity in data mining.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Charu C. Aggarwal. On k-anonymity and the curse of dimensionality. In Proc. of the 31th VLDB Conference, Trondheim, Norway, September 2005.
Gagan Aggarwal, Tomas Feder, Krishnaram Kenthapadi, Rajeev Motwani, Rina Panigrahy, Dilys Thomas, and An Zhu. Anonymizing tables. In Proc. of the 10th International Conference on Database Theory (ICDT’05), Edinburgh, Scotland, January 2005.
Gagan Aggarwal, Tomas Feder, Krishnaram Kenthapadi, Rajeev Motwani, Rina Panigrahy, Dilys Thomas, and An Zhu. Approximation algorithms for k-anonymity. Journal of Privacy Technology, November 2005.
Dakshi Agrawal and Charu C. Aggarwal. On the design and quantification of privacy preserving data mining algorithms. In Proc. of the 20th ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, Santa Barbara, California, June 2001.
Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules. In Proc. of the 20th VLDB Conference, Santiago, Chile, September 1994.
Rakesh Agrawal and Ramakrishnan Srikant. Privacy-preserving data mining. In Proc. of the ACM SIGMOD Conference on Management of Data, Dallas, Texas, May 2000.
Maurizio Atzori, Francesco Bonchi, Fosca Giannotti, and Dino Pedreschi. Blocking anonymity threats raised by frequent itemset mining. In Proc. of the 5th IEEE International Conference on Data Mining (ICDM 2005), Houston, Texas, November 2005.
Maurizio Atzori, Francesco Bonchi, Fosca Giannotti, and Dino Pedreschi. k-anonymous patterns. In Proc. of the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), Porto, Portugal, October 2005.
Maurizio Atzori, Francesco Bonchi, Fosca Giannotti, and Dino Pedreschi. Anonymity preserving pattern discovery. VLDB Journal, November 2006.
Roberto J. Bayardo and Rakesh Agrawal. Data privacy through optimal k-anonymization. In Proc. of the International Conference on Data Engineering (ICDE’05), Tokyo, Japan, April 2005.
Valentina Ciriani, Sabrina De Capitani di Vimercati, Sara Foresti, and Pierangela Samarati. k-anonymity. In T. Yu and S. Jajodia, editors, Security in Decentralized Data Management. Springer, Berlin Heidelberg, 2007.
Valentina Ciriani, Sabrina De Capitani di Vimercati, Sara Foresti, and Pierangela Samarati. Microdata protection. In T. Yu and S. Jajodia, editors, Security in Decentralized Data Management. Springer, Berlin Heidelberg, 2007.
Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, and Johannes Gehrke. Privacy preserving mining of association rules. In Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, July 2002.
Federal Committee on Statistical Methodology. Statistical policy working paper 22, May 1994. Report on Statistical Disclosure Limitation Methodology.
Arik Friedman, Assaf Schuster, and Ran Wolff. Providing k-anonymity in data mining. VLDB Journal. Forthcoming.
Benjamin C.M. Fung, Ke Wang, and Philip S. Yu. Anonymizing classification data for privacy preservation. IEEE Transactions on Knowledge and Data Engineering, 19(5):711–725, May 2007.
Michael R. Garey and David S. Johnson Computers and Intractability. W. H. Freeman & Co., New York, NY, USA, 1979.
Kristen LeFevre, David J. DeWitt, and Raghu Ramakrishnan. Incognito: efficient full-domain k-anonymity. In Proc. of the ACM SIGMOD Conference on Management of Data, Baltimore, Maryland, June 2005.
Kristen LeFevre, David J. DeWitt, and Raghu Ramakrishnan. Mondrian multidimensional k-anonymity. In Proc. of the International Conference on Data Engineering (ICDE’06), Atlanta, Georgia, April 2006.
Yehuda Lindell and Benny Pinkas. Privacy preserving data mining. Journal of Cryptology, 15(3):177–206, June 2002.
Ashwin Machanavajjhala, Johannes Gehrke, and Daniel Kifer. ℓ-density: Privacy beyond k-anonymity. In Proc. of the International Conference on Data Engineering (ICDE’06), Atlanta, Georgia, April 2006.
Adam Meyerson and Ryan Williams On the complexity of optimal k-anonymity. In Proc. of the 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Paris, France, June 2004.
Hyoungmin Park and Kyuseok Shim. Approximate algorithms for k-anonymity. In Proc. of the ACM SIGMOD Conference on Management of Data, Beijing, China, June 2007.
Nicolas Pasquier, Yves Bastide, Rafik Taouil, and Lotfi Lakhal. Discovering frequent closed itemsets for association rules. In Proc. of the 7th International Conference on Database Theory (ICDT ’99), Jerusalem, Israel, January 1999.
Rajeev Rastogi and Kyuseok Shim. PUBLIC: A decision tree classifier that integrates building and pruning. In Proc. of the 24th VLDB Conference, New York, September 1998.
Pierangela Samarati. Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010–1027, November 2001.
Pierangela Samarati and Latanya Sweeney. Generalizing data to provide anonymity when disclosing information (abstract). In Proc. of the 17th ACM-SIGMOD-SIGACT-SIGART Symposium on the Principles of Database Systems,Seattle,WA,188,1998
Ramakrishnan Srikant and Rakesh Agrawal. Mining generalized association rules. In Proc. of the 21th VLDB Conference, Zurich, Switzerland, September 1995.
Ke Wang, Philip S. Yu, and Sourav Chakraborty. Bottom-up generalization: A data mining solution to privacy protection. In Proc. of the 4th IEEE International Conference on Data Mining (ICDM 2004), Brighton, UK, November 2004.
Zhiqiang Yang, Sheng Zhong, and Rebecca N. Wright. Privacy-preserving classification of customer data without loss of accuracy. In Proc. of the 5th SIAM International Conference on Data Mining, Newport Beach, California, April 2005.
Mohammed J. Zaki and Ching-Jui Hsiao. Charm: An efficient algorithm for closed itemset mining. In Proc. of the 2nd SIAM International Conference on Data Mining, Arlington, Virginia, April 2002.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Ciriani, V., di Vimercati, S.D.C., Foresti, S., Samarati, P. (2008). k-Anonymous Data Mining: A Survey. In: Aggarwal, C.C., Yu, P.S. (eds) Privacy-Preserving Data Mining. Advances in Database Systems, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-70992-5_5
Download citation
DOI: https://doi.org/10.1007/978-0-387-70992-5_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-70991-8
Online ISBN: 978-0-387-70992-5
eBook Packages: Computer ScienceComputer Science (R0)