Anonymity meets game theory: secure data integration with malicious participants

Mohammed, Noman; Fung, Benjamin C. M.; Debbabi, Mourad

doi:10.1007/s00778-010-0214-6

Anonymity meets game theory: secure data integration with malicious participants

Regular Paper
Published: 29 December 2010

Volume 20, pages 567–588, (2011)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

The VLDB Journal Aims and scope Submit manuscript

Anonymity meets game theory: secure data integration with malicious participants

Download PDF

Noman Mohammed¹,
Benjamin C. M. Fung¹ &
Mourad Debbabi¹

493 Accesses
37 Citations
Explore all metrics

Abstract

Data integration methods enable different data providers to flexibly integrate their expertise and deliver highly customizable services to their customers. Nonetheless, combining data from different sources could potentially reveal person-specific sensitive information. In VLDBJ 2006, Jiang and Clifton (Very Large Data Bases J (VLDBJ) 15(4):316–333, 2006) propose a secure Distributed k-Anonymity (DkA) framework for integrating two private data tables to a k-anonymous table in which each private table is a vertical partition on the same set of records. Their proposed DkA framework is not scalable to large data sets. Moreover, DkA is limited to a two-party scenario and the parties are assumed to be semi-honest. In this paper, we propose two algorithms to securely integrate private data from multiple parties (data providers). Our first algorithm achieves the k-anonymity privacy model in a semi-honest adversary model. Our second algorithm employs a game-theoretic approach to thwart malicious participants and to ensure fair and honest participation of multiple data providers in the data integration process. Moreover, we study and resolve a real-life privacy problem in data sharing for the financial industry in Sweden. Experiments on the real-life data demonstrate that our proposed algorithms can effectively retain the essential information in anonymous data for data analysis and are scalable for anonymizing large data sets.

Article PDF

A Global Optimal Model for Protecting Privacy

Article 22 January 2020

Solving Data Trading Dilemma with Asymmetric Incomplete Information Using Zero-Determinant Strategy

Privacy Preserving Data Mining: A Review of the State of the Art

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Adam N.R., Wortman J.C.: Security control methods for statistical databases. ACM Comput. Surv. 21(4), 515–556 (1989)
Article Google Scholar
Agrawal, R., Terzi, E.: On honesty in sovereign information sharing. In: Proceedings of the EDBT (2006)
Agrawal, R., Evfimievski, A., Srikant, R.: Information sharing across private databases. In: Proceedings of ACM SIGMOD, San Diego, CA (2003)
Axelrod R.: The Evolution of Cooperation. Basic Books, New York (1984)
Google Scholar
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: ICDE (2005)
Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: the SuLQ framework. In: PODS (2005)
Brodsky A., Farkas C., Jajodia S.: Secure databases: Constraints, inference channels, and monitoring disclosures. IEEE Trans. Knowl. Data Eng. 12, 900–919 (2000)
Article Google Scholar
Clifton C., Kantarcioglu M., Vaidya J., Lin X., Zhu M.Y.: Tools for privacy preserving distributed data mining. ACM SIGKDD Explor. Newsl. 4(2), 28–34 (2002)
Article Google Scholar
Dayal U., Hwang H.Y.: View definition and generalization for database integration in a multidatabase systems. IEEE Trans. Softw. Eng. 10(6), 628–645 (1984)
Article Google Scholar
Denning D., Schlorer J.: Inference controls for statistical databases. IEEE Comput. 16(7), 69–82 (1983)
Google Scholar
Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: PODS (2003)
Du, W., Zhan, Z.: Building decision tree classifier on private data. In: Workshop on Privacy, Security, and Data Mining at the IEEE ICDM (2002)
Du, W., Han, Y.S., Chen, S.: Privacy-preserving multivariate statistical analysis: linear regression and classification. In: Proceedings of the SIAM International Conference on Data Mining (SDM), Florida (2004)
Dwork, C.: Differential privacy. In: ICALP (2006)
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: TCC (2006)
Farkas C., Jajodia S.: The inference problem: A survey. ACM SIGKDD Explor. Newsl. 4(2), 6–11 (2003)
Article Google Scholar
Fung B.C.M., Wang K., Yu P.S.: Anonymizing classification data for privacy preservation. IEEE TKDE 19(5), 711–725 (2007)
Google Scholar
Fung B.C.M., Wang K., Chen R., Yu P.S.: Privacy-preserving data publishing: A survey of recent developments. ACM Comput. Surv. 42(4), 14:1–14:53 (2010)
Article Google Scholar
Hinke, T.: Inference aggregation detection in database management systems. In: IEEE S&P (1988)
Inan, A., Kantarcioglu, M., Bertino, E., Scannapieco, M.: A hybrid approach to private record linkage. In: Proceedings of the Int’l Conference on Data Engineering (2008)
Inan, A., Kantarcioglu, M., Ghinita, G., Bertino, E.: Private record matching using differential privacy. In: Proceedings of the EDBT (2010)
Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: SIGKDD (2002)
Jiang, W., Clifton, C.: Privacy-preserving distributed k-anonymity. In: BDSec (2005)
Jiang W., Clifton C.: A secure distributed framework for achieving k-anonymity. Very Large Data Bases J. (VLDBJ) 15(4), 316–333 (2006)
Article Google Scholar
Jiang W., Clifton C., Kantarcioglu M.: Transforming semi-honest protocols to ensure accountability. Data Knowl. Eng. 65(1), 57–74 (2008)
Article Google Scholar
Jurczyk, P., Xiong, L.: Distributed anonymization: achieving privacy for both data subjects and data providers. In: DBSec (2009)
Kantarcioglu M., Kardes O.: Privacy-preserving data mining in the malicious model. Int. J. Inf. Comput. Secur. 2(4), 353–375 (2008)
Google Scholar
Kantarcioglu, M., Xi, B., Clifton, C.: A game theoretical model for adversarial learning. In: Proceedings of the NGDM Workshop (2007)
Kardes, O., Kantarcioglu, M.: Privacy-preserving data mining applications in malicious model. In: Proceedings of the PADM Workshop (2007)
Kargupta, H., Das, K., Liu, K.: A game theoretic approach toward multi-party privacy-preserving distributed data mining. In: Proceedings of the PKDD (2007)
Kleinberg, J., Papadimitriou, C., Raghavan, P.: On the value of private information. In: TARK (2001)
Layfield, R., Kantarcioglu, M., Thuraisingham, B.: Incentive and trust issues in assured information sharing. In: Proceedings of the CollaborateComm (2008)
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Workload-aware anonymization. In: SIGKDD (2006)
Li, N., Li, T., Venkatasubramanian, S. t-closeness: privacy beyond k-anonymity and ℓ-diversity. In: ICDE (2007)
Lindell Y., Pinkas B.: Privacy preserving data mining. J. Cryptol. 15(3), 177–206 (2002)
Article MathSciNet MATH Google Scholar
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: ℓ-diversity: privacy beyond k-anonymity. ACM TKDD 1(1) (2007)
Malvestuto F.M., Mezzini M., Moscarini M.: Auditing sum- queries to make a statistical database secure. ACM Trans. Inf. Syst. Secur. 9(1), 31–60 (2006)
Article MathSciNet Google Scholar
Mohammed, N., Fung, B.C.M., Hung, P.C.K., Lee, C.: Anonymizing healthcare data: a case study on the blood transfusion service. In: SIGKDD (2009a)
Mohammed, N., Fung, B.C.M., Wang, K., Hung, P.C.K.: Privacy-preserving data mashup. In: EDBT (2009b)
Mohammed, N., Fung, B.C.M., Hung, P.C.K., Lee, C. (2010) Centralized and distributed anonymization for high-dimensional healthcare data. ACM Trans. Knowl. Discov. Data (TKDD) 4(4), 18:1–18:33
Google Scholar
Nash J.: Non-cooperative games. Ann. Math. 54(2), 286–295 (1951)
Article MathSciNet Google Scholar
Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases. http://archive.ics.uci.edu/ml/ (1998)
Nisan, N.: Algorithms for selfish agents. In: Proceedings of the STACS (1999)
Osborne M.J., Rubinstein A.: A Course in Game Theory. The MIT Press, Cambridge, UK (1994)
MATH Google Scholar
Pinkas B.: Cryptographic techniques for privacy-preserving data mining. ACM SIGKDD Explor. Newsl. 4(2), 12–19 (2002)
Article Google Scholar
Quinlan J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Los Altos (1993)
Google Scholar
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE TKDE 13(6), 1010–1027 (2001)
Google Scholar
Sweeney, L.: Datafly: a system for providing anonymity in medical data. In: Proceedings of the DBSec (1998)
Sweeney L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 571–588 (2002a)
Article MathSciNet MATH Google Scholar
Sweeney, L.: k-anonymity: a model for protecting privacy. In: International Journal on Uncertainty, Fuzziness and Knowledge-based Systems (2002b)
Thuraisingham, B.M.: Security checking in relational database management systems augmented with inference engines. Comput. Secur. 6(6), 479–492 (1987)
Google Scholar
Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the ACM SIGKDD (2002)
Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: Proceedings of the ACM SIGKDD (2003)
Wang K., Fung B.C.M., Yu P.S.: Handicapping attacker’s confidence: An alternative to k-anonymization. KAIS 11(3), 345–368 (2007)
Google Scholar
Wiederhold, G.: Intelligent integration of information. In: Proceedings of ACM SIGMOD, pp 434–437 (1993)
Wong, R.C.W., Li, J., Fu, A.W.C., Wang, K.: (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. In: SIGKDD (2006)
Xiao, X., Tao, Y.: Anatomy: simple and effective privacy preservation. In: VLDB (2006)
Xiao, X., Yi, K., Tao, Y. The hardness and approximation algorithms for l-diversity. In: EDBT (2010)
Yang, Z., Zhong, S., Wright, R.N.: Privacy-preserving classification of customer data without loss of accuracy. In: Proceedings of the SDM (2005)
Yao, A.C.: Protocols for secure computations. In: Proceedings of the IEEE FOCS (1982)
Zhang, N., Zhao, W.: Distributed privacy preserving information sharing. In: Proceedings of the VLDB (2005)

Download references

Author information

Authors and Affiliations

Concordia Institute for Information Systems Engineering, Concordia University, Montreal, QC, H3G 1M8, Canada
Noman Mohammed, Benjamin C. M. Fung & Mourad Debbabi

Authors

Noman Mohammed
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin C. M. Fung
View author publications
You can also search for this author in PubMed Google Scholar
Mourad Debbabi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Noman Mohammed.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mohammed, N., Fung, B.C.M. & Debbabi, M. Anonymity meets game theory: secure data integration with malicious participants. The VLDB Journal 20, 567–588 (2011). https://doi.org/10.1007/s00778-010-0214-6

Download citation

Received: 22 November 2009
Revised: 12 November 2010
Accepted: 10 December 2010
Published: 29 December 2010
Issue Date: August 2011
DOI: https://doi.org/10.1007/s00778-010-0214-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Anonymity meets game theory: secure data integration with malicious participants

Abstract

Article PDF

Similar content being viewed by others

A Global Optimal Model for Protecting Privacy

Solving Data Trading Dilemma with Asymmetric Incomplete Information Using Zero-Determinant Strategy

Privacy Preserving Data Mining: A Review of the State of the Art

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Anonymity meets game theory: secure data integration with malicious participants

Abstract

Article PDF

Similar content being viewed by others

A Global Optimal Model for Protecting Privacy

Solving Data Trading Dilemma with Asymmetric Incomplete Information Using Zero-Determinant Strategy

Privacy Preserving Data Mining: A Review of the State of the Art

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation