Abstract
A popular model for protecting privacy when person-specific data is released is k -anonymity. A dataset is k-anonymous if each record is identical to at least (k−1) other records in the dataset. The basic k-anonymization problem, which minimizes the number of dataset entries that must be suppressed to achieve k-anonymity, is NP-hard and hence not solvable both quickly and optimally in general. We apply parameterized complexity analysis to explore algorithmic options for restricted versions of this problem that occur in practice. We present the first fixed-parameter algorithms for this problem and identify key techniques that can be applied to this and other k-anonymization problems.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005) Approximation algorithms for k-anonymity. J Priv Technol, paper 20051120001
Brankovic L, Estivill-Castro V (1999) Privacy issues in knowledge discovery and data mining. In: Proceedings of Australian institute of computer ethics conference (AICEC99), pp 89–99
Brankovic L, Miller M, Horak P, Wrightson G (1997) Usability of compromise-free statistical databases. In: Proceedings of the ninth international conference on scientific and statistical database management (SSDBM 1997). IEEE Press, New York, pp 144–154
Bonizzoni P, Della Vedova G, Dondi R (2007) Anonymizing binary tables is APX-hard. The Computing Research Repository (CoRR) 0707.0421. http://arxiv.org/abs/0707.0421
Chaytor R (2006) Utility preserving k-anonymity. Technical report MUN-CS 2006-01, Dept Computer Science, Memorial University of Newfoundland
Chaytor R (2007) Allowing privacy protection algorithms to jump out of local optimums: an ordered greed framework. In: Bonchi F et al. (eds) Proceedings of the 1st SIGKDD international workshop on privacy, security, and trust in KDD (PinKDD’07). LNCS, vol 4890. Springer, Berlin, pp 33–55
Downey R, Fellows M (1999) Parameterized complexity. Springer, Berlin
Er MC (1988) A fast algorithm for generating set partitions. Comput J 31:283–284
Fernau H (2004) Complexity of a {0,1}-matrix problem. Australasian J Comb 29:273–300
Horak P, Brankovic L, Miller M (1999) A combinatorial problem in database security. Discrete Appl Math 91:119–126
Islam MZ, Brankovic L (2004) A framework for privacy preserving classification in data mining. In: Proceedings of the second workshop on Australasian information security, data mining and web intelligence, and software internationalisation (ACSW Frontiers 2004), pp 163–168
MacDonald (2005) personal communication
Meyerson A, Williams R (2004) On the complexity of optimal k-anonymity. In: Proceedings of 23rd ACM symposium on principles of database systems (PODS’04), pp 223–228
Niedermeier R (2006) Invitation to fixed-parameter algorithms. Oxford University Press, Oxford
Samarati P, Sweeney L (1998) Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report SRI-CSL-98-04, SRI International, Computer Science Laboratory
Sweeney L (2002) Achieving k-anonymity privacy protection using generalization and suppression. Int J Uncertain Fuzziness Knowl-Based Syst 10(5):571–588
Wang K, Yu P, Chakraborty S (2004) Bottom-up generalization: a data mining solution to privacy protection. In: Proceedings of 4th IEEE international conference on data mining (ICDM’04), pp 249–256
Wareham T (1999) Systematic parameterized complexity analysis in computational phonology. PhD thesis, Dept Computer Science, University of Victoria
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Evans, P.A., Wareham, H.T. & Chaytor, R. Fixed-parameter tractability of anonymizing data by suppressing entries. J Comb Optim 18, 362–375 (2009). https://doi.org/10.1007/s10878-009-9253-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10878-009-9253-6