PCPs and the Hardness of Generating Private Synthetic Data

Ullman, Jonathan; Vadhan, Salil

doi:10.1007/978-3-642-19571-6_24

Jonathan Ullman¹⁷ &
Salil Vadhan¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 6597))

Included in the following conference series:

Theory of Cryptography Conference

2984 Accesses
24 Citations

Abstract

Assuming the existence of one-way functions, we show that there is no polynomial-time, differentially private algorithm \(\mathcal{A}\) that takes a database D ∈ ({0,1}^d)ⁿ and outputs a “synthetic database” \(\widehat{D}\) all of whose two-way marginals are approximately equal to those of D. (A two-way marginal is the fraction of database rows x ∈ {0,1}^d with a given pair of values in a given pair of columns). This answers a question of Barak et al. (PODS ‘07), who gave an algorithm running in time poly(n,2^d).

Our proof combines a construction of hard-to-sanitize databases based on digital signatures (by Dwork et al., STOC ‘09) with encodings based on probabilistically checkable proofs.

We also present both negative and positive results for generating “relaxed” synthetic data, where the fraction of rows in D satisfying a predicate c are estimated by applying c to each row of \(\widehat{D}\) and aggregating the results in some way.

A full version of this paper appears on ECCC [28].

Download to read the full chapter text

Chapter PDF

PCPs and the Hardness of Generating Synthetic Data

Article 31 July 2020

Wave: A New Family of Trapdoor One-Way Preimage Sampleable Functions Based on Codes

Strong Hardness of Privacy from Weak Traitor Tracing

Keywords

References

Adam, N.R., Wortmann, J.: Security-control methods for statistical databases: A comparative study. ACM Computing Surveys 21, 515–556 (1989)
Article Google Scholar
Alekhnovich, M., Braverman, M., Feldman, V., Klivans, A.R., Pitassi, T.: The complexity of properly learning simple concept classes. J. Comput. Syst. Sci. 74, 16–34 (2008)
Article MATH MathSciNet Google Scholar
Barak, B., Chaudhuri, K., Dwork, C., Kale, S., McSherry, F., Talwar, K.: Privacy, accuracy, and consistency too: A holistic solution to contingency table release. In: Proceedings of the 26th Symposium on Principles of Database Systems, pp. 273–282 (2007)
Google Scholar
Barak, B., Goldreich, O.: Universal arguments and their applications. SIAM J. Comput. 38, 1661–1694 (2008)
Article MATH MathSciNet Google Scholar
Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: The SuLQ framework. In: Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (June 2005)
Google Scholar
Blum, A., Ligett, K., Roth, A.: A learning theory approach to non-interactive database privacy. In: Proceedings of the 40th ACM SIGACT Symposium on Thoery of Computing (2008)
Google Scholar
Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 202–210 (2003)
Google Scholar
Duncan, G.: Confidentiality and statistical disclosure limitation. In: International Encyclopedia of the Social and Behavioral Sciences. Elsevier, Amsterdam (2001)
Google Scholar
Dwork, C.: A firm foundation for private data analysis. Communications of the ACM (to appear)
Google Scholar
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006, Part II. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006)
Chapter Google Scholar
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006)
Chapter Google Scholar
Dwork, C., Naor, M., Reingold, O., Rothblum, G., Vadhan, S.: When and how can privacy-preserving data release be done efficiently? In: Proceedings of the 2009 International ACM Symposium on Theory of Computing (STOC) (2009)
Google Scholar
Dwork, C., Nissim, K.: Privacy-preserving datamining on vertically partitioned databases. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 528–544. Springer, Heidelberg (2004)
Google Scholar
Dwork, C., Rothblum, G., Vadhan, S.P.: Boosting and differential privacy. In: Proceedings of FOCS 2010 (2010)
Google Scholar
Evfimievski, A., Grandison, T.: Privacy Preserving Data Mining (a short survey). In: Encyclopedia of Database Technologies and Applications. Information Science Reference (2006)
Google Scholar
Feldman, V.: Hardness of proper learning. In: The Encyclopedia of Algorithms. Springer, Heidelberg (2008)
Google Scholar
Feldman, V.: Hardness of approximate two-level logic minimization and PAC learning with membership queries. Journal of Computer and System Sciences 75(1), 13–26 (2009), http://dx.doi.org/10.1016/j.jcss.2008.07.007
Article MATH MathSciNet Google Scholar
Goldreich, O.: Foundations of Cryptography, vol. 2. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Håstad, J.: Some optimal inapproximability results. J. ACM. 48, 798–859 (2001)
Article MATH MathSciNet Google Scholar
Kearns, M.J., Valiant, L.G.: Cryptographic limitations on learning boolean formulae and finite automata. J. ACM. 41, 67–95 (1994)
Article MATH MathSciNet Google Scholar
Kilian, J.: A note on efficient zero-knowledge proofs and arguments (extended abstract). In: STOC (1992)
Google Scholar
Micali, S.: Computationally sound proofs. SIAM J. Comput. 30, 1253–1298 (2000)
Article MATH MathSciNet Google Scholar
Naor, M., Yung, M.: Universal one-way hash functions and their cryptographic applications. In: STOC, pp. 33–43 (1989)
Google Scholar
Pitt, L., Valiant, L.G.: Computational limitations on learning from examples. J. ACM 35, 965–984 (1988)
Article MATH MathSciNet Google Scholar
Reiter, J.P., Drechsler, J.: Releasing multiply-imputed synthetic data generated in two stages to protect confidentiality. Iab discussion paper, Intitut für Arbeitsmarkt und Berufsforschung (IAB), Nürnberg, Institute for Employment Research, Nuremberg, Germany (2007), http://ideas.repec.org/p/iab/iabdpa/200720.html
Rompel, J.: One-way functions are necessary and sufficient for secure signatures. In: STOC, pp. 387–394 (1990)
Google Scholar
Roth, A., Roughgarden, T.: Interactive privacy via the median mechanism. In: STOC 2010 (2010)
Google Scholar
Ullman, J., Vadhan, S.P.: PCPs and the hardness of generating synthetic data. Electronic Colloquium on Computational Complexity (ECCC) 17, 17 (2010)
Google Scholar
Valiant, L.G.: A theory of the learnable. Communications of the ACM 27(11), 1134–1142 (1984)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering and Applied Sciences & Center for Research on Computation and Society, Harvard University, Cambridge, MA, USA
Jonathan Ullman & Salil Vadhan

Authors

Jonathan Ullman
View author publications
You can also search for this author in PubMed Google Scholar
Salil Vadhan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, Technion, 32000, Haifa, Israel
Yuval Ishai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ullman, J., Vadhan, S. (2011). PCPs and the Hardness of Generating Private Synthetic Data. In: Ishai, Y. (eds) Theory of Cryptography. TCC 2011. Lecture Notes in Computer Science, vol 6597. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19571-6_24

Download citation

DOI: https://doi.org/10.1007/978-3-642-19571-6_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19570-9
Online ISBN: 978-3-642-19571-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

PCPs and the Hardness of Generating Private Synthetic Data

Abstract

Chapter PDF

Similar content being viewed by others

PCPs and the Hardness of Generating Synthetic Data

Wave: A New Family of Trapdoor One-Way Preimage Sampleable Functions Based on Codes

Strong Hardness of Privacy from Weak Traitor Tracing

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

PCPs and the Hardness of Generating Private Synthetic Data

Abstract

Chapter PDF

Similar content being viewed by others

PCPs and the Hardness of Generating Synthetic Data

Wave: A New Family of Trapdoor One-Way Preimage Sampleable Functions Based on Codes

Strong Hardness of Privacy from Weak Traitor Tracing

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation