Abstract
Currently, a decision tree is the most commonly used data mining algorithm for classification tasks. While a significant number of studies have investigated privacy-preserving decision trees, the methods proposed in these studies often have shortcomings in terms of data privacy breach or efficiency. Additionally, these methods typically only apply to symmetric frameworks, which consist of two or more parties with equal privilege, and are not suitable for asymmetric scenarios where parties have unequal privilege. In this paper, we propose SecureCART, a three-party privacy-preserving decision tree training scheme with a privileged party. We adopt the existing pMPL framework and design novel secure interactive protocols for division, comparison, and asymmetric multiplication. Compared to similar schemes, our division protocol is 93.5–560.4 × faster, with the communication overhead reduced by over 90%; further, our multiplication protocol is approximately 1.5× faster, with the communication overhead reduced by around 20%. Our comparison protocol based on function secret sharing maintains good performance when adapted to pMPL. Based on the proposed secure protocols, we implement SecureCART in C++ and analyze its performance using three real-world datasets in both LAN and WAN environments. he experimental results indicate that SecureCART is significantly faster than similar schemes proposed in past studies, and that the loss of accuracy while using SecureCART remains within an acceptable range.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Lu S, Zheng J, Cao Z, et al. A survey on cryptographic techniques for protecting big data security: present and forthcoming. Sci China Inf Sci, 2022, 65: 201301
An Y, Meng H, Gao Y, et al. Application of machine learning method in optical molecular imaging: a review. Sci China Inf Sci, 2020, 63: 111101
Liu F, Zheng Z, Shi Y, et al. A survey on federated learning: a perspective from multi-party computation. Front Comput Sci, 2024, 18: 181336
Sun G. New progress in research and application of machine learning. Chin J Electron, 2020, 29: 991
Song L, Wang J, Wang Z, et al. pMPL: a robust multi-party learning framework with a privileged party. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, 2022. 2689–2703
Chen H, Li H, Wang Y, et al. PriVDT: an efficient two-party cryptographic framework for vertical decision trees. IEEE Trans Inform Forensic Secur, 2023, 18: 1006–1021
Hao M, Li H, Chen H, et al. FastSecNet: An efficient cryptographic framework for private neural network inference. IEEE Trans Inform Forensic Secur, 2023, 18: 2569–2582
Quinlan J R. Induction of decision trees. Mach Learn, 1986, 1: 81–106
Quinlan J R. C4.5: Programs for Machine Learning. Amsterdam: Elsevier, 2014
Lewis R J. An introduction to classification and regression tree (cart) analysis. In: Proceedings of Annual Meeting of the Society for Academic Emergency Medicine in San Francisco, 2000
Lindell Y, Pinkas B. Privacy preserving data mining. In: Proceedings of Annual International Cryptology Conference, 2000. 36–54
Xiao M J, Huang L S, Luo Y L, et al. Privacy preserving ID3 algorithm over horizontally partitioned data. In: Proceedings of the 6th International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT’05), 2005. 239–243
Samet S, Miri A. Privacy preserving ID3 using gini index over horizontally partitioned data. In: Proceedings of IEEE/ACS International Conference on Computer Systems and Applications, 2008. 645–651
Hao M, Li H, Xu G, et al. Efficient, private and robust federated learning. In: Proceedings of Annual Computer Security Applications Conference, 2021. 45–60
Li A, Zhang L, Tan J, et al. Sample-level data selection for federated learning. In: Proceedings of IEEE INFOCOM 2021-IEEE Conference on Computer Communications, 2021. 1–10
Ma Q, Deng P. Secure multi-party protocols for privacy preserving data mining. In: Proceedings of the 3rd International Conference on Wireless Algorithms, Systems, and Applications, 2008. 526–537
Khodaparast F, Sheikhalishahi M, Haghighi H, et al. Privacy preserving random decision tree classification over horizontally and vertically partitioned data. In: Proceedings of the 16th International Conference on Dependable, Autonomic and Secure Computing, 2018. 600–607
Liu L, Chen R, Liu X, et al. Towards practical privacy-preserving decision tree training and evaluation in the cloud. IEEE Trans Inform Forensic Secur, 2020, 15: 2914–2929
Du W L, Zhan Z J. Building decision tree classifier on private data. In: Proceedings of the IEEE International Conference on Privacy, Security and Data Mining, 2002
She R, Wang K, Xu Y, et al. Pushing feature selection ahead of join. In: Proceedings of the SIAM International Conference on Data Mining, 2005. 536–540
Wang K, Xu Y, Yu P S, et al. Building decision trees on records linked through key references. In: Proceedings of the SIAM International Conference on Data Mining, 2005. 576–580
Vaidya J, Clifton C. Privacy-preserving decision trees over vertically partitioned data. In: Proceedings of IFIP Annual Conference on Data and Applications Security and Privacy, 2005. 139–152
Vaidya J, Clifton C, Kantarcioglu M, et al. Privacy-preserving decision trees over vertically partitioned data. ACM Trans Knowl Discov Data, 2008, 2: 1–27
Dansana J, Dey D, Kumar R. A novel approach: cart algorithm for vertically partitioned database in multi-party environment. In: Proceedings of IEEE Conference on Information & Communication Technologies, 2013. 829–834
Hu Y, Niu D, Yang J, et al. FDML: a collaborative machine learning framework for distributed features. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019. 2232–2240
Cheng K, Fan T, Jin Y, et al. SecureBoost: a lossless federated learning framework. IEEE Intell Syst, 2021, 36: 87–98
Abspoel M, Escudero D, Volgushev N. Secure training of decision trees with continuous attributes. Proc Privacy Enhancing Technol, 2021, 2021: 167–187
Wu Y, Cai S, Xiao X, et al. Privacy preserving vertical federated learning for tree-based models. 2020. ArXiv:2008.06170
Zheng Y, Xu S, Wang S, et al. Privet: a privacy-preserving vertical federated learning service for gradient boosted decision tables. IEEE Trans Serv Comput, 2023, 16: 3604–3620
Brickell E F. Some ideal secret sharing schemes. In: Proceedings of Workshop on the Theory and Application of Cryptographic Techniques, 1989. 468–475
Gilboa N, Ishai Y. Distributed point functions and their applications. In: Proceedings of the 33rd Annual International Conference on the Theory and Applications of Cryptographic Techniques, Copenhagen, 2014. 640–658
Boyle E, Gilboa N, Ishai Y. Function secret sharing. In: Proceedings of Annual International Conference on the Theory and Applications of Cryptographic Techniques, 2015. 337–367
Boyle E, Gilboa N, Ishai Y. Function secret sharing: improvements and extensions. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, 2016. 1292–1303
Boyle E, Chandran N, Gilboa N, et al. Function secret sharing for mixed-mode and fixed-point secure computation. In: Proceedings of Annual International Conference on the Theory and Applications of Cryptographic Techniques, 2021. 871–900
Canetti R. Security and composition of multiparty cryptographic protocols. J Cryptology, 2000, 13: 143–202
Bache K, Lichman M. UCI machine learning repository. 2013. https://archive.ics.uci.edu
Acknowledgements
This work was supported in part by Major Program (JD) of Hubei Province (Grant No. 2023BAA027), National Natural Science Foundation of China (Grant Nos. 62202339, 62172307, U21A20466, 62325209), New 20 Project of Higher Education of Jinan (Grant No. 202228017), and Fundamental Research Funds for the Central Universities (Grant No. 2042023KF0203).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Supporting information Appendixes A–E. The supporting information is available online at info.scichina.com and springerlink.bibliotecabuap.elogim.com. The supporting materials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and content remains entirely with the authors.
Rights and permissions
About this article
Cite this article
Tong, Y., Feng, Q., Luo, M. et al. Multi-party privacy-preserving decision tree training with a privileged party. Sci. China Inf. Sci. 67, 182303 (2024). https://doi.org/10.1007/s11432-023-4013-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-023-4013-x