×

Differentially private distributed logistic regression with the objective function perturbation. (English) Zbl 1508.94044

Summary: Distributed learning is a very effective divide-and-conquer strategy for dealing with big data. As distributed learning algorithms become more and more mature, network security issues including the risk of privacy disclosure of personal sensitive data, have attracted high attention and vigilance. Differentially private is an important method that maximizes the accuracy of a data query while minimizing the chance of identifying its records when querying from the given data. The known differentially private distributed learning algorithms are based on variable perturbation, but the variable perturbation method may be non-convergence and the experimental results usually have large deviations. Therefore, in this paper, we consider differentially private distributed learning algorithm based on objective function perturbation. We first propose a new distributed logistic regression algorithm based on objective function perturbation (DLR-OFP). We prove that the proposed DLR-OFP satisfies differentially private, and obtain its fast convergence rate by introducing a new acceleration factor for the gradient descent method. The numerical experiments based on benchmark data show that the proposed DLR-OFP algorithm has fast convergence rate and better privacy protection ability.

MSC:

94A16 Informational aspects of data analysis and big data
68T07 Artificial neural networks and deep learning
68P05 Data structures

Software:

RAPPOR
Full Text: DOI

References:

[1] Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K. and Zhang, L., Deep learning with differentially private, Proc. 2016 ACM SIGSAC Conf. Computer and Communications Security (ACM, 2016), pp. 308-318.
[2] Aldeen, Y. A., Salleh, M. and Aljeroudi, Y., An innovative privacy preserving technique for incremental datasets on cloud computing, J. Biomed. Inf.62 (2016) 107-116.
[3] L. P. Barnes and A. Ozgur, Minimax bounds for distributed logistic regression, preprint (2019), arXiv:abs/1910.01625.
[4] Beck, A., First-order methods in optimization, SIAM-Soc. Indust. Appl. Math.25 (2017) 331-351. · Zbl 1384.65033
[5] Blum, A., Ligett, K. and Roth, A., A learning theory approach to noninteractive database privacy, J. ACM60(2) (2013) 1-25. · Zbl 1281.68092
[6] Boyd, S., Parikh, N., Chu, E., Peleato, B. and Eckstein, J., Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers (Now Publishers Inc, 2011). · Zbl 1229.90122
[7] Chaudhuri, K. and Monteleoni, C., Privacy-preserving logistic regression, Neural Inform. Process. Syst.8 (2008) 289-296.
[8] Chaudhuri, K., Monteleoni, C. and Sarwate, A. D., Differentially private empirical risk minimization, J. Mach. Learn. Res.12 (2011) 1069-1109. · Zbl 1280.62073
[9] Cormode, G., Procopiuc, C., Srivastava, D., Shen, E. and Yu, T., Differentially private spatial decompositions, 2012 IEEE 28th Int. Conf. Data Engineering (IEEE, 2012), pp. 20-31.
[10] Dwork, C., Differentially private, Int. Colloquium on Automata, Languages, and Programming, ICALP (Springer, 2006), pp. 1-12.
[11] Dwork, C. and Roth, A., The algorithmic foundations of differentially private, Found. Trends Theoret. Comput. Sci.9 (2014) 211-407. · Zbl 1302.68109
[12] Erlingsson, Ú., Pihur, V. and Korolova, A., RAPPOR: Randomized aggregatable privacy-preserving ordinal response, in Proc. ACM Conf. Computer and Communications Security (ACM, 2014), pp. 1054-1067.
[13] Fan, L. and Xiong, L., An adaptive approach to real-time aggregate monitoring with differentially private, IEEE Trans. Knowl. Data Eng.26(9) (2013) 2094-2106.
[14] Georgios, K., Stavros, P., Xiao, X. and Dimitris, P., Differentially private event sequences over infinite streams, Proc. VLDB Endowment7(12) (2014) 1155-1166.
[15] Han, S., Topcu, U. and Pappas, G. J., Differentially private distributed constrained optimization, IEEE Trans. Autom. Control62(1) (2016) 50-64. · Zbl 1359.90167
[16] Huang, Z., Hu, R., Guo, Y., Chan-Tin, E. and Gong, Y., DP-ADMM: ADMM-based distributed learning with differentially private, IEEE Trans. Inf. Forensics Security15 (2019) 1002-1012.
[17] Ioannidis, S., Jiang, Y., Amizadeh, S.et al., Parallel news-article traffic forecasting with ADMM, SIGKDD Workshop on Mining and Learning from Time Series (ACM, 2016).
[18] Ji, Z., Jiang, X., Wang, S., Xiong, L. and Ohno-Machado, L., Differentially private distributed logistic regression using private and public data, BMC Med. Genom.7(Suppl 1) (2014) S14.
[19] Kifer, D., Smith, A. and Thakurta, A., Private convex empirical risk minimization and high-dimensional regression, 25th Annual Conf. Learning Theory, Vol. 23 (2012), pp. 1-40.
[20] Li, N., Li, T. and Venkatasubramanian, S., \(t\)-closeness: Privacy beyond k-anonymity and l-diversity, 2007 IEEE 23rd Int. Conf. Data Engineering (IEEE, 2007), pp. 106-115.
[21] Li, N., Lyu, M., Su, D. and Yang, W., Differential Privacy: From Theory to Practice, Synthesis Lectures on Information Security, Privacy, and Trust8(4) (2016) 1-138.
[22] Li, H., Wu, X. and Chen, Y, k-means clustering method preserving differentially private in MapReduce framework, J. Commun.37(2) (2016) 124-130 (in Chinese).
[23] Liu, T., Chen, W., Wang, T. and Gao, F., Distributed Machine Learning: Theories, Algorithms and Systems (China Machine Press, 2018).
[24] Machanavajjhala, A., Gehrke, J., Kifer, D. and Venkitasubramaniam, M., L-diversity: Privacy beyond k-anonymity, in Proc. IEEE 22nd Int. Conf. Data Engineering (ICDE’06) (IEEE, 2006), pp. 24-24.
[25] McSherry, F. and Talwar, K., Mechanism design via differentially private, 48th Annual IEEE Symp. Foundations of Computer Science (FOCS’07) (IEEE, 2007), pp. 94-103.
[26] Nesterov, Y., A method of solving a convex programming problem with convergence rate \(\mathcal{O}(1/ k^2)\), Sov. Math. Dokl.27 (1983) 372-376. · Zbl 0535.90071
[27] Song, S., Chaudhuri, K. and Sarwate, A. D., Stochastic gradient descent with differentially private updates, 2013 IEEE Global Conf. Signal and Information Processing (IEEE, 2013), pp. 245-248.
[28] Sweeney, L., K-anonymity: A model for protecting privacy, Int. J. Uncertain. Fuzz. Knowledge-Based Syst.10(5) (2002) 557-570. · Zbl 1085.68589
[29] Wang, X., Ishii, H., Du, L., Cheng, P. and Chen, J., Differentially private-preserving distributed machine learning, 2019 IEEE 58th Conf. Decision and Control (CDC) (IEEE, 2019), pp. 7339-7344.
[30] Wang, P. and Zhang, H., Distributed logistic regression with differentially private, Sci. Sin. Inform.50 (2020) 1511-1528 (in Chinese).
[31] Wilcoxon, F., Individual comparisons by ranking methods, Biometr. Bull.1(6) (1945) 80-83.
[32] Wu, N., Farokhi, F., Smith, D. and Kaafar, M. A., The value of collaboration in convex machine learning with differentially private, 2020 IEEE Symp. Security and Privacy (SP) (IEEE, 2020), pp. 304-317.
[33] Xiao, X., Wang, G. and Gehrke, J., Differentially private via wavelet transforms, IEEE Trans. Knowledge Data Eng.23(8) (2010) 1200-1214.
[34] Xiao, Y., Xiao, L., Lu, X., Zhang, H., Yu, S. and Poor, H. V., Deep-reinforcement-learning-based user profile perturbation for privacy-aware recommendation, IEEE Int. Things J.8(6) (2021) 4560-4568.
[35] Xiong, P., Zhu, T. and Wang, X., A survey on differentially private and applications, Jisuanji Xuebao/Chin. J. Comput.37(1) (2014) 101-122.
[36] Xu, J., Xu, C., Zou, B., Tang, Y. Y., Peng, J. and You, X., New incremental learning algorithm with support vector machines, IEEE Trans. Syst. Man Cybern.: Syst.49(11) (2018) 2230-2241.
[37] Xu, D., Yuan, S. and Wu, X., Achieving differentially private in vertically partitioned multiparty Learning, 2021 IEEE Int. Conf. Big Data (Big Data) (IEEE, 2021), pp. 5474-5483.
[38] Zerka, F., Barakat, S., Walsh, S., Bogowicz, M., Leijenaar, R. T. H., Jochems, A., Miraglio, B., Townend, D. and Lambin, P., Systematic review of privacy-preserving distributed machine learning from federated databases in health care, JCO Clinic. Cancer Inform.4 (2020) 184-200.
[39] Zhang, X., Khalili, M. and Liu, M., Improving the privacy and accuracy of ADMM-based distributed algorithms, Proceedings of the 35th International Conference on Machine Learning, PMLR, Vol. 80 (2018), pp. 5796-5805.
[40] Zhang, X. and Meng, X., Differentially private in data publication and analysis, Chin. J. Comput.4 (2014) 927-949 (in Chinese).
[41] Zhang, T. and Zhu, Q., Dynamic differentially private for ADMM-based distributed classification learning, IEEE Trans. Inf. Forens. Security12(1) (2016) 172-187.
[42] Zhou, S., Li, F., Tao, Y. and Xiao, X., Privacy preservation in database applications: A survey, Chin. J. Comput.32(5) (2009) 847-861.
[43] Zhu, T., Li, G., Zhou, W. and Yu, P. S., Differentially private data publishing and analysis: A survey, IEEE Trans. Knowl. Data Eng.29(8) (2017) 1619-1638.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.