×

Privacy-preserving algorithms for distributed mining of frequent itemsets. (English) Zbl 1142.68488

Summary: Standard algorithms for association rule mining are based on identification of frequent itemsets. In this paper, we study how to maintain privacy in distributed mining of frequent itemsets. That is, we study how two (or more) parties can find frequent itemsets in a distributed database without revealing each party’s portion of the data to the other. The existing solution for vertically partitioned data leaks a significant amount of information, while the existing solution for horizontally partitioned data only works for three parties or more. In this paper, we design algorithms for both vertically and horizontally partitioned data, with cryptographically strong privacy. We give two algorithms for vertically partitioned data; one of them reveals only the support count and the other reveals nothing. Both of them have computational overheads linear in the number of transactions. Our algorithm for horizontally partitioned data works for two parties and above and is more efficient than the existing solution.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
68P15 Database theory
94A62 Authentication, digital signatures and secret sharing

Software:

GeneScout
Full Text: DOI

References:

[1] D. Agrawal, C.C. Agrawal, On the design and quantification of privacy preserving data mining algorithms, in: Proceedings of 12th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 2001, pp. 247-255.; D. Agrawal, C.C. Agrawal, On the design and quantification of privacy preserving data mining algorithms, in: Proceedings of 12th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 2001, pp. 247-255.
[2] D. Agrawal, R. Srikant, Privacy preserving mining, in: Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, 2000, pp. 439-450.; D. Agrawal, R. Srikant, Privacy preserving mining, in: Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, 2000, pp. 439-450.
[3] Rakesh Agrawal, Ramakrishnan Srikant, Fast algorithm for mining association rules, in: Proceedings of 20th International Conference on Very Large Data Bases, 1994, pp. 487-499.; Rakesh Agrawal, Ramakrishnan Srikant, Fast algorithm for mining association rules, in: Proceedings of 20th International Conference on Very Large Data Bases, 1994, pp. 487-499.
[4] M. Ben-Or, S. Goldwasser, A. Widgerson, Completeness theorems for non-cryptographic fault-tolerant distributed computation, in: Proceedings of the 20th Annual ACM Symposium on the Theory of Computing, 1988, pp. 1-10.; M. Ben-Or, S. Goldwasser, A. Widgerson, Completeness theorems for non-cryptographic fault-tolerant distributed computation, in: Proceedings of the 20th Annual ACM Symposium on the Theory of Computing, 1988, pp. 1-10.
[5] Dan Boneh, The decision Diffie-Hellman problem, in: ANTS-III, Lecture Notes in Computer Science, vol. 1423, 1998, pp. 48-63.; Dan Boneh, The decision Diffie-Hellman problem, in: ANTS-III, Lecture Notes in Computer Science, vol. 1423, 1998, pp. 48-63. · Zbl 1067.94523
[6] D. Chaum, C. Crepeau, I. Damgard, Multiparty unconditionally secure protocols, in: Proceedings of the 20th Annual ACM Symposium on the Theory of Computing, 1988, pp. 11-19.; D. Chaum, C. Crepeau, I. Damgard, Multiparty unconditionally secure protocols, in: Proceedings of the 20th Annual ACM Symposium on the Theory of Computing, 1988, pp. 11-19.
[7] Damagard, Ivan; Jurik, Mads, A generalisation, a simplification and some applications of Paillier’s probabilistic public-key system, (Kim, Kwangjo, Public Key Cryptography 2001. Public Key Cryptography 2001, Lecture Notes in Computer Science, vol. 1992 (2001), Springer-Verlag: Springer-Verlag Cheju Island, Korea), 119-136 · Zbl 0987.94032
[8] Elena Dasseni, V.S. Verykios, A.K. Elmagarmid, Elisa Bertino, Hiding association rules by using confidence and support, in: Information Hiding Workshop 2001, pp. 369-383.; Elena Dasseni, V.S. Verykios, A.K. Elmagarmid, Elisa Bertino, Hiding association rules by using confidence and support, in: Information Hiding Workshop 2001, pp. 369-383. · Zbl 1008.68671
[9] W. Du, M.J. Atallah, Secure multi-party computation problems and their applications: a review and open problems, in: New Security Paradigms Workshop, 2001, pp. 11-20.; W. Du, M.J. Atallah, Secure multi-party computation problems and their applications: a review and open problems, in: New Security Paradigms Workshop, 2001, pp. 11-20.
[10] A. Evfimievski, R. Srikant, D. Agrawal, J. Gehrke, Privacy preserving mining of association rules. in: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002, pp. 217-228.; A. Evfimievski, R. Srikant, D. Agrawal, J. Gehrke, Privacy preserving mining of association rules. in: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002, pp. 217-228.
[11] O. Goldreich, S. Micali, A. Wigderson, Proofs that yield nothing but their validity and a methodology of cryptographic protocol design, in: Proceedings of the 27th IEEE Symposium on Foundations of Computer Science, 1986, pp. 174-187.; O. Goldreich, S. Micali, A. Wigderson, Proofs that yield nothing but their validity and a methodology of cryptographic protocol design, in: Proceedings of the 27th IEEE Symposium on Foundations of Computer Science, 1986, pp. 174-187.
[12] Goldreich, O.; Micali, S.; Wigderson, A., Proofs that yield nothing but their validity or all languages in NP have zero-knowledge proof systems, JACM, 38, 691-729 (1991) · Zbl 0799.68101
[13] O. Goldreich, Secure multi-party computation, Working Draft Version 1.1, 1998.; O. Goldreich, Secure multi-party computation, Working Draft Version 1.1, 1998.
[14] Goldreich, O., Foundations of cryptography, Basic Tools, vol. 1. (2001), Cambridge University Press: Cambridge University Press Cambridge · Zbl 1007.94016
[15] Goldwasser, S.; Micali, S., Probabilistic encryption, J. Comput. System Sci., 28, 270-299 (1984) · Zbl 0563.94013
[16] Bart Goethals, Sven Laur, Helger Lipmaa, Taneli Mielikainen, On private scalar product computation for privacy-preserving data mining, in: Proceedings of 7th International Conference on Information Security and Cryptology, Lecture Notes in Computer Science, vol. 3506, 2005, pp. 104-120.; Bart Goethals, Sven Laur, Helger Lipmaa, Taneli Mielikainen, On private scalar product computation for privacy-preserving data mining, in: Proceedings of 7th International Conference on Information Security and Cryptology, Lecture Notes in Computer Science, vol. 3506, 2005, pp. 104-120. · Zbl 1122.68494
[17] Rozenberg, Boris, Ehud Gudes: privacy preserving frequent item set mining in vertically partitioned databases, DBSec, 91-104 (2003)
[18] Murat Kantarcioglu, Chris Clifton, Privacy preserving distributed mining of association rules on horizontally partitioned data, in: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 2002, pp. 24-31.; Murat Kantarcioglu, Chris Clifton, Privacy preserving distributed mining of association rules on horizontally partitioned data, in: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 2002, pp. 24-31.
[19] Kantarcioglu, Murat; Clifton, Chris, Privacy preserving distributed mining of association rules on horizontally partitioned data, IEEE Trans. Knowledge Data Eng., 16, 9, 1026-1037 (2004)
[20] Y. Lindell, B. Pinkas, Privacy preserving data mining, in: Advances in Cryptology—Proceedings of CRYPTO 2000, Lecture Notes in Computer Science, vol. 1880, 2000, pp. 36-54.; Y. Lindell, B. Pinkas, Privacy preserving data mining, in: Advances in Cryptology—Proceedings of CRYPTO 2000, Lecture Notes in Computer Science, vol. 1880, 2000, pp. 36-54. · Zbl 0989.68506
[21] Lindell, Y.; Pinkas, B., Privacy preserving data mining, J. Cryptol., 15, 177-206 (2002) · Zbl 1010.94008
[22] S.R.M. Oliveira, O.R. Zaiane, Privacy preserving frequent itemset mining, in: IEEE International Conference on Data Mining Workshop on Privacy, Security and Data Mining, vol. 14., 2002, pp. 43-54.; S.R.M. Oliveira, O.R. Zaiane, Privacy preserving frequent itemset mining, in: IEEE International Conference on Data Mining Workshop on Privacy, Security and Data Mining, vol. 14., 2002, pp. 43-54.
[23] S.J. Rizvi, J.R. Haritsa, Maintaining data privacy in association rule mining, in: Proceedings of 28th International Conference on Very Large Data Bases, 2002, pp. 682-693.; S.J. Rizvi, J.R. Haritsa, Maintaining data privacy in association rule mining, in: Proceedings of 28th International Conference on Very Large Data Bases, 2002, pp. 682-693.
[24] Y. Saygin, V.S. Verykios, A.K. Elmagarmid, Privacy preserving association rule mining, in: Research Issues in Data Engineering (RIDE), 2002, pp. 151-158.; Y. Saygin, V.S. Verykios, A.K. Elmagarmid, Privacy preserving association rule mining, in: Research Issues in Data Engineering (RIDE), 2002, pp. 151-158.
[25] Jaideep Vaidya, Chris Clifton, Privacy preserving association rule mining in vertically partitioned data, in: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002, pp. 639-644.; Jaideep Vaidya, Chris Clifton, Privacy preserving association rule mining in vertically partitioned data, in: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002, pp. 639-644. · Zbl 1158.68380
[26] R.N. Wright, Z. Yang, Privacy-preserving Bayesian network structure computation on distributed heterogeneous data, in: Proceedings of KDD 2004, 2004, pp. 713-718.; R.N. Wright, Z. Yang, Privacy-preserving Bayesian network structure computation on distributed heterogeneous data, in: Proceedings of KDD 2004, 2004, pp. 713-718.
[27] A. Yao, How to generate and exchange secrets, in: Proceedings of the 27th IEEE Symposium on Foundations of Computer Science, 1986, pp. 162-167.; A. Yao, How to generate and exchange secrets, in: Proceedings of the 27th IEEE Symposium on Foundations of Computer Science, 1986, pp. 162-167.
[28] Yin, Michael M.; Wang, Jason T. L., GeneScout: a data mining system for predicting vertebrate genes in genomic DNA sequences, Inform. Sci., 163, 1-3, 201-218 (2004)
[29] Yu, Jeffrey Xu; Chong, Zhihong; Lu, Hongjun; Zhang, Zhenjie; Zhou, Aoying, A false negative approach to mining frequent itemsets from high speed transactional data streams, Inform. Sci., 176, 16, 1986-2015 (2006)
[30] Zhang, Ling; Zhang, Bo, Fuzzy reasoning model under quotient space structure, Inform. Sci., 176, 4, 353-364 (2005) · Zbl 1088.68170
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.