Abstract
We suggest a user-oriented approach to combinatorial data anonymization. A data matrix is called k-anonymous if every row appears at least k times—the goal of the NP-hard k -Anonymity problem then is to make a given matrix k-anonymous by suppressing (blanking out) as few entries as possible. We describe an enhanced k-anonymization problem called Pattern-Guided k -Anonymity where the users can express the differing importance of various data features. We show that Pattern-Guided k -Anonymity remains NP-hard. We provide a fixed-parameter tractability result based on a data-driven parameterization and, based on this, develop an exact ILP-based solution method as well as a simple but very effective greedy heuristic. Experiments on several real-world datasets show that our heuristic easily matches up to the established “Mondrian” algorithm for k -Anonymity in terms of quality of the anonymization and outperforms it in terms of running time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Blocki, J., Williams, R.: Resolving the complexity of some data privacy problems. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010. LNCS, vol. 6199, pp. 393–404. Springer, Heidelberg (2010)
Bonizzoni, P., Della Vedova, G., Dondi, R.: Anonymizing binary and small tables is hard to approximate. Journal of Combinatorial Optimization 22(1), 97–119 (2011)
Bonizzoni, P., Della Vedova, G., Dondi, R., Pirola, Y.: Parameterized complexity of k-anonymity: hardness and tractability. Journal of Combinatorial Optimization (2011)
Bredereck, R., Nichterlein, A., Niedermeier, R., Philip, G.: Pattern-guided data anonymization and clustering. In: Murlak, F., Sankowski, P. (eds.) MFCS 2011. LNCS, vol. 6907, pp. 182–193. Springer, Heidelberg (2011)
Bredereck, R., Nichterlein, A., Niedermeier, R., Philip, G.: The effect of homogeneity on the computational complexity of combinatorial data anonymization. In: Data Mining and Knowledge Discovery (2012)
Campan, A., Truta, T.M.: Data and structural k-anonymity in social networks. In: Bonchi, F., Ferrari, E., Jiang, W., Malin, B. (eds.) PinKDD 2008. LNCS, vol. 5456, pp. 33–54. Springer, Heidelberg (2009)
Chakaravarthy, V.T., Pandit, V., Sabharwal, Y.: On the complexity of the k-anonymization problem. CoRR, abs/1004.4729 (2010)
Dwork, C.: A firm foundation for private data analysis. Communications of the ACM 54(1), 86–95 (2011)
Evans, P.A., Wareham, T., Chaytor, R.: Fixed-parameter tractability of anonymizing data by suppressing entries. Journal of Combinatorial Optimization 18(4), 362–375 (2009)
Frank, A., Asuncion, A.: UCI machine learning repository(2010), http://archive.ics.uci.edu/ml
Fredkin, E.: Trie memory. Communications of the ACM 3(9), 490–499 (1960)
Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys 14(2) 14:1–14:53 (2010)
Gkoulalas-Divanis, A., Kalnis, P., Verykios, V.S.: Providing k-anonymity in location based services. ACM SIGKDD Explorations Newsletter 12, 3–10 (2010)
Karp, R.M.: Reducibility among combinatorial problems. In: Miller, R.E., Thatcher, J.W. (eds.) Complexity of Computer Computations, pp. 85–103. Plenum Press (1972)
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: Proceedings of the 22nd International Conference on Data Engineering (ICDE 2006), pp. 25–25. IEEE (2006)
Lenstra, H.W.: Integer programming with a fixed number of variables. Mathematics of Operations Research 8, 538–548 (1983)
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and l-diversity. In: Proceedings of the 23rd International Conference on Data Engineering (ICDE 2007), pp. 106–115. IEEE (2007)
Loukides, G., Shao, J.: Capturing data usefulness and privacy protection in k-anonymisation. In: Proceedings of the 2007 ACM Symposium on Applied Computing, pp. 370–374. ACM (2007)
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: ℓ-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data 1(1) (2007)
Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2004), pp. 223–228. ACM (2004)
Navarro-Arribas, G., Torra, V., Erola, A., Castellà-Roca, J.: User k-anonymity for privacy preserving data mining of query logs. Information Processing & Management 48(3), 476–487 (2012)
Rastogi, V., Suciu, D., Hong, S.: The boundary between privacy and utility in data publishing. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp. 531–542. VLDB Endowment (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bredereck, R., Nichterlein, A., Niedermeier, R. (2013). Pattern-Guided k-Anonymity. In: Fellows, M., Tan, X., Zhu, B. (eds) Frontiers in Algorithmics and Algorithmic Aspects in Information and Management. Lecture Notes in Computer Science, vol 7924. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38756-2_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-38756-2_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38755-5
Online ISBN: 978-3-642-38756-2
eBook Packages: Computer ScienceComputer Science (R0)