Non-metric Multidimensional Scaling for Privacy-Preserving Data Clustering

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6936))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1869 Accesses
4 Citations

Abstract

Outsourcing data to external parties for analysis is risky as the privacy of confidential variables can be easily violated. To eliminate this threat, the data values of these variables should be perturbed before releasing the data. However, the perturbation itself may significantly change the underlying properties of the data, affecting the analysis results. What is required is a subtle transformation to generate perturbed data that maintains, as much as possible, the statistical properties and effectiveness (i.e. the utility) of the original data whilst preserving the privacy. We examine privacy-preserving transformations in the context of data clustering. In particular, this paper demonstrates how non-metric multidimensional scaling (MDS) can be profitably used as a perturbation tool and how the perturbed data can be effectively used in clustering analysis without compromising privacy or utility. We apply the proposed technique to real datasets and compare the results, which were, in some circumstances, exactly the same as those obtained from the original data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Data Reduction with Distance Correlation

Privacy Aware K-Means Clustering with High Utility

Privacy Preserving Data Clustering Using a Heterogeneous Data Distortion

References

Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, p. 255. ACM, New York (2001)
Google Scholar
Agrawal, R., Srikant, R.: Privacy-preserving data mining. ACM Sigmod Record 29(2), 439–450 (2000)
Article Google Scholar
Chen, K., Liu, L.: Privacy preserving data classification with rotation perturbation. In: Proceedings of the Fifth IEEE International Conference on Data Mining, p. 4 (2005)
Google Scholar
Chen, K., Sun, G., Liu, L.: Towards attack-resilient geometric data perturbation. In: Proceedings of the 2007 SIAM Data Mining Conference. SDM 2007 (2007)
Google Scholar
Clifton, C., Kantarcioǧlu, M., Vaidya, J.: Defining privacy for data mining. In: National Science Foundation Workshop on Next Generation Data Mining, pp. 126–133 (2002)
Google Scholar
Domingo-Ferrer, J.: A survey of inference control methods for privacy-preserving data mining. In: Aggarwal, C., Yu, P. (eds.) Privacy-Preserving Data Mining: Models and Algorithms, ch. 3, pp. 53–80. Springer, Heidelberg (2008)
Chapter Google Scholar
Domingo-Ferrer, J., Torra, V.: Disclosure control methods and information loss for microdata. In: Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 93–112 (2002)
Google Scholar
Guo, S., Wu, X.: Deriving private information from arbitrarily projected data. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 84–95. Springer, Heidelberg (2007)
Chapter Google Scholar
Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the Privacy Preserving Properties of Random Data Perturbation Techniques. In: Proceedings of the Third IEEE International Conference on Data Mining, pp. 99–106. IEEE Computer Society, Los Alamitos (2003)
Chapter Google Scholar
Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: Random-data perturbation techniques and privacy-preserving data mining. Knowledge and Information Systems 7(4), 387–414 (2005)
Article Google Scholar
Kim, J.J., Winkler, W.E.: Multiplicative Noise for Masking Continuous Data. Technical report, Research Report Series - statistics 2003-01, Statistical Research Division. US Bureau of the Census, Washington, DC (2003)
Google Scholar
Kruskal, J.B.: Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29(1), 1–27 (1964)
Article MathSciNet MATH Google Scholar
Kruskal, J.B.: Nonmetric multidimensional scaling: a numerical method. Psychometrika 29(2), 115–129 (1964)
Article MathSciNet MATH Google Scholar
Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)
Chapter Google Scholar
Liu, K., Giannella, C., Kargupta, H.: An attacker’s view of distance preserving maps for privacy preserving data mining. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 297–308. Springer, Heidelberg (2006)
Chapter Google Scholar
Liu, K., Kargupta, H., Ryan, J.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Transactions on Knowledge and Data Engineering 18(1), 92–106 (2006)
Article Google Scholar
Meila, M.: Comparing clusterings–an information based distance. Journal of Multivariate Analysis 98(5), 873–895 (2007)
Article MathSciNet MATH Google Scholar
Oliveira, S., Zaïane, O.R.: Privacy-preserving clustering to uphold business collaboration: A dimensionality reduction based transformation approach. International Journal of Information Security and Privacy 1(2), 13 (2007)
Article Google Scholar
Sammon Jr., J.W.: A nonlinear mapping for data structure analysis. IEEE Transactions on Computers 100(5), 401–409 (1969)
Article Google Scholar
Turgay, E., Pedersen, T., Saygın, Y., Savaş, E., Levi, A.: Disclosure risks of distance preserving data transformations. In: Ludäscher, B., Mamoulis, N. (eds.) SSDBM 2008. LNCS, vol. 5069, pp. 79–94. Springer, Heidelberg (2008)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK
Khaled Alotaibi, Victor J. Rayward-Smith & Beatriz de la Iglesia

Authors

Khaled Alotaibi
View author publications
You can also search for this author in PubMed Google Scholar
Victor J. Rayward-Smith
View author publications
You can also search for this author in PubMed Google Scholar
Beatriz de la Iglesia
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Electrical and Electronic Engineering, University of Manchester, Sackville Street Building, M60 1QD, Manchester, UK
Hujun Yin
School of Computing Sciences, University of East Anglia, NR4 7TJ, Norwich, UK
Wenjia Wang
University of East Anglia, NR4 7TJ, Norwich, UK
Victor Rayward-Smith

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alotaibi, K., Rayward-Smith, V.J., de la Iglesia, B. (2011). Non-metric Multidimensional Scaling for Privacy-Preserving Data Clustering. In: Yin, H., Wang, W., Rayward-Smith, V. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2011. IDEAL 2011. Lecture Notes in Computer Science, vol 6936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23878-9_35

Download citation

DOI: https://doi.org/10.1007/978-3-642-23878-9_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23877-2
Online ISBN: 978-3-642-23878-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Non-metric Multidimensional Scaling for Privacy-Preserving Data Clustering

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Data Reduction with Distance Correlation

Privacy Aware K-Means Clustering with High Utility

Privacy Preserving Data Clustering Using a Heterogeneous Data Distortion

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Non-metric Multidimensional Scaling for Privacy-Preserving Data Clustering

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Data Reduction with Distance Correlation

Privacy Aware K-Means Clustering with High Utility

Privacy Preserving Data Clustering Using a Heterogeneous Data Distortion

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation