×

A general framework for dimensionality reduction of K-means clustering. (English) Zbl 07300764

Summary: Dimensionality reduction plays an important role in many machine learning and pattern recognition applications. Linear discriminant analysis (LDA) is the most popular supervised dimensionality reduction technique which searches for the projection matrix that makes the data points of different classes to be far from each other while requiring data points of the same class to be close to each other. In this paper, trace ratio LDA is combined with K-means clustering into a unified framework, in which K-means clustering is employed to generate class labels for unlabeled data and LDA is used to investigate low-dimensional representation of data. Therefore, by combining the subspace clustering with dimensionality reduction together, the optimal subspace can be obtained. Differing from other existing dimensionality reduction methods, our novel framework is suitable for different scenarios: supervised, semi-supervised, and unsupervised dimensionality reduction cases. Experimental results on benchmark datasets validate the effectiveness and superiority of our algorithm compared with other relevant techniques.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

JAFFE
Full Text: DOI

References:

[1] Belkin, M., & Niyogi, P. (2001). Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in neural information processing systems 14 (pp. 585-591): MIT Press.
[2] Cai, D., He, X., Han, J. (2007). Semi-supervised discriminant analysis. In 2007 IEEE 11th international conference on computer vision (pp. 1-7): IEEE.
[3] Cai, D., Zhang, C., He, X. (2010). Unsupervised feature selection for multi-cluster data. In ACM SIGKDD international conference on knowledge discovery and data mining (pp. 333-342).
[4] Chen, P.; Jiao, L.; Liu, F.; Zhao, J.; Zhao, Z.; Liu, S., Semi-supervised double sparse graphs based discriminant analysis for dimensionality reduction, Pattern Recognition, 61, 361-378 (2017) · Zbl 1428.68274 · doi:10.1016/j.patcog.2016.08.010
[5] Cui, Y.; Fan, L., A novel supervised dimensionality reduction algorithm: graph-based fisher analysis, Pattern Recognition, 45, 4, 1471-1481 (2012) · Zbl 1231.68196 · doi:10.1016/j.patcog.2011.10.006
[6] Delac, K.; Grgic, M.; Grgic, S., Independent comparative study of pca, ica, and lda on the feret data set, International Journal of Imaging Systems & Technology, 15, 5, 252-260 (2005) · doi:10.1002/ima.20059
[7] Ding, C.; Peng, H., Minimum redundancy feature selection from microarray gene expression data., Journal of Bioinformatics and Computational Biology, 3, 2, 185-205 (2005) · doi:10.1142/S0219720005001004
[8] Feng, Z.; Yang, M.; Zhang, L.; Liu, Y.; Zhang, D., Joint discriminative dimensionality reduction and dictionary learning for face recognition, Pattern Recognition, 46, 8, 2134-2143 (2013) · doi:10.1016/j.patcog.2013.01.016
[9] Fukunaga, K., Introduction to statistical pattern recognition (1972), New York: Academic Press, New York · Zbl 0711.62052
[10] He, X., Cai, D., Yan, S., Zhang, H.-J. (2005). Neighborhood preserving embedding. In Tenth IEEE international conference on computer vision (ICCV’05) Volume 1, (Vol. 2 pp. 1208-1213): IEEE.
[11] Hoi, S., Liu, W., Lyu, M., Ma, W.-Y. (2006). Learning distance metrics with contextual constraints for image retrieval. In 2006 IEEE computer society conference on computer vision and pattern recognition, (Vol. 2 pp. 2072-2078): IEEE.
[12] Hou, C.; Nie, F.; Li, X.; Yi, D.; Wu, Y., Joint embedding learning and sparse regression: a framework for unsupervised feature selection, IEEE Transactions on Cybernetics, 44, 6, 793 (2014) · doi:10.1109/TCYB.2013.2272642
[13] Jia, Y.; Nie, F.; Zhang, C., Trace ratio problem revisited, IEEE Transactions on Neural Networks, 20, 4, 729-735 (2009) · doi:10.1109/TNN.2009.2015760
[14] Kokiopoulou, E.; Saad, Y., Orthogonal neighborhood preserving projections: a projection-based dimensionality reduction technique, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 12, 2143-2156 (2007) · doi:10.1109/TPAMI.2007.1131
[15] Li, H.; Jiang, T.; Zhang, K., Efficient and robust feature extraction by maximum margin criterion, IEEE Transactions on Neural Networks, 17, 1, 157-165 (2006) · doi:10.1109/TNN.2005.860852
[16] Lin, Y.-Y., Liu, T.-L., Chen, H.-T. (2005). Semantic manifold learning for image retrieval. In Proceedings of the 13th annual ACM international conference on multimedia (pp. 249-258): ACM.
[17] Liu, W., Jiang, W., Chang, S.-F. (2008). Relevance aggregation projections for image retrieval. In Proceedings of the 2008 international conference on content-based image and video retrieval (pp. 119-126): ACM.
[18] Lyons, MJ; Budynek, J.; Akamatsu, S., Automatic classification of single facial images, Pattern Analysis & Machine Intelligence IEEE Transactions on, 21, 12, 1357-1362 (1999) · doi:10.1109/34.817413
[19] Mahapatra, D., Semi-supervised learning and graph cuts for consensus based medical image segmentation, Pattern Recognition, 63, 700-709 (2017) · doi:10.1016/j.patcog.2016.09.030
[20] Mardia, K.V., Kent, J.T., Bibby, J.M. (2001). Multivariate analysis. · Zbl 0432.62029
[21] Nie, F.; Xiang, S.; Jia, Y.; Zhang, C., Semi-supervised orthogonal discriminant analysis via label propagation, Pattern Recognition, 42, 11, 2615-2627 (2009) · Zbl 1175.68338 · doi:10.1016/j.patcog.2009.04.001
[22] Nie, F., Xiang, S., Zhang, C. (2007). Neighborhood minmax projections. In International Joint Conference on Artifical Intelligence (pp. 993-998).
[23] Niyogi, X. (2004). Locality preserving projections. In Neural information processing systems, (Vol. 16 p. 153): MIT.
[24] Nutt, CL; Mani, DR; Betensky, RA; Tamayo, P.; Cairncross, JG; Ladd, C.; Pohl, U.; Hartmann, C.; Mclaughlin, ME; Batchelor, TT, Gene expression-based classification of malignant gliomas correlates better with survival than histological classification, Cancer Research, 63, 7, 1602-7 (2003)
[25] Pedronette, DCG; Gonçalves, FMF; Guilherme, IR, Unsupervised manifold learning through reciprocal knn graph and connected components for image retrieval tasks, Pattern Recognition, 75, 161-174 (2018) · doi:10.1016/j.patcog.2017.05.009
[26] Raducanu, B.; Dornaika, F., A supervised non-linear dimensionality reduction approach for manifold learning, Pattern Recognition, 45, 6, 2432-2444 (2012) · Zbl 1234.68345 · doi:10.1016/j.patcog.2011.12.006
[27] Roweis, ST; Saul, LK, Nonlinear dimensionality reduction by locally linear embedding, Science, 290, 5500, 2323-2326 (2000) · doi:10.1126/science.290.5500.2323
[28] Singh, D.; Febbo, PG; Ross, K.; Jackson, DG; Manola, J.; Ladd, C.; Tamayo, P.; Renshaw, AA; D’Amico, AV; Richie, JP, Gene expression correlates of clinical prostate cancer behavior, Cancer Cell, 1, 2, 203 (2002) · doi:10.1016/S1535-6108(02)00030-2
[29] Sugiyama, M., Dimensionality reduction of multimodal labeled data by local fisher discriminant analysis, Journal of Machine Learning Research, 8, May, 1027-1061 (2007) · Zbl 1222.68312
[30] Sugiyama, M.; Idé, T.; Nakajima, S.; Sese, J., Semi-supervised local fisher discriminant analysis for dimensionality reduction, Machine Learning, 78, 1-2, 35-61 (2010) · Zbl 1470.68180 · doi:10.1007/s10994-009-5125-7
[31] Tenenbaum, JB; De Silva, V.; Langford, JC, A global geometric framework for nonlinear dimensionality reduction, Science, 290, 5500, 2319-2323 (2000) · doi:10.1126/science.290.5500.2319
[32] Wang, D.; Nie, F.; Huang, H.; Yan, J.; Risacher, SL; Saykin, AJ; Shen, L., Structural brain network constrained neuroimaging marker identification for predicting cognitive functions, Inf Process Med Imaging, 23, 536-547 (2013)
[33] Wang, H.; Nie, F.; Huang, H.; Kim, S.; Nho, K.; Risacher, SL; Saykin, AJ; Shen, L., Identifying quantitative trait loci via group-sparse multitask regression and feature selection: an imaging genetics study of the adni cohort., Bioinformatics, 28, 2, 229 (2012) · doi:10.1093/bioinformatics/btr649
[34] Wang, H., Nie, F., Huang, H., Risacher, S., Ding, C., Saykin, A.J., Shen, L. (2011). Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance. In International conference on computer vision (pp. 557-562).
[35] Wang, H., Yan, S., Xu, D., Tang, X. (2007). Trace ratio vs. ratio trace for dimensionality reduction. In IEEE conference on computer vision and pattern recognition (pp. 1-8).
[36] Wang, S.; Lu, J.; Gu, X.; Du, H.; Yang, J., Semi-supervised linear discriminant analysis for dimension reduction and classification, Pattern Recognition, 57, 179-189 (2016) · Zbl 1412.68195 · doi:10.1016/j.patcog.2016.02.019
[37] Wang, X., Liu, Y., Nie, F., Huang, H. (2015). Discriminative unsupervised dimensionality reduction. In Proceedings of the 24th international conference on artificial intelligence (pp. 3925-3931): AAAI Press.
[38] Wu, H.; Prasad, S., Semi-supervised dimensionality reduction of hyperspectral imagery using pseudo-labels, Pattern Recognition, 74, 212-224 (2018) · doi:10.1016/j.patcog.2017.09.003
[39] Yan, S.; Xu, D.; Zhang, B.; Zhang, H-J; Yang, Q.; Lin, S., Graph embedding and extensions: a general framework for dimensionality reduction, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1, 40-51 (2007) · doi:10.1109/TPAMI.2007.250598
[40] Yu, G.; Zhang, G.; Domeniconi, C.; Yu, Z.; You, J., Semi-supervised classification based on random subspace dimensionality reduction, Pattern Recognition, 45, 3, 1119-1135 (2012) · Zbl 1227.68096 · doi:10.1016/j.patcog.2011.08.024
[41] Yu, J., & Tian, Q. (2006). Learning image manifolds by semantic subspace projection. In Proceedings of the 14th ACM international conference on multimedia (pp. 297-306): ACM.
[42] Zhang, D., Zhou, Z.-H., Chen, S. (2007). Semi-supervised dimensionality reduction. In SDM, SIAM (pp. 629-634).
[43] Zhang, H.; Wu, QMJ; Chow, TWS; Zhao, M., A two-dimensional neighborhood preserving projection for appearance-based face recognition, Pattern Recognition, 45, 5, 1866-1876 (2012) · Zbl 1233.68204 · doi:10.1016/j.patcog.2011.11.002
[44] Zhang, Z.; Zhang, Y.; Li, F.; Zhao, M.; Zhang, L.; Yan, S., Discriminative sparse flexible manifold embedding with novel graph for robust visual representation and label propagation, Pattern Recognition, 61, 492-510 (2017) · Zbl 1428.68269 · doi:10.1016/j.patcog.2016.07.042
[45] Zhuang, X.; Dai, D., Improved discriminate analysis for high-dimensional data and its application to face recognition, Pattern Recognition, 40, 5, 1570-1578 (2007) · Zbl 1113.68086 · doi:10.1016/j.patcog.2006.11.015
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.