poster

On the equivalent of low-rank linear regressions and linear discriminant analysis based regressions

Authors:

Heng HuangAuthors Info & Claims

KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 1124 - 1132

https://doi.org/10.1145/2487575.2487701

Published: 11 August 2013 Publication History

Abstract

The low-rank regression model has been studied and applied to capture the underlying classes/tasks correlation patterns, such that the regression/classification results can be enhanced. In this paper, we will prove that the low-rank regression model is equivalent to doing linear regression in the linear discriminant analysis (LDA) subspace. Our new theory reveals the learning mechanism of low-rank regression, and shows that the low-rank structures exacted from classes/tasks are connected to the LDA projection results. Thus, the low-rank regression efficiently works for the high-dimensional data.

Moreover, we will propose new discriminant low-rank ridge regression and sparse low-rank regression methods. Both of them are equivalent to doing regularized regression in the regularized LDA subspace. These new regularized objectives provide better data mining results than existing low-rank regression in both theoretical and empirical validations. We evaluate our discriminant low-rank regression methods by six benchmark datasets. In all empirical results, our discriminant low-rank models consistently show better results than the corresponding full-rank methods.

References

[1]

T. Anderson. Estimating linear restrictions on regression coefficients for multivariate normal distributions. The Annals of Mathematical Statistics, 22(3):327--351, 1951.

[2]

T. Anderson. Asymptotic distribution of the reduced rank regression estimator under general conditions. The Annals of Statistics, 27(4):1141--1154, 1999.

[3]

A. Argyriou, T. Evgeniou, and M. Pontil. Multi-task feature learning. In NIPS, pages 41--48, 2006.

Digital Library

[4]

P. Belhumeur, J. Hespanha, and D. Kriegman. Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 19(7):711--720, 1997.

Digital Library

[5]

F. Bunea, Y. She, and M. Wegkamp. Optimal selection of reduced rank estimators of high-dimensional matrices. The Annals of Statistics, 39(2):1282--1309, 2011.

[6]

X. Cai, F. Nie, H. Huang, and C. H. Q. Ding. Multi-class l2;1-norms support vector machine. In ICDM, pages 91--100, 2011.

Digital Library

[7]

C. H. Q. Ding, D. Zhou, X. He, and H. Zha. R1-pca: rotational invariant l1-norm principal component analysis for robust subspace factorization. In ICML, pages 281--288, 2006.

Digital Library

[8]

D. Donoho. High-dimensional data analysis: The curses and blessings of dimensionality. AMS Math Challenges Lecture, pages 1--32, 2000.

[9]

J. Friedman. Regularized discriminant analysis. Journal of the American statistical association, pages 165--175, 1989.

[10]

K. Fukunaga. Introduction to statistical pattern recognition. Academic Pr, 1990.

Digital Library

[11]

D. Graham and N. Allinson. Characterising virtual eigensignatures for general purpose face recognition. NATO ASI series. Series F: computer and system sciences, pages 446--456, 1998.

[12]

A. Hoerl and R. Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, pages 55--67, 1970.

[13]

A. Izenman. Reduced-rank regression for the multivariate linear model. Journal of multivariate analysis, 5(2):248--264, 1975.

[14]

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278--2324, 1998.

[15]

D. Luo, C. Ding, and H. Huang. Linear discriminant analysis: New formulations and overfit analysis. Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011.

[16]

M. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba. Coding facial expressions with gabor wavelets. In Automatic Face and Gesture Recognition, 1998. Proceedings. Third IEEE International Conference on, pages 200--205. IEEE, 1998.

Digital Library

[17]

F. Nie, H. Huang, X. Cai, and C. H. Q. Ding. Efficient and robust feature selection via joint l2;1-norms minimization. In NIPS, pages 1813--1821, 2010.

[18]

M. Niranjan and F. Fallside. Neural networks and radial basis functions in classifying static speech patterns. Computer Speech & Language, 4(3):275--289, 1990.

[19]

G. Obozinski, B. Taskar, and M. Jordan. Multi-task feature selection. Statistics Department, UC Berkeley, Tech. Rep, 2006.

[20]

G. Reinsel and R. Velu. Multivariate reduced-rank regression: theory and applications. Springer New York, 1998.

[21]

L. Sun, R. Patel, J. Liu, K. Chen, T. Wu, J. Li, E. Reiman, and J. Ye. Mining brain region connectivity for alzheimer's disease study via sparse inverse covariance estimation. In KDD, pages 1335--1344, 2009.

Digital Library

[22]

R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pages 267--288, 1996.

[23]

H. Wang, F. Nie, H. Huang, S. L. Risacher, C. Ding, A. J. Saykin, L. Shen, and ADNI. A new sparse multi-task regression and feature selection method to identify brain imaging predictors for memory performance. IEEE Conference on Computer Vision, pages 557--562, 2011.

Digital Library

[24]

H. Wang, F. Nie, H. Huang, S. L. Risacher, A. J. Saykin, L. Shen, et al. Identifying disease sensitive and quantitative trait-relevant biomarkers from multidimensional heterogeneous imaging genetics data via sparse multimodal multitask learning. Bioinformatics (ISMB), 28(12):i127--i136, 2012.

Digital Library

[25]

H. Wang, F. Nie, H. Huang, J. Yan, S. Kim, S. Risacher, A. Saykin, and L. Shen. High-order multi-task feature learning to identify longitudinal phenotypic markers for alzheimer's disease progression prediction. In NIPS, pages 1286--1294, 2012.

[26]

S. Xiang, Y. Zhu, X. Shen, and J. Ye. Optimal exact least squares rank minimization. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 480--488, 2012.

Digital Library

[27]

J. Ye. Least squares linear discriminant analysis. In ICML, pages 1087--1093, 2007.

Digital Library

[28]

P. Zhao and B. Yu. On model selection consistency of lasso. Journal of Machine Learning Research, 7:2541--2563, 2006.

Digital Library

Cited By

Wang JXie FNie FLi X(2024)Generalized and Robust Least Squares RegressionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.321359435:5(7006-7020)Online publication date: May-2024
https://doi.org/10.1109/TNNLS.2022.3213594
Wen JDeng SFei LZhang ZZhang BZhang ZXu Y(2024)Discriminative Regression With Adaptive Graph DiffusionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318540835:2(1797-1809)Online publication date: Feb-2024
https://doi.org/10.1109/TNNLS.2022.3185408
Zhou JZhang QZeng SZhang BFang L(2024)Latent Linear Discriminant Analysis for feature extraction via Isometric Structural LearningPattern Recognition10.1016/j.patcog.2023.110218149(110218)Online publication date: May-2024
https://doi.org/10.1016/j.patcog.2023.110218
Show More Cited By

Index Terms

On the equivalent of low-rank linear regressions and linear discriminant analysis based regressions
1. Computing methodologies
  1. Machine learning

Recommendations

Graph regularized linear discriminant analysis and its generalization

Linear discriminant analysis (LDA) is a powerful dimensionality reduction technique, which has been widely used in many applications. Although, LDA is well-known for its discriminant capability, it clearly does not capture the geometric structure of the ...
Capped norm linear discriminant analysis and its applications
Abstract
Classical linear discriminant analysis (LDA) is based on squared Frobenious norm and hence is sensitive to outliers and noise. To improve the robustness of LDA, this paper introduces a capped l_2,1-norm of a matrix, which employs non-squared l₂-...
Efficient model selection for regularized linear discriminant analysis
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

Classical Linear Discriminant Analysis (LDA) is not applicable for small sample size problems due to the singularity of the scatter matrices involved. Regularized LDA (RLDA) provides a simple strategy to overcome the singularity problem by applying a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

August 2013

1534 pages

ISBN:9781450321747

DOI:10.1145/2487575

Editors:
Rayid Ghani
University of Chicago
,
Ted E. Senator
SAIC
,
Paul Bradley
MethodCare, Inc.
,
Rajesh Parekh
Groupon
,
Jingrui He
Stevens Institute of Technology
,
General Chairs:
Robert L. Grossman
University of Chicago and Open Data Group
,
Ramasamy Uthurusamy
General Motors Corporation (retired)
,
Program Chairs:
Inderjit S. Dhillon
University of Texas
,
Yehuda Koren
Google

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 August 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Conference

KDD' 13

Sponsor:

KDD' 13: The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 11 - 14, 2013

Illinois, Chicago, USA

Acceptance Rates

KDD '13 Paper Acceptance Rate 125 of 726 submissions, 17%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

99
Total Citations
View Citations
1,064
Total Downloads

Downloads (Last 12 months)41
Downloads (Last 6 weeks)1

Reflects downloads up to 23 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang JXie FNie FLi X(2024)Generalized and Robust Least Squares RegressionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.321359435:5(7006-7020)Online publication date: May-2024
https://doi.org/10.1109/TNNLS.2022.3213594
Wen JDeng SFei LZhang ZZhang BZhang ZXu Y(2024)Discriminative Regression With Adaptive Graph DiffusionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318540835:2(1797-1809)Online publication date: Feb-2024
https://doi.org/10.1109/TNNLS.2022.3185408
Zhou JZhang QZeng SZhang BFang L(2024)Latent Linear Discriminant Analysis for feature extraction via Isometric Structural LearningPattern Recognition10.1016/j.patcog.2023.110218149(110218)Online publication date: May-2024
https://doi.org/10.1016/j.patcog.2023.110218
Zhu YWen JLai ZZhou JKong H(2024)LPRR: Locality preserving robust regression based jointly sparse feature extractionInformation Sciences10.1016/j.ins.2024.121128(121128)Online publication date: Jul-2024
https://doi.org/10.1016/j.ins.2024.121128
Lai ZLiang GZhou JKong HLu Y(2024)A Joint Learning Framework for Optimal Feature Extraction and Multi-class SVMInformation Sciences10.1016/j.ins.2024.120656(120656)Online publication date: Apr-2024
https://doi.org/10.1016/j.ins.2024.120656
Li YJin JGeng YXiao YLiang JChen C(2024)Discriminative elastic-net broad learning systems for visual classificationApplied Soft Computing10.1016/j.asoc.2024.111445(111445)Online publication date: Mar-2024
https://doi.org/10.1016/j.asoc.2024.111445
Jia BLiu JZhang M(2024)Towards exploiting linear regression for multi-class/multi-label classification: an empirical analysisInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02114-615:9(3671-3700)Online publication date: 18-Mar-2024
https://doi.org/10.1007/s13042-024-02114-6
Zhang QYing ZZhou JSun JZhang B(2023)Broad Learning Model with a Dual Feature Extraction Strategy for ClassificationMathematics10.3390/math1119408711:19(4087)Online publication date: 26-Sep-2023
https://doi.org/10.3390/math11194087
Nie FDong XHu ZWang RLi X(2023)Discriminative Projected Clustering via Unsupervised LDAIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.320271934:11(9466-9480)Online publication date: Nov-2023
https://doi.org/10.1109/TNNLS.2022.3202719
Wang JXie FNie FLi X(2023) Robust Supervised and Semisupervised Least Squares Regression Using ℓ 2, p -Norm Minimization IEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.315010234:11(8389-8403)Online publication date: Nov-2023
https://doi.org/10.1109/TNNLS.2022.3150102
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents