×

On optimizing the sum of the Rayleigh quotient and the generalized Rayleigh quotient on the unit sphere. (English) Zbl 1285.90042

Summary: Given symmetric matrices \(B,D\in \mathbb R^{n\times n }\) and a symmetric positive definite matrix \(W\in \mathbb R^{n\times n }\), maximizing the sum of the Rayleigh quotient \(\mathbf x^{\top } D \mathbf x\) and the generalized Rayleigh quotient \(\frac{\mathbf{x}^{\top}B \mathbf{x}}{\mathbf{x}^{\top}W\mathbf{x}}\) on the unit sphere not only is of mathematical interest in its own right, but also finds applications in practice. In this paper, we first present a real world application arising from the sparse Fisher discriminant analysis. To tackle this problem, our first effort is to characterize the local and global maxima by investigating the optimality conditions. Our results reveal that finding the global solution is closely related with a special extreme nonlinear eigenvalue problem, and in the special case \(D=\mu W\) \((\mu >0)\), the set of the global solutions is essentially an eigenspace corresponding to the largest eigenvalue of a specially-defined matrix. The characterization of the global solution not only sheds some lights on the maximization problem, but motives a starting point strategy to obtain the global maximizer for any monotonically convergent iteration. Our second part then realizes the Riemannian trust-region method of P.-A. Absil, C. G. Baker and K. A. Gallivan [Found. Comput. Math. 7, No. 3, 303–330 (2007; Zbl 1129.65045)] into a practical algorithm to solve this problem, which enjoys the nice convergence properties: global convergence and local superlinear convergence. Preliminary numerical tests are carried out and empirical evaluation of its performance is reported.

MSC:

90C26 Nonconvex programming, global optimization

Citations:

Zbl 1129.65045

Software:

PRMLT; ARPACK; eigs
Full Text: DOI

References:

[1] Abraham, R., Marsden, J.E., Ratiu, T.: Manifolds, Tensor Analysis, and Applications, 2nd edn. Applied Mathematical Sciences, vol. 75. Springer, New York (1988) · Zbl 0875.58002
[2] Absil, P.-A., Gallivan, K.A.: Accelerated line-search and trust-region methods. SIAM J. Numer. Anal. 47, 997–1018 (2009) · Zbl 1191.65067 · doi:10.1137/08072019X
[3] Absil, P.-A., Baker, C.G., Gallivan, K.A.: A truncated-CG style method for symmetric generalized eigenvalue problems. J. Comput. Appl. Math. 189, 274–285 (2006) · Zbl 1090.65042 · doi:10.1016/j.cam.2005.10.006
[4] Absil, P.-A., Baker, C.G., Gallivan, K.A.: Trust-region methods on Riemannian manifolds. Found. Comput. Math. 7, 303–330 (2007) · Zbl 1129.65045 · doi:10.1007/s10208-005-0179-9
[5] Absil, P.-A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2008) · Zbl 1147.65043
[6] Adler, R.L., Dedieu, J.-P., Margulies, J.Y., Martens, M., Shub, M.: Newton’s method on Riemannian manifolds and a geometric model for the human spine. IMA J. Numer. Anal. 22, 359–390 (2002) · Zbl 1056.92002 · doi:10.1093/imanum/22.3.359
[7] Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin (2006) · Zbl 1107.68072
[8] Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2000) · Zbl 0966.49001
[9] Chu, M.T., Driessel, K.R.: The projected gradient method for least squares matrix approximations with spectral constraints. SIAM J. Numer. Anal. 27, 1050–1060 (1990) · Zbl 0704.65025 · doi:10.1137/0727062
[10] Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust-Region Methods. SIAM, Philadelphia (2000) · Zbl 0958.65071
[11] Duchene, L., Leclerq, S.: An optimal transformation for discriminant and principal component analysis. IEEE Trans. Pattern Anal. Mach. Intell. 10, 978–983 (1988) · Zbl 0655.62064 · doi:10.1109/34.9121
[12] Dundar, M.M., Fung, G., Bi, J., Sandilya, S., Rao, B.: Sparse fisher discriminant analysis for computer aided detection. In: Proceedings of SIAM International Conference on Data Mining (2005)
[13] Edelman, A., Arias, T.A., Smith, S.T.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20, 303–353 (1998) · Zbl 0928.65050 · doi:10.1137/S0895479895290954
[14] Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001) · Zbl 1073.62547 · doi:10.1198/016214501753382273
[15] Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annu. Eugen. 7, 179–188 (1936) · doi:10.1111/j.1469-1809.1936.tb02137.x
[16] Foley, D., Sammon, J.: An optimal set of discriminant vectors. IEEE Trans. Comput. 24, 281–289 (1975) · Zbl 0296.68106 · doi:10.1109/T-C.1975.224208
[17] Friedman, J.: Regularized discriminant analysis. J. Am. Stat. Assoc. 84, 165–175 (1989) · doi:10.1080/01621459.1989.10478752
[18] Fukunaga, K.: Introduction to Statistical Pattern Classification. Academic Press, San Diego (1990) · Zbl 0711.62052
[19] Fung, E., Ng, M.: On sparse fisher discriminant method for microarray data analysis. Bioinformation 2, 230–234 (2007) · doi:10.6026/97320630002230
[20] Gao, X.B., Golub, G.H., Liao, L.-Z.: Continuous methods for symmetric generalized eigenvalue problems. Linear Algebra Appl. 428, 676–696 (2008) · Zbl 1140.65029 · doi:10.1016/j.laa.2007.08.034
[21] Golub, G.H., Liao, L.-Z.: Continuous methods for extreme and interior eigenvalue problems. Linear Algebra Appl. 415, 31–51 (2006) · Zbl 1092.65029 · doi:10.1016/j.laa.2005.01.009
[22] Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996) · Zbl 0865.65009
[23] Helmke, U., Moore, J.B.: Optimization and Dynamical Systems. Springer, London (1994) · Zbl 0984.49001
[24] Howland, P., Jeon, M., Park, H.: Structure preserving dimension reduction for clustered text data based on the generalized singular value decomposition. SIAM J. Matrix Anal. Appl. 25, 165–179 (2003) · Zbl 1061.68135 · doi:10.1137/S0895479801393666
[25] Hunter, D.R., Li, R.: Variable selection using MM algorithms. Ann. Stat. 33, 1617–1642 (2005) · Zbl 1078.62028 · doi:10.1214/009053605000000200
[26] Kelley, C.T.: Iterative Methods for Linear and Nonlinear Equations. SIAM, Philadelphia (1995) · Zbl 0832.65046
[27] Lehoucq, R.B., Sorensen, D.C.: Deflation techniques for an implicitly re-started Arnoldi iteration. SIAM J. Matrix Anal. Appl. 17, 789–821 (1996) · Zbl 0863.65016 · doi:10.1137/S0895479895281484
[28] Lehoucq, R.B., Sorensen, D.C., Yang, C.: ARPACK Users’ Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods. SIAM, Philadelphia (1998) · Zbl 0901.65021
[29] Ng, M.K., Liao, L.-Z., Zhang, L.-H.: On sparse linear discriminant analysis for high-dimensional data. Numer. Linear Algebra Appl. 18, 223–235 (2011) · Zbl 1248.62105 · doi:10.1002/nla.736
[30] Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006) · Zbl 1104.65059
[31] Parlett, B.N.: The Rayleigh quotient iteration and some generalizations for nonnormal matrices. Math. Comput. 28, 679–693 (1974) · Zbl 0293.65023 · doi:10.1090/S0025-5718-1974-0405823-3
[32] Parlett, B.N.: The Symmetric Eigenvalue Problem. Classics Appl. Math., vol. 20. SIAM, Philadelphia (1998) · Zbl 0885.65039
[33] Primolevo, G., Simeone, O., Spagnolini, U.: Towards a joint optimization of scheduling and beamforming for MIMO downlink. In: IEEE Ninth International Symposium on Spread Spectrum Techniques and Applications, pp. 493–497 (2006)
[34] Saad, Y.: Numerical Methods for Large Eigenvalue Problems. Algorithms and Architectures for Advanced Scientific Computing. Manchester University Press, Manchester (1992)
[35] Steihaug, T.: The conjugate gradient method and trust regions in large scale optimization. SIAM J. Numer. Anal. 20, 626–637 (1983) · Zbl 0518.65042 · doi:10.1137/0720042
[36] Toint, P.L.: Towards an efficient sparsity exploiting newton method for minimization. In: Duff, I.S. (ed.) Sparse Matrices and Their Uses, pp. 57–88. Academic Press, London (1981) · Zbl 0463.65045
[37] Wu, M.C., Zhang, L.S., Wang, Z.X., Christiani, D.C., Lin, X.H.: Sparse linear discriminant analysis for simultaneous testing for the significance of a gene set/pathway and gene selection. Bioinformatics 25, 1145–1151 (2009) · doi:10.1093/bioinformatics/btp019
[38] Ye, J.-P., Janardan, R., Park, C., Park, H.: An optimization criterion for generalized discriminant analysis on undersampled problems. IEEE Trans. Pattern Anal. Mach. Intell. 26, 982–994 (2004) · doi:10.1109/TPAMI.2004.37
[39] Zhang, L.-H.: Uncorrected trace ratio LDA for undersampled problems. Pattern Recognit. Lett. 32, 476–484 (2011) · doi:10.1016/j.patrec.2010.11.008
[40] Zhang, L.-H., Liao, L.-Z., Ng, M.K.: Fast algorithms for the generalized Foley-Sammon discriminant analysis. SIAM J. Matrix Anal. Appl. 31, 1584–1605 (2010) · Zbl 1205.65031 · doi:10.1137/080720863
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.