×

Classification of gene-expression data: the manifold-based metric learning way. (English) Zbl 1103.68769

Summary: Classification of microarray gene-expression data can potentially help medical diagnosis, and becomes an important topic in bioinformatics. However, microarray data sets are usually of small sample size relative to an overwhelming number of genes. This makes the classification problem fairly challenging. Instance-Based Learning (IBL) algorithms, such as nearest neighbor (\(k\)-NN), are usually the baseline algorithm due to their simplicity. However, practices show that \(k\)-NN performs not very well in this field. This paper introduces manifold-based metric learning to improve the performance of IBL methods. A novel metric learning algorithm is proposed by utilizing both local manifold structural information and local discriminant information. In addition, a random subspace extension is also presented. We apply the proposed algorithm to the gene-classification problem in three ways: one in the original feature space, another in the reduced feature space, and the third via the random subspace extension. Statistical evaluation shows that the proposed algorithm can achieve promising results, and gain significant performance improvement over traditional IBL algorithms.

MSC:

68T10 Pattern recognition, speech recognition
92C40 Biochemistry, molecular biology

Software:

LIBSVM
Full Text: DOI

References:

[1] Golub, T.; Slonim, D.; Tamayo, P.; Huard, C.; Gaasenbeek, M.; Mesirov, J.; Coller, H.; Loh, M.; Downing, J.; Caligiuri, M.; Bloomfield, C.; Lander, E., Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science, 286, 531-537 (1999)
[2] van’t Veer, L.; Dai, H.; Van de Vijver, M.; He, Y.; Hart, A.; Mao, M.; Peterse, H.; Van der Kooy, K.; Marton, M.; Witteveen, A.; Schreiber, G.; Kerkhoven, R.; Roberts, C.; Linsley, P.; Bernards, R.; Friend, S., Gene expression profiling predicts clinical outcome of breast cancer, Nature, 415, 530-536 (2002)
[3] Dougherty, E. R., Small sample issues for microarray-based classification, Comparative Functional Genomics, 2, 28-34 (2001)
[4] Mitchell, T., Machine Learning (1997), McGraw Hill: McGraw Hill New York · Zbl 0913.68167
[5] Theilhaber, J.; Connolly, T.; Roman-Roman, S.; Bushnell, S.; Jackson, A.; Call, K.; Garcia, T.; Baron, R., Finding genes in the c2c12 osteogenic pathway by k-nearest-neighbor classification of expression data, Genome Res., 12, 1, 165-176 (2002)
[6] Wu, W.; Xing, E.; Mian, I.; Bissell, M., Evaluation of normalization methods for cdna microarray data by k-nn classification, BMC Bioinformatics, 6, 191, 1-21 (2005)
[7] Cover, T.; Hart, P., Nearest neighbor pattern classification, Ann. Statist., 13, 57-67 (1967) · Zbl 0154.44505
[8] S. Salzberg, Distance metrics for instance-based Learning, Lecture Notes in Computer Science, vol. 542, 1991, Springer, Berlin, pp. 399-408.; S. Salzberg, Distance metrics for instance-based Learning, Lecture Notes in Computer Science, vol. 542, 1991, Springer, Berlin, pp. 399-408.
[9] A. Bar-Hillel, T. Hertz, N. Shental, D. Weinshall, Learning distance functions using equivalence relations, in: Proceeding of the 20th International Conference on Machine Learning (ICML), 2003, pp. 11-18.; A. Bar-Hillel, T. Hertz, N. Shental, D. Weinshall, Learning distance functions using equivalence relations, in: Proceeding of the 20th International Conference on Machine Learning (ICML), 2003, pp. 11-18. · Zbl 1222.68140
[10] Schultz, M.; Joachims, T., Learning a distance metric from relative comparisons, (Thrun, S.; Saul, L. K.; Schölkopf, B., Advances in Neural Information Processing Systems (NIPS) (2004), MIT Press: MIT Press Cambridge, MA), 41-48
[11] Z. Zhang, J. Kwok, D.-Y. Yeung, Parametric distance metric learning with label information, in: Proceeding of the 18th International Joint Conference on Artificial Intelligence (IJCAI), 2003, pp. 1450-1452.; Z. Zhang, J. Kwok, D.-Y. Yeung, Parametric distance metric learning with label information, in: Proceeding of the 18th International Joint Conference on Artificial Intelligence (IJCAI), 2003, pp. 1450-1452.
[12] Li, S.; Chan, K.; Wang, C., Performance evaluation of the nearest feature line method in image classification and retrieval, IEEE Trans. Pattern Anal. Mach. Intell., 22, 11, 1335-1349 (2000)
[13] Li, S.; Lu, J., Face recognition using the nearest feature line method, IEEE Trans. Neural Networks, 10, 2, 439-443 (1999)
[14] Vincent, P.; Bengio, Y., K-local hyperplane and convex distance nearest neighbor algorithms, (Dietterich, T. G.; Becker, S.; Ghahramani, Z., Advances in Neural Information Processing Systems (NIPS), vol. 14 (2002), MIT Press: MIT Press Cambridge, MA), 985-992
[15] Simard, P.; LeCun, Y.; Denker, J., Efficient pattern recognition using a 55 new transformation distance, (Hanson, S. J.; Cowan, J. D.; Giles, C. L., Advances in Neural Information Processing Systems (NIPS), 57 vol. 5 (1993), Morgan Kaufmann Press: Morgan Kaufmann Press Los Altos, CA), 50-58
[16] Simard, P.; LeCun, Y.; Denker, J. S.; Victorri, B., Transformation invariance in pattern recognition ctangent distance and tangent propagation, Int. J. Imaging System Technol., 11, 3, 181-194 (2001)
[17] Hastie, T.; Simard, P.; Saeckinger, E., Learning prototype models for tangent distance, (Tesauro, G.; Touretzky, D. S.; Leen, T. K., Advances in Neural Information Processing Systems (NIPS), vol. 7 (1995), MIT Press: MIT Press Cambridge, MA), 999-1006
[18] O. Okun, Protein fold recognition with k-local hyperplane distance nearest neighbor algorithm, in: Proceedings of the Second European Workshop on Data Mining and Text Mining for Bioinformatics, Pisa, Italy, 2004, pp. 51-57.; O. Okun, Protein fold recognition with k-local hyperplane distance nearest neighbor algorithm, in: Proceedings of the Second European Workshop on Data Mining and Text Mining for Bioinformatics, Pisa, Italy, 2004, pp. 51-57.
[19] Gray, A., Modern Differential Geometry of Curves and Surfaces (1993), CRC Press: CRC Press Boca Raton, FL · Zbl 0795.53001
[20] Hastie, T.; Tibshirani, R., Discriminant adaptive nearest neighbor classification, IEEE Trans. Pattern Anal. Mach. Intell., 18, 6, 409-415 (1996)
[21] J. Lee, J. Wang, C. Zhang, Z. Bian, Probabilistic tangent subspace: a unified view, in: Proceeding of the 21st International Conference on Machine Learning (ICML), 2004, pp. 528-535.; J. Lee, J. Wang, C. Zhang, Z. Bian, Probabilistic tangent subspace: a unified view, in: Proceeding of the 21st International Conference on Machine Learning (ICML), 2004, pp. 528-535.
[22] Girolami, M., An alternative perspective on adaptive independent component analysis algorithms, Neural Comput., 10, 8, 2103-2114 (1998)
[23] Schapire, R.; Freund, Y.; Bartlett, P.; Lee, W. S., Boosting the margin: a new explanation for the effectiveness of voting methods, Ann. Statist., 26, 5, 1651-1686 (1998) · Zbl 0929.62069
[24] MacKay, D. J.C., Introduction to Monte Carlo methods, (Jordan, M. I., Learning in Graphical Models (1998), Kluwer Academic Press: Kluwer Academic Press Dordrecht), 175-204 · Zbl 0911.65004
[25] Vapnik, V., The nature of statistical learning theory (1997), Springer: Springer New York · Zbl 0934.62009
[26] Breiman, L., Random forests, Mach. Learning, 45, 5-32 (2001) · Zbl 1007.68152
[27] Ho, T. K., The random subspace method for constructing decision forests, IEEE Trans. Pattern Anal. Mach. Intell., 20, 8, 832-844 (1998)
[28] S.D. Bay, Combining nearest neighbor classifiers through multiple feature subsets, in: Proceedings of the 15th International Conference on Machine Learning (ICML), 1998, pp. 37-45.; S.D. Bay, Combining nearest neighbor classifiers through multiple feature subsets, in: Proceedings of the 15th International Conference on Machine Learning (ICML), 1998, pp. 37-45.
[29] Alon, A.; Barkai, N.; Notterman, D.; Gish, K.; Ybarra, S.; Mack, D.; Levine, A., Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays, Proc. Natl. Acad. Sci., 96, 6745-6750 (1999)
[30] Iizuka, N.; Oka, M.; Yamada-Okabe, H.; Nishida, M.; Maeda, Y.; Mori, N.; Takao, T.; Tamesa, T.; Tangoku, A.; Tabuchi, H.; Hamada, K.; Nakayama, H.; Ishitsuka, H.; Miyamoto, T.; Hirabayashi, A.; Uchimura, S.; Hamamoto, Y., Oligonucleotide microarray for prediction of early intrahepatic recurrence of hepatocellular carcinoma after curative resection, Lancet, 361, 923-929 (2003)
[31] Nutt, C.; Mani, D.; Betensky, R.; Tamayo, P.; Cairncross, J.; Ladd, C.; Pohl, U.; Hartmann, C.; McLaughlin, M.; Batchelor, T.; Black, P.; von Deimling, A.; Pomeroy, S.; Golub, T.; Louis, D., Gene expression-based classification of malignant gliomas correlates better with survival than histological classification, Cancer Res., 63, 7, 1602-1607 (2003)
[32] Singh, D.; Febbo, P.; Ross, K.; Jackson, D.; Manola, J.; Ladd, C.; Tamayo, P.; Renshaw, A.; D’Amico, A.; Richie, J.; Lander, E.; Loda, M.; Kantoff, P.; Golub, T.; Sellers, W., Gene expression correlates of clinical prostate cancer behavior, Cancer Cell, 1, 2, 203-209 (2003)
[33] Furey, T.; Cristianini, N.; Duffy, N.; Bednarski, D.; Schummer, M.; Haussler, D., Support vector machines classification and validation of cancer tissue samples using microarray expression data, Bioinformatics, 16, 906-914 (2000)
[34] Pochet, N.; Smet, F. D.; Suykens, J.; Moor, B. D., Systematic benchmarking of microarray data classification, Bioinformatics, 20, 17, 3185-3195 (2004)
[35] Efron, B., The Jackknife, the Bootstrap, and Other Resampling Plans (1982), SIAM Press · Zbl 0496.62036
[36] Sima, C.; Braga-Neto, U.; Dougherty, E., Bolstered error estimation provides superior feature-set ranking for small samples, Bioinformatics, 21, 7, 1046-1054 (2005)
[37] Fu, W. J.; Carroll, R. J.; Wang, S., Estimating misclassification error with small samples via bootstrap cross-validation, Bioinformatics, 21, 9, 1979-1986 (2005)
[38] C.-C. Chang, C.-J. Lin, Libsvm: a library for support vector machines, Technical Report, \( \langle;\) http://www.csie.ntu.edu.tw/\( \sim;\rangle;\); C.-C. Chang, C.-J. Lin, Libsvm: a library for support vector machines, Technical Report, \( \langle;\) http://www.csie.ntu.edu.tw/\( \sim;\rangle;\)
[39] Hollander, M.; Wolfe, D. A., Nonparametric Statistical Methods (1999), Wiley: Wiley New York · Zbl 0997.62511
[40] Hsu, J. C., Multiple ComparisonTheory and Methods (1996), Chapman & Hall: Chapman & Hall London · Zbl 0898.62090
[41] Yandell, B. S., Practical Data Analysis for Designed Experiments (1997), Chapman & Hall: Chapman & Hall London · Zbl 1056.62500
[42] Kira, K.; Rendell, L., A practical approach to feature selection, (Proceedings of the Ninth International Conference on Machine Learning (1994), Morgan Kaufmann: Morgan Kaufmann Los Altos, CA), 249-256
[43] Liu, H.; Motoda, H., Feature Selection for Knowledge Discovery and Data Mining (1998), Kluwer Academic Publishers: Kluwer Academic Publishers Dordrecht · Zbl 0908.68127
[44] M.A. Hall, G. Holmes, Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans. Knowl. Data En. 15 (3) (2003) 1-16.; M.A. Hall, G. Holmes, Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans. Knowl. Data En. 15 (3) (2003) 1-16.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.