Document Zbl 1326.68229

Lei, Hao; Mei, Kuizhi; Zheng, Nanning; Dong, Peixiang; Zhou, Ning; Fan, Jianping

Learning group-based dictionaries for discriminative image representation. (English) Zbl 1326.68229

Pattern Recognition 47, No. 2, 899-913 (2014).

Summary: Dictionary learning is a critical issue for achieving discriminative image representation in many computer vision tasks such as object detection and image classification. In this paper, a new algorithm is developed for learning discriminative group-based dictionaries, where the inter-concept (category) visual correlations are leveraged to enhance both the reconstruction quality and the discrimination power of the group-based discriminative dictionaries. A visual concept network is first constructed for determining the groups of visually similar object classes and image concepts automatically. For each group of such visually similar object classes and image concepts, a group-based dictionary is learned for achieving discriminative image representation. A structural learning approach is developed to take advantage of our group-based discriminative dictionaries for classifier training and image classification. The effectiveness and the discrimination power of our group-based discriminative dictionaries have been evaluated on multiple popular visual benchmarks.

MSC:

68T05	Learning and adaptive systems in artificial intelligence
68T45	Machine vision and scene understanding

Keywords:

group-based dictionary learning; discriminative image representation; bag-of-visual-words; structural learning; image classification

Software:

PCA-SIFT; LIBSVM; Caltech-256; SIFT; Vlfeat; WordNet; ImageNet

Cite Review PDF

Full Text: DOI

References:

[5]	Datta, R.; Joshi, D.; Li, J.; Wang, J. Z., Image retrieval: ideas, and trends of the new age, ACM Computing Surveys, 40, 5:1-5:60 (2008)
[6]	Jgou, H.; Douze, M.; Schmid, C., Improving bag-of-features for large scale image search, International Journal of Computer Vision, 87, 316-336 (2010)
[9]	Aharon, M.; Elad, M.; Bruckstein, A., K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation, IEEE Transactions on Signal Processing, 54, 4311-4322 (2006) · Zbl 1375.94040
[14]	Wright, J.; Yang, A.; Ganesh, A.; Sastry, S.; Ma, Y., Robust face recognition via sparse representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 210-227 (2009)
[16]	Wang, H.; Yuan, C.; Hu, W.; Sun, C., Supervised class-specific dictionary learning for sparse modeling in action recognition, Pattern Recognition, 45, 3902-3911 (2012)
[17]	Fan, J.; Gao, Y.; Luo, H., Integrating concept ontology and multitask learning to achieve more effective classifier training for multilevel image annotation, IEEE Transactions on Image Processing, 17, 407-426 (2008)
[18]	Fan, J.; Gao, Y.; Luo, H.; Jain, R., Mining multilevel image semantics via hierarchical classification, IEEE Transactions on Multimedia, 10, 167-187 (2008)
[19]	Naphade, M.; Smith, J.; Tesic, J.; Chang, S.-F.; Hsu, W.; Kennedy, L.; Hauptmann, A.; Curtis, J., Large-scale concept ontology for multimedia, IEEE Multimedia, 13, 86-91 (2006)
[24]	Cilibrasi, R.; Vitanyi, P., The google similarity distance, IEEE Transactions on Knowledge and Data Engineering, 19, 370-383 (2007)
[25]	Miller, G. A., Wordnet: a lexical database for english, Communications of the ACM, 38, 39-41 (1995)
[26]	Mikolajczyk, K.; Schmid, C., A performance evaluation of local descriptors, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 1615-1630 (2005)
[27]	Wright, J.; Ma, Y.; Mairal, J.; Sapiro, G.; Huang, T.; Yan, S., Sparse representation for computer vision and pattern recognition, Proceedings of the IEEE, 98, 1031-1044 (2010)
[28]	Barnard, K.; Duygulu, P.; Forsyth, D.; de Freitas, N.; Blei, D. M.; Jordan, M. I., Matching words and pictures, Journal of Machine Learning Research, 3, 1107-1135 (2003) · Zbl 1061.68174
[32]	Lowe, D. G., Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, 60, 91-110 (2004)
[35]	Fernando, B.; Fromont, E.; Muselet, D.; Sebban, M., Supervised learning of gaussian mixture models for visual vocabulary generation, Pattern Recognition, 45, 897-907 (2012) · Zbl 1225.68176
[38]	Varma, M.; Zisserman, A., A statistical approach to texture classification from single images, International Journal of Computer Vision, 62, 61-81 (2005)
[39]	Zhang, J.; MarszaLek, M.; Lazebnik, S.; Schmid, C., Local features and kernels for classification of texture and object categories: a comprehensive study, International Journal of Computer Vision, 73, 213-238 (2007)
[40]	Perronnin, F., Universal and adapted vocabularies for generic visual categorization, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 1243-1256 (2008)
[44]	Efron, B.; Hastie, T.; Johnstone, I.; Tibshirani, R., Least angle regression, Annals of Statistics, 32, 407-499 (2004) · Zbl 1091.62054
[46]	Bao, B.-K.; Zhu, G.; Shen, J.; Yan, S., Robust image analysis with sparse representation on quantized visual features, IEEE Transactions on Image Processing, 22, 860-871 (2013) · Zbl 1373.94037
[51]	Shi, J.; Malik, J., Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 888-905 (2000)
[55]	Chang, C.-C.; Lin, C.-J., LIBSVM: a library for support vector machines, ACM Transactions on Intelligent Systems and Technology, 2, 27:1-27:27 (2011), Software available at: 〈http://www.csie.ntu.edu.tw/ cjlin/libsvm〉
[56]	Hsu, C.-W.; Lin, C.-J., A comparison of methods for multiclass support vector machines, IEEE Transactions on Neural Networks, 13, 415-425 (2002)
[64]	Fan, J.; He, X.; Zhou, N.; Peng, J.; Jain, R., Quantitative characterization of semantic gaps for learning complexity estimation and inference model selection, IEEE Transactions on Multimedia, 14, 1414-1428 (2012)

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.