×

Key-skeleton-pattern mining on 3D skeletons represented by Lie group for action recognition. (English) Zbl 1427.68278

Summary: The human skeleton can be considered as a tree system of rigid bodies connected by bone joints. In recent researches, substantial progress has been made in both theories and experiments on skeleton-based action recognition. However, it is challenging to accurately represent the skeleton and precisely eliminate noisy skeletons from the action sequence. This paper proposes a novel skeletal representation, which is composed of two subfeatures to recognize human action: static features and dynamic features. First, to avoid scale variations from subject to subject, the orientations of the rigid bodies in a skeleton are employed to capture the scale-invariant spatial information of the skeleton. The static feature of the skeleton is defined as a combination of the orientations. Unlike previous orientation-based representations, the orientation of a rigid body in the skeleton is defined as the rotations between the rigid body and the coordinate axes in three-dimensional space. Each rotation is mapped to the special orthogonal group \(\mathrm{SO}(3)\). Next, the rigid-body motions between the skeleton and its previous skeletons are utilized to capture the temporal information of the skeleton. The dynamic feature of the skeleton is defined as a combination of the motions. Similarly, the motions are represented as points in the special Euclidean group \(\mathrm{SE}(3)\). Therefore, the proposed skeleton representation lies in the Lie group \((\mathrm{SE}(3) \times \cdots \times \mathrm{SE}(3), \mathrm{SO}(3) \times \cdots \times \mathrm{SO}(3))\), which is a manifold. Using the proposed representation, an action can be considered as a series of points in this Lie group. Then, to recognize human action more accurately, a new pattern-growth algorithm named MinP-PrefixSpan is proposed to mine the key-skeleton-patterns from the training dataset. Because the algorithm reduces the number of new patterns in each growth step, it is more efficient than the PrefixSpan algorithm. Finally, the key-skeleton-patterns are used to discover the most informative skeleton sequences of each action (skeleton sequence). Our approach achieves accuracies of 94.70%, 98.87%, and 95.01% on three action datasets, outperforming other relative action recognition approaches, including LieNet, Lie group, Grassmann manifold, and Graph-based model.

MSC:

68T10 Pattern recognition, speech recognition
70B15 Kinematics of mechanisms and robots

Software:

PrefixSpan; G3D; SPADE
Full Text: DOI

References:

[1] Aggarwal, J. K.; Ryoo, M. S., Human activity analysis: a review, ACM Computing Surveys, 43, 3, article 16 (2011) · doi:10.1145/1922649.1922653
[2] Knutzen, K. M., Kinematics of human motion, American Journal of Human Biology, 10, 6, 808-809 (1998) · doi:10.1002/(SICI)1520-6300(1998)10:6<808::AID-AJHB13>3.0.CO;2-E
[3] Moeslund, T. B.; Hilton, A.; Krüger, V., A survey of advances in vision-based human motion capture and analysis, Computer Vision and Image Understanding, 104, 2-3, 90-126 (2006) · doi:10.1016/j.cviu.2006.08.002
[4] Li, W.; Zhang, Z.; Liu, Z., Action recognition based on a bag of 3D points, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW ’10) · doi:10.1109/cvprw.2010.5543273
[5] Vemulapalli, R.; Arrate, F.; Chellappa, R., Human action recognition by representing 3D skeletons as points in a lie group, Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’14) · doi:10.1109/cvpr.2014.82
[6] Wang, C.; Wang, Y.; Yuille, A. L., An approach to pose-based action recognition, Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’13) · doi:10.1109/cvpr.2013.123
[7] Wang, J.; Liu, Z.; Wu, Y.; Yuan, J., Mining actionlet ensemble for action recognition with depth cameras, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’12) · doi:10.1109/cvpr.2012.6247813
[8] Johansson, G., Visual perception of biological motion and a model for its analysis, Perception & Psychophysics, 14, 2, 201-211 (1973) · doi:10.3758/BF03212378
[9] Yang, W.; Wang, Y.; Mori, G., Recognizing human actions from still images with latent poses, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ’10), IEEE · doi:10.1109/cvpr.2010.5539879
[10] Murray, R. M.; Sastry, S. S., A Mathematical Introduction to Robotic Manipulation (1994), CRC Press · Zbl 0858.70001
[11] Boothby, W. M., An introduction to differentiable manifolds and Riemannian geometry (1975), Academic Press [A subsidiary of Harcourt Brace Jovanovich, Publishers], New York-London · Zbl 0333.53001
[12] Suykens, J. A. K.; Vandewalle, J., Least squares support vector machine classifiers, Neural Processing Letters, 9, 3, 293-300 (1999) · doi:10.1023/A:1018628609742
[13] Ding, W.; Liu, K.; Fu, X.; Cheng, F., Profile HMMs for skeleton-based human action recognition, Signal Processing: Image Communication, 42, 109-119 (2016) · doi:10.1016/j.image.2016.01.010
[14] Vemulapalli, R.; Chellappa, R., Rolling rotations for recognizing human actions from 3D skeletal data, Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
[15] Pei, J.; Han, J.; Mortazavi-Asl, B.; Pinto, H.; Chen, Q.; Dayal, U.; Hsu, M.-C., PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth, Proceedings of the 17th International Conference on Data Engineering
[16] Slama, R.; Wannous, H.; Daoudi, M.; Srivastava, A., Accurate 3D action recognition using learning on the Grassmann manifold, Pattern Recognition, 48, 2, 556-567 (2015) · doi:10.1016/j.patcog.2014.08.011
[17] Liu, J.; Akhtar, N.; Mian, A., Skepxels: Spatio-Temporal Image Representation of Human Skeleton Joints for Action Recognition (2017)
[18] Chaudhry, R.; Ofli, F.; Kurillo, G.; Bajcsy, R.; Vidal, R., Bio-inspired dynamic 3D discriminative skeletal features for human action recognition, Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2013
[19] Xia, L.; Chen, C.-C.; Aggarwal, J. K., View invariant human action recognition using histograms of 3D joints, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW ’12) · doi:10.1109/cvprw.2012.6239233
[20] Li, M.; Leung, H., Graph-based approach for 3D human skeletal action recognition, Pattern Recognition Letters, 87, 195-202 (2017) · doi:10.1016/j.patrec.2016.07.021
[21] Evangelidis, G.; Singh, G.; Horaud, R., Skeletal quads: Human action recognition using joint quadruples, Proceedings of the 22nd International Conference on Pattern Recognition, ICPR 2014
[22] Huang, Z.; Wan, C.; Probst, T.; Gool, L. V., Deep Learning on Lie Groups for Skeleton-Based Action Recognition, Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) · doi:10.1109/CVPR.2017.137
[23] Bloom, V.; Makris, D.; Argyriou, V., G3D: A gaming action dataset and real time action recognition evaluation framework, Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2012 · doi:10.1109/CVPRW.2012.6239175
[24] Zaki, M. J., SPADE: an efficient algorithm for mining frequent sequences, Machine Learning, 42, 1-2, 31-60 (2001) · Zbl 0970.68052 · doi:10.1023/a:1007652502315
[25] Han, J.; Pei, J.; Mortazavi-Asl, B.; Chen, Q.; Dayal, U.; Hsu, M.-C., FreeSpan: Frequent pattern-projected sequential pattern mining, Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery in Databases
[26] Srikant, R.; Agrawal, R., Mining sequential patterns: generalizations and performance improvements, Advances in Database Technology—EDBT ’96. Advances in Database Technology—EDBT ’96, Lecture Notes in Computer Science, 1057, 1-17 (1996), Berlin, Germany: Springer, Berlin, Germany · doi:10.1007/BFb0014140
[27] Muzammal, M.; Raman, R., Mining sequential patterns from probabilistic databases, Knowledge and Information Systems, 44, 2, 325-358 (2015) · doi:10.1007/s10115-014-0766-7
[28] Yang, J.; Wang, W.; Yu, P. S.; Han, J., Mining long sequential patterns in a noisy environment, Proceedings of the ACM SIGMOD 2002 Proceedings of the ACM SIGMOD International Conference on Managment of Data
[29] Yang, X.; Tian, Y., Effective 3D action recognition using Eigen Joints, Journal of Visual Communication and Image Representation, 25, 1, 2-11 (2014) · doi:10.1016/j.jvcir.2013.03.001
[30] Carbonera Luvizon, D.; Tabia, H.; Picard, D., Learning features combination for human action recognition from skeleton sequences, Pattern Recognition Letters, 99, 13-20 (2017) · doi:10.1016/j.patrec.2017.02.001
[31] Nie, S.; Ji, Q., Capturing global and local dynamics for human action recognition, Proceedings of the 22nd International Conference on Pattern Recognition, ICPR 2014
[32] Ding, W.; Liu, K.; Belyaev, E.; Cheng, F., Tensor-based linear dynamical systems for action recognition from 3D skeletons, Pattern Recognition, 75-86 (2017) · doi:10.1016/j.patcog.2017.12.004
[33] Moerchen, F., Algorithms for time series knowledge mining, Proceedings of the the 12th ACM SIGKDD international conference · doi:10.1145/1150402.1150485
[34] Aggarwal, J. K.; Xia, L., Human activity recognition from 3D data: a review, Pattern Recognition Letters, 48, 70-80 (2014) · doi:10.1016/j.patrec.2014.04.011
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.