An individualized system of skeletal data-based CNN classifiers for action recognition in manufacturing assembly

Md. Al-Amin¹,
Ruwen Qin ORCID: orcid.org/0000-0003-2656-8705²,
Md Moniruzzaman³,
Zhaozheng Yin⁴,
Wenjin Tao⁵ &
…
Ming C. Leu⁵

952 Accesses
15 Citations
1 Altmetric
Explore all metrics

Abstract

Real-time Action Recognition (ActRgn) of assembly workers can timely assist manufacturers in correcting human mistakes and improving task performance. Yet, recognizing worker actions in assembly reliably is challenging because such actions are complex and fine-grained, and workers are heterogeneous. This paper proposes to create an individualized system of Convolutional Neural Networks (CNNs) for action recognition using human skeletal data. The system comprises six 1-channel CNN classifiers that each is built with one unique posture-related feature vector extracted from the time series skeletal data. Then, the six classifiers are adapted to any new worker through transfer learning and iterative boosting. After that, an individualized fusion method named Weighted Average of Selected Classifiers (WASC) integrates the adapted classifiers as an ActRgn system that outperforms its constituent classifiers. An algorithm of stream data analysis further differentiates the actions for assembly from the background and corrects misclassifications based on the temporal relationship of the actions in assembly. Compared to the CNN classifier directly built with the skeletal data, the proposed system improves the accuracy of action recognition by 28%, reaching 94% accuracy on the tested group of new workers. The study also builds a foundation for immediate extensions for adapting the ActRgn system to current workers performing new tasks and, then, to new workers performing new tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Challenges of the Creation of a Dataset for Vision Based Human Hand Action Recognition in Industrial Assembly

Skeleton-Based Action and Gesture Recognition for Human-Robot Collaboration

SPECTRE: a deep learning network for posture recognition in manufacturing

Article 03 September 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Al-Amin, M., Qin, R., Moniruzzaman, M., Yin, Z., Tao, W., & Leu, M. C. (2020). Data for the individualized system of skeletal data-based CNN classifiers for action recognition in manufacturing assembly.
Al-Amin, M., Qin, R., Tao, W., Doell, D., Lingard, R., Yin, Z., & Leu, M. C. (2020). Fusing and refining convolutional neural network models for assembly action recognition in smart manufacturing. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, page NA.
Al-Amin, M., Tao, W., Doell, D., Lingard, R., Yin, Z., Leu, M. C., et al. (2019). Action recognition in manufacturing assembly using multimodal sensor fusion. Procedia Manufacturing, 39, 158–167.
Article Google Scholar
Banos, O., Damas, M., Pomares, H., Rojas, F., Delgado-Marquez, B., & Valenzuela, O. (2013). Human activity recognition based on a sensor weighting hierarchical classifier. Soft Computing, 17(2), 333–343.
Article Google Scholar
Chen, C., Jafari, R., & Kehtarnavaz, N. (2017). A survey of depth and inertial sensor fusion for human action recognition. Multimedia Tools and Applications, 76(3), 4405–4425.
Article Google Scholar
Chernbumroong, S., Cang, S., & Yu, H. (2015). Genetic algorithm-based classifiers fusion for multisensor activity recognition of elderly people. IEEE Journal of Biomedical and Health Informatics, 19(1), 282–289.
Article Google Scholar
Chung, S., Lim, J., Noh, K. J., Kim, G., & Jeong, H. (2019). Sensor data acquisition and multimodal sensor fusion for human activity recognition using deep learning. Sensors, 19(7), 1716.
Article Google Scholar
Cook, D., Feuz, K. D., & Krishnan, N. C. (2013). Transfer learning for activity recognition: A survey. Knowledge and Information Systems, 36(3), 537–556.
Article Google Scholar
Du, Y., Fu, Y., & Wang, L. (2015). Skeleton based action recognition with convolutional neural network. In 3rd IAPR Asian conference on pattern recognition (ACPR), pp. 579–583.
ElMaraghy, H., & ElMaraghy, W. (2016). Smart adaptable assembly systems. Procedia CIRP, 44, 4–13.
Article Google Scholar
Guo, M., Wang, Z., Yang, N., Li, Z., & An, T. (2019). A multisensor multiclassifier hierarchical fusion model based on entropy weight for human activity recognition using wearable inertial sensors. IEEE Transactions on Human-Machine Systems, 49(1), 105–111.
Article Google Scholar
Guo, Y., He, W., & Gao, C. (2012). Human activity recognition by fusing multiple sensor nodes in the wearable sensor systems. Journal of Mechanics in Medicine and Biology, 12(05), 1250084.
Article Google Scholar
Han, Y., Chung, S. L., Chen, S. F., & Su, S. F. (2018) Two-stream LSTM for action recognition with RGB-D-based hand-crafted features and feature combination. In IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 3547–3552. IEEE.
Hou, Y., Li, Z., Wang, P., & Li, W. (2018). Skeleton optical spectra-based action recognition using convolutional neural networks. IEEE Transactions on Circuits and Systems for Video Technology, 28(3), 807–811.
Article Google Scholar
Kamel, A., Sheng, B., Yang, P., Li, P., Shen, R., & Feng, D. D. (2019). Deep convolutional neural networks for human action recognition using depth maps and postures. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 49(9), 1806–1819.
Article Google Scholar
Kang, K., Li, H., Yan, J., Zeng, X., Yang, B., Xiao, T., et al. (2018). T-CNN: Tubelets with convolutional neural networks for object detection from videos. IEEE Transactions on Circuits and Systems for Video Technology, 28(10), 2896–2907.
Article Google Scholar
Ke, Q., Bennamoun, M., An, S., Sohel, F., & Boussaid, F. (2017). A new representation of skeleton sequences for 3D action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3288–3297. IEEE.
Khaire, P., Kumar, P., & Imran, J. (2018). Combining CNN streams of RGB-D and skeletal data for human activity recognition. Pattern Recognition Letters, 115, 107–116.
Article Google Scholar
Kong, X. T., Luo, H., Huang, G. Q., & Yang, X. (2019). Industrial wearable system: The human-centric empowering technology in Industry 4.0. Journal of Intelligent Manufacturing, 30(8), 2853–2869.
Article Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1097–1105.
Google Scholar
Li, B., Li, X., Zhang, Z., & Wu, F. (2019). Spatio-temporal graph routing for skeleton-based action recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 8561–8568.
Article Google Scholar
Li, C., Wang, P., Wang, S., Hou, Y., & Li, W. (2017). Skeleton-based action recognition using LSTM and CNN. In IEEE International conference on multimedia and expo workshops (ICMEW), pp. 585–590. IEEE.
Liu, J., Shahroudy, A., Xu, D., Kot, A. C., & Wang, G. (2017). Skeleton-based action recognition using spatio-temporal LSTM network with trust gates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12), 3007–3021.
Article Google Scholar
Mittal, S., Galesso, S., & Brox, T. (2021). Essentials for class incremental learning. arXiv preprint arXiv:2102.09517.
Moniruzzaman, M., Yin, Z., He, Z. H., Qin, R., & Leu, M. (2021). Human action recognition by discriminative feature pooling and video segmentation attention model. IEEE Transactions on Multimedia.
Nunez, J. C., Cabido, R., Pantrigo, J. J., Montemayor, A. S., & Velez, J. F. (2018). Convolutional neural networks and long short-term memory for skeleton-based human activity and hand gesture recognition. Pattern Recognition, 76, 80–94.
Article Google Scholar
Pham, H. H., Khoudour, L., Crouzil, A., Zegers, P., & Velastin, S. A. (2018). Exploiting deep residual networks for human action recognition from skeletal data. Computer Vision and Image Understanding, 170, 51–66.
Article Google Scholar
Rude, D. J., Adams, S., & Beling, P. A. (2018). Task recognition from joint tracking data in an operational manufacturing cell. Journal of Intelligent Manufacturing, 29(6), 1203–1217.
Article Google Scholar
Shen, C., Chen, Y., Yang, G., & Guan, X. (2020). Toward hand-dominated activity recognition systems with wristband-interaction behavior analysis. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 50(7), 2501–2511.
Article Google Scholar
Song, S., Lan, C., Xing, J., Zeng, W., & Liu, J. (2017). An end-to-end spatio-temporal attention model for human action recognition from skeleton data. In Proceedings of AAAI Conference on Artificial Intelligence, pp. 4263–4270.
Stiefmeier, T., Roggen, D., Ogris, G., Lukowicz, P., & Tröster, G. (2008). Wearable activity tracking in car manufacturing. IEEE Pervasive Computing, 7(2), 42–50.
Article Google Scholar
Tao, W., Lai, Z.-H., Leu, M. C., & Yin, Z. (2018). Worker activity recognition in smart manufacturing using IMU and sEMG signals with convolutional neural networks. Procedia Manufacturing, 26, 1159–1166.
Tao, X., Hong, X., Chang, X., Dong, S., Wei, X., & Gong, Y. (2020). Few-shot class-incremental learning. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12183–12192.
Tsanousa, A., Meditskos, G., Vrochidis, S., & Kompatsiaris, I. (2019). A weighted late fusion framework for recognizing human activity from wearable sensors. In 10th international conference on information, intelligence, systems and applications (IISA), pp. 1–8. IEEE.
Wang, K.-J., Rizqi, D. A., & Nguyen, H.-P. (2021). Skill transfer support model based on deep learning. Journal of Intelligent Manufacturing, 32(4), 1129–1146.
Article Google Scholar
Ward, J. A., Lukowicz, P., Troster, G., & Starner, T. E. (2006). Activity recognition of assembly tasks using body-worn microphones and accelerometers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1553–1567.
Article Google Scholar
Zhao, Z., Chen, Y., Liu, J., Shen, Z., & Liu, M. (2011). Cross-people mobile-phone based activity recognition. In Twenty-second International Joint Conference on Artificial Intelligence, pp. 2545–2550.
Zhou, F., Ji, Y., & Jiao, R. J. (2013). Affective and cognitive design for mass personalization: Status and prospect. Journal of Intelligent Manufacturing, 24(5), 1047–1069.
Article Google Scholar
Zhu, X., Wang, Y., Dai, J., Yuan, L., & Wei, Y. (2017). Flow-guided feature aggregation for video object detection. Proceedings of the IEEE International Conference on Computer Vision, 1, 408–417.
Google Scholar
Zhu, X., Xiong, Y., Dai, J., Yuan, L., & Wei, Y. (2017). Deep feature flow for video recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4141–4150.

Download references

Acknowledgements

All the authors of this paper received financial support from the National Science Foundation through the Award CMMI-1646162. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Department of Engineering Management and Systems Engineering, Missouri University of Science and Technology, Rolla, MO, 65409, USA
Md. Al-Amin
Department of Civil Engineering, Stony Brook University, Stony Brook, NY, 11794, USA
Ruwen Qin
Department of Computer Science, Stony Brook University, Stony Brook, NY, 11794, USA
Md Moniruzzaman
Department of Biomedical Informatics, Department of Computer Science, and AI Institute, Stony Brook University, Stony Brook, NY, 11794, USA
Zhaozheng Yin
Department of Mechanical and Aerospace Engineering, Missouri University of Science and Technology, Rolla, MO, 65409, USA
Wenjin Tao & Ming C. Leu

Authors

Md. Al-Amin
View author publications
You can also search for this author in PubMed Google Scholar
Ruwen Qin
View author publications
You can also search for this author in PubMed Google Scholar
Md Moniruzzaman
View author publications
You can also search for this author in PubMed Google Scholar
Zhaozheng Yin
View author publications
You can also search for this author in PubMed Google Scholar
Wenjin Tao
View author publications
You can also search for this author in PubMed Google Scholar
Ming C. Leu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruwen Qin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Al-Amin, M., Qin, R., Moniruzzaman, M. et al. An individualized system of skeletal data-based CNN classifiers for action recognition in manufacturing assembly. J Intell Manuf 34, 633–649 (2023). https://doi.org/10.1007/s10845-021-01815-x

Download citation

Received: 24 October 2020
Accepted: 06 July 2021
Published: 26 July 2021
Issue Date: February 2023
DOI: https://doi.org/10.1007/s10845-021-01815-x

An individualized system of skeletal data-based CNN classifiers for action recognition in manufacturing assembly

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Challenges of the Creation of a Dataset for Vision Based Human Hand Action Recognition in Industrial Assembly

Skeleton-Based Action and Gesture Recognition for Human-Robot Collaboration

SPECTRE: a deep learning network for posture recognition in manufacturing

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An individualized system of skeletal data-based CNN classifiers for action recognition in manufacturing assembly

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Challenges of the Creation of a Dataset for Vision Based Human Hand Action Recognition in Industrial Assembly

Skeleton-Based Action and Gesture Recognition for Human-Robot Collaboration

SPECTRE: a deep learning network for posture recognition in manufacturing

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation