Designing an adaptive cost function for dynamic human pose predictions

135 Accesses
Explore all metrics

Abstract

In the modern-day scenario, machines and humans are expected to work together and collaborate in several social and manufacturing environments. The machines should predict humans’ next move for effective collaborations by observing their present move. Human motion modelling and prediction are fundamental and challenging problems involving computer vision and graphics. To help solve some of the challenges, in the present investigation, we propose an innovative idea of developing a new cost function as the objective function based on adaptive sampling, which is subsequently used with an ’Adam’ optimizer for training and validating a specially configured Deep Learning architecture. Our proposed development produced significantly improved results regarding future pose estimation/predictions. The adaptiveness of the proposed cost function is based on a bell-shaped locally weighted function. It has been observed that the area covered by the cost function plays a vital role during training, and the bell-shaped function’s width helps decide the region of importance for the training samples. The proposed cost function has been used for training a gated recurrent unit (GRU) based encoder-decoder architecture. The encoder takes the observed input sequences, extracts the input sequence’s significant variability, and passes it to the decoder. The decoder takes it as input, trains using the adaptive sampling-based method, and predicts future poses. We have experimented with this function in various sizes and shapes and compared the results obtained with some state-of-the-art research results. As elaborated in this paper, we obtained much-improved results in almost all the cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Role of Depth Predictions for 3D Human Pose Estimation

Refining the Pose: Training and Use of Deep Recurrent Autoencoders for Improving Human Pose Estimation

A framework for robotic arm pose estimation and movement prediction based on deep and extreme learning models

Article 25 November 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability statement

Data sharing does not apply to this article as no datasets were generated or analyzed during the current study.

References

Levine S, Wang JM, Haraux A, Popović Z, Koltun V (2012) Continuous character control with low-dimensional embeddings. ACM Trans Graph (TOG) 31(4):1–10
Article Google Scholar
Koppula H, Saxena A (2013) Learning spatio-temporal structure from RGB-D videos for human activity detection and anticipation. In: International conference on machine learning, pp 792–800
Koppula HS, Saxena A (2013) Anticipating human activities for reactive robotic response. In: IROS, p 2071 Tokyo
Gupta A, Martinez J, Little JJ, Woodham RJ (2014) 3D pose from motion for cross-view action recognition via non-linear circulant temporal encoding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2601–2608
Gong H, Sim J, Likhachev M, Shi J (2011) Multi-hypothesis motion planning for visual object tracking. In: 2011 International conference on computer vision, pp 619–626 IEEE
Urtasun R, Fleet DJ, Fua P (2006) 3D people tracking with Gaussian process dynamical models. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 1, pp 238–245 IEEE
Troje NF (2002) Decomposing biological motion: a framework for analysis and synthesis of human gait patterns. J Vis 2(5):2–2
Article Google Scholar
Fragkiadaki K, Levine S, Felsen P, Malik J (2015) Recurrent network models for human dynamics. In: Proceedings of the IEEE international conference on computer vision, pp 4346–4354
Jain A, Zamir AR, Savarese S, Saxena A (2016) Structural-RNN: deep learning on spatio-temporal graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5308–5317
Martinez J, Black MJ, Romero J (2017) On human motion prediction using recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2891–2900
Ingram JN, Körding KP, Howard IS, Wolpert DM (2008) The statistics of natural hand movements. Exp Brain Res 188(2):223–236
Article Google Scholar
Gupta S, Yadav GK, Nandi GC (2023) Development of human motion prediction strategy using inception residual block. Multimedia Tools Appl:1–15
Wang JM, Fleet DJ, Hertzmann A (2007) Gaussian process dynamical models for human motion. IEEE Trans Pattern Anal Mach Intell 30(2):283–298
Article Google Scholar
Brand M, Hertzmann A (2000) Style machines. In: Proceedings of the 27th annual conference on computer graphics and interactive techniques, pp 183–192
Taylor GW, Hinton GE, Roweis ST (2007) Modeling human motion using binary latent variables. In: Adv Neural Inf Process Syst, pp 1345–1352
Lehrmann AM, Gehler PV, Nowozin S (2014) Efficient nonlinear Markov models for human motion. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 1314–1321
Mao W, Liu M, Salzmann M, Li H (2019) Learning trajectory dependencies for human motion prediction. In: Proceedings of the IEEE international conference on computer vision, pp 9489–9497
Wang B, Adeli E, Chiu H-K, Huang D-A, Niebles JC (2019) Imitation learning for human pose prediction. In: Proceedings of the IEEE international conference on computer vision, pp 7124–7133
Sidenbladh H, Black MJ, Sigal L (2002) Implicit probabilistic models of human motion for synthesis and tracking. In: European conference on computer vision, pp 784–800 Springer
Pavlovic V, Rehg JM, MacCormick J (2001) Learning switching linear models of human motion. In: Adv Neural Inf Process Syst, pp 981–987
Hernandez A, Gall J, Moreno-Noguer F (2019) Human motion prediction via spatio-temporal inpainting. In: Proceedings of the IEEE international conference on computer vision, pp 7134–7143
Lebailly T, Kiciroglu S, Salzmann M, Fua P, Wang W (2020) Motion prediction using temporal inception module. In: Proceedings of the asian conference on computer vision
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Yadav GK, Abdel-Nasser M, Rashwan HA, Puig D, Nandi G (2023) Implicit regularization of a deep augmented neural network model for human motion prediction. Appl Intell:1–14
Liu Z, Wu S, Jin S, Liu Q, Lu S, Zimmermann R, Cheng L (2019) Towards natural and accurate future motion prediction of humans and animals. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10004–10012
Kundu JN, Gor M, Babu RV (2019) BiHMP-GAN: bidirectional 3D human motion prediction GAN. Proc AAAI Conf Artif intell 33:8553–8560
Google Scholar
Barsoum E, Kender J, Liu Z (2018) HP-GAN: probabilistic 3D human motion prediction via GAN. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 1418–1427
Butepage J, Black MJ, Kragic D, Kjellstrom H (2017) Deep representation learning for human motion prediction and classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6158–6166
Yadav GK, Nandi GC (2020) Development of adaptive sampling based strategy for human activity predictions using sequential networks. In: 2020 IEEE 4th conference on information & communication technology (CICT), pp 1–6 IEEE
Ionescu C, Papava D, Olaru V, Sminchisescu C (2014) Human3.6M: large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Trans Pattern Anal Mach Intell 3(7):1325–1339
Article Google Scholar
Schaal S, Atkeson CG, Vijayakumar S (2002) Scalable techniques from nonparametric statistics for real time robot learning. Appl Intell 17(1):49–60
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Gui L-Y, Wang Y-X, Liang X, Moura JM (2018) Adversarial geometry-aware human motion prediction. In: Proceedings of the European conference on computer vision (ECCV), pp 786–803
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M et al (2016) TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv:1603.04467

Download references

Author information

Authors and Affiliations

Department of Information Technology, Indian Institute of Information Technology Allahabad, Prayagraj, Uttar Pradesh, India
Gaurav Kumar Yadav & G. C. Nandi
Department of Computer Science and Mathematic Security, Universitat Rovira I Virgili, Tarragona, Spain
Gaurav Kumar Yadav & Domenec Puig

Authors

Gaurav Kumar Yadav
View author publications
You can also search for this author in PubMed Google Scholar
Domenec Puig
View author publications
You can also search for this author in PubMed Google Scholar
G. C. Nandi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gaurav Kumar Yadav.

Ethics declarations

Conflict of Interest

The authors have no conflicts of interest to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yadav, G.K., Puig, D. & Nandi, G.C. Designing an adaptive cost function for dynamic human pose predictions. Multimed Tools Appl 83, 53201–53219 (2024). https://doi.org/10.1007/s11042-023-17736-1

Download citation

Received: 14 September 2021
Revised: 03 June 2023
Accepted: 23 November 2023
Published: 11 December 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s11042-023-17736-1

Designing an adaptive cost function for dynamic human pose predictions

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On the Role of Depth Predictions for 3D Human Pose Estimation

Refining the Pose: Training and Use of Deep Recurrent Autoencoders for Improving Human Pose Estimation

A framework for robotic arm pose estimation and movement prediction based on deep and extreme learning models

Data availability statement

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Designing an adaptive cost function for dynamic human pose predictions

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On the Role of Depth Predictions for 3D Human Pose Estimation

Refining the Pose: Training and Use of Deep Recurrent Autoencoders for Improving Human Pose Estimation

A framework for robotic arm pose estimation and movement prediction based on deep and extreme learning models

Explore related subjects

Data availability statement

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation