Skip to main content
Log in

Designing an adaptive cost function for dynamic human pose predictions

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In the modern-day scenario, machines and humans are expected to work together and collaborate in several social and manufacturing environments. The machines should predict humans’ next move for effective collaborations by observing their present move. Human motion modelling and prediction are fundamental and challenging problems involving computer vision and graphics. To help solve some of the challenges, in the present investigation, we propose an innovative idea of developing a new cost function as the objective function based on adaptive sampling, which is subsequently used with an ’Adam’ optimizer for training and validating a specially configured Deep Learning architecture. Our proposed development produced significantly improved results regarding future pose estimation/predictions. The adaptiveness of the proposed cost function is based on a bell-shaped locally weighted function. It has been observed that the area covered by the cost function plays a vital role during training, and the bell-shaped function’s width helps decide the region of importance for the training samples. The proposed cost function has been used for training a gated recurrent unit (GRU) based encoder-decoder architecture. The encoder takes the observed input sequences, extracts the input sequence’s significant variability, and passes it to the decoder. The decoder takes it as input, trains using the adaptive sampling-based method, and predicts future poses. We have experimented with this function in various sizes and shapes and compared the results obtained with some state-of-the-art research results. As elaborated in this paper, we obtained much-improved results in almost all the cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability statement

Data sharing does not apply to this article as no datasets were generated or analyzed during the current study.

References

  1. Levine S, Wang JM, Haraux A, Popović Z, Koltun V (2012) Continuous character control with low-dimensional embeddings. ACM Trans Graph (TOG) 31(4):1–10

    Article  Google Scholar 

  2. Koppula H, Saxena A (2013) Learning spatio-temporal structure from RGB-D videos for human activity detection and anticipation. In: International conference on machine learning, pp 792–800

  3. Koppula HS, Saxena A (2013) Anticipating human activities for reactive robotic response. In: IROS, p 2071 Tokyo

  4. Gupta A, Martinez J, Little JJ, Woodham RJ (2014) 3D pose from motion for cross-view action recognition via non-linear circulant temporal encoding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2601–2608

  5. Gong H, Sim J, Likhachev M, Shi J (2011) Multi-hypothesis motion planning for visual object tracking. In: 2011 International conference on computer vision, pp 619–626 IEEE

  6. Urtasun R, Fleet DJ, Fua P (2006) 3D people tracking with Gaussian process dynamical models. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 1, pp 238–245 IEEE

  7. Troje NF (2002) Decomposing biological motion: a framework for analysis and synthesis of human gait patterns. J Vis 2(5):2–2

    Article  Google Scholar 

  8. Fragkiadaki K, Levine S, Felsen P, Malik J (2015) Recurrent network models for human dynamics. In: Proceedings of the IEEE international conference on computer vision, pp 4346–4354

  9. Jain A, Zamir AR, Savarese S, Saxena A (2016) Structural-RNN: deep learning on spatio-temporal graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5308–5317

  10. Martinez J, Black MJ, Romero J (2017) On human motion prediction using recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2891–2900

  11. Ingram JN, Körding KP, Howard IS, Wolpert DM (2008) The statistics of natural hand movements. Exp Brain Res 188(2):223–236

    Article  Google Scholar 

  12. Gupta S, Yadav GK, Nandi GC (2023) Development of human motion prediction strategy using inception residual block. Multimedia Tools Appl:1–15

  13. Wang JM, Fleet DJ, Hertzmann A (2007) Gaussian process dynamical models for human motion. IEEE Trans Pattern Anal Mach Intell 30(2):283–298

    Article  Google Scholar 

  14. Brand M, Hertzmann A (2000) Style machines. In: Proceedings of the 27th annual conference on computer graphics and interactive techniques, pp 183–192

  15. Taylor GW, Hinton GE, Roweis ST (2007) Modeling human motion using binary latent variables. In: Adv Neural Inf Process Syst, pp 1345–1352

  16. Lehrmann AM, Gehler PV, Nowozin S (2014) Efficient nonlinear Markov models for human motion. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 1314–1321

  17. Mao W, Liu M, Salzmann M, Li H (2019) Learning trajectory dependencies for human motion prediction. In: Proceedings of the IEEE international conference on computer vision, pp 9489–9497

  18. Wang B, Adeli E, Chiu H-K, Huang D-A, Niebles JC (2019) Imitation learning for human pose prediction. In: Proceedings of the IEEE international conference on computer vision, pp 7124–7133

  19. Sidenbladh H, Black MJ, Sigal L (2002) Implicit probabilistic models of human motion for synthesis and tracking. In: European conference on computer vision, pp 784–800 Springer

  20. Pavlovic V, Rehg JM, MacCormick J (2001) Learning switching linear models of human motion. In: Adv Neural Inf Process Syst, pp 981–987

  21. Hernandez A, Gall J, Moreno-Noguer F (2019) Human motion prediction via spatio-temporal inpainting. In: Proceedings of the IEEE international conference on computer vision, pp 7134–7143

  22. Lebailly T, Kiciroglu S, Salzmann M, Fua P, Wang W (2020) Motion prediction using temporal inception module. In: Proceedings of the asian conference on computer vision

  23. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  24. Yadav GK, Abdel-Nasser M, Rashwan HA, Puig D, Nandi G (2023) Implicit regularization of a deep augmented neural network model for human motion prediction. Appl Intell:1–14

  25. Liu Z, Wu S, Jin S, Liu Q, Lu S, Zimmermann R, Cheng L (2019) Towards natural and accurate future motion prediction of humans and animals. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10004–10012

  26. Kundu JN, Gor M, Babu RV (2019) BiHMP-GAN: bidirectional 3D human motion prediction GAN. Proc AAAI Conf Artif intell 33:8553–8560

    Google Scholar 

  27. Barsoum E, Kender J, Liu Z (2018) HP-GAN: probabilistic 3D human motion prediction via GAN. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 1418–1427

  28. Butepage J, Black MJ, Kragic D, Kjellstrom H (2017) Deep representation learning for human motion prediction and classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6158–6166

  29. Yadav GK, Nandi GC (2020) Development of adaptive sampling based strategy for human activity predictions using sequential networks. In: 2020 IEEE 4th conference on information & communication technology (CICT), pp 1–6 IEEE

  30. Ionescu C, Papava D, Olaru V, Sminchisescu C (2014) Human3.6M: large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Trans Pattern Anal Mach Intell 3(7):1325–1339

    Article  Google Scholar 

  31. Schaal S, Atkeson CG, Vijayakumar S (2002) Scalable techniques from nonparametric statistics for real time robot learning. Appl Intell 17(1):49–60

    Article  Google Scholar 

  32. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  33. Gui L-Y, Wang Y-X, Liang X, Moura JM (2018) Adversarial geometry-aware human motion prediction. In: Proceedings of the European conference on computer vision (ECCV), pp 786–803

  34. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M et al (2016) TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv:1603.04467

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gaurav Kumar Yadav.

Ethics declarations

Conflict of Interest

The authors have no conflicts of interest to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yadav, G.K., Puig, D. & Nandi, G.C. Designing an adaptive cost function for dynamic human pose predictions. Multimed Tools Appl 83, 53201–53219 (2024). https://doi.org/10.1007/s11042-023-17736-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17736-1

Keywords

Navigation