×

Real-time stylistic prediction for whole-body human motions. (English) Zbl 1259.68201

Summary: The ability to predict human motion is crucial in several contexts such as human tracking by computer vision and the synthesis of human-like computer graphics. Previous work has focused on off-line processes with well-segmented data; however, many applications such as robotics require real-time control with efficient computation. In this paper, we propose a novel approach called real-time stylistic prediction for whole-body human motions to satisfy these requirements. This approach uses a novel generative model to represent a whole-body human motion including rhythmic motion (e.g., walking) and discrete motion (e.g., jumping). The generative model is composed of a low-dimensional state (phase) dynamics and a two-factor observation model, allowing it to capture the diversity of motion styles in humans. A real-time adaptation algorithm was derived to estimate both state variables and style parameter of the model from non-stationary unlabeled sequential observations. Moreover, with a simple modification, the algorithm allows real-time adaptation even from incomplete (partial) observations. Based on the estimated state and style, a future motion sequence can be accurately predicted. In our implementation, it takes less than 15ms for both adaptation and prediction at each observation. Our real-time stylistic prediction was evaluated for human walking, running, and jumping behaviors.

MSC:

68T45 Machine vision and scene understanding
68U05 Computer graphics; computational geometry (digital and algorithmic aspects)

Software:

astsa
Full Text: DOI

References:

[1] Bishop, C. M., Neural networks for pattern recognition (1995), Oxford University Press
[2] Brand, M., & Hertzmann, A. (2000). Style machines. In SIGGRAPH; Brand, M., & Hertzmann, A. (2000). Style machines. In SIGGRAPH
[3] Chai, J.; Hodgins, J. K., Performance animation from low-dimensional control signals, ACM Transactions on Graphics, 24, 686-696 (2005)
[4] Dempster, A. P.; Laird, N. M.; Rubin, D. B., Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series B, 39, 1-38 (1977) · Zbl 0364.62022
[5] Fukuda, O.; Tsuji, T.; Kaneko, M.; Otsuka, A., A human-assisting manipulator teleoperated by EMG signals and arm motions, IEEE Transaction on Robotics and Automation, 19, 210-222 (2003)
[6] Ghahramani, Z., & Hinton, G. E. (1996). Parameter estimation for linear dynamical systems. Technical report; Ghahramani, Z., & Hinton, G. E. (1996). Parameter estimation for linear dynamical systems. Technical report
[7] Grochow, K.; Martin, S. L.; Hertzmann, A.; Popovic, Z., Style-based inverse kinematics, ACM Transactions on Graphics, 23, 1, 522-531 (2004)
[8] Haykin, S., Adaptive filter theory (2002), Prentice Hall
[9] Howe, N. R.; Leventon, M. E.; Freeman, W. T., Bayesian reconstruction of 3D human motion from single-camera video, (Advances in neural information processing systems, Vol. 12 (2000)), 820-826
[10] Hsu, E.; Pulli, K.; Popovic, J., Style translation for human motion, ACM Transactions on Graphics, 24, 3, 1082-1089 (2005)
[11] Ijspeert, A. J.; Nakanishi, J.; Schaal, S., Learning attractor landscapes for learning motor primitives, (Advances in neural information processing systems, Vol. 15 (2002)), 1523-1530
[12] Inamura, T., Toshima, I., & Nakamura, Y. (2002). Acquisition and embodiment of motion elements in closed mimesis loop. In IEEE international conference on robotics and automation; Inamura, T., Toshima, I., & Nakamura, Y. (2002). Acquisition and embodiment of motion elements in closed mimesis loop. In IEEE international conference on robotics and automation
[13] Kawamoto, H., Kanbe, S., & Sankai, Y. (2003). Power assist method for HAL-3 using EMG-based feedback controller. In IEEE international conference on systems, man and cybernetics; Kawamoto, H., Kanbe, S., & Sankai, Y. (2003). Power assist method for HAL-3 using EMG-based feedback controller. In IEEE international conference on systems, man and cybernetics
[14] Ko, J., & Fox, D. (2008). GP-Bayes filters: Bayesian filtering using Gaussian process prediction and observation models. In IEEE/RSJ international conference on intelligent robots and systems; Ko, J., & Fox, D. (2008). GP-Bayes filters: Bayesian filtering using Gaussian process prediction and observation models. In IEEE/RSJ international conference on intelligent robots and systems
[15] Lawrence, N. (2007). Learning for larger datasets with the Gaussian process latent variable model. In Proceedings of the eleventh international workshop on artificial intelligence and statistics; Lawrence, N. (2007). Learning for larger datasets with the Gaussian process latent variable model. In Proceedings of the eleventh international workshop on artificial intelligence and statistics
[16] Liu, X., & Goldsmith, A. (2004). Kalman filtering with partial observation losses. In IEEE conferences on decision and control; Liu, X., & Goldsmith, A. (2004). Kalman filtering with partial observation losses. In IEEE conferences on decision and control
[17] Li, Y.; Wang, T.; Shum, H.-Y., Motion texture: a two-level statistical model for character motion synthesis, ACM Transactions on Graphics, 21, 3, 465-472 (2002)
[18] Onishi, M., Luo, Z., Odashima, T., Hirano, S., Tahara, K., & Mukai, T. (2007). Generation of human care behaviors by human-interactive robot RI-MAN. In IEEE international conference on robotics and automation; Onishi, M., Luo, Z., Odashima, T., Hirano, S., Tahara, K., & Mukai, T. (2007). Generation of human care behaviors by human-interactive robot RI-MAN. In IEEE international conference on robotics and automation
[19] Ormoneit, D.; Sidenbladh, H.; Blank, M.; Hastie, T., (Learning and tracking cyclic human motion. Learning and tracking cyclic human motion, Advances in neural information processing systems, Vol. 13 (2001)), 894-900
[20] Pavlovic, V.; Rehg, J. M.; MacCormick, J., (Learning switching linear models of human motion. Learning switching linear models of human motion, Advances in neural information processing systems, Vol. 12 (2000)), 981-987
[21] Rasmussen, C. E.; Williams, C. K.I., Gaussian processes for machine learning (2006), MIT Press · Zbl 1177.68165
[22] Riley, M., Ude, A., Wada, K., & Atkeson, C.G. (2003). Enabling real-time full-body imitation: a natural way of transferring human movement to humanoids. In IEEE international conference robotics and automation (pp. 2368-2374).; Riley, M., Ude, A., Wada, K., & Atkeson, C.G. (2003). Enabling real-time full-body imitation: a natural way of transferring human movement to humanoids. In IEEE international conference robotics and automation (pp. 2368-2374).
[23] Sato, M.; Ishii, S., On-line em algorithm for the normalized Gaussian network, In Neural Computation, 12, 407-432 (2000) · Zbl 1473.68164
[24] Shapiro, A., Cao, Y., & Faloutsos, P. (2006). Style components. In Graphics interface; Shapiro, A., Cao, Y., & Faloutsos, P. (2006). Style components. In Graphics interface
[25] Shumway, R. H.; Stoffer, D. S., An approach to time series smoothing and forecasting using the EM algorithm, Journal of Time Series Analysis, 3, 253-264 (1982) · Zbl 0502.62085
[26] Sidenbladh, H., Black, M. J., & Fleet, D. J. (2000). Stochastic tracking of 3D human figures using 2D image motion. In European conference computer visionVol. 2; Sidenbladh, H., Black, M. J., & Fleet, D. J. (2000). Stochastic tracking of 3D human figures using 2D image motion. In European conference computer visionVol. 2
[27] Taylor, G. W., & Hinton, G. E. (2009). Factored conditional restricted Bolzmann machines for modeling motion style. In International conference on machine learning; Taylor, G. W., & Hinton, G. E. (2009). Factored conditional restricted Bolzmann machines for modeling motion style. In International conference on machine learning
[28] Taylor, G. W., Hinton, G. E., & Roweis, S. T. (2006). Modeling human motion using binary latent variables. In Proceedings of advances in neural information processing systems; Taylor, G. W., Hinton, G. E., & Roweis, S. T. (2006). Modeling human motion using binary latent variables. In Proceedings of advances in neural information processing systems
[29] Tenenbaum, J. B.; Freeman, W. T., Separating style and content with bilinear models, Neural Computation, 12, 1247-1283 (2000)
[30] Torresani, L.; Hackney, P.; Bregler, C., Learning motion style synthesis from perceptual observations, (Advances in Neural Information Processing Systems, Vol. 19 (2006)), 1393-1400
[31] Urtasun, R., Fleet, D.J., & Fua, P. (2006). 3D people tracking with Gaussian process dynamical models. In IEEE Computer society conference on computer vision and pattern recognition; Urtasun, R., Fleet, D.J., & Fua, P. (2006). 3D people tracking with Gaussian process dynamical models. In IEEE Computer society conference on computer vision and pattern recognition
[32] Urtasun, R., Fleet, D.J., Hertzmann, A., & Fua, P. (2005). Priors for people tracking from small training sets. In IEEE international conference on computer vision; Urtasun, R., Fleet, D.J., Hertzmann, A., & Fua, P. (2005). Priors for people tracking from small training sets. In IEEE international conference on computer vision
[33] Urtasun, R., & Fua, P. (2004). 3D Human body tracking using deterministic temporal motion models. In European Conference on Computer VisionVol. 3; Urtasun, R., & Fua, P. (2004). 3D Human body tracking using deterministic temporal motion models. In European Conference on Computer VisionVol. 3
[34] Wang, J. M., Fleet, D. J., & Hertzmann, A. (2007). Multifactor Gaussian process models for style-content separation. In International conference on machine learning; Wang, J. M., Fleet, D. J., & Hertzmann, A. (2007). Multifactor Gaussian process models for style-content separation. In International conference on machine learning
[35] Wang, J. M.; Fleet, D. J.; Hertzmann, A., Gaussian process dynamical models for human motion, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 283-298 (2008)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.