Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Liu, Zhiguang; Zhou, Liuyang; Leung, Howard; Shum, Hubert P. H. (2016)
Publisher: IEEE
Languages: English
Types: Article
Subjects: G400

Classified by OpenAIRE into

Depth sensor based 3D human motion estimation hardware such as Kinect has made interactive applications more popular recently. However, it is still challenging to accurately recognize postures from a single depth camera due to the inherently noisy data derived from depth images and self-occluding action performed by the user. In this paper, we propose a new real-time probabilistic framework to enhance the accuracy of live captured postures that belong to one of the action classes in the database. We adopt the Gaussian Process model as a prior to leverage the position data obtained from Kinect and marker-based motion capture system. We also incorporate a temporal consistency term into the optimization framework to constrain the velocity variations between successive frames. To ensure that the reconstructed posture resembles the accurate parts of the observed posture, we embed a set of joint reliability measurements into the optimization framework. A major drawback of Gaussian Process is its cubic learning complexity when dealing with a large database due to the inverse of a covariance matrix. To solve the problem, we propose a new method based on a local mixture of Gaussian Processes, in which Gaussian Processes are defined in local regions of the state space. Due to the significantly decreased sample size in each local Gaussian Process, the learning time is greatly reduced. At the same time, the prediction speed is enhanced as the weighted mean prediction for a given sample is determined by the nearby local models only. Our system also allows incrementally updating a specific local Gaussian Process in real time, which enhances the likelihood of adapting to run-time postures that are different from those in the database. Experimental results demonstrate that our system can generate high quality postures even under severe self-occlusion situations, which is beneficial for real-time applications such as motion-based gaming and sport training.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • [1] J. Chan, H. Leung, J. Tang, and T. Komura, “A virtual reality dance training system using motion capture technology,” Learning Technologies, IEEE Transactions on, vol. 4, no. 2, pp. 187-195, April 2011.
    • [2] Microsoft Corporation, “Kinect for windows sdk programming guide version 1.8,” 2013.
    • [3] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake, “Real-time human pose recognition in parts from single depth images,” in Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, ser. CVPR '11, 2011, pp. 1297-1304.
    • [4] T. Morgan, D. Jarrell, and J. Vance, “Poster: Rapid development of natural user interaction using kinect sensors and vrpn,” in 3D User Interfaces (3DUI), 2014 IEEE Symposium on, March 2014, pp. 163-164.
    • [5] J. Chai and J. K. Hodgins, “Performance animation from lowdimensional control signals,” in ACM SIGGRAPH 2005 Papers, ser. SIGGRAPH '05, 2005, pp. 686-696.
    • [6] H. Liu, X. Wei, J. Chai, I. Ha, and T. Rhee, “Realtime human motion control with a small number of inertial sensors,” in Symposium on Interactive 3D Graphics and Games, ser. I3D '11, 2011, pp. 133-140.
    • [7] H. P. H. Shum, E. S. L. Ho, Y. Jiang, and S. Takagi, “Real-time posture reconstruction for microsoft kinect,” IEEE Transactions on Cybernetics, vol. 43, no. 5, pp. 1357-1369, 2013.
    • [8] C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, 2005.
    • [9] L. Zhou, Z. Liu, H. Leung, and H. P. H. Shum, “Posture reconstruction using kinect with a probabilistic model,” in Proceedings of the 20th ACM Symposium on Virtual Reality Software and Technology, ser. VRST '14. New York, NY, USA: ACM, 2014, pp. 117-125.
    • [10] I. Tashev, “Kinect development kit: A toolkit for gesture- and speech-based human-machine interaction [best of the web],” Signal Processing Magazine, IEEE, vol. 30, no. 5, pp. 129-131, Sept 2013.
    • [11] S. Izadi, D. Kim, O. Hilliges, D. Molyneaux, R. Newcombe, P. Kohli, J. Shotton, S. Hodges, D. Freeman, A. Davison, and A. Fitzgibbon, “Kinectfusion: Real-time 3d reconstruction and interaction using a moving depth camera,” in Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, ser. UIST '11, 2011, pp. 559-568.
    • [12] J. Han, L. Shao, D. Xu, and J. Shotton, “Enhanced computer vision with microsoft kinect sensor: A review,” Cybernetics, IEEE Transactions on, vol. 43, no. 5, pp. 1318-1334, Oct 2013.
    • [13] S. W. Bailey and B. Bodenheimer, “A comparison of motion capture data recorded from a vicon system and a microsoft kinect sensor,” in Proceedings of the ACM Symposium on Applied Perception, ser. SAP '12, 2012, pp. 121-121.
    • [14] J. Kim, Y. Seol, and J. Lee, “Human motion reconstruction from sparse 3d motion sensors using kernel cca-based regression,” Computer Animation and Virtual Worlds, vol. 24, no. 6, 2013.
    • [15] T. Helten, M. Muller, H.-P. Seidel, and C. Theobalt, “Real-time body tracking with one depth camera and inertial sensors,” in Computer Vision (ICCV), 2013 IEEE International Conference on, Dec 2013, pp. 1105-1112.
    • [16] M. Sigalas, M. Pateraki, I. Oikonomidis, and P. Trahanias, “Robust model-based 3d torso pose estimation in rgb-d sequences,” in Computer Vision Workshops (ICCVW), 2013 IEEE International Conference on, Dec 2013, pp. 315-322.
    • [17] H. P. H. Shum and E. S. L. Ho, “Real-time physical modelling of character movements with microsoft kinect,” in Proceedings of the 18th ACM symposium on Virtual reality software and technology, ser. VRST '12, 2012, pp. 17-24.
    • [18] A. Baak, M. Muller, G. Bharaj, H.-P. Seidel, and C. Theobalt, “A data-driven approach for real-time full body pose reconstruction from a depth camera,” in Proceedings of the 2011 International Conference on Computer Vision, ser. ICCV '11, 2011, pp. 1092-1099.
    • [19] W. Shen, K. Deng, X. Bai, T. Leyvand, B. Guo, and Z. Tu, “Exemplar-based human action pose correction and tagging,” in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, June 2012, pp. 1784-1791.
    • [20] H. Yasin, B. Kr u¨ger, and A. Weber, “Model based full body human motion reconstruction from video data,” in Proceedings of the 6th International Conference on Computer Vision / Computer Graphics Collaboration Techniques and Applications, ser. MIRAGE '13, 2013, pp. 1:1-1:8.
    • [21] X. Wei, P. Zhang, and J. Chai, “Accurate realtime full-body motion capture using a single depth camera,” ACM Trans. Graph., vol. 31, no. 6, pp. 188:1-188:12, Nov. 2012.
    • [22] L. Bo and C. Sminchisescu, “Twin gaussian processes for structured prediction,” International Journal of Computer Vision, vol. 87, no. 1-2, pp. 28-52, 2010.
    • [23] L. Bo and C. Sminchisescu, “Structured output-associative regression,” in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009, pp. 2403-2410.
    • [24] G. Shakhnarovich, P. Viola, and T. Darrell, “Fast pose estimation with parameter-sensitive hashing,” in Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on. IEEE, 2003, pp. 750-757.
    • [25] V. Ramakrishna, D. Munoz, M. Hebert, J. A. Bagnell, and Y. Sheikh, “Pose machines: Articulated pose estimation via inference machines,” in Computer Vision-ECCV 2014. Springer, 2014, pp. 33-47.
    • [26] A. Toshev and C. Szegedy, “Deeppose: Human pose estimation via deep neural networks,” in Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014, pp. 1653-1660.
    • [27] J. Tompson, M. Stein, Y. Lecun, and K. Perlin, “Real-time continuous pose recovery of human hands using convolutional networks,” ACM Transactions on Graphics (TOG), vol. 33, no. 5, p. 169, 2014.
    • [28] J. Quin˜ onero-Candela and C. E. Rasmussen, “A unifying view of sparse approximate gaussian process regression,” The Journal of Machine Learning Research, vol. 6, pp. 1939-1959, 2005.
    • [29] N. Lawrence, M. Seeger, and R. Herbrich, “Fast sparse gaussian process methods: The informative vector machine,” in Proceedings of the 16th Annual Conference on Neural Information Processing Systems, 2003, pp. 609-616.
    • [30] E. Snelson and Z. Ghahramani, “Sparse gaussian processes using pseudo-inputs,” in Advances in Neural Information Processing Systems. MIT press, 2006, pp. 1257-1264.
    • [31] V. Tresp, “Mixtures of gaussian processes,” Advances in neural information processing systems, pp. 654-660, 2000.
    • [32] C. E. Rasmussen and Z. Ghahramani, “Infinite mixtures of gaussian process experts,” Advances in neural information processing systems, vol. 2, pp. 881-888, 2002.
    • [33] R. Urtasun and T. Darrell, “Sparse probabilistic regression for activity-independent human pose inference,” in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on. IEEE, 2008, pp. 1-8.
    • [34] L. Bo, C. Sminchisescu, A. Kanaujia, and D. Metaxas, “Fast algorithms for large scale conditional 3d prediction,” in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on. IEEE, 2008, pp. 1-8.
    • [35] E. Snelson, “Local and global sparse gaussian process approximations,” in Proceedings of Artificial Intelligence and Statistics, 2007.
    • [36] D. Nguyen-Tuong, J. R. Peters, and M. Seeger, “Local gaussian process regression for real time online model learning,” in Advances in Neural Information Processing Systems, 2009, pp. 1193- 1200.
    • [37] S. Schaal and C. G. Atkeson, “From isolation to cooperation: An alternative view of a system of experts,” Advances in neural information processing systems, pp. 605-611, 1996.
    • [38] L. Sigal, R. Memisevic, and D. J. Fleet, “Shared kernel information embedding for discriminative inference,” in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009, pp. 2852-2859.
    • [39] Motion Analysis Corporation, http://www.motionanalysis.com.
    • [40] N. Lawrence, “Gaussian processes tool-kit, http://cran.rproject.org/web/packages/gptk,” 2014.
    • [41] X. Zhao, Y. Fu, and Y. Liu, “Human motion tracking by temporalspatial local gaussian process experts,” Image Processing, IEEE Transactions on, vol. 20, no. 4, pp. 1141-1151, 2011.
    • [42] M. Seeger, “Low rank updates for the cholesky decomposition,” University of California at Berkeley, Tech. Rep, 2007.
    • [43] H. Sidenbladh and M. Black, “Learning image statistics for bayesian tracking,” in Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on, vol. 2, 2001, pp. 709-716 vol.2.
    • [44] L. Hoyet, R. McDonnell, and C. O'Sullivan, “Push it real: Perceiving causality in virtual interactions,” ACM Trans. Graph., vol. 31, no. 4, pp. 90:1-90:9, Jul. 2012.
  • No related research data.
  • Discovered through pilot similarity algorithms. Send us your feedback.

Share - Bookmark

Cite this article