LOGIN TO YOUR ACCOUNT

Username
Password
Remember Me
Or use your Academic/Social account:

CREATE AN ACCOUNT

Or use your Academic/Social account:

Congratulations!

You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.

Important!

Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message

CREATE AN ACCOUNT

Name:
Username:
Password:
Verify Password:
E-mail:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Cai, Ziyun; Han, Jungong; Liu, Li; Shao, Ling (2017)
Publisher: Springer
Languages: English
Types: Article
Subjects: G400

Classified by OpenAIRE into

ACM Ref: ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, ComputingMethodologies_COMPUTERGRAPHICS
RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • 1. Abdallah D, Charpillet F (2015) Pose estimation for a partially observable human body from rgb-d cameras. In: International Conference on Intelligent Robots and Systems, p 8
    • 2. Aldoma A, Tombari F, Di Stefano L, Vincze M (2012) A global hypotheses verification method for 3d object recognition. In: European Conference on Computer Vision, pp 511-524
    • 3. Aggarwal JK, Cai Q (1997) Human motion analysis: A review. In: Nonrigid and Articulated Motion Workshop, pp 90-102
    • 4. Baltrusaitis T, Robinson P, Morency L (2012) 3d constrained local model for rigid and non-rigid facial tracking. In: Conference on Computer Vision and Pattern Recognition, pp 2610-2617
    • 5. Barbosa BI, Cristani BI, Del Bue A, Bazzani L, Murino V (2012) Re-identification with rgb-d sensors. In: First International Workshop on Re-Identification, pp 433-442
    • 6. Berger K The role of rgb-d benchmark datasets: an overview. arXiv:1310.2053
    • 7. Brachmann E, Krull A, Michel F, Gumhold S, Shotton J, Rother C (2014) Learning 6d object pose estimation using 3d object coordinates. In: European Conference on Computer Vision, pp 536-551
    • 8. Bourke A, Obrien J, Lyons G (2007) Evaluation of a threshold-based tri-axial accelerometer fall detection algorithm, Gait & posture 26(2):194-199
    • 9. Bo L, Ren X, Fox D (2011) Depth kernel descriptors for object recognition. In: International Conference on Intelligent Robots and Systems, pp 821-826
    • 10. Bo L, Lai K, Ren X, Fox D (2011) Object recognition with hierarchical kernel descriptors. In: Conference on Computer Vision and Pattern Recognition, pp 1729-1736
    • 11. Bo L, Ren X, Fox D (2013) Unsupervised feature learning for rgb-d based object recognition. In: Experimental Robotics, pp 387-402
    • 12. Borra`s R, Lapedriza A` , Igual L (2012) Depth information in human gait analysis: an experimental study on gender recognition. In: Image Analysis and Recognition, pp 98-105
    • 13. Chen L, Wei H, Ferryman J (2013) A survey of human motion analysis using depth imagery. Pattern Recogn Lett 34(15):1995-2006
    • 14. Chen C, Jafari R, Kehtarnavaz N (2015) Utd-mad: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: IEEE International conference on image processing
    • 15. Chen C, Jafari R, Kehtarnavaz N (2015) Improving human action recognition using fusion of depth camera and inertial sensors. IEEE Transactions on Human-Machine Systems 45(1):51-61
    • 16. Chen C, Jafari R, Kehtarnavaz N A real-time human action recognition system using depth and inertial sensor fusion
    • 17. Cruz L, Lucio D, Velho L (2012) Kinect and rgbd images: Challenges and applications. In: Conference on Graphics, Patterns and Images Tutorials, pp 36-49
    • 18. Chua CS, Guan H, Ho YK (2002) Model-based 3d hand posture estimation from a single 2d image. Image and Vision computing 20(3):191-202
    • 19. De Rosa R, Cesa-Bianchi N, Gori I, Cuzzolin F (2014) Online action recognition via nonparametric incremental learning. In: British machine vision conference
    • 20. Drouard V, Ba S, Evangelidis G, Deleforge A, Horaud R (2015) Head pose estimation via probabilistic high-dimensional regression. In: International conference on image processing
    • 21. Endres F, Hess J, Engelhard N, Sturm J, Cremers D, Burgard W (2012) An evaluation of the RGB-D SLAM system. In: International Conference on Robotics and Automation, pp 1691-1696
    • 22. Endres F, Hess J, Sturm J, Cremers D, Burgard W (2014) 3-D mapping with an rgb-d camera. IEEE Trans Robot 30(1):177-187
    • 23. Ellis C, Masood SZ, Tappen MF, Laviola JJ Jr, Sukthankar R (2013) Exploring the trade-off between accuracy and observational latency in action recognition. Int J Comput Vis 101(3):420-436
    • 24. Erdogmus N, Marcel S (2013) Spoofing in 2d face recognition with 3d masks and anti-spoofing with kinect:1-6
    • 25. Fanelli G, Dantone M, Gall J, Fossati A, Van Gool L (2013) Random forests for real time 3d face analysis. International Journal on Computer Vision 101(3):437-458
    • 26. Fothergill S, Mentis HM, Kohli P, Nowozin S (2012) Instructing people for training gestural interactive systems. In: Conference on Human Factors in Computer Systems, pp 1737-1746
    • 27. Garcia J, Zalevsky Z (2008) Range mapping using speckle decorrelation. United States Patent 7 433:024
    • 28. Gasparrini S, Cippitelli E, Spinsante S, Gambi E A depth-based fall detection system using a kinect®sensor, Sensors 14(2):2756-2775
    • 29. Gao J, Ling H, Hu W, Xing J (2014) Transfer learning based visual tracking with gaussian processes regression. In: European Conference on Computer Vision, pp 188-203
    • 30. Geng J (2011) Structured-light 3d surface imaging: a tutorial. Adv Opt Photon 3(2):128-160
    • 31. Gossow D, Weikersdorfer D, Beetz M (2012) Distinctive texture features from perspective-invariant keypoints. In: International Conference on Pattern Recognition, pp 2764-2767
    • 32. Gupta S, Girshick R, Arbela¨aez P, Malik J (2014) Learning rich features from rgb-d images for object detection and segmentation. In: European Conference on Computer Vision, pp 345-360
    • 33. Han J, Shao L, Xu D, Shotton J (2013) Enhanced computer vision with microsoft kinect sensor: a review. IEEE Transactions on Cybernetics 43(5):1318-1334
    • 34. Handa A, Whelan T, McDonald J, Davison A (2014) A benchmark for rgb-d visual odometry, 3d reconstruction and slam. In: International Conference on Robotics and Automation, pp 1524-1531
    • 35. Helmer S, Meger D, Muja M, Little JJ, Lowe DG (2011) Multiple viewpoint recognition and localization. In: Asian Conference on Computer Vision, pp 464-477
    • 36. Hinterstoisser S, Lepetit V, Ilic S, Holzer S, Bradski G, Konolige K, Navab N (2012) Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes, pp 548-562
    • 37. Hinterstoisser S, Cagniart C, Ilic S, Sturm P, Navab N, Fua P, Lepetit V (2012) Gradient response maps for real-time detection of texture-less objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(5):876-888
    • 38. Hornung A, Wurm KM, Bennewitz M, Stachniss C, Burgard W (2013) Octomap: an efficient probabilistic 3d mapping framework based on octrees. Auton Robot 34(3):189-206
    • 39. Hu G, Huang S, Zhao L, Alempijevic A, Dissanayake G (2012) A robust rgb-d slam algorithm. In: International Conference on Intelligent Robots and Systems, pp 1714-1719
    • 40. Huynh O, Stanciulescu B (2015) Person re-identification using the silhouette shape described by a point distribution model. In: IEEE Winter Conference on Applications of Computer Vision, pp 929-934
    • 41. Janoch A, Karayev S, Jia Y, Barron JT, Fritz M, Saenko K, Darrell T (2013) A category-level 3d object dataset: Putting the kinect to work. In: Consumer Depth Cameras for Computer Vision, Rsearch Topics and Applications, pp 141-165
    • 42. Jhuo IH, Gao S, Zhuang L, Lee D, Ma Y Unsupervised feature learning for rgb-d image classification
    • 43. Jin L, Gao S, Li Z, Tang J (2014) Hand-crafted features or machine learnt features? together they improve rgb-d object recognition. In: IEEE International Symposium on Multimedia, pp 311-319
    • 44. Karpathy A, Miller S, Fei-Fei L (2013) Object discovery in 3d scenes via shape analysis. In: International Conference on Robotics and Automation (ICRA), pp 2088-2095
    • 45. Kerl C, Sturm J, Cremers D (2013) Robust odometry estimation for rgb-d cameras. In: International Conference on Robotics and Automation, pp 3748-3754
    • 46. Kepski M, Kwolek B (2014) Fall detection using ceiling-mounted 3d depth camera. In: International Conference on Computer Vision Theory and Applications, vol 2, pp 640-647
    • 47. Koppula HS, Gupta R, Saxena A (2013) Learning human activities and object affordances from rgb-d videos. The International Journal of Robotics Research 32(8):951-970
    • 48. Koppula HS, Anand A, Joachims T, Saxena A (2011) Semantic labeling of 3d point clouds for indoor scenes. In: Advances in Neural Information Processing Systems, pp 244-252
    • 49. Kumatani K, Arakawa T, Yamamoto K, McDonough J, Raj B, Singh R, Tashev I (2012) Microphone array processing for distant speech recognition: Towards real-world deployment. Asia Pacific Signal and Information Processing Association Conference:1-10
    • 50. Kurakin A, Zhang Z, Liu Z (2012) A real time system for dynamic hand gesture recognition with a depth sensor. In: European Signal Processing Conference (EUSIPCO), pp 1975-1979
    • 51. Kwolek B, Kepski M (2014) Human fall detection on embedded platform using depth maps and wireless accelerometer. Comput Methods Prog Biomed 117(3):489-501
    • 52. Lai K, Bo L, Ren X, Fox D (2011) A large-scale hierarchical multi-view rgb-d object dataset. In: International Conference on Robotics and Automation, pp 1817-1824
    • 53. Lai K, Bo L, Ren X, Fox D (2013) Rgb-d object recognition: Features, algorithms, and a large scale benchmark. In: Consumer Depth Cameras for Computer Vision, pp 167-192
    • 54. Lee TK, Lim S, Lee S, An S, Oh SY (2012) Indoor mapping using planes extracted from noisy rgb-d sensors. In: International Conference on Intelligent Robots and Systems, pp 1727-1733
    • 55. Leibe B, Cornelis N, Cornelis K, Van Gool L (2007) Dynamic 3d scene analysis from a moving vehicle. In: Conference on Computer Vision and Pattern Recognition, pp 1-8
    • 56. Leroy J, Rocca F, Mancac¸s M, Gosselin B (2013) 3d head pose estimation for tv setups. In: Intelligent Technologies for Interactive Entertainment, pp 55-64
    • 57. Liu L, Shao L (2013) Learning discriminative representations from rgb-d video data. In: International joint conference on Artificial Intelligence, pp 1493-1500
    • 58. Liu K, Chen C, Jafari R, Kehtarnavaz N (2014) Fusion of inertial and depth sensor data for robust hand gesture recognition. Sensors Journal 14(6):1898-1903
    • 59. Luber M, Spinello L, Arras KO (2011) People tracking in rgb-d data with on-line boosted target models. In: International Conference on Intelligent Robots and Systems, pp 3844-3849
    • 60. Mason J, Marthi B, Parr R (2012) Object disappearance for object discovery. In: International Conference on Intelligent Robots and Systems, pp 2836-2843
    • 61. Mason J, Marthi B, Parr R (2014) Unsupervised discovery of object classes with a mobile robot. In: International Conference on Robotics and Automation, pp 3074-3081
    • 62. Mantecon T, del Bianco CR, Jaureguizar F, Garcia N (2014) Depth-based face recognition using local quantized patterns adapted for range data. In: International Conference on Image Processing, pp 293-297
    • 63. Meister S, Izadi S, Kohli P, Ha¨mmerle M, Rother C, Kondermann D (2012) When can we use kinectfusion for ground truth acquisition? In: Workshop on color-depth camera fusion in robotics
    • 64. Min R, Kose N, Dugelay JL (2014) Kinectfacedb: a kinect database for face recognition. IEEE Transactions on Cybernetics 44(11):1534-1548
    • 65. Nathan Silberman PK, Hoiem D, Fergus R (2012) Indoor segmentation and support inference from rgbd images, in: European Conference on Computer Vision, pp 746-760
    • 66. Narayan KS, Sha J, Singh A, Abbeel P Range sensor and silhouette fusion for high-quality 3d scanning, sensors 32(33):26
    • 67. Negin F, O¨ zdemir F, Akg u¨l CB, Y u¨ksel KA, Erc¸cil A (2013) A decision forest based feature selection framework for action recognition from rgb-depth cameras. In: Image Analysis and Recognition, pp 648- 657
    • 68. Ni B, Wang G, Moulin P (2013) Rgbd-hudaact: A color-depth video database for human daily activity recognition. In: Consumer Depth Cameras for Computer Vision, pp 193-208
    • 69. Oikonomidis I, Kyriazis N, Argyros AA (2011) Efficient model-based 3d tracking of hand articulations using kinect. In: British Machine Vision Conference, pp 1-11
    • 70. Pomerleau F, Magnenat S, Colas F, Liu M, Siegwart R (2011) Tracking a depth camera: Parameter exploration for fast icp. In: International Conference on Intelligent Robots and Systems, pp 3824-3829
    • 71. Rekik A, Ben-Hamadou A, Mahdi W (2013) 3d face pose tracking using low quality depth cameras. In: International Conference on Computer Vision Theory and Applications, vol 2, pp 223-228
    • 72. Richtsfeld A, Morwald T, Prankl J, Zillich M, Vincze M (2012) Segmentation of unknown objects in indoor environments. In: International Conference on Intelligent Robots and Systems, pp 4791-4796
    • 73. Richtsfeld A, M o¨rwald T, Prankl J, Zillich M, Vincze M (2014) Learning of perceptual grouping for object segmentation on rgb-d data. Journal of visual communication and image representation 25(1):64-73
    • 74. Rusu RB, Cousins S (2011) 3d is here: Point cloud library (pcl). In: International Conference on Robotics and Automation, pp 1-4
    • 75. Salas-Moreno RF, Glocken B, Kelly PH, Davison AJ (2014) Dense planar slam. In: IEEE International Symposium on Mixed and Augmented Reality, pp 157-164
    • 76. Satta R (2013) Dissimilarity-based people re-identification and search for intelligent video surveillance. Ph.D. thesis
    • 77. Shao T, Xu W, Zhou K, Wang J, Li D, Guo B (2012) An interactive approach to semantic modeling of indoor scenes with an rgbd camera. ACM Trans Graph 31(6):136
    • 78. Shotton J, Glocker B, Zach C, Izadi S, Criminisi A, Fitzgibbon A (2013) Scene coordinate regression forests for camera relocalization in rgb-d images. In: Conference on Computer Vision and Pattern Recognition, pp 2930-2937
    • 79. Silberman L, Fergus R (2011) Indoor scene segmentation using a structured light sensor. In: International Conference on Computer Vision - Workshop on 3D Representation and Recognition, pp 601-608
    • 80. Singh A, Sha J, Narayan KS, Achim T, Abbeel P (2014) Bigbird: A large-scale 3d database of object instances. In: International Conference on Robotics and Automation, pp 509-516
    • 81. Song S, Xiao J (2013) Tracking revisited using rgbd camera: Unified benchmark and baselines. In: International Conference on Computer Vision, pp 233-240
    • 82. Song S, Lichtenberg SP, Xiao J (2015) Sun rgb-d: A rgb-d scene understanding benchmark suite. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 567-576
    • 83. Spinello L, Arras KO (2011) People detection in rgb-d data. In: International Conference on Intelligent Robots and Systems, pp 3838-3843
    • 84. Sturm J, Magnenat S, Engelhard N, Pomerleau F, Colas F, Burgard W, Cremers D, Siegwart R (2011) Towards a benchmark for rgb-d slam evaluation. In: RGB-D workshop on advanced reasoning with depth cameras at robotics: Science and systems conference, vol 2
    • 85. Sturm J, Engelhard N, Endres F, Burgard W, Cremers D (2012) A benchmark for the evaluation of rgb-d slam systems. In: International Conference on Intelligent Robot Systems, pp 573-580
    • 86. Sturm J, Burgard W, Cremers D (2012) Evaluating egomotion and structure-from-motion approaches using the tum rgb-d benchmark. In: Workshop on color-depth camera fusion in international conference on intelligent robot systems
    • 87. Steinbruecker D, Sturm J, Cremers D (2011) Real-time visual odometry from dense rgb-d images. In: Workshop on Live Dense Reconstruction with Moving Cameras at the International Conference on Computer Vision, pp 719-722
    • 88. Stein S, McKenna SJ (2013) Combining embedded accelerometers with computer vision for recognizing food preparation activities. In: International joint conference on Pervasive and ubiquitous computing, pp 729-738
    • 89. Stein S, Mckenna SJ (2013) User-adaptive models for recognizing food preparation activities. In: International workshop on Multimedia for cooking & eating activities, pp 39-44
    • 90. Sutton MA, Orteu JJ, Schreier H (2009) Image correlation for shape, motion and deformation measurements: basic concepts theory and applications
    • 91. Sun M, Bradski G, Xu BX, Savarese S (2010) Depth-encoded hough voting for joint object detection and shape recovery. In: European Conference on Computer Vision, pp 658-671
    • 92. Sung J, Ponce C, Selman B, Saxena A Human activity detection from rgbd images., plan, activity, and intent recognition 64
    • 93. Susanto W, Rohrbach M, Schiele B (2012) 3d object detection with multiple kinects. In: European Conference on Computer Vision Workshops and Demonstrations, pp 93-102
    • 94. Tao D, Jin L, Yang Z, Li X (2013) Rank preserving sparse learning for kinect based scene classification. IEEE Transactions on Cybernetics 43(5):1406-1417
    • 95. Tao D, Cheng J, Lin X, Yu J Local structure preserving discriminative projections for rgb-d sensor-based scene classification, Information Sciences
    • 96. Vaufreydaz D, Ne`gre A (2014) Mobilergbd, an open benchmark corpus for mobile rgb-d related algorithms. In: International conference on control, Automation, Robotics and Vision
    • 97. Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1290-1297
    • 98. Wang J, Liu Z, Wu Y, Yuan J (2014) Learning actionlet ensemble for 3d human action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 36(5):914-927
    • 99. Wohlkinger W, Aldoma A, Rusu RB, Vincze M (2012) 3dnet: Large-scale object class recognition from cad models. In: International Conference on Robotics and Automation, pp 5384-5391
    • 100. Wu D, Shao L (2014) Leveraging hierarchical parametric networks for skeletal joints based action segmentation and recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 724-731
    • 101. Xiao J, Owens A, Torralba A (2013) Sun3d: A database of big spaces reconstructed using sfm and object labels. In: International Conference on Computer Vision, pp 1625-1632
    • 102. Yang Y, Guha A, Fernmueller C, Aloimonos Y (2014) Manipulation action tree bank: A knowledge resource for humanoids:987-992
    • 103. Yu G, Liu Z, Yuan J (2015) Discriminative orderlet mining for real-time recognition of human-object interaction. In: Asian Conference on Computer Vision, pp 50-65
    • 104. Zhang Q, Song X, Shao X, Shibasaki R, Zhao H (2013) Category modeling from just a single labeling: Use depth information to guide the learning of 2d models. In: Conference on Computer Vision and Pattern Recognition, pp 193-200
    • 105. Zhou Q-Y, Koltun V (2013) Dense scene reconstruction with points of interest. ACM Trans Graph 32(4):112-117
  • No related research data.
  • No similar publications.
  • BioEntity Site Name
    GitHub

Share - Bookmark

Cite this article