NYU Hand Pose Dataset

Team


Jonathan Tompson	Murphy Stein	Yann Lecun	Ken Perlin

For questions contact Jonathan Tompson: jonathantompson@gmail.com

References

Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks

Jonathan Tompson, Murphy Stein, Yann Lecun and Ken Perlin.

TOG'14 (Presented at SIGGRAPH'14)

License

The dataset is released as-is under the Creative Commons BY 4.0 license

LICENSE

Overview

The NYU Hand pose dataset contains 8252 test-set and 72757 training-set frames of captured RGBD data with ground-truth hand-pose information. For each frame, the RGBD data from 3 Kinects is provided: a frontal view and 2 side views. The training set contains samples from a single user only (Jonathan Tompson), while the test set contains samples from two users (Murphy Stein and Jonathan Tompson). A synthetic re-creation (rendering) of the hand pose is also provided for each view.

We also provide the predicted joint locations from our ConvNet (for the test-set) so you can compare performance. Note: for real-time prediction we used only the depth image from Kinect 1.

The source code to fit the hand-model to the depth frames here can be found here

NEW: The dataset used to train the RDF is also public! It contains 6736 depth frames of myself doing various hand gesture (seated and standing) and the ground truth per-pixel labels (hand/not hand).

Citing the dataset

@article{tompson14tog,
  author = {Jonathan Tompson and Murphy Stein and Yann Lecun and Ken Perlin}
  title = {Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks,
  journal = {ACM Transactions on Graphics},
  year = {2014},
  month = {August},
  volume = {33}
}

Publication


TOG'14 paper		SIGGRAPH'14 ppt

Download

You can download the dataset here:

nyu_hand_dataset_v2.zip (92 GB)

Dataset format

The top level directory is structured as follows:

visualize_example.m - Example script: loading and displaying one data sample
evaluate_predictions.m - Example script: displaying our detector's predicted coordinates and performance
test
- depth_<k>_<f>.png - Test-set Depth frame <f> for <k> kinect.
- synthdepth_<k>_<f>.png - Test-set Synthetic depth frame <f> for <k> kinect.
- rgb_<k>_<f>.png - Test-set RGB frame <f> for <k> kinect.
- joint_data.mat - Matlab data containing:
  - joint_names - Cell of strings containing the names of the 36 key hand locations
  - joint_uvd - 4D Tensor containing the UVD location of each joint in the test-set frames
  - joint_xyz - 4D Tensor containing the XYZ location of each joint in the test-set frames
- test_predictions.mat - Matlab data containing:
  - conv_joint_names - Cell of string containing the names of locations tracked by the ConvNet
  - pred_joint_uvconf - UV and confidence (likelihood) for each tracked joint
train
- depth_<k>_<f>.png - Training-set depth frame <f> for <k> kinect.
- synthdepth_<k>_<f>.png - Training-set Synthetic depth frame <f> for <k> kinect.
- rgb_<k>_<f>.png - Training-set RGB frame <f> for <k> kinect.
- joint_data.mat - Matlab data containing:
  - joint_names - Cell of strings containing the names of the 36 key hand locations
  - joint_uvd - 4D Tensor containing the UVD location of each joint in the training-set frames
  - joint_xyz - 4D Tensor containing the XYZ location of each joint in the training-set frames

Note: In each depth png file the top 8 bits of depth are packed into the green channel and the lower 8 bits into blue.

NEW: RDF Download

You can download the dataset used to train the RDF here:

nyu_hand_dataset_rdf.zip (177 MB)