Our dataset, called H2O (2 Hands and Objects), provides synchronized multi-view RGB-D images, interaction labels, object classes, ground-truth 3D poses for left & right hands, 6D object poses, ground-truth camera poses, object meshes and scene point clouds. For more detailed information, please visit our project page.
Method | Val accuracy | Test accuracy | Modalities |
---|---|---|---|
C2D | 76.10 | 70.66 | RGB |
I3D [1] | 85.15 | 75.21 | RGB |
Slowfast [2] | 86.00 | 77.69 | RGB |
H+O [3] | 80.49 | 68.88 | train:RGB+hand+obj, test:RGB |
ST-GCN [4] | 83.47 | 73.86 | train:RGB+hand+obj, test:RGB |
TA-GCN [5] | 86.78 | 79.25 | train:RGB+hand+obj, test:RGB |
The last three baselines use RGB images, hand, and object poses for training and use only RGB images for the test. In this challenge (ECCV'22), we expect you will:
Please indicate what modalities you use in the method description section. (ex. hand+obj, RGB, train:RGB +hand+obj test:RGB)
* We don't allow to use pre-trained models for this ECCV'22 competition.
References
[1] Carreira et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset, CVPR 2017
[2] Feichtenhofer et al. SlowFast Networks for Video Recognition, ICCV 2019
[3] Tekin et al. H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions, CVPR 2019
[4] Yan et al. Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition, AAAI 2018
[5] Kwon et al. H2O: Two Hands Manipulating Objects for First Person Interaction Recognition. ICCV 2021.
We have three devided datasets, training, validation, test. You need to train your model only using the training set and select your model using the validation set. Submission are only evaludated the test dataset. For action labels, we calculate action accuracy. Baseline validation and test accuracy are 86.78 and 79.25 respectively in [1].
References
[1] Taein Kwon, Bugra Tekin, Jan Stuhmer, Federica Bogo, and Marc Pollefeys. “H2O: Two Hands Manipulating Objects for First Person Interaction Recognition.” ICCV (2021).
You agree that the DATASET: (a) shall only be downloaded if you agree to these terms; (b) is to be used only for the academic purposes; (c) will not be used for commercial purposes; (d) will not be transferred to any third party. Furthermore, you agree that any publication based on, or containing, the DATASET shall include a reference to the Data set provided under these terms.
To submit your results to the leaderboard you must construct a submission zip file containing a file:
For action prediction, you need to put action id number as the key and action label number as value of the json file. action_labels.json {"modality": "hand+obj", "1": 32, "2": 11, "3": 14, ... "241": 16, "242": 22 }
Make answer.zip with this json file to submit your result to CodaLab.
Start: May 7, 2022, midnight
Never
You must be logged in to participate in competitions.
Sign In# | Username | Score |
---|---|---|
1 | IMOU_ALG | 0.9711 |
2 | Necca | 0.9669 |
3 | debaumann | 0.9463 |