Update August 2022: this leaderboard only shows submissions after August 2022. To access previous results go to https://competitions.codalab.org/competitions/20913
Introduction
We present the HANDS19 Challenge, a public competition hosted by the HANDS 2019 workshop, ICCV 2019, designed for the evaluation of the task of 3D hand pose estimation in both depth and colour modalities in the presence and absence of objects. The main goals of this challenge are to assess the performance of state of the art approaches in terms of interpolation-extrapolation capabilities of hand variations in their main four axes (shapes, articulations, viewpoints, objects), and the use of synthetic data to fill the gaps of current datasets on these axes. The challenge builds on recent datasets BigHand2.2M, F-PHAB and HO-3D datasets, which have been designed to exhaustively cover multiple hand shapes, viewpoints, articulations and both self-occlusion and occlusion from objects using both depth and RGB cameras. Despite being the most exhaustive available datasets for their respective tasks, they lack full coverage of the hand variability. In order to fill these gaps, parameters of a fitted hand model (MANO) and a toolkit to synthesize data are provided to participants. Training and test splits are carefully designed to study the interpolation and extrapolation capabilities of participants' techniques on these mentioned axes and the potential benefit of using such synthetic data. The challenge consists of a standardized dataset, an evaluation protocol for three different tasks and a public competition. Participating methods will be analyzed and ranked according to their performance on the mentioned axes. Winners and prizes will be announced and awarded during the workshop and results will be disseminated in a subsequent challenge publication.
Challenge overview
In each task the aim is to predictthe 21 joints’ 3D locations for each given image (details on annotation below). In training both hand pose annotations and MANO fitting parameters are provided for each image. For inference, only depth/RGB images and hand's bounding boxes are provided.
- Task 1: Depth-Based 3D Hand Pose Estimation. This task builds on BigHand2.2M dataset in a similar format to HANDS 2017 challenge. Some hand shapes, articulations and viewpoints are strategically excluded from the training set in order to measure interpolation and extrapolation capabilities of submissions. No objects appear in this task. Hands appear in both 3rd person and egocentric viewpoints.
- Task 2: Depth-Based 3D Hand Pose Estimation while Interacting with Objects: This task builds on F-PHAB dataset. Objects appear being manipulated by a subject in an egocentric viewpoint. Some hand shapes and objects are strategically excluded from the training set in order to measure interpolation and extrapolation capabilities of submissions.
- Task 3: RGB-Based 3D Hand Pose Estimation while Interacting with Objects: This task builds on HO-3D dataset. Objects appear being manipulated by a subject in a 3rd person viewpoint. Some hand shapes and objects are strategically excluded from the training set in order to measure interpolation and extrapolation capabilities of submissions.
Task 1: Depth-Based 3D Hand Pose Estimation
This task builds on BigHand2.2M dataset in a similar format to HANDS 2017 challenge. Hands appear in both 3rd person and egocentric viewpoints. No objects are present in this task.
- Training set: Contains images from 5 different subjects. Some hand articulations and viewpoints are strategically excluded.
- Test set: Contains images from 10 different subjects. 5 subjects overlap with the training set. Exhaustive coverage of viewpoints and articulations.
- The following performance scores (as mean joint error) will be evaluated:
- Interpolation (INTERP.): performance on test samples that have shape, viewpoints and articulations present in the training set.
- Extrapolation:
- Total (EXTRAP.): performance on test samples that have hand shapes, viewpoints and articulations not present in the training set.
- Shape (SHAPE): performance on test samples that have hand shapes not present in the training set. Viewpoints and articulations are present in the training set.
- Articulation (ARTIC.): performance on test samples that have articulations not present in the training set. Shapes and viewpoints are present in the training set.
- Viewpoint (VIEWP.): performance on test samples that have viewpoints not present in the training set. Shapes and articulations are present in the training set. Viewpoint is defined as elevation and azimuth angles of the hand respect to the camera. Both angles are analyzed independently.
- Use of fitted MANO model for synthesizing data is encouraged.
- Images are captured with Intel RealSense SR300 camera at 640 × 480-pixel resolution.
- Use of training data from HANDS 2017 challenge is not allowed as some images may overlap with the test set.
- Use of other labelled datasets (either real or synthetic) is not allowed. Use of fitted MANO model for synthesizing data is encouraged. Use of external unlabelled data is allowed (self-supervised and unsupervised methods).
The following performance scores (as mean joint error mm) are evaluated:
- Interpolation (INTERP.): performance on test samples that have shape, viewpoints and articulations present in the training set.
- Extrapolation:
- Total (EXTRAP.): performance on test samples that have hand shapes, viewpoints and articulations not present in the training set.
- Shape (SHAPE): performance on test samples that have hand shapes not present in the training set. Viewpoints and articulations are present in the training set.
- Articulation (ARTIC.): performance on test samples that have articulations not present in the training set. Shapes and viewpoints are present in the training set.
- Viewpoint (VIEWP.): performance on test samples that have viewpoints not present in the training set. Shapes and articulations are present in the training set. Viewpoint is defined as elevation and azimuth angles of the hand respect to the camera. Both angles are analyzed independently.
- Submission deadline is 1st October 2019.
- To participate fill this form and accept the terms and conditions.
- In order for participants to be eligible for competition prizes and be included in the official rankings (to be presented during the workshop and subsequent publications), information about their submission must be provided to organizers. Information may include, but not limited to, details on their method, synthetic and real data use, architecture and training details. Check previous challenge publication to have an idea of the information needed.
- Winning methods may be asked to provide their source code to reproduce their results under strict confidentiality rules if requested by participants.
- For each submission, participants must keep the parameters of their method constant across all testing data for a given task.
- Use of training data from HANDS 2017 challenge is not allowed as some images may overlap with the test set.
- Use of other labelled datasets (either real or synthetic) is not allowed. Use of fitted MANO model for synthesizing data is encouraged. Use of external unlabelled data is allowed (self-supervised and unsupervised methods).