I think it is better if we use another dryrun data without given the gt bounding boxes.
It is weird or just me think that mAP >= 99.0 is a little bit off?
We release the GT labels for dry-run data for the participants to better debug their model's performance.
You are not supposed to use the dry-run data's label in any model's training.
We noticed that there are some 95+ score but these scores are useless, aren't they?