We started looking at Mafat Data with fastdup, and found out many of the unlabeled images are broken namely have 99% of the regions are completely black with very small box of real data.
We wonder if this is on purpose or are those really broken images?
Some examples are here: https://github.com/dnth/mafat_fastdup_blogpost/blob/main/fastdup_unlabeled.ipynb
Thanks for publishing this analysis. This may help other participants too.
Please notice that we cropped the images to the size of the labeled dataset, 1280*1280; therefore, some of the frames have black padding.
In addition, the unlabeled dataset was not filtered by the organizers and was released as is and therefore might include cloudy frames.
MAFAT Challenge Team