Can we use val dataset for training, or we only can use val dataset for Validation.
I did not find a detailed description of the issue.
Dear participant,
yes, you can use the validation data at the test stage to fine tune your model.
Best
I think the val dataset(if labels available) should not be used for training. This goes against the original intent of the Challenge for measuring Day, Week, Month level drift, because the val dataset contains pictures of all months.
Posted by: heboyong @ June 17, 2022, 9:16 a.m.Hi,
I am sorry if I was not clear in my answer. Validation data can be used at the final phase but also taking into account the rules of the challenge. That is, train on a predefined and single day/week/month. I am not totally aware if there is additional data in validation set for the respective days/weeks/months. I will ask my colleague to complement my question. You may received additional information soon.
Best
Hi,
Juliojj is correct here, the intent of the challenge is to train only on the day/week/month data only. Hence why the training data contains data from February and why February is completely excluded from validation and testing.
The purpose of the validation data and labels is so the participants can trouble shoot where their model fails and identify whether the proposed additions to the architecture actually introduces robustness to the concept drift (gradual visual change of the thermal images over time).
Therefore we ask that the submissions for the finale test phase have only been trained on the initial day/week/month split, depending on the challenge track chosen.
Posted by: Anderssj @ June 17, 2022, 9:43 a.m.The schedule states that retraining is possible with the validation data set.
What should I do right?...
Posted by: PeterKim @ June 17, 2022, 10:08 a.m.Or, how about making only images without annotations accessible considering the real world?
(Because images are continuously being collected.)
I completely understand the confusion, it is a mistake on our part to have that sentence included in the schedule. The idea was that the validation data with labels could be used to further tune approaches and help figure out and finetune parameters that would result in a more robust (to concept drift) model, not learn representations directly from the data.
We will add an elaboration to the schedule that clarifies that for fair evaluation purposes and accurate evaluation of concept drift / methods to mitigate it, models for testing should only be trained on the initial training data.
As for releasing data without labels, it is ofcourse a very interesting, especially for a real-world scenario. While i cannot speak for the focus of future challenges, it is definetly an aspect we have considered. There is ofcourse some compliance considerations from gathering data in public spaces.
Posted by: Anderssj @ June 17, 2022, 11:25 a.m.Please answer the questions below. This confusion in the test phase is very embarrassing...
Only the training data is accessible, not the images of the validation data? (semi-supervised manners)
Posted by: PeterKim @ June 17, 2022, 11:33 a.m.As stated in the "How to enter the competition" on the official challenge page: https://chalearnlap.cvc.uab.cat/challenge/51/description/ training of models must be done on the training data exclusively, and leveraging the validation images/labels for anything but personal evaluation and hyper-parameter tuning is grounds for disqualification, as it is against the purpose of developing methods robust to concept drift.
In short you cannot do anything like self-supervised, semi-supervised or un-supervised learning on the validation or test data.
Posted by: Anderssj @ June 17, 2022, 12:12 p.m.