Can we use any publicly available data on the internet to train our model? Other datasets etc.
I also just found out additional labels (csv_dir/test_set_labels.csv) inside official VIP Cup repository (commit 28dde3d) and I am wondering if it good idea to incorporate them in model training.
It seems to significantly improve results.