Hi everyone,
We are releasing the extracted video features to facilitate faster algorithm implementation. The features are extracted using the code of https://github.com/dasongli1/SnapUGC_Engagement/tree/main/ECR_inference.
The "EfficientNetV2/", "Distort/" are per-frame features. The "ResNet3d/" is per-clip features (1 clip contains 16 frames). The "Music/" is the sound classification (top-5) of whole videos. The "Caption/" contains per-clip features (1 clip contains 16 frames) and a text description of the whole video.
The download link of train set is google drive [https://drive.google.com/drive/folders/14hwZ5rIMfMNqByK7PM_1X3jMS1MWUnug?usp=share_link] or Baiduyun [https://pan.baidu.com/s/1TpU_vXNOJ2ELvYeQFZUwZA?pwd=58cp].
The download link of val set is google drive [https://drive.google.com/drive/folders/1N4SbyKgTgxQE340mOo8-4FJu_VlYAKKL?usp=share_link] or Baiduyun [https://pan.baidu.com/s/1utz8zDP9fZzlpITbeTAi8Q?pwd=veb9].