I have read about priority of evaluation score set {CIDEr, SPICE, METEOR, BLEU-4, ROGUE, BLEU-3, BLEU-2, and BLEU-1} using this order.
And then, if someone get 1st score in CIDEr, it can be 1st place?
If it's right, what happen if several teams got same CIDEr score?
Do you merge each score with weights?
Hi,
We will evaluate the caption accuracy using the following set of commonly employed caption evaluation metrics: CIDEr, SPICE, METEOR, BLEU-4, ROGUE, BLEU-3, BLEU-2, and BLEU-1 (using this order as priority).
Our leaderboard is already set to work in this way.
Please check Evaluation tab under Learn the Deatils tab.
Posted by: p.ahn @ April 5, 2023, 4:59 a.m.