We submitted twice and received the exact same BERTScore for both (to 16 decimals). I thought for a minute that we submitted the same file twice in error, but the DialogRPT scores are different. How is this possible?
Posted by: justinray-v @ May 6, 2023, 6:18 a.m.Thank you for pointing this out. We are currently looking into this as we are going through the submissions.
Posted by: anaistack @ May 11, 2023, 7:42 a.m.I think it's possible you're only using the BERT Score of "test_0001"...
Posted by: justinray-v @ May 12, 2023, 12:58 a.m.