I have submitted 13 times to the English task. 11 failed. 2 succeeded. The most common issue is
AssertionError: 499 / 498
I cannot replicate this issue with task_evaluate.py. But I believe it's an issue with my llm output having characters that break the csv reader.
I'd like to have a better submission failure feedback mechanism. The current setup is not only unhelpful for locating the output bug, but take a very long time to run.
Thanks.
Posted by: nirvanatear @ Nov. 7, 2024, 4:04 a.m.Please, correct your headers:
Id;Text;Question;Answer
must be
ID;Text;Question;Answer
Posted by: lli-uam @ Nov. 7, 2024, 1:33 p.m.Thank you. You are right that this is a header issue and not a output issue.
Posted by: nirvanatear @ Nov. 7, 2024, 4:46 p.m.