We felt the labels are incorrect for few samples in training data. One of the example is the below one:
sexism2022_english-4118 Surprized they didn't stop and rape some women
actual label: not sexist
correct label task A: sexist
correct label task B: Threats
correct label task C: Incitement and encouragement of harm.
We wanted to clarify if our assumption is right. And, I wanted to check if anyone had similar issues. Also, having similar incorrect label in dev and test datasets might cause wrong evaluation of our models.
Thanks,
Jayanth
Hi,
Thanks for messaging. A couple of points from our end:
- Generally, as with every dataset, the labels are the result of a specific annotation process. Each entry was annotated by three annotators and adjudicated by an expert in the case of disagreement. Sexism is a subjective task and there will be some variation in labelling.
- We cannot comment directly on why the annotators labelled this as "non-sexist" but suggest that this particular entry could be a sarcastic comment or is expressing prejudice / stereotyping against another group about them having the tendency to threaten women.
If you have more questions, please email us.
Posted by: hannah.rose.kirk @ Dec. 29, 2022, 8:54 p.m.