CodaLab -

> Clarification on the evaluation metric and on hashing

Dear organisers, thanks for organising this cool challenge.
I was wondering whether you could provide a python file to compute the evaluation metric (or the precise mathematical details for this specific case), as I am getting slightly different results.
Moreover, do the features need to be hashed or is it just for computational reasons?
Thanks in advance!

Posted by: SirPopiel @ April 15, 2022, 4:49 p.m.

Hello,

For the NCE metric you can compute it like that:

"""
import numpy as np
from sklearn.metrics import log_loss
from scipy.stats import entropy

llh = log_loss(y_test, y_hat)
y_bar = np.mean(y_test)
y_entr = entropy([y_bar, 1 - y_bar])
nce = (y_entr - llh) / y_entr
"""

It is expected that you don't have the same result locally as the "NCE (train envs)" column in the leaderboard as it is computed on samples from the train domain *in the test set*. Note that the test set on which you provide predictions contains a mix of samples from different domains, some of them coming from the same distribution as the train domains.

Re: hashing it is just a convenient way to anonymize the feature values.

Posted by: eustache @ April 25, 2022, 9:14 a.m.

Post in this thread

Forums

PRINCE Out-of-distribution Generalization Challenge @ ECML-PKDD Forum

> Clarification on the evaluation metric and on hashing