Synthetic data about heart attack generated artificially by causal Bayesian networks with binary variables. These examples are completely made up and are used for illustration purpose only.
This challenge consists in three problems:
Binary classification: Each data row is labeled (0) or (1). You have to train a predictive model on train dataset to be able to find as well as possible the labels of the test dataset.
Feature selection: Among the 11 features. The goal of this problem is therefore to classify features between useless (0) and useful (1).
Causal inference: Find the causal links between each variables.
Let's see an example: In this example, A and C variables cause B. D has no causal links with any variable. This can be represented by the following matrix:
The two arrows are represented by the 1 in the matrix.
The problems are a binary classification, a feature selection and a causal inference. For the binary classification, the evaluation metric is the area under ROC curve (AUC). For feature selection and causality inference, the scoring metric is balanced accuracy.
The Receiver Operating Characteristic (ROC) curve is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. It is created by plotting the true positive rate (TPR) against the false positive (FPR) at various threshold settings.
The evaluation metric is therefore the area under the curve (AUC).
You may submit 15 submissions every day and 200 in total.
This challenge is for educational purposes only, no prizes are awarded.
This challenge is governed by the general ChaLearn contest rules.
Start: Sept. 10, 2021, midnight
You must be logged in to participate in competitions.Sign In