In this competition, participants will address two fundamental causal challenges in machine learning in the context of education using time-series data. The first is to identify the causal relationships between different constructs, where a construct is defined as the smallest element of learning. The second challenge is to predict the impact of learning one construct on the ability to answer questions on other constructs. Addressing these challenges will enable optimisation of students' knowledge acquisition, which can be deployed in a real edtech solution impacting millions of students. Participants will run these tasks in an idealised environment with synthetic data and a real-world scenario with evaluation data collected from a series of A/B tests.
The real-world data for this competition comes from an educational platform that is already deployed and used at scale. The data describes real answers given by real students to real questions. By providing the opportunity to work on genuine educational data and real problems in an engaging manner, our competition will attract talent to the important field of machine learning in education.
We expect the competition to bring fundamental advances to educational data mining technologies. These methods will be deployed in a real educational platform where they will improve the learning outcomes of millions of students.
For details regarding this competition, one can download the instructions and guides at instructions and guides.
For data, one can download them either at the codalab competition page or at Data. For the starting kit, it can be found at Starting Kit.
There are four tasks of varying styles of this competition. Below is a brief description of each task.
Each task contains a public evaluation phase and a private evaluation phase. In the public evaluation phase, you can see the evaluation results of your submission in a public leaderboard compared to other participants. In the private evaluation phase, the evaluation results of the competition is not publicly viewable.
More Information on the tasks, including the evaluation metrics, can be found in the official competition guide (competition guides)
Relationship discovery for constructs over time using synthetic time-series data
The relationship among different constructs is key for us to set the learning path for students. Currently, in education, the relationships are determined by teachers and these vary among different teachers depending on their experience. In this task, we would like to discover the construct relationships using complete synthetic time-series data. The causal discovery results will be evaluated against the ground-truth causal graph using adjacency F1 score. Apart from the test datasets, we provide additional synthetic datasets for local development. All datasets will use the same time lag and the number of constructs, but differ in their causal and functional relationships
Teaching Effectiveness Inference using Synthetic Time-series Data
Given the limited learning time for students, we would like to provide the student with the learning material that is most helpful for their overall knowledge acquisition. In addition, since each student may have different learning path history, the provided learning material should also take this into consideration.
Temporal conditional average treatment effect (CATE) is a good measure on the effectiveness of learning a particular construct to another target construct. We use the probability of correctly answering the associated question as a performance indicator for that construct.
The candidates are asked to estimate the CATE of 10 provided query for each dataset in task 1. During the evaluation, the estimated CATE will be compared against the ground truth and ranked based on the averaged negative root mean square error (RMSE) over all datasets.
For this task, the goal is to discover the construct relations (similar to task 1) using real-world data. We will evaluate the aggregated causal graph with partial ground-truth from the A/B test. Similar to task 1, the participants are asked to provide a full aggregated adjacency matrix for construct-construct relations. However, due to the experimental nature of this task, we will only evaluate parts of the adjacency matrix that we have ground truth from the A/B testing.
This task is similar to task 2 but with real data. Since we do not have access to the actual knowledge of a particular construct, we use the averaged correctness of questions associated to that construct as the performance indicator, which can be viewed as the empirical estimates of the probability of correctly answering the questions. The ground truth conditional average treatment effect (CATE) will be computed based on the intervention data collected from A/B tests, which will not be revealed to the participants. The candidates need to submit their CATE estimation about selected pairs of constructs. We will compare the submission with the ground truth.
The rules of the competition are as follows:
Any instances of cheating, defined as violation of the rules above or any other attempt to circumvent the spirit and intent of the competition, as determined by the organizers, will result in the invalidation of all of the offending team's submissions and a disqualification from the competition.
In order to prevent cheating, all the evaluation data will be kept completely inaccessible to the participants during the competition. The aforementioned rules on manual review of the submissions also aim to prevent cheating.
Task 1 and 2 contains two phases: a public evaluation phase and private evaluation phase. Results in the public evaluation phase are displayed on a public leaderboard allowing participants to see their performances compared to others. Results for private evaluation phase are hidden until the end of the competition.
Important: For each task, participants must submit to both the public and private evaluation phases separately. Submissions made solely to the public evaluation phase will not be used in the final judgement of the competition. It is the participants responsibility to make sure their submission to the private phase of each task represent their best results.
Below are the detailed submission instructions for each task.
According to the task descriptions in the guides, the submitted file should be a zip file, containing a npy file with shape [5, 50, 50] with the name adj_matrix.npy. Alternatively, it is possible to submit probabilistic estimates of the adjacency matrix in the form of samples from the adjacency matrix distribution. This should be submitted in the shape of [5, s, 50, 50], where s is the number of samples from the distribution over adjacency matrices.
A template submission can also be obtained by running the task 1 baseline model in the starting kit.
According to the task descriptions in the guides, the candidate should submit a zip file, containing the numpy array CATE estimations with shape [5, 10] (i.e. 5 datasets in task 1 and 10 queries for each). The file name should be cate_estimate.npy. This array should not contain any NaN values and its shape has to exactly match the [5,10]. The elements inside the array should be arranged according to the data set and query id, where the first dimension indicates the data set id and second dimension is for query number. For instance, cate_estimate[3,7]=0.45 means the CATE estimation of query 7 for dataset 3 is 0.45.
A template submission can also be obtained by running the task 2 baseline model in the starting kit.
Details regarding the submissions for Task 3 and 4 will be given upon the release of these tasks, refer to schedule for the timeline.
If you have any questions about the competition, you can contact the organisers by the email causal_edu@outlook.com. We will try to get back to you as soon as possible!
Wenbo Gong, Microsoft Research
Digory Smith, Eedi
Jack Wang, Rice University
Craig Barton, Eedi
Simon Woodhead, Eedi
Nick Pawlowski, Microsoft Research
Joel Jennings, Microsoft Research
Cheng Zhang, Microsoft Research
Eedi will provide a $5,000 cash prize for the competition. There will be $1,000 prizes awarded to the winning team for each task. In addition, a $1,000 prize for the overall winner across all tasks will be awarded, as determined by the team's average rank across each competition task (smallest average rank wins). If a team hasn't submitted a working solution to a particular task, their rank for that task will be considered to be the number of entrants across all tasks in total. In the event of a tie, this prize will be split evenly between the tied teams.
The competition will be promoted on Microsoft Research and Eedi's social media platforms. A webpage will be set up detailing the competition, and links will be distributed to machine learning mailing lists and public forums.
We will also distribute to university mailing lists and will encourage university lecturers to use this competition for course projects.
Download | Size (mb) | Phase |
---|---|---|
Public Data | 190.341 | #1 Task 1 Public |
Public Data | 95.589 | #2 Task 1 Private |
Public Data | 8.297 | #3 Task 2 Public |
Public Data | 4.373 | #4 Task 2 Private |
Public Data | 8.250 | #5 Task 3 Public |
Public Data | 0.002 | #7 Task 4 Public |
June 27, 2022: Tasks 1 and 2 released.
August 09, 2022: Tasks 3 and 4 released.
October 15, 2022: Final submission deadline for all tasks.
November 15, 2022: Results announced, private leaderboards revealed, prize-winners notified.
December 6, 2022: Virtual competition workshop, results announced, private leaderboards revealed.
Start: June 27, 2022, midnight
Description: adjacency matrix prediction (synthetic data)
Start: June 27, 2022, midnight
Description: adjacency matrix prediction (synthetic data)
Start: June 27, 2022, midnight
Description: average treatment effect prediction (synthetic data)
Start: June 27, 2022, midnight
Description: average treatment effect prediction (synthetic data)
Start: Aug. 10, 2022, 5 p.m.
Description: adjacency matrix prediction (real data)
Start: Oct. 7, 2022, 1 p.m.
Description: adjacency matrix prediction (real data)
Start: Aug. 10, 2022, 5 p.m.
Description: average treatment effect prediction (real data)
Start: Oct. 7, 2022, 1 p.m.
Description: average treatment effect prediction (real data)
Oct. 29, 2022, midnight
You must be logged in to participate in competitions.
Sign In# | Username | Score |
---|---|---|
1 | skt | -0.00 |
2 | shshen | -0.00 |
3 | goforit | -0.00 |