CodaLab - Competition

Trojan Detection Challenge 2023 - Trojan Detection Track (Large Model Subtrack)

Organized by mmazeika - Current server time: Nov. 13, 2025, 8:51 a.m. UTC
Reward $30,000

First phase

Development

July 26, 2023, 7 a.m. UTC

End

Competition Ends

Nov. 7, 2023, noon UTC

Overview
Evaluation
Terms and Conditions

Development

Start: July 26, 2023, 7 a.m.

Description: In this phase, participants can submit predictions for trojans that have been inserted in the the dev phase LLM. Submissions are evaluated on held-out trojans that are not part of the training set. This leaderboard does not determine the final ranking and is primarily for developing detection algorithms and comparing to other participants and the baseline detectors. Participants can make 5 submissions per day. All values in the leaderboard are percentages.

Test

Start: Nov. 1, 2023, noon

Description: In this phase, participants can submit predictions for trojans that have been inserted in the the test phase LLM. Submissions are evaluated on held-out trojans that are not part of the training set. This leaderboard determines the final rankings used for awarding prizes. Participants can make 5 submissions total. All values in the leaderboard are percentages.

Competition Ends

Nov. 7, 2023, noon

You must be logged in to participate in competitions.