Frisian Coqui STT (small vocabulary)

Organized by josh-coqui - Current server time: March 29, 2025, 10:05 p.m. UTC

First phase

Train a 🐸 STT model
Nov. 10, 2021, 8 a.m. UTC


Competition Ends
Nov. 17, 2021, 8 a.m. UTC

Frisian (small vocabulary) Speech-to-Text

Brought to you by 🐸 Coqui

The goal of this competition is to create the most accurate Frisian (small vocabulary) Speech-to-Text model using the 🐸 STT Toolkit. The training, testing, and evaluation data come from the Common Voice 7.0 release for the 'Single Word Target Segments Corpus'. This dataset only contains words for 'yes' and 'no' in the Frisian language (if such words exist), as well as the words for single digits from 'zero' to 'nine'. This is a constrained vocabulary task, but you can submit any kind of Coqui STT model so long as the alphabet of your model contains all the characters for all the words in this vocabulary. When in doubt, use the alphabet.txt file provided in the training set.

Read more about the Frisian language on Wikipedia.

⏲️ Timeline

This competition starts at 9:00 in the morning on November 10, 2021 (CET), and the competition finishes at 9:00 in the morning on November 17, 2021 (CET).

⚡ Free GPUs from OVHCloud

To obtain OVHCloud's generously donated free GPU time for the competition, you should navigate to OVH's Discord registration page, fill in the required info, and join the coqui-ai-stt-challenge channel to obtain all the details.

💚 Community Support & Docs

If you're looking for more information, documentation, or community support, check out the following:

  1. Github repo for Coqui STT
  2. Documentation
  3. Community Chat room 👋😄

Frisian STT: Evaluation

Your model will be evaluated on a small vocabulary (see the competition overview for more info), held-out test set from the Common Voice 7.0 release for the Frisian language. Statistics for Word Error Rate will be calculated and ranked on the leaderboard. The best model will have the lowest value for Word Error Rate.

In the event of a tie, the smallest model (in Megabytes) will win. Accurate models are good, but small and accurate models are better 😄👍.

Frisian STT: Rules

You may not train on the testing data. However, you may train on any other data, from any language and any dataset. You may use Transfer Learning to bootstrap from a pre-trained model, as long as that pre-trained model was not trained on the test data. In short, don't use the testing data at any step of training.

There are limits on number of submissions per day and number of submissions overall. See the submission page for more information.

Code of Conduct

Please remember that this is a friendly competition, and we are working together to advance the state of Open speech technology for all the world's languages. If you participate, you agree to Coqui's Code of Conduct.

Contributor License Agreement

Your STT model submissions may be hosted on the Coqui STT Model Zoo under your License of choice (you should submit a LICENSE file with your submission). By making a submission, you agree to Coqui's Contributor License Agreement.

Train a 🐸 STT model

Start: Nov. 10, 2021, 8 a.m.

Competition Ends

Nov. 17, 2021, 8 a.m.

You must be logged in to participate in competitions.

Sign In