Note: If you want to discuss submissions or learn more about language models, talk to us on our WhatsApp or Discord.

The Challenge

Chai uses the best language models in the world to power the chatbots that users speak to on our platform. Up until now, we have more than 100 thousand daily active users on our mobile app with more than 1 million messages sent per day. You could be part of our journey by helping us build state of the art language models and improve our chatbots.

The goal of this competition is to make improvements to the state of the art language model and improving the quality of the conversations people have with the chatbots.

Prize

$1,000 every time you beat the best score on the MCL leaderboard by at least 0.5. $500 every time you beat the best score on the Reward model leaderboard.

Evaluation

Submissions are first evaluated on the median conversation length (MCL) that actual users will have with your model. When you submit a new model, we will run an A/B test on our app and have users speak it. Your score is the median length of conversations that users have with it.

Submissions are also evaluated based on a Reward Model score. This reward model scores your model responses, if the score is 0.9, then we estimate a 10% probability that the conversation will end. Higher reward model scores should lead to higher MCL but don't always do. You can run the reward model locally to better understand how to improve your models.

Leaderboard

Median conversation length (MCL)

SubmissionParticipantScoreLatency (s)
🥇hakurei/litv2-6B-rev2Reimu Hakurei+1.951.74
🥈hakurei/litv2-6B-rev1Reimu Hakurei+1.241.65
🥉hakurei/lit-6BReimu Hakurei+1.122.17
KoboldAI/GPT-J-6B-ShinenJulius ter Pelkwijk+0.653.34
EleutherAI/gpt-j-6BEleutherAI02.48
KoboldAI/OPT-6B-nerys-v2Julius ter Pelkwijk-6.3112.80

Reward model

SubmissionParticipantReward
🥇KoboldAI/GPT-J-6B-ShinenJulius ter Pelkwijk0.8803
🥈EleutherAI/gpt-j-6BEleutherAI0.8797
🥉hakurei/litv2-6B-rev2Reimu Hakurei0.8796
KoboldAI/OPT-6B-nerys-v2Julius ter Pelkwijk0.8781

Example

The models we use are hosted on HuggingFace and you can run the example script from our Google Colab notebook.

Submission

Upload your model to HuggingFace and send the link to us here. The deadline for submission is January 1, 2023.

Submit

FAQ

  1. How can I submit a model? Once you have a model you want to submit, upload it to HuggingFace and share it in the submissions channel for the team to review! We will be in touch with you immediately.

  2. Do you provide computing resources? At this stage we don't, we expect people will use Colab or their own setup. If this is a massive problem we can look into this with you.

  3. Is latency important? The latency for our currently deployed models is ~1.5s for an inference. We estimate that for every 1s improvement in latency your score will go up by 0.4MCL. We cannot deploy submissions if they take longer than ~4s per inference.

  4. Can I get some help? Yes! We are happy to help you with any questions or clarifications. Reach out to us on WhatsApp or Discord.

  5. What is the deadline for submissions? The competition will run until Jan. 1st 2023.

  6. What happens once I submit? We will deploy your model asap and be in touch with you. If your solution tops the leaderboard we'll have a call with you to see your code: we want to share winning solutions with other participants so that the community can build on one another's work.

Resources