Evaluating LLM outputs accurately is critical to being able to iterate quickly on a LLM system. Human annotations can be slow and expensive and using LLMs instead promises to solve this. However aligning a LLM Judge with human judgements is often hard with many implementation details to consider.
During the hackathon, let’s try to build LLM Judges together and move the field forward a little by:
This hackathon is for you if you are an AI Engineer who:
LLM API credits will be provided to those who need them.
$5,000 cash equivalent prizes will be awarded for top 3 overall projects with a bonus category for most on-theme projects.
Saturday, Sept 21: 10am-10pm
Sunday, Sept 22: 9:30am-5pm