Skip to main content
nicolas-remerscheid
Projects
eval-llm-apps
Reports
LLM-rated Answer Correctness (avg) (25/03/19 13:47:14)
Log in
Sign up
Share
Comment
Star
LLM-rated Answer Correctness (avg) (25/03/19 13:47:14)
Nicolas Remerscheid
Created on March 19
|
Last edited on March 19
Comment
LLM-rated Answer Correctness (avg)
LLM-rated Answer Correctness (avg)
eager-spaceship-37 --- meta-llama/Llama-2-7b-chat-hf
wise-salad-38 --- meta-llama/Llama-2-7b-chat-hf
young-sweep-5 --- meta-llama/Llama-2
vital-sweep-20 --- meta-llama/Llama-2
dark-sweep-9 --- meta-llama/Llama-2
0
10
20
30
40
50
60
70
80
90
100
Run set
35
Add a comment
90