Evaluation Comparison Report - test

Comparing evaluations
Created on February 8|Last edited on February 8
Comment
﻿
﻿
hellaswag/acc, hellaswag/acc_norm_stderr
hellaswag/acc, hellaswag/acc_norm_stderr
volcanic-violet-123 Run set hellaswag/accvolcanic-violet-123 Run set hellaswag/acc_norm_stderrsuper-donkey-122 Run set hellaswag/accsuper-donkey-122 Run set hellaswag/acc_norm_stderr0.000.100.200.300.40
hellaswag/acc, hellaswag/acc_norm_stderr
hellaswag/acc, hellaswag/acc_norm_stderr
volcanic-violet-123 Run set hellaswag/accvolcanic-violet-123 Run set hellaswag/acc_norm_stderrsuper-donkey-122 Run set hellaswag/accsuper-donkey-122 Run set hellaswag/acc_norm_stderr0.000.100.200.300.40
hellaswag/acc, hellaswag/acc_norm_stderr
hellaswag/acc, hellaswag/acc_norm_stderr
volcanic-violet-123 Run set hellaswag/accvolcanic-violet-123 Run set hellaswag/acc_norm_stderrsuper-donkey-122 Run set hellaswag/accsuper-donkey-122 Run set hellaswag/acc_norm_stderr0.000.100.200.300.40
hellaswag/acc, hellaswag/acc_norm_stderr
hellaswag/acc, hellaswag/acc_norm_stderr
volcanic-violet-123 Run set hellaswag/accvolcanic-violet-123 Run set hellaswag/acc_norm_stderrsuper-donkey-122 Run set hellaswag/accsuper-donkey-122 Run set hellaswag/acc_norm_stderr0.000.100.200.300.40
hellaswag/acc, hellaswag/acc_norm_stderr
hellaswag/acc, hellaswag/acc_norm_stderr
volcanic-violet-123 Run set hellaswag/accvolcanic-violet-123 Run set hellaswag/acc_norm_stderrsuper-donkey-122 Run set hellaswag/accsuper-donkey-122 Run set hellaswag/acc_norm_stderr0.000.100.200.300.40
hellaswag/acc, hellaswag/acc_norm_stderr
hellaswag/acc, hellaswag/acc_norm_stderr
volcanic-violet-123 Run set hellaswag/accvolcanic-violet-123 Run set hellaswag/acc_norm_stderrsuper-donkey-122 Run set hellaswag/accsuper-donkey-122 Run set hellaswag/acc_norm_stderr0.000.100.200.300.40
Run set1
Run set1
﻿
﻿
Add a comment