Skip to main content

Autometa's group workspace

Timestamps visible
2024-02-29 20:14:27
|  - high_school_biology                |      0|none            |None  |acc        |0.7774|±  |0.0237|
2024-02-29 20:14:27
|  - high_school_chemistry              |      0|none            |None  |acc        |0.5320|±  |0.0351|
2024-02-29 20:14:27
|  - high_school_computer_science       |      0|none            |None  |acc        |0.6600|±  |0.0476|
2024-02-29 20:14:27
|  - high_school_mathematics            |      0|none            |None  |acc        |0.3444|±  |0.0290|
2024-02-29 20:14:27
|  - high_school_physics                |      0|none            |None  |acc        |0.3642|±  |0.0393|
2024-02-29 20:14:27
|  - high_school_statistics             |      0|none            |None  |acc        |0.5417|±  |0.0340|
2024-02-29 20:14:27
|  - machine_learning                   |      0|none            |None  |acc        |0.5000|±  |0.0475|
2024-02-29 20:14:27
|hellaswag                              |      1|none            |None  |acc        |0.6048|±  |0.0049|
2024-02-29 20:14:27
|                                       |       |none            |None  |acc_norm   |0.8065|±  |0.0039|
2024-02-29 20:14:27
|gsm8k                                  |      3|strict-match    |5     |exact_match|0.5292|±  |0.0137|
2024-02-29 20:14:27
|                                       |       |flexible-extract|5     |exact_match|0.5337|±  |0.0137|
2024-02-29 20:14:27
|arc_challenge                          |      1|none            |None  |acc        |0.4915|±  |0.0146|
2024-02-29 20:14:27
|                                       |       |none            |None  |acc_norm   |0.5384|±  |0.0146|
2024-02-29 20:14:27
|      Groups      |Version|Filter|n-shot|Metric|Value |   |Stderr|
2024-02-29 20:14:27
|------------------|-------|------|------|------|-----:|---|-----:|
2024-02-29 20:14:27
|mmlu              |N/A    |none  |     0|acc   |0.6167|±  |0.0038|
2024-02-29 20:14:27
| - humanities     |N/A    |none  |None  |acc   |0.5520|±  |0.0067|
2024-02-29 20:14:27
| - other          |N/A    |none  |None  |acc   |0.6933|±  |0.0080|
2024-02-29 20:14:27
| - social_sciences|N/A    |none  |None  |acc   |0.7241|±  |0.0079|
2024-02-29 20:14:27
| - stem           |N/A    |none  |None  |acc   |0.5331|±  |0.0085|