0.4764
0.02614
wandb artifact get wandb/jobs/openai_evals_registry:latest --root openai_evals_registry
Modify registry/modelgraded/custom.yaml
(left panel below)
Modify registry/data/custom/samples_labeled.jsonl
(right panel below). Each row should be a JSON object that at least contains these keys:
input
, The query submitted to your LLMcompletion
, The LLM's responsechoice
, The correct choice among the optionswandb artifact put openai_evals_registry
{"run_config": {"eval": "custom-meta","model": "wandb/jobs/openai_evals_model:v0","registry": "your_entity/your_project/openai_evals_registry:latest","oaieval_settings": {"max_samples": 10}}}