llm_surgery

Autometa's group workspace

Group: Mistral-7B-v0.1

1-2

of 2

Timestamps visible

2024-03-11 00:14:08

[INFO|trainer.py:3067] 2024-03-11 00:14:07,751 >> Saving model checkpoint to data/zephyr-dpo-full

2024-03-11 00:14:08

[INFO|configuration_utils.py:473] 2024-03-11 00:14:07,752 >> Configuration saved in data/zephyr-dpo-full/config.json

2024-03-11 00:14:08

[INFO|configuration_utils.py:614] 2024-03-11 00:14:07,753 >> Configuration saved in data/zephyr-dpo-full/generation_config.json

2024-03-11 00:14:20

[INFO|modeling_utils.py:2462] 2024-03-11 00:14:19,029 >> The model is bigger than the maximum size per checkpoint (5GB) and is going to be split in 3 checkpoint shards. You can find where each parameters has been saved in the index located at data/zephyr-dpo-full/model.safetensors.index.json.

2024-03-11 00:14:20

[INFO|tokenization_utils_base.py:2459] 2024-03-11 00:14:19,030 >> tokenizer config file saved in data/zephyr-dpo-full/tokenizer_config.json

2024-03-11 00:14:20

[INFO|tokenization_utils_base.py:2468] 2024-03-11 00:14:19,030 >> Special tokens file saved in data/zephyr-dpo-full/special_tokens_map.json

2024-03-11 00:14:20

INFO:__main__:Model saved to data/zephyr-dpo-full

2024-03-11 00:14:20

[INFO|modelcard.py:450] 2024-03-11 00:14:19,050 >> Dropping the following result as it does not have all the necessary fields:

2024-03-11 00:14:20

{'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}, 'dataset': {'name': 'argilla/dpo-mix-7k', 'type': 'argilla/dpo-mix-7k', 'config': None, 'split': 'None'}}

2024-03-11 00:14:20

[INFO|configuration_utils.py:473] 2024-03-11 00:14:19,052 >> Configuration saved in data/zephyr-dpo-full/config.json

2024-03-11 00:14:20

INFO:__main__:Saving model as artifact to wandb

2024-03-11 00:15:10

wandb: Adding directory to artifact (./data/zephyr-dpo-full)... Done. 49.7s

2024-03-11 00:15:10

INFO:__main__:*** Training complete! ***