upup-ashton-wang-usc

Upup-ashton-wang's group workspace

Group: Resa - Main Models

533

532

464

634

464

567

1-9

of 9

Timestamps visible

2025-08-07 07:42:14

"transformers_version": "4.51.1",

2025-08-07 07:42:14

"use_cache": true,

2025-08-07 07:42:14

"use_mrope": false,

2025-08-07 07:42:14

"use_sliding_window": false,

2025-08-07 07:42:14

"vocab_size": 151936

2025-08-07 07:42:14

}

2025-08-07 07:42:14

tokenizer config file saved in /project/neiswang_1391/shangsha/reasoning/reasoning-sae/ckpts/models/DeepSeek-R1-Distill-Qwen-1.5B/sae_tuning_deepscaler/DeepSeek-R1-Distill-Qwen-1.5B_grpo_still_checkpoint-0/trained_from_scratch_deepscaler_model.layers.12/tokenizer_config.json

2025-08-07 07:42:14

Special tokens file saved in /project/neiswang_1391/shangsha/reasoning/reasoning-sae/ckpts/models/DeepSeek-R1-Distill-Qwen-1.5B/sae_tuning_deepscaler/DeepSeek-R1-Distill-Qwen-1.5B_grpo_still_checkpoint-0/trained_from_scratch_deepscaler_model.layers.12/special_tokens_map.json

2025-08-07 07:42:14

Final model saved to /project/neiswang_1391/shangsha/reasoning/reasoning-sae/ckpts/models/DeepSeek-R1-Distill-Qwen-1.5B/sae_tuning_deepscaler/DeepSeek-R1-Distill-Qwen-1.5B_grpo_still_checkpoint-0/trained_from_scratch_deepscaler_model.layers.12

2025-08-07 07:42:14

Training finished.