Xk-huang's workspace
Runs
23
Name
3 visualized
Runtime
3d 39m 2s
9h 46m 7s
5h 52m 57s
actor_rollout_ref.actor.optim.total_training_steps
actor_rollout_ref.actor.ppo_micro_batch_size_per_gpu
actor_rollout_ref.actor.ppo_mini_batch_size
actor_rollout_ref.model.path
actor_rollout_ref.ref.log_prob_micro_batch_size_per_gpu
actor_rollout_ref.rollout.log_prob_micro_batch_size_per_gpu
critic.model.tokenizer_path
critic.optim.total_training_steps
critic.ppo_mini_batch_size
data.shuffle
data.train_batch_size
data.train_files
reward_model.model.input_tokenizer
trainer.default_local_dir
trainer.experiment_name
trainer.save_freq
trainer.test_freq
trainer.total_epochs
actor_rollout_ref.actor.checkpoint.load_contents
actor_rollout_ref.actor.checkpoint.save_contents
actor_rollout_ref.actor.entropy_checkpointing
actor_rollout_ref.actor.entropy_from_logits_with_chunking
actor_rollout_ref.actor.fsdp_config.forward_prefetch
actor_rollout_ref.model.fused_kernel_options.impl_backend
actor_rollout_ref.ref.entropy_checkpointing
actor_rollout_ref.ref.entropy_from_logits_with_chunking
actor_rollout_ref.ref.fsdp_config.forward_prefetch
actor_rollout_ref.rollout.multi_turn.enable_tokenization_sanity_check
actor_rollout_ref.rollout.multi_turn.use_inference_chat_template
critic.checkpoint.load_contents
critic.checkpoint.save_contents
critic.model.fsdp_config.forward_prefetch
data.validation_shuffle
reward_model.model.fsdp_config.forward_prefetch
trainer.max_actor_ckpt_to_keep
trainer.max_critic_ckpt_to_keep
Commit
Created
GitHub
End Time
Hostname
ID
Notes
State
Updated
actor/entropy_loss
actor/grad_norm
actor/kl_loss
actor/pg_clipfrac
actor/pg_loss
actor/ppo_kl
645
16
64
Qwen/Qwen2.5-VL-32B-Instruct
16
16
Qwen/Qwen2.5-VL-32B-Instruct
645
64
true
128
/opt/dlami/nvme/xhuan192/codes/med-vlrm/data/verl/med-vlm-m23k-qwen2_5_vl_3b-easy_to_hard/train.parquet
Qwen/Qwen2.5-VL-32B-Instruct
checkpoints/med-vlrm/train-qwen2_5_vl_7b-m23k
train-qwen2_5_vl_7b-m23k
50
50
5
["model","optimizer","extra"]
["model","optimizer","extra"]
false
false
false
torch
false
false
false
true
false
["model","optimizer","extra"]
["model","optimizer","extra"]
false
false
false
2
2
-
-
Jun 19 '25 22:29
-
illpepzw
Finished
Jun 19 '25 22:29
-
0.17737
0.051244
0.0008924
-0.033434
0.00014574
320
32
128
Qwen/Qwen2.5-VL-7B-Instruct
32
32
Qwen/Qwen2.5-VL-7B-Instruct
320
128
true
256
data/verl/med-vlm-m23k-qwen2_5_vl_3b-easy_to_hard/train.parquet
Qwen/Qwen2.5-VL-7B-Instruct
checkpoints/med-vlrm/train-qwen2_5_vl_7b-m23k
train-qwen2_5_vl_7b-m23k
50
50
5
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
95a6f24c09559248ddb2b7e77f9b25eb2f138e0f
Jun 06 '25 07:15
-
6y1xmm9y
Finished
Jun 06 '25 07:15
0.79866
0.34715
0.045927
0.00055478
0.10208
-0.00013131
320
32
128
Qwen/Qwen2.5-VL-3B-Instruct
32
32
Qwen/Qwen2.5-VL-3B-Instruct
320
128
true
256
data/verl/med-vlm-m23k-qwen2_5_vl_3b-easy_to_hard/train.parquet
Qwen/Qwen2.5-VL-3B-Instruct
checkpoints/med-vlrm/train-qwen2_5_vl_3b-m23k
train-qwen2_5_vl_3b-m23k
50
50
5
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
5977eb7030292a112eb8a3254a82f2fd9bac0243
Jun 06 '25 01:47
-
vts1pbwr
Finished
Jun 06 '25 01:47
1.18377
0.33439
0.021837
0.00034077
-0.026197
-0.000076161
1-3
of 3