Asap-zzhou's group workspace
alpaca-ablations
What makes this group special?
Tags
ModernBERT-large/alpaca/openwebtext/epochs-20-bs-512-len-512
Notes
Author
State
Finished
Start time
October 30th, 2025 12:58:07 AM
Runtime
8m 34s
Tracked hours
8m 31s
Run path
asap-zzhou/dllm/5kq65gdg
OS
Linux-3.10.0-957.el7.x86_64-x86_64-with-glibc2.17
Python version
CPython 3.10.18
Git repository
git clone git@github.com:ZHZisZZ/dllm.git
Git state
git checkout -b "ModernBERT-large/alpaca/openwebtext/epochs-20-bs-512-len-512" df67f70cb557ab2459fc53a3049a414578c1fe2d
Command
/mnt/lustrenew/mllm_aligned/dongzhichen/dllm/examples/bert/sft.py --model_name_or_path models/ModernBERT-large/openwebtext/steps-200000-bs-1024-len-1024/checkpoint-18000 --dataset_args tatsu-lab/alpaca --max_length 512 --num_train_epochs 20 --per_device_train_batch_size 64 --per_device_eval_batch_size 64 --save_steps 0.1 --output_dir models/ModernBERT-large/alpaca/openwebtext/epochs-20-bs-512-len-512
System Hardware
| CPU count | 64 |
| Logical CPU count | 128 |
| GPU count | 8 |
| GPU type | NVIDIA A100-SXM4-80GB |
W&B CLI Version
0.22.2
Group
alpaca-ablationsConfig
Config parameters are your model's inputs. Learn more
- {} 219 keys▶
- "models/ModernBERT-large/openwebtext/steps-200000-bs-1024-len-1024/checkpoint-18000"
- {} 6 keys▶
- false
- 0.9
- 0.999
- 0.00000001
- false
- [] 1 item▶
- "ModernBertForMaskedLM"
- false
- 0
- false
- true
- null
- false
- null
- true
- false
- null
- 0
- "gelu"
- false
- 0
- "mean"
- 50,281
- null
- null
- false
- 0
- false
- true
- null
- null
- null
- null
- null
- 1,800
- [] 0 items
- true
- null
- null
- false
- false
- 0
- true
- false
- false
- 50,368
- 0.1
- 0
- 0
46 ... 95▶▶96 ... 145▶▶146 ... 195▶▶196 ... 214▶▶
Summary
Summary metrics are your model's outputs. Learn more
- {} 14 keys▶
- 1.3181915283203125
- 1.3886
- 3,745.409
- 7.921
- 198,651,305,726,574,592
- 1.331548237800598
- 512.3445
- 1,826.896
- 3.591
- 20
- 1,840
- 0.96875
- 0.00000000008997439994
- 1.2008
Artifact Outputs
This run produced these artifacts as outputs. Total: 1. Learn more
Type
Name
Consumer count
Loading...