Skip to main content

Asap-zzhou's group workspace

alpaca-ablations

What makes this group special?
Tags
Notes
State
Finished
Start time
October 30th, 2025 12:58:07 AM
Runtime
8m 34s
Tracked hours
8m 31s
Run path
asap-zzhou/dllm/5kq65gdg
OS
Linux-3.10.0-957.el7.x86_64-x86_64-with-glibc2.17
Python version
CPython 3.10.18
Git repository
git clone git@github.com:ZHZisZZ/dllm.git
Git state
git checkout -b "ModernBERT-large/alpaca/openwebtext/epochs-20-bs-512-len-512" df67f70cb557ab2459fc53a3049a414578c1fe2d
Command
/mnt/lustrenew/mllm_aligned/dongzhichen/dllm/examples/bert/sft.py --model_name_or_path models/ModernBERT-large/openwebtext/steps-200000-bs-1024-len-1024/checkpoint-18000 --dataset_args tatsu-lab/alpaca --max_length 512 --num_train_epochs 20 --per_device_train_batch_size 64 --per_device_eval_batch_size 64 --save_steps 0.1 --output_dir models/ModernBERT-large/alpaca/openwebtext/epochs-20-bs-512-len-512
System Hardware
CPU count64
Logical CPU count 128
GPU count8
GPU typeNVIDIA A100-SXM4-80GB
W&B CLI Version
0.22.2
Config

Config parameters are your model's inputs. Learn more

  • {} 219 keys
    • "models/ModernBERT-large/openwebtext/steps-200000-bs-1024-len-1024/checkpoint-18000"
    • {} 6 keys
      • false
      • 0.9
      • 0.999
      • 0.00000001
      • false
      • [] 1 item
        • "ModernBertForMaskedLM"
      • false
      • 0
      • false
      • true
      • null
      • false
      • null
      • true
      • false
      • null
      • 0
      • "gelu"
      • false
      • 0
      • "mean"
      • 50,281
      • null
      • null
      • false
      • 0
      • false
      • true
      • null
      • null
      • null
      • null
      • null
      • 1,800
      • [] 0 items
        • true
        • null
        • null
        • false
        • false
        • 0
        • true
        • false
        • false
        • 46 ... 95
          96 ... 145
          146 ... 195
          196 ... 214
        • 50,368
        • 0.1
        • 0
        • 0
      Summary

      Summary metrics are your model's outputs. Learn more

      • {} 14 keys
        • 1.3181915283203125
        • 1.3886
        • 3,745.409
        • 7.921
        • 198,651,305,726,574,592
        • 1.331548237800598
        • 512.3445
        • 1,826.896
        • 3.591
        • 20
        • 1,840
        • 0.96875
        • 0.00000000008997439994
        • 1.2008
      Artifact Outputs

      This run produced these artifacts as outputs. Total: 1. Learn more

      Loading...