Skip to main content

Aflah's workspace

2k4k6k8k10k12kStep46810
seq_length: 2048, dataset: FW_Edu, model: pythia, pos_emb: none, rotary_pct: 1, seed: 1234, lr: 0.0004, use_qk_layernorm: false, pipe_parallel_size: 0, log_grad_norm: false
seq_length: 2048, dataset: FW_Edu, model: pythia, pos_emb: none, rotary_pct: 1, seed: 1234, lr: 0.0004, use_qk_layernorm: false, pipe_parallel_size: 0, log_grad_norm: true