Aug-05__11:14 | kastan Overview – Weights & Biases

Skip to main content

Kastan's group workspace

Group: Aug-05__11:14

1-1

of 1

Tags

Notes

Tags

Aug-05__11:14

BATCH_SIZE32

NUM_EPOCHS=3

NUM_MICRO_BATCHES=4

SLURM=513422

TP=4

WORLD_SIZE=32

Author

State

Crashed

Start time

August 5th, 2022 4:16:03 PM

Runtime

0s

Tracked hours

-

Run path

kastan/LLM-Distributed-Quantization/16x2ja9m

W&B CLI Version

0.13.0

Group

Config parameters are your model's inputs. Learn more

▶
Config parameters:{} 22 keys
- BATCH_SIZE:
  32
- clip_grad_norm:
  1
- ▶
  fp16:{} 1 key
  - mode:
    "AMP_TYPE.NAIVE"
- gradient_accumulation:
  4
- LEARNING_RATE:
  0.00015
- LOG_PATH:
  "./quant_gpt2_2.5d_tp4_bs32_lr0.00015/"
- ▶
  loss:{} 1 key
  - type:
    "titans.loss.lm_loss.gpt_lmloss.GPTLMLoss"
- ▶
  model:{} 7 keys
- ▶
  model_dtypes:{} 4 keys
  - decoder_dtype:
    "torch.float32"
  - embed_dtype:
    "torch.float32"
  - head_dtype:
    "torch.bfloat16"
  - layernorm_dtype:
    "torch.float32"
- NUM_EPOCHS:
  3
- NUM_MICRO_BATCHES:
  4
- ▶
  optimizer:{} 2 keys
  - lr:
    0.00015
  - weight_decay:
    0.01
- ▶
  parallel:{} 2 keys
  - pipeline:
    4
  - ▶
    tensor:{} 3 keys
    - depth:
      1
    - mode:
      "2.5d"
    - size:
      4
- quant_gpt2_8B:
  "titans.model.quant_gpt.quant_gpt.quant_gpt2_8B"
- quant_gpt2_xl:
  "titans.model.quant_gpt.quant_gpt.quant_gpt2_xl"
- SEQ_LENGTH:
  1,024
- TENSOR_PARALLEL_MODE:
  "2.5d"
- TENSOR_PARALLEL_SIZE:
  4
- TOTAL_BATCH_SIZE:
  128
- VOCAB_SIZE:
  50,304
- WARMUP_EPOCHS:
  1
- WEIGHT_DECAY:
  0.01

Summary metrics are your model's outputs. Learn more

No summary metrics saved for this run.

Check the summary metrics documentation for more information.