Skip to main content

Chilli's group workspace

example

What makes this group special?
Tags

floral-firefly-11599

Notes
State
Failed
Start time
April 4th, 2025 6:18:40 AM
Runtime
38s
Tracked hours
37s
Run path
eleutherai/neox/5rat5lum
OS
Linux-6.5.13-65-650-4141-22041-coreweave-amd64-85c45edc-x86_64-with-glibc2.35
Python version
CPython 3.10.14
Git repository
git clone git@github.com:EleutherAI/gpt-neox.git
Git state
git checkout -b "floral-firefly-11599" f84b54e20361772cf97da87ba48960b741954065
Command
/mnt/ssd-1/nora/gpt-neox/train.py --local_rank=0 --deepspeed_config eyJ0cmFpbl9iYXRjaF9zaXplIjogMzIsICJ0cmFpbl9taWNyb19iYXRjaF9zaXplX3Blcl9ncHUiOiA0LCAib3B0aW1pemVyIjogeyJ0eXBlIjogIkFkYW0iLCAicGFyYW1zIjogeyJsciI6IDAuMDAwNiwgImJldGFzIjogWzAuOSwgMC45NV0sICJlcHMiOiAxZS0wOH19LCAiZnAxNiI6IHsiZW5hYmxlZCI6IHRydWUsICJsb3NzX3NjYWxlIjogMCwgImxvc3Nfc2NhbGVfd2luZG93IjogMTAwMCwgImh5c3RlcmVzaXMiOiAyLCAibWluX2xvc3Nfc2NhbGUiOiAxfSwgInplcm9fb3B0aW1pemF0aW9uIjogeyJzdGFnZSI6IDEsICJhbGxnYXRoZXJfcGFydGl0aW9ucyI6IHRydWUsICJhbGxnYXRoZXJfYnVja2V0X3NpemUiOiA1MDAwMDAwMDAsICJvdmVybGFwX2NvbW0iOiB0cnVlLCAicmVkdWNlX3NjYXR0ZXIiOiB0cnVlLCAicmVkdWNlX2J1Y2tldF9zaXplIjogNTAwMDAwMDAwLCAiY29udGlndW91c19ncmFkaWVudHMiOiB0cnVlfSwgIndhbGxfY2xvY2tfYnJlYWtkb3duIjogdHJ1ZX0= --megatron_config eyJob3N0ZmlsZSI6ICIvbW9ja19wYXRoIiwgInRyYWluX2JhdGNoX3NpemUiOiAzMiwgInRyYWluX21pY3JvX2JhdGNoX3NpemVfcGVyX2dwdSI6IDQsICJvcHRpbWl6ZXIiOiB7InR5cGUiOiAiQWRhbSIsICJwYXJhbXMiOiB7ImxyIjogMC4wMDA2LCAiYmV0YXMiOiBbMC45LCAwLjk1XSwgImVwcyI6IDFlLTA4fX0sICJmcDE2IjogeyJlbmFibGVkIjogdHJ1ZSwgImxvc3Nfc2NhbGUiOiAwLCAibG9zc19zY2FsZV93aW5kb3ciOiAxMDAwLCAiaHlzdGVyZXNpcyI6IDIsICJtaW5fbG9zc19zY2FsZSI6IDF9LCAiemVyb19vcHRpbWl6YXRpb24iOiB7InN0YWdlIjogMSwgImFsbGdhdGhlcl9wYXJ0aXRpb25zIjogdHJ1ZSwgImFsbGdhdGhlcl9idWNrZXRfc2l6ZSI6IDUwMDAwMDAwMCwgIm92ZXJsYXBfY29tbSI6IHRydWUsICJyZWR1Y2Vfc2NhdHRlciI6IHRydWUsICJyZWR1Y2VfYnVja2V0X3NpemUiOiA1MDAwMDAwMDAsICJjb250aWd1b3VzX2dyYWRpZW50cyI6IHRydWV9LCAid2FsbF9jbG9ja19icmVha2Rvd24iOiB0cnVlLCAicHJlY2lzaW9uIjogImZwMTYiLCAibnVtX2xheWVycyI6IDEyLCAiaGlkZGVuX3NpemUiOiA3NjgsICJudW1fYXR0ZW50aW9uX2hlYWRzIjogMTIsICJzZXFfbGVuZ3RoIjogMjA0OCwgIm1heF9wb3NpdGlvbl9lbWJlZGRpbmdzIjogMjA0OCwgInBvc19lbWIiOiAicm90YXJ5IiwgIm5vX3dlaWdodF90eWluZyI6IHRydWUsICJhdHRlbnRpb25fY29uZmlnIjogWyJnbG9iYWwiLCAiZ2xvYmFsIiwgImdsb2JhbCIsICJnbG9iYWwiLCAiZ2xvYmFsIiwgImdsb2JhbCIsICJnbG9iYWwiLCAiZ2xvYmFsIiwgImdsb2JhbCIsICJnbG9iYWwiLCAiZ2xvYmFsIiwgImdsb2JhbCJdLCAic3BhcnNpdHlfY29uZmlnIjoge30sICJpbml0X21ldGhvZCI6ICJzbWFsbF9pbml0IiwgIm91dHB1dF9sYXllcl9pbml0X21ldGhvZCI6ICJ3YW5nX2luaXQiLCAibHJfZGVjYXlfc3R5bGUiOiAiY29zaW5lIiwgImxyX2RlY2F5X2l0ZXJzIjogMzIwMDAwLCAibWluX2xyIjogNmUtMDUsICJvcHRpbWl6ZXJfdHlwZSI6ICJBZGFtIiwgInplcm9fc3RhZ2UiOiAxLCAiemVyb19yZWR1Y2Vfc2NhdHRlciI6IHRydWUsICJ6ZXJvX2NvbnRpZ3VvdXNfZ3JhZGllbnRzIjogdHJ1ZSwgInplcm9fcmVkdWNlX2J1Y2tldF9zaXplIjogNTAwMDAwMDAwLCAiemVyb19hbGxnYXRoZXJfYnVja2V0X3NpemUiOiA1MDAwMDAwMDAsICJsciI6IDAuMDAwNiwgImRhdGFfcGF0aCI6ICIvbW50L3NzZC0xL2RhdGEvZW53aWs4L2Vud2lrOF90ZXh0X2RvY3VtZW50IiwgImRhdGFfaW1wbCI6ICJtbWFwIiwgInNhdmUiOiAiL21udC9zc2QtMS9jaGVja3BvaW50cyIsICJjb25maWdfZmlsZXMiOiB7IjEyNU0tbW9lLnltbCI6ICIjIEdQVC0yIHByZXRyYWluaW5nIHNldHVwXG57XG4gICAjIFNlZSBSRUFETUUgZm9yIE1vRSBjb25maWcgZG9jcyFcbiAgIFwibW9lX3R5cGVcIjogXCJkZWVwc3BlZWRcIixcbiAgIFwibW9lX3Rva2VuX2Ryb3BwaW5nXCI6IHRydWUsXG4gICAjIEhhdmUgNCBleHBlcnRzIHBlciBsYXllciAoZXZlcnkgMiBsYXllcnMgYnkgZGVmYXVsdClcbiAgIFwibW9lX251bV9leHBlcnRzXCI6IDQsXG4gICAjIHBhcmFsbGVsaXNtIHNldHRpbmdzXG4gICBcImVuYWJsZV9leHBlcnRfdGVuc29yX3BhcmFsbGVsaXNtXCI6IHRydWUsXG4gICBcInBpcGVfcGFyYWxsZWxfc2l6ZVwiOiAxLCAjIG5vdCB5ZXQgc3VwcG9ydGVkIGZvciBNb0VcbiAgIFwibW9kZWxfcGFyYWxsZWxfc2l6ZVwiOiAxLFxuICAgXCJtb2VfZXhwZXJ0X3BhcmFsbGVsX3NpemVcIjogMSxcblxuICAgIyBtb2RlbCBzZXR0aW5nc1xuICAgXCJudW1fbGF5ZXJzXCI6IDEyLFxuICAgXCJoaWRkZW5fc2l6ZVwiOiA3NjgsXG4gICBcIm51bV9hdHRlbnRpb25faGVhZHNcIjogMTIsXG4gICBcInNlcV9sZW5ndGhcIjogMjA0OCxcbiAgIFwibWF4X3Bvc2l0aW9uX2VtYmVkZGluZ3NcIjogMjA0OCxcbiAgIFwibm9ybVwiOiBcImxheWVybm9ybVwiLFxuICAgXCJwb3NfZW1iXCI6IFwicm90YXJ5XCIsXG4gICBcIm5vX3dlaWdodF90eWluZ1wiOiB0cnVlLFxuICAgXCJncHRfal9yZXNpZHVhbFwiOiBmYWxzZSxcbiAgIFwib3V0cHV0X2xheWVyX3BhcmFsbGVsaXNtXCI6IFwiY29sdW1uXCIsXG5cbiAgICMgdGhlc2Ugc2hvdWxkIHByb3ZpZGUgc29tZSBzcGVlZHVwIGJ1dCB0YWtlcyBhIHdoaWxlIHRvIGJ1aWxkLCBzZXQgdG8gdHJ1ZSBpZiBkZXNpcmVkXG4gICBcInNjYWxlZF91cHBlcl90cmlhbmdfbWFza2VkX3NvZnRtYXhfZnVzaW9uXCI6IGZhbHNlLFxuICAgXCJiaWFzX2dlbHVfZnVzaW9uXCI6IGZhbHNlLFxuICAgXCJyb3BlX2Z1c2lvblwiOiBmYWxzZSxcblxuICAgIyBpbml0IG1ldGhvZHNcbiAgIFwiaW5pdF9tZXRob2RcIjogXCJzbWFsbF9pbml0XCIsXG4gICBcIm91dHB1dF9sYXllcl9pbml0X21ldGhvZFwiOiBcIndhbmdfaW5pdFwiLFxuXG5cbiAgICMgb3B0aW1pemVyIHNldHRpbmdzXG4gICBcIm9wdGltaXplclwiOiB7XG4gICAgIFwidHlwZVwiOiBcIkFkYW1cIixcbiAgICAgXCJwYXJhbXNcIjoge1xuICAgICAgIFwibHJcIjogMC4wMDA2LFxuICAgICAgIFwiYmV0YXNcIjogWzAuOSwgMC45NV0sXG4gICAgICAgXCJlcHNcIjogMS4wZS04LFxuICAgICB9XG4gICB9LFxuICAgXCJtaW5fbHJcIjogMC4wMDAwNixcblxuICAgIyBmb3IgYWxsIHplcm9fb3B0aW1pemF0aW9uIG9wdGlvbnMsIHNlZSBodHRwczovL3d3dy5kZWVwc3BlZWQuYWkvZG9jcy9jb25maWctanNvbi8jemVyby1vcHRpbWl6YXRpb25zLWZvci1mcDE2LXRyYWluaW5nXG4gICBcInplcm9fb3B0aW1pemF0aW9uXCI6IHtcbiAgICBcInN0YWdlXCI6IDEsXG4gICAgXCJhbGxnYXRoZXJfcGFydGl0aW9uc1wiOiBUcnVlLFxuICAgIFwiYWxsZ2F0aGVyX2J1Y2tldF9zaXplXCI6IDUwMDAwMDAwMCxcbiAgICBcIm92ZXJsYXBfY29tbVwiOiBUcnVlLFxuICAgIFwicmVkdWNlX3NjYXR0ZXJcIjogVHJ1ZSxcbiAgICBcInJlZHVjZV9idWNrZXRfc2l6ZVwiOiA1MDAwMDAwMDAsXG4gICAgXCJjb250aWd1b3VzX2dyYWRpZW50c1wiOiBUcnVlLFxuICB9LFxuXG4gICAjIGJhdGNoIC8gZGF0YSBzZXR0aW5nc1xuICAgXCJ0cmFpbl9taWNyb19iYXRjaF9zaXplX3Blcl9ncHVcIjogNCxcbiAgIFwiZGF0YV9pbXBsXCI6IFwibW1hcFwiLFxuXG4gICAjIGFjdGl2YXRpb24gY2hlY2twb2ludGluZ1xuICAgXCJjaGVja3BvaW50X2FjdGl2YXRpb25zXCI6IHRydWUsXG4gICBcImNoZWNrcG9pbnRfbnVtX2xheWVyc1wiOiAxLFxuICAgXCJwYXJ0aXRpb25fYWN0aXZhdGlvbnNcIjogdHJ1ZSxcbiAgIFwic3luY2hyb25pemVfZWFjaF9sYXllclwiOiB0cnVlLFxuXG4gICAjIHJlZ3VsYXJpemF0aW9uXG4gICBcImdyYWRpZW50X2NsaXBwaW5nXCI6IDEuMCxcbiAgIFwid2VpZ2h0X2RlY2F5XCI6IDAuMSxcbiAgIFwiaGlkZGVuX2Ryb3BvdXRcIjogMC4wLFxuICAgXCJhdHRlbnRpb25fZHJvcG91dFwiOiAwLjAsXG5cbiAgICMgcHJlY2lzaW9uIHNldHRpbmdzXG4gICBcImZwMTZcIjoge1xuICAgICBcImVuYWJsZWRcIjogdHJ1ZSxcbiAgICAgXCJsb3NzX3NjYWxlXCI6IDAsXG4gICAgIFwibG9zc19zY2FsZV93aW5kb3dcIjogMTAwMCxcbiAgICAgXCJoeXN0ZXJlc2lzXCI6IDIsXG4gICAgIFwibWluX2xvc3Nfc2NhbGVcIjogMVxuICAgfSxcblxuICAgIyBtaXNjLiB0cmFpbmluZyBzZXR0aW5nc1xuICAgXCJ0cmFpbl9pdGVyc1wiOiAzMjAwMDAsXG4gICBcImxyX2RlY2F5X2l0ZXJzXCI6IDMyMDAwMCxcbiAgIFwiZGlzdHJpYnV0ZWRfYmFja2VuZFwiOiBcIm5jY2xcIixcbiAgIFwibHJfZGVjYXlfc3R5bGVcIjogXCJjb3NpbmVcIixcbiAgIFwid2FybXVwXCI6IDAuMDEsXG4gICBcImNoZWNrcG9pbnRfZmFjdG9yXCI6IDEwMDAwLFxuICAgXCJldmFsX2ludGVydmFsXCI6IDEwMDAsXG4gICBcImV2YWxfaXRlcnNcIjogMTAsXG5cbiAgICMgbG9nZ2luZ1xuICAgXCJsb2dfaW50ZXJ2YWxcIjogMTAsXG4gICBcInN0ZXBzX3Blcl9wcmludFwiOiAxMCxcbiAgIFwia2VlcF9sYXN0X25fY2hlY2twb2ludHNcIjogNCxcbiAgIFwid2FsbF9jbG9ja19icmVha2Rvd25cIjogdHJ1ZSxcblxuICAjICBuZXR3b3JraW5nXG4gIFwiaG9zdGZpbGVcIjogXCIvbW9ja19wYXRoXCJcbn1cbiIsICJlbGV1dGhlcmFpX2NsdXN0ZXIueW1sIjogIiMgRGF0YSBwYXRocyBhbmQgb3B0aW9ucyB3aGVuIHVzaW5nIEVsZXV0aGVyQUkgY2x1c3Rlclxue1xuICAjIHlvdSBtYXkgaW5jbHVkZSBtdWx0aXBsZSBkaXN0aW5jdCBkYXRhc2V0cyBpZiBkZXNpcmVkXG4gICNcInRyYWluX2RhdGFfcGF0aHNcIjogW1wiL21udC9zc2QtMS9kYXRhL2Vud2lrOC9lbndpazhfdGV4dF9kb2N1bWVudFwiXSxcbiAgI1widmFsaWRfZGF0YV9wYXRoc1wiOiBbXCIvbW50L3NzZC0xL2RhdGEvZW53aWs4L2Vud2lrOF92YWxfdGV4dF9kb2N1bWVudFwiXSxcbiAgI1widGVzdF9kYXRhX3BhdGhzXCI6IFtcIi9tbnQvc3NkLTEvZGF0YS9lbndpazgvZW53aWs4X3Rlc3RfdGV4dF9kb2N1bWVudFwiXSxcblxuICAjIGlmIHVzaW5nIG11bHRpcGxlIGRhdGFzZXRzLCBwcm92aWRlIHdlaWdodHMgZm9yIHRoZW0gdG8gYmUgc2FtcGxlZCB3aXRoXG4gICMgXCJ0cmFpbi1kYXRhLXdlaWdodHNcIjogWzEuLCAyLl0sXG4gICMgXCJ0ZXN0LWRhdGEtd2VpZ2h0c1wiOiBbMi4sIDEuXSxcbiAgIyBcInZhbGlkLWRhdGEtd2VpZ2h0c1wiOiBbMC41LCAwLjRdLFxuXG5cbiAgIyBJZiB5b3Ugd291bGQgbGlrZSB0aGUgY29kZSB0byBjcmVhdGUgdmFsIGFuZCB0ZXN0IGRhdGFzZXRzIGZyb20geW91ciB0cmFpbmluZyBzZXQgdXNlIHRoZSBmb2xsb3dpbmcgaW5zdGVhZFxuICAjIFwic3BsaXRcIiBkZXRlcm1pbmVzIHRoZSByZWxhdGl2ZSBzaXplIG9mIHRyYWluLCB2YWwsIGFuZCB0ZXN0XG5cbiAgXCJzcGxpdFwiOiBcIjk5NSw0LDFcIixcbiAgXCJkYXRhX3BhdGhcIjogXCIvbW50L3NzZC0xL2RhdGEvZW53aWs4L2Vud2lrOF90ZXh0X2RvY3VtZW50XCIsXG5cbiAgXCJ2b2NhYl9maWxlXCI6IFwiL21udC9zc2QtMS9kYXRhL2dwdDItdm9jYWIuanNvblwiLFxuICBcIm1lcmdlX2ZpbGVcIjogXCIvbW50L3NzZC0xL2RhdGEvZ3B0Mi1tZXJnZXMudHh0XCIsXG4gIFwic2F2ZVwiOiBcIi9tbnQvc3NkLTEvY2hlY2twb2ludHNcIixcbiAgXCJsb2FkXCI6IFwiL21udC9zc2QtMS9jaGVja3BvaW50c1wiLFxuICBcInRlbnNvcmJvYXJkX2RpclwiOiBcIi9tbnQvc3NkLTEvdGVuc29yYm9hcmRcIixcbiAgXCJsb2dfZGlyXCI6IFwiL21udC9zc2QtMS9sb2dzXCIsXG4gIFwid2FuZGJfdGVhbVwiOiBcImVsZXV0aGVyYWlcIixcbiAgI1wid2FuZGJfcnVuX25hbWVcIjogXCJleHBlcmltZW50XCJcbiAgXCJ3YW5kYl9wcm9qZWN0XCI6IFwibmVveFwiLFxuICBcIndhbmRiX2dyb3VwXCI6IFwiZXhhbXBsZVwiXG59XG4ifSwgImxvYWQiOiAiL21udC9zc2QtMS9jaGVja3BvaW50cyIsICJjaGVja3BvaW50X2ZhY3RvciI6IDEwMDAwLCAiYmF0Y2hfc2l6ZSI6IDQsICJ0cmFpbl9pdGVycyI6IDMyMDAwMCwgImV2YWxfaXRlcnMiOiAxMCwgImtlZXBfbGFzdF9uX2NoZWNrcG9pbnRzIjogNCwgInNwbGl0IjogIjk5NSw0LDEiLCAidm9jYWJfZmlsZSI6ICIvbW50L3NzZC0xL2RhdGEvZ3B0Mi12b2NhYi5qc29uIiwgIm1lcmdlX2ZpbGUiOiAiL21udC9zc2QtMS9kYXRhL2dwdDItbWVyZ2VzLnR4dCIsICJjaGVja3BvaW50X2FjdGl2YXRpb25zIjogdHJ1ZSwgInN5bmNocm9uaXplX2VhY2hfbGF5ZXIiOiB0cnVlLCAicGFydGl0aW9uX2FjdGl2YXRpb25zIjogdHJ1ZSwgImR5bmFtaWNfbG9zc19zY2FsZSI6IHRydWUsICJwaXBlX3BhcmFsbGVsX3NpemUiOiAxLCAid29ybGRfc2l6ZSI6IDEsICJ3YW5kYl9ncm91cCI6ICJleGFtcGxlIiwgIndhbmRiX3RlYW0iOiAiZWxldXRoZXJhaSIsICJsb2dfZGlyIjogIi9tbnQvc3NkLTEvbG9ncyIsICJ0ZW5zb3Jib2FyZF9kaXIiOiAiL21udC9zc2QtMS90ZW5zb3Jib2FyZCIsICJsb2dfaW50ZXJ2YWwiOiAxMCwgInRleHRfZ2VuX3R5cGUiOiAidW5jb25kaXRpb25hbCIsICJtb2VfbnVtX2V4cGVydHMiOiA0LCAibW9lX3Rva2VuX2Ryb3BwaW5nIjogdHJ1ZSwgIm1vZV90eXBlIjogImRlZXBzcGVlZCIsICJlbmFibGVfZXhwZXJ0X3RlbnNvcl9wYXJhbGxlbGlzbSI6IHRydWUsICJsb2NhbF9yYW5rIjogMCwgInJhbmsiOiAwLCAidXNlcl9zY3JpcHQiOiAidHJhaW4ucHkiLCAiZ2xvYmFsX251bV9ncHVzIjogOH0=
System Hardware
CPU count48
Logical CPU count 96
GPU count8
GPU typeNVIDIA A40
W&B CLI Version
0.19.8
Group
example
Config

Config parameters are your model's inputs. Learn more

  • {} 327 keys
    • null
    • "gelu"
    • null
    • false
    • 1,000
    • true
    • null
    • false
    • [] 12 items
      • 0
      • false
      • null
      • null
      • null
      • 4
      • null
      • false
      • false
      • false
      • null
      • true
      • 10,000
      • false
      • 1
      • "linear"
      • false
      • 1
      • null
      • null
      • null
      • null
      • null
      • null
      • null
      • null
      • null
      • null
      • {} 2 keys
        • "# GPT-2 pretraining setup { # See README for MoE config docs! "moe_type": "deepspeed", "moe_token_dropping": true, # Have 4 experts per layer (every 2 layers by default) "moe_num_experts": 4, # parallelism settings "enable_expert_tensor_parallelism": true, "pipe_parallel_size": 1, # not yet supported for MoE "model_parallel_size": 1, "moe_expert_parallel_size": 1, # model settings "num_layers": 12, "hidden_size": 768, "num_attention_heads": 12, "seq_length": 2048, "max_position_embeddings": 2048, "norm": "layernorm", "pos_emb": "rotary", "no_weight_tying": true, "gpt_j_residual": false, "output_layer_parallelism": "column", # these should provide some speedup but takes a while to build, set to true if desired "scaled_upper_triang_masked_softmax_fusion": false, "bias_gelu_fusion": false, "rope_fusion": false, # init methods "init_method": "small_init", "output_layer_init_method": "wang_init", # optimizer settings "optimizer": { "type": "Adam", "params": { "lr": 0.0006, "betas": [0.9, 0.95], "eps": 1.0e-8, } }, "min_lr": 0.00006, # for all zero_optimization options, see https://www.deepspeed.ai/docs/config-json/#zero-optimizations-for-fp16-training "zero_optimization": { "stage": 1, "allgather_partitions": True, "allgather_bucket_size": 500000000, "overlap_comm": True, "reduce_scatter": True, "reduce_bucket_size": 500000000, "contiguous_gradients": True, }, # batch / data settings "train_micro_batch_size_per_gpu": 4, "data_impl": "mmap", # activation checkpointing "checkpoint_activations": true, "checkpoint_num_layers": 1, "partition_activations": true, "synchronize_each_layer": true, # regularization "gradient_clipping": 1.0, "weight_decay": 0.1, "hidden_dropout": 0.0, "attention_dropout": 0.0, # precision settings "fp16": { "enabled": true, "loss_scale": 0, "loss_scale_window": 1000, "hysteresis": 2, "min_loss_scale": 1 }, # misc. training settings "train_iters": 320000, "lr_decay_iters": 320000, "distributed_backend": "nccl", "lr_decay_style": "cosine", "warmup": 0.01, "checkpoint_factor": 10000, "eval_interval": 1000, "eval_iters": 10, # logging "log_interval": 10, "steps_per_print": 10, "keep_last_n_checkpoints": 4, "wall_clock_breakdown": true, # networking "hostfile": "/mock_path" } "
        • "# Data paths and options when using EleutherAI cluster { # you may include multiple distinct datasets if desired #"train_data_paths": ["/mnt/ssd-1/data/enwik8/enwik8_text_document"], #"valid_data_paths": ["/mnt/ssd-1/data/enwik8/enwik8_val_text_document"], #"test_data_paths": ["/mnt/ssd-1/data/enwik8/enwik8_test_text_document"], # if using multiple datasets, provide weights for them to be sampled with # "train-data-weights": [1., 2.], # "test-data-weights": [2., 1.], # "valid-data-weights": [0.5, 0.4], # If you would like the code to create val and test datasets from your training set use the following instead # "split" determines the relative size of train, val, and test "split": "995,4,1", "data_path": "/mnt/ssd-1/data/enwik8/enwik8_text_document", "vocab_file": "/mnt/ssd-1/data/gpt2-vocab.json", "merge_file": "/mnt/ssd-1/data/gpt2-merges.txt", "save": "/mnt/ssd-1/checkpoints", "load": "/mnt/ssd-1/checkpoints", "tensorboard_dir": "/mnt/ssd-1/tensorboard", "log_dir": "/mnt/ssd-1/logs", "wandb_team": "eleutherai", #"wandb_run_name": "experiment" "wandb_project": "neox", "wandb_group": "example" } "
      • false
      • false
      • true
      • null
      • null
      • 0
      • null
      • "mmap"
      • 46 ... 95
        96 ... 145
        146 ... 195
        196 ... 245
        246 ... 295
        296 ... 322
      • {} 7 keys
        • 500,000,000
        • true
        • 1
      Summary

      Summary metrics are your model's outputs. Learn more

      • {} 0 keys