Chilli's group workspace
jmzpl6tw_k7d4ruyz
What makes this group special?
Tags
stella-ord-0-0
Notes
Author
State
Crashed
Start time
June 24th, 2024 4:29:19 AM
Runtime
2m 31s
Tracked hours
-
Run path
eleutherai/neox/sb6gnw18
OS
Linux-5.19.17-coreweave-x86_64-with-glibc2.17
Python version
3.8.19
Git repository
git clone https://github.com/EleutherAI/gpt-neox.git
Git state
git checkout -b "stella-ord-0-0" 4c426da8b6149e2313bc6e00584531f004cfe457
Command
train.py --local_rank=0 --deepspeed_config eyJ0cmFpbl9iYXRjaF9zaXplIjogMTI4LCAidHJhaW5fbWljcm9fYmF0Y2hfc2l6ZV9wZXJfZ3B1IjogNCwgImdyYWRpZW50X2FjY3VtdWxhdGlvbl9zdGVwcyI6IDQsICJvcHRpbWl6ZXIiOiB7InR5cGUiOiAiQWRhbSIsICJwYXJhbXMiOiB7ImxyIjogMC4wMDAyNSwgImJldGFzIjogWzAuOSwgMC45NV0sICJlcHMiOiAxZS0wOH19LCAiZnAzMl9hbGxyZWR1Y2UiOiB0cnVlLCAiZnAxNiI6IHsiZW5hYmxlZCI6IHRydWUsICJ0eXBlIjogImJmbG9hdDE2IiwgImF1dG9fY2FzdCI6IHRydWUsICJsb3NzX3NjYWxlIjogMCwgImxvc3Nfc2NhbGVfd2luZG93IjogMTAwMCwgImluaXRpYWxfc2NhbGVfcG93ZXIiOiAxMiwgImh5c3RlcmVzaXMiOiAyLCAibWluX2xvc3Nfc2NhbGUiOiAxfSwgInplcm9fb3B0aW1pemF0aW9uIjogeyJzdGFnZSI6IDAsICJhbGxnYXRoZXJfcGFydGl0aW9ucyI6IHRydWUsICJhbGxnYXRoZXJfYnVja2V0X3NpemUiOiA1MDAwMDAwMDAsICJvdmVybGFwX2NvbW0iOiB0cnVlLCAicmVkdWNlX3NjYXR0ZXIiOiB0cnVlLCAicmVkdWNlX2J1Y2tldF9zaXplIjogNTAwMDAwMDAwLCAiY29udGlndW91c19ncmFkaWVudHMiOiB0cnVlLCAiY3B1X29mZmxvYWQiOiBmYWxzZX0sICJ3YWxsX2Nsb2NrX2JyZWFrZG93biI6IHRydWV9 --megatron_config eyJ0cmFpbl9iYXRjaF9zaXplIjogMTI4LCAidHJhaW5fbWljcm9fYmF0Y2hfc2l6ZV9wZXJfZ3B1IjogNCwgImdyYWRpZW50X2FjY3VtdWxhdGlvbl9zdGVwcyI6IDQsICJvcHRpbWl6ZXIiOiB7InR5cGUiOiAiQWRhbSIsICJwYXJhbXMiOiB7ImxyIjogMC4wMDAyNSwgImJldGFzIjogWzAuOSwgMC45NV0sICJlcHMiOiAxZS0wOH19LCAiZnAzMl9hbGxyZWR1Y2UiOiB0cnVlLCAiZnAxNiI6IHsiZW5hYmxlZCI6IHRydWUsICJ0eXBlIjogImJmbG9hdDE2IiwgImF1dG9fY2FzdCI6IHRydWUsICJsb3NzX3NjYWxlIjogMCwgImxvc3Nfc2NhbGVfd2luZG93IjogMTAwMCwgImluaXRpYWxfc2NhbGVfcG93ZXIiOiAxMiwgImh5c3RlcmVzaXMiOiAyLCAibWluX2xvc3Nfc2NhbGUiOiAxfSwgInplcm9fb3B0aW1pemF0aW9uIjogeyJzdGFnZSI6IDAsICJhbGxnYXRoZXJfcGFydGl0aW9ucyI6IHRydWUsICJhbGxnYXRoZXJfYnVja2V0X3NpemUiOiA1MDAwMDAwMDAsICJvdmVybGFwX2NvbW0iOiB0cnVlLCAicmVkdWNlX3NjYXR0ZXIiOiB0cnVlLCAicmVkdWNlX2J1Y2tldF9zaXplIjogNTAwMDAwMDAwLCAiY29udGlndW91c19ncmFkaWVudHMiOiB0cnVlLCAiY3B1X29mZmxvYWQiOiBmYWxzZX0sICJ3YWxsX2Nsb2NrX2JyZWFrZG93biI6IHRydWUsICJwcmVjaXNpb24iOiAiZnAxNiIsICJudW1fbGF5ZXJzIjogMTYsICJoaWRkZW5fc2l6ZSI6IDIwNDgsICJudW1fYXR0ZW50aW9uX2hlYWRzIjogOCwgInNlcV9sZW5ndGgiOiAyMDQ4LCAibWF4X3Bvc2l0aW9uX2VtYmVkZGluZ3MiOiAyMDQ4LCAicG9zX2VtYiI6ICJyb3RhcnkiLCAibm9fd2VpZ2h0X3R5aW5nIjogdHJ1ZSwgImF0dGVudGlvbl9jb25maWciOiBbImdsb2JhbCIsICJnbG9iYWwiLCAiZ2xvYmFsIiwgImdsb2JhbCIsICJnbG9iYWwiLCAiZ2xvYmFsIiwgImdsb2JhbCIsICJnbG9iYWwiLCAiZ2xvYmFsIiwgImdsb2JhbCIsICJnbG9iYWwiLCAiZ2xvYmFsIiwgImdsb2JhbCIsICJnbG9iYWwiLCAiZ2xvYmFsIiwgImdsb2JhbCJdLCAic3BhcnNpdHlfY29uZmlnIjoge30sICJzY2FsZWRfdXBwZXJfdHJpYW5nX21hc2tlZF9zb2Z0bWF4X2Z1c2lvbiI6IHRydWUsICJiaWFzX2dlbHVfZnVzaW9uIjogdHJ1ZSwgInJvdGFyeV9wY3QiOiAwLjI1LCAiaW5pdF9tZXRob2QiOiAic21hbGxfaW5pdCIsICJvdXRwdXRfbGF5ZXJfaW5pdF9tZXRob2QiOiAid2FuZ19pbml0IiwgImdwdF9qX3Jlc2lkdWFsIjogdHJ1ZSwgImxyX2RlY2F5X3N0eWxlIjogImNvc2luZSIsICJscl9kZWNheV9pdGVycyI6IDE0MzAwMCwgIm1pbl9sciI6IDIuNWUtMDUsICJvcHRpbWl6ZXJfdHlwZSI6ICJBZGFtIiwgInplcm9fc3RhZ2UiOiAwLCAiemVyb19yZWR1Y2Vfc2NhdHRlciI6IHRydWUsICJ6ZXJvX2NvbnRpZ3VvdXNfZ3JhZGllbnRzIjogdHJ1ZSwgInplcm9fcmVkdWNlX2J1Y2tldF9zaXplIjogNTAwMDAwMDAwLCAiemVyb19hbGxnYXRoZXJfYnVja2V0X3NpemUiOiA1MDAwMDAwMDAsICJsciI6IDAuMDAwMjUsICJ0b2tlbml6ZXJfdHlwZSI6ICJIRlRva2VuaXplciIsICJkYXRhc2V0X3R5cGUiOiAicGF1c2UiLCAiZGF0YXNldF9jZmciOiB7InBhdXNlX2lkIjogNTAyNzd9LCAiZGF0YV9wYXRoIjogImRhdGEvZW53aWs4L2Vud2lrOF90ZXh0X2RvY3VtZW50IiwgImRhdGFfaW1wbCI6ICJtbWFwIiwgInNhdmUiOiAiY2hlY2twb2ludHMiLCAiY29uZmlnX2ZpbGVzIjogeyIxQi1wYXVzZS55bWwiOiAie1xuICBcInBpcGVfcGFyYWxsZWxfc2l6ZVwiOiAxLFxuICBcIm1vZGVsX3BhcmFsbGVsX3NpemVcIjogMSxcblxuICBcIm51bV9sYXllcnNcIjogMTYsXG4gIFwiaGlkZGVuX3NpemVcIjogMjA0OCxcbiAgXCJudW1fYXR0ZW50aW9uX2hlYWRzXCI6IDgsXG4gIFwic2VxX2xlbmd0aFwiOiAyMDQ4LFxuICBcIm1heF9wb3NpdGlvbl9lbWJlZGRpbmdzXCI6IDIwNDgsXG4gIFwicG9zX2VtYlwiOiBcInJvdGFyeVwiLFxuICBcInJvdGFyeV9wY3RcIjogMC4yNSxcbiAgXCJub193ZWlnaHRfdHlpbmdcIjogdHJ1ZSxcbiAgXCJncHRfal9yZXNpZHVhbFwiOiB0cnVlLFxuICBcIm91dHB1dF9sYXllcl9wYXJhbGxlbGlzbVwiOiBcImNvbHVtblwiLFxuXG4gIFwic2NhbGVkX3VwcGVyX3RyaWFuZ19tYXNrZWRfc29mdG1heF9mdXNpb25cIjogdHJ1ZSxcbiAgXCJiaWFzX2dlbHVfZnVzaW9uXCI6IHRydWUsXG5cbiAgXCJpbml0X21ldGhvZFwiOiBcInNtYWxsX2luaXRcIixcbiAgXCJvdXRwdXRfbGF5ZXJfaW5pdF9tZXRob2RcIjogXCJ3YW5nX2luaXRcIixcblxuICBcIm9wdGltaXplclwiOiB7XG4gICAgXCJ0eXBlXCI6IFwiQWRhbVwiLFxuICAgIFwicGFyYW1zXCI6IHtcbiAgICAgIFwibHJcIjogMC4wMDAyNSxcbiAgICAgIFwiYmV0YXNcIjogWzAuOSwgMC45NV0sXG4gICAgICBcImVwc1wiOiAxLjBlLThcbiAgICB9XG4gIH0sXG4gIFwibWluX2xyXCI6IDAuMDAwMDI1LFxuXG4gIFwiemVyb19vcHRpbWl6YXRpb25cIjoge1xuICAgIFwic3RhZ2VcIjogMCxcbiAgICBcImFsbGdhdGhlcl9wYXJ0aXRpb25zXCI6IHRydWUsXG4gICAgXCJhbGxnYXRoZXJfYnVja2V0X3NpemVcIjogNTAwMDAwMDAwLFxuICAgIFwib3ZlcmxhcF9jb21tXCI6IHRydWUsXG4gICAgXCJyZWR1Y2Vfc2NhdHRlclwiOiB0cnVlLFxuICAgIFwicmVkdWNlX2J1Y2tldF9zaXplXCI6IDUwMDAwMDAwMCxcbiAgICBcImNvbnRpZ3VvdXNfZ3JhZGllbnRzXCI6IHRydWUsXG4gICAgXCJjcHVfb2ZmbG9hZFwiOiBmYWxzZVxuICB9LFxuXG4gIFwiZnAxNlwiOiB7XG4gICAgXCJlbmFibGVkXCI6IHRydWUsXG4gICAgXCJ0eXBlXCI6IFwiYmZsb2F0MTZcIixcbiAgICBcImF1dG9fY2FzdFwiOiB0cnVlLFxuICAgIFwibG9zc19zY2FsZVwiOiAwLFxuICAgIFwibG9zc19zY2FsZV93aW5kb3dcIjogMTAwMCxcbiAgICBcImluaXRpYWxfc2NhbGVfcG93ZXJcIjogMTIsXG4gICAgXCJoeXN0ZXJlc2lzXCI6IDIsXG4gICAgXCJtaW5fbG9zc19zY2FsZVwiOiAxXG4gIH0sXG5cbiAgXCJmcDMyX2FsbHJlZHVjZVwiOiB0cnVlLFxuXG4gIFwidHJhaW5fbWljcm9fYmF0Y2hfc2l6ZV9wZXJfZ3B1XCI6IDQsXG4gIFwiZ3JhZGllbnRfYWNjdW11bGF0aW9uX3N0ZXBzXCI6IDQsXG4gIFwiZGF0YV9pbXBsXCI6IFwibW1hcFwiLFxuICBcIm51bV93b3JrZXJzXCI6IDEsXG5cbiAgXCJjaGVja3BvaW50X2FjdGl2YXRpb25zXCI6IHRydWUsXG4gIFwiY2hlY2twb2ludF9udW1fbGF5ZXJzXCI6IDEsXG4gIFwicGFydGl0aW9uX2FjdGl2YXRpb25zXCI6IHRydWUsXG4gIFwic3luY2hyb25pemVfZWFjaF9sYXllclwiOiB0cnVlLFxuXG4gIFwiZ3JhZGllbnRfY2xpcHBpbmdcIjogMS4wLFxuICBcIndlaWdodF9kZWNheVwiOiAwLjEsXG4gIFwiaGlkZGVuX2Ryb3BvdXRcIjogMCxcbiAgXCJhdHRlbnRpb25fZHJvcG91dFwiOiAwLFxuXG4gIFwidHJhaW5faXRlcnNcIjogMTQzMDAwLFxuICBcImxyX2RlY2F5X2l0ZXJzXCI6IDE0MzAwMCxcbiAgXCJkaXN0cmlidXRlZF9iYWNrZW5kXCI6IFwibmNjbFwiLFxuICBcImxyX2RlY2F5X3N0eWxlXCI6IFwiY29zaW5lXCIsXG4gIFwid2FybXVwXCI6IDAuMDEsXG4gIFwiY2hlY2twb2ludF9mYWN0b3JcIjogMTAwMCxcbiAgXCJleHRyYV9zYXZlX2l0ZXJzXCI6IFswLDEsMiw0LDgsMTYsMzIsNjQsMTI4LDI1Niw1MTJdLFxuICBcImV2YWxfaW50ZXJ2YWxcIjogMTQzMDAwLFxuICBcImV2YWxfaXRlcnNcIjogMTAsXG5cbiAgXCJsb2dfaW50ZXJ2YWxcIjogMTAsXG4gIFwic3RlcHNfcGVyX3ByaW50XCI6IDEwLFxuICBcIndhbGxfY2xvY2tfYnJlYWtkb3duXCI6IHRydWUsXG5cbiAgXCJ0b2tlbml6ZXJfdHlwZVwiOiBcIkhGVG9rZW5pemVyXCIsXG4gIFwiZGF0YXNldF90eXBlXCI6IFwicGF1c2VcIixcbiAgXCJkYXRhc2V0X2NmZ1wiOiB7XG4gICAgXCJwYXVzZV9pZFwiOiA1MDI3NyxcbiAgfVxufVxuIiwgImxvY2FsX3NldHVwLnltbCI6ICIjIFN1Z2dlc3RlZCBkYXRhIHBhdGhzIHdoZW4gdXNpbmcgR1BULU5lb1ggbG9jYWxseVxue1xuICBcImRhdGFfcGF0aFwiOiBcImRhdGEvZW53aWs4L2Vud2lrOF90ZXh0X2RvY3VtZW50XCIsXG5cbiAgIyBvciBmb3Igd2VpZ2h0ZWQgZGF0YXNldHM6XG4gICMgXCJ0cmFpbi1kYXRhLXBhdGhzXCI6IFtcImRhdGEvZW53aWs4L2Vud2lrOF90ZXh0X2RvY3VtZW50XCIsIFwiZGF0YS9lbndpazgvZW53aWs4X3RleHRfZG9jdW1lbnRcIl0sXG4gICMgXCJ0ZXN0LWRhdGEtcGF0aHNcIjogW1wiZGF0YS9lbndpazgvZW53aWs4X3RleHRfZG9jdW1lbnRcIiwgXCJkYXRhL2Vud2lrOC9lbndpazhfdGV4dF9kb2N1bWVudFwiXSxcbiAgIyBcInZhbGlkLWRhdGEtcGF0aHNcIjogW1wiZGF0YS9lbndpazgvZW53aWs4X3RleHRfZG9jdW1lbnRcIiwgXCJkYXRhL2Vud2lrOC9lbndpazhfdGV4dF9kb2N1bWVudFwiXSxcbiAgIyBcInRyYWluLWRhdGEtd2VpZ2h0c1wiOiBbMS4sIDIuXSxcbiAgIyBcInRlc3QtZGF0YS13ZWlnaHRzXCI6IFsyLiwgMS5dLFxuICAjIFwidmFsaWQtZGF0YS13ZWlnaHRzXCI6IFswLjUsIDAuNF0sXG5cbiAgIyBJZiB3ZWlnaHRfYnlfbnVtX2RvY3VtZW50cyBpcyBUcnVlLCBCdWlsZHMgZGF0YXNldCB3ZWlnaHRzIGZyb20gYSBtdWx0aW5vbWlhbCBkaXN0cmlidXRpb24gb3ZlciBncm91cHMgb2YgZGF0YSBhY2NvcmRpbmcgdG8gdGhlIG51bWJlciBvZiBkb2N1bWVudHMgaW4gZWFjaCBncm91cC5cbiAgIyBXQVJOSU5HOiBzZXR0aW5nIHRoaXMgdG8gVHJ1ZSB3aWxsIG92ZXJyaWRlIGFueSB1c2VyIHByb3ZpZGVkIHdlaWdodHNcbiAgIyBcIndlaWdodF9ieV9udW1fZG9jdW1lbnRzXCI6IGZhbHNlLFxuICAjIFwid2VpZ2h0ZWRfc2FtcGxlcl9hbHBoYVwiOiAwLjMsXG5cbiAgXCJ2b2NhYl9maWxlXCI6IFwiLi4vcHl0aGlhX3R5cGUyLmpzb25cIixcblxuICBcInNhdmVcIjogXCJjaGVja3BvaW50c1wiLFxuICBcImxvYWRcIjogXCJjaGVja3BvaW50c1wiLFxuICBcImNoZWNrcG9pbnRfdmFsaWRhdGlvbl93aXRoX2ZvcndhcmRfcGFzc1wiOiBGYWxzZSxcblxuICBcInRlbnNvcmJvYXJkX2RpclwiOiBcInRlbnNvcmJvYXJkXCIsXG4gIFwibG9nX2RpclwiOiBcImxvZ3NcIixcbiAgXCJ1c2Vfd2FuZGJcIjogVHJ1ZSxcbiAgXCJ3YW5kYl9ob3N0XCI6IFwiaHR0cHM6Ly9hcGkud2FuZGIuYWlcIixcbiAgXCJ3YW5kYl9wcm9qZWN0XCI6IFwibmVveFwiXG59XG4ifSwgImxvYWQiOiAiY2hlY2twb2ludHMiLCAiY2hlY2twb2ludF9mYWN0b3IiOiAxMDAwLCAiZXh0cmFfc2F2ZV9pdGVycyI6IFswLCAxLCAyLCA0LCA4LCAxNiwgMzIsIDY0LCAxMjgsIDI1NiwgNTEyXSwgImJhdGNoX3NpemUiOiA0LCAidHJhaW5faXRlcnMiOiAxNDMwMDAsICJldmFsX2l0ZXJzIjogMTAsICJldmFsX2ludGVydmFsIjogMTQzMDAwLCAidm9jYWJfZmlsZSI6ICIuLi9weXRoaWFfdHlwZTIuanNvbiIsICJudW1fd29ya2VycyI6IDEsICJjaGVja3BvaW50X2FjdGl2YXRpb25zIjogdHJ1ZSwgInN5bmNocm9uaXplX2VhY2hfbGF5ZXIiOiB0cnVlLCAicGFydGl0aW9uX2FjdGl2YXRpb25zIjogdHJ1ZSwgImR5bmFtaWNfbG9zc19zY2FsZSI6IHRydWUsICJwaXBlX3BhcmFsbGVsX3NpemUiOiAxLCAid29ybGRfc2l6ZSI6IDEsICJpc19waXBlX3BhcmFsbGVsIjogdHJ1ZSwgInVzZV93YW5kYiI6IHRydWUsICJ3YW5kYl9ncm91cCI6ICJqbXpwbDZ0d19rN2Q0cnV5eiIsICJsb2dfZGlyIjogImxvZ3MiLCAidGVuc29yYm9hcmRfZGlyIjogInRlbnNvcmJvYXJkIiwgImxvZ19pbnRlcnZhbCI6IDEwLCAidGV4dF9nZW5fdHlwZSI6ICJ1bmNvbmRpdGlvbmFsIiwgImxvY2FsX3JhbmsiOiAwLCAicmFuayI6IDAsICJ1c2VyX3NjcmlwdCI6ICJ0cmFpbi5weSIsICJzYXZlX2l0ZXJzIjogWzAsIDEsIDIsIDQsIDgsIDE2LCAzMiwgNjQsIDEyOCwgMjU2LCA1MTIsIDEwMDAsIDIwMDAsIDMwMDAsIDQwMDAsIDUwMDAsIDYwMDAsIDcwMDAsIDgwMDAsIDkwMDAsIDEwMDAwLCAxMTAwMCwgMTIwMDAsIDEzMDAwLCAxNDAwMCwgMTUwMDAsIDE2MDAwLCAxNzAwMCwgMTgwMDAsIDE5MDAwLCAyMDAwMCwgMjEwMDAsIDIyMDAwLCAyMzAwMCwgMjQwMDAsIDI1MDAwLCAyNjAwMCwgMjcwMDAsIDI4MDAwLCAyOTAwMCwgMzAwMDAsIDMxMDAwLCAzMjAwMCwgMzMwMDAsIDM0MDAwLCAzNTAwMCwgMzYwMDAsIDM3MDAwLCAzODAwMCwgMzkwMDAsIDQwMDAwLCA0MTAwMCwgNDIwMDAsIDQzMDAwLCA0NDAwMCwgNDUwMDAsIDQ2MDAwLCA0NzAwMCwgNDgwMDAsIDQ5MDAwLCA1MDAwMCwgNTEwMDAsIDUyMDAwLCA1MzAwMCwgNTQwMDAsIDU1MDAwLCA1NjAwMCwgNTcwMDAsIDU4MDAwLCA1OTAwMCwgNjAwMDAsIDYxMDAwLCA2MjAwMCwgNjMwMDAsIDY0MDAwLCA2NTAwMCwgNjYwMDAsIDY3MDAwLCA2ODAwMCwgNjkwMDAsIDcwMDAwLCA3MTAwMCwgNzIwMDAsIDczMDAwLCA3NDAwMCwgNzUwMDAsIDc2MDAwLCA3NzAwMCwgNzgwMDAsIDc5MDAwLCA4MDAwMCwgODEwMDAsIDgyMDAwLCA4MzAwMCwgODQwMDAsIDg1MDAwLCA4NjAwMCwgODcwMDAsIDg4MDAwLCA4OTAwMCwgOTAwMDAsIDkxMDAwLCA5MjAwMCwgOTMwMDAsIDk0MDAwLCA5NTAwMCwgOTYwMDAsIDk3MDAwLCA5ODAwMCwgOTkwMDAsIDEwMDAwMCwgMTAxMDAwLCAxMDIwMDAsIDEwMzAwMCwgMTA0MDAwLCAxMDUwMDAsIDEwNjAwMCwgMTA3MDAwLCAxMDgwMDAsIDEwOTAwMCwgMTEwMDAwLCAxMTEwMDAsIDExMjAwMCwgMTEzMDAwLCAxMTQwMDAsIDExNTAwMCwgMTE2MDAwLCAxMTcwMDAsIDExODAwMCwgMTE5MDAwLCAxMjAwMDAsIDEyMTAwMCwgMTIyMDAwLCAxMjMwMDAsIDEyNDAwMCwgMTI1MDAwLCAxMjYwMDAsIDEyNzAwMCwgMTI4MDAwLCAxMjkwMDAsIDEzMDAwMCwgMTMxMDAwLCAxMzIwMDAsIDEzMzAwMCwgMTM0MDAwLCAxMzUwMDAsIDEzNjAwMCwgMTM3MDAwLCAxMzgwMDAsIDEzOTAwMCwgMTQwMDAwLCAxNDEwMDAsIDE0MjAwMF0sICJnbG9iYWxfbnVtX2dwdXMiOiA4fQ==
System Hardware
CPU count | 48 |
Logical CPU count | 96 |
GPU count | 8 |
GPU type | NVIDIA A40 |
W&B CLI Version
0.17.1
Group
jmzpl6tw_k7d4ruyzConfig
Config parameters are your model's inputs. Learn more
- {} 265 keys▶
- null
- "gelu"
- null
- false
- 1,000
- null
- false
- [] 16 items▶
- 0
- false
- null
- null
- null
- 4
- null
- false
- true
- false
- null
- true
- 1,000
- false
- 1
- "linear"
- false
- 1
- null
- null
- null
- null
- {} 2 keys▶
- "{ "pipe_parallel_size": 1, "model_parallel_size": 1, "num_layers": 16, "hidden_size": 2048, "num_attention_heads": 8, "seq_length": 2048, "max_position_embeddings": 2048, "pos_emb": "rotary", "rotary_pct": 0.25, "no_weight_tying": true, "gpt_j_residual": true, "output_layer_parallelism": "column", "scaled_upper_triang_masked_softmax_fusion": true, "bias_gelu_fusion": true, "init_method": "small_init", "output_layer_init_method": "wang_init", "optimizer": { "type": "Adam", "params": { "lr": 0.00025, "betas": [0.9, 0.95], "eps": 1.0e-8 } }, "min_lr": 0.000025, "zero_optimization": { "stage": 0, "allgather_partitions": true, "allgather_bucket_size": 500000000, "overlap_comm": true, "reduce_scatter": true, "reduce_bucket_size": 500000000, "contiguous_gradients": true, "cpu_offload": false }, "fp16": { "enabled": true, "type": "bfloat16", "auto_cast": true, "loss_scale": 0, "loss_scale_window": 1000, "initial_scale_power": 12, "hysteresis": 2, "min_loss_scale": 1 }, "fp32_allreduce": true, "train_micro_batch_size_per_gpu": 4, "gradient_accumulation_steps": 4, "data_impl": "mmap", "num_workers": 1, "checkpoint_activations": true, "checkpoint_num_layers": 1, "partition_activations": true, "synchronize_each_layer": true, "gradient_clipping": 1.0, "weight_decay": 0.1, "hidden_dropout": 0, "attention_dropout": 0, "train_iters": 143000, "lr_decay_iters": 143000, "distributed_backend": "nccl", "lr_decay_style": "cosine", "warmup": 0.01, "checkpoint_factor": 1000, "extra_save_iters": [0,1,2,4,8,16,32,64,128,256,512], "eval_interval": 143000, "eval_iters": 10, "log_interval": 10, "steps_per_print": 10, "wall_clock_breakdown": true, "tokenizer_type": "HFTokenizer", "dataset_type": "pause", "dataset_cfg": { "pause_id": 50277, } } "
- "# Suggested data paths when using GPT-NeoX locally { "data_path": "data/enwik8/enwik8_text_document", # or for weighted datasets: # "train-data-paths": ["data/enwik8/enwik8_text_document", "data/enwik8/enwik8_text_document"], # "test-data-paths": ["data/enwik8/enwik8_text_document", "data/enwik8/enwik8_text_document"], # "valid-data-paths": ["data/enwik8/enwik8_text_document", "data/enwik8/enwik8_text_document"], # "train-data-weights": [1., 2.], # "test-data-weights": [2., 1.], # "valid-data-weights": [0.5, 0.4], # If weight_by_num_documents is True, Builds dataset weights from a multinomial distribution over groups of data according to the number of documents in each group. # WARNING: setting this to True will override any user provided weights # "weight_by_num_documents": false, # "weighted_sampler_alpha": 0.3, "vocab_file": "../pythia_type2.json", "save": "checkpoints", "load": "checkpoints", "checkpoint_validation_with_forward_pass": False, "tensorboard_dir": "tensorboard", "log_dir": "logs", "use_wandb": True, "wandb_host": "https://api.wandb.ai", "wandb_project": "neox" } "
- false
- false
- true
- null
- null
- 0
- null
- "mmap"
- "data/enwik8/enwik8_text_document"
- null
- {} 1 key▶
- 50,277
- "pause"
- false
- null
- true
- {} 8 keys▶
- 500,000,000
- true
- 0
46 ... 95▶▶96 ... 145▶▶146 ... 195▶▶196 ... 245▶▶246 ... 260▶▶
Summary
Summary metrics are your model's outputs. Learn more
No summary metrics saved for this run.
Check the summary metrics documentation for more information.
Artifact Outputs
This run produced these artifacts as outputs. Total: 1. Learn more
Type
Name
Consumer count
Loading...