Skip to main content

cleanba distributed (actor SPS improvement)

Created on February 16|Last edited on February 17

10M20M30M40M50Mglobal_step0100200300400500
510152025Time (minutes)50001000015000200002500030000
510152025Time (minutes)50001000015000200002500030000
510152025Time (minutes)1020304050
510152025Time (minutes)246
510152025Time (minutes)0100200300400500
a0-l1,2,3,4,5,6
4
a0-l0
3
a0-l1,2
3
a0-l1,2,3-d2 (same node)
1
a0-l1,2,3-d4 (2 nodes)
1
a0-l1,2,3-d2 (same hyperparams)
1
a0-l0 (high envs, 1 update)
1
a0-l1
1
thput
1
a0-l1,2,3-d1 (learner does data transfer)
1
a0-l1,2,3-d1 (learner does data transfer, minor improvement)
1
a0-l1,2,3-d1(K=2)
1
a0-l1,2,3-d1(K=1, N=120, b=40)
1
a0-l1,2,3-d1(K=1, N=192, b=64)
1