Skip to main content

sebulba profile

Created on January 29|Last edited on January 30
Finding highlights:
  • adding a timeout significantly reduces params_queue get time for actors, especially when paired with a dedicated GPU; however it does come with a side effect (e.g., such as the actor will not be able to pull the latest params from the learner and will use the old params instead)


count data_transfer stats/rollout_timecount data_transfer stats/training_timecount data_transfer stats/data_transfer_timedevice_put_sharded only in learner stats/rollout_timedevice_put_sharded only in learner stats/training_timedevice_put_sharded only in learner stats/data_transfer_time0.000.100.200.300.40
100k200k300k400kglobal_step2000400060008000
100k200k300k400kglobal_step0.40.4050.41
count data_transfer
1
split_data in prepare_data
1
device_put_sharded only in learner
1
multi-actor-threads
1



do data_transfer
2




do data_transfer
1