Skip to main content

SBX TD3 - Influence of policy net

net_arch=[256,256] vs net_arch=[400,300]
Created on January 11|Last edited on January 11

HalfCheetah-v4


200k400k600k800kglobal_step140015001600170018001900
algo: td3, hyperparams.policy_kwargs.net_arch: [400,300] TD3 bigger
algo: td3, saved_hyperparams.learning_rate: 0.001 TD3 RL Zoo 2.3.0
200k400k600k800kglobal_step40006000800010000
algo: td3, hyperparams.policy_kwargs.net_arch: [400,300] TD3 bigger
algo: td3, saved_hyperparams.learning_rate: 0.001 TD3 RL Zoo 2.3.0
TD3 RL Zoo 2.3.0
7
TD3 RL Zoo 2.2.1
11
TD3 bigger
3



Ant-v4


TD3 RL Zoo 2.3.0
9
TD3 RL Zoo 2.2.1
13
TD3 bigger
3



Hopper-v4


TD3 RL Zoo 2.3.0
9
TD3 RL Zoo 2.2.1
10
TD3 bigger
3



Walker2d-v4


TD3 RL Zoo 2.3.0
9
TD3 RL Zoo 2.2.1
10
TD3 bigger
3



Swimmer-v4


TD3 RL Zoo 2.3.0
9
TD3 RL Zoo 2.2.1
10
TD3 bigger
3