Comment
loss, train_reward/minibatch/loss
loss, train_reward/minibatch/loss
loss, train_reward/minibatch/loss
loss, train_reward/minibatch/loss
my attempts
10
openai original codebase
40
Name
10 visualized
exp_name: train_reward
exp_name: train_reward
10
State
Notes
User
Tags
Created
Runtime
Sweep
a
action_noise
action_shape
actor_buffer_size
actor_device_ids
actor_devices
agent_model_path
alg
algorithm
algorithm_spec.GAE
algorithm_spec.K_epoch
algorithm_spec.dueling
algorithm_spec.entropy_coeff
algorithm_spec.episodic_update
algorithm_spec.eps_clip
algorithm_spec.eps_decay
algorithm_spec.eps_final
algorithm_spec.eps_start
algorithm_spec.gamma
algorithm_spec.lambda
algorithm_spec.max_grad_norm
algorithm_spec.multi_step
algorithm_spec.policy_loss_coeff
algorithm_spec.replay_buffer_size
algorithm_spec.target_update_interval
algorithm_spec.vf_coeff
alpha
anneal-lr
anneal_lr
async_batch_size
async_update
asyncvec
autotune
aux_batch_size
aux_minibatch_size
backend
base_model
baseline_cost
batch_size
beta_clone
buffer_size
built_in_ais
Finished
-
costa-huang
3m 1s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
gpt2
-
-
-
-
-
1-1
of 1
Add a comment
Created with ❤️ on Weights & Biases.
https://wandb.ai/costa-huang/cleanRL/reports/sentiment-analysis--Vmlldzo0Njk2NzY3