BC #02: MMOneSphere, ntrain100, PNet2_avg_ee
Try the naive averaging layer this time using the new BC dataset. Remember, this was the one that did really well, but only for the translation-only data, so we hope to do at least as good as this with SVD if we have no rotations in the demonstrator. UPDATE: yes, results are really good! Now if only we can get SVD to do as well as this. :X
Created on April 28|Last edited on May 2
Comment
Contrast these results with:
From the prior BC experiments.
eval/info_done_final LGTM, gives me (0.6978+0.6459+0.5714)/3 = 0.6383.
Seems better than https://wandb.ai/mooey5775/mixed_media/reports/BC-02-MMOneSphere-ntrain100-PNet2_svd_pointwise_flow--VmlldzoxOTI1MDAw but if you look at the performance curve, largely due to initial jump in performance, though there does seem to be more stable long-term behavior.
Success Rates and Train/Eval MSEs.
Run set
3
Example GIFs:
First seed after 250 epochs, showing recovery behavior (this is why we allow for 100 time steps, even if we only use the first 75 of demonstrator data). Great sideways movement, etc.

Second seed after 250 epochs, also looking good, failures are also largely due to pushing the ball in the water and having it sink (we do have the initial sideways movement of the demonstrator but I actually now think we should just remove that, hard to learn). That would mean regenerating the data again, though. :/

Third seed after 250 epochs, looking really good!

The training variant / parameters:
NOTE: This uses Eddie’s action_type='mean' that we added (for converting from flow→action). See this commit for the change. Actually this shouldn't matter too much but is mainly added for compatibility to other code. The reason is that a simple forward pass should just average the flows, so we get 3D output. BUT, when selecting action only (not during train / eval MSEs) we call the self._flow_to_env_action(obs, flow=self.actor.trunk.flow_per_pt) which will have been populated from the prior forward pass. This gives us flow that we can then use to average with action_type='mean'. It is a bit of a hack and again, not strictly necessary in this particular case, but moreso when we need the SVD stuff.
In our code this is pointnet_avg as the encoder (at least when we ran these on the 28th).
For seed 102. Nothing seems out of the ordinary here ...
{"_hidden_keys": [],"act_type": "ee","action_type": "mean","actor_lr": 0.0001,"agent": "bc","alg_policy": "ladle_algorithmic_v02","algorithm": "BC","batch_size": 24,"bc_data_dir": "/data/dseita/softgym_mm/data_demo/MMOneSphere_v01_BClone_filtered_ladle_algorithmic_v02_nVars_2000_obs_combo_act_translation","bc_data_filtered": true,"data_buffer_capacity": 1000000,"encoder_type": "pointnet_avg","env_kwargs": {"action_mode": "translation","action_repeat": 8,"camera_name": "top_down","deterministic": false,"headless": true,"horizon": 100,"num_variations": 1000,"observation_mode": "cam_rgb","render": true,"render_mode": "fluid"},"env_kwargs_action_mode": "translation","env_kwargs_camera_height": 128,"env_kwargs_camera_width": 128,"env_kwargs_deterministic": false,"env_kwargs_num_variations": 2000,"env_kwargs_observation_mode": "point_cloud","env_name": "MMOneSphere","env_version": "v01","exp_name": "MMOneSphere_v01_BC_ntrain_0100_PCL_PNet2_avg_ee_2022_04_28_11_08_09_0003","hidden_dim": 1024,"lambda_pos": 1.0,"lambda_rot": 100.0,"log_interval": 1,"n_epochs": 250,"n_eval_episodes": 10,"n_train_demos": 100,"num_filters": 32,"num_layers": 4,"project_axis_ang_y": false,"save_freq": 10,"save_model": true,"save_video": true,"seed": 102,"wandb_entity": "mooey5775","wandb_project": "mixed_media","weighted_MSE": false}
Add a comment