BC03 3DoF, 1 Sphere, PN++ Average Layer, EE

Seems to be doing pretty well.
Created on May 13|Last edited on May 20
Comment
I ran these at some different times so they aren't numbered 001 through 005, but should be the same settings except one of them only did 200 epochs (not too big of a deal, we can re-run that one if needed).
Edit (05/14/2022) the two runs that started 05/14 have the correct evaluation flow visualizations and MSE computations. The earlier runs do not have that (but should have had the correct policy selection for actions...).
I'm quite confused. It looks like it is doing well, yet PN++ SVD is not?
Good news, eval/MSE_loss with the 05/14 fixes is actually going down! Noisy, but it gets down to around 0.03-ish while training gets down to abotu 0.01-ish. This is on the 3D MSE vectors, so we can only compare these to other numbers that do MSEs on EE vectors, which would be the naive PN++ classification method which gets slightly higher (worse) MSE, good! 
Also this is indeed averaging to the right thing even with sideways movement. Despite how this looks weird, the average ends up somewhat close to what the demonstrator actually did!
Example evaluation with going downwards, the downward movement is VERY consistent on a per-point basis! Note also that this one is also trending in a sideways direction (it gets closer to that when the ladle gets closer to the water, makes sense).
﻿
Example, this one is tough but the average is trending in the right directiojn:
﻿
Example:
﻿
Section 1﻿
Run set7
Runs w/out strange physics5
﻿
Example GIFsBC03_MMOneSphere_v01_ntrain_0100_PCL_PNet2_avg_ee_3DoF_ar_8_hor_100_rawPCL_scaleTarg_debug_2022_05_12_20_34_06_0001 after 200:
﻿
Update: BC03_MMOneSphere_v01_ntrain_0100_PCL_PNet2_avg_ee_3DoF_ar_8_hor_100_rawPCL_scaleTarg_2022_05_13_09_55_58_0002 ended up with exploding water GIFs, so we have to re-run that (this is one of the runs that didn't do well so re-running is likely going to help PN++ averaging).
﻿
﻿
﻿
The two 05/14 runs that fixed the evaluation MSE / flow visualization calculations:
BC03_MMOneSphere_v01_ntrain_0100_PCL_PNet2_avg_ee_3DoF_ar_8_hor_100_rawPCL_scaleTarg_2022_05_14_13_39_21_0001
﻿
BC03_MMOneSphere_v01_ntrain_0100_PCL_PNet2_avg_ee_3DoF_ar_8_hor_100_rawPCL_scaleTarg_2022_05_14_13_39_21_0002 (edit: this one is at ~200, later it ran into the "exploding water" GIFs so the run stopped, but that would make the PN++ averaging even better I think).
﻿
﻿
Code ﻿
(Edit: did some more runs after the 05/14 change to fix the validation flow + MSE calculations.)
After
﻿https://github.com/Xingyu-Lin/softagent_rpad/commit/f68acdd176698d92bbb4c3cd2e65e7a3b4229674﻿
﻿https://github.com/mooey5775/softgym_MM/commit/6371e3a7528b7e770d69d7233e0f70605dd0ac99﻿
﻿
Add a comment