Skip to main content

Multi-Sphere, Comparing Point Clouds

Now with updated data statistics so we can track how often we are getting one set of balls over another, etc. (Edit: investigating again in mid-July but maybe make a new report as it's harder to group runs, for some reason.)
Created on June 8|Last edited on July 12
Contents:


Overview

Finally starting to get back to this (06/08/2022), compare with the CNN here:
Surprisingly the CNN is really bad and looks like it wouldn't have gone up even with more training.
NOTE! For results here, I actually changed this so now we do 500 epochs, and evaluate every 5 epochs (so now the training is the bottleneck instead of evaluation). There are thus a total of 100 x 10 = 1000 evaluation episodes, and in the plots below you need to interpret every "epoch" by multiplying it by 5.

(06/08/2022) Found Inconsistency with my Evaluation

Later update: oops, I realized I should have also been doing an update eval period for 5 for the normal point cloud, not just the new version. :( Ah, I really should re-run this because the curves are not comparable. See https://github.com/Xingyu-Lin/softagent_rpad/commit/c46ab277e19c093fa231974db0f22e8412b5c1c5 (and https://github.com/Xingyu-Lin/softagent_rpad/commit/48ba1a45cb1f9092268561ac9479c4aabffbc2a2) essentially this means for the new point cloud, each "epoch" here should be multiplied by 5 to get the true one, whereas for the older point cloud, that's not the case. I stopped my runs on the cluster and the local run, and for the plots below I'll only show the one local run (and will remove it later once we can replace this).

(06/08/2022) Fixed evaluation, adding more data statistics

As of 2:00pm, running my second version of the first run with the new point cloud, which will also have better analysis so that we can identify if we're retrieving a target or distractor (or both). To clarify this is after the SoftGym (not Agent) commit:
This is the status as of 2:40p:
(softgym) seita@takeshi:~/softagent_rpad_MM (daniel-mixed-media-bc03)$ ls -lh data/local/BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg/
total 8.0K
drwxrwxr-x 5 seita seita 4.0K Jun 8 14:07 BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_00_17_52_0001
drwxrwxr-x 5 seita seita 4.0K Jun 8 14:17 BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_14_14_44_0001
(softgym) seita@takeshi:~/softagent_rpad_MM (daniel-mixed-media-bc03)$
The second run from 08 14 is the one that has the updated statistics.
Edit: as of 10:00pm, I should have 3X for each of the old PCL and new PCL. I also have 1 more of the new PCL but without the interesting data statistics I logged (might want to ignore that one).

(06/10/2022) Updated Results Noticed BUG!

Not looking good for some reason, even with the new point cloud (?!?). Update: never mind, might not have actually run the baseline with the proper point clouds :( But let's not worry about this since it seems like even this isn't doing well.
OK to summarize here's the issue. The results I have are stored here:
seita@takeshi:/data/seita/softagent_mm$ ls -lh BC03_MMMultiSphere_v02_ntrain_0100_PCL_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg/
total 16K
drwxrwxr-x 5 seita seita 4.0K Jun 9 22:49 BC03_MMMultiSphere_v02_ntrain_0100_PCL_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_00_18_39_0001
drwxrwxr-x 5 seita seita 4.0K Jun 9 22:50 BC03_MMMultiSphere_v02_ntrain_0100_PCL_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_19_08_08_0001
drwxrwxr-x 5 seita seita 4.0K Jun 9 21:13 BC03_MMMultiSphere_v02_ntrain_0100_PCL_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_19_43_52_0001
drwxrwxr-x 5 seita seita 4.0K Jun 9 21:14 BC03_MMMultiSphere_v02_ntrain_0100_PCL_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_19_43_52_0002
seita@takeshi:/data/seita/softagent_mm$ ls -lh BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg/
total 16K
drwxrwxr-x 5 seita seita 4.0K Jun 8 14:07 BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_00_17_52_0001
drwxrwxr-x 5 seita seita 4.0K Jun 8 14:17 BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_14_14_44_0001
drwxrwxr-x 5 seita seita 4.0K Jun 9 21:08 BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_22_01_17_0001
drwxrwxr-x 5 seita seita 4.0K Jun 8 22:10 BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_22_07_53_0001
seita@takeshi:/data/seita/softagent_mm$
This is 8 runs. However the first one above was only trained partially and should not be used. So that leaves 7 runs with proper data storage (evaluate once every 5 epochs, etc.). However I also marked the first one for the GT PCL as "cancel" so I've cleaned up the directories and we now have 6 results total which makes sense:
seita@takeshi:/data/seita/softagent_mm$ ls -lh BC03_MMMultiSphere_v02_ntrain_0100_PCL_*
BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg:
total 12K
drwxrwxr-x 5 seita seita 4.0K Jun 8 14:17 BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_14_14_44_0001
drwxrwxr-x 5 seita seita 4.0K Jun 9 21:08 BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_22_01_17_0001
drwxrwxr-x 5 seita seita 4.0K Jun 8 22:10 BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_22_07_53_0001

BC03_MMMultiSphere_v02_ntrain_0100_PCL_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg:
total 12K
drwxrwxr-x 5 seita seita 4.0K Jun 9 22:50 BC03_MMMultiSphere_v02_ntrain_0100_PCL_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_19_08_08_0001
drwxrwxr-x 5 seita seita 4.0K Jun 9 21:13 BC03_MMMultiSphere_v02_ntrain_0100_PCL_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_19_43_52_0001
drwxrwxr-x 5 seita seita 4.0K Jun 9 21:14 BC03_MMMultiSphere_v02_ntrain_0100_PCL_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_19_43_52_0002
seita@takeshi:/data/seita/softagent_mm$
The problem is that I think I was using the same point cloud type for both the old and the new way, so despite the directory names, BOTH ARE USING THE DATA DIRECTORY WITH THE UPDATED POINT CLOUD. (So basically we ran 6X seeds of the new way, I think) But then if our point cloud observation type is point_cloud then how did the latter work anyway? I think maybe it trained on the data from the new point cloud but then was tested on the old point cloud? So I'm actually wondering why performance wasn't worse but then again we really do need to check this case again. However it's not a priority now I think because even the new point cloud case which should have been working is not working...

(07/11/2022) Try this again.

Let's be careful to make sure our evaluation is correct. Do after:
Also this is with 3D flow so that could change as I think I did prior experiments with 6D flow. 
Compare:
SVD_POINTWISE_EE2FLOW_GT_PCL = dict(
obs_type='point_cloud_gt_v01', # note the change
act_type='ee2flow',
encoder_type='pointnet_svd_pointwise',
method_flow2act='svd',
use_consistency_loss=True,
lambda_consistency=0.1,
scale_pcl_flow=True,
scale_pcl_val=250,
)
versus old way:
SVD_POINTWISE_EE2FLOW = dict(
obs_type='point_cloud',
act_type='ee2flow',
encoder_type='pointnet_svd_pointwise',
method_flow2act='svd',
use_consistency_loss=True,
lambda_consistency=0.1,
scale_pcl_flow=True,
scale_pcl_val=250,
)

Results


6D Flow SVD PW Cons. (NEW PCL)
3
6D Flow SVD PW Cons. (old PCL)
3


More Results, Look at How Often Red vs Yellow Ball is Being Raised

Unfortunately this seems to suggest that it's not actually that effective in pulling the target item or disambiguating? Right? TODO: get the non-new point cloud there and see if this resolves the issue. The info_height_item_1_final refers to the height of the distractor at the end.
Actually this is also further misleading because sometimes we lift it, then we lower it, and this might cause it to fall out, so even this might not be an accurate reflection of what's happening. We need to add yet another evaluation metric to the info dict, I think.
UPDATE (06/08/2022) we have that now so ignore any older runs which don't do that.
  • eval/info_done_final: if we have retrieved the target item. (Does not consider the distractor, may account for cases when retrieving both.)
  • eval/info_dist_1_done_final: if we have retrieved the distractor. (Does not consider the target, may account for cases when retrieving both.)
  • eval/info_done_and_dist_1_final: if we have retrieved BOTH items.
  • eval/info_done_no_dist_1_final: if we have retrieved the target and AVOIDED the distractor.
  • eval/info_no_done_dist_1_final: if we have retrieved the distractor but NOT the target!

6D Flow SVD PW Cons. (NEW PCL)
3
6D Flow SVD PW Cons. (old PCL)
3


Code

Did these after commits:
and

Example GIFs (new point cloud)

One of them finished up. Now that we adjusted our train / eval, the bottleneck here is with the training (which I think is probably fine).
Done with BC. Elapsed train / eval time comparison:
cumulative train time: 32372.27s
cumulative eval time: 17400.58s
Stored in this directory:
BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_00_17_52_0001
It does seem like it's not able to disambiguate. :( After 495 epochs, still getting confused, the second from top left is really puzzling, it clearly has two options but moves to the distractor.

These others:
drwxrwxr-x 5 seita seita 4.0K Jun 8 14:17 BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_14_14_44_0001
drwxrwxr-x 5 seita seita 4.0K Jun 9 21:08 BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_22_01_17_0001
drwxrwxr-x 5 seita seita 4.0K Jun 8 22:10 BC03_MMMultiSphere_v02_ntrain_0100_PCL_gt_v01_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_22_07_53_0001
At 495 epochs, still (!) getting confused!
Actually I now wonder if this is because when it lowers, if the red ball is the one that gets moved so it gets under the ladle's bowl, it seems like that is causing it to lift the ball?
495 epochs
495 epochs
495 epochs

Example GIFs (old point cloud)

(softgym) seita@takeshi:/data/seita/softagent_mm$ ls -lh BC03_MMMultiSphere_v02_ntrain_0100_PCL_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg/
total 16K
drwxrwxr-x 5 seita seita 4.0K Jun 9 21:11 BC03_MMMultiSphere_v02_ntrain_0100_PCL_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_19_08_08_0001
drwxrwxr-x 5 seita seita 4.0K Jun 9 21:13 BC03_MMMultiSphere_v02_ntrain_0100_PCL_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_19_43_52_0001
drwxrwxr-x 5 seita seita 4.0K Jun 9 21:14 BC03_MMMultiSphere_v02_ntrain_0100_PCL_PNet2_svd_pointwise_6d_flow_ee2flow_4DoF_ar_8_hor_100_scalePCL_noScaleTarg_2022_06_08_19_43_52_0002
(softgym) seita@takeshi:/data/seita/softagent_mm$
Old point cloud, still gets confused among the two. At least the rotation behavior seems good.






Results from (07/11/2022) Experiments

Now we use the updated evaluation metrics with evaluating every 25 epochs, etc.
This is 3D flow, comparing the two different point clouds.
Unfortunately, I can't figure out how to group the runs here?? [It works when I make a brand new report, though] However it's a bit of a moot point as all the success rates seem low.

3D flow, observable PCL
5
3D flow, g.t. PCL
5