Skip to main content

Reproducing Results for Open-Source, PyFlex

Only ToolFlowNet and Direct Vector MSE, for both variants of PourWater (3D and 6D action spaces). Uses PyFlex, which is what we can release!
Created on February 3|Last edited on February 5
I did this after merging:
See my Notion for other info to cross reference.
See my other REPORT to cross-reference with PyFlexRobotics
Not sure why but the cluster was giving a lot of problems ... let's see what happens when we run these locally? Resolved 02/03/2023, the problem was with compute-0-27, that probably needs a reboot as all our runs were freezing.


02/04/2023: ah, now I realize we have to use the correct version of "Direct Vector MSE" which we reported in the paper. That uses:
exp_configs.DIRECT_VECTOR_INTRINSIC_AXIS_ANGLE
Re-running today... and as of 02/05 I have added results here with "Intrinsic" in the name.




Results, PourWater 3D

  • 2/5 seeds finished for Direct Vector MSE.
    • FYI, seeds 2,3,4 were the ones that didn't run, seed 1,5 finished.
    • Re-running these Feb 03 ... should be done Feb 04.
  • 5/5 seeds finished for ToolFlowNet. Gets 0.640 raw perf. Comparing with Table S6 (raw success rate), we reported 0.720 +/- 0.050 so I think it would overlap in stderr if we included it for ours.
  • 5/5 seeds for Direct Vector MSE (intrinsic) -- should use this one. Gets 0.568 raw perf, so it's actually pretty close. In Table S6 we reported 0.480+/-0.070 which would also probably overlap with this one.
Overall, it seems like performance is a bit closer than I would have expected. Not a deal breaker as ToolFlowNet still does better, but it's something to think about.

Direct Vector MSE
5
ToolFlowNet
5
Direct Vector MSE Intrinsic
5


Results, PourWater 6D

All right, how do these look?
  • 5/5 Direct Vector MSE runs finished. Results don't look good (which is good for us!).
  • 2/5 ToolFlowNet runs finished. Seed 4 crashed, while seeds 3 and 5 seem perpetually stuck (let's just terminate them). The seeds 1,2 which finished look good, but we need to run these again. Feb 03: re-running those 3.
    • These are done! And results are 0.696 raw performance, comparing with Table S6 we actually reported 0.544+/-0.030 so we're actually doing a lot better. Interesting ...
  • 5/5 Direct Vector MSE with intrinsic axis-angle (use this) finished. Results are 0.296, and comparing with Table S6 we reported 0.328+/-0.030 so that's probably consistent. Good to know it's not that good...

Direct Vector MSE
5
ToolFlowNet
5
Direct Vector MSE Intrinsic
5