Drake: geometry: SceneGraph may cause non-deterministic simulation?

Created on 5 Jun 2020  路  5Comments  路  Source: RobotLocomotion/drake

EDIT (2020/07/17): Sub-issue of #13506

See:
https://github.com/RobotLocomotion/drake/pull/13505#pullrequestreview-425063885

@SeanCurtis-TRI Passing the buck to you on this one ;) Feel free to reassign!

\cc @DamrongGuoy @sherm1

geometry proximity dynamics

All 5 comments

Adding @rpoyner-tri to this. Possibly a different problem than #13506 which has fewer moving parts.

I remember something that I saw in FCL long time ago. I mention here in case it is relevant.

I saw these lines in FCL:

  NodeType* p = n->parent;
  if(p > n)

https://github.com/flexible-collision-library/fcl/blob/ff832492/include/fcl/broadphase/detail/hierarchy_tree-inl.h#L769

It compares two pointers p and n. It's inside a function named sort (but it does not look like sorting to me) :

template<typename BV>
typename HierarchyTree<BV>::NodeType* HierarchyTree<BV>::sort(NodeType* n, NodeType*& r)

https://github.com/flexible-collision-library/fcl/blob/ff832492/include/fcl/broadphase/detail/hierarchy_tree-inl.h#L766

It is called when we balance the bounding-volume-hierarchy tree in FCL in:

template<typename BV>
void HierarchyTree<BV>::balanceIncremental(int iterations)

https://github.com/flexible-collision-library/fcl/blob/ff832492/include/fcl/broadphase/detail/hierarchy_tree-inl.h#L274

I'm not claiming that it definitely causes the non-determinism that we saw. Without spending more time, I cannot understand what the code is doing. Furthermore, I do not know whether it is called by Eric's example.

EDIT(eric): Permalink'd

Interesting -- would definitely cause different behavior with randomized addresses.

BTW I gave @rpoyner-tri a small patch that would protect Drake from FCL's BVH's re-ordering of pairs. (It can be found here). By report, it eliminated some of the problem, but not all of the problem.

My claims are these:

  • the causes and remedies written in #13736 cover all the phenomena speculated at in this ticket.
  • there is still a Discrete,Reuse problem, papered over by the infamous Initialize() call in #13506.
  • those two cause/effect chains are independent -- hacking one does not cure symptoms of the other and vice-versa.
  • hacking and/or fixing those two problems will completely cure the symptoms of Eric's original program (python experiments still TBD).

Therefore, I will close this ticket in favor of #13736 and do what I can to sharpen up the remaining tickets.

Was this page helpful?
0 / 5 - 0 ratings