drake 🚀 - Output KinematicsCache from RBP

I think this design proposal is a super-confusing way to address the challenges that #3316 was trying to solve, and that #3316 was more on the right track.

jwnimmer-tri on 1 Sep 2016

I am open to close this issue and re-open #3316. But let me see if I can share with you some thoughts.
Just think about how the design in #3316 would get connected.

Example 1. RBP --> RbpTransalator --> BotVisualizer
In order for RbpTransalator to not recompute the KinematicsCache it'd need it as an input form RBP. So the desing here proposed does not add this new AbstractValue port but it was implicit in #3316 (in case with "super-confusing" this is what you were referring to).

Example 2. SimplePendulum --> RbpTransalator --> BotVisualizer
In this case SimplePendulum does not have a KinematicsCache. Somebody needs to compute it. Well, in that case it seems we'd then need to leave the KinematicsCache port unconnected and then RbpTransalator would compute the kinematics cache.

So, all the super-confusing components were already present in #3316. Actually here I am just simplifying to two classes instead of three. Again, I might be missing something. Let me know what you think.

amcastro-tri on 1 Sep 2016

This is tagged as high priority; is that true? Is the only top-level goal to be able to write something like BotVisualizer? I feel like a whiteboard would really help, which means deferring to next week.

jwnimmer-tri on 1 Sep 2016

And a System 2.0 versions of RosTfPublisherSystem, and ImuSensorSystem. OK for deferring till next week.

liangfok on 1 Sep 2016

Yes, that is why the high priority. It should be easy to implement once we agree on what exactly to implement. The high priority is because we at least need RBP, and BotVisuzlizer to simulate something and visualize it (as done with the Kuka demo).
I like the idea of a f2f with whiteboard! and for next week, yes.

amcastro-tri on 1 Sep 2016

KinematicsCache is going to have a short lifetime since its functionality should be absorbed by System2's built-in caching facility. So I don't think we should be designing in any new dependence on it (unless this is meant as a short-term hack). Exposing meaningful information on output ports makes more sense and is more general since any System could output such data, while KinematicsCache is only available from Systems that have a RigidBodyTree embedded somewhere.

sherm1 on 1 Sep 2016

Is it possible to incrementally transform the KinematicsCache into System2's cache? Are we still planning on incrementally morphing RigidBodyTree into the version used by Dynamics 2.0?

liangfok on 1 Sep 2016

Is it possible to incrementally transform the KinematicsCache into System2's cache?

That will be much easier if we don't expose KinematicsCache outside RBTree and RBPlant. Then we can gradually move its contents in the Sys2 cache until it is empty.

sherm1 on 1 Sep 2016

Sharing of KinematicsCache (or whatever will replace it in the future) is also pretty much required for an efficient implementation of RigidBodySensors. Perhaps a quick and dirty solution is to just construct these RigidBodySensors with a reference to the KinematicsCache internal to RigidBodyPlant for now (which should by the way not be constructed on every call to EvalTimeDerivatives!).

tkoolen on 4 Sep 2016

Would it be too against the spirit of System 2.0 to have RigidBodyPlant maintain a pointer to a mutable KinematicsCache object and all consumers of the KinematicsCache maintain a const reference to it?

I hesitate to recommend such a design because it feels like it's a side-channel around System 2.0's port or cache abstractions.

liangfok on 5 Sep 2016

System 2 has its own nascent caching mechanism which should not be strangled in the womb by outputting some other caching mechanism!

sherm1 on 5 Sep 2016

👍1

For future reference, I'll restate an example I gave on slack earlier: consider a robot with 20 accelerometers attached to it (we've actually discussed doing this for Valkyrie; this is not some contrived example). If every RigidBodyAccelerometer has to redo the work to compute the data currently stored in KinematicsCache, that would be terrible for performance. In my view, this demonstrates the need for some sort of sharing of cached results between Systems.

So what is the long-term vision for this? The discussion on slack seemed to be leaning towards only outputting a minimal representation of state (q and v for a RigidBodyPlant), and that cache is only an implementation detail, hidden to someone assembling a Diagram. That would imply that sharing cached results should be done 'underhand'.

Instead, I'd like to argue that passing data between systems should always be done through ports; otherwise this nice, decoupled system design is in name only. So I'm in favor of @amcastro-tri's original plan of having a KinematicsCache output port for RigidBodyPlant (or whatever the cache for a RigidBodyPlant will be renamed to in the future).

tkoolen on 6 Sep 2016

How about instead of having _one_ output port that contains a KinematicsCache object, we have _n_ output ports, one for each of the _n_ intermediate values stored in what is now the KinematicsCache? We can always add more output ports if and when we want more intermediate values.

Maybe RigidBodyPlant can optimize itself by detecting whether there are any downstream consumers of a particular output port and only computing the intermediate value for that output port if (1) it has to anyway and (2) there are downstream consumers of the value on the port. I'm actually not sure if System 2.0 currently supports this optimization.

liangfok on 6 Sep 2016

The System 2 cache is contained inside the Context and is available to all subsystems for every computation performed at run time (in a carefully managed way that guarantees up to date results). In a sense it is _always_ exported implicitly. It should never be exposed explicitly on an output port.

If every RigidBodyAccelerometer has to redo the work to compute the data currently stored in KinematicsCache, that would be terrible for performance.

Agreed! That would never happen making proper use of the System 2 facilities.

sherm1 on 6 Sep 2016

The System 2 cache is contained inside the Context and is available to all subsystems for every computation performed at run time

I wonder what that API would look like. Right now we can access subsystem contexts like done here for the PID controller. The access can be done if you have the pointer to the system which subcontext you want to access to.
Consider a simple example like BotVisuzlizer, should BotVisualizer keep a (non-owning) pointer to the RigidBodyPlant?

amcastro-tri on 6 Sep 2016

@sherm1, hmm, I mean it works, but the output ports of a RigidBodyPlant are kind of just for show then (except for properly setting up the dependency graph). RigidBodySensors would be able to do their work just fine without any information from RigidBodyPlant output ports.

I suppose maybe you're planning to require having access to the state vector in order to access the cache for a system (this is not what's currently implemented), but in that case, why not just have a scalar output port, outputting a hash value computed from the state?

tkoolen on 6 Sep 2016

From @sherm1:

If every RigidBodyAccelerometer has to redo the work to compute the data currently stored in KinematicsCache, that would be terrible for performance.

Agreed! That would never happen making proper use of the System 2 facilities.

I agree as well, and for the same reasons. Of course, this would be clearer if the System2 cache facilities actually existed, so I will dust off that branch this week. It was previously PR #3135. We tabled it because there were competing priorities, and because @sherm1 and I have an unresolved design disagreement. That disagreement is tangential to this thread, where we entirely agree.

From @tkoolen:

Instead, I'd like to argue that passing data between systems should always be done through ports; otherwise this nice, decoupled system design is in name only.

I absolutely, 100% agree with this. If we allowed mutable members of System<T> subclasses, the whole framework would be pointless.

So I'm in favor of @amcastro-tri's original plan of having a KinematicsCache output port for RigidBodyPlant (or whatever the cache for a RigidBodyPlant will be renamed to in the future).

As an interim solution, I think this (or anything similar) is fine. It also gets at an important point. The System2 cache infrastructure will supersede the aspects of KinematicsCache that manage data lifetime. It will _not_ supersede the aspects of KinematicsCache that are a notation for the result of particular, expensive computations about RigidBodyTree. We will still need that notation, to be stored in System2 cache, and to be transmitted on a RigidBodyPlant output port to downstream consumers.

david-german-tri on 6 Sep 2016

Sigh. My post crossed the most recent from @amcastro-tri and @tkoolen in flight.

I do not think @sherm1 meant to claim that multiple subsystems would _share_ a cache. If he did, then I strongly disagree, for the reasons @tkoolen mentioned! I believe Sherm meant, and I also would assert, that (a) each subsystem always has its own _private_ cache available and (b) cache entries which depend on a given input are invalidated whenever that input changes.

So, a downstream consumer of KinematicsCache can check whether its KinematicsCache input has changed, and redo dependent computations only if it hasn't.

david-german-tri on 6 Sep 2016

I do not think @sherm1 meant to claim that multiple subsystems would share a cache.

Right! What I meant is that the full cache is always lurking behind every computation within a Diagram so that (as managed carefully via input and output ports) all needed information can be obtained without (a) passing caches around, (b) recomputing anything that has already been computed, or (c) computing things that don't get used. Each subsystem sees only its own cache, but the framework as a whole has the Diagram perspective and can efficiently deliver computations between subsystems via the ports.

sherm1 on 7 Sep 2016

👍1

OK, that makes sense. Back to the original issue :-). What ports do you think should be added to RigidBodyPlant to enable efficient RigidBodySensors and visualizers?

tkoolen on 7 Sep 2016

fyi - this is also blocking the valkyrie sim.

RussTedrake on 8 Sep 2016

To get the conversation started how about let's make it output the following at least initially? We can always add more output ports with additional data as necessary.

A vector of joint positions (q). Meta data include model instance IDs and joint names.
A vector of joint velocities (q_dot). Meta data include model instance IDs and joint names.
A newly defined WorldVisualizationDescription object that contains, for each RigidBody in the world, a description of how it should be visualized. It should contain all of the information needed by BotVisualizerSystem::initialize_drake_visualizer().
A newly defined RigidBodyPoses object that contains, for each RigidBody in the world, its ~~6-DOF pose~~ position and quaternion orientation in the world frame. It should contain all of the information needed by BotVisualizerSystem::DoPublish().
A newly defined ContactWrenches object that contains, for each contact in the world, the two bodies in contact and the contact wrench between them.

liangfok on 8 Sep 2016

@liangfok asked me to comment on this

@liangfok, i think that the conceptual output of the rigid body plant is very clear -- it should contain the generalized state of the plant (which is positions and velocities). in my view most of the discussion here is about passing the result of computations on that state (transforms, accelerations, etc) simply to make downstream computations more efficient for sensors, etc.

RussTedrake on 9 Sep 2016

👍1

Yes, the additional state that we were thinking about outputting from RigidBodyPlant is indeed mainly for computational efficiency. Another reason I've heard is to prevent downstream systems from needing to have a const reference to the RigidBodyTree (i.e., an encapsulation argument).

I believe one of the original instigators that resulted in this discussion is https://github.com/RobotLocomotion/drake/pull/3228#issuecomment-242509888, where it was pointed out that BotVisualizer 2.0 could be vastly simplified if it were provided rigid body pose information.

In https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-244989985, @tkoolen makes a strong argument for why efficiency is important with respect to supporting sensors.

In System 1.0, the RigidBodyPlant and Sensors were all inside RigidBodySystem, which sidesteps the question of what RigidBodyPlant should output.

liangfok on 9 Sep 2016

i think that the conceptual output of the rigid body plant is very clear -- it should contain the generalized state of the plant (which is positions and velocities)

@RussTedrake, I disagree; I think that the _state_ of a rigid body plant is very clear, but I don't think that the state should be the only possible output presented on a rigid body plant's output ports, even conceptually.

tkoolen on 10 Sep 2016

I will summarize here a f2f conversation we just had with @sherm1 about this.

About visualizing a `RigidBodyPlant`

First, We thought that a more appropriate name for System 2 BotVisualizer would be something like RbtLcmPublisher (other similar name suggestions are welcome) so that it is abolutely clear what this system does: it publishes LCM messages for our drake-visualizer that are only valid to visualize a RigidBodyTree.
Also, it then makes sense that this system is then initialized from a constant reference to a RigidBodyTree.

Second, RigidBodyPlant would have output ports for each of its RigidBody poses so that they can be inputs to RbtLcmPublisher. Since outputs in System 2.0 will be cached, accessing these from RbtLcmPublisher won't incur in any additional computation solving the problem of unnecessarily re-doing computations.

About visualizing a system like a `Pendulum`.

We think that in a case of a system like say a simple pendulum that clearly has all the information to plot itself and was written by a user with a clear idea in mind of how that system would be visualized, it makes sense for this system to also output the poses for its links.
Therefore System 2.0 pendulum would output the rigid body pose of its link. Similarly for other systems like say SpringMassSystem.

amcastro-tri on 10 Sep 2016

I think it will be a common occurrence for Systems that connect to the RigidBodyPlant output to want the position and velocity for all bodies, not just some. The above proposal reads to me like it proposes one port per body. Offering that is fine if you want, but I would also like to see offered a single output port that provides the entire state for all bodies (in addition to whatever other outputs we also end up offering). The so-called RbtLcmPublisher would use that all-in-one port, of course.

jwnimmer-tri on 10 Sep 2016

I would also prefer a single port that contains all of the poses. It seems cleaner and more tidy that way.

liangfok on 10 Sep 2016

RigidBodyPlant would have output ports for each of its RigidBody poses

and

I would also like to see offered a single output port that provides the entire state for all bodies

This is a difficult one.

I think the 'small' output ports for each of the RigidBody poses should exist in any case, because otherwise you're probably requiring some systems downstream to accept more data than what they need to do their jobs: 'fat' output ports require fat input ports. This would be kind of like only wanting a little pet rat, but getting the whole pet store and having to deal with it. Fat output ports increase coupling in the code base, and make it harder to write unit tests for individual systems. Small output ports provide better data abstraction.

But should there be a fat output port with all poses in addition to the small output ports? I could be persuaded both ways, but I'm leaning towards a "no", just because it feels a bit redundant. Maybe what's needed instead is some syntactic sugar for connecting a whole bunch of ports in a one-liner?

RbtLcmPublisher

I prefer not to abbreviate RigidBodyTree (or rather, to abbreviate in general). LCM is a very well-established abbreviation, but RBT is more likely to confuse people that are new to the code base.

tkoolen on 10 Sep 2016

:+1: for providing sugar for wiring up a whole bunch of related ports in one line of code.

Question: Is it OK for the number of output ports to be dependent on the model that's being loaded? In other words, will the decision to make the number of output ports proportional to the number of rigid bodies introduce too much coupling between the diagram and the model (URDF, SDF, etc.) that it can load? It would be a bummer to have to recompile the code just to support a URDF or SDF with a different number of DOFs.

liangfok on 10 Sep 2016

I still think the output of the RigidBodySystem should always be the generalized state. We could have a sensor/modifier that turns that into poses of some end effector. Again, the design which was thought out and discussed (at length) before is described briefly in
https://github.com/RobotLocomotion/drake/blob/master/drake/systems/plants/RigidBodySystem.h#L15

I think we should either consider RigidBodyPlant to be what I was calling RigidBodyTree in that doc, or include sensors/actuators and call it RigidBodySystem (where the output is now the sensors).

RussTedrake on 10 Sep 2016

OK. I suppose we can stick with the System 1.0 design of lumping the RigidBodyTree, Sensors, and ForceElements all into a single System 2.0-based RigidBodySystem.

@RussTedrake, just to be clear, are you opposed to having this system also output the individual rigid body poses in addition to its generalized state? Only outputting the generalized state will force downstream systems like BotVisualizer to recompute the pose information.

Oops, I just now understood the comment above saying sensors can be added that output the poses of each rigid body. In that design, the generalized state and pose information will all be lumped into a single _very fat output port_ (a port containing a very long BasicVector). This is not necessarily a regression since it is the current state of System 1.0. I suppose System 2.0 is better in that the output port can output a subclass of BasicVector, which then provides named accessors to the data within it. All of the index bookkeeping can be hidden within this subclass of BasicVector.

liangfok on 10 Sep 2016

Thinking about this some more, I suppose there's nothing preventing us from aggregating the RigidBodyTree, Sensors, and Force Elements all into a single System 2.0-based RigidBodySystem, but still have multiple output ports, one for the generalized state of the tree, and other ports for each individual sensor and force element.

liangfok on 10 Sep 2016

lumping the RigidBodyTree, Sensors, and ForceElements all into a single System 2.0-based RigidBodySystem.

That works too.

I still think the output of the RigidBodySystem should always be the generalized state.

I really think that just can't work because of https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-244989985. If sensors are separate systems (as was the case in the design you linked) and the only outputs available to the sensors are q and v, then you either have to redo lots of work, or share caching between systems, making the port-based design be just for show.

tkoolen on 10 Sep 2016

@jwnimmer-tri, @liangfok. Yes, I am sorry if I didn't explain correctly but my proposal is to provide a single port with all body poses, not a port per body.

@RussTedrake

I still think the output of the RigidBodySystem should always be the generalized state. We could have a sensor/modifier that turns that into poses of some end effector.

Yes, the proposal here is to have that output port with the state plus an additional port with poses that could for instance be used by the bot visualizer. The reason for this port is to avoid re-doing computations in the bot visualizer which were already performed by the RigidBodyPlant. These poses (and potentially other kinematic quantities) could additionally be used by sensor systems which then won't require to recompute the body poses or any kinematics in order to figure out say an acceleration.

Thinking about this some more, I suppose there's nothing preventing us from aggregating the RigidBodyTree, Sensors, and Force Elements all into a single System 2.0-based RigidBodySystem

I do not like that idea. Dividing our system into a RigidBodyPlant and into a bigger RigidBodySystem containing now additional sensors and actuators IMO allows for greater flexibility and a more incremental development. That discussion does not belong to this issue though.

amcastro-tri on 10 Sep 2016

Please let me summarize the plan of action into what I think covers all of your concerns. I am intentionally leaving out the discussion of sensors/actuators and other systems that can be attached to a RigidBodyPlant because System 2.0 now allow us to develop into smaller increments by composing simpler systems into a Diagram. RigidBodyPlant ONLY contains the RigidBodyTree model of the world with:
Inputs: the actuators' generalized forces.
Outputs: state + convenience output port (a single one) with bodies' kinematics.

The later can be used for either visualization or by smaller systems like sensors so that this information is already cached and does not need re-computations.

The output port with the system kinematics will be wrapped in a nice API so that a request on per-body basis can easily be performed as well. So this output port won't just be a bare array of doubles but more like an array of RigidBodyKinematics (for the lack of a better name).

amcastro-tri on 10 Sep 2016

@amcastro-tri I am in general agreement with you, but was only considering mirroring System 1.0's approach of combining everything into a RigidBodySystem based on @RussTedrake's comments above.

In what way does making sensors and force elements independent systems increase flexibility and incremental development? Even if they were contained in a RigidBodySystem along with a RigidBodyTree, they would still be their own independent objects that are instantiated at run-time either programmatically or via a specification file like SDF. By combining them into a single system, there will be no ports between the tree, sensors, and force elements, which eliminates the concern that they are "just for show".

liangfok on 10 Sep 2016

I am in general agreement with you

Good.

but was only considering mirroring System 1.0's approach

System 2.0 does not need to mirror System 1.0 (of course we still have similar concepts, but not implementations). We can do better with @david-german-tri's design now.

In what way does making sensors and force elements independent systems increase flexibility and incremental development?

A simple example: #3245 could be merged right now as it is without having to design right now the right abstractions to add sensors. Even adding the output poses I propose here would be a separate PR. IMO that is incremental development.

How it is more flexibly to design sensor systems? well, that'd require a f2f or VC but I don't think that belongs to this issue anyway.

amcastro-tri on 10 Sep 2016

I see, so you're saying incremental w.r.t. pull requests. That is true.

However, would you agree that lumping the tree, sensors, and actuators into a single RigidBodySystem does not enable additional incremental development w.r.t the creation of a model?

liangfok on 10 Sep 2016

@tkoolen: I think the 'small' output ports for each of the RigidBody poses should exist in any case, because otherwise you're probably requiring some systems downstream to accept more data than what they need to do their jobs: 'fat' output ports require fat input ports.

In System 2 there is no cost to having a fat poses port even if only a few poses are used, because RigidBodyPlant has to compute them anyway. When the System 2 implementation is complete there will be no copying from output ports to input ports. The connecting lines in a System Diagram should not be thought of as pipes along which data flows, but just an expression of the connectivity. So picking a particular pose from the output port is actually just obtaining a reference to a particular element of RBPlant's internal pose cache.

There would be nothing to prevent us from adding options to RBPlant to make it produce more specific output ports. That would be a performance win for quantities that wouldn't otherwise be computed, but for poses we wouldn't gain anything.

sherm1 on 10 Sep 2016

👍1

@sherm1, I wasn't talking about runtime cost.

tkoolen on 10 Sep 2016

I wasn't talking about runtime cost.

Oh. To me it seems messier to add construction options to an RBPlant to generate body-specific pose outputs, then use those to make a custom-wired Diagram, than simply to select the poses you care about from the always-available general supply. Same goes for state -- it's easy just to output the whole x even if downstream only needs some of it.

sherm1 on 10 Sep 2016

I was just thinking that if we end up lumping RigidBodyTree, Sensors, and Force Elements all within a single System 2.0 version of RigidBodySystem, we might as well also throw BotVisualizer into RigidBodySystem. I'm not sure if this is an argument for or against lumping everything together. Just something to think about. Maybe the port-based architecture is only for connecting things that are less coupled like perception ←→planner ←→ controller ←→ plant?

liangfok on 10 Sep 2016

@tkoolen et al. Perhaps I've missed some of the discussion. Is passing caching between systems really so bad? i think that's exactly the right thing to do. I do not like the idea of passing out additional kinematic outputs, because it is conceptually incorrect (over-specified). If we are doing more complicated design/analysis on these systems, then we want to have the conceptually correct (and minimal) signals passing between them -- not just for show. The caching can make the runtime performance ok.

RussTedrake on 11 Sep 2016

@RussTedrake, I believe @david-german-tri's post above (https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-245105482) concisely describes the System 2.0 vision of caching. Basically, each system will have its own private cache that's _not_ shared with other systems. However, the Diagram itself will have a cache that ensures efficiency within its constituent systems.

liangfok on 11 Sep 2016

@RussTedrake, KinematicsCache will be replaced (gradually) by System 2's more-general caching mechanism so we don't want to expose KC in a System 2 system. Sys2 has more information so can trigger computation when necessary and avoid computing expensive quantities that aren't being consumed.

sherm1 on 11 Sep 2016

If that is the model, then i would argue for the rigodbodysystem approach, where the single system contains the list of sensors and actuators -- acting effectively as a very special case of diagram where cache sharing is allowed.

Perhaps the botvisualizer tools also live as a part of this system so they share the same privilege?

RussTedrake on 11 Sep 2016

@sherm1 -- totally agreed that the more general caching mechanism will be in play here. But is that mechanism restricted to a single system, or can it be shared across systems?

RussTedrake on 11 Sep 2016

is that mechanism restricted to a single system, or can it be shared across systems?

It can be shared among subsystems contained in the same Diagram (however deeply nested). Sharing is carefully controlled though to prevent out-of-date references. Subsystem A can do anything it wants with its own portion of the Context (which contains the cache), but if it wants to use something from subsystem B that is mediated by the Diagram.

sherm1 on 11 Sep 2016

Are we accidentally delaying ourselves by prematurely optimizing things? @sherm1 has previously emphasized that the entire caching mechanism is _optional_ and that it will be easily togglable on / off. From a system user's point of view, the cache does not even exist. It will only have non-functional impacts on the overall execution of the software.

Given the above, how about let's charge forward implementing the RigidBodyPlant, Sensors, ForceElements, BotVisualizer, etc., as individual System 2.0 systems that input and output only the minimal generalized state (and have multiple instantiations of RigidBodyTree as necessary)? We can then perform some benchmarks to prove that we have a performance problem, which will justify (1) a sophisticated caching mechanism, (2) exposure of non-minimal state through ports, and/or (3) lumping multiple software abstractions into a single RigidBodySystem so they can share the same cache.

liangfok on 11 Sep 2016

In https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-246115243, I meant sharing caching 'under the table' - not through ports. That's what I'm against, and so is @david-german-tri. If q and v are the only outputs, but all of the sensors actually only interact with cache that's not explicitly shared through a port, then the output port is just for show in my opinion.

i would argue for the rigodbodysystem approach, where the single system contains the list of sensors and actuators -- acting effectively as a very special case of diagram where cache sharing is allowed.

This is certainly a viable option. Clearly we're having trouble defining the output ports of RigidBodyPlant, so maybe we just haven't drawn the system borders in the right place. On the other hand, if we're going to have sensors with state at some point (hysteresis perhaps), then it's going to be awkward to combine the state of the sensor with the state of the RigidBodyPlant, while it would be trivial if they were their own systems hooked up in a Diagram.

These are just some considerations, I don't know what the right answer is, but I'm still modestly in favor of a design with many small output ports that are created (selectively) upon user request after constructing the RigidBodyPlant but before wiring up the Diagram.

tkoolen on 11 Sep 2016

👍1

@liangfok, I don't think it's premature. It's pretty clear to me that if e.g. each contact sensor system has to redo all of the kinematics, collision checking, and contact force computation, then that's going to be bad for performance.

tkoolen on 11 Sep 2016

👍2

I agree it will be bad performance. However, I'm unconvinced that it will be so bad as to be unusable. Having baseline numbers will be super useful to guide the discussion and design choices we make.

liangfok on 11 Sep 2016

@RussTedrake, thinking about it some more, the thing I'm proposing ('many small output ports created upon user request') is actually the same as the original RigidBodySystem design with RigidBodySensors; I was just using different terminology. Each small output port would have the same role as a RigidBodySensor in the original RigidBodySystem design. Hysteresis and noise can be handled by hooking up a (future) sensor noise model to the perfect information coming out of the small output ports.

tkoolen on 11 Sep 2016

@tkoolen, in System 1.0, the outputs of RigidBodySensor objects are lumped together with the generalized state of the RigidBodyTree forming one potentially gigantic output coming out of the System 1.0 RigidBodySystem. I assume you're trying to avoid this in System 2.0? I would agree with that.

liangfok on 11 Sep 2016

to summarize my stance, there are two proposals that I have seen before which I agree with.
1) RigidBodySystem, which contains a list of sensors, etc, with the output being those sensors. a fullstatefeedback sensor would output the generalized state if it is present. that was the design i initially implemented in system1.0
2) RigidBodyPlant, who's output is only the generalized state. sensors, etc are wired up using Diagram, and the Diagram's caching mechanism (eventually) optimizes away repeated calls. @tkoolen -- in this model I am not proposing that anything is hidden. the sensor blocks would still call the relevant methods, like forwardKinematics, on the generalized state, but those calls would be made essentially free because of the caching mechanism.

I like 2 better, and do agree with @liangfok that we could put it in place without worrying about performance first. but I'm also ok with 1.

RussTedrake on 11 Sep 2016

@liangfok, ah yes, I'd forgotten about that. Yeah, let's not concatenate all measurements together.

For proposal 2 in @RussTedrake's comment, the blocking issue that needs to be resolved is whether or not caches are really private to each subsystem (but possibly shared explicitly via output ports), or shared with the whole diagram (implicitly, not through ports).

Just to summarize the discussion on this topic:

@RussTedrake is in favor of sharing caches implicitly with the whole diagram (https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-246153583).
@david-german-tri (https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-245101995, https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-245105482) and I (https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-245099507) are in favor of keeping caches private to each subsystem.
I'm a little unclear on @sherm1's stance (https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-245092887, https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-245354356, https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-246155598).

If we end up going with subsystem-private caches, the next decision is whether to have many small RigidBodyPlant/RigidBodySystem output ports, or one, or a few fat output ports.

@amcastro-tri and @sherm1 (https://github.com/RobotLocomotion/drake/issues/3325#issue-174531541, https://github.com/RobotLocomotion/drake/issues/3362#issuecomment-245170332), as well as @liangfok (https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-245548677) are in favor of fat output ports.
I am in favor of many small output ports (https://github.com/RobotLocomotion/drake/issues/3362#issuecomment-246158590, https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-246160130). @RussTedrake is as well, but would prefer sharing caches so that this isn't relevant in the first place (https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-246168296).

We'll just need to take decisions on these two topics to make progress. Please let me know if I misrepresented your stance.

tkoolen on 11 Sep 2016

👍1

I'm glad there's more focus now! I'll throw in one tangential point in case it helps you all decide.

A fat output port doesn't mean that you need to suffer System 1's manual indexing math (the RigidBodySystem 1.0 idiom with "get_num_positions" / "get_sensor_start_index" and whatever). If you choose the fat port, it should provide a RigidBodySystem-specific subclass of BasicVector that consumers would downcast. That vector subclass can offer any number of API calls to access the underlying data in portions-at-a-time, such as back-referencing to model_ids in the tree, separating position data from sensor readings, etc. The caching of the output and underlying Eigen matrix representation would be consolidated into one output port, but the using-code can be as sugar'd up for convenience as we like.

jwnimmer-tri on 11 Sep 2016

@jwnimmer-tri, I'm aware of that. My reasons for not liking fat outputs so much are outlined here: https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-246087775. But fat output ports certainly are a valid solution to this issue; just a matter of preference.

tkoolen on 11 Sep 2016

👍1

One option is to initially provide both fat ports and small ports, then over time we can see if both are in fact useful for different purposes.

liangfok on 11 Sep 2016

I'm a little unclear on @sherm1's stance [on sharing of cache]

I believe mine is the same as @david-german-tri's: we have a beautiful caching mechanism designed for System 2 that is completely sufficient for efficiently and correctly connecting subsystems via input/output ports. So my stance is that this is a non-issue. The cache should be invisible and the solutions should involve designing useful output ports that expose meaningful data.

sherm1 on 11 Sep 2016

👍2

I was able to chat about this quickly with @RussTedrake and @tkoolen this afternoon. We cleared up a few misunderstandings. (Please correct me if I misrepresent anything!)

First, Russ confirmed that it's not necessarily a requirement that RigidBodyPlant, or other systems, have parsimonious output. In principle, it's OK for RigidBodyPlant to output not just x = [q v], but also other ancillary data like the mass matrix. He also agreed that the System2 cache infrastructure (#3376 et. seq.) can remain private to a System. So, that resolves the major blocking issue from https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-246185918.

Second, Russ pointed out a couple of additional requirements that I hadn't thought of. (a) It should be possible to have a network boundary between RigidBodyPlant and its consumers. (b) If N sensors depend on some ancillary data output of RigidBodyPlant, they shouldn't have to recompute it, even if RigidBodyPlant wouldn't compute it by default.

That suggests an implementation spiral like the following. My primary interest is in the application of the System2 architecture; if I've missed some dynamics details please accept my apologies!

_step 0 (fat ports, no cache)._ RigidBodyPlant outputs KinematicsCache (and, separately, contact forces). It may make sense to rename KinematicsCache to RigidBodyConfiguration or similar, since it wouldn't really be a cache anymore. AIUI this is what @amcastro-tri proposed originally!

_step 1 (fat ports, cache)._ Once the System2 cache is available, KC/RBC can be stored there (and invalidated on changes to x or u) instead of recomputed on every call to EvalOutput.

_step 2, option a (thin ports, cache)._ Once #2890 is resolved, it may be worthwhile to break up KC/RBC into smaller outputs. Then the N sensors mentioned above can depend on foo_matrix_port, and it will be computed exactly once - or zero times, if no one depends on it at all. (@RussTedrake, @tkoolen, this deviates a little bit from our conversation. I realized afterwards that we don't have to resort to mutable members of KC/RBC if we have thin ports.)

_step 2, option b (no ports, cache)._ You could alternatively glom RigidBodyPlant and the sensors into a single System with a shared cache. I don't think anyone sees a good reason to prefer this, but the door wouldn't be closed to it.

david-german-tri on 13 Sep 2016

Just going to add a couple of things I took away from the meeting. Let me know if I failed to understand something. But first:

RigidBodyConfiguration

How about RigidBodyPlantState? Configuration normally refers to only the 'q' (generalized positions) part of the state, and including Plant makes it clear that it's not the state of a single rigid body. I'll tentatively use RigidBodyPlantState below.

@david-german-tri and I were initially confused about how @RussTedrake wanted to 1) output only q and v from RigidBodyPlant, 2) not redo work, and 3) not share RigidBodyPlant's private cache 'under the table'. Russ cleared that up by stating he is in favor of @amcastro-tri's original plan (and his own) of outputting KinematicsCache (RigidBodyPlantState), i.e. an object that contains q and v and can be serialized as just q and v, but also contains ancillary computation results that depend (only) on q and v.

So in a sense, RigidBodyPlantState would be a fancy state vector that has its own 'cache' of q-and-v-dependent computation results, separate from each subsystem's private cache. If the additional computation results are missing (e.g. if a RigidBodyPlantState was received over LCM, serialized as just q and v), then the receiver will just have to do a bit more computation, but the 'cache' in a RigidBodyPlantState will be invisible in the sense of https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-246197657.

Consumers of a RigidBodyPlantState get a const view of it, and so if the consumer receives a state that doesn't have the right ancillary computation results stored, computing and storing those results becomes a problem. Our solution at the meeting was to make the ancillary fields mutable. David still doesn't like this, witnessed by 2a above, but I think it's a correct application of mutable: the RigidBodyPlantState is logically const in the sense that q and v don't change (so the state remains the same), but not bitwise const (see http://stackoverflow.com/a/105061). Do note that currently, computing the mass matrix and dynamics bias term in RigidBodyPlant actually requires all ancillary parts of KinematicsCache to be up to date, so for a simple diagram where RigidBodyPlantState goes straight from RigidBodyPlant to sensors (instead of being serialized in between), this problem disappears.

On a somewhat related note, I think we should move towards not exposing the CacheElements of KinematicsCache through a getElement method, but rather move the methods of RigidBodyTree to KinematicsCache, so that there is no need to expose this caching detail. These methods will then form the interface used by consumers.

RigidBodyPlant outputs KinematicsCache (and, separately, contact forces).

Our reasoning for separating out contact forces is that they may depend on the RigidBodyPlant's inputs in a direct feedthrough manner (although they currently don't, due to the penalty-based ground contact model), making them distinct from the stuff that's currently in KinematicsCache. So @amcastro-tri's ContactResults from https://github.com/RobotLocomotion/drake/issues/3362 sounds like the right way to go.

tkoolen on 13 Sep 2016

Making parts of an output mutable would be a complete end-around the System 2 caching system, identical to exporting the KinematicsCache. We should _not_ do that. If we don't have suitable caching available yet, we can just have the RBPlant do extra work to make sure everything it outputs is valid.

BTW, I agree that caching is an appropriate use of mutable; that is exactly how the System 2 caching system works (will work). But it manages that internally and guarantees correctness.

sherm1 on 13 Sep 2016

So, basically, RigidBodyPlantState _is_ a "fat port". SGTM.

Why not have downstream consumers of it simply make a mutable copy of the const version they receive?

liangfok on 13 Sep 2016

@sherm1, thinking about networking is what convinced me the approach we came up with at the meeting is the right way to go. Consider this: how would you send a ContactResults over the network? I'm using ContactResults here because that is less controversial at this point. Would you send over one giant LCM message that is an exact mirror of all of the fields of ContactResults, with all its implementation details exposed, or would you send over a more minimal representation and then recompute ancillary data on the consumer (receiver) side? I would argue the latter is the right way to go.

If you choose the minimal LCM message, you could have the LCMSubscriber system recompute _all_ the ancillary data from the data received over LCM, and that would work fine. But you might be recomputing more than what the consumer needs. Making that recomputation be lazy - but invisible to the consumer - is the optimization we wanted to implement. If this is not covered by the System 2 caching design, perhaps it should be.

Why not have downstream consumers of it simply make a [non-]mutable copy of the const version they receive?

@liangfok, that's an option, but my argument is that the input will remain const in the way that matters anyway (logically const), even if it's not bitwise const.

tkoolen on 13 Sep 2016

RigidBodyPlantState

SGTM.

Although I agree with @sherm1 that it's a bad idea, it sounds like we should also track an option 2c. Importantly, we don't have to choose among the 2x's to do steps 0 and 1, which will unblock most real use cases.

_step 2, option c (fat port, separate caching)._ Make the ancillary computation results in the KC/RBPS mutable. This allows downstream consumers can write those values back, sharing them with other consumers, without the RBPlant (or LCM receiver) having to know to compute them in advance.

On a somewhat related note, I think we should move towards not exposing the CacheElements of KinematicsCache through a getElement method, but rather move the methods of RigidBodyTree to KinematicsCache, so that there is no need to expose this caching detail. These methods will then form the interface used by consumers.

:+1: to this part. I knew I was forgetting something.

david-german-tri on 13 Sep 2016

👍1

we don't have to choose among the 2x's to do steps 0 and 1, which will unblock most real use cases.

Just step 0 already does this, and it will even make the non-networked case be as efficient as possible: computing the dynamics in RigidBodyPlant requires all ancillary parts of KinematicsCache/RigidBodyPlantState to be up to date.

tkoolen on 13 Sep 2016

@tkoolen: Making that recomputation be lazy - but invisible to the consumer - is the optimization we wanted to implement.

I totally agree and this is how the System 2 caching mechanism should behave. Surprisingly this is somewhat controversial at this point but I hope we'll resolve it soon.

sherm1 on 13 Sep 2016

The controversial thing is sharing cache that is supposed to be private to a subsystem implicitly (i.e., not through output ports) with other subsystems in a diagram (e.g., https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-245105482).

The proposal here is to associate cache with the output of a subsystem, not with the subsystem itself, so that it passes through an output port, and the dependencies are explicit.

Maybe I completely misunderstood, but I was under the impression that the current cache design only associates cache with a subsystem.

tkoolen on 13 Sep 2016

@tkoolen Your understanding is mine also. I think @sherm1 is trying to connect this thread to the question of whether System outputs themselves should be computed in push order or pull/lazy order. I claim that's a totally separate, unrelated issue.

david-german-tri on 13 Sep 2016

The point is that in a networked situation, the RigidBodyPlantState (or rather, a minimal representation of it as an LCM message) may be coming over the network, and there may not _be_ any RigidBodyPlant on the receiver side whose cache consumers can access. The natural thing in this case is to associate the cache directly with the RigidBodyPlantState output, not with the RigidBodyPlant.

tkoolen on 13 Sep 2016

If it helps: another architecture for the networked plant state is to transmit the minimal representation, and then have an explicit System on the receiving side that takes minimal representation as input and provides the supplemental representation as (cached) output. Then you get minimalism, caching, explicit documentation of the computations' structure, and it would (also) be reusable code pattern if you had different plants with different supplemental state.

jwnimmer-tri on 13 Sep 2016

👍1

@jwnimmer-tri beat me to it. But this whole argument is 2a vs 2c and can wait.

david-german-tri on 13 Sep 2016

Completely agree.

tkoolen on 13 Sep 2016

👍1

@jwnimmer-tri, yep, exactly what I meant in https://github.com/RobotLocomotion/drake/issues/3325#issuecomment-246704316 with

you could have the LCMSubscriber system recompute all the ancillary data from the data received over LCM, and that would work fine [... there's a but after this.]

tkoolen on 13 Sep 2016

It sounds like I should abstain from this discussion for the time being :smile:. I am happy to talk though the computation / caching for the future numbered items if and when we need them, if anyone wants.

jwnimmer-tri on 13 Sep 2016

Sorry, didn't mean to scare you away, haha :-)

tkoolen on 14 Sep 2016

I'm good. It's an interesting discussion, just not a problem I need to own.

jwnimmer-tri on 14 Sep 2016

👍1

Maybe I completely misunderstood, but I was under the impression that the current cache design only associates cache with a subsystem.

Caching is supposed to apply to output ports as well (and also other computations like derivatives, discrete updates, energy calculations, and anything a concrete System may want to add besides the general API). Cached output ports provide a disciplined way to allow caching to reach downstream from the System that owns that piece of cache real estate, without having to expose any of the internal workings (even the fact that a cache exists at all should be evident only through performance).

sherm1 on 14 Sep 2016

👍1

Seeing as there is still no consensus on this issue (given #3471), I propose clearing things up with a meeting, this time with _everybody_ involved in this issue.

tkoolen on 19 Sep 2016

👍1

Please consider that accessing an output port triggers the associated computation (that is, the output port's value is guaranteed to be up to date when you see it). Consequently output ports need to be segregated appropriately; asking for body poses should not trigger acceleration computations. Also, a port that includes accelerations is necessarily a direct feedthough from the RBPlant's actuator inputs. That is not what you want from a port that returns kinematics, which is _not_ direct feedthrough and could thus serve as input to a sensor System whose output goes to a controller that generates actuation inputs to RBPlant.

Returning KinematicsCache as a single output port thus implies that everything in it gets calculated at once. That likely means we have to include all the sensor and controller code in RBPlant rather than using System interconnections. That could be done but seems like a complete end-around the System 2 framework.

IMO, a nice solution would involve at least 3 output ports:

one for one-time data such as body names, mass properties, dimensions, and other "parameterizable" quantities
one for state-based outputs (position and velocity kinematics)
one for dynamics outputs (accelerations, reaction forces)

It is not clear to me that we should be exporting internal quantities through output ports (rather than bespoke RBPlant methods) but if we do those would be good candidates for additional ports so that expensive computations (mass matrix, for example) could be delayed unless someone asks for them. (Accelerations can be calculated without forming a mass matrix.)

@tkoolen: I propose clearing things up with a meeting

That seems like a good idea!

sherm1 on 19 Sep 2016

The above seems to be a strong argument for using "thin" ports.

I did not realize that the act of accessing a port triggers the associated computation, though it sounds like a really nice optimization. I'm now thinking about the implications of this fact on the implementation of the system that owns the output port. If a particular output port is accessed only 50% of the simulation cycles, is it the responsibility of the system that owns the output port to only compute the value supplied on that port 50% of the time?

liangfok on 19 Sep 2016

The dataflow model (a.k.a. lazy evaluation) that we're now using never computes anything unless someone asks for it. So an output port that gets accessed occasionally only gets evaluated occasionally. The same scheme works for all the rest of the computations also, like derivatives, discrete updates, energy calculations, initial conditions, and so on. We don't know how often those need to be computed until someone asks.

sherm1 on 19 Sep 2016

The dataflow model ... that we're now using

Not on master yet - that's #3455. (pending your review)

david-german-tri on 19 Sep 2016

Drake: Output KinematicsCache from RBP

Most helpful comment

All 85 comments

About visualizing a `RigidBodyPlant`

About visualizing a system like a `Pendulum`.

Related issues

Drake: Output KinematicsCache from RBP

Most helpful comment

All 85 comments

About visualizing a RigidBodyPlant

About visualizing a system like a Pendulum.

Related issues

About visualizing a `RigidBodyPlant`

About visualizing a system like a `Pendulum`.