Drake: Adding Protocol Buffers to drake

Created on 10 Jan 2017 · 14Comments · Source: RobotLocomotion/drake

After having a conversation with @david-german-tri, we think it's about time to switch from yaml to Protocol Buffer for parsing config files. https://developers.google.com/protocol-buffers/docs/cpptutorial
This is new to me, but from what I understand, the biggest benefit we get is that the library validates the config file's format for us before we write any custom parsing code.

medium

Source

siyuanfeng-tri

🎉1

Most helpful comment

Overview

Protocol Buffers are a structured data serialization format from Google. They shine as a binary format for bytes on the wire, a binary format for bytes on disk, and a text format for human-editable configuration files. The protocol buffer libraries and tools are BSD-licensed.

Protocol Buffers have a number of competitors with similar feature sets, as summarized in this blog post. This document recommends Protocol Buffers over the competitors for three reasons:

The Google backing and ecosystem will mean more stability and less work for us.
Mutability is really useful.
The performance advantages of Cap'n Proto et. al. are moot on small messages, and can be mitigated for large messages such as bitmaps or point clouds by escaping to a binary encoding.

Why Drake Needs Protocol Buffers

Configuration Files

Today, YAML is the de-facto standard for configuration files in Drake. As Drake use cases get more complex, the number and complexity of configuration files is growing. Examples include #4638 (alias groups), #4389 (maliput), #3949 (vector generation). Protocol buffers are superior to YAML for managing complex, evolving configuration formats:

Protobufs have a schema, and the library validates input data against it.
Configuration file readers can understand the configuration by reading the commented schema, without reasoning through Turing-complete code. (example)
Protobuf fields have names, which makes it easy to grep for their call sites.

Network Messages

Today, LCM is the de-facto standard for network messages in Drake. Protocol Buffers solve at least two well-understood weaknesses of LCM that have bitten us repeatedly:

An LCM message is only valid if it has hash equality to the schema. A Protocol Buffer message is valid so long as every field in the message exists in the schema. Thus, it is possible to add a new field to a proto, and gradually update all binaries that send or receive the message to use the new field. In contrast, an LCM protocol, once deployed, can only be changed with a coordinated global release of all affected binaries.
In the language-specific Protocol Buffer libraries, a protobuf message inherits from a generic Message type, which supports reflection over the fields.
- This would make it possible to write truly generic publish/subscribe nodes in Drake, instead of the extension mechanism we have right now.

Note that transport is a separate issue from serialization format. The LCM transport has some desirable properties and a lot of inertia. Protocol buffers can be exchanged as raw bytes in an LCM container, or over some other transport. This document does not consider transport further.

GitHub issues #1880 and #1881 touched on a variety of maintenance challenges with LCM, but the threads are entangled with various other historical Drake software architecture issues.

How Drake Can Get Protocol Buffers

Proto Compiler

We would need to add the proto compiler to CI instances, and to the installation instructions. In the Bazel build, we can download it hermetically instead of depending on a system version.

Proto Build Rules

We need build rules for both the proto library itself, and for individual proto message types. There appear to be existing protocol buffer rules for CMake. The Bazel rules are still in-progress upstream, but we can use the gRPC Bazel rules in the meantime.

Actual Protocol Messages

Configuration files are the lowest-hanging fruit. We should start there, and think about the network migration later.

david-german-tri on 20 Jan 2017

👍2

All 14 comments

I'm intrigued and supportive of this idea, but am a little apprehensive because this feels like taking a sledgehammer to a small tack. :smile:

liangfok on 10 Jan 2017

I'm drafting a somewhat longer manifesto at https://docs.google.com/document/d/1gGSKUZzBgn25TVNYEeDOR9VP-edljcGbrc8Im1ZkcWk/edit. Interested Drake developers outside of TRI will hit an access wall, but I'll try to grant access requests as quickly as I can. Let's start with discussion there; once it quiesces, I'll serialize it out to this issue, hopefully by the end of the week.

david-german-tri on 10 Jan 2017

Overview

Protocol Buffers have a number of competitors with similar feature sets, as summarized in this blog post. This document recommends Protocol Buffers over the competitors for three reasons:

The Google backing and ecosystem will mean more stability and less work for us.
Mutability is really useful.
The performance advantages of Cap'n Proto et. al. are moot on small messages, and can be mitigated for large messages such as bitmaps or point clouds by escaping to a binary encoding.

Why Drake Needs Protocol Buffers

Configuration Files

Protobufs have a schema, and the library validates input data against it.
Configuration file readers can understand the configuration by reading the commented schema, without reasoning through Turing-complete code. (example)
Protobuf fields have names, which makes it easy to grep for their call sites.

Network Messages

Today, LCM is the de-facto standard for network messages in Drake. Protocol Buffers solve at least two well-understood weaknesses of LCM that have bitten us repeatedly:

An LCM message is only valid if it has hash equality to the schema. A Protocol Buffer message is valid so long as every field in the message exists in the schema. Thus, it is possible to add a new field to a proto, and gradually update all binaries that send or receive the message to use the new field. In contrast, an LCM protocol, once deployed, can only be changed with a coordinated global release of all affected binaries.
In the language-specific Protocol Buffer libraries, a protobuf message inherits from a generic Message type, which supports reflection over the fields.
- This would make it possible to write truly generic publish/subscribe nodes in Drake, instead of the extension mechanism we have right now.

GitHub issues #1880 and #1881 touched on a variety of maintenance challenges with LCM, but the threads are entangled with various other historical Drake software architecture issues.

How Drake Can Get Protocol Buffers

Proto Compiler

We would need to add the proto compiler to CI instances, and to the installation instructions. In the Bazel build, we can download it hermetically instead of depending on a system version.

Proto Build Rules

Actual Protocol Messages

Configuration files are the lowest-hanging fruit. We should start there, and think about the network migration later.

david-german-tri on 20 Jan 2017

👍2

:+1:

Lack of backward compatibility in LCM types has been a pain point for a long time. If drake provides the protobuf tools, then I might also be able to use them as part of Director's communication interface later on, which would be very nice.

@david-german-tri where do you foresee the .proto files living? Will they be in Drake, or in some other repo akin to robotlocomotion_lcmtypes?

rdeits on 20 Jan 2017

where do you foresee the .proto files living? Will they be in Drake, or in some other repo akin to robotlocomotion_lcmtypes?

Good question. I think all the same arguments that apply to .lcm files apply to .proto files, so I'd argue we should follow the same logic as https://github.com/RobotLocomotion/lcmtypes/issues/3. Messages shared across projects go in a shared upstream repository; messages local to a project go in that project. Seem reasonable?

david-german-tri on 20 Jan 2017

Agreed. It might be nice someday to communicate with Drake without depending on Drake.

rdeits on 20 Jan 2017

Also, I think replacing the yaml configuration with protobuf is reasonable. I picked yaml because it was easy for me to read and write (and I was the only one interacting with those files at the time), and because it just required a small library as a dependency. But it makes sense that we'd want a more structured, parseable format in the future.

rdeits on 22 Jan 2017

👍1

The protocol buffer support for Python requires a library, most easily obtained through pip. (The versions in apt are ancient, on both Xenial and Trusty.) I know there's some anti-pip sentiment, but I don't fully understand the reasons. Anyone want to chime in?

david-german-tri on 26 Jan 2017

I'm pro-pip, personally.

rdeits on 26 Jan 2017

In the Bazel build, it looks like I can get the protobuf library hermetically, so that's what I've done in #4912. I'll continue thinking about how to support CMake; pip may well be the answer.

david-german-tri on 26 Jan 2017

For the recored: my months-ago objection was in particular to sudo pip install during the "must always do this" setup instructions, because it puts content onto the host system that is less strictly managed and cross-integrated than what debs from the OS vendor directly get us.

Since then, we have added option setup instructions of pip install --user -U pylint which places files in the user's home directory. Many of the same problems arise there, but at least its confined to one user's homedir -- and because its an optional tool, pedantic users can simply choose not to do it. I think its a fair compromise when a more robust choice for obtaining tools is too expense.

jwnimmer-tri on 26 Jan 2017

In addition, users who are concerned about their global python environment can also just use a virtualenv and pip-install into that environment (which is standard practice in python as far as I know).

rdeits on 26 Jan 2017

We have C++ and Python protobufs in the Bazel build. I don't think we'll ever need to add them to the CMake build, so I'm going to close this as completed.

david-german-tri on 3 Apr 2017

fwiw -- #5786 adds it to the cmake build.