Drake: Compatibility with ROS2 version of spdlog

Created on 16 Oct 2020  路  23Comments  路  Source: RobotLocomotion/drake

Likely related to https://github.com/RobotLocomotion/drake/issues/7451. Not much of an issue until one tries to use drake::log() on a binary that depends on another library that uses a different spdlog version.

Context

Using ROS 2 Eloquent and latest Drake binaries on Ubuntu Bionic 18.04, (strong) spdlog symbols brought by libdrake_spdlog.so interact with (weak) spdlog symbols brought by librcl_logging_spdlog.so. Eventually, it causes a segfault within rcl_logging_spdlog.
Arguably, neither library attempts to mitigate this risk.

How to reproduce

See spdlog_pollution_repro: branch | commit.

build system linux kitware feature request

Most helpful comment

FYI fmt 7.1.3 made into Debian Sid shortly after New Year's Eve: https://tracker.debian.org/news/1209212/accepted-fmtlib-713ds1-5-source-into-unstable/ :smiley:

All 23 comments

CC @IanTheEngineer @EricCousineau-TRI

Right, it's not supported nor possible to mix and match binary compilations of two different versions spdlog in the same program. You need to compile everything using the same source version. The pre-compiled Drake images are probably not suitable for your use case.

If I recall correctly, another difference you'll find is that the ros2 bundled spdlog uses spdlog's internal vendored fmt, which is a different version that Drake uses, leading to the same kinds of problems. The patch to Drake to disable spdlog entirely would be small (already present in the Bazel build, we'd just need to thread it through to the CMake build), but we can't disable fmt, so there's still an incompatibility lurking.

Ultimately, the build system needs to use the same libraries and headers for any public library dependencies when linked into the same program.

For things like libsdformat, Drake only uses it internally, so we compile it as a static archive with hidden symbols, so it will not cause a conflict.

For the things listed in #7451, the entire program needs to agree on one single version. Especially for widely-used things like fmt and Eigen, I've been thinking we could offer in the Drake CMake build the option to find_package() for those public dependencies, so users could opt-in to libraries provided elsewhere.

From #7451, there are three public libraries that we compile from source within Drake: fmt, lcm, and spdlog. Allowing a CMake user to rebuild Drake while supplying their own compiled versions of those libraries would resolve this request, I think.

@jwnimmer-tri would this also be possible when Drake is an external dependency of a Bazel workspace? For a scenario where a binary ROS 2 distribution is pulled into that Bazel workspace as an external dependency.

In Bazel, the user already has full control of externals:

When calling Drake's add_default_repositories users can opt-out of any dependencies that they have already provided in their own WORKSPACE, by using, e.g., excludes = ["spdlog"]:

https://github.com/RobotLocomotion/drake/blob/7b7bf9e95a88f4b03aca4905f1335625228a60ce/tools/workspace/default.bzl#L88-L96

Alternatively, if the user wants to opt-in to Drake dependencies one by one instead of ingesting Drake's entire WORKSPACE, they should not call add_default_repositories but instead call, e.g., fcl_repository(name = "fcl", mirrors = mirrors) one by one to accept Drake's defaults for specific repositories (or else provide their own, when desired).

https://github.com/RobotLocomotion/drake/blob/7b7bf9e95a88f4b03aca4905f1335625228a60ce/tools/workspace/default.bzl#L134-L135

To clarify, this use case is actually Anzu consuming ROS2. Would that change the considerations here?

(Jeremy, I can page you in on more details if you want.)

Yes.

To obtain compatibility of Anzu with the ROS2 ...

Option 1 is to consume ROS2 as source and compile it ourselves. This could be a lot of work, but could also be the best way to go. In general, building from source will always be the most robust option when dealing with C++ code we intend to link into the same binary as our own source code. We've shirked this with our ROS1 integration because of its minimal use, but if we are planning on non-trivial runtime integration, the investment to rebuild from sources could pay off nicely.

Option 2 is to stick with the ROS2 precompiled binaries. For that we'd need to use the ROS2 precompiled spdlog. That would currently be a problem because ROS2 appears to be configured to use header-only spdlog, and that _severely_ bloats our binaries, which is a regression I want to avoid. If we can get ROS2 to upgrade to a modern spdlog (it seems to have that on master, just not the eloquent branch) and then _also_ enable the shared library, then we could point Anzu to use that pre-compiled spdlog.

Aye. Mich and @sloretz have an Option 3 as well, where we leverage the rcl_logging C interface, intercept it at flag-parsing time, and can tell to it to use Drake's version of spdlog (or just disable logging entirely).

I think we'll try out Option 3 for prototyping, but will keep in mind the above alternatives you mentioned.

Some of the discussion in #13698 partially relates this, but the conclusion there was, I think I am correct in saying, that we knew this was going to bite us at some point.

I don't disagree with any of the above.

Given how release schedules work though, I would like to advocate for investing in getting Ubuntu and ROS cleaned up as much as possible before their next release(s) start the process freezing things and locking in versions.

If ROS could use the Ubuntu versions of spdlog and fmt, that would go a long way towards everyone agreeing on what version (ABI) to use for those libraries. Drake could move to doing the same.

We'd like to have the _newest_ version of them made available in the releases, and also change spdlog config to use the shared library -- the header-only is _vastly_ worse on build times and footprint. If we can get all of that work into the pipeline for 21.04, that would be a great starting point.

If we can get all of that work into the pipeline for 21.04, [...]

Not sure if I understand this part - why even consider a non-LTS release? Do we have plans of having Drake support non-LTS?
Or are you saying for the ROS ecosystem in isolation?

The sooner you get it into an Ubuntu release, the better. You have work with upstream, Debian, and Ubuntu developers. If you wait until 22.04 there is a good chance you will miss it (you really need fixes to be in by 21.10).

Gotcha! Makes sense in terms of focusing non-LTS work on ROS, nothing directly on Drake (for non-LTS).

Note as It happens, one of the issues, the vendored libfmt has finally been picked up from Debian and will be in groovy.

Gotcha! Makes sense in terms of focusing non-LTS work on ROS, nothing directly on Drake (for non-LTS).

Sooner is always better, even for Drake, it just depends on the scale of what is needed. If something new needs packaging, it can sometimes take a year to do so. Fixes are a little less time sensitive, but maintainers are volunteers, and sometimes the same version just rolls over because the maintainer was pressed for time.

@jamiesnape @jwnimmer-tri to recap,

Mich and @sloretz have an Option 3 as well, where we leverage the rcl_logging C interface, intercept it at flag-parsing time, and can tell to it to use Drake's version of spdlog (or just disable logging entirely).

Our current workaround is somewhat different. ROS 2 builds strip RPATHs by default, so we basically inject an librcl_logging_spdlog.so substitute that uses drake::log() under the hood.

If ROS could use the Ubuntu versions of spdlog and fmt, that would go a long way towards everyone agreeing on what version (ABI) to use for those libraries. Drake could move to doing the same.

This is already the case for ROS 2 Foxy packages running on Ubuntu Focal 20.04 (and MacOS Mojave 10.14). You can see here that spdlog is vendored iff it's not present in the system. It also uses spdlog in its shared library form.

We'd like to have the newest version of them made available in the releases, and also change spdlog config to use the shared library -- the header-only is vastly worse on build times and footprint. If we can get all of that work into the pipeline for 21.04, that would be a great starting point.

The sooner you get it into an Ubuntu release, the better. You have work with upstream, Debian, and Ubuntu developers. If you wait until 22.04 there is a good chance you will miss it (you really need fixes to be in by 21.10).

Got it. Specifically, ensure spdlog 1.8.1 and fmt 7.1.0 get into Ubuntu 22.04 (by releasing them ASAP, preferably for 21.04). I'm not familiar with Drake release policies though. How will you ensure compatibility moving forward? I see you bump external dependencies' versions quite often.

Also, it's worth mentioning that ROS distributions do not target non-LTS releases either. Foxy targets 20.04 and so will Galactic (to be released in May 2021).

(Aside: @jwnimmer-tri not @jeremyma-tri)

Thanks for the info.

Specifically, ensure spdlog 1.8.1 and fmt 7.1.0 get into Ubuntu 22.04 (by releasing them ASAP, preferably for 21.04).

Exactly. The precise version doesn't matter that much to us, it's more about just getting as many new features and bugfixes into 22.04 as possible, for the best chance of that version being something we can live with for four years. If we develop the relationships and workflow where we can regularly shepherd new versions of these two into Debian and/or Ubuntu, that pace will help ensure that 22.04 is the best it can be for our needs.

How will you ensure compatibility moving forward? I see you bump external dependencies' versions quite often.

For dependencies _unavailable_ in Ubuntu, Drake generally uses a relatively recent tagged release from upstream and bumps it monthly, especially for private dependencies.

For public dependencies that _are_ available in Ubuntu, we try to the Ubuntu build when possible, to improve binary compatibility with other software in the ecosystem (such as ROS). The question is whether there are any big bugs or missing features in the Ubuntu version. (For example, for Eigen, as of 18.04 we finally use the host version; but in 16.04 and prior it was too buggy for us, so we had to fetch a newer version from source, which was a compatibility headache.)

We haven't spent much time polishing Drake's 20.04 integration yet. There's a chance that we could swap to using the host spdlog on 20.04, even though it's a downgrade. Dealing with the API changes is probably not the hard part. The challenge is likely to be the vendored fmt getting in the way.

I'm sure that we _could_ patch Drake to build against the older spdlog and (vendored) fmt. The question is what features and performance we lose, and whether we're willing to accept that trade-off either for the prebuilt binaries we deliver for many users, or for the Anzu builds within TRI.

I'll take some time to go look and refresh my memory about exactly how fmt and spdlog end up coupled in the source and ABI, and whether we can use inline namespaces or similar to get source and binary compatibility without having to downgrade fmt.

I'll take some time to go look and refresh my memory about exactly how fmt and spdlog end up coupled in the source and ABI, and whether we can use inline namespaces or similar to get source and binary compatibility without having to downgrade fmt.

My refreshed look is the same as it was a few years ago -- spdlog and fmt are intertwined enough that we must re-use whatever fmt sources that spdlog was compiled against.

If we use the focal spdlog 1.5.0, that implies using its bundled fmt 6.1.2.

Homebrew is up-to-date with regard both spdlog and fmt, so versions 1.8.1 and 7.1.2 respectively. Hopefully that Is not too wide a gap to bridge to the versions in Focal.

(I guess #14379 will tell us.)

My proposal for the next change to attempt here is as follows:

On 20.04 Focal, no longer compile fmt nor spdlog from source, nor install our builds of them. Instead, switch to using the host spdlog and its vendored fmt. This will effectively downgrade the version of both.

On 18.04 Bionic, continue compiling fmt and spdlog from source, and continue to install our builds of them. (The host version is _way_ too old.) However, downgrade the pinned versions to match Focal -- it seems wrong to have Bionic be newer that Focal. An upgrade from Bionic to Focal should be strictly better, and if there are problems with the spdlog 1.5.0, I'd like to find them out early, and our day-to-day use is currently on Bionic.

On macOS, no longer compile fmt nor spdlog from source, nor install our builds of them. Instead, switch to using the homebrew formulae. Compared to building our own, this means that our binaries will break out from under us when homebrew upgrades those libraries, but we already have that problem with other libraries, and so we'll just have to live with it. Our macOS releases are not very durable anyway -- generally only the most recent one will function correctly, if that. The mitigation for that will be #12782, once it lands.

This leaves us with macOS using newer libraries than Ubuntu, which has some hazard of CI results shear, but we already have that for many other libraries, and given the small size of #14379 it will hopefully not be a substantial problem.

FYI spdlog 1.8.1 got into Debian Sid today: https://tracker.debian.org/news/1200036/accepted-spdlog-1181ds-1-source-into-unstable/ (thanks for the hint @j-rivero!).

FYI fmt 7.1.3 made into Debian Sid shortly after New Year's Eve: https://tracker.debian.org/news/1209212/accepted-fmtlib-713ds1-5-source-into-unstable/ :smiley:

Resolution from #14427:

On Ubuntu 18.04, we downgrade to spdlog 5.1.0 and fmt 6.1.2, but we still compile them from github source releases.

On Ubuntu 20.04, fmt and spdlog are obtained from the host now. (The host copy of fmt is also installed by Drake, because Ubuntu's packaging is deficient.) The versions are spdlog 5.1.0 with bundled fmt 6.1.2.

On macOS, fmt and spdlog are obtained from homebrew now. The versions are the latest available, per the usual homebrew policy.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mattcorsaro1 picture mattcorsaro1  路  4Comments

jwnimmer-tri picture jwnimmer-tri  路  4Comments

david-german-tri picture david-german-tri  路  4Comments

peteflorence picture peteflorence  路  5Comments

RussTedrake picture RussTedrake  路  4Comments