Drake: Hidden Visibility, Implicit Instantiation, and the Mach-O Linker/Loader

Created on 25 Aug 2016  路  1Comment  路  Source: RobotLocomotion/drake

As discussed on #3236 and #3247, when running with -fvisibility=hidden, the Mach-O linker will create non-external typeinfo symbols for implicit template instantiations in _every_ shared library where that particular instantiation occurs. These typeinfo are not equal, so if you then pass a template object across dylib boundaries, dynamic_cast will spuriously fail.

This is a time bomb that will lurk in your code unnoticed until you actually instantiate Foo<T> : public Bar in two different dylibs, and then try to dynamic_cast a Bar from one to a Foo<T> in the other. When it goes off, it will be very hard to debug: dynamic_cast will silently return nullptr, on Mac only.

This article provides lots of helpful background, although it doesn't predict our exact issue: http://www.russellmcc.com/posts/2013-08-03-rtti.html

In Drake, we could work around this by taking care to explicitly instantiate, extern, and export all the template instantiations that would otherwise be implicit. To comply with ODR, this export should be done in the library at the lowest level of physical layering that uses the template. That is a lot of finicky bookkeeping: I expect people will get it wrong, and plant more time bombs, since there would be no real tooling to make you get it right.

Additionally, the workaround is awkward because GCC and MSVC do not agree on the appropriate syntax for both externing and exporting a symbol. GCC wants the annotations on the extern declaration in the .h (Wattributes), whereas MSVC wants them on the explicit instantiation in the .cc (Warning C4910). As Drake is currently configured, that's an error on GCC and merely a warning on MSVC, so for the short-term fix in #3247, GCC won.

Alternatively, we could decide that the Mach-O linker's interpretation of -fvisibility=hidden is so pedantic that we shouldn't even bother with it. Hiding the typeinfo seems pretty extreme! Under this proposal, we'd just drop the flag from the Mac build and delete the #3247 workarounds.

I know that both of those ideas have multiple opponents. Other suggestions are welcome!

mac medium kitware bug

Most helpful comment

A very good description, thanks!

I would be excited if somebody in the wider community had a silver bullet solution, but I know of none.

My own opinion (relayed vigorously to @david-german-tri in person) is that the only worthwhile purpose of the "visibility hidden by default" choice is to make windows support less costly, by reproducing the windows export-ornament errors on the primary developer platforms, so that such errors are (probably) detected earlier and are (more) reproducible locally, and thus removing jenkins round-trips from the critical path.

To the extent that visibility satisfaction is even more challenging on OS X than windows, I don't see how default-to-hidden is making our development efforts less expensive -- in fact the opposite.

Unless there is a trivial solution to the hidden-typeinfo problem with Mach-O, my vote is to revert OS X to default visibility, but leave linux as hidden-by-default. That way at least a portion of our developers will catch problems early, and even those on OS X should be able to obtain easy access to a linux build environment for iterating and debugging the ornaments.

>All comments

A very good description, thanks!

I would be excited if somebody in the wider community had a silver bullet solution, but I know of none.

My own opinion (relayed vigorously to @david-german-tri in person) is that the only worthwhile purpose of the "visibility hidden by default" choice is to make windows support less costly, by reproducing the windows export-ornament errors on the primary developer platforms, so that such errors are (probably) detected earlier and are (more) reproducible locally, and thus removing jenkins round-trips from the critical path.

To the extent that visibility satisfaction is even more challenging on OS X than windows, I don't see how default-to-hidden is making our development efforts less expensive -- in fact the opposite.

Unless there is a trivial solution to the hidden-typeinfo problem with Mach-O, my vote is to revert OS X to default visibility, but leave linux as hidden-by-default. That way at least a portion of our developers will catch problems early, and even those on OS X should be able to obtain easy access to a linux build environment for iterating and debugging the ornaments.

Was this page helpful?
0 / 5 - 0 ratings