As more and more of flux-sched (and some of flux-core) is using C++, the desire to use newer features of the languages correspondingly increases. If we want to include C++11 features or libraries that use them, my understanding is that we need to accept a minimum compiler version (i.e., gcc 4.8 & clang 3.3) for Flux as a whole to maintain ABI compatibility. (Is that correct? Or is this a decision that can be made for flux-sched in isolation?)
If the minimum version of gcc that we require is > 4.8.5, then we will also need to determine how to build Flux with a newer compiler on the TOSS build farm. Based on everything that I have read online, outside of a few minor bugs, gcc 4.8.1+ implements the C++11 standard. I am unclear about the STL though. @trws probably knows the nitty gritty details.
Based on an out-of-band conversation with @trws, it looks like the STL is C++11 compliant as of gcc 5.1 (source), but he and others saw bugs through gcc 5.3 and performance issues through to gcc 6.
Rather than targeting a particular C++ standard and inferring the compiler versions from that, maybe we go the other way and pick the newest gcc version that we are willing to require is, and pin our C++ features to that version of the compiler.
This all sounds right. The real restriction is that all the C++ code that calls through c++ APIs has to use the same standard library version to be ABI compatible. Since all the API boundaries between core and sched are C, there should be no need to pin core to the higher compiler other than that we may as well use a less broken better refined compiler. Well, or unless components like the jobspec in core want to have a working c++11 base.
On August 21, 2018 at 5:30:58 PM PDT, Stephen Herbein notifications@github.com wrote:
Based on an out-of-band conversation with @trwshttps://github.com/trws, it looks like the STL is C++11 compliant as of gcc 5.1 (sourcehttps://en.cppreference.com/w/cpp/compiler_support), but he and others saw bugs through gcc 5.3 and performance issues through to gcc 6.
Rather than targeting a particular C++ standard and inferring the compiler versions from that, maybe we go the other way and pick the newest gcc version that we are willing to require is, and pin our C++ features to that version of the compiler.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com/flux-framework/flux-core/issues/1623#issuecomment-414865777, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAoStYBgI2abQ9rLiXsq2hKCiwaaUyVpks5uTKYigaJpZM4WGyd3.
Libjobspec in core exports a C++ interface to core and sched. Do we need
to fix that?
On Tue, Aug 21, 2018, 8:30 PM Tom Scogland notifications@github.com wrote:
This all sounds right. The real restriction is that all the C++ code that
calls through c++ APIs has to use the same standard library version to be
ABI compatible. Since all the API boundaries between core and sched are C,
there should be no need to pin core to the higher compiler other than that
we may as well use a less broken better refined compiler. Well, or unless
components like the jobspec in core want to have a working c++11 base.On August 21, 2018 at 5:30:58 PM PDT, Stephen Herbein <
[email protected]> wrote:Based on an out-of-band conversation with @trwshttps://github.com/trws,
it looks like the STL is C++11 compliant as of gcc 5.1 (source<
https://en.cppreference.com/w/cpp/compiler_support>), but he and others
saw bugs through gcc 5.3 and performance issues through to gcc 6.Rather than targeting a particular C++ standard and inferring the compiler
versions from that, maybe we go the other way and pick the newest gcc
version that we are willing to require is, and pin our C++ features to that
version of the compiler.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<
https://github.com/flux-framework/flux-core/issues/1623#issuecomment-414865777>,
or mute the thread<
https://github.com/notifications/unsubscribe-auth/AAoStYBgI2abQ9rLiXsq2hKCiwaaUyVpks5uTKYigaJpZM4WGyd3>.—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/flux-framework/flux-core/issues/1623#issuecomment-414897971,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAKX22CzDkI-t6cOPIvJprLCPEdWTOjJks5uTNBhgaJpZM4WGyd3
.
Current jobspec is the only API in core that uses and assumes c++11. Probably good to use the same c++ compiler versions as what sched wants to use.
Dong
From: Tom Scogland notifications@github.com
Sent: Tuesday, August 21, 2018 8:30:41 PM
To: flux-framework/flux-core
Cc: Subscribed
Subject: Re: [flux-framework/flux-core] C++11 features (#1623)
This all sounds right. The real restriction is that all the C++ code that calls through c++ APIs has to use the same standard library version to be ABI compatible. Since all the API boundaries between core and sched are C, there should be no need to pin core to the higher compiler other than that we may as well use a less broken better refined compiler. Well, or unless components like the jobspec in core want to have a working c++11 base.
On August 21, 2018 at 5:30:58 PM PDT, Stephen Herbein notifications@github.com wrote:
Based on an out-of-band conversation with @trwshttps://github.com/trws, it looks like the STL is C++11 compliant as of gcc 5.1 (sourcehttps://en.cppreference.com/w/cpp/compiler_support), but he and others saw bugs through gcc 5.3 and performance issues through to gcc 6.
Rather than targeting a particular C++ standard and inferring the compiler versions from that, maybe we go the other way and pick the newest gcc version that we are willing to require is, and pin our C++ features to that version of the compiler.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com/flux-framework/flux-core/issues/1623#issuecomment-414865777, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAoStYBgI2abQ9rLiXsq2hKCiwaaUyVpks5uTKYigaJpZM4WGyd3.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://github.com/flux-framework/flux-core/issues/1623#issuecomment-414897971, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AA0nq9QQzWP7o0puE4Eu9MfoZzuE7U4_ks5uTNBhgaJpZM4WGyd3.
Anywhere there is a c++ api that crosses translation units, we must use the same c++ standard lib. Easiest way is to use the same compiler or make the api internal.
On August 21, 2018 at 9:26:28 PM PDT, Dong H. Ahn notifications@github.com wrote:
Current jobspec is the only API in core that uses and assumes c++11. Probably good to use the same c++ compiler versions as what sched wants to use.
Dong
From: Tom Scogland notifications@github.com
Sent: Tuesday, August 21, 2018 8:30:41 PM
To: flux-framework/flux-core
Cc: Subscribed
Subject: Re: [flux-framework/flux-core] C++11 features (#1623)
This all sounds right. The real restriction is that all the C++ code that calls through c++ APIs has to use the same standard library version to be ABI compatible. Since all the API boundaries between core and sched are C, there should be no need to pin core to the higher compiler other than that we may as well use a less broken better refined compiler. Well, or unless components like the jobspec in core want to have a working c++11 base.
On August 21, 2018 at 5:30:58 PM PDT, Stephen Herbein notifications@github.com wrote:
Based on an out-of-band conversation with @trwshttps://github.com/trws, it looks like the STL is C++11 compliant as of gcc 5.1 (sourcehttps://en.cppreference.com/w/cpp/compiler_support), but he and others saw bugs through gcc 5.3 and performance issues through to gcc 6.
Rather than targeting a particular C++ standard and inferring the compiler versions from that, maybe we go the other way and pick the newest gcc version that we are willing to require is, and pin our C++ features to that version of the compiler.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com/flux-framework/flux-core/issues/1623#issuecomment-414865777, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAoStYBgI2abQ9rLiXsq2hKCiwaaUyVpks5uTKYigaJpZM4WGyd3.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://github.com/flux-framework/flux-core/issues/1623#issuecomment-414897971, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AA0nq9QQzWP7o0puE4Eu9MfoZzuE7U4_ks5uTNBhgaJpZM4WGyd3.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com/flux-framework/flux-core/issues/1623#issuecomment-414906769, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAoStdxYAzsgMaTOZScS2BouQfUxqh_aks5uTN1ggaJpZM4WGyd3.
Anywhere there is a c++ api that crosses translation units, we must use the same c++ standard lib. Easiest way is to use the same compiler or make the api internal.
I wonder if we should make the jobspec parser internal then, copying it in both core and sched?
It doesn't necessarily have to become internal in that sense, but exposing a C api that would make cross-library use safer would be a good idea, we probably want one for use elsewhere in core and the bindings anyway.
In fact, this is something we'll have to deal with soon regardless, the current stable version of yaml-cpp no longer depends on boost (yay!) but depends on a conforming c++11 compiler instead.
I wonder if we should make the jobspec parser internal then, copying it in both core and sched?
That would be nice in that it would allow flux-sched to make the C++ compiler decision independently of flux-core. Then users that only compile flux-core could use an older compiler.
At the same time, flux-core and flux-sched are frequently going to be built together, so any compiler requirement that flux-sched has is going to "practically apply" for flux-core, with the TOSS build farm being a great example. If flux-sched requires C++17 (as an extreme example), then we will still need to figure out how to get GCC 8 on the build farm to compile flux-sched, rendering the compiler flexibility of flux-core kind of moot.
(Speaking in total and complete ignorance when it comes to the C++ environment), it seems to me that gcc 4.8.5 is a good compromise for now. It is new enough that we can use most of C++11 and only have to work around a few edge cases. At the same time, that would give us "out-of-the-box" compatibility with Centos 7 and thus TOSS3. IMO that is a huge usability win for tri-lab users. It would also help with any future collaborations with ORNL since Summit uses 4.8.5 as the system default [source].
If we aim for an older compiler, we lose the "C++11 support" badge and the associated libraries (like yaml-cpp as @trws mentioned). Does anyone have any compelling arguments for supporting a GCC version older than 4.8.5? If not, I propose we mark that as our oldest supported compiler and then re-visit this if the need/desire for a newer compiler emerges.
exposing a C api.....we probably want one for use elsewhere in core and the bindings anyway.
Totally agree. This is something that @garlick and I were chatting about yesterday. Python bindings would also be useful as job submission is a primary use case for python.
I don't see sched going too fancy like c++17 in extreme cases. And I sort of hate having to copy source files across different projects unless there is a compelling reason for it. Been bitten by that several times in the past.
It seems going with gcc 4.8.5 and limiting c++ 11 features to those supported by that is a good compromise.
What would be the minimum compiler version for clang++ though? In general it should have good c++11 support and beyond.
And I sort of hate having to copy source files across different projects unless there is a compelling reason for it. Been bitten by that several times in the past.
Agreed.
What would be the minimum compiler version for clang++ though? In general it should have good c++11 support and beyond.
Based on the cppreference site posted earlier, it looks like Clang 3.3 has feature parity with GCC 4.8.1 and above.
If that's the target, we need to be pretty careful about use of C++11, it really is not a working implementation of that dialect. The user default 4.9.3 was set that way for precisely this reason, too many Livermore projects that rely on C++11 working were otherwise getting the libstdc++ from 4.8.5 and breaking in nasty hard to diagnose ways that produced a good deal of support work for DEG.
I'm not saying we can't do it, or even necessarily shouldn't do it if there's sufficient reason, but it may have non-trivial costs.
Why can't we set the baseline to 4.9.3? I know how nasty debugging such a problem is.
This is the cruddy bit, RHEL 7 uses a 4.8.5 base, so that's what's at /usr/bin/gcc. For the TOSS3 environment John and Matt and company went through and re-did everything user-facing with gcc 4.9.3, and made it part of the default profile. That way all of the compilers, gcc, icc, pgi, everything gets the libstdc++ from 4.9.3 rather than 4.8.5, but it's still technially a 4.8.5 base system.
It doesn't necessarily have to become internal in that sense, but exposing a C api that would make cross-library use safer would be a good idea, we probably want one for use elsewhere in core and the bindings anyway.
Open flux-core bug?
Just to check my understanding (since I can't seem to find a clear answer online):
If I were to compile flux-core with gcc 4.8.5, the jobspec library will be built and linked against libstdc++.so.6.0.19 [1]. And since "binaries with equivalent DT_SONAMEs (e.g., libstdc++.so.6) are forward-compatibile" [1], in theory I could "hotswap" my environment to gcc 8, which uses libstdc++.so.6.0.25, using module load gcc/8, and the libjobspec will still work as expected within flux-core (despite the newer libstdc++ getting picked up at runtime). Right?
Does the same also hold in theory if I were to compile flux-sched with gcc 8 and link it against the flux-core/libjobspec built with gcc 4.8.5 (assuming the newer libstdc++ is used at runtime)? Also assuming that the _GLIBCXX_USE_CXX11_ABI is set to 0 [2]. Or would this require both forwards and backwards ABI compatibility? I'm guessing the latter given some of @trws's previous responses, specifically:
Anywhere there is a c++ api that crosses translation units, we must use the same c++ standard lib. Easiest way is to use the same compiler or make the api internal.
Yes, it should work with a newer (or possibly even older) libstdc++ up
to a certain limit. This works because the old versions of various
representations and symbols are still provided in newer versions of the
library. The problem happens when you build two components with
different object representations then try and pass those objects across
the boundary.
Note also that if you configure gcc 8 to use the old libstdc++, headers
and all, and all of those conform to its requirements, it may work.
That’s why you can compile something with clang 6 or icc 18 and link
with system C++ libraries on our systems, they pull their standard
library from the in-path gcc by default. The problem happens when gcc 8
uses its own libstdc++ headers, with new representations for STL types,
and then attempts to pass them to functions accepting the old version,
it is very likely to break, and while this is usually caught at
link-time these days (oh god, gcc 4.7 flashbacks…) it sometimes
happens to pass the linker and then silently passes an object of the
wrong structure, or decides “hey, I can pack these ints in a different
way so that the argument gets its bottom bits silently sliced off” or
similar bugs.
I’m not sure that completely answered the question, if you want to
chat about this I’m around this afternoon. I’m sorry for the
sometimes bombastic reactions by the way, it has been a long-running pet
peeve of mine when projects use modern C++ with compilers released more
than half a decade ago, and then ask us to fix correct code that’s
being misinterpreted by the compiler.
On 23 Aug 2018, at 13:42, Stephen Herbein wrote:
Just to check my understanding (since I can't seem to find a clear
answer online):If I were to compile flux-core with gcc 4.8.5, the jobspec library
will be built and linked againstlibstdc++.so.6.0.19
[1]. And
since "binaries with equivalent DT_SONAMEs (e.g.,libstdc++.so.6)
are forward-compatibile"
[1], in
theory I could "hotswap" my environment to gcc 8, which uses
libstdc++.so.6.0.25, usingmodule load gcc/8, and the libjobspec
will still work as expected within flux-core (despite the newer
libstdc++ getting picked up at runtime). Right?Does the same also hold in theory if I were to compile flux-sched
with gcc 8 and link it against the flux-core/libjobspec built with gcc
4.8.5 (assuming the newer libstdc++ is used at runtime)? Also
assuming that the_GLIBCXX_USE_CXX11_ABIis set to 0
[2].
Or would this require both forwards and backwards ABI compatibility?
I'm guessing the latter given some of @trws's previous responses,
specifically:Anywhere there is a c++ api that crosses translation units, we must
use the same c++ standard lib. Easiest way is to use the same
compiler or make the api internal.--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/flux-framework/flux-core/issues/1623#issuecomment-415564415
What I make out of this so far:
That summary sounds right. A recent out-of-band discussion with @trws has led me to believe that a compiled C++ library might be better than a header-only solution in certain scenarios. Specifically, if we upgrade the flux-core installation, then users will automatically get the latest jobspec parser without having to recompile their applications/workflow managers (in a header-only solution, users would be forced to recompile whenever the flux version changes to ensure compatibility). @dongahn pointed out that if the header/API changes, then user's would have to recompile anyways (in the case of a shared library). So this really only applies in the case of minor changes like bugfixes.
The other part of the discussion was that the current C++ interface could be improved to be easier to use and avoid direct access to stl types/class members. @trws and I plan to sit down and propose an updated C++ jobspec interface, a new C jobspec interface, and a new python jobspec interface. Components of flux-core (like the job-ingest module and python bindings) can leverage the C interface while components of flux-sched (like the resource matching service) can leverage the C++ interface.
@trws and I plan to sit down and propose an updated C++ jobspec interface, a new C jobspec interface, and a new python jobspec interface. Components of flux-core (like the job-ingest module and python bindings) can leverage the C interface while components of flux-sched (like the resource matching service) can leverage the C++ interface.
This sounds good. You can also take a look at how I used the jobspec API in my resource infrastructure. Hopefully, my hope is the transition pass isn't too painful.
I think we've essentially reached a conclusion on this right, shall we close?
Closing. libjobspec C++ interface moved to flux-sched internal.