Describe the bug
For "simple" variants like std::variant<int,float,double> there can never be a "valueless_by_exception" state. Hence for these cases std::visit should be noexcept(true) and no exception code should be generated which allows for more optimizations later on.
Command-line test case
#include <variant>
template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
struct T0 {};
struct T1 {};
struct T2 {};
struct T3 {};
struct T4 {};
struct T5 {};
struct T6 {};
struct T7 {};
struct T8 {};
struct T9 {};
using example_variant = std::variant<T0, T1, T2, T3, T4, T5, T6, T7, T8, T9>;
int do_visit(example_variant v)
{
return std::visit(overloaded {
[](T0 val) { return 3; },
[](T1 val) { return 5; },
[](T2 val) { return 8; },
[](T3 val) { return 13; },
[](T4 val) { return 17; },
[](T5 val) { return 29; },
[](T6 val) { return 53; },
[](T7 val) { return 85; },
[](T8 val) { return 65; },
[](T9 val) { return 233; },
}, v);
}
Expected behavior
You can see the generated code at https://godbolt.org/z/D2Q5ED in comparison with Clang and GCC which both have the noexcept/omission of the potential throw.
Note how the optimal code looks like for Boost.Variant2 which is like return values[v.index()]: https://godbolt.org/z/EawzdG
Summary
Track the possibility of the valueless state, make it constexpr accessible and use it to avoid generating code that can in practice never be reached. This will improve performance and reduce generated code size.
For "simple" variants like
std::variant<int,float,double>there can never be a "valueless_by_exception" state.
This is not the case. Here's a function that takes any variant and puts it into the valueless_by_exception state:
template <class T>
struct valueless_hack {
struct tag {};
operator T() const { throw tag{}; }
};
template<class First, class... Rest>
void make_valueless(std::variant<First, Rest...>& v) {
try { v.emplace<0>(valueless_hack<First>()); }
catch(valueless_hack<First>::tag const&) {}
}
I am sympathetic to the desire to guarantee that a visit will not throw, but it needs changes to the standard. Ideally someone would propose both an unchecked_get (which assumes the variant is in the proper state) and unchecked_visit (which assumes the variant is not valueless, and can therefore be conditionally noexcept) for C++23.
I am sympathetic to the desire to guarantee that a visit will not throw, but it needs changes to the standard
The standard is very vague on the valueless thing. As you can see from the linked Godbolt example, GCC 9/Clang 9 stdlib does not throw for small, trivially copyable values. It seems keeping the current value in case of an exception via an extra copy is allowed. See my question on SO and its answers: https://stackoverflow.com/q/58834389/1930508
I'm bringing this up as I expect the case of "small, trivially copyable values" to be very common and having an exceptionless variant for them would be a great optimization.
Edit: See your code example with a check: https://godbolt.org/z/uPRx2j GCC and Clang do not enter the valueless state even after calling make_valueless.
I am sympathetic to the desire to guarantee that a visit will not throw, but it needs changes to the standard
The standard is very vague on the valueless thing. As you can see from the linked Godbolt example, GCC 9/Clang 9 stdlib does not throw for small, trivially copyable values.
FYI: your Compiler Explorer example is using the same standard library (GCC's libstdc++) with both GCC and Clang. If you pass -stdlib=libc++ to Clang it will use LLVM's C++ standard library which doesn't perform this optimization.
It seems keeping the current value in case of an exception via an extra copy is allowed. See my question on SO and its answers: https://stackoverflow.com/q/58834389/1930508
Yes, thanks for your persistence: I'd forgotten the weasely "might not hold a value" in [variant.mod]/12 and [variant.mod]/18. I agree there is a conforming optimization we could make here. There are a couple of issues to address:
We need to decide what "small" means: we probably don't want to copy std::array<double, 1ull << 30> onto the stack.
We need to decide how comfortable we are with an updated visit relying on a guarantee provided by an updated emplace. If someone links an OBJ with "new" visit to an OBJ with "old" emplace, the ODR will stomp all over them (i.e., visit will rely on a guarantee that emplace doesn't provide). Should we wait for vNext to perform this optimization to avoid this potential breakage?
For reference/inspiration:
inline namespace __8 in their std namespace. This should cause any problems in almost all cases as variant is fully templated. Not sure if MS is using something similar to this "ABI versioning"is_nothrow_constructible_v<type, _Args...>==true and can directly construct it inside the variant without any copy. The fallback only "workarounds" the uncommon throwing convert caseThis should cause any problems in almost all cases as variant is fully templated.
Not true; in fact being fully templated makes the issue worse since the variant code is often inlined into user code which is not in the inline namespace.
Not true; in fact being fully templated makes the issue worse since the variant code is often inlined into user code which is not in the inline namespace.
Can you elaborate on this? Inline namespaces were IMO designed for such changes. Consider the following cases:
Generally: Due to the inline namespace no new variant can be passed to code expecting an old variant. New code will use the new one, old code the old one. There might be recompilation required if it is passed through API boundaries of already compiled code. (Not sure if this is/can be avoided)
At least one limitation is that this doesn't work transitively. I.e., a type that holds a variant as a member will have different layouts depending on what version gets used and inline functions that inline any of the variant related functions get different bodies depending on the version but the same mangled name.
Also I believe under some ABIs the return type is not part of the mangled name, so inline namespace make sure you can't pass the wrong type in, but you might still get a mismatch when returning.
Inline namespaces were IMO designed for such changes.
They were designed for this, but did not deliver on it. That's why GCC folks had to create this "ABI tag" feature.
fully inlined into function(s) with outside passing of variant -> Inline namespace will prevent passing a new variant into an old variant
As @MikeGitb indicated, it doesn't do that. Consider:
struct UserType { // in the global namespace
std::variant<a, b, c> d;
};
extern "C" void consume(UserType& u);
If consume's body is compiled with the "new" variant, it thinks that the data structure has an invariant that valueless_by_exception is impossible, but if the code calling consume was compiled with the old variant, it might be in the valueless_by_exception state. It doesn't matter that the two variant implementations are in different namespaces; the signature of consume doesn't mention any inline namespace.
There might be recompilation required if it is passed through API boundaries of already compiled code.
That is not allowed. That's what an ABI break means at all -- there are .dlls out there which already have the body of the old thing compiled into them. Those are already shipped on end users computers and the source code that produced those .dlls may no longer exist.
Most helpful comment
The standard is very vague on the valueless thing. As you can see from the linked Godbolt example, GCC 9/Clang 9 stdlib does not throw for small, trivially copyable values. It seems keeping the current value in case of an exception via an extra copy is allowed. See my question on SO and its answers: https://stackoverflow.com/q/58834389/1930508
I'm bringing this up as I expect the case of "small, trivially copyable values" to be very common and having an exceptionless variant for them would be a great optimization.
Edit: See your code example with a check: https://godbolt.org/z/uPRx2j GCC and Clang do not enter the valueless state even after calling
make_valueless.