Although documentation for static void to_cbor(const basic_json& j, detail::output_adapter<char> o) says nothing I assumed that it has some purpose but f.e. following sample:
::nlohmann::json jj;
std::vector<char> test_cbor_data;
::nlohmann::json::to_cbor(jj, test_cbor_data);
generates lots of warnings like
Warning C4309 'static_cast': truncation of constant value testapp json.hpp 7731
in class binary_writer
at lines like:
https://github.com/nlohmann/json/blob/e426219256e8011fc8e4dd0bc8f54c85034218fa/single_include/nlohmann/json.hpp#L7949
or
Basically the cause for that is static_cast<CharType>(/*some integer value > 127*/) with CharType = char leads to loss of integer value (sizeof(char) is 1 byte and it is signed, so value range is [-128;127] and it can't fit integer number > 127).
I don't know if it is an issue but that code smells.
(The function's documentation should be similar to this one.)
@nlohmann Ok, so it is not dead code. Then there is a problem. If user calls binary_writer<char>::to_something directly or indirectly like in json::to_cbor then integer values over 127 are being lost.
What do you think about using the approach of https://stackoverflow.com/a/15172304/266378 and replace all the static_cast<CharType> calls by calls to convert:
CharType convert(std::uint8_t x)
{
return *reinterpret_cast<CharType *>(&x);
}
?
@nlohmann what about CharType != signed/regular/unsigned char? reinterpret_cast to char* and following dereference of char* for any pointer is safe by standard. But reinterpret_cast of pointer to some CharType = UserDefiendType? The cast itself is UB.
{
int i;
int *ip = &i;
char *cp = reinterpret_cast<char*>(ip); // safe
*cp; // safe
struct MyUserType{};
MyUserType mut;
MyUserType *mutp = &mut;
cp = reinterpret_cast<char*>(mutp); // safe
*cp; // safe
// DANGER (undefined behaviour):
mutp = reinterpret_cast<MyUserType*>(cp); // UB
*mutp; // UB
mutp = reinterpret_cast<MyUserType*>(ip); // UB
*mutp; // UB
}
So the function should be like:
std::enable_if<
std::is_same<CharType, char>::value ||
std::is_same<CharType, unsigned char>::value ||
std::is_same<CharType, signed char>::value,
CharType
>::type convert(std::uint8_t x)
{
return *reinterpret_cast<CharType *>(&x);
}
or restrict CharType to be only one of
char, unsigned char, signed char
in the class itself.
For the input adapters, we require that CharType is integral and has a sizeof 1. Does this answer your question?
For the input adapters, we require that
CharTypeis integral and has asizeof1. Does this answer your question?
C++03 5.2.10.7:
A pointer to an object can be explicitly converted to a pointer to an object of different type.
Except that converting an rvalue of type “pointer to T1” to the type “pointer to T2” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value, the result of such a pointer conversion is unspecified.
C++03 3.10.15:
If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behavior is undefined
— the dynamic type of the object,
— a cv-qualified version of the dynamic type of the object,
— a type similar (as defined in 4.4) to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
— a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
— a char or unsigned char type.
So in general cast to any pointer except of char/unsigned char with dereferencing will cause UB.
Hm. What do you propose?
This seems like the solution tho:
std::enable_if<
std::is_signed<CharType>::value && std::is_signed<char*>::value,
CharType
>::type convert(std::uint8_t x) nothrow
{
return *reinterpret_cast<char*>(&x);
}
std::enable_if<
std::is_signed<CharType>::value && std::is_unsigned<char*>::value,
CharType
>::type convert(std::uint8_t x) nothrow
{
using namespace std;
static_assert(sizeof(std::uint8_t) == sizeof(CharType), "size of CharType must be equal to std::uint8_t");
static_assert(is_pod<CharType>::value, "CharType must be POD");
CharType result;
memcpy(&result, &x, sizeof(x));
return result;
}
std::enable_if<
std::is_unsigned<CharType>::value,
CharType
>::type convert(std::uint8_t x) nothrow
{
return x;
}
may be add some constexpr.
I had to fiddle a bit to make it compile:
template < typename C = CharType,
enable_if_t < std::is_signed<C>::value and std::is_signed<char>::value > * = nullptr >
static constexpr CharType convert(std::uint8_t x) noexcept
{
return *reinterpret_cast<char*>(&x);
}
template < typename C = CharType,
enable_if_t < std::is_signed<C>::value and std::is_unsigned<char>::value > * = nullptr >
static CharType convert(std::uint8_t x) noexcept
{
static_assert(sizeof(std::uint8_t) == sizeof(CharType), "size of CharType must be equal to std::uint8_t");
static_assert(std::is_pod<CharType>::value, "CharType must be POD");
CharType result;
std::memcpy(&result, &x, sizeof(x));
return result;
}
template<typename C = CharType,
enable_if_t<std::is_unsigned<C>::value>* = nullptr>
static constexpr CharType convert(std::uint8_t x) noexcept
{
return x;
}
Is this what you mean?
Yeah, sorry for my code snippet, was writing on the fly.
so
for signed-signed conversion will be just elementary copy with cast to signed char pointer type and dereferencing (which is ok by standard),
for signed-unsigned conversion will be exact copy of bits contained with memcpy (no choice there, sorry)
and
There is some issue that I do not know if you are taking in consideration in that project: bit count in byte.
std::uint8_t contains exactly 8 bits and will not be implemented on platforms with count of bits in byte > 8. In reality there are platforms with bit count in byte > 8 (16, 9, Windows CE with 16bits, etc.) so std::uint8_t is absent there.
@oktonion Could you have a look at branch https://github.com/nlohmann/json/compare/feature/convert_char?
For the non-8-bit-char platforms, I do not have an opinion right now.
IMO, we should not worry about esoteric platforms that are not tested in our CI. We cannot verify that the unit tests even pass there, or even if they happen to, that we don't break it accidentally down the road. Platforms where "byte!=octet" (such that uint8_t is not present) are extremely rare and we should not pessimize our implementation in fear we _might_ be breaking them when we don't even test them to start with.
@jaredgrubb Good point but as it states in the README of the projects "What if JSON was part of modern C++?", so I assume that if this library aims to be part of standard lib then it should work on every platform possible (even with > 8 bits in byte?).
Anyway I'm not starting some holly wars about this point. If library supports every platform possible by standard then using uint8_t with no other option is bad design. If it is not the case then some compiler error message like
#include <climits>
#if CHAR_BIT != 8
#error "JSON library is not implemented for this platform"
#endif
would be nice in my opinion.
@nlohmann also I want to thank you and the contributors for this excellent library. And I could say that it could be implemented in pure C++ 98 so I've already ported like 90% of codebase to old standard and it works like charm.