What is the issue you have?
Can't dump json object into string
Please describe the steps to reproduce the issue. Can you provide a small but working code example?
nlohmann::json fJson;
std::string codigo_ativo("脟脙O");
fJson["CODIGO_ATIVO"] = codigo_ativo;
fJson.dump();
What is the expected behavior?
the .dump() method to generate the serialized string of the json object.
And what is the actual behavior instead?
Exception thrown: [json.exception.type_error.316] invalid UTF-8 byte at index 1: 0xC3
Which compiler and operating system are you using? Is it a supported compiler?
cmake version 3.11.4 with -utf-8 compile option
Did you use a released version of the library or the version from the develop branch?
Release version n潞 3.1.2 (https://github.com/nlohmann/json/releases/tag/v3.1.2)
If you experience a compilation error: can you compile and run the unit tests?
no compilation error
I've noticed similar erros at issues https://github.com/nlohmann/json/issues/1022 and
https://github.com/nlohmann/json/issues/1131
To try to fix it I added the -utf-8 flag to the compiler. Before setting a value to tje fJson object, I printed the content of the codigo_ativo variable to check its hex content:
for (size_t i = 0; i < codigo_ativo.size(); ++i)
{
std::cout << i << " " << std::hex << static_cast<int>(static_cast<uint8_t>(codigo_ativo[i])) << std::endl;
}
outputs:
0 c7
1 c3
2 4f
The string is not UTF-8 encoded. The string 脟脙O should yield the code points C7 C3 4F and thus the UTF-8 byte sequence C387 C383 4F. The latter is printed by your example program. This is not a bug from the library (it in fact detects that C3 is not a valid UTF-8 byte), but your compiler or the encoding of the source code file.
The string is encoded in Ascii, but isn't ascii codes equivalent to their respective in utf-8?
see:
https://stackoverflow.com/questions/2347783/how-to-convert-an-ascii-string-to-an-utf8-string-in-c
ASCII is a subset of UTF-8. From your string, only the last character can be expressed by ASCII.
You may want to have a look at https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/ and https://utf8everywhere.org
"ASCII is a subset of UTF-8" so if the API accepts UTF-8 it should be accepting ASCII. And All the three characters can be expressed in ascii, see its table:
Decimal 199 = 脙
Decimal 128 = 脟
That is extended ASCII. ASCII can only express 128 characters - from 0x00 to 0x7F.
Most helpful comment
That is extended ASCII. ASCII can only express 128 characters - from 0x00 to 0x7F.