I need to serialize a binary string to CBOR.
I'm currently doing this:
void* binaryBuffer ;
std::string binString(binaryBuffer, 100000) ;
json jsStr = binString ;
json::to_cbor(jsStr) ;
binaryBuffer is first copied to binString and then binString is copied to jsStr.
Is it possible to construct jsStr directly from binaryBuffer, thus avoiding to copy the buffer twice?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
You can use the json::parse overload which accepts iterators:
void* binaryBuffer = // raw input data;
int bufferStrLength = // number of actual characters in buffer;
auto bufferChar = static_cast<char*>(rawData);
auto jsStr = nlohmann::json::parse(bufferChar, bufferChar + bufferStrLength);
auto result = json::to_cbor(jsStr) ;
Thanks for your reply.
I forgot to clarify that the question is not related to JSON in any way. I'm using the library as a CBOR serializer only.
I have a pointer to an array of chars and I'd like to serialize that string to CBOR without first copying that string to an std::string.
I see. The snippet code I provided doesn't create a std::string and directly passes your buffer to json::parse for JSON object creation. Though there will be std::string object creations during the parse for constructing the JSON object.
Since json::to_cbor only accepts an JSON object, I'm afraid there is no way to directly serialize the input buffer into a CBOR stream and creation of intermediate variable jsStr is unavoidable.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Right, the library currently only supports JSON <-> CBOR conversion.
The original question was distorted somehow.
In the example code I posted above, how to avoid the second line:
std::string binString(binaryBuffer, 100000) ;
That is, is it possible to construct a json string object from a char array, not from a std::string.
Yes, as demonstrated in https://github.com/nlohmann/json/issues/1553#issuecomment-487569496 it is possible. Does that work for you?
json::parse parses its input.
I don't want to parse a string. I want to build a json object whose content type is string, that is, .is_string() returns true.
And I want to build that json object from a char array without first converting that char array to an std::string.
Why I need to do that?
Because I have a 20 MB binary string that I need to convert to CBOR, and right now my code copies that char array twice before converting it to CBOR.
If I could avoid converting the char array to std::string first, then I would save one copy operation.
Lets assume there is a way to construct an JSON object of type string directly from a char*. Are you sure you want to convert raw char* to CBOR? First, I'm not sure whether to_cbor works for bare string objects.
More importantly, converting a raw string to CBOR doesn't make much sense and doesn't accomplish anything. CBOR is useful when the data can be parsed into fields since it can encodes the field and their data more efficiently.
CBOR representation of a raw byte stream is probably the byte stream istelf, prefixed with a few header bytes. In a nutshell, you can convert the byte stream to CBOR just by prefixing it with a handful of bytes as header.
Here is a sample code for demonstration:
#include <fstream>
#include <sstream>
#include <string>
#include <iostream>
#include "json.hpp"
using namespace std;
int main()
{
std::ifstream fin("input.bin");
if(!fin)
{
cout << "Failed to open file." << endl;
return 1;
}
std::stringstream stream;
stream << fin.rdbuf();
std::string binary_data = stream.str();
cout << "Raw bytes: " << binary_data.size() << endl;
nlohmann::json j_object = binary_data;
cout << "j_object.is_string(): " << j_object.is_string() << endl;
std::vector<std::uint8_t> v_cbor = nlohmann::json::to_cbor(j_object);
cout << "CBOR encoded size: " << v_cbor.size() << endl;
int cbor_header_size = v_cbor.size() - binary_data.size();
int result = std::memcmp(binary_data.data(), &v_cbor[cbor_header_size], binary_data.size());
cout << "Compare result: " << result << endl;
return 0;
}
You can test this code by generating a random binary file:
dd if=/dev/urandom of=input.bin count=20 bs=1MB
It should output:
Raw bytes: 20000000
j_object.is_string(): 1
CBOR encoded size: 20000005
Compare result: 0
Note that std::memcmp returns 0 when bytes of input arrays exactly match.
You are right. The question doesn't make much sense as it is. I reduced my use case to the simplest example possible for the purposes of asking the question, but ended up asking for something that's useless.
Here is an example that makes more sense:
void* binaryBuffer1, binaryBuffer2, binaryBuffer3 ;
std::string binString1(binaryBuffer1, 100000) ;
std::string binString2(binaryBuffer2, 100000) ;
std::string binString3(binaryBuffer3, 100000) ;
json obj = {
{"key1", binString1},
{"key2", binString2},
{"key3", binString3}
} ;
json::to_cbor(obj) ;
Assume that the pointers binaryBufferN point to a valid place in memory.
As you can see, the content of the memory buffers are copied twice each.
I see. Sorry for misunderstanding the issue.
Have you tried using move construction? For instance, in the above code j_object can be constructed with:
nlohmann::json j_object(std::move(binary_data));
which steals the contents of binary_data. You can verify it by calling checking binary_data.size(). You can try this to see if can be used in your use-case and causes any improvements.
Issue https://github.com/nlohmann/json/issues/786 is somewhat related too.
Yes, this code:
json obj = {
{"key1", binString1},
{"key2", binString2},
{"key3", binString3}
} ;
uses initializer lists for construction, which are not move-from-able. (This is C++ language rule, not this library's rule).
You would have to decompose it into the individual calls and use std::move like nickaein suggests. In other words:
json obj = json::object();
obj["key1"] = std::move(binString1);
obj["key2"] = std::move(binString2);
obj["key3"] = std::move(binString3);
Does that mean that this obj["key1"] = std::string(binaryBuffer1, 100000) also avoids a copy?
By constructing an std::string as a temporary (an Rvalue), the assignment operator should perform a move operation instead of a copy.
Am I right?
Does that mean that this
obj["key1"] = std::string(binaryBuffer1, 100000)also avoids a copy?By constructing an
std::stringas a temporary (an Rvalue), the assignment operator should perform a move operation instead of a copy.
Yes that's right.
Problem solved then. The double copy can be avoided by initializing the json object with a movable std::string.
Thanks @nickaein and @jaredgrubb.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.