What language does this apply to?
cxx and other binding languages such as python's cpp_implenentation
Describe the problem you are trying to solve.
the Reflection class in message.h only has SetString(..., const std::string &value)interface. in some cases, we just have raw buffer (char * and size), to call the SetString method, we have to construct/memcpy/deconstruct a std::string. But in some time-sensitive case, this consumes much, especially when the the bytes is large enough (eg. 8MB), which will cause large amount of page-faults.
Describe the solution you'd like
Add SetStringByArray(..., const char *, size_t ) interface, and call it when necessorily.
Describe alternatives you've considered
even in protobuf, construct 8MB bytes(std::string) will cause large amount of page-fault, this will slow down much. is it possible to setup an allocator for the protobuf. I did not seen such method.
Additional context
Add any other context or screenshots about the feature request here.
Unfortunately I don't think there's any easy way to do this. Internally messages store string and bytes fields as std::string, so ultimately we have to construct a real std::string. However, we could potentially add an overload that takes a std::string&& and then we could at least move the string into place without creating another copy.
@acozzette
I'm sorry, I did not describe it clearly.
I uses the contruct/memcpy/deconstruct did not mean the protobuf's internal data storage but the extra std::string, which must be create to adapt to the interface.
like the following code, could we remove the temp std::string's construct/memcpy/deconstruct
bool foo(const char *buf, size_t size,...) {
...
std::string temp(buf, size); // actually, we don't need to create a temporary std::string here, if we have
// some method like SetStringByArray(..., buf, size);
reflection->SetString(...., temp);
...
}
the original case comes from the following code, and I measured it in a 8MB bytes case. the value_string's construct and memcpy take about 3ms( actually due to the page fault).
https://github.com/protocolbuffers/protobuf/blob/a03d332aca5d33c5d4b2cd25037c9e37d57eff02/python/google/protobuf/pyext/message.cc#L787
If the string is passed by value or by rvalue reference then we can eliminate one of the two copies. The user can create the string and then it can be moved directly into place.
Adding an overloaded SetString(..., std::string&& value) which utilizes move semantics will avoid using the copy constructor and the std::string object will be moved, not copied which is less costly and will improve performance significantly
copy that. when will this feature be added to the master?
I'm afraid I don't have time to work on it, but if you want to try implementing it yourself then feel free to send me a pull request.
solution:
SetString(..., const std::string&) to SetString(..., std::string). std::string s any more , using SetString(..., std::move(s))std::string s, using SetString(..., s)SetString, becuase we get the moved or copied string, we just move it forward
Most helpful comment
If the string is passed by value or by rvalue reference then we can eliminate one of the two copies. The user can create the string and then it can be moved directly into place.