As far as I know, Arena is not a memory pool which can reuse allocated memory by maintain a freelist, it just cache more and more memory when create message with arena; Isn't google tcmalloc a better and straighter way to improve overall performance? I just want to take the advantage of network transmission with protobuf.
Right now using arena with opensource protobuf doesn't gain you much, but inside Google, we have seen massive improvement by adopting arena. I think protobuf arena has two advantages that tcmalloc can't offer:
I think the most benefit we saw is from (1). This unfortunately isn't the case with opensource protobuf because all string fields are not allocated in the arena. Internally we have a hack to allocate something that looks like a string in the arena and cast it to a string with accessed, but that isn't portable. We also don't have ctype=STRING_PIECE support in opensource which can help with the issue. I know there are some users using arena with their own patch to implement ctype=STRING_PIECE. I don't think arena will can be widely used until we address the string issue.
@xfxyjwf Thanks for you quickly reply, I still have two more question in my scene, a server holds ten thousands of tcp connection keeping alive with heartbeat package:
The common patterns:
{
proto2::Arena arena;
unique_ptr<Foo> foo(Arena::CreateMessage<Foo>(&arena));
foo->ParseFromString(data);
... use foo ...
// arena is destructed
}
Foo* foo = free_list_->Pop();
foo->ParseFromString(data);
... use foo ...
free_list_->Push(foo);
(1) works well if the message structure is complex. You can also fine-control the memory allocation using ArenaOptions. For example, you can provide an initial block so if the message fits into this block no memory allocation/deallocation will happen. However, as I mentioned, string fields won't be allocated on arena so it doesn't help if you have lots of string fields.
(2) is the most common pattern used before we have arena support. That's probably still true today. Protobuf objects have the property that proto.Clear() doesn't deallocate any memory but instead caches them for reuse. So if you reuse the same proto object, memory allocation will be kept minimum. Compared to arena, proto.Clear() still has a cost because it needs to traverse the entire message tree structure, but it's much better then deleting the proto object and therefore is used very widely. This is likely the best pattern for your use case as well. You can either use a global free list or per-thread free list. In its simplest form you can just reuse one single proto object again and again. There is one catch: because proto.Clear() doesn't deallocate memory, the memory usage of the reused proto will keep increasing. The reused proto basically allocates enough memory to accommodate every message parsed into it. For example, if one message uses repeated field "a" and another message uses repeated field "b", the reused proto will keep both. The more complex your message structure is, the faster the memory usage increases. For this reason the free-list implementation usually delete an object after a certain number of uses and newly allocated object will start to accumulate memory afresh.
I think i got it.
@xfxyjwf
You mention strings not working great in arenas, but what about bytes. Bytes are pseudo strings, but since they don’t need to marshaled into some object, my assumption would be that arenas would be excellent for receiving bytes.
Especially if you wanted to receive these bytes directly into some special block of pinned memory, eg. cudaMallocHost memory using ArenaOptions.
Do arenas make sense for FlatBuffers? It seems like this might be the mechanism to do zero copy directly in and out of the memory blocks you reserve for messages.
@ryanolson In protobuf C++ API, string fields and bytes fields are both stored as std::string so the same issue applies: neither of them will be stored efficiently in protobuf arena. That can be solved by open-sourcing the zero copy support (see https://github.com/google/protobuf/issues/1896), which includes StringPiece (basically std::string_view) support and that will allow a string or bytes field to alias memory in the arena directly.
@xfxyjwf hi , I have an problem about arena .
now protobuf-3.6.1 has support create string in arena , so about this advice "I don't think arena will can be widely used until we address the string issue"
now Can I use this version to improve performance.
sorry , my english is bad . thank you .
Looking forward to your reply.
@ly82882592 No, we still do not yet have a solution for this unfortunately. We will probably need to introduce a string ctype based on std::string_view to be able to store string data directly on the arena.
@acozzette Oh , thank you
Most helpful comment
Right now using arena with opensource protobuf doesn't gain you much, but inside Google, we have seen massive improvement by adopting arena. I think protobuf arena has two advantages that tcmalloc can't offer:
I think the most benefit we saw is from (1). This unfortunately isn't the case with opensource protobuf because all string fields are not allocated in the arena. Internally we have a hack to allocate something that looks like a string in the arena and cast it to a string with accessed, but that isn't portable. We also don't have ctype=STRING_PIECE support in opensource which can help with the issue. I know there are some users using arena with their own patch to implement ctype=STRING_PIECE. I don't think arena will can be widely used until we address the string issue.