I have a custom protocol and I'm writing a large, dynamically generated object to a channel via Channel.writeAndFlush. The custom protocol is based on protobuf, and because the object I'm writing is too large to fit into a single protobuf message, I'm breaking it into many objects ~1M in size.
This conversion happens within a ChannelOutboundHandler in my ChannelPipeline; basically, it's a for loop that chunks the large, dynamically generated objects into the small protobuf messages, which are then sent down the pipeline via ChannelHandlerContext.writeAndFlush. The default memory of the JVM is 128M, the encoded protobuf objects when put into a ByteBuf are just over 1M, so the ByteBuf's holding them jump up to 2M on resize.
Because the original Channel.writeAndFlush is not yet complete, the sub-writes via ChannelHandlerContext.writeAndFlush back-up in the ChannelOutboundBuffer; I get 60 2M ByteBuf's ready to go outbound in that buffer. On the 61st write, I can see in DefaultChannelHandlerContext.write that the result of next.invokeWrite(msg, promise) leaves the Promise as: DefaultChannelPromise@2479c21f(failure(io.netty.handler.codec.EncoderException: java.lang.OutOfMemoryError: Direct buffer memory)
Eventually, the EOF object, which is small and _can_ be added to the outbound buffer is successfully written. At this point the original, Channel.writeAndFlush call is completed, the outbound buffer empties and sends the content along to the receiver, which notices that it's malformed.
(I got to the sending EOF part w/o noticing the exception because I didnt realize the write promise(s) had exceptions on them).
There has to be a way to actually flush the outbound buffer and avoid these OOMs. This is a relatively small amount of data and what seems, to me at least, to be a pretty reasonable usage.
@mhgrove when calling writeAndFlush() it will try to flush out the data (and all pending data) but may not be able to do so as the receive can not consume it fast enough. You will need to check channel.isWritable() to not write to fast. I will return false once the buffer fills up. Once there is enough space again it triggered fireChannelWritabilityChanged() which you can intercept in your ChannelHandler and continue the write.
An other thing that could cause the thing you describe would be you blocking the EventLoop and so block it from writing. Check with Yourkit or VisualVM if this is the case.
Please use stackoverflow or our google group for questions in the future. This is only for bugs :)
Most helpful comment
@mhgrove when calling writeAndFlush() it will try to flush out the data (and all pending data) but may not be able to do so as the receive can not consume it fast enough. You will need to check channel.isWritable() to not write to fast. I will return false once the buffer fills up. Once there is enough space again it triggered fireChannelWritabilityChanged() which you can intercept in your ChannelHandler and continue the write.
An other thing that could cause the thing you describe would be you blocking the EventLoop and so block it from writing. Check with Yourkit or VisualVM if this is the case.
Please use stackoverflow or our google group for questions in the future. This is only for bugs :)