The documentation states that the driver return -ENOBUFS when recv() was called with a buffer too small to hold the incoming frame. But two questions are remain to be decided:
Also: Some drivers seem to silently truncate the frame in this case instead of returning an error. This should be tracked an fixed. I suggest to first decide on the questions above and than make sure that all drivers comply with this API
@thomaseichinger, @haukepetersen: After reading the frame from the RX-FIFO the cc2420 treats the next two bytes in the RX-FIFO as appended status information. But this will only work if whole frame was read. In case the supplied buffer was too small the cc2420 silently truncates the frame and misinterprets the next two bytes read from the RX-FIFO (which still belong to the frame) as status information.
It's good that you opened this issue.
My opinion is that the buffer should not be implicitly dropped when a too short buffer is passed, and that the first len bytes of the frame should be written to the buffer in that case. Like @bergzand wrote in https://github.com/RIOT-OS/RIOT/issues/10410#issuecomment-439359438 this opens up for some possibilities regarding frame filtering and processing that may be useful in the future.
(There is no need to return the length of the buffer in the -ENOBUFS case since the caller is probably aware of what len they passed to the function and can thus assume all len bytes of the output buffer have been written by the driver)
@gebart: E.g. the cc2420 is unable to read data from the RX FIFO without removing it from the RX FIFO. Thus, it is impossible to implement that API without sacrificing RAM for an additional buffer on the MCU.
In general I'm in favor of your proposal. But I think need a solution for drivers than can access the incoming frame only once. Making an exception from the API for drivers that do not allow retrieving only parts of a frame would make the API too complex in my opinion. Adding a buffer to cache the retrieved chunk of the frame for subsequent calls to recv() would in my opinion be too wasteful with the already limited amount of RAM.
Even though I do prefer in general that the driver should still retrieve the chunk that fits into the buffer, I would vote against this to make life for drivers like the cc2420 easier.
@maribu You are right. It makes sense to not design an API that forces the driver to allocate another frame buffer in RAM which would most of the time be just wasted space.
Isn't our current practice to first read the frame size, then do a read with the previously returned size?
IMO, in that case, drivers should return as much as possible, drop the rest and return -ENOBUFS, which implicitly means "only \ What's the use case for first reading part of the packet, then reading the full packet?
What comes to mind would be some kind of check whether the packet is actually interesting for the node, and it would want to take a look at the first couple of bytes before deciding to allocate the full amount. To me, the stack would need to be able to allocate the full amount anyways in case the packet turns out to be interesting to the node. I can think of obscure memory saving reasons where a node might rather not free some data unless absolutely necessary, but do we want to design our API around that?
What's the use case for first reading part of the packet, then reading the full packet?
I'm just now reading @bergzand's comment linked above. How long does it take to read a whole frame vs. reading only the link layer addresses?
IMO, in that case, drivers should return as much as possible, drop the rest and return -ENOBUFS, which implicitly means "only
bytes of packet have been written to buffer".
IMHO it can lead to bugs to impose this on a driver. -ENOBUFS should in the best case keep the current frame (so the consumer can try again), and the state of the provided buffer should
Isn't our current practice to first read the frame size, then do a read with the previously returned size?
As far as I know most upper layer code does exactly this. With this practice calling recv() with a too small buffer should never occur, unless the netdev_driver_t::recv() reported a smaller size than actually needed. I would consider reporting a smaller size as a bug - if the netdev_driver_t::recv() cannot know for sure how big the frame will actually be it should in my opinion return the upper bound.
If my assumptions above are correct, the too small buffer case should only occur if some bug was triggered. If this is true, I would be in favor of just returning -ENOBUFS without retrieving a truncated version of the frame, as this would require the least lines of code and ROM size to handle.
I would consider reporting a smaller size as a bug - if the
netdev_driver_t::recv()cannot know for sure how big the frame will actually be it should in my opinion return the upper bound.
As IIRC netdev_tap is doing.
If my assumptions above are correct, the too small buffer case should only occur if some bug was triggered. If this is true, I would be in favor of just returning
-ENOBUFSwithout retrieving a truncated version of the frame, as this would require the least lines of code and ROM size to handle.
Yeah, thinking more about it, I agree. Seems to be the simplest solution. Isn't that also how most drivers currently are implemented?
If at some point we do want to do partial packet reads, we can reconsider, or even just add this as an optional netdev->recv_peek().
@kaspar030 @maribu @miri64 Correct me if I am wrong, but it seems like we have a consensus among the devs who have commented here.
When buf_len < frame_len:
Any other interested devs? @bergzand @gschorcht @aabadie might want to just take a look if they want to add something to the discussion
I found an edge-case where it is not that easy to implement such a behavior (without additional buffering): SLIP. Basically slipdev just reads the UART character-wise to allow for unescaping the stuffed bytes, so I can't go back if the buffer I'm copying the encoded data into is too small for the given packet :-/.
I think because the too-small-buffer-case would in absence of any bug never occur, this case should be handled with the least amount of RAM/ROM and runtime overhead. Taking edge-cases like slipdev and the cc2420 into account, the least effort API would be:
If recv() was called with buf != NULL and buf_len < frame_len:
-ENOBUFS@bergzand: You had a use case in mind for the peek feature. Do you agree in not providing it?
This proposal sounds sane to me. I would however add something like "The state of the data retrieval buffer is not defined on error", so we avoid requiring a reset or something like that in an error case.
@miri64
This proposal sounds sane to me. I would however add something like "The state of the data retrieval buffer is not defined on error", so we avoid requiring a reset or something like that in an error case.
Sounds good to me, the buffer should not contain anything important anyway since the caller was expecting to have it overwritten by the received frame before the error occurred.
The state of the data retrieval buffer is not defined on error
I'm strongly against this. This goal of this issue is to get rid of undefined behavior.
so we avoid requiring a reset or something like that in an error case.
But this would exactly be the outcome of that undefined behavior: A user of the API can no longer expect subsequent frames to be correct. If the buffer is in an undefined state afterwards, the user has to expect that subsequent frames could potentially be mixed with whatever was left in the buffer after the error. The upper layers will have to work around this e.g. by resetting the device to get back to a defined state. The complexity of this work around is clearly greater than bringing the buffer into a defined state in the driver.
My opinion on undefined behavior is that it only leads to increased complexity and bugs.
I'm strongly in favor of dropping the received frame completely in case of this error. Thus, the buffer should contain no information after an error. And subsequent frames are guaranteed to not get mixed up with the partly received frames.
The state of the data retrieval buffer is not defined on error
I'm strongly against this. This goal of this issue is to get rid of undefined behavior.
I think you both are talking about different buffers, if I understand the argumentation correctly.
@miri64 means that the supplied (to recv()) buffer might have some bytes changed, e.g., recv() might touch the buffer even if it is actually dropping the frame. This might help implementations where reading from the device (even if partly) needs a buffer, and it would otherwise, in the error case, need to be allocated on stack. I think mentioning that the buffer might be touched even in the drop case is perfectly acceptable. After all, in gnrc the buffer is probably full with data from the last packet(s) as not all pktbuf allocations are memset(0)'ed.
If I understand @maribu correctly, you're referring to the device's input buffer, and that we have to avoid that being in an undefined state requiring extra handling on the recv() user side? I totally agree with that.
@kaspar030: Thanks for your explanation. Now that I understand what @miri64 meant I totally agree :-)
The state of the data retrieval buffer is not defined on error
I'm strongly against this. This goal of this issue is to get rid of undefined behavior.
I think you both are talking about different buffers, if I understand the argumentation correctly.
@miri64 means that the supplied (to recv()) buffer might have some bytes changed, e.g., recv() might touch the buffer even if it is actually dropping the frame. This might help implementations where reading from the device (even if partly) needs a buffer, and it would otherwise, in the error case, need to be allocated on stack. I think mentioning that the buffer might be touched even in the drop case is perfectly acceptable. After all, in gnrc the buffer is probably full with data from the last packet(s) as not all pktbuf allocations are memset(0)'ed.If I understand @maribu correctly, you're referring to the device's input buffer, and that we have to avoid that being in an undefined state requiring extra handling on the recv() user side? I totally agree with that.
Yes I meant the supplied buffer to the recv() method. Sorry for the confusion.
Yes I meant the supplied buffer to the recv() method. Sorry for the confusion.
No that I know what you meant it is obvious to me. Everybody else understood you right away, so it's me not reading properly ;-)
So, there seems to be an agreement reached?
If recv() was called with buf != NULL and buf_len < frame_len, then:
buf is undefined garbage@maribu :+1: from me
Ok, this PR applies the current consensus to the doc and other similar changes on netdev_driver_t::recv()s documentation. I split each change into a separate commit to allow discussing them one by one and easily split off controversial parts into a separate PR -- if any.
@kaspar030 @gebart can you maybe also have a look at #10657?
Most helpful comment
@kaspar030 @maribu @miri64 Correct me if I am wrong, but it seems like we have a consensus among the devs who have commented here.
When buf_len < frame_len:
Any other interested devs? @bergzand @gschorcht @aabadie might want to just take a look if they want to add something to the discussion