Mbed-os: UDPSocket::sendto with more than 1472 bytes of data sometimes fails

Created on 12 Aug 2016  路  9Comments  路  Source: ARMmbed/mbed-os

With two K64F devices using ethernet:

  1. reboot device and connect to network
  2. open a new UDPSocket
  3. call sendto with 1473 bytes or more data
  4. sendto appears to work correctly but nothing is received at the other end

Only the very first sendto fails, and only if sending more than 1472 bytes. In every other case it seems to work fine. Also, if at some point between steps 1 and 3 a TCP connection is established, the sendto at step 3 works.

closed_in_jira mirrored tracking bug

Most helpful comment

Alrighty then, that all makes sense. Thanks for digging into it :)

Closing the issue since it's working as intended.

All 9 comments

sendto appears to work correctly but nothing is received at the other end

What is the return value from the sendto call? Is the UDP packet being truncated?

What is the return value from the sendto call? Is the UDP packet being truncated?

The return value is the value of the size parameter it was called with, i.e. it appears to have successfully transmitted all of the given data.
Any subsequent sendto with up to 3000 bytes of data (at least; haven't tried more) is received with no issue, i.e. the data hasn't been truncated or corrupted and is received with a single recvfrom call, as long as the rx buffer is big enough.

ARM Internal Ref: IOTMORF-441

I'm having a hard time reproducing this. The following code seems to work fine:

#include "mbed.h"
#include "EthernetInterface.h"

uint8_t data[1473] = "Hello!";
EthernetInterface iface;

int main() {
    iface.connect();

    UDPSocket udp(&iface);
    int send = udp.sendto("echo.u-blox.com", 7, data, sizeof(data));
    printf("send: %d\n", send);
    int recv = udp.recvfrom(NULL, data, sizeof(data));
    printf("recv: %d\n", recv);
}

outputs:

send: 1473
recv: 1473

Do you have an example where this fails? Could you try the above example on your network to see if the issue is might be a local one?

UDP is allowed to drop packets. Since it consistently works after the first failure, I'm wondering if your network may actually be behaving poorly.

Have you tried this example with the actual IP?
Your example works as it is, but if I replace the URL with 195.34.89.241, it doesn't.

@geky bump

Sorry for the long wait! After significant investigation (and learning a lot about the link layer), I think I've tracked down what's going on.

I do see the error with a physical IP address, tracing packets shows that the >1472 byte packet never makes it down to the link layer, indicating the issue lies somewhere in lwip.

Comparing the successful/unsuccessful cases I was able to narrow down the specific change in behaviour to the following condition in the etharp part of lwip:
https://github.com/ARMmbed/mbed-os/blob/master/features/FEATURE_LWIP/lwip-interface/lwip/src/core/ipv4/lwip_etharp.c#L1216

Interestingly, it looks like lwip is trying to find the MAC address of the route for the packet.

If lwip can't find the MAC address in the ARP cache, it queues up the ethernet frame and sends an ARP request. In the case where the IP packet fits in the ethernet frame (DNS and <1472 byte packets), everything works fine and the packet is later sent when an ARP request is recieved. However, if the IP packet must be fragmented (>1472 byte packets), lwip only keeps the last ethernet frame in the queue, effectively corrupting the UDP packet.

Note: If the route has been established by a previous ARP request, no queueing is necessary, and this issue doesn't appear.

Note: Even if the UDP packet is corrupted, the ARP request later succeeds. Sending the UDP packet twice will successfully reach its destination.


UDP based protocols must be able to handle lost packets. It's a bit idiosyncratic, but dropping this packet is legal and is actually an optimization by lwip to reduce it's memory footprint.

That being said, lwip does have an option to enable full queueing of ethernet frames on ARP request. The define ARP_QUEUEING controls this feature. If needed you can simply pass -DARP_QUEUEING=1 to mbed compile (note that you may still lose the UDP packet anywhere on the route).

Here's the cost of the socket demo with/without ARP_QUEUEING defined:

| | .text | .data | .bss |
| --- | --- | --- | --- |
| with ARP_QUEUEING | 130816 | 4164 | 63184 |
| without ARP_QUEUEING | 130616 | 4164 | 61616 |
| difference | 200 | 0 | 1568 |

Alrighty then, that all makes sense. Thanks for digging into it :)

Closing the issue since it's working as intended.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

davidantaki picture davidantaki  路  3Comments

DuyTrandeLion picture DuyTrandeLion  路  3Comments

ccchang12 picture ccchang12  路  4Comments

MarceloSalazar picture MarceloSalazar  路  3Comments

neilt6 picture neilt6  路  4Comments