Node: possible libuv issue with macos 10.15: empty udp message not registering

Created on 19 Oct 2019  路  23Comments  路  Source: nodejs/node

  • Version:
    v12.12.0
  • Platform:
    macos catalina/10.15 (19A602)
  • Subsystem:
    n/a

    When sending an empty udp message, the message will not be received until another non-empty udp message is received.

Example/Reproduce: (may only affect macos 10.15?)

file server.js

var PORT = 33333;
var HOST = '127.0.0.1';

var dgram = require('dgram');
var server = dgram.createSocket('udp4');

server.on('listening', function() {
  var address = server.address();
 console.log('UDP Server listening on ' + address.address + ':' + address.port);
});

server.on('message', function(message, remote) {
 console.log(remote.address + ':' + remote.port +' - ' + message);
});

server.bind(PORT, HOST);

file client_test.js

var PORT = 33333;
var HOST = '127.0.0.1';

var dgram = require('dgram');


var client = dgram.createSocket('udp4');
var message = Buffer.from('abc');

client.send(message, 0, message.length, PORT, HOST, function(err, bytes) {
  if (err) throw err;
  console.log('UDP message sent to ' + HOST +':'+ PORT);
  client.close();
});

file client_0.js

var PORT = 33333;
var HOST = '127.0.0.1';

var dgram = require('dgram');


var client = dgram.createSocket('udp4');
var message = Buffer.from('');

client.send(message, 0, message.length, PORT, HOST, function(err, bytes) {
  if (err) throw err;
  console.log('UDP message sent to ' + HOST +':'+ PORT);
  client.close();
});

run the server code, then run the client_0.js: no response in server.
followed by running client_test.js: server got empty and the "abc" message.

confirmed-bug dgram macos

Most helpful comment

Looks like Apple retired radar, but I've filed this using the new system as FB7503750.

All 23 comments

Does this also happen with Node 12.11.1?

@addaleax Yes, just tried with the binary using https://nodejs.org/dist/v12.11.1/

/cc @nodejs/platform-macos

@daviehh Have you tried recreating this with e.g. Python? I鈥檓 not on a mac but I have the suspicion that this may actually be just a platform quirk of macOS that we can鈥檛 do anything about.

@addaleax I have actually tried Python, empty udp message works there:

receiving:

import socket

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.bind(("127.0.0.1", 6000))

while True:
    data, addr = sock.recvfrom(1024)
    print("received message:", data)

sending:

import socket

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.sendto(b"", ("127.0.0.1", 6000))

It seems our test suite also fails due to this on macOS Catalina.

=== release test-dgram-connect-send-empty-buffer ===                          
Path: parallel/test-dgram-connect-send-empty-buffer
Command: out/Release/node /Users/luigi/code/node/test/parallel/test-dgram-connect-send-empty-buffer.js
--- TIMEOUT ---
=== release test-dgram-connect-send-empty-array ===        
Path: parallel/test-dgram-connect-send-empty-array
Command: out/Release/node /Users/luigi/code/node/test/parallel/test-dgram-connect-send-empty-array.js
--- TIMEOUT ---
=== release test-dgram-connect-send-empty-packet ===       
Path: parallel/test-dgram-connect-send-empty-packet
Command: out/Release/node /Users/luigi/code/node/test/parallel/test-dgram-connect-send-empty-packet.js
--- TIMEOUT ---
=== release test-dgram-send-empty-array ===                
Path: parallel/test-dgram-send-empty-array
Command: out/Release/node /Users/luigi/code/node/test/parallel/test-dgram-send-empty-array.js
--- TIMEOUT ---
=== release test-dgram-send-empty-buffer ===               
Path: parallel/test-dgram-send-empty-buffer
Command: out/Release/node /Users/luigi/code/node/test/parallel/test-dgram-send-empty-buffer.js
--- TIMEOUT ---
=== release test-dgram-send-empty-packet ===
Path: parallel/test-dgram-send-empty-packet
Command: out/Release/node /Users/luigi/code/node/test/parallel/test-dgram-send-empty-packet.js
--- TIMEOUT ---
[03:46|% 100|+ 2791|-   6]: Done

Thanks! I first noticed this issue with a beta version of another language that uses libuv (julia), so it's likely that it's a libuv bug; is it good for the bug report to be here or is it better to report to libuv? Not sure how to write a simple c code that directly uses libuv to reproduce the issue though.

Yeah, I've reproduced the issue in libuv. Apparently the message is actually sent (I've checked it with tcpdump) but not received.

It looks like a problem with kqueue not reporting the event. The python example does not seem to apply to our case as it's waiting on recvfrom syscall.

@nodejs/libuv Has anybody reported this to Apple yet? I鈥檓 assuming that that鈥檚 the thing to do here, no?

Yes... except sending a bug report to Apple is like throwing matter into a black hole: it disappears and is never seen again.

I have a channel to get some additional information from Apple given the bug report number. Has anyone reported it?

Oh... I just realized that I might have reported this in 2014 and provided a fix. Here is a repo (by other person) that reproduces the issue using outdated libuv version: https://github.com/misterdjules/udp-empty-dgram-repro

So this just regressed then? If nobody has reported a new radar, I'm happy to do that.

Looks like Apple retired radar, but I've filed this using the new system as FB7503750.

IMO, I suggest our tests should be reworked to not require platforms support zero-length UDP datagrams, and instead check they either work, or not throw js/c++ errors because our code is broken.

For platforms that do support them, its nice that node and libuv do, as well, but its not exactly a critical feature. I'm not aware of any actual uses for zero-length UDP other than port-knocking during network intrusion or crashing unsuspecting UDP listeners, who insufficiently validate the received packet sizes.

Making the tests auto-detect what is happening, and pass whether or not the packets are received seems like a long-term robust solution, and will get OS X 10.15 out of perma-yellow.

Thoughts?

cc: @AshCripps

@sam-github You could work the tests to do that however, from what I understand the test is supposed to pass on mac, apple just introduced a regression in catlina rather than the feature not being supported.

@sam-github Turns out these tests used to be skipped fully on apple in the first place when the bug first appeared in macos 10.9. This would get the CI out of perma yellow but would need to be swapped back if apple ever fix it. Im going to try and see if I can find out the progress of the apple bug report and see if anythings being done about it tomorrow.

https://github.com/nodejs/node/pull/22546 (Can only find the PR to remove the skip)

Apple has told me that they have identified a fix and will ship it in a future OS update, but have not provided any further details than that.

@keno At least that something, thanks for letting us know!

22546 (Can only find the PR to remove the skip)

The skip is very old and was added in the old repository (see https://github.com/nodejs/node/issues/30030#issuecomment-566755213).

I'm seeing multiple instances of our test suite not respecting the mark of flaky in these tests, any idea what's up?

https://ci.nodejs.org/job/node-test-commit-osx/32536/#showFailuresLink

nvm failure is due to test-timers-blocking-callback

The issue seems to be fixed in macOS 10.15.4

Was this page helpful?
0 / 5 - 0 ratings

Related issues

cong88 picture cong88  路  3Comments

addaleax picture addaleax  路  3Comments

sandeepks1 picture sandeepks1  路  3Comments

danialkhansari picture danialkhansari  路  3Comments

filipesilvaa picture filipesilvaa  路  3Comments