epoll_wait works as it did in 4.1.29
epoll_wait fails when the optimisation added in #7816 is used. timerfd_settime() is called with the following itimerspec:
Thread 26 "java" hit Catchpoint 1 (returned from syscall timerfd_settime), 0x00007faff30e15da in timerfd_settime () at ../sysdeps/unix/syscall-template.S:78
78 in ../sysdeps/unix/syscall-template.S
(gdb) print (struct itimerspec) *$rdx
$9 = {it_interval = {tv_sec = 0, tv_nsec = 0}, it_value = {tv_sec = -1, tv_nsec = -1}}
This is a continuation of the problem reported in #8348.
By my reading of the code, timerfd_settime shouldn't be called when both tvSec and tvNsec are -1. That makes me suspect that this if is not evaluating as expected: https://github.com/netty/netty/blob/44c3b824ecabe5d4bf52d3a9a7a30c456ee391b8/transport-native-epoll/src/main/c/netty_epoll_native.c#L187-L196
TBD
Sorry, I don't have one yet. I will update the issue if/when I do.
4.1.31.Final
java -version)openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-0ubuntu0.18.04.1-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
uname -a)Linux ubuntu 4.15.0-29-generic #31-Ubuntu SMP Tue Jul 17 15:39:52 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
@wilkinsona I have no idea what is going on... I was not able to reproduce with an isolated test-case that will pass in -1:
https://github.com/netty/netty/pull/8447
Are you sure that there is not somehow a different native lib on the class path ?
I'm as sure as I can be, but there must be something strange going on as I've been unable to reproduce the problem whenever I try to isolate it. I've got things set up in my Ubuntu VM so that I can build Netty, including the epoll transport, from source. I'm going to try and add some caveman debugging to see if I can figure out what's going on. I haven't been able to glean much via gdb.
Thanks... let me known if there is anything I can do
Am 30.10.2018 um 15:42 schrieb Andy Wilkinson notifications@github.com:
I'm as sure as I can be, but there must be something strange going on as I've been unable to reproduce the problem whenever I try to isolate it. I've got things set up in my Ubuntu VM so that I can build Netty, including the epoll transport, from source. I'm going to try and add some caveman debugging to see if I can figure out what's going on. I haven't been able to glean much via gdb.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
Well, this is embarrassing. Your suspicion that there was a different native lib on the class path was exactly right. Sorry. Micrometer's Statd module shades Netty but leaves the native /META-INF/native/libnetty_transport_native_epoll_x86_64.so in its default location. That one's first on the classpath so it's found and loaded by NativeLibraryLoader.
@wilkinsona doh! Thanks for closing the loop on this, this makes sense. At least we have one more unit test now :)
Most helpful comment
Well, this is embarrassing. Your suspicion that there was a different native lib on the class path was exactly right. Sorry. Micrometer's Statd module shades Netty but leaves the native
/META-INF/native/libnetty_transport_native_epoll_x86_64.soin its default location. That one's first on the classpath so it's found and loaded byNativeLibraryLoader.