Tokio: poll_write() wakes custom waker while returning Ready(Ok(X))?

Created on 4 Feb 2020  路  8Comments  路  Source: tokio-rs/tokio

Tokio version 0.2.11 on x86_64 Linux 5.3 in debug.

I'm not actually sure if this is a bug, but I found it super confusing, and the docs on exactly how a future should be woken aren't super well-specified here.

I wrote a custom waker/Context which is passed to tokio::io::split::WriteHalf's poll_write (no, not tokio::tcp::WriteHalf, as I need them to exist independently without the &), and on the first time poll_write is called it calls wake_by_ref before then returning Read(Ok(N)). I somewhat expected, based on the docs for std::futures, that it would only call my wakers after returning Pending when there are bytes available to be sent (which appears to maybe be the case after the first call).

st for the parts in tokio:

   9: core::task::wake::Waker::wake
             at /usr/src/rustc-1.40.0/src/libcore/task/wake.rs:252
  10: tokio::sync::task::atomic_waker::AtomicWaker::wake
             at /home/matt/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/sync/task/atomic_waker.rs:243
  11: tokio::io::driver::Inner::register
             at /home/matt/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/io/driver/mod.rs:292
  12: tokio::io::registration::Registration::poll_ready
             at /home/matt/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/io/registration.rs:228
  13: tokio::io::registration::Registration::poll_write_ready
             at /home/matt/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/io/registration.rs:193
  14: tokio::io::poll_evented::PollEvented<E>::poll_write_ready
             at /home/matt/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/io/poll_evented.rs:308
  15: tokio::net::tcp::stream::TcpStream::poll_write_priv
             at /home/matt/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/net/tcp/stream.rs:673
  16: <tokio::net::tcp::stream::TcpStream as tokio::io::async_write::AsyncWrite>::poll_write
             at /home/matt/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/net/tcp/stream.rs:836
  17: <tokio::io::split::WriteHalf<T> as tokio::io::async_write::AsyncWrite>::poll_write
             at /home/matt/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.11/src/io/split.rs:128
A-tokio C-bug M-io

All 8 comments

@TheBlueMatt I think what you are noticing is something that is in the process of being removed in #2221.

If the change goes through as is, then the first time poll_write is called it will not call wake_by_ref--instead just ensuring it's waker is registered and returning Ready(Ok(n)).

Does that answer your question?

@TheBlueMatt now that #2221 has landed, this _should_ no longer be happening. Can you give it a try on master?

Indeed, this seems to be fixed on master, thanks!

Hit this issue again on a release build (so sadly backtrace is...less than useful) on Tokio v0.2.13, though it does seem rather hard to hit (instead of reliably every time).

#1  0x0000ffffbc400ea0 in __GI___pthread_mutex_lock (mutex=0xffffb0002220) at pthread_mutex_lock.c:80
#2  0x0000aaaab4a757b8 in lightning_net_tokio::wake_socket_waker_by_ref ()
# Calls  <tokio::io::split::WriteHalf<T> as tokio::io::async_write::AsyncWrite>::poll_write
#3  0x0000aaaab4a75b04 in <lightning_net_tokio::SocketDescriptor as lightning::ln::peer_handler::SocketDescriptor>::send_data ()
# async task spawned by tokio::spawn that wakes every 60 seconds

Note that in this case I don't actully know if it was going to return Ready() or some other value - the above lock is a deadlock.

Can you try to put this in your Cargo.toml to obtain a more helpful backtrace in case you hit it again?

[profile.release]
debug = true

Hmm, sadly that didn't do much for readability of the st, but here it is:

#5  lightning_net_tokio::wake_socket_waker_by_ref (orig_ptr=0xffff78000fe0) at /root/.cargo/git/checkouts/rust-lightning-61cf011b3b289b3b/d16c1a9/lightning-net-tokio/src/lib.rs:380
#6  0x0000aaaab34577d4 in core::task::wake::Waker::wake_by_ref (self=<optimized out>) at /usr/src/rustc-1.40.0/src/libcore/task/wake.rs:267
#7  tokio::io::split::Inner<T>::poll_lock (self=<optimized out>, cx=0xffff84ef6bf0) at /root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.13/src/io/split.rs:159
#8  <tokio::io::split::WriteHalf<T> as tokio::io::async_write::AsyncWrite>::poll_write (cx=0xffff84ef6bf0, buf=..., self=...) at /root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.13/src/io/split.rs:127
#9  <lightning_net_tokio::SocketDescriptor as lightning::ln::peer_handler::SocketDescriptor>::send_data (self=<optimized out>, data=..., resume_read=<optimized out>) at /root/.cargo/git/checkouts/rust-lightning-61cf011b3b289b3b/d16c1a9/lightning-net-tokio/src/lib.rs:459

Looking at the source, I have a hard time seeing how this can happen, and this has been inactive for a long time. I am closing it. Feel free to reopen if you can still reproduce it.

I've worked around this issue in my own stuff so it may definitely still be happening but I wont hit it anymore.

Was this page helpful?
0 / 5 - 0 ratings