Tokio: Explore improvements to the thread pool scheduling logic

Created on 21 Feb 2018  ·  12Comments  ·  Source: tokio-rs/tokio

The PR (#141) that introduced the Tokio thread pool was an initial implementation. As such, it is not yet optimal.

This issue generally represents improving the scheduling algorithm. The discussion on the original PR included a number of good ideas.

Most helpful comment

Ah! Well, if it proves to be useful, the threaded runtime can most likely take an option.

All 12 comments

/cc @stjepang, @jeehoonkang @alkis

Always happy to get help with this :)

To make meaningful contributions to this we need some profiling data of high performance asynchronous applications. Any pointers to such data?

We work on Conduit: http://github.com/runconduit/conduit (a very fancy proxy), but max performance hasn't been a huge focus yet.

As a consumer of tokio, I can try and offer up profiling data for the project I'm building -- https://github.com/nuclearfurnace/synchrotron -- which is an L7 load balancer specifically designed for caches.

You may have seen me in the Tokio gitter lately dealing with trying to eke out as much performance as I can from Tokio, and practically everything I test is about maximum throughput at the lowest possible latencies.

Happy to be a guinea pig if that would help. 👍

Since we're switching from single-global-reactor to reactor-per-worker model in #660, I've been thinking whether we should change our task spawning strategy.

Currently, we spawn new tasks like this:

  • If we're currently in the thread pool, add the task to the current worker's queue.
  • If we're outside the thread pool, pick a random worker and add the task to its queue.

There's a problem with this strategy. If a single task spawns a bunch of new I/O tasks, they will all be added to the current worker's queue. A disproportionate number of these tasks might be polled the first time by the current worker and thus their I/O resources will be assigned to the current reactor.

I believe we should always spawn new tasks onto a random worker, regardless of whether we're inside the thread pool or not. This way I/O resources will get assigned to all reactors more evenly.

We might come up with a better solution in the future, but this is what I've come up so far. Some experiments I did locally confirm that the suggested strategy scales better. I should publish benchmarks sometime...

A few ideas on how to optimize reactor notifying threadpool about readied tasks:

  • In the Notify::notify implementation for Notifier, we always upgrade a Weak<Pool> into Arc<Pool>, which is a lot of contended refcounting. We should be smarter here.

  • After a successful .turn() inside reactor, most of the readied tasks will go into the same pool and the same worker. We could probably optimize this procedure by batching notifications together.

I believe we should always spawn new tasks onto a random worker, regardless of whether we're inside the thread pool or not. This way I/O resources will get assigned to all reactors more evenly.

In some cases you want this behaviour, for very high-performance network services.

On Linux (and many other OSes, but I do not know the details of those) you can configure a network card with multiple queues, each of which has its own IRQ, then bind each IRQ to a CPU core. Userland code can then set the SO_REUSEPORT option on a TCP socket, and bind multiple TcpListerers, one per thread, to the _same_ address. Each thread should be bound to one CPU core (pthread_setaffinity()). The kernel will then load-balance incoming packet flows over those threads in the most optimal way possible network-processing wise.

So there should be a way to express "for this TcpListener I want all tasks spawned from it to always run on the reactor it was scheduled on first" to be able to use this setup. Also a way to schedule a new task on a specific worker (to initially put one TcpListener on each thread).

I do realize this is an optimization for only a few specific applications, so I don't expect anyone to implement this right now. I just wanted to throw the idea out there.

I am working on such a high-performance application and right now I'm simply running multiple threads each with its own current_thread reactor. It's not in the stage that I can do extensive benchmarking.. yet.

@miquels that is pretty much exactly the use-case that tokio-io-pool was designed for. You may want to give it a whirl!

@jonhoo I'm not sure, it sounds like @miquels specifically wants to always spawn new tasks on the current thread executor. I believe tokio-io-pool spawns to a random thread as well.

Nope — tokio-io-pool spawns "top-level" tasks (anything spawned with Handle::spawn or Runtime::spawn) randomly, but any subsequent tasks spawned with tokio::spawn are placed on the same thread.

Ah! Well, if it proves to be useful, the threaded runtime can most likely take an option.

I'm going to close this issue. It is fairly vague and the logic has already improved over time thanks to @stjepang's work

Was this page helpful?
0 / 5 - 0 ratings