io_uring is a new method to perform efficient I/O on Linux systems. It provides a completion model (rather than a readiness model), similar to what IOCP on Windows provides, and unlike the standard poll-like interfaces, it can be used to request I/O from regular files as well (and, unlike the old/broken AIO in Linux, it doesn't require files to be opened in O_DIRECT mode).
It is a recent development, but reports of it being used by servers are very promising, often yielding gains exceeding 2 or 4x in throughput. Here's a talk by its main author with details, including benchmarks.
In addition to I/O (read/write/poll), it's also possible to handle connections (accept/connect) and a bunch of other things.
It should be possible to enable this and have both io_uring and epoll (as a fallback) in pal_networking.
Going by the pdf, it seems that polled IO might be the most suited option for PAL networking, because it is efficient, closer to epoll implementation and does not require elevated privileges (like 'kernel side polling' option). Few questions:
I'd say depend on liburing. Doing stuff by hand is possible but we would be essentially replicating it inside the runtime; better to stick with something that's been debugged and tested already. I don't know how much they care about API and ABI compatibility at this point, so using it as a shim might not be a good idea; maybe using a git submodule?
As for the minimum kernel requirement: for io_uring, we should support 5.4+ only, falling back to epoll on older kernel versions. There were many improvements in the 5.5 series too, so eventually we might even bump the requirements if we end up taking the advantage of these features, just to simplify how we implement stuff -- for instance, async file I/O and not only sockets. (This kernel is still not common in most distributions but would be nice if the performance just appeared out of the blue after a kernel upgrade.)
Possible dupe of:
https://github.com/dotnet/coreclr/issues/24441
This situation with the issues not yet ported is starting to generate noise...
Indeed it's a dupe, @damageboy. (I'll keep this issue open here as it might be easier to reference it and it's unlikely a lot of folks will keep a close eye on the coreclr repo after the consolidation.)
@lpereira Aren't the issues moving? Has anything changed?
@lpereira Aren't the issues moving? Has anything changed?
They're moving, but it should take a month or so. I can close this one once the move is complete (can't easily mark as dupe in different repos.)
It should be possible to enable this and have both io_uring and epoll (as a fallback) in pal_networking.
i think pal_networking
, coming from corefx, deserves a separate issue as there is a defined/finite surface area which is currently using epoll
where io_uring can be incorporated. It can be tracked here.
coreclr issue is a broader discussion on how to make use of io_uring in variety of scenarios, which currently is done in coreclr's pal
without using epoll
and friends in kernel-agnostic manner, afaict.
Another thing I think we can use io_uring -- maybe not right now, but we could contribute a patch to the Linux kernel -- is to implement WaitForMultipleObjectsEx() using futexes directly, and have a command in io_uring to perform operations in multiple futexes at the same time.
Another thing I think we can use io_uring -- maybe not right now, but we could contribute a patch to the Linux kernel -- is to implement WaitForMultipleObjectsEx() using futexes directly, and have a command in io_uring to perform operations in multiple futexes at the same time.
@lpereira, I'm speculating, but would a new futex opcode with already implemented linked commands and timeouts suffice you?
Someone already mentioned supporting futex(2) axboe/liburing#39
epoll bare minimum echo server
50 clients, running 512 bytes, 60 sec.
Speed: 189185 request/sec, 189185 response/sec
Requests: 11351122
Responses: 11351122
io_uring bare minimum echo server (Linux 5.4 needed, lower versions don't return the right amount of bytes read from io_uring_prep_readv
in cqe->res.) https://github.com/frevib/io_uring-echo-server
Benchmarking: localhost:5555
50 clients, running 512 bytes, 60 sec.
Speed: 368368 request/sec, 368368 response/sec
Requests: 22102112
Responses: 22102110
The difference looks good, even though it can do even better. E.g. io_uring
allows registered buffers and fds, supports IORING_OP_ACCEPT
, etc. (or get rid of callocs in the loop...)
edit removed links as author has decided on GPL v3.0
@benaadams changed it to MIT, sorry for the inconvenience. @isilence it definitely needs some optimizations and I think there are some tiny bugs. If you want/like/have time to issue a PR, I鈥檓 happy to merge.
edit author changed to MIT so put link back https://github.com/frevib/io_uring-echo-server :)
It's a networking example using liburing
which is LGPL so can be linked to (though not derived from for MIT; so don't look at the source for liburing
in case we do our own implementation on io_uring
which must be clean and not derived from LGPL).
Though I don't know the dotnet policy on linking to LGPL and whether its allowed? /cc @jkotas
There's a very detailed document from the author of liburing
@axboe who is also one of the authors of io_uring
https://kernel.dk/io_uring.pdf on the motivation for io_uring
and what it achieves, as well as how to use it (including considerations around memory barriers).
That then leads to the motivations for liburing
and how to use that (it simplifies all the boilerplate setup and tear down for io_uring
and handles all the memory barriers etc)
To quote
With the inner details of the io_uring out of the way, you'll now be relieved to learn that there's a simpler way to do much of the above. The liburing library serves two purposes:
- Remove the need for boiler plate code for setup of an io_uring instance.
- Provide a simplified API for basic use cases.
Also a LWN.net article about io_uring
As noted above, I think at least for the usecase in pal_networking.c
in this repository, where implementation is currently using epoll, does not require link to liburing (a convenience library). It is more work, yes, but IMO worth it for dotnet runtime. Taking a dependency on another runtime library comes with cost for packaging as well. For example, liburing is not readily available in Alpine Linux package and many other package management systems, see Absent in repositories.
Notwithstanding library availability -- because we could use git submodules, for instance, and statically link with liburing -- there's a bigger issue: linking with LGPL would require us to also distribute .o
files in addition to the binaries for .NET.
So I agree that it would be better to reimplement what liburing does; it's a thin wrapper around the kernel API. It mostly reduces a lot of the boilerplate necessary to map the queues and provides a bunch of auxiliary functions and whatnot.
If we're unsure how to use the API, though, it's possible to read from other implementations; for instance, there's a dual-licensed Apache 2/MIT library for Rust that could be used for studying purposes.
Also the libuv PR for io_uring
could be something to look at https://github.com/libuv/libuv/pull/2322 (libuv uses an joyent attribution licence); where they also state they can't look at the source for liburing
as its LGPL https://github.com/libuv/libuv/pull/2322#issuecomment-500455185
FWIW, I'd be willing to change the liburing license to dual MIT/GPL. There's really nothing fancy in the library, it's mostly just helpers, and a simplified interface should the application wish to use that. But it'd be a shame to have some of this code duplicated just because of licensing constraints.
@axboe That would be appreciated; it would indeed help a lot with io_uring adoption, given that GPL family of licenses aren't, unfortunately (in my personal opinion), that popular these days.
I like GPL for applications, and I still use it, but it makes less sense for libraries. And in particular for something like liburing, which isn't really a lot of smarts, it's mostly just setup and helper code. I'm doing some due diligence by emailing folks that have more than a few commits in liburing, then I'll change it provided nobody objects (can't see why they would).
I'm doing some due diligence by emailing folks that have more than a few commits in liburing, then I'll change it provided nobody objects (can't see why they would).
This has now been done.
For the record, here's an ASP.NET transport by @tkp1n that reimplements liburing in C#: https://github.com/tkp1n/IoUring
@lpereira, I'm speculating, but would a new futex opcode with already implemented linked commands and timeouts suffice you?
Someone already mentioned supporting futex(2) axboe/liburing#39
Going back to the ignored question... Guys, what's your use case and what would you need to integrate io_uring
? Support for futex(2)
? Something else?
@lpereira, I'm speculating, but would a new futex opcode with already implemented linked commands and timeouts suffice you?
Someone already mentioned supporting futex(2) axboe/liburing#39Going back to the ignored question... Guys, what's your use case and what would you need to integrate
io_uring
? Support forfutex(2)
? Something else?
Yeah, futex support for io_uring
would be very welcome, especially if it had the FUTEX_WAIT_MULTIPLE
command that was proposed a while ago (the use case is for Wine's implementation of WaitForMultipleObjects()
, which is currently using polled eventfd
s, but we also have an implementation in our PAL that could benefit from this.)
Yeah, futex support for
io_uring
would be very welcome, especially if it had theFUTEX_WAIT_MULTIPLE
command that was proposed a while ago (the use case is for Wine's implementation ofWaitForMultipleObjects()
, which is currently using polledeventfd
s, but we also have an implementation in our PAL that could benefit from this.)
Great, I'll try to take a look. I'm concerned about not having fast-path in-userspace locking, but it should be any better than eventfd + epoll
. I haven't seen FUTEX_WAIT_MULTIPLE, but will need it to be merged first.
This article about using io_uring
in modern C++ (with coroutines et al) is a pretty good read and gives some API insights, too: https://cor3ntin.github.io/posts/iouring/
lwn article The rapid growth of io_uring
A general update:
All prototyping is being done on https://github.com/tmds/Tmds.LinuxAsync, together with other experiments from #14304 . We hope to see some numbers soon. After that we can think about the productization of the changes.
Is it possible to dupe-close one of these two issues, so that there is one main tracking issue?
https://github.com/dotnet/runtime/issues/12650
Most helpful comment
FWIW, I'd be willing to change the liburing license to dual MIT/GPL. There's really nothing fancy in the library, it's mostly just helpers, and a simplified interface should the application wish to use that. But it'd be a shame to have some of this code duplicated just because of licensing constraints.