Zig: solve concurrency

Created on 25 Aug 2016  路  20Comments  路  Source: ziglang/zig

On top of the ability to create a kernel thread, Zig should have an abstraction that allows code to express concurrency.

  • [x] coroutines (async / await)
  • [x] kernel threads for linux
  • [x] atomic primitives
  • [ ] see the TODO for atomicRmw in the docs. support bool and integers with bit counts not powers of 2. for atomicrmw as well as atomicload. #1220
  • [x] #461
  • [x] implement kernel threads for macos and windows
  • [x] Proof of concept M:N threading. (multiplex coroutines onto a thread pool)
  • [ ] #157
enhancement

Most helpful comment

We're one step closer to having solved concurrency now that we have coroutines.

Next steps:

  • Kernel threads in the standard library. Thread local variables?
  • Proof of concept M:N threading. (multiplex coroutines onto a thread pool)
  • Static call graph analysis to determine worst case stack upper bound.

After this, concurrency will be solved.

All 20 comments

Are you able to use Atomic code generation natively in Zig? If not that would be a great place to start.

We have some built-in functions for this currently with plans to for many more atomic primitives.

In addition this issue is calling to research M:N threading and perhaps some kind of promise abstraction and/or cooperative multithreading.

Nice to hear atomics are supported. I'd like to caution on keeping things simple, one thing I really like about C is that I have very good control on how big programs are. You can create programs which don't use any "std" libraries and create code that are just a few bytes in size. Another words the programmer is in complete control.

So I suggest, if you want to have features like M:N threading or whatever, there should be no cost in speed or size in programs that don't use them.

Your comments are in complete alignment with Zig's philosophy. Whatever concurrency features we have, they will be built on top of the kernel thread abstraction that will be provided in the standard library, and they will not depend on a black box runtime or hide errors such as memory allocation failure.

Just my opinion: i think LibUV has an excellent and industry proven abstraction for tasks and sockets/filesystem IO, and sits just in the middle of ASIO/STDC++ and threads in C11, I'm familiarising with zig trying to port parts of the async loop of that library using BSD's kqueue, but first I have to finish an allocator using mmap.

but first I have to finish an allocator using mmap.

ahh, very exciting. How do you plan to handle the problem of different threads allocating and deallocating memory?

Im currently reading the malloc related source code in http://opensource.apple.com//source/Libc/Libc-583/gen/malloc.c it seems that it take pages with mmap, the expensive syscall ( it seems mmap is thread safe) is done less frequently and in the actual malloc call that subdivides the page for allocations just lock a mutex releasing it when it finish (macros MALLOC_LOCK and MALLOC_UNLOCK) , note : osx deprecated the use of sbrk and brk.

edit: good reading on the subject : Mac OS X Internals: A Systems Approach (memory chapter)

Very belated comment.

I strongly suggest not supporting M:N threading.

TL;DR: nearly every system that did use it in the past has abandoned it including Java, Solaris, Linux...

Some light reading. Dan Kegel has some good links in his C10k web pages:

Dan Kegel's C10k notes

There was an interesting discussion on the Rust lists a few years ago about this (a bit of a slog to read):

The future of M:N threading

In view of systems selling now with 32+ cores on the desktop, it seems like the benefits of M:N threading are less and less obvious. At the same time, there seems to have been little progress made on the problems with it.

We're one step closer to having solved concurrency now that we have coroutines.

Next steps:

  • Kernel threads in the standard library. Thread local variables?
  • Proof of concept M:N threading. (multiplex coroutines onto a thread pool)
  • Static call graph analysis to determine worst case stack upper bound.

After this, concurrency will be solved.

I hope you don't substitute a serial event loop and callbacks for concurrency. I think the Rust experimental RFC for adding yield / generator-style coroutines is a pretty good read. Here it is: https://github.com/rust-lang/rfcs/blob/master/text/2033-experimental-coroutines.md

Someone else's recent thoughts on concurrency are at https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/ ; I tend to agree philosophically, though I think in Zig the right answer might be handing in Nurseries much like Allocators.

go is nice because it's an extremely _simple_ API.

I think go is not aligned with Zig design principles (it hides complexity and runtime cost, and lacks a sane error-handling or panic/unwind approach) and neither is a "concurrency framework" (framework is a dirty word for similar reasons.) I think a Zig-aligned solution is to offer a small and simple set of orthogonal APIs that users can build abstractions on, such as go or spawn, in a safe and zero-cost way, if they would like to.

Here is another good article on problems intrinsic to "concurrency" where the goroutine model falls short:
NUMA-aware scheduler for Go (and HN discussion)

The point of that article is that error handling in the context of concurrency can be difficult using traditional primitives, and wrapping them up in something like a Nursery (see the paper) helps by having a concurrency manager (much like an Allocator is an 'allocation manager') to deal with catching/handling errors and... managing concurrency.

I think people misunderstand the mission of Zig. The idea of "perfect software" is not to say that Zig wants to make errors impossible (which seems to be more of a Rust goal). What it means is that only Zig and C enable local handling of out of memory errors. C++, Rust, Java, Go, Python, Ruby, etc. all handle OOM through exceptions (or crashing), which leads to unsafe code that isn't handled at the call site. No one writes C++ with every std::vector::push wrapped in a try/catch to check for OOM. Zig is a modern flavor of the C approach without some of C's undefined behavior.

TL;DR IMO Zig is not in the business of forbidding unsafe practices. In fact, Zig should encourage low-level, unsafe code and keep opinions to itself. Zig's only axiom is that error values are better than exceptions. I think the ideas put forward here are interesting and might be worth putting in an external library, but not fundamental to the language in any way.

Concurrency is solved. We have the ability to create kernel threads on all supported targets using std.os.spawnThread. On top of that we have M:N threading implemented in userland with coroutines using async/await syntax. For example:

  • std.event.Loop
  • std.event.Lock
  • std.event.Channel
  • std.event.Group
  • std.event.Future

The self-hosted compiler is underway using these abstractions.

The @atomicRmw issue mentioned above is a separate issue: #1220
Also stack upper bound determination is a separate issue: #157

I would like to suggest that issues should only be marked as done once documentation is added because without documentation any feature is about as good as if it would not exist in the fist place.

Maybe this is too much to ask at this point but nevertheless wanted to express this opinion because at this point (seeing the issue is closed) I would love to read into some nice examples.

Thanks for your work 馃憤

  • basic docs for everything #367
  • standard library docs #21

I agree with you about how important docs are. Here are my priorities:

  1. stabilize the language
  2. document everything about the language
  3. stabilize the standard library
  4. document everything about the standard library
  5. 1.0.0

So far we're still on (1).

ok I was just too eager to read on, but I'll wait, no problem 馃憤

@andrewrk have User-Level-Threads on windows been considered?
https://docs.microsoft.com/en-us/windows/desktop/procthread/user-mode-scheduling

google apparently implemented the same thing on linux in a proprietary way and got speedups from it

Was this page helpful?
0 / 5 - 0 ratings