Nixpkgs: Musl as default instead of glibc

Created on 11 Jun 2020  路  11Comments  路  Source: NixOS/nixpkgs

Thanks to many distributions such as Alpine Linux, a lot of software compiles with musl nowadays.

It would mainly reduce the closure size significantly and improve static linking support. See https://www.etalabs.net/compare_libcs.html

It would be interesting to see how much compiles under pkgsMusl attribute and make the switch at some point.

enhancement community feedback

Most helpful comment

I don't think as default, but musl is still interesting to me, e.g. pkgsStatic. IIRC some people do submit musl-specific fixes, but I haven't seen that significant interest around NixOS.org so far.

All 11 comments

I don't think as default, but musl is still interesting to me, e.g. pkgsStatic. IIRC some people do submit musl-specific fixes, but I haven't seen that significant interest around NixOS.org so far.

musl advertises that it is "lightweight, fast, simple, free, correct", but one thing they do not advertise is resistance against exploitation.

glibc has long been in an arms race against hackers, and as new techniques are found for attacking the glibc heap, glibc introduces mitigations against them. A small part of this history is found here:

http://phrack.org/issues/61/6.html
http://phrack.org/issues/66/10.html
https://github.com/shellphish/how2heap

What mitigations does musl offer to protect users from malicious input attempting to corrupt their processes' heaps? Without this information, we can't tell if switching to musl is a security regression or not.

Oh, now I recalled that systemd tends to be problematic: https://github.com/NixOS/nixpkgs/pull/37715

Part of the reason will be that systemd does lots of low-level stuff, another part that people choosing "lightweight" libcs usually don't care much for systemd... for some mysterious reasons ;-) (the correlation might go both ways)

Of course, problems can be patched, etc. For most packages one can probably find a solution somewhere already. The key question there would be whether there's enough motivation to maintain also this divergence from majority.

What mitigations does musl offer to protect users from malicious input attempting to corrupt their processes' heaps? Without this information, we can't tell if switching to musl is a security regression or not.

To determine the level of robustness between the glibc and musl malloc implementations, I constructed the following (rough) metric.

I searched through each implementation's malloc.c file to see how many error conditions will lead to an immediate abort. The results:

musl-1.2.0:
src/malloc/malloc.c:388: if (extra & 1) a_crash();
src/malloc/malloc.c:406: if (next->psize != self->csize) a_crash();
src/malloc/malloc.c:450: if (next->psize != self->csize) a_crash();
src/malloc/malloc.c:515: if (extra & 1) a_crash();
(4 results)

glibc-2.31:
malloc/malloc.c:1454: malloc_printerr ("corrupted size vs. prev_size");
malloc/malloc.c:1460: malloc_printerr ("corrupted double-linked list");
malloc/malloc.c:1468: malloc_printerr ("corrupted double-linked list (not small)");
malloc/malloc.c:2537: malloc_printerr ("break adjusted to free malloc space");
malloc/malloc.c:2830: malloc_printerr ("munmap_chunk(): invalid pointer");
malloc/malloc.c:2858: malloc_printerr("mremap_chunk(): invalid pointer");
malloc/malloc.c:3175: malloc_printerr ("realloc(): invalid pointer");
malloc/malloc.c:3594: malloc_printerr ("malloc(): memory corruption (fast)");
malloc/malloc.c:3644: malloc_printerr ("malloc(): smallbin double linked list corrupted");
malloc/malloc.c:3736: malloc_printerr ("malloc(): invalid size (unsorted)");
malloc/malloc.c:3739: malloc_printerr ("malloc(): invalid next size (unsorted)");
malloc/malloc.c:3741: malloc_printerr ("malloc(): mismatching next->prev_size (unsorted)");
malloc/malloc.c:3744: malloc_printerr ("malloc(): unsorted double linked list corrupted");
malloc/malloc.c:3746: malloc_printerr ("malloc(): invalid next->prev_inuse (unsorted)");
malloc/malloc.c:3786: malloc_printerr ("malloc(): corrupted unsorted chunks 3");
malloc/malloc.c:3868: malloc_printerr ("malloc(): largebin double linked list corrupted (nextsize)");
malloc/malloc.c:3874: malloc_printerr ("malloc(): largebin double linked list corrupted (bk)");
malloc/malloc.c:3957: malloc_printerr ("malloc(): corrupted unsorted chunks");
malloc/malloc.c:4061: malloc_printerr ("malloc(): corrupted unsorted chunks 2");
malloc/malloc.c:4107: malloc_printerr ("malloc(): corrupted top size");
malloc/malloc.c:4173: malloc_printerr ("free(): invalid pointer");
malloc/malloc.c:4177: malloc_printerr ("free(): invalid size");
malloc/malloc.c:4201: malloc_printerr ("free(): double free detected in tcache 2");
malloc/malloc.c:4249: malloc_printerr ("free(): invalid next size (fast)");
malloc/malloc.c:4266: malloc_printerr ("double free or corruption (fasttop)");
malloc/malloc.c:4276: malloc_printerr ("double free or corruption (fasttop)");
malloc/malloc.c:4288: malloc_printerr ("invalid fastbin entry (free)");
malloc/malloc.c:4309: malloc_printerr ("double free or corruption (top)");
malloc/malloc.c:4314: malloc_printerr ("double free or corruption (out)");
malloc/malloc.c:4317: malloc_printerr ("double free or corruption (!prev)");
malloc/malloc.c:4322: malloc_printerr ("free(): invalid next size (normal)");
malloc/malloc.c:4332: malloc_printerr ("corrupted size vs. prev_size while consolidating");
malloc/malloc.c:4356: malloc_printerr ("free(): corrupted unsorted chunks");
malloc/malloc.c:4477: malloc_printerr ("malloc_consolidate(): invalid chunk size");
malloc/malloc.c:4493: malloc_printerr ("corrupted size vs. prev_size in fastbins");
malloc/malloc.c:4553: malloc_printerr ("realloc(): invalid old size");
malloc/malloc.c:4564: malloc_printerr ("realloc(): invalid next size");
(37 results)

Admittedly, this is not a perfect metric, because the glibc malloc is more complicated, has a fastpath and slower paths, etc. But in general it seems like glibc is being much more careful.

I think sticking with glibc is the smarter decision from a security perspective.

@domenkozar what would be the advantages of musl over glibc?

@markuskowa smaller closure size, easier static linking. See https://www.etalabs.net/compare_libcs.html

Having tried to use pkgsStatic for the Mobile NixOS stage-1, as @vcunat said, systemd won't play ball, and we need a bunch of work still to make a large proportion of Nixpkgs work. A bunch of trivial-enough things didn't work, some was fixed, some was worked around with alternatives. In the end I decided to go with glibc.

Though, with that said, I'm not opposed to the idea, but as a default, I'm not sure when and if it'll happen. It'd need many person-hours to get there.

What mitigations does musl offer to protect users from malicious input attempting to corrupt their processes' heaps? Without this information, we can't tell if switching to musl is a security regression or not.

To determine the level of robustness between the glibc and musl malloc implementations, I constructed the following (rough) metric.

I searched through each implementation's malloc.c file to see how many error conditions will lead to an immediate abort. The results:

musl-1.2.0:
(4 results)

glibc-2.31:
(37 results)

Admittedly, this is not a perfect metric, because the glibc malloc is more complicated, has a fastpath and slower paths, etc. But in general it seems like glibc is being much more careful.

I think sticking with glibc is the smarter decision from a security perspective.

musl 1.2.0 malloc.c: 548 lines
glibc 2.31 malloc.c: 5610 lines

musl asserts per line: .007299
glibc asserts per line: .006595

therefore, musl is more secure.

glibc malloc has more sanity checks. At least compared to glibc before thread caching was added (major caveat), the current generation musl malloc is not security aware and is friendlier to exploitation. Both use a similar design based on trusted inline metadata that's inherently friendly to exploitation. glibc now has thread caches which are a huge boon for easier and more reliable exploitation, especially in complex situations involving parallelism, while musl won't be taking that approach. Thread caches bypassed a substantial amount of the previous work put into bolting on sanity checks to glibc malloc. They also make exploitation much more reliable for threaded programs and fundamentally get in the way of more meaningful deterministic security checks.

musl has a new malloc implementation landing soon, with a fundamentally more security-oriented design than glibc's approach. It isn't possible to build the same kind of security through adding weak sanity checks like glibc. It's a base that can be turned into a truly hardened allocator by bolting on additional features, unlike the traditional dlmalloc-like design used by glibc and musl (glibc makes the fundamental issues worse with tcache). You can't build decent security by bolting it on top of a fragile foundation preventing robust security checks. I strongly recommend reading this thread from Rich Felker on some of the security properties of oldmalloc vs. malloc-ng, and perhaps the other discussions about it on the mailing list and Twitter:

https://www.openwall.com/lists/musl/2020/05/13/1

If you care about exploit mitigation, wait until the next generation malloc lands. There are also other security features missing that should be added.

FORTIFY_SOURCE isn't implemented in musl, although the glibc implementation is lackluster (only checks writes, not reads, etc.) and doesn't bother with Clang compatibility. Clang actually has superior extensions for implementing it... but an implementation compatible with both that's strictly better than the glibc one is straightforward. I think the stance on this is that it should be done with a header-only approach, but I don't think a high quality production implementation is available, so it's effectively not available.

musl also doesn't currently have setjmp or much attempt at function pointer protection, although the implementations in glibc are lackluster and quite incomplete. Only really matters if you're using type-based CFI elsewhere, and neither musl or glibc has support for Clang type-based CFI. Neither supports the arm64 ShadowCallStack feature which is the approach to protecting return addresses there (hopefully CET shows up for x86 soon for a superior hardware implementation, and MTE on arm64 for similar reasons). ShadowCallStack doesn't exist for x86 since their approach didn't meet the same standards (races, etc.).

There's a fork of musl for Fuchsia with some of these features added, but not in a way that would ever be possible to land upstream since they wrote it in C++ and did it specifically for that OS with assumptions that wouldn't hold elsewhere.

glibc has a lot of security misfeatures and far more attack surface / bugs. Exploit mitigation isn't everything, especially when it's done poorly as glibc does in many cases. It also introduces problems with features like secondary stack caching that are not present in musl. I think it's a mixed bag right now. malloc is quite important so there's a very strong argument that the current generation glibc beats musl on exploit mitigation (and the libc attack surface isn't substantial in the big picture anyway) but that's not going to be true for much longer.

FORTIFY_SOURCE isn't implemented in musl,

While it is true that musl itself does not ship with a FORTIFY_SOURCE implementation, Alpine (and other apk distributions) have support for FORTIFY_SOURCE thanks to the work of our toolchain maintainers, actually. We believe it to be of higher quality than the glibc one.

But Nix's userbase is probably best off staying with glibc because there are a number of differences (even with layering other libraries on top) between musl and glibc environments, and in general, the folks maintaining the musl ecosystem at large don't wish to evolve the environment into a glibc clone.

Thanks all for the input. Seems like we still have a long way to go.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

copumpkin picture copumpkin  路  3Comments

ghost picture ghost  路  3Comments

domenkozar picture domenkozar  路  3Comments

tomberek picture tomberek  路  3Comments

teto picture teto  路  3Comments