After a long process of debugging (because almost no tools worked for this, being a setuid situation), I have figured out that keybase-mount-helper segfaults very early on in execution if LD_LIBRARY_PATH, and likely other environment variables that could cause security problems, are set at all.
My first thought is unsetting LD_LIBRARY_PATH and anything else that breaks in run_keybase before starting keybase-mount-helper, but I worry that this might break other things, or else I'd try to make a PR.
@mystfox Thanks for helping track this down! Unfortunately, I can't seem to repro on Ubuntu 16.04:
$ LD_LIBRARY_PATH=/tmp/ keybase-mount-helper
$ echo $?
0
I know $LD_LIBRARY_PATH is probably ignored for setuid programs. Are your shared libraries in an unusual location maybe? This is what they look like on my system:
$ ldd `which keybase-mount-helper`
linux-vdso.so.1 => (0x00007ffd3eea7000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f6a51a1f000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f6a51655000)
/lib64/ld-linux-x86-64.so.2 (0x00007f6a51c3c000)
Hmm, this is very weird. So, I'm on ArchLinux (using the keybase-bin package from the AUR, so your binaries, still). I tried this in a Docker image of 16.04 and still had it segfault. ldd didn't show anything unusual, on Arch or in the container. I spun up a droplet on DO to test (was quicker than trying to setup a VM) and I saw the behaviour you were seeing there. I'm not entirely sure what to think of this.
...Actually, I think it is very likely related to kernels - as somewhat hinted at by the container version still segfaulting. I had a droplet that had Linux fox 4.14.15-1-ARCH #1 SMP PREEMPT Tue Jan 23 21:49:25 UTC 2018 x86_64 GNU/Linux for uname -a, though a new kernel version had been installed, and it worked fine, no segfaults. I rebooted, uname -a was now Linux fox 4.15.1-2-ARCH #1 SMP Sun Feb 4 22:27:45 UTC 2018 x86_64 GNU/Linux, and LD_LIBRARY_PATH=/tmp keybase-mount-helper now segfaults. So it's likely either 4.15 or new patches for it Arch added?
(My guess would be it's 4.15, and perhaps to do with mitigations for Meltdown/Spectre, but that's only a guess.)
Weeeird. For the record I'm at 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux.
I'm inclined to leave this issue open for a while and see if others start having the same problem before doing anything to fix it. As you say, we could just unset it, but I'd like to figure out a bit more about what is the underlying problem and whether that's really a safe thing to do in all cases...
Mmm. keybase/keybase-issues#3176 looks like it could be this, for one.
Another thing to note: I have tried taking a few simple executables and doing chown+setuid on them, like true or echo. They both worked just fine when setuid and I have a LD_LIBRARY_PATH. I've tried a few different programs from the current contents of my ~/go/bin. Some of them have segfaulted and some have not. So there might be a specific go library or something that is the problem?
And as for if it's really a safe thing to just unset it... well, Nix/NixOS immediately comes to mind, for one (not something officially supported, I don't think, but it's an example of when it could be important, even with the few libraries needed here.
Okay, so. I've been digging at this even more, with the help of some others. While all my attempts to get a strace or coredump or something of it segfaulting have failed, there's enough to start digging from that side, still, from the kernel log message. Error 15 means it's a protection fault, read, userspace address, and instruction fetch, and both the segfault and the ip are at ffffffffff600000, which turns out to be the vsyscall page. Looking at https://git.archlinux.org/svntogit/packages.git/commit/trunk?h=packages/linux&id=9998d4fe8026c686abe8db9d9c5941d3936af3de , it seems that vsyscall emulation was disabled in Arch's kernel config at the same time as the move to 4.15 - hence going from 4.14 to 4.15 being part of this.
I actually looked closer at the executables in my ~/go/bin, and it turns out the ones that are showing this behaviour are all dynamically linked, not statically linked - so it seems it's dynamically linked Go programsm because other dynamically linked executables I tried don't show this.
So, the suspicion starts to fall on glibc - it provides both the linker and libc, of course, so good canidates. And it turns out that the variables that cause this exactly match glibc's UNSECURE_ENVVARS - TZDIR causes the segfault, LD_ASSUME_KERNEL doesn't, etc. However, there are only two places in glibc that seem to actually use UNSECURE_ENVVARS - in elf/dl-support.c, which seems to be part of libc, and elf/rtld.c, which seems to be a lot of ld. But they both seem to just unconditionally unset the env vars if secure (the binary is suid), which is where I am stumped. Let alone why it's specifically Go.
Edit: Oh, also. man ld.so mentioned that LD_DEBUG would work if /etc/suid-debug(iirc, taking a break rn) existed. Created that, then LD_DEBUG both gave debug output (that looks to be pretty much useless) and still segfaulted.
Oh, and a final note on the things that don't give me useful information:
https://bugs.archlinux.org/task/57336
If this is a general issue with dynamically linked golang binaries that are SUID, I wonder why they are using vsyscall at all... might be worth taking this to the golang developers. ;)
(No comment on the existence of this specific SUID binary.)
Is someone willing to make, or has someone already made, a golang issue about this? I'm not really following how the vsyscall thing fits into the LD_LIBRARY_PATH thing.
One thing I just found is this Meltdown CVE: https://lists.debian.org/debian-security-announce/2018/msg00000.html, containing the following (bold mine):
We also identified a regression for ancient userspaces using the vsyscall
interface, for example chroot and containers using (e)glibc 2.13 and older,
including those based on Debian 7 or RHEL/CentOS 6. This regression will be
fixed in a later update.
I'm not sure if this is referring to a regression caused by the Meltdown patch which hasn't yet been fixed (and is responsible for the problems we're seeing), or if it's referring to something independent noticed during the Meltdown work, and which was then fixed, causing the problems we're seeing. Anyone know?
The previous time I鈥檝e heard about vsyscall was in the context of building Python extensions (dynamic libraries) compatible with many linux-based operating systems: https://github.com/python/peps/blob/master/pep-0571.rst#compatibility-with-kernels-that-lack-vsyscall
Sorry for the major delays on this issue. We've been working on replacing the mount-helper system entirely, but we thought of an idea we wanted to try to see if we can fix this. Unfortunately we haven't been able to repro this internally, so is there a Debian/Ubuntu amd64 user here that is willing to try an experimental build?
The deb is here: https://keybase.pub/strib/keybase-1.0.44-20180301194049.c84992596a-amd64.deb
It's built from this branch: https://github.com/keybase/client/tree/strib/mount-helper-segfault-cgo
And this is the relevant interesting commit: c84992596a3dfa7509cd3948c9413ac2127fc026
Can someone who was experiencing the segfault please try installing this and running run_keybase, and letting us know if it worked? Thanks!
Doesn鈥檛 help for me.
D'oh. Ok thanks for trying. We'll get something out to fix this soonish.
@merwok or anyone else: what about this one?
https://keybase.pub/strib/keybase-1.0.44-20180301212225.f7b7d96e40-amd64.deb
Branch: https://github.com/keybase/client/tree/strib/mount-helper-segfault-root
Commit: f7b7d96e
NOTE: This one loosens security slightly by using root as the suid user, rather than keybasehelper. But the binary is so simple, it hopefully doesn't matter much.
So far f7b7d96 does look a lot healthier...
@mystfox and other Arch users: I made a keybase-bin pacman package for you to try with the fix f7b7d96:
https://keybase.pub/strib/keybase-bin-1.0.44_20180301212225%2Bf7b7d96e40-1-x86_64.pkg.tar.xz
Let me know if this helps! I still don't really understand the problem, but it seems like doing a setuid to a non-root user causes the issue on some systems. Not sure what that has to do with LD_LIBRARY_PATH, but switching to setuid(0) seems to do the trick...
@strib, I confirm that the package you provided works.
@strib yes the package you provided works for me as well on arch
I prefer not to run suid root programs that haven鈥檛 been validated by Debian.
I prefer not to run suid keybasehelper programs either ;) the only thing it actually does is ensure the symlink in /var/lib/keybase/mount1 points to ~/.local/share/keybase/fs for the first user to claim it.
See https://github.com/keybase/client/issues/10033#issuecomment-364998038
I have opted to exclude the mount helper from Arch Linux packaging entirely, and keybase as well as kbfsfuse work just fine...
Indeed, to work around the segfault I comment out the mount helper lines and everything works.
Makes you wonder if the mount helper is really necessary!
Not unless you are attached to the idea of being able to tell people the specific filepath /keybase/... for a KBFS-stored file...
https://github.com/keybase/client/issues/10033#issuecomment-361001953
However, we still maintain a symlink at
/keybasethat points to the mountpoint of the first user to runrun_keybaseon the system. I understand this is controversial, but at this time we don't want to break compatibility with existing third-party applications that already depend on the/keybasepath. This might be something we'll revisit in the future with a more careful plan for deprecating that old path, if we decide it's the best thing for our official packages.
I'm not an official keybase member packaging keybase, and I have no qualms about deciding the best thing for Arch Linux users (as that is actually my job). :stuck_out_tongue:
Since @strib's experiment worked to his satisfaction, there is probably no need for you to test it.
If keybase uses this as a final workaround, well, I guess you know the best way forward for yourself. You may wish to speak to Debian about having them package this themselves (it's fairly simple, see Arch packaging for keybase and kbfs, the GUI is not currently packaged).
Hopefully their redirector service will render this moot -- I'm not entirely convinced people shouldn't just use the real path in the first place, but a redirector at least achieves a more concrete goal than a static symlink...
It is necessary only only to make sure the /keybase link (via /var/lib/keybase/mount1) points to a real mountpoint. If you don't care about the /keybase link, or if you set it up manually, then you have no use for the suid helper.
Very soon we will be replacing this mess with a different strategy, which also comes with a suid root helper but which can easily be configured off if you don't like it. For now I'll probably commit this though.
I have to say I don鈥檛 understand why /keybase exists. I will switch to the official Debian package when it鈥檚 released, and they certainly won鈥檛 accept this violation of the file hierarchy standard.
s/when/if/
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=792916
It's been requested 2 years ago but no one has seen fit to create one in all that time...
Yesterday a release went out with the root-suid mount-helper fix (1.0.45-20180306212653+08e1917910). If anyone is still seeing this after upgrading, please let me know!
(The other redesign I mentioned is coming soon though, possibly as early as tomorrow, so the above fix won't be around for long.)
For anyone interested, the Linux releases with the root redirector just went live, and we put up instructions on how to turn it off under the "Mountpoints" section here: https://keybase.io/docs/kbfs/understanding_kbfs