$ dpkg -S /usr/bin/crystal
crystal: /usr/bin/crystal
$ dpkg -l | grep crystal
ii crystal 0.26.1-1 i386 no description given
$ /usr/bin/crystal -h
Failed to raise an exception: END_OF_STACK
[0xb5ca8813] ???
[0xb535e383] ???
[0xb535f411] ???
[0xb53645ec] ???
[0xb539c469] ???
[0xb5362991] ???
[0xb722b816] ???
as a side note, maybe add a description to the package?
@stronny I am checking the installer using a vagrant xenial 32 bits and I was able to run the installer as documented here without troubles.
vagrant@ubuntu-xenial:~$ dpkg -S /usr/bin/crystal
crystal: /usr/bin/crystal
vagrant@ubuntu-xenial:~$ dpkg -l | grep crystal
ii crystal 0.26.1-1 i386 no description given
vagrant@ubuntu-xenial:~$ /usr/bin/crystal -h
Usage: crystal [command] [switches] [program file] [--] [arguments]
Command:
....
My setup is a debian stable virtualbox guest. I've just noticed that the bug is reproduced only as a particular user, but not as another one. I think it has to do with fd reopening in the update, here's the tail of an strace:
[pid 12842] clock_gettime(CLOCK_REALTIME, {tv_sec=1535657416, tv_nsec=132236618}) = 0
[pid 12842] epoll_create1(EPOLL_CLOEXEC) = 4
[pid 12842] pipe2([5, 6], O_NONBLOCK|O_CLOEXEC) = 0
[pid 12842] getcpu(0x9afd8, 0x10, NULL) = -1 EFAULT (Bad address)
[pid 12842] open("/dev/urandom", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 7
[pid 12842] fcntl64(7, F_SETFD, FD_CLOEXEC) = 0
[pid 12842] fstat64(7, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 9), ...}) = 0
[pid 12842] fcntl64(7, F_SETFD, FD_CLOEXEC) = 0
[pid 12842] read(7, "\271\245K\203\374\322<\270\20^\245\315l\302\35=\360\263dm$\373\326\251uV\334\325\274\352\2316"..., 8192) = 8192
[pid 12842] ioctl(0, TIOCGWINSZ, {ws_row=38, ws_col=151, ws_xpixel=0, ws_ypixel=0}) = 0
[pid 12842] readlink("/proc/self/fd/0", "/dev/pts/4", 256) = 10
[pid 12842] stat64("/dev/pts/4", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
[pid 12842] fstat64(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
[pid 12842] open("/dev/pts/4", O_RDWR|O_LARGEFILE) = -1 EACCES (Permission denied)
[pid 12842] fcntl64(-1, F_GETFL) = -1 EBADF (Bad file descriptor)
[pid 12842] mmap2(NULL, 155648, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb51b4000
[pid 12842] mmap2(NULL, 155648, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb518e000
[pid 12851] futex(0xb51f4c08, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 12851] futex(0xb51f4c08, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 12842] munmap(0xb518e000, 155648) = 0
[pid 12842] ioctl(2, TIOCGWINSZ, {ws_row=38, ws_col=151, ws_xpixel=0, ws_ypixel=0}) = 0
[pid 12842] readlink("/proc/self/fd/2", "/dev/pts/4", 256) = 10
[pid 12842] stat64("/dev/pts/4", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
[pid 12842] fstat64(2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
[pid 12842] open("/dev/pts/4", O_RDWR|O_LARGEFILE) = -1 EACCES (Permission denied)
[pid 12842] fcntl64(-1, F_GETFL) = -1 EBADF (Bad file descriptor)
[pid 12842] writev(2, [{iov_base="Failed to raise an exception: EN"..., iov_len=43}, {iov_base=NULL, iov_len=0}], 2Failed to raise an exception: END_OF_STACK
) = 43
Do you need any additional details from me to narrow down the fault?
@stronny since you are using a vagrant machine and it happens with certain user but not all it would be create to have steps to reproduce the issues consistently. From there it will be easier to narrow the cause. Otherwise we would need to guess in the dark what are the difference between these two users.
I'm not using a vagrant machine though. Comparing straces I can point to this:
This is a normal run:
[pid 13305] readlink("/proc/self/fd/0", "/dev/pts/5", 256) = 10
[pid 13305] stat64("/dev/pts/5", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 5), ...}) = 0
[pid 13305] fstat64(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 5), ...}) = 0
[pid 13305] open("/dev/pts/5", O_RDWR|O_LARGEFILE) = 8
[pid 13305] fcntl64(8, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
This is a faulty one:
[pid 12842] readlink("/proc/self/fd/0", "/dev/pts/4", 256) = 10
[pid 12842] stat64("/dev/pts/4", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
[pid 12842] fstat64(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
[pid 12842] open("/dev/pts/4", O_RDWR|O_LARGEFILE) = -1 EACCES (Permission denied)
[pid 12842] fcntl64(-1, F_GETFL) = -1 EBADF (Bad file descriptor)
I think there is a lack of an error check somewhere and an error sentinel of -1 goes into fcntl64() which promptly explodes.
I think there is a lack of an error check somewhere and an error sentinel of -1 goes into fcntl64() which promptly explodes.
Like in https://github.com/crystal-lang/crystal/commit/2530bb602a0730dafdeb519d8b3bbabaf734749b#diff-97b852ed0df274031d263857a85db195R30 maybe?
Oof, ok. Checking the error is one thing, but what to do after a failure.
Can the tty be reopened with less permissions? Or is this tty somehow unopenable? If so, what's the correct fallback?
@Timbus no, just act as if ttyname_r failed, and return new(fd, blocking: true)
So the problem is that the second user (the one that's failing) runs under sudo -i. As a result it can't access /dev/pts/*, because files there are crw--w---- and owned by the first user and tty group, which none are members of.
I don't think I've seen any problems with this fairly standard setup with any other nonblocking IO, so either I'm just ignorant of other programs that can't use read_nonblock(), or perhaps they do something differently?
The problem is that there is no such thing as read_nonblock(), which is a huge flaw in posix imo.
So, in order for crystal to read a handle without blocking, you need to set the handle into nonblocking mode. This would be fine except it is the same handle as the current interactive terminal's handle, and setting nonblock on it will force anything else sharing the handle to perform nonblocking reads (this includes the shell, piped/sibling processes, and forked subprocesses).
So yes, it can be done differently without reopening the TTY. It just breaks many other things. Perhaps in the future there may be a threaded fallback mode or something, but for now it just goes back to blocking on a TTY read.
Just a quick clarification if I may. IIUC what you are doing is something like 1) new_fd = reopen(STDIN); 2) set_nonblock(new_fd) which makes perfect sense, except what is done is fcntl(char_device), which is not a handle, it's a device file itself. So what's the point of reopening the tty if at the end you operate on the same tty? If your goal is to set nonblocking mode separately for this process, how is it achieved by mutating global state?
Edit: I'm wrong, fcntl indeed accepts an fd, not a device. Please disregard the above comment.
Another thing I wanted to ask. It seems that Crystal doesn't reopen the tty, it just opens 3 new fds, and leaves the old ones hanging. Is that intentional behaviour? I've checked with this:
slice = "real stdout\n"
LibC.write(1, slice, slice.size)
puts "stdout?"
which gives me:
[pid 1066] write(1, "real stdout\n", 12real stdout
) = 12
[pid 1066] write(8, "stdout?\n", 8stdout?
) = 8
@stronny yes it's intentional, please read https://github.com/crystal-lang/crystal/pull/6518
Fix PR was merged.