Now that we have multiple threads, running this program:
100.times do |i|
spawn do
puts i
end
end
sleep 0.1
Will output stuff like this:
6
0
8
4
1
13
105
149
...
Note that there are multiple newlines together, and some numbers are mixed.
The reason is that puts(obj)
is implemented by doing print(obj)
followed by print '\n'
and so a thread might switch between these two calls.
There are multiple solutions to this. For example:
puts(obj)
just call print "#{obj}\n"
, but that involves a memory allocation for the string, at least in this simple form. To avoid this allocation Go for example has a pool of buffers that it uses to write to first, and then dumps this buffer into the final IO.There are probably other solutions to this. Maybe we can choose the simplest one and eventually improve its performance.
The question is: is this expected behavior? Do we want to change it? I think that from the outside I would expect a call to puts(obj)
to look kind of atomic.
/cc @waj
The array to_s algorithm needs to be reviewed also since the detection of recursive values is not thread safe
@bcardiff I think for that one we just need to change the current approach to use a thread local variable. ~And now that I think about it, maybe we can use one too for puts
: a buffer per thread that we write to first, then we copy to the IO.~ EDIT: hm, I think it doesn't work, we'll need a pull of buffers.
Let鈥檚 discuss one issue at a time ;)
I agree that puts
behavior should be atomic, at least for tty. Having a pool of buffers sounds great but I鈥檇 only add it if we figure out the performance degrades (do not underestimate the GC that already manages pools of memory blocks)
It鈥檚 sad that we need to allocate an extra buffer for this but the option is use a Mutex for every operation. Or have atomic/non atomic IOs. Just thinking...
For a TTY, performance is not critical, so allocating and additional string buffer shouldn't be a huge issue.
For a more performant implementation, it might be worth to revisit vectored IO (#6078).
Big +1 to vectored IO from me. read()
and write()
should support vectored IO overloads.
Unfortunately the to_s(io)
pattern doesn't support buffering the whole line, unless we added a print_atomic
and puts_atomic
which did to_s
with an IO::Memory
then sent the result to write()
. Then ::puts
and ::print
could be STDOUT.puts_atomic
?
can update the vectorted io PR, heavily prefer that too to some extra buffering.
It seems write
might write less than what we pass it and we don't take that into account. Am I seeing that right?
It also seems writev
might write less...?
@carlhoerberg vectored IO isn't relevant to this issue, there will have to be buffering unless the design of to_s
is radically changed to add something like IO lists from elixir.
@asterite the write syscall will write less than we ask for, crystal's IO#write
will always loop until it writes the whole buffer. We cannot guarantee atomic writes, we can only improve this a bit.
crystal's IO#write will always loop until it writes the whole buffer
Oooh... it's actually in evented_write
, I couldn't find it in unbuffered_write
.
So I guess that even if we send a string with newline in one call there's still a chance that it might break in the middle of a thread.
@asterite yep.
Since ruby introduced writev
in 2.5 I haven't seen interleaved newlines in multiple threads writing to stdout.. theortically it could still happen i guess, but at least trivial examples like the following won't result in interleaved newlines, same code running by ruby 2.4 will consistently mess up the output.
q = Queue.new
100.times do |i|
q << i
end
Array.new(5) do |j|
Thread.new do
puts "thread #{j} got #{q.pop}" until q.empty?
end
end.each(&:join)
The man pages for writev say that the result behaves as if it was one single atomic write, and it would be reasonable that any binding in crystal follow the same. So print_atomic
and writev
would be equivalent from a result point of view.
Unless I'm misunderstanding what is meant by atomic in this case.
The thing is, I don't understand how we can use writev in puts to fix this issue.
@asterite it doesn't help. writev
was a tangent.
i'm pretty sure it does help for writes smaller than the buffer size, as the OS gladly preempt threads at syscalls, so between write "a"
and write "\n"
, you would only get the behavior if the buffers to writev
was too large and had to be retried.
The problem is that an arbitrary object like an array is streamed to an IO in piecesb first the bracket, then the value, then comma, etc. And there's no way to turn all of these into a single writev call.
I think we should simply make STDIN, STDOUT and STDERR thread safe, so just a single operation can be called per thread (we would use a mutex for that).
To add to this, this affects exception printing to console. Exceptions and their callbacks are separated.
Most helpful comment
The problem is that an arbitrary object like an array is streamed to an IO in piecesb first the bracket, then the value, then comma, etc. And there's no way to turn all of these into a single writev call.
I think we should simply make STDIN, STDOUT and STDERR thread safe, so just a single operation can be called per thread (we would use a mutex for that).