I'm working on an http service that serves big files. I noticed, that parallel downloads are not possible. The process serves only one file at a time, all other downloads are waiting until the previous downloads finish. How can I stream multiple files at the same time?
require "http/server"
server = HTTP::Server.new(3000) do |context|
context.response.content_type = "application/data"
f = File.open "bigfile.bin", "r"
IO.copy f, context.response.output
end
puts "Listening on http://127.0.0.1:3000"
server.listen
Request one file at a time:
$ ab -n 10 -c 1 127.0.0.1:3000/
[...]
Percentage of the requests served within a certain time (ms)
50% 9
66% 9
75% 9
80% 9
90% 9
95% 9
98% 9
99% 9
100% 9 (longest request)
Request 10 files at once:
```
$ ab -n 10 -c 10 127.0.0.1:3000/
[...]
Percentage of the requests served within a certain time (ms)
50% 52
66% 57
75% 64
80% 69
90% 73
95% 73
98% 73
99% 73
100% 73 (longest request)
The problem here is that both File#read
and context.response.output
will never block. Crystal's concurrency model is based on cooperatively scheduled fibers, where switching fibers only happens when IO blocks. Reading from the disk using nonblocking IO is impossible which means the only part that's possible to block is writing to context.response.output
. However, disk IO is a lot lot slower than network IO on the same machine, meaning that writing will never block because ab
is reading at a rate much faster than the disk can provide data, even from the disk cache. This example is practically the perfect storm to break crystal's concurrency.
In the real world, it's much more likely that clients of the service will reside over the network from the machine, making the response write occasionally block. Furthermore, if you were reading from another network service or a pipe/socket you would also block. Another solution would be to use a threadpool to implement nonblocking file IO, which is what libuv does. Perhaps crystal will switch to libuv over libevent sometime in the future, a lot of the literature online seems to say it's a superior library.
@RX14 is crystal already switched from libuv to libevent? https://github.com/crystal-lang/crystal/issues/1698#issuecomment-155240200
Thank you, that helps. I found a way to block (and yield) while reading big files:
def copy_in_chunks(input, output, chunk_size = 4096)
size = 1
while size > 0
size = IO.copy(input, output, chunk_size)
Fiber.yield
end
end
File.open("bigfile.bin", "r") do |file|
copy_in_chunks(file, context.response)
end
@akzhan I meant to move to libuv. I haven't looked into libevent vs libuv very much but the current libevent API uses callbacks to schedule the fibers, not sure how changing to libuv would change much.
@RX14 unrelated to this discussion but Crystal moved to libevent because libuv doesn't allow a multithreaded event loop (i.e. have any thread resume any fiber).
Calling Fiber.yield
to pass execution to any pending fiber is the correct solution.
@ysbaddaden ah I see that makes sense
Hi! Some weeks ago we met @luislavena in Buenos Aires and he gave us a great idea: start using StackOverflow as KB for Crystal, by asking and answering questions that flow in through Crystal's community channels (IRC, Google Groups, GitHub, etc.).
This issue was a perfect example for that so I captured it: https://stackoverflow.com/questions/44802809/how-can-i-stream-multiple-files-at-the-same-time-using-httpserver/44802810#44802810
Thanks everyone for participating 馃檹
Most helpful comment
Hi! Some weeks ago we met @luislavena in Buenos Aires and he gave us a great idea: start using StackOverflow as KB for Crystal, by asking and answering questions that flow in through Crystal's community channels (IRC, Google Groups, GitHub, etc.).
This issue was a perfect example for that so I captured it: https://stackoverflow.com/questions/44802809/how-can-i-stream-multiple-files-at-the-same-time-using-httpserver/44802810#44802810
Thanks everyone for participating 馃檹