forkall
(Current)Process.fork
currently has forkall
semantics with respect to fibers. What this means is that all the fibers which were alive during the Process.fork
call remain alive in the forked process.
Since fd/sockets are shared across forks, fibers which were doing writes will do double writes which will cause:
Similarly, fd/sockets reads may get divided across forks. So, both processes will see
An example of double writes:
spawn do
5.times do |i|
sleep rand
puts "line #{i}"
end
end
process = Process.fork do
sleep 10
end
process.wait
Expected output:
line 0
line 1
line 2
line 3
line 4
Actual output:
line 0
line 0
line 1
line 2
line 1
line 3
line 4
line 2
line 3
line 4
IMO, these are really severe disadvantages so these semantics ought to change.
fork
like semantics (linux/unix)In this scenario, only the fiber invoking Process.fork
lives on in the forked process. Rest of the fibers are either collected(majority of them) and a few core runtime fibers like SignalHandler loop
are recreated.
pthread_atfork
like api for fibers may help here though i am not sure how successful that will be.This approach is in line with what go and erlang takes. They find fork
too difficult to implement correctly in a multi-threaded/multi-fiber/multi-actor environment that they don't even bother implementing it.
exec()
s to a crystal binary.Personally, I would prefer that fork without exec be removed from the stdlib so there's no risk of deadlocks or data corruption. If that's unacceptable, even implementing fork
semantics is still a step in the right direction.
PS: I've been implementing a supervisor
clone in crystal which gave me an opportunity to look deeper into Crystal's internals.
forkall
behaviour. I'll provide a root cause analysis in that thread.Yes, I'd like that too. We have fork
because we had it before we went with non-blocking IO and fibers like Go. Now that did that, it's probably time to remove fork
(which also doesn't exist in Windows...)
Another reason: the experiments on trying to bring multithread support to Crystal were difficult to achieve, and getting fork
to work was one of the reasons (not the reason, but one of them).
I used your example and ran it... got a similar input... I have scanned over the source code to see what's going on.
I do kind of see what you are suggesting, and how it's possibly happening. I agree with the fact that the implementation of fork needs to be improved, but I don't think that we have to throw it away either.
I tweaked your example and ran a comparable in Ruby for expected output. I am fearful of runaway processes, so I will always use a wait.
`
chan = Channel(Nil).new
spawn do
10.times do |i|
sleep rand
puts "line #{i}"
end
chan.send(nil)
end
t = Process.fork do
sleep 1
end
chan.receive
t.wait
`
I didn't quite the same output:
line 0
line 0
line 1
line 1
line 2
line 3
line 4
line 5
line 6
line 7
line 8
line 9
Do you think that the differences my be OS related?
These are going to be fun bugs to eradicate... :P
@asterite I think that the lack of windows operability shouldn't mean that we need to discard the whole thing. Now that Microsoft is going into the deepend on interoperability with Linux... these functions will be able to be accessed by MS users. There is always spawn or other ways to do it also for windows users.
Ah, the output was from 5.times
and not 10.times
. Edited my original post to reflect that. Also added process.wait.
@98-f355-f1 You are only getting line 0
twice because your forked process is only sleeping for 1 sec. Increase it to see similar output.
I think the only way to improve fork while keeping it would be implement forkone semantics but then we'll have to worry about mutexes and deadlocks. I am not sure if that will be considered an improvement.
So removing it is the safest choice.
see
https://thorstenball.com/blog/2014/10/13/why-threads-cant-fork/
https://news.ycombinator.com/item?id=8449164
I think fork should be kept if possible, maybe as suggested by enforcing fork and exec pattern.
Random idea: can we allow it only before any user fibers, or generally any fibers not explicitly marked as safe for fork, are created?
I'm leaning towards both positions (keep or remove) but I still lean a bit on the keep it side.
Yes, fork is dangerous, it can lead to disasters if invoked at whatever time; you should never have to use it; except for some rare edge cases where it makes sense on a POSIX system in a controlled environment (e.g. zero downtime binary reloads) without requiring C.
Just like daemonizing, I believe we should have a proper fork handling (that fixes a few bolts, like reseeding rand), mark it as unsafe and document it as very dangerous. Maybe remove the global ::fork
and keep only Process.fork
. I think it's better than having users just call LibC.fork
manually (never do that).
I think it's possible to keep fork with a multithreaded event loop. The GC must signal all threads to stop before it can start collecting memory, fork must do the same (unless it's a fork/exec).
fork is need for example here: https://github.com/kostya/curl-downloader, i not find any efficient ways except fork (i tried scheduler mapper to events, threads, curl multi). just write some danger comment for all who wants to use fork.
@forksaber you wrote above "PS: I've been implementing a supervisor clone in crystal which gave me an opportunity to look deeper into Crystal's internals." Could you explain how you did this?
@forksaber on the second note, I got similar type erratic behavior also, but I am impatient so I wasn't going to wait 14 min for the sleep LOL. I tried it at different times, but curiously when I rewrote with the wait, it sort of straightened up the behavior.
I guess that it seems that most folks want to keep the fork, perhaps remove the exec, and put in strong warning, use at your own risk, explaining some possible side effects.
I agree with the "keep fork as-is (but fix as many bugs as we can) but mark it as super dangerous" idea. We should provide a "sanitized" safe fork that works before fibers have been spawned too.
@98-f355-f1 You can check out my supervisor implementation here https://github.com/forksaber/supervisor
Almost feature complete but not much in the way of documentation and tests.
I think we should remove plain Process.fork
from the stdlib.
fork is not supported on win32 at all. That makes it a very platform-specific feature which we would like to avoid in stdlib.
On other platforms, there are difficulties with fibers and it's entirely unsupported with multithreading enabled. AFAIK this can't work reliably ever. In a forked process only one thread continues which makes it error-prone. So with MT moving forward, future use cases look very slim.
This answer on SO explains the issues with fork in Golang, which should apply exactly to Crystal as well: https://stackoverflow.com/a/28371586/7401689
As a replacement, fork could be made available in a platform-specific shard which can be added as a dependency for the few use cases where you would really want to use bare fork behaviour.
fork immediately followed by exec would obviously still be required as implementation for Process.exec
on POSIX platforms.
fork
can be kept under a platform-specific module. This modularization is common in other languages.
I think we can keep platform depended apis along with their logical modules but mark them as so. If someone uses them he should know about consequences (i.e non-portability). See my proposal about this concept - #8524.
I finally changed my mind on fork
. Keeping support would require a _lot of work_ and _lots of bugs_ to fix, especially with MT (we already had a bunch with a single thread) but... for what purposes?
Let's not bother.
The only concevable use of fork would be to insert safe privilege-dropping commands like a chroot or setrlimit between the fork and the exec. This would be useful for a container scheduler for example.
I would support an unsafe non-documented API to make these happen if someone was interested in the capability, but otherwise i'm fully in support of pulling fork.
I would support an unsafe non-documented API to make these happen if someone was interested in the capability, but otherwise i'm fully in support of pulling fork.
That would certainly be better placed in a documented API of a dedicated shard.
@straight-shoota well Process.exec
needs to be part of the stdlib, and it can't be obvious or documented how to use a post-fork exec callback since it's very unsafe. I don't see how this could be a shard.
Most helpful comment
I finally changed my mind on
fork
. Keeping support would require a _lot of work_ and _lots of bugs_ to fix, especially with MT (we already had a bunch with a single thread) but... for what purposes?Let's not bother.