Surge: Linux: vfork() is called in openURL() in invalid way

Created on 24 Feb 2019  Â·  59Comments  Â·  Source: surge-synthesizer/surge

The current use of vfork() in openURL() is invalid because:

  • vfork() must be only called with signals blocked. Otherwise spurious signals can trigger.
  • vfork() must be called with all threads suspended because the syscall will only suspend the calling thread.

Either these conditions must be taken into consideration when calling vfork() or it must be replaced with a regular fork(). Since opening an URL is not really any kind of performance scenarion, the 2nd option is probably more suitable.

References:

Most helpful comment

No it isn't. I checked in the from kernel sources just to make sure :)

All 59 comments

@falkTX, @baconpaul, I had to create a bug report because it is not really a matter of taste but an actual bug.

Ok I have literally zero opinion here. And I think it is also irrelevant at this code point. So happy for you to remove the character and merge. But I have nothing to contribute.

Like I said: I coded it with system and it worked fine. I had never heard of vfork until yesterday. So you guys sort it out however you want in Linux land!

Thanks

@rgareus can you comment on here please, since you did the vfork wrapper for ardour?

I'm fine with any end result as long as it is such that it cannot cause spurious behaviour such as memory corruption, which uncontrolled threads can potentially do.

The issues you mention, while valid in general scenarios, I do not think apply to us.
We are calling exec right after vfork, not handling any signals, doing allocations or anything like this.

The fact that it does not block the entire process is the entire point of it, and why we want to use it (so we do not block audio while starting an external process).

Signal handling is probably true for the reason that because of their global nature good behaving libraries do not tend to install signal handlers.

Surge does use pthreads. The race condition exists in the time window in the child process after returning from vfork() and before calling execlp(). Two copies of the same threads in different processes will continue to execute, which really meets the definition of "spurious behaviour" well.

The problem with fork() is that it duplicates the page tables and file-descriptors of the parent process. It is inherently not realtime-safe. Any processes that has realtime threads can not use fork(). In context of pro-audio on POSIX systems, perfer vfork(); execve();

@x42, How do you go on sorting the race with multiple threads? I rather take any day latency than memory corruption. That is not really something that I would call _pro_ in any context.

And what is anyway the bottleneck we are dealing with here? I cannot make up scenario where I would open URL's with Surge while playing audio but I assume there is one. Usually performance is optimized when you have a workload that does not meet your criteria, not the other way around.

First of all a plugin should not use threads to begin with. A plugin has insufficient information to set the thread's priority correctly nor can it ensure that it won't interfere with host's parallel processing.

Anyway for the case at hand it won't be an issue. The GUI is single-threaded to begin with (on MacOS it's always Thread 1 due to constraints put forward by Apple, and on GNU/Linux X11 or Wayland there is also only a single UI thread).

And what is anyway the bottleneck we are dealing with here?

This is the thing I don’t get also. This function is bound to a menu which opens the manual in a web broswer. If that makes a 300 frame audio glitch does it matter? It’s not like we are forking in a tight loop or in the audio loop. We have navigated a full menu structure with the mouse in the ui doing all the Cairo and font work and everything else.

So what I’m not getting is: what bad thing happens with fork and what bad thing happens with vfork?

Seems with fork you stall the audio thread is one point of view

Seems with vfork you have a tiny micro slice where a thread could double execute on the same memory but you won’t stall the audio thread is the other

Have I at least understood properly?

I suppose I could just disable this feature on Linux. It really is just opening a web page as a user convenience.

@x42, Thanks for sharing that. And I do get that. Like for any hard/soft realtime like system you better use co-operative multitasking to get timing exact as possible. And yes, with those constraints it should be OK.

I did some research and there is two threads in total in Surge. The main thread and loadPatchInBackgroundThread (lets rename this to something more punchy at some point).

We can probably continue to use vfork() if we suspend patchLoaderThread (wouldn't this be more punchy?) before calling it. Not too bad. And as @falkTX correctly stated there is no use of signals.

Still kind of do not see the point of not just using fork(). Not too hard to switch vfork() if that seems a necessity.

I'm curious how a thread could double execute due to the use of vfork(); execve().

Also even if there were any signals, it still would be fine:

"Signals sent to the parent arrive after the child releases the parent's memory (i.e., after the child terminates or calls execve(2))"
[ http://man7.org/linux/man-pages/man2/vfork.2.html ]

Am I missing something special to surge-synth?

There is nothing special to surge synth

See now I was hoping to end this without me having to go and read the vfork man page, but you guys broke me. Chuckle.

So @x42 I think the part of the man page you cite which worries @jsakkine is:

When vfork() is called in a multithreaded process, only the calling thread is suspended until the child terminates or executes a new program. This means that the child is sharing an address space with other running code.

(I was also a bit put off by all the "this is obsolete" and "we only needed this when bsd was a bad implementation" bits before you get to there, but lets leave that aside).

So I now understand his point of view which is: When you vfork you haven't forked memory but other threads can still manipulate data. So if, say, another thread changed the url which is passed in by reference to openURL after we vfork but before we execlp in the child process, the child process would read corrupted memory. It seems to me he is correct in that regard from reading the documentation.

Practically that won't happen here but we should at least take a copy of the url.c_str() before we vfork to assuage that fear right?

Or just fork of course.

As to "other stuff" happening between vfork and exec since the fork process uses no other parent memory there's nothing else which could happen I don't think.

But in the function in question I think @jsakkine has a legit worry that url.c_str() could change from parent.

For your convenience here's the function

        void openURL(const std::string &url)
        {
           if (vfork()==0)
           {
              if (execlp("xdg-open", "xdg-open", url.c_str(), (char*)nullptr) < 0)
              {
                 exit(0);
              }
           }
        }

But look this is all a bit silly right? This is called from a menu to open a manual. What's the lowest risk way to do it. That's what I really want to know!

Thanks all!

Oh and in the call stack: The url is not shared by another thread it is only shared by the parent thread so the worry I raise about it changing in this instance isn't one which will happen.

But again I'm trying to not invest in understanding vfork too much. So if I'm wrong apologies!

@baconpaul you are correct.

void openURL(const std::string &url) ; the url is constant and cannot change. The parent thread will wait for exec or exit, so it cannot leave scope either.

But again I'm trying to not invest in understanding vfork too much

Hey, never let what you are really trying to accomplish interfere with deepening your knowledge of computer science :)

Right. If that had signature arbitrary* foo and we had other threads doing unlocked writes to foo though we could somehow break.

So what do you think @jsakkine? Is there some case in this instance we are missing which can blow us up?

Checking.

Thanks! Also @jsakkine I’m not relying in general the const meaning the url can’t change (which @x42 mentions). Insert entire history of const being confusing with multithreading and so on. Or just const being confusing. I am sure I can construct a pathological case with multithreading references and const which acts unexpectedly.

But in this case I am calling this from a single thread with a non shared variable. That fact was part of my analysis.

Thanks!

@baconpaul, fully agree.

@x42,the problem comes with the signals that are sent to the process group. Then it gets buffered for the parent and executed by the child. Kaboom. You have a race condition.

I did not see any of the comments resolve loadPatchInBackgroundThread. It should be suspended before vfork() is called.

The way I would look at this issue is this: is there any real-world use case where you simultaneously play with Surge and open URLs in real-time? I kind of find this discussion disturbing because we are using tools that require extreme care with a non-existent use case. Probably would not matter for practical use even if latency was one second.

Yeah this is definitely not an optimization that matters. But it has made me go understand vfork against my will so I’m curious :)

I think the question about the load thread - which I am really curious about - is what could it do which is bad in this case. It keeps running fine. It doesn’t adjust any of the memory used here because of the call shape. So what goes wrong?

does the thread get duplicated when you vfork - like is there a child profess with another version of the thread writing the same memory?

If not then here there’s no problem.

If so then yeah I see the problem!

@baconpaul, it does yes and will not be suspend by vfork(). vfork() only suspends the caller thread. Other ones run freely using the same memory as the parent process.

vfork() originates BTW from BSD, not from Linux :) macOS should have it too.

Gotcha. So then in this case If day a midi program change events occurs at the same time as a user opening the url, the patch could double load into one memory segment. Patch loading is idempotent so that’s probably ok but it does Malloc with a table from patch in some cases I think.

So we have to suspend that thread. Or use fork. But fork copies way more than needed so can glitch the audio thread.

So now I think I have a complete understanding of the issue which is technically correct yes?

macOS should have it too

I checked. It does. I open urls on Mac using system”open” because it’s just opening a url :)

Yes, that is how it is :) I think performance is really a priority but not over "working correctly".

I think the right path is to use fork() for the moment and if anyone shouts we will reconsider for example suspending that helper thread. I doubt anyone ever will.

Yes I agree that in this case if no one disagrees with the technical summary here, I would pick fork.

If in a live performance session you open a manual and get a glitch well ... uhh don’t do that. If in a user session you open a manual and get a crash ... that’s worse even though it won’t happen practically. And really neither will happen in any likely scenario I don’t think.

It's your synth but any potential dropout is a no-go in my book. You don’t want your software’s audio to glitch, ever.

Ps. Also keep in mind that your software calling fork() affects all other plugins that are running. It's not only your memory space and file-descriptors that are copied! In the past this lead to various issues (from OOM, to crashes other plugins when their file-des were copied).

and, yes you should also use vfork() on MacOS/X (and *BSD).

@x42: That does not take away a fact that the loader thread still exists. However, your argument about other plugins is a valid one. Thanks for sharing that.

I could look into pausing the loader thread? Putting a thread into TASK_STOPPED is only a matter of one user-kernel round trip. Does this sound like a fair compromise?

Ugh right. Other plugins. What a pain.

I’m also curious @x42 - in your synths how do you direct users to online documentation? Like what do you actually do in the exact use case here? Maybe I should just solve the problem I’m trying to solve.

@jsakkine why is the loader thread special. What about threads in the host or other plugins in the same process space? No way to stop those.

@baconpaul, AFAIK that should not be a problem as they manipulate only their local data. The malicious use does not count as they could maliciously manipulate each others data even without vfork().

BTW, a side-note, it is a flaw in existing plug-in architectures that they don't explicitly disallow the use of signals for plug-ins. In Linux you could enforce that in the plug-in host with the use of a seccomp filter (way to filter which syscalls a process can make) and I'd assume that macOS has something similar.

@baconpaul I'm usually on the host-side (ardour.org) and wrote the vfork implementation there. It was @falkTX who summoned me ^^

@jsakkine What signals do you actually use that could lead to a race? Can plugins make use of signals anyway? Signals would be sent to the process (the plugin-host). I'd expect you'll have bigger issues than a race between vfork() and exec().
Also doesn't sending signals to a process-group require privileged access (usually root)?

It can be done outside the DAW i.e. send a signal to process group if the DAW is running with the same UID or with a privileged access. Nothing prevents plugins from sending signals unless you have something like a seccomp filter.

Anyway, signals are out of the scope when it comes to Surge as it has been concluded and I proposed a solution that involves only a single kernel round trip where the action taken by the kernel is roughly setting a flag.

BTW, using multiple threads in plugins is not really something that should be avoided as long as it is well considered. For example, Serum uses separate threads for DSP and GUI and I'm sure that is not the only one.

@jsakkine but wouldn’t they also be duplicated?

I guess I am trying to understand what happens to threads other than calling thread when you Vfork. The manual leads me to believe they continue running in the same address space. @jsakkine you seem to say that they keep running in the same address space and also double - the child pid has all the threads duplicated from the parent. If so I’m amazed it ever works. But I don’t see why “write to local data” makes any difference. Our second thread also writes only to local data.

Also: this is a huge waste of time! Quite literally surge glitches all platforms when we swap sample rate now. I should spend my time there! But I still don’t think we all agree on the basics of what vfork does.

So let me ask a simple question. You have a program running with 2 pthreads.

You fork and you get 2 processes with ~2 threads each and~ 2 memory spaces (implemented as cow).

You vfork and you get what? I think it is

2 professes
~2 threads each~ EDIT: This was the crux. Only calling thread gets duplicated.
The parent vforking thread is suspended
The child vforking thread is running
Each of the “other” threads is running in each of the process spaces
Those threads share memory

In that case why don’t all threads need suspending when you vfork if you want to avoid multi writes?

And if that’s not correct what happens?

Also I want to print out this issue and put it in a book of jokes called “how many programmers does it take to open a web page on Linux”. Chuckle.

@x42

I'm usually on the host-side (ardour.org)

Thanks for writing ardour first of all!

In ardour how do you direct users to documentation? Like what does the “help/manual” menu do? (I know I could go find it but figure you know!)

Thanks

@baconpaul, yes they would :( Kind of misguided my own thoughts... In order to use vfork() we would have to have a way to pause all threads in all plugins. And using threads is not artificial only applying to Surge as my Serum example clearly showed.

To summarize fork() is the only choice.

Sadly Ardour's OpenURI is a bad example. it uses system() on Linux: https://github.com/Ardour/ardour/blob/master/libs/pbd/openuri.cc#L42 (and https://github.com/Ardour/ardour/blob/master/libs/pbd/cocoa_open_uri.mm)

For pretty much all child processes there is https://github.com/Ardour/ardour/blob/master/libs/pbd/system_exec.cc

@x42, @falkTX, @baconpaul: Anyway, thanks for this discussion! Has really made to think this context out-of-the-box. At least I have learned a thing or two :)

@jsakkine could you explain why you said "In order to use vfork() we would have to have a way to pause all threads in all plugins." -- if you don't use signals, there should be no issue.

The problem comes from, as @baconpaul correctly stated, from all other plugins running in the same process. They get duplicated too.

The way see it threads are not a bad thing. The problem is what kernel provides to launch processes. EThe OS must have an atomic syscall to do fork and exec. Unfortunately in Linux does not have such a syscall.

I think I might write a summary this to LKML at some point because it is an interesting workload for kernel but in short term the next best solution is to use fork().

@x42

I hope you realize I’m being sincere and not a jerk when I let you know that this

Sadly Ardour's OpenURI is a bad example. it uses system() on Linux

Made me laugh and laugh! Like a “we are all so silly” laugh not a “you are silly” laugh. In my life the number of long engineering threads which have ended with “yeah we do it the old way but it’s not good” is too plentiful to count :)

This thread exists because my first implementation of openURL used system on Linux and @falkTX told me it was better as vfork!

Sigh. So system is fork / exec. But looks like you made the same choice I originally did!

So uhh... I don’t know where that leads us. More educated and reverting to system “xdg-home” + url?

Well system() does not make much sense. It is just a more complex way to call fork(). I think replacing vfork() with fork() is the way to go right now.

I think I'll discuss about the atomic spawning at work and see if it is an issue that could be workaround in some future Linux version.

Ardour moved away from using system() for many things, already. I was actually considering to update the code and copy your implementation there before answering :)

Anyway vfork() does not duplicate threads. @baconpaul's description is not correct. You don't end up with *2 threads.

        void openURL(const std::string &url)
        {
           if (vfork()==0)
           {
              if (execlp("xdg-open", "xdg-open", url.c_str(), (char*)nullptr) < 0)
              {
                 exit(0);
              }
           }
        }

That would have been pretty funny especially if you made a commit referencing this issue!

OK if vfork doesn’t duplicate threads then I don’t see what the risk is @jsakkine?

Quickly/dirty test shows that a running thread is not duplicated: https://gist.github.com/x42/60dfae7e5f2ec8cdb239087c324d10d0

@x42, @baconpaul, right! only the calling thread is copied i.e. potential problem could only come if the unsuspended threads in the parent would do something bad.

No it isn't. I checked in the from kernel sources just to make sure :)

OK, anyway enlightening discussion, thank you. And I still think that it would be really nice to have an atomic fork-exec. This is way too complicated!

+1 for an atomic fork-exec. I'd vote you into the POSIX committee for that if I could :-)

Posix has posix_spawn() but it is a library call in Linux. No idea if some other *nix has it implemented as a syscall.

https://gist.github.com/baconpaul/eae4ba8fcbcfa0d25f963568777aea5d

I ran this also just now and yeah only the calling thread gets duped.

Cool so we are OK with vfork! Great. Look forward to the ardour menu update! Chuckle.

Oh also: I'll probably change the mac implementation to vfork exec next time I'm in there! Ha.

Just one last thing here: I actually didn't use system("open") on macOS it turns out. I was smarter than that.

        void openURL(const std::string &url_str)
        {
            CFURLRef url = CFURLCreateWithBytes (
                NULL,                        // allocator
                (UInt8*)url_str.c_str(),     // URLBytes
                url_str.length(),            // length
                kCFStringEncodingASCII,      // encoding
                NULL                         // baseURL
                );
            LSOpenCFURLRef(url,0);
            CFRelease(url);
        }
Was this page helpful?
0 / 5 - 0 ratings

Related issues

mortfell picture mortfell  Â·  8Comments

baconpaul picture baconpaul  Â·  7Comments

esaruoho picture esaruoho  Â·  10Comments

baconpaul picture baconpaul  Â·  5Comments

mkruselj picture mkruselj  Â·  6Comments