Emscripten: ASYNCIFY_BLACKLIST functions can't be synchronously cwrap()'d

Created on 8 Sep 2019 · 15Comments · Source: emscripten-core/emscripten

There's a sensible assertion in ccall() which helps the Asyncify build avoid situations where you might "yield while already yielding":

https://github.com/emscripten-core/emscripten/blob/incoming/src/preamble.js#L146

It's a good idea...but sometimes you would like to cwrap() a function on the ASYNCIFY_BLACKLIST (implicitly discovered, or explicitly specified). That should be okay.

My current workaround for this situation is to not use the "official" EXTRA_EXPORTED_RUNTIME_METHODS=['ccall', 'cwrap'], but maintain my own local duplicates ccall_tolerant and cwrap_tolerant which just comment out the assert (and then trust that I know what I'm doing 🤨). It's a bit unfortunate since my code isn't coming from something that is preprocessed, so all the #ifs have to be either dropped or kept. I basically lose all the asserts in the process.

More ideal would be if there could be some knowledge of what had been asyncified and what hadn't, informing the wrapping process. Since it's an assert such a probe would not need to be particularly speedy--especially since it could be done at cwrap() time, and only in the debug builds.
Perhaps the assert trigger wouldn't be on the cwrap()/ccall() side in JS, but came from the WASM side? e.g. the Asyncify process that adds instrumentation would--in the debug build--add a bit more instrumentation that checked a flag on entry to see if it was in the wait state and raised a signal of some kind at that point?

If auto-detection is not feasible I don't know if this is common enough to justify yet-another-parameter saying "I know this function won't yield". But wanted to raise the point and link to this issue as documentation to point to for the workaround, in case better ideas of how to do this become available.

wontfix

Source

hostilefork

All 15 comments

Hm, looking at this behavior this still looks right to me. I implemented this as an analogous copy of Emterpreter's Asyncify behavior / assertions.

Your function that is in ASYNCIFY_BLACKLIST should be running synchronously. There seems to be multiple things going wrong here:

When cwrap-ing a function in ASYNCIFY_BLACKLIST you should pass {async: false} in the opts argument of cwrap() (or omit it because the default is false) . It seems like you are passing true, because that assertion is only thrown when opts.async is true.
It also seems like you are running a ASYNCIFY_BLACKLIST function while a function that is async is currently executing. For this reason, when you fix the previous point, you will actually end up triggering the next assert() likely: https://github.com/emscripten-core/emscripten/blob/34a0be62d6a10148f80f1d310a4cc222c8c4160d/src/preamble.js#L149 You should wait until the any asynchronous execution is done executing (by waiting for its Promise to resolve) before starting synchronous execution.

AjayP13 on 9 Sep 2019

If auto-detection is not feasible I don't know if this is common enough to justify yet-another-parameter saying "I know this function won't yield". But wanted to raise the point and link to this issue as documentation to point to for the workaround, in case better ideas of how to do this become available.

Actually looking at this further, it looks like what you are asking for is if there is a way to not have to explicitly tell cwrap whether the function is going to execute asynchronously or not and have it determine that on it's own (i.e. remove the opts.async parameter completely).

Temporarily, what you can do is just pass {async: false} to cwrap. That is the "parameter saying 'I know this function won't yield'" you are asking for :).

opts.async likely exists a design decision. It is actually possible for us to determine at runtime which functions are executing synchronously / asynchronously by checking the Asyncify.currData or EmterpreterAsync.state. But it's better for you as the client to be explicit about whether you want the function to return a Promise or not. If we scrapped opts.async then sometimes the function would return a Promise and sometimes it wouldn't depending on whether anything async happened, which would be weird.

AjayP13 on 9 Sep 2019

Temporarily, what you can do is just pass {async: false} to cwrap. That is the "parameter saying 'I know this function won't yield'" you are asking for :).

That sounds nice, but that's not in line with what the assert says:

 assert(opts && opts.async, 'The call to ' + ident + ' is running asynchronously. If this was intended, add the async option to the ccall/cwrap call.');

If opts.async is false you will get assert(false, ...). And when you pass false to an assert, it triggers.

e.g. I think my interpretation of the situation is correct. It only permits promise-based calls in this case, regardless of blacklist status.

Perhaps this suggests some kind of tristate for async, where the default is conservative (you get the assert) and then explicit true or false is taken as a directive from the caller. Though I still like my idea of knowing about the blacklist status and using that--seems more idiot(me)-proof

it's better for you as the client to be explicit about whether you want the function to return a Promise or not.

I certainly would not argue differently! :-)

hostilefork on 9 Sep 2019

Yes, but that's actually because of point number 2 in my first response.

It also seems like you are running a ASYNCIFY_BLACKLIST function _while_ a function that is async is currently executing.

You're inside an if block that is conditioned upon Asyncify.currData being true: https://github.com/emscripten-core/emscripten/blob/incoming/src/preamble.js#L141

Asyncify.currData should only be true if there is currently an asynchronous function executing that has yielded. Let me know if you think that is not the case though, that would be a bug.

AjayP13 on 9 Sep 2019

It also seems like you are running a ASYNCIFY_BLACKLIST function while a function that is async is currently executing.

Yes (?)...and...why should I not do this? It won't yield, and I need the functionality provided by the non-yielding function. For background:

https://github.com/emscripten-core/emscripten/issues/6820

hostilefork on 9 Sep 2019

Hm, I think the stack should be fine if we allowed synchronous execution while an asynchronous execution has yielded since once synchronous execution finishes it should leave the stack in the same state.

AFAIK you shouldn't do this because it can lead to undefined behavior since the asynchronous function hasn't finished executing and you are starting up another function. Even though your program might not have this requirement, in general, the asynchronous function might expect that nothing in memory will be touched while it is sleeping, however, the synchronous function could change memory addresses. Imagine a "pure C" code base where a function sleeps. If the program is not multi-threaded, memory addresses should not have been touched when the program resumes from sleep. For this reason, it can cause undefined behavior as it is similar to running functions in a multi-threaded manner on non thread-safe code.

If your code is "thread-safe" you could just disable assertions with -s ASSERTIONS=0, but of course, this would disable all assertions.

@kripken might be able to say whether my interpretation of the limitations here are correct or not.

AjayP13 on 9 Sep 2019

👍1

Even though your program might not have this requirement, in general, the asynchronous function might expect that nothing in memory will be touched while it is sleeping, however, the synchronous function could change memory addresses.

Ugh. Well glad I brought this up... :-(

Clang must have some kind of mechanism in the emitter to put out a register flush, or memory barrier (or whatever) during emscripten_sleep(), doesn't it? There's certainly no optimization concern with doing so, right?

We really need this, for what we're doing! 😱 I don't mind a special call (or extra call) if what I'm asking for seems unusual. emscripten_sleep_may_run_sync_code(...) or emscripten_flush(); emscripten_sleep(...), for example.

hostilefork on 9 Sep 2019

So we could add a parameter something like allowConcurrentSynchronousExecution (a mouthful, so we need a better name) on an async function's cwrap that would set Asyncify.allowConcurrentSynchronousExecution and explicitly indicate that it is "thread-safe". Then, when a synchronous cwrap gets called, we can skip the assert if this is set to true.

I don't know if there are other problems that could occur though with executing two functions "concurrently" like that.

AjayP13 on 9 Sep 2019

👍1

So we could add a parameter something like allowConcurrentSynchronousExecution

My idea about knowing the blacklist status and having it be automatic sounds nice and all. But just this would be a big step up from having a copy/paste of ccall()/cwrap() minus asserts into our code!

I don't know if there are other problems that could occur though with executing two functions "concurrently" like that.

Certainly I'd prefer not to do anything unsound...

hostilefork on 9 Sep 2019

Yeah, we can't just use the blacklist for this, because there are certainly people using the BLACKLIST with code that would be non-thread safe and it would be nice to warn them if they try to do something potentially unsafe. The option parameter is the better option, so you can explicitly say "I know this function is okay with the memory state potentially being changed by some other function."

I think @kripken would know if there are any other pitfalls here / whether it makes sense to add this.

AjayP13 on 9 Sep 2019

there are certainly people using the BLACKLIST with code that would be non-thread safe and it would be nice to warn them if they try to do something potentially unsafe.

Perhaps having emscripten_sleep_threadsafe() as a separate entry point could give the best of both worlds?

This topic seems to come up here: https://github.com/WebAssembly/binaryen/pull/2327

"Names of all the asyncified export functions are now added as data of custom 'asyncify' sections.
This allows to detect them in a JS wrapper and wrap into async JS functions only if necessary."

hostilefork on 10 Sep 2019

I think the above summary is basically correct, good discussion! Yes, in theory it is safe to call synchronous code while sleeping, but it can violate assumptions in the C program. For example, while blocking on a network access in a GUI program, a key event may arrive, and if it's handled, that might break the program somehow.

In general the older Emterpretify tried to be smarter around this, and had sleep vs sleep_with_yield. But it was more trouble than it's worth. Overall in the new Asyncify we don't have as many checks.

In theory maybe we can just remove the ccall check. I think that would be more consistent actually, now that I think about it. So we should assert on an async ccall being started while one is already in flight, but not assert on a sync one. How does that sound?

kripken on 11 Sep 2019

In theory maybe we can just remove the ccall check. I think that would be more consistent actually, now that I think about it. So we should assert on an async ccall being started while one is already in flight, but not assert on a sync one. How does that sound?

That would work for my purposes, and would be much appreciated!

hostilefork on 11 Sep 2019

Sounds good, I've put the change into a PR!

AjayP13 on 11 Sep 2019

👍1

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.