Emscripten: Dynamic linking & indirect calls

Created on 8 Mar 2019  路  11Comments  路  Source: emscripten-core/emscripten

Currently in emscripten dynamic linking we share functions by exporting and importing them. This should be faster than indirect calls, except for the case of calling a function from a later-linked module (since we don't have anything to import yet, so we import a JS indirection helper thunk).

This has a problem, though - pointer identity is not preserved. The reason is that if module A exports a function foo, and module B imports it, and module B wants to call foo indirectly, then in the current model we have received the function itself. To make it indirectly callable, we create a little helper function and put that in the table. But it has a different index than the original (if the original was also in the table). So two function pointers may be different even though they refer to the same actual function.

Options:

  1. Do all dynamic linking cross-module calls using indirect calls. This would be simplest - when a module exports functions, it would just export the function pointer indexes, and not the actual functions. But the overhead of indirect calls is high (on the other hand this would avoid the JS indirection helpers.)
  2. Also export the function pointer indexes, in addition to the functions themselves. Then another module can use either the function for a direct call or the index for an indirect one.

In both of those, a downside is that we'd need to place every exported function in the table, and also export those indexes.

cc @sbc100 @dschuff @awtcode

All 11 comments

I also realized this problem recently.

I like the idea of (1). This means that the wasm table becomes the equivalent of the ELF PLT, where all calls go through a level of indirection. This also allows for things like symbol interposition and runtime redirection via table.set().

I think the way it would work is that DLLs that import external functions would instead import a global holding the "address" (table index) of a function. So all calls to non-local-bound symbols would look like:

 import global plt.$foo
 ...
 get_global $foo
 call_indirect

The JS loader would when construct plt as a map from function name to table index.

As an optimization to allow for at least some cross-DLL-direct-calling, we could distinguish between address-take functions (which we import and call by table index), and non-address-taken function which we can import directly and call directly (but never put in a table).

Yeah, perhaps the importing module would import both the function and its index in the table, and then use the former for direct calls, and the latter for indirect ones. Then we can just remove the current wrapper code.

This all does mean that modules must place all exports in the table, and must export both the function and the table index, which seems annoying, but perhaps unavoidable.

Yeah, perhaps the importing module would import both the function and its index in the table, and then use the former for direct calls, and the latter for indirect ones. Then we can just remove the current wrapper code.

This all does mean that modules must place all exports in the table, and must export both the function and the table index, which seems annoying, but perhaps unavoidable.

Does it?

Couldn't modules just export the functions, and then import all the table address for any function they export? Then the dynamic loader could be in charge or building the table from the combined exports of all the modules?

I guess each DLL might still have its own private table section for "hidden" (non-dll-exported) symbols.

Just to clarify, when we have code like the following in the main module:

globalFnPtr == sidey

where sidey is a function imported from a side module. Are we comparing the function's address or the table index? If it is the former, does it mean that the address needs to be imported from the side module as well?

With wasm a function's address is its table index, and least that is how we model the function address in C/C++ code today. So that operation is a comparison of two table indexes.

One approach being suggested here is that we (1) is that we only import the table index and then all call's become indirect calls (at least all calls across the DLL boundary).

Couldn't modules just export the functions, and then import all the table address for any function they export? Then the dynamic loader could be in charge or building the table from the combined exports of all the modules?

Good point. Yeah, talking with @dschuff, it does seem like a good approach would be to have the dynamic loader do it. So modules export the functions, and the loader tracks function name => wasm function. Then we have a JS Proxy object for the imports, and when we see someone imports __address$foo (or some other way to say "the index of foo"), then we add that function to the table if it isn't there yet, and now have an index we can give the import. In other words, lazy adding to the table.

I'd love to see this fixed. I added a failing test case for it to confirm how it currently fails.

@kripken do you have time to look into this or should I?

I like the idea of assign the table index's lazily, however that does mean that the defining module for foo will also need to call __address$foo for to find its address. i.e. nobody will know the address of foo until runtime.

Another approach would be for modules to export __address$foo for every function foo that they export? Would double the number of exports though.

@kripken and @sbc100 , I found another issue last week which might be related. The problem happens when we pass a derived pointer to a function in another module. The dynamic_cast fails in the other module.

main.cpp
`EMSCRIPTEN_KEEPALIVE extern "C"
void mainy(base * arg) {
derived * temp = dynamic_cast < derived * > (arg);
printf("temp:%p\n", temp);
}

extern "C" void sidey(base* arg);

int main() {
derived* temp = new derived();
printf("main: temp:%p\n", temp);
sidey(temp);
return 0;
}`

side.cpp
`EMSCRIPTEN_KEEPALIVE extern "C"
void sidey(base* arg) {
derived * temp1 = dynamic_cast < derived * > (arg);
printf("sidey: arg:%p temp1:%p\n", arg, temp1);

base* temp = new derived();
mainy(temp);
}`

@sbc100 - I don't think I'll have time for this, but in any case, we should probably just focus on the LLVM wasm backend anyhow, which you're doing anyway (that is, I don't see a point to fixing this in fastcomp, since we hope to deprecate it).

I think we have a reasonable plan here, where the dynamic loader can assign the final fixed index in the table to any function, even of a function in a module not yet loaded (it reserves the spot, and will fill it in later).

Issues left:

  • It seems like for globals we do need mutable imported globals, as the dynamic loader can't actually figure them out ahead of time (each function takes one slot; a global is part of a module's memory allocation, whose size we only see later, unless we parse it out of the modules).
  • For direct calls, we can hook them up directly, but for cycles (a module not yet loaded) it seems we have to add a stub + an indirect call.
Was this page helpful?
0 / 5 - 0 ratings