Houston I have a problem :D
Hello hackers ::)
C-code
void best(int *x,int offset_a,int offset_b){
int * a = x;
int * b = (x+offset_a);
int buff;
for(int i=0;i<offset_a;i++){
buff=b[i];
b[i]=a[i];
a[i]=buff;
};
}
compile
emcc convolve.c -v -g3 -Os -s "WASM=1" -s "SIDE_MODULE=1" -o convolve.wasm
JS-code
default load/init like this => https://gist.github.com/kripken/59c67556dc03bb6d57052fedef1e61ab
var i32 = new Uint32Array(wmodule.imports.env.memory.buffer);
var a = new Uint32Array([10,10,10,10,10]);
var b = new Uint32Array([20,20,20,20,20]);
i32.set(a,0);
i32.set(b,a.length);
console.log("add",i32);
var time = Date.now();
wmodule.module._test(i32,a.length,b.length);
var times = Date.now();
console.log("test time: ",times-time);
console.log("mov",i32);
I run in browser and
add Uint32Array [ 10, 10, 10, 10, 10, 20, 20, 20, 20, 20, 4194294 more… ] pica.js:423:5
test time: 569
mov Uint32Array [ 20, 20, 20, 20, 20, 10, 10, 10, 10, 10, 4194294 more… ] pica.js:428:5
/******/
add Uint32Array [ 10, 10, 10, 10, 10, 20, 20, 20, 20, 20, 4194294 more… ] pica.js:423:5
test time: 388 pica.js:427:5
/*other log*/
add Uint32Array [ 10, 10, 10, 10, 10, 20, 20, 20, 20, 20, 4194294 more… ] pica.js:423:5
test time: 609
300~600ms!
why so slow?
firefox 52.0.1 x64 Debian
my repo https://github.com/fedor-elizarov/convolve-wasm
One issue at least is that you are passing the Uint32Array to a WebAssembly function. This doesn't do what you think; it will actually just coerce the Uint32Array to an int32 value and use that as a index into the WebAssembly instance's linear memory.
I'm also not sure how you are calling the function with the module, you should be using the instance. I took your demo and tested it locally with a V8 build:
i32 = new Uint32Array(imports.env.memory.buffer, 0, 10); // makes the print below smaller
a = new Uint32Array([10, 10, 10, 10, 10]);
b = new Uint32Array([20, 20, 20, 20, 20]);
i32.set(a, 0);
i32.set(b, a.length);
print('add', i32);
var time = performance.now();
instance.exports._test(0, a.length, b.length);
var times = performance.now();
print('test time: ', times - time);
print('now', i32);
This typically prints ~0.005 ms on my machine, which is hitting the precision limits of performance.now, I believe.
Shameless plug here. You can use https://wasdk.github.io/WasmFiddle/ to toy around with WebAssembly and share examples.
Thank you! It works and works fast
The code binji posted doesn't work. It doesn't do anything to the array.
https://wasdk.github.io/WasmFiddle//?1cs2mc
Yeah, it looks like it doesn't work because the memory is exported instead of being imported. Maybe that changed sometime in the past year? Anyway, here's a standalone html example that works:
<!doctype html>
<body>
<pre></pre>
<script>
let pre = document.querySelector('pre');
let data = new Uint8Array([
0x00, 0x61, 0x73, 0x6d, 0x01, 0x00, 0x00, 0x00, 0x01, 0x07, 0x01, 0x60,
0x03, 0x7f, 0x7f, 0x7f, 0x00, 0x03, 0x02, 0x01, 0x00, 0x04, 0x04, 0x01,
0x70, 0x00, 0x00, 0x05, 0x03, 0x01, 0x00, 0x01, 0x07, 0x11, 0x02, 0x06,
0x6d, 0x65, 0x6d, 0x6f, 0x72, 0x79, 0x02, 0x00, 0x04, 0x62, 0x65, 0x73,
0x74, 0x00, 0x00, 0x0a, 0x47, 0x01, 0x45, 0x01, 0x03, 0x7f, 0x02, 0x40,
0x20, 0x01, 0x41, 0x01, 0x48, 0x0d, 0x00, 0x20, 0x01, 0x41, 0x02, 0x74,
0x21, 0x03, 0x03, 0x40, 0x20, 0x00, 0x20, 0x03, 0x6a, 0x22, 0x04, 0x28,
0x02, 0x00, 0x21, 0x05, 0x20, 0x04, 0x20, 0x00, 0x28, 0x02, 0x00, 0x36,
0x02, 0x00, 0x20, 0x00, 0x20, 0x05, 0x36, 0x02, 0x00, 0x20, 0x00, 0x41,
0x04, 0x6a, 0x21, 0x00, 0x20, 0x01, 0x41, 0x7f, 0x6a, 0x22, 0x01, 0x0d,
0x00, 0x0b, 0x0b, 0x0b
]);
function log(...args) {
pre.textContent += [].join.call(args, ' ') + '\n';
}
let module = new WebAssembly.Module(data);
let instance = new WebAssembly.Instance(module);
let memory = instance.exports.memory;
let best = instance.exports.best;
i32 = new Uint32Array(memory.buffer, 0, 10);
a = new Uint32Array([10, 10, 10, 10, 10]);
b = new Uint32Array([20, 20, 20, 20, 20]);
i32.set(a, 0);
i32.set(b, a.length);
log('add', i32);
let time = performance.now();
best(0, a.length, b.length);
let times = performance.now();
log('test time: ', times - time);
log('now', i32);
</script>
</body>
If you do it this way you're stuck with its tiny default 64k buffer and can't use a WebAssembly.Memory you create in JavaScript. If you pass one in when you create the instance, it gets ignored.
There has to be a working code example on the Internet somewhere that shows a WebAssembly.Memory being created in JavaScript and then accessed in C. (All the documentation regarding WebAssembly.Memory is written for WAT programmers, as if any exist.) I've been searching for months for a C code sample but all the ones I can find are broken.
If you do it this way you're stuck with its tiny default 64k buffer and can't use a WebAssembly.Memory you create in JavaScript.
You can call memory.grow to add additional pages. This will invalidate the buffer, however.
If you pass one in when you create the instance, it gets ignored.
That's true, because in the example above the memory is not imported. Here is an example where the memory is imported instead:
<!doctype html>
<body>
<pre></pre>
<script>
let pre = document.querySelector('pre');
let data = new Uint8Array([
0x00, 0x61, 0x73, 0x6d, 0x01, 0x00, 0x00, 0x00, 0x01, 0x07, 0x01, 0x60,
0x03, 0x7f, 0x7f, 0x7f, 0x00, 0x02, 0x0f, 0x01, 0x03, 0x65, 0x6e, 0x76,
0x06, 0x6d, 0x65, 0x6d, 0x6f, 0x72, 0x79, 0x02, 0x00, 0x01, 0x03, 0x02,
0x01, 0x00, 0x04, 0x04, 0x01, 0x70, 0x00, 0x00, 0x07, 0x08, 0x01, 0x04,
0x62, 0x65, 0x73, 0x74, 0x00, 0x00, 0x0a, 0x47, 0x01, 0x45, 0x01, 0x03,
0x7f, 0x02, 0x40, 0x20, 0x01, 0x41, 0x01, 0x48, 0x0d, 0x00, 0x20, 0x01,
0x41, 0x02, 0x74, 0x21, 0x03, 0x03, 0x40, 0x20, 0x00, 0x20, 0x03, 0x6a,
0x22, 0x04, 0x28, 0x02, 0x00, 0x21, 0x05, 0x20, 0x04, 0x20, 0x00, 0x28,
0x02, 0x00, 0x36, 0x02, 0x00, 0x20, 0x00, 0x20, 0x05, 0x36, 0x02, 0x00,
0x20, 0x00, 0x41, 0x04, 0x6a, 0x21, 0x00, 0x20, 0x01, 0x41, 0x7f, 0x6a,
0x22, 0x01, 0x0d, 0x00, 0x0b, 0x0b, 0x0b
]);
function log(...args) {
pre.textContent += [].join.call(args, ' ') + '\n';
}
let memory = new WebAssembly.Memory({initial: 1});
let module = new WebAssembly.Module(data);
let instance = new WebAssembly.Instance(module, {env: {memory}});
let best = instance.exports.best;
i32 = new Uint32Array(memory.buffer, 0, 10);
a = new Uint32Array([10, 10, 10, 10, 10]);
b = new Uint32Array([20, 20, 20, 20, 20]);
i32.set(a, 0);
i32.set(b, a.length);
log('add', i32);
let time = performance.now();
best(0, a.length, b.length);
let times = performance.now();
log('test time: ', times - time);
log('now', i32);
</script>
</body>
If I import something from JS how do I even access it in C? (NOBODY explains how to do this.) What's in that data block up there?
I've figured out you can export from C by defining a magic constant "WASM_EXPORT" above a declaration, but I can't find it documented anywhere.
Also in the JS, the import should be { js: { mem: memory} } and not { env: { memory } }, according to the current docs. Maybe it's changed?
check this source https://github.com/nodeca/pica/tree/master/lib/mm_resize this real project allocate and load image in "wasm memory" transform and write back. ask how it works in https://github.com/nodeca/pica/issues Vitaly Puzrin (puzrin) I think he will be happy to answer ::)
If I import something from JS how do I even access it in C? (NOBODY explains how to do this.)
Take a look at the emscripten documentation here:
https://kripken.github.io/emscripten-site/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html
What's in that data block up there?
Sorry, should have included that. The data variable contains the binary encoding of this wasm module:
(module
(memory $env.memory (import "env" "memory") 1)
(func $best (export "best") (param $p0 i32) (param $p1 i32) (param $p2 i32)
(local $l0 i32)
(local $l1 i32)
(local $l2 i32)
(block $B0
(br_if $B0
(i32.lt_s
(get_local $p1)
(i32.const 1)))
(set_local $l0
(i32.shl
(get_local $p1)
(i32.const 2)))
(loop $L1
(set_local $l2
(i32.load
(tee_local $l1
(i32.add
(get_local $p0)
(get_local $l0)))))
(i32.store
(get_local $l1)
(i32.load
(get_local $p0)))
(i32.store
(get_local $p0)
(get_local $l2))
(set_local $p0
(i32.add
(get_local $p0)
(i32.const 4)))
(br_if $L1
(tee_local $p1
(i32.add
(get_local $p1)
(i32.const -1)))))))
(table $T0 0 anyfunc))
It's a bit hard to read, but this is essentially equivalent to the following C-like code:
extern void* memory;
int* mem32 = memory;
void best(int p0, int p1, int p2) {
int l0 = 0, l1 = 0, l2 = 0;
if (p1 < 1) return;
l0 = p1 << 2;
do {
l1 = p0 + l0;
l2 = mem32[l1];
mem32[l1] = mem32[p0];
mem32[p0] = l2;
p0 += 4;
p1 -= 1;
} while(p1 != 0);
}
Also in the JS, the import should be { js: { mem: memory} } and not { env: { memory } }, according to the current docs. Maybe it's changed?
It depends on what name the WebAssembly module is expecting. You can see in this example that the module expects it to have the name "env" "memory". So the import object must look like:
{
'env': {
'memory': memory
}
}
I was using the ES2015 object shorthand syntax above to abbreviate it to {env: {memory}}
Your self-contained HTML with the data block works.
I tried replacing the data block with WASM compiled from that C code, using an extern void* pointer, and passing in {env:{memory}} as the import object. It doesn't work.
See demo
EDIT: Here's a simpler example. It tries to square each element of an array but isn't doing it.
main.js:
let pre = document.querySelector('pre');
fetch('../out/main.wasm').then(response =>
response.arrayBuffer()
).then(data => {
function log(...args) {
pre.textContent += [].join.call(args, ' ') + '\n';
}
let memory = new WebAssembly.Memory({initial: 1});
let module = new WebAssembly.Module(data);
let instance = new WebAssembly.Instance(module, {env: {memory}});
let squareEach = instance.exports.squareEach;
const i32 = new Uint32Array(memory.buffer, 0, 15);
for (let i = 0; i < 15; i++) {
i32[i] = i;
}
log('before', i32);
squareEach(15);
log('after', i32);
}).catch(console.error);
main.c:
#define WASM_EXPORT __attribute__((visibility("default")))
extern void* memory;
WASM_EXPORT
void squareEach(int lim) {
int* mem32 = memory;
for (int i = 0; i < lim; i++) {
int val = mem32[i];
mem32[i] = val * val;
}
}
The resulting WAT code:
(module
(type $t0 (func))
(type $t1 (func (param i32)))
(func $__wasm_call_ctors (type $t0))
(func $squareEach (export "squareEach") (type $t1) (param $p0 i32)
(local $l0 i32) (local $l1 i32)
block $B0
get_local $p0
i32.const 1
i32.lt_s
br_if $B0
i32.const 0
i32.load
set_local $l0
loop $L1
get_local $l0
get_local $l0
i32.load
tee_local $l1
get_local $l1
i32.mul
i32.store
get_local $l0
i32.const 4
i32.add
set_local $l0
get_local $p0
i32.const -1
i32.add
tee_local $p0
br_if $L1
end
end)
(table $T0 1 1 anyfunc)
(memory $memory (export "memory") 2)
(global $g0 (mut i32) (i32.const 66560))
(global $__heap_base (export "__heap_base") i32 (i32.const 66560))
(global $__data_end (export "__data_end") i32 (i32.const 1024)))
This prints
before 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14
after 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14
I also tried an example on WasmFiddle.
WasmFiddle does include (import "env" "memory" (global $memory i32)) in the WAT. But the JS crashes when you instantiate with { env: { memory } } as the import object:
"Uncaught LinkError: WebAssembly Instantiation: Import #0 module="env" function="memory" error: global import must be a number or WebAssembly.Global object"
The WAT it generates:
(module
(import "env" "memory" (global $memory i32))
(table 0 anyfunc)
(memory $0 1)
(export "memory" (memory $0))
(export "squareEach" (func $squareEach))
(func $squareEach (; 0 ;) (param $0 i32)
(local $1 i32)
(local $2 i32)
(block $label$0
(br_if $label$0
(i32.lt_s
(get_local $0)
(i32.const 1)
)
)
(set_local $2
(i32.load
(get_global $memory)
)
)
(loop $label$1
(i32.store
(get_local $2)
(i32.mul
(tee_local $1
(i32.load
(get_local $2)
)
)
(get_local $1)
)
)
(set_local $2
(i32.add
(get_local $2)
(i32.const 4)
)
)
(br_if $label$1
(tee_local $0
(i32.add
(get_local $0)
(i32.const -1)
)
)
)
)
)
)
)
Sorry, the C-like code I used above was meant to be a way to better understand what the wasm code was doing, but I think it just made everything more confusing.
Thank you for writing a simpler example! For both WebAssembly Studio and WasmFiddle, you can fix it by using the exported memory instead of the imported memory. The WebAssembly module itself dictates how it wants the memory, so you can't pass in memory if the module is already creating it itself.
Here's what the corrected C code looks like:
void squareEach(int* p, int lim) {
for (int i = 0; i < lim; i++) {
int val = p[i];
p[i] = val * val;
}
}
We pass in a pointer to the data we want to square, and the number of elements to square.
The in JavaScript, we call squareEach and pass the address of the array. Since it is at the start of memory, we can pass 0:
squareEach(0, 15);
But let's say we only want to square the elements at indexes 5, 6, and 7. In that case, we can pass the pointer to the fifth element, and a length of 5. Each element is 4 bytes, so the pointer is 5 * 4 = 20. You can also use the TypedArray's BYTES_PER_ELEMENT property to make it clearer:
squareEach(5 * i32.BYTES_PER_ELEMENT, 3);
Here's the fixed example in WebAssembly Studio: https://webassembly.studio/?f=zu4h3i4olmo
And here's the fixed example in WasmFiddle: https://wasdk.github.io/WasmFiddle//?mu4bk
So this is exporting the memory. I was trying to figure out how to create a new WebAssembly.Memory and use that. Is it possible to get past the 64k limit by calling grow() on the exported memory before accessing its buffer?
Yes, here's an example that does that: https://wasdk.github.io/WasmFiddle//?b60v4
let module = new WebAssembly.Module(wasmCode);
let instance = new WebAssembly.Instance(module);
let squareEach = instance.exports.squareEach;
let memory = instance.exports.memory;
function logArray(...args) {
log([].join.call(args, ' ') + '\n');
}
log(memory.buffer.byteLength); // 1 page.
memory.grow(10);
log(memory.buffer.byteLength); // 11 pages.
const p = 9 * 65536; // Start at the 9th page.
const lim = 20;
const pArray = new Uint32Array(memory.buffer, p, lim);
for (let i = 0; i < lim; ++i) {
pArray[i] = i;
}
logArray('before', pArray);
squareEach(p, lim);
logArray('after', pArray);
Output:
720896
before 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
after 0,1,4,9,16,25,36,49,64,81,100,121,144,169,196,225,256,289,324,361
EDIT: simplified the code
Weird... I'm trying this technique in the main project I'm working on, and it works perfectly for small arrays. As soon as the ArrayBuffer exceeds even just several hundred bytes, it goes dead. Still looking into it.
EDIT: Figured it out... I was using a byte offset that was too low, so my array began stomping on my stack variables once it exceeded 1024 bytes. That was driving me nuts.
We all need a step-by-step explanation of how memory works, and when it is slow when used, and when it is fast. Unfortunately for myself, I can only poke a wand learning how everything works by experience, this is bad. Maybe someone will write a deep article revealing features. Just such problems should not be in my opinion, problems understanding how it fucking works inside.
This is a petition, like it if you are for it. Taaadaaaaam ::)
P.S. Or just I'm stupid LOL :)
Don't do what I did and allocate an array on top of your stack variables. I followed binji's advice and passed in a pointer, but I left the offset at zero and it managed to fail by overwriting its own loop counter when zeroing it out (past one kilobyte). That drove me nuts for about an hour. I moved the array 64k higher and cleared the stack, but in general how do you know what the forbidden regions in memory are? I can't figure out what the WebAssembly.Memory constructor is even useful for at this point, if you can't import memory.
Part of the problem with WebAssembly documentation is that each high level language comes with a big ugly JS glue code file, and that's all anyone wants to talk about. The documentation specific to C talks about how to use the C glue code. The docs for writing WebAssembly in Go tell you about Go's glue code, etc.
If you're trying to avoid Emscripten glue, and learn how to deal with the native interface exposed to any specific high level language, there is very little documentation out there at all. Without glue, the discussion immediately changes to WAT, which is like JVM bytecode. Nobody writes JVM bytecode tutorials, only Java tutorials. But WebAssembly's whole schtick is being language-agnostic, so there is no one "Java" here- and everyone is afraid to pick any one specific high level language when they write a tutorial. So it's really hard to get started, because each high level language has its own way of dealing with the native WebAssembly API, and the only language that's properly documented is WAT. Which fills up my screen and hurts my eyes.
The JS API is pretty well documented, but everyone leaves you in the dark as to how you do things on the other side- except for some code samples in WAT. OK, how do I write a C program that compiles to these WAT instructions, just so I can get a damn pointer into my C? I couldn't figure this out for a month until binji pointed out you can pass the memory.buffer JS reference and actually get a valid address. (I was thinking it would just be a pointer to some goofball object in V8's internals.)
Don't do what I did and allocate an array on top of your stack variables. I followed binji's advice and passed in a pointer, but I left the offset at zero and it managed to fail by overwriting its own loop counter when zeroing it out (past one kilobyte).
Right, the example above is simple enough that it doesn't use any stack, but with more complicated examples you need to follow the constraints of the source language.
in general how do you know what the forbidden regions in memory are?
This is what the glue code provides you: it initializes the memory and gives you functions for allocating and deallocating memory regions. With emscripten, it's expected that you use _malloc and _free. See some of the examples in the preamble.js documentation.
I can't figure out what the WebAssembly.Memory constructor is even useful for at this point, if you can't import memory.
I believe it's an option to emscripten whether to import or export the memory, though I'm not sure of the specifics. There's an example here of using -s SIDE_MODULE=1 to generate a standalone WebAssembly module that imports its memory: https://gist.github.com/kripken/59c67556dc03bb6d57052fedef1e61ab
But generally it's up to the source language how to use memory.
The JS API is pretty well documented, but everyone leaves you in the dark as to how you do things on the other side- except for some code samples in WAT. OK, how do I write a C program that compiles to these WAT instructions, just so I can get a damn pointer into my C?
You may want to take a look at some of the other languages that compile to WebAssembly like AssemblyScript and Walt. They are generally less opinionated about how they generate WebAssembly modules, but provide languages that are more terse than a wat file.
If you want to use C without as much emscripten glue, you can do so, but it's going to be more work. You can look at the SIDE_MODULE example above, or you can try experimental projects like wasmception.
TurboScript used to compile TypeScript to asm.js but now they're rewriting it to generate WASM, and speedy.js is also compiling TypeScript. So counting AssemblyScript, that's three TypeScript -> WASM language projects going on right now. I wonder what's going to happen with that. Walt also looks a lot like TypeScript. There's also Grain which is completely different but looks like someone is still working on it.
As one of maintainer of TurboScript in the past and AssemblyScript for now I would like to clarify that TurboScript and speedy.js not maintained anymore. Walt is more close to flow rather then typescript.
Most helpful comment
Yeah, it looks like it doesn't work because the memory is exported instead of being imported. Maybe that changed sometime in the past year? Anyway, here's a standalone html example that works: