WebAssembly allows growing memory in steps of 64 kB. However, it is not obvious whether this is advisable - if growing memory is an expensive operation (i.e. involves copying contents around) you might want to perform as few growth operations as possible. On the other hand, if growing is cheap you might want to avoid reserving more memory than you currently need.
Currently, it's impossible to tell which scenario you are in. For example, the SpiderMonkey implementation will call ArrayBufferObject::wasmGrowToSizeInPlace() (no copying involved) in x64 builds or when a maximum size is set, otherwise it will use ArrayBufferObject::wasmMovingGrowToSize() that copies contents. The V8 implementation appears to copy memory contents around unconditionally.
IMHO, this implementation detail should be exposed so that the code can choose the best strategy dynamically. I guess that this means adding a method like Memory.prototype.willGrowCopyData(pages) - this should tell whether a Memory.prototype.grow() call with the same parameter will be expensive. Both SpiderMonkey and V8 implementations can currently ignore the parameter but other implementations might use a non-copying approach up to a certain memory amount and only perform a copy once that amount is exceeded.
How does the API you propose help? AFAICS, it will only allow you to find out that the next grow is gonna be expensive when you get there, that is, _after_ you have already picked a strategy.
Nope, you can make a decision based on the results. One possible approach would be: call memory.willGrowCopyData(1) and if it returns false actually grow the memory by one page. Otherwise double the amount of memory so that this expensive operation doesn't need to be repeated later.
What if the next small grow turns out to be cheap but causes the ones after to be much more expensive than with a larger grow factor? Inquiries about individual next steps do not enable you to find the best strategy overall. Also, what does cheap vs expensive mean? You are assuming a dichotomy when there might be a spectrum.
There can be various contributing factors here of course, depending on how this memory area is managed. But IMHO by far the biggest one is always going to be copying data from one memory block to another. And the characteristics seem pretty simple for that one - a copy is either required or it isn't. Also, growing without a copy now will of course require copying more memory later on when the copy is actually required - but that's at most a linear factor, where could an overproportional delay possibly come from?
I think it would be useful to discuss the strategies around virtual memory reservation. These are the cases which can occur:
In 1. and 2. it's always fast to grow memory. This seems like a fact you'd like to know.
Then, there's the case that mostly seems to be discussed here of 3. Here there's a gray area:
memcpy old memory over, free the previous one.mremap or similar API which doesn't require copying, only reshuffling of the page table.Here, 1. is in the same cost bucket as the previous two cases, whereas 2.i. and 2.ii. have a cost. These costs are different though! Further, as @rossberg-chromium points out, these costs aren't paid on each allocation and a smart engine would try to amortize them (or assume the in-wasm allocator is smart about not calling grow memory too much).
Engines are fairly new and haven't been tuned to adapt to real-world usecases yet, and neither have the in-wasm allocators. Given this, I agree there's probably use in knowing this information but I'm not sure we want to expose it for now. It could be useful, but I think we can improve the stated problem without exposing this yet.
Of course, an engine could also be smart about it and merely reserve the requested amount of memory virtually - turn it into an actual allocation only page-wise for pages being accessed. This would solve the problem at least for Emscripten without requiring any changes to the algorithm. I'm not sure how realistic that is however.
Of course, an engine could also be smart about it and merely reserve the requested amount of memory virtually - turn it into an actual allocation only page-wise for pages being accessed. This would solve the problem at least for Emscripten without requiring any changes to the algorithm. I'm not sure how realistic that is however.
That's another option which I didn't mention. I think it makes your request worst because it means that, at runtime, memory you thought you had can be found to not actually exist. Any memory access could cause the instance to die. It's not that hard to implement, at least for us, but I'm not sure it's expected behavior from a WebAssembly user's point of view.
I'll close for now since I don't think we want to do this at this time. Please comment with new data if we need to revisit, happy to reopen if that's the case.
because it means that, at runtime, memory you thought you had can be found to not actually exist. Any memory access could cause the instance to die.
If I understand https://blogs.msdn.microsoft.com/oldnewthing/20170512-00/?p=96146 correctly, you are in that scenario no matter what. At least Windows won't physically reserve memory you allocate until you touch it. So if memory is really exhausted (that's physical memory + swap) then any memory access could indeed cause a crash. I suspect that other operating systems will behave similarly. Of course, if your system is in that kind of state a random application crash is probably your least worry.
Most helpful comment
I think it would be useful to discuss the strategies around virtual memory reservation. These are the cases which can occur:
In 1. and 2. it's always fast to grow memory. This seems like a fact you'd like to know.
Then, there's the case that mostly seems to be discussed here of 3. Here there's a gray area:
memcpyold memory over, free the previous one.mremapor similar API which doesn't require copying, only reshuffling of the page table.Here, 1. is in the same cost bucket as the previous two cases, whereas 2.i. and 2.ii. have a cost. These costs are different though! Further, as @rossberg-chromium points out, these costs aren't paid on each allocation and a smart engine would try to amortize them (or assume the in-wasm allocator is smart about not calling grow memory too much).
Engines are fairly new and haven't been tuned to adapt to real-world usecases yet, and neither have the in-wasm allocators. Given this, I agree there's probably use in knowing this information but I'm not sure we want to expose it for now. It could be useful, but I think we can improve the stated problem without exposing this yet.