Emscripten: segmentation fault: wasm module tries to alloc memory from the stack

Created on 12 Jul 2020  Â·  14Comments  Â·  Source: emscripten-core/emscripten

During the last 2 days, I've been trying to fix a segmentation fault error and now with -s SAFE_HEAP_LOG=1 I think emscripten consistently identifies the problem.

After successfully porting a few Unix tools, I'm now working to port the xz compressor (https://tukaani.org/xz/) to wasm in the browser. However, its wasm version only works on the command line ("node xz.js") but not on the browser.

On the browser, the memory profiler tells me that my total heap size is 16MB, the STATIC memory area size is 66.5kB and the STACK memory area size is 5MB. So, in one way or the other, the wasm module insists on trying to allocate below the start of the heap area.

This is how the logs look like before I call it on the JS console:

SAFE_HEAP store: 5312143,97,1,,104
SAFE_HEAP store: 5312144,109,1,,105
SAFE_HEAP store: 5312145,0,1,,106
SAFE_HEAP store: 5311752,2,1,,107
SAFE_HEAP store: 5311780,0,4,,108

But when I run Module.callMain([ "--help" ]) the first time:

SAFE_HEAP store: 66132,63,4,,109
segmentation fault

The xz's source code doesn't seem to have a hardcoded memory allocation address, so I don't understand why it is trying to allocate on a position so low.

And when I repeat it:

SAFE_HEAP store: 5311480,2,1,,110
SAFE_HEAP store: 5311508,0,4,,111
segmentation fault

When running the equivalent command on node node xz.js --help, it works nicely:

$ node xz.js --help
SAFE_HEAP store: 5311996,7,4,,0
SAFE_HEAP store: 5311992,136,4,,1
SAFE_HEAP store: 5312008,5312048,4,,2
SAFE_HEAP store: 5312048,85,1,,3
.... etc ...
Usage: <path>/xz.js [OPTION]... [FILE]...
Compress or decompress FILEs in the .xz format.
... etc ...

At this point, I tried lots of combinations of the Emscripten settings (within reason) but nothing seems to work. I tried to reduce the stack to the minimum size (1kB), but the heap still starts above 66,000 bytes. Finally, I also tried to run it on both Chrome and Safari, and the results are the same.

Your guidance is welcome. I tried to share only relevant details that describe the issue, but if you need more details, just ask.

Most helpful comment

I got past this issue by disabling nls on the build: --disable-nls so it's just some strange interaction with gettext and friends.

All 14 comments

FYI if, instead of the SAFE_HEAP options, I compile with -fsanitize=address and run Module.callMain([ "--help" ]) this is what I see on the browser's console:

memory resize: 60424192 72548352        xz.js:1325 
exception thrown: RuntimeError: memory access out of bounds,RuntimeError: memory access out of bounds         xz.html:1246      
    at __mo_lookup (http://localhost:12345/xz-5.2.4/src/xz/xz.wasm:wasm-function[578]:0x272c7)
    at dcngettext (http://localhost:12345/xz-5.2.4/src/xz/xz.wasm:wasm-function[552]:0x260f6)
    at dgettext (http://localhost:12345/xz-5.2.4/src/xz/xz.wasm:wasm-function[555]:0x26527)
    at gettext (http://localhost:12345/xz-5.2.4/src/xz/xz.wasm:wasm-function[529]:0x24b33)
    at message_help (http://localhost:12345/xz-5.2.4/src/xz/xz.wasm:wasm-function[164]:0x97ed)
    at parse_real (http://localhost:12345/xz-5.2.4/src/xz/xz.wasm:wasm-function[67]:0x2134)
    at args_parse (http://localhost:12345/xz-5.2.4/src/xz/xz.wasm:wasm-function[65]:0x1908)
    at main (http://localhost:12345/xz-5.2.4/src/xz/xz.wasm:wasm-function[127]:0x6a6e)
    at Module._main (http://localhost:12345/xz-5.2.4/src/xz/xz.js:6178:59)
    at Object.callMain (http://localhost:12345/xz-5.2.4/src/xz/xz.js:6512:13)
  1. It's odd you get a memory access out of bounds when using ASan - I'd expect a full ASan error report. Did you both compile and link with -fsanitize=address? (it's not enough to do it only during link)
  2. If this works in node but not in the browser, then seeing where things diverge could help narrow this down quickly, since it should execute the same things in both cases (unless you are using NODERAWFS and accessing local files or something like that). That is, each allocation should be the same, etc. - everything should be 100% deterministic and not depend on where the code is running (if it uses random somehow, you can use -s DETERMINISTIC). Manually tracing execution is one option, another is the autodebugger, https://emscripten.org/docs/porting/Debugging.html#debugging-autodebugger

Hey, Alon, let me first tell you that Emscripten is an awesome tool. I'm learning a lot through it.
I didn't use NODERAWFS but it shouldn't be a problem as calling xz (through callMain) with the --help option should just print a list of available options without any disk access. This even makes the issue weirder, in the sense that at this point it should just print a preprogrammed message.
The only difference when I target node is in using -s INVOKE_RUN=0 or 1.

After applying your suggestions (-s DETERMINISTIC -fsanitize=address), even on the required libraries, indeed I get a much more detailed output.
On the console:

> exception thrown: Error: Out of bounds memory access (evaluating 'asm[name].apply(null, arguments)'),<?>.wasm-function[__mo_lookup]@[wasm code]
<?>.wasm-function[dcngettext]@[wasm code]
<?>.wasm-function[dgettext]@[wasm code]
<?>.wasm-function[gettext]@[wasm code]
<?>.wasm-function[message_help]@[wasm code]
<?>.wasm-function[parse_real]@[wasm code]
<?>.wasm-function[args_parse]@[wasm code]
<?>.wasm-function[main]@[wasm code]
wasm-stub@[wasm code]
main@[native code]
http://localhost:12345/xz-5.2.4/src/xz/xz.js:1914:25
callMain@http://localhost:12345/xz-5.2.4/src/xz/xz.js:7291:26
global code
evaluateWithScopeExtension@[native code]

_wrapCall

> Error: Out of bounds memory access (evaluating 'asm[name].apply(null, arguments)')
<?>.wasm-function[__mo_lookup]
<?>.wasm-function[dcngettext]
<?>.wasm-function[dgettext]
<?>.wasm-function[gettext]
<?>.wasm-function[message_help]
<?>.wasm-function[parse_real]
<?>.wasm-function[args_parse]
<?>.wasm-function[main]
wasm-stub
main
(funzione anonima) — xz.js:1914
callMain — xz.js:7291
Codice globale
evaluateWithScopeExtension
(funzione anonima)
_wrapCall

And on the memory profiler (there is a lot more, I'm only pasting the last output:

63098880-65212416 (1 times, 2.02 MB):wasm-stub@[wasm code]
.wasm-function[sbrk]@[wasm code]
.wasm-function[dlmalloc]@[wasm code]
.wasm-function[internal_memalign]@[wasm code]
.wasm-function[dlmemalign]@[wasm code]
wasm-stub@[native code]
http://localhost:12345/xz-5.2.4/src/xz/xz.js:1914:25
[email protected]/src/xz/xz.js:5553:18
http://localhost:12345/xz-5.2.4/src/xz/xz.js:5578:22
[email protected]/src/xz/xz.js:5537:14
[email protected]/src/xz/xz.js:5577:40
.wasm-function[dlmemalign]@[wasm code]
.wasm-function[__sanitizer::internal_mmap(void*, unsigned long, int, int, int, unsigned long)]@[wasm code]
.wasm-function[__sanitizer::MmapOrDieOnFatalError(unsigned long, char const*)]@[wasm code]
.wasm-function[__sanitizer::MmapAlignedOrDieOnFatalError(unsigned long, unsigned long, char const*)]@[wasm code]
.wasm-function[__sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> >::AllocateRegion(__sanitizer::AllocatorStats*, unsigned long)]@[wasm code]
.wasm-function[__sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> >::PopulateFreeList(__sanitizer::AllocatorStats*, __sanitizer::SizeClassAllocator32LocalCache<__sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> > >*, __sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> >::SizeClassInfo*, unsigned long)]@[wasm code]
.wasm-function[__sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> >::AllocateBatch(__sanitizer::AllocatorStats*, __sanitizer::SizeClassAllocator32LocalCache<__sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> > >*, unsigned long)]@[wasm code]
.wasm-function[__sanitizer::SizeClassAllocator32LocalCache<__sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> > >::Refill(__sanitizer::SizeClassAllocator32LocalCache<__sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> > >::PerClass*, __sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> >*, unsigned long)]@[wasm code]
.wasm-function[__sanitizer::SizeClassAllocator32LocalCache<__sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> > >::Allocate(__sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> >*, unsigned long)]@[wasm code]
.wasm-function[__sanitizer::CombinedAllocator<__sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> >, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> > >, __sanitizer::LargeMmapAllocator<__asan::AsanMapUnmapCallback, __sanitizer::LargeMmapAllocatorPtrArrayStatic, __sanitizer::LocalAddressSpaceView>, __sanitizer::LocalAddressSpaceView>::Allocate(__sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<__asan::AP32<__sanitizer::LocalAddressSpaceView> > >*, unsigned long, unsigned long)]@[wasm code]
.wasm-function[__asan::Allocator::Allocate(unsigned long, unsigned long, __sanitizer::BufferedStackTrace*, __asan::AllocType, bool)]@[wasm code]
.wasm-function[__asan::asan_malloc(unsigned long, __sanitizer::BufferedStackTrace*)]@[wasm code]
.wasm-function[malloc]@[wasm code]
.wasm-function[dcngettext]@[wasm code]
.wasm-function[dgettext]@[wasm code]
.wasm-function[gettext]@[wasm code]
.wasm-function[message_help]@[wasm code]
.wasm-function[parse_real]@[wasm code]
.wasm-function[args_parse]@[wasm code]
.wasm-function[main]@[wasm code]
wasm-stub@[wasm code]
main@[native code]
http://localhost:12345/xz-5.2.4/src/xz/xz.js:1914:25
[email protected]/src/xz/xz.js:7291:26
global code
evaluateWithScopeExtension@[native code]

_wrapCall

The only difference when I target node is in using -s INVOKE_RUN=0 or 1.

Why is that difference there? It should be possible to get rid of it, that is, to use the same setting in both cases.

Btw, if that setting is relevant, then it's possible this is a timing issue - maybe you are calling run yourself manually in the one case and doing so too early, before the runtime is ready. -s ASSERTIONS=1 should check that for you. And make sure to call compiled code after startup is finished (e.g. using onRuntimeInitialized).

Why is that difference there?

Oops, right, I didn't make the following clear: in order to troubleshoot this issue and avoid introducing others, when I opened this issue I was no longer using my own HTML and JS but the ones generated by Emscripten.
The last stage of compilation is emcc ... -o xz.html and I run all my tests in the browser's js console after opening xz.html.
So the difference is that I want to prevent the automatic execution of callMain on the browse, because it is called without any parameters, and I want to pass [ "--help" ] as an argument. On the other hand, when using node, if I add -s INVOKE_RUN=0 then there is no output.
Anyway, after adding -s INVOKE_RUN=1 to the previous options and the result is the same: with nodeJS it works fine, on the browser I get a memory allocation error.

When I enable -s ASSERTIONS=1 the only difference seems to be this warning becomes an error:

Calling stub instead of sigaction()

By the looks of it, it seems I will have to practice with the Autodebugger today. :)

On the other hand, when using node, if I add -s INVOKE_RUN=0 then there is no output.

You can add some JS after the main JS (using --extern-post-js for example) that does what you would do in the browser. It should be possible to get them practically identical I think.

Btw, I just opened https://github.com/emscripten-core/emscripten/pull/11629 which fixes a DETERMINISTIC mode bug, might be good to use the autodebugger with that (see docs there).

And good luck! :rocket:

Let me leave here a comment to all Internet users... The Internet is a big place and I'm sure someday someone will also try to port XZ compressor to the browser and will find this issue.
If you are that person, let me share here all the commands I run in order to arrive to this stage (get it running on Node but failing to alloc memory on the browser), if this makes your life easier. Please contact me as I'd love to collaborate with you on a resolution.
Emsdk 1.39.19 is installed and I'm trying to port xz 5.2.5 (currently the latest, download it from https://tukaani.org/xz/).

~$ tar -xvf xz-5.2.5.tar.gz
~$ cd xz-5.2.5
~/xz-5.2.5$ emconfigure ./configure --enable-threads=no
~/xz-5.2.5$ cd src/liblzma/
~/xz-5.2.5/src/liblzma$ emmake make clean
~/xz-5.2.5/src/liblzma$ emmake make CFLAGS="-g -Oz -s EXPORT_ALL=1 -fPIC -s USE_PTHREADS=0 -s DETERMINISTIC=1 -fsanitize=address"
~/xz-5.2.5/src/liblzma$ cd ../xz
~/xz-5.2.5/src/xz$ emmake make clean

~/xz-5.2.5/src/xz$ emmake make CFLAGS="-g -Oz -s LLD_REPORT_UNDEFINED -s ERROR_ON_UNDEFINED_SYMBOLS=0 -fPIC -s DETERMINISTIC=1 -fsanitize=address -s USE_PTHREADS=0"

~/xz-5.2.5/src/xz$ emcc -fvisibility=hidden -Wall -Wextra -Wvla -Wformat=2 -Winit-self -Wmissing-include-dirs -Wstrict-aliasing -Wfloat-equal -Wundef -Wshadow -Wpointer-arith -Wbad-function-cast -Wwrite-strings -Waggregate-return -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes -Wmissing-declarations -Wmissing-noreturn -Wredundant-decls -g -Oz -s LLD_REPORT_UNDEFINED -s ERROR_ON_UNDEFINED_SYMBOLS=0 xz-args.o xz-coder.o xz-file_io.o xz-hardware.o xz-main.o xz-message.o xz-mytime.o xz-options.o xz-signals.o xz-suffix.o xz-util.o xz-tuklib_open_stdxxx.o xz-tuklib_progname.o xz-tuklib_exit.o xz-tuklib_mbstr_width.o xz-tuklib_mbstr_fw.o xz-list.o ../../src/liblzma/.libs/liblzma.dylib -s EXIT_RUNTIME=0 -s FORCE_FILESYSTEM=1 -s EXTRA_EXPORTED_RUNTIME_METHODS="['callMain']" -s USE_PTHREADS=0 -s ALLOW_MEMORY_GROWTH=1 -s ASSERTIONS=1 -s DETERMINISTIC=1 -fsanitize=address -s INVOKE_RUN=0 --memoryprofiler -o xz.html

At this point browse to xz.html and run Module.callMain([ "--help" ]) to watch what I've been reporting here. Thank you!

Yes, I see this happening on basically everything I am building via #12204 (when I add my own _memset function to get past this error). This exact stack trace hitting that __mo_lookup function for as, ld, readelf, and objdump that I'm building from binutils. All of these worked fine in an older emscripten build... I might need to figure out how to get an old emscripten to work again. :(

I got past this issue by disabling nls on the build: --disable-nls so it's just some strange interaction with gettext and friends.

I got past this issue by disabling nls on the build: --disable-nls so it's just some strange interaction with gettext and friends.

Thanks, @wilkie, I just saw your advice today and will try it out.

That's awesome, calling Module.callMain([ "--help" ]) worked without a "memory out of bounds" error! Thank you! I will continue to play and share the results of my experiments over here.

I feel ready to close this issue now that it was possible to compile liblzma and xz to wasm, and therefore port a project that uses them as dependencies.
For those interested, the step by step compilation instructions may be seen inside the Dockerfile in

https://github.com/fbitti/web-repaq

and a functional demonstration may be seen over here:

https://web-repaq.web.app/

Thank you, @wilkie for telling me which emconfigure flag to use. After months of being stuck, your tip allowed me to move forward.

Happy to help, as always. Have a good new year!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

juj picture juj  Â·  3Comments

hcomere picture hcomere  Â·  3Comments

napalm272 picture napalm272  Â·  4Comments

HolgerStrauss picture HolgerStrauss  Â·  4Comments

phraemer picture phraemer  Â·  3Comments