I just build [email protected] on mips64el (loongson 3A3000) with --with-snapshot 锛宼he result was successed. When I run nodejs, it crashed. The messages are as follows:
#
# Fatal error in ../deps/v8/src/execution/isolate.cc, line 232
# Embedded blob checksum verification failed. This indicates that the embedded blob has been modified since compilation time. A common cause is a debugging breakpoint set within builtin code.
#
#
#
#FailureMessage Object: 0xffffffacc0
Program received signal SIGTRAP, Trace/breakpoint trap.
v8::base::OS::Abort () at ../deps/v8/src/base/platform/platform-posix.cc:406
But [email protected] and [email protected] works fine. I found that when I build [email protected] with --without-snapshot锛宼hen nodejs works fine. Maybe mips64el has some problems with v8 snapshot.
The versions of v8 and nodejs are :
According to the nodejs building document, have to build nodejs with gcc>=6.3锛宐ut I build it with gcc 4.9.3-3 without errors. after crash,I use [email protected] to build nodejs,the result are same.
I also tried to build nodejs with crossing compiler [email protected] and [email protected], nodejs was crashed too.You can download them from
http://www.loongnix.org/index.php/Cross-compile , and direct download url [email protected] and [email protected] ,put the date directory to /usr/loca/
the crossing build script as follows(host: ubuntu 18.04 x86_64 and [email protected]):
#!/bin/bash
export PREFIX=/usr/local/mips-loongson-gcc4.9-linux-gnu/bin/mips-linux-gnu-
export CC=${PREFIX}"gcc -march=gs464e -mips64r2 -mabi=64"
export CXX=${PREFIX}"c++ -march=gs464e -mips64r2 -mabi=64"
export LINK=$CXX
export LD=${PREFIX}ld
export AR=${PREFIX}ar
export AS=${PREFIX}as
export RANLIB=${PREFIX}ranlib
export CROSS_COMPILE=mips-loongson
export ARCH=mips64el
# Native compilers
export AR_host="ar"
export CC_host="gcc"
export CXX_host="g++"
export LINK_host="g++"
export AR_HOST="ar"
export CC_HOST="gcc"
export CXX_HOST="g++"
export LINK_HOST="g++"
# extras for convenience.
export OBJD=${PREFIX}objdump
export GDB=${PREFIX}gdb
export RDE=${PREFIX}readelf
./configure --dest-cpu=mips64el --cross-compiling --with-mips-arch-variant=r2 --dest-os=linux --openssl-no-asm --verbose
make -j$(grep -c ^processor /proc/cpuinfo 2>/dev/null || 1)
Maybe nodejs crashed in v8 builtin code,But when I use ./configure xxxx and --gdb, build failed on src\deps\v8\src\diagnostics\gdb-jit.cc@629 #error Unsupported target architecture.
void WriteHeader(Writer* w) {
DCHECK_EQ(w->position(), 0);
Writer::Slot<ELFHeader> header = w->CreateSlotHere<ELFHeader>();
#if (V8_TARGET_ARCH_IA32 || V8_TARGET_ARCH_ARM)
const uint8_t ident[16] = {0x7F, 'E', 'L', 'F', 1, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0};
#elif V8_TARGET_ARCH_X64 && V8_TARGET_ARCH_64_BIT || \
V8_TARGET_ARCH_PPC64 && V8_TARGET_LITTLE_ENDIAN
const uint8_t ident[16] = {0x7F, 'E', 'L', 'F', 2, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0};
#elif V8_TARGET_ARCH_PPC64 && V8_TARGET_BIG_ENDIAN && V8_OS_LINUX
const uint8_t ident[16] = {0x7F, 'E', 'L', 'F', 2, 2, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0};
#elif V8_TARGET_ARCH_S390X
const uint8_t ident[16] = {0x7F, 'E', 'L', 'F', 2, 2, 1, 3,
0, 0, 0, 0, 0, 0, 0, 0};
#elif V8_TARGET_ARCH_S390
const uint8_t ident[16] = {0x7F, 'E', 'L', 'F', 1, 2, 1, 3,
0, 0, 0, 0, 0, 0, 0, 0};
#else
#error Unsupported target architecture. <--
#endif
Hi @bnoordhuis:
Thank you for help. Yes, mips(el) is not an officially supported
architecture, but less than nothing.
That "Embedded blob checksum verification failed" error is a debug check, it's not enabled in release builds, so I'm kind of surprised you're getting that message. Where did you obtain the source code from?
When nodejs crashed, I build nodejs with --debug flag ,so I get the error message. But I can not built it with -gdb flag.
--gdb is for debugging jitted code (the machine code V8 emits when it compiles a JS method), you probably don't need that but if you do, you can probably make it work with minor tweaks to gdb-jit.cc.
You mention release builds crash? What does the backtrace look like in gdb?
So I just dig the issue deeper. It crashed at node_mksnapshot.
With GDB, it told me
Thread 1 "node_mksnapshot" received signal SIGSEGV, Segmentation fault.
0x000000aaab6dace0 in Builtins_ConstructProxy () at ../../deps/v8/../../deps/v8/src/builtins/base.tq:412
412 elements: FixedArrayBase;
(gdb) info registers
zero at v0 v1
R0 0000000000000000 0000000000000001 000000aaab6dace0 0000000000010000
a0 a1 a2 a3
R4 000000aaaf2b1fe0 000000d3c5ec04b9 000000143ad03119 000000d3c5ec04b9
a4 a5 a6 a7
R8 0000000000000003 000000ffffff27f0 0000000000000000 0000000000000003
t0 t1 t2 t3
R12 0000000000000000 0000000000000000 0000000000000000 00000000000c0004
s0 s1 s2 s3
R16 000000aaaf2b1ee0 000000ffffff2560 000000aaaf259fe0 0000000000000005
s4 s5 s6 s7
R20 000000ffffff24d0 000000d3c5ec04b9 000000143ad03119 000000aaab6dace0
t8 t9 k0 k1
R24 0000000000000038 000000aaab6dace0 000000fff7bd0000 0000000000000000
gp sp s8 ra
R28 000000aaaf20e758 000000ffffff2480 000000ffffff2480 000000aaac33d630
status lo hi badvaddr
000000000400ccf3 0e1b9099b653a189 0000000000000001 000000aaac33d62f
cause pc
0000000010000004 000000aaab6dace0
fcsr fir restart
000c0004 00f70501 0000000000000000
With no backtrace frame.
Disassembly at that point seems strange. It should be padding or something, not code.
0x000000aaab6dac20 <+848>: sd v0,-48(s8)
0x000000aaab6dac24 <+852>: sd a5,-64(s8)
0x000000aaab6dac28 <+856>: sd a7,-72(s8)
0x000000aaab6dac2c <+860>: sd a4,-80(s8)
0x000000aaab6dac30 <+864>: ld t9,25640(s6)
0x000000aaab6dac34 <+868>: daddiu t9,t9,63
0x000000aaab6dac38 <+872>: jalr t9
0x000000aaab6dac3c <+876>: nop
0x000000aaab6dac40 <+880>: move t0,v0
0x000000aaab6dac44 <+884>: ld a6,-40(s8)
0x000000aaab6dac48 <+888>: ld v0,-48(s8)
0x000000aaab6dac4c <+892>: ld a5,-64(s8)
0x000000aaab6dac50 <+896>: ld a7,-72(s8)
0x000000aaab6dac54 <+900>: ld a4,-80(s8)
0x000000aaab6dac58 <+904>: b 0xaaab6daa84 <Builtins_ConstructProxy+436>
0x000000aaab6dac5c <+908>: nop
0x000000aaab6dac60 <+912>: daddiu sp,sp,-16
0x000000aaab6dac64 <+916>: sd v0,0(sp)
0x000000aaab6dac68 <+920>: li a2,0x3b
0x000000aaab6dac6c <+924>: dsll32 a2,a2,0x1
0x000000aaab6dac70 <+928>: sd a2,8(sp)
0x000000aaab6dac74 <+932>: ld a1,9616(s6)
0x000000aaab6dac78 <+936>: daddiu a0,zero,2
0x000000aaab6dac7c <+940>: ld s7,-40(s8)
0x000000aaab6dac80 <+944>: ld t9,30488(s6)
0x000000aaab6dac84 <+948>: daddiu t9,t9,63
0x000000aaab6dac88 <+952>: jalr t9
0x000000aaab6dac8c <+956>: nop
0x000000aaab6dac90 <+960>: break 0x150,0x321
0x000000aaab6dac94 <+964>: nop
0x000000aaab6dac98 <+968>: srav zero,zero,zero
0x000000aaab6dac9c <+972>: srl zero,zero,0x0
0x000000aaab6daca0 <+976>: sll zero,zero,0x2
0x000000aaab6daca4 <+980>: sd ra,-1(ra)
0x000000aaab6daca8 <+984>: sd ra,-1(ra)
0x000000aaab6dacac <+988>: 0x204
0x000000aaab6dacb0 <+992>: sd ra,-1(ra)
0x000000aaab6dacb4 <+996>: sd ra,-1(ra)
0x000000aaab6dacb8 <+1000>: dsll32 zero,zero,0xa
0x000000aaab6dacbc <+1004>: sd ra,-1(ra)
0x000000aaab6dacc0 <+1008>: sd ra,-1(ra)
0x000000aaab6dacc4 <+1012>: 0x2dc
0x000000aaab6dacc8 <+1016>: sd ra,-1(ra)
0x000000aaab6daccc <+1020>: sd ra,-1(ra)
0x000000aaab6dacd0 <+1024>: 0x348
0x000000aaab6dacd4 <+1028>: sd ra,-1(ra)
0x000000aaab6dacd8 <+1032>: sd ra,-1(ra)
0x000000aaab6dacdc <+1036>: tge zero,zero,0xd
=> 0x000000aaab6dace0 <+1040>: sd ra,-1(ra)
0x000000aaab6dace4 <+1044>: sd ra,-1(ra)
0x000000aaab6dace8 <+1048>: sll zero,zero,0xf
0x000000aaab6dacec <+1052>: sd ra,-1(ra)
0x000000aaab6dacf0 <+1056>: sd ra,-1(ra)
0x000000aaab6dacf4 <+1060>: 0x2001a8
0x000000aaab6dacf8 <+1064>: 0x1b80000
0x000000aaab6dacfc <+1068>: 0x1be0000
0x000000aaab6dad00 <+1072>: pref 0xc,0(a2)
0x000000aaab6dad04 <+1076>: pref 0xc,-13108(a2)
0x000000aaab6dad08 <+1080>: pref 0xc,-13108(a2)
0x000000aaab6dad0c <+1084>: pref 0xc,-13108(a2)
So I tried to trace RA register, and got the real call routine.
Thread 1 "node_mksnapshot" hit Breakpoint 2, 0x000000aaac33d4ac in v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) ()
(gdb) bt
#0 0x000000aaac33d4ac in v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) ()
#1 0x000000aaac33db2c in v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) ()
#2 0x000000aaac293918 in v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) ()
#3 0x000000aaababf584 in node::InitializePrimordials (context=...) at ../src/api/environment.cc:504
#4 0x000000aaababe90c in node::GetPerContextExports (context=...) at ../src/api/environment.cc:409
#5 0x000000aaababf264 in node::InitializePrimordials (context=...) at ../src/api/environment.cc:482
#6 0x000000aaababf06c in node::InitializeContextForSnapshot (context=...) at ../src/api/environment.cc:465
#7 0x000000aaababf670 in node::InitializeContext (context=...) at ../src/api/environment.cc:516
#8 0x000000aaababeabc in node::NewContext (isolate=0xaaaf2b1ee0, object_template=...) at ../src/api/environment.cc:422
#9 0x000000aaabab86d4 in node::SnapshotBuilder::Generate (args=std::vector of length 1, capacity 1 = {...}, exec_args=std::vector of length 0, capacity 0) at ../tools/snapshot/snapshot_builder.cc:90
#10 0x000000aaabab6dec in main (argc=2, argv=0xffffff30c8) at ../tools/snapshot/node_mksnapshot.cc:47
Looks like it jumped into wrong address. Any further hint on debugging is appreciated.
Thanks.
@bnoordhuis
It appears to be a GCC only regression.
Clang build works fine, but there are some test failures with OpenSSL, like:
=== release test-tls-honorcipherorder ===
Path: parallel/test-tls-honorcipherorder
_tls_common.js:129
c.context.setCert(cert);
^
Error: error:140AB18F:SSL routines:SSL_CTX_use_certificate:ee key too small
at Object.createSecureContext (_tls_common.js:129:17)
at Server.setSecureContext (_tls_wrap.js:1312:27)
at new Server (_tls_wrap.js:1176:8)
at Object.createServer (_tls_wrap.js:1219:10)
at test (/home/flygoat/nodejs/test/parallel/test-tls-honorcipherorder.js:30:22)
at Object.<anonymous> (/home/flygoat/nodejs/test/parallel/test-tls-honorcipherorder.js:60:1)
at Module._compile (internal/modules/cjs/loader.js:1204:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:1224:10)
at Module.load (internal/modules/cjs/loader.js:1053:32)
at Function.Module._load (internal/modules/cjs/loader.js:948:14) {
library: 'SSL routines',
function: 'SSL_CTX_use_certificate',
reason: 'ee key too small',
code: 'ERR_SSL_EE_KEY_TOO_SMALL'
}
Will do some investigation.
https://github.com/nodejs/node/issues/31118#issuecomment-591221731 looks like a C++ -> JS function call gone wrong but I can't really tell more from the backtrace.
It looks like this function call:
cc @joyeecheung - that code seems to have undergone quite some changes recently so perhaps this is a known and already fixed issue?
@FlyGoat You might want to try out the HEAD of the v12.x-staging branch, see if that works better.
@bnoordhuis still happens on v12.x-staging with GCC.
But I can't understand why clang just works fine.
that code seems to have undergone quite some changes recently so perhaps this is a known and already fixed issue?
Not that I know of, but I can't tell from the stack trace which part of the snapshot building was particularly relevant - I believe the call on the stack frames was just the first time V8 in our builds would ever try to call from C++ to JS through the API? It might help to just create a regular v8 context, compile a simple JS function and call into it in node::SnapshotBuilder::Generate() right after the registration of isolates to figure out if it's just general C++ -> JS call failure:
See this test on how to do this:
And, you can also try building with the configure option --without-node-snapshot and see if the node binary fails when it starts
And, you can also try building with the configure option
--without-node-snapshotand see if the node binary fails when it starts
Thanks for your suggestions, node binary without snapshot crashed at the same point.
It looks like a v8 issue. I'm going to run some v8 tests.
Bisected down to commit f579e1194046c50f2e6bb54348d48c8e7d1a53cf, v8 broke between 7.3.492.25 and 7.4.288.13. Need to investigate deeper.
on mips platform, we always build v8 use llvm before, so we do not notice this issue.
I will do some investigation about this.
This should be a bug of GCC assembler on mips.
v8 snapshot use file "embeded.S" as a input file, if v8 do not insert source location information into this file(".loc"), everything goes right, all code are writed as .octa xxxxx , but if v8 insert ".loc" to "embeded.S", the code will be writed by ".octa xxx" along with ".byte xxxx" see https://github.com/nodejs/node/blob/v12.x/deps/v8/src/snapshot/embedded/embedded-file-writer.cc#L165
looks like gcc assembler do not handle this situation successfully, but clang can handle successfully, I have report this to our compiler team.
do not insert source information to this file can workaround for this issue temporary, patch is attached.

It's very nice of you to finish the investigation! We'll see how to best fix things upstream.
Filled an v8 upstream issue
As it's currently restricted to Googlers, I'll paste content below.
The root cause is embedded-file-writer inserted a .byte between two .octa, and assembler's auto-align function added padding to let .octa aligned with 128bit boundary broken relative address offset between code.
What is auto-align on MIPS assembler?
Demo code.
.octa ~0x0
.word 0xdeadbeef
.octa ~0x0
on other archtectures, the binary likely to generate is:
addr content
0x0 0xffffffff
0x4 0xffffffff
0x8 0xffffffff
0xc 0xffffffff
0x10 0xdeadbeef
0x14 0xffffffff
0x18 0xffffffff
0x1c 0xffffffff
0x20 0xffffffff
However, on MIPS, what will be generated is:
addr content
0x0 0xffffffff
0x4 0xffffffff
0x8 0xffffffff
0xc 0xffffffff
0x10 0xdeadbeef
0x14 0x00000000
0x18 0x00000000
0x1c 0x00000000
0x20 0xffffffff
0x24 0xffffffff
0x28 0xffffffff
0x2c 0xffffffff
0x14~0x1c is auto-align padding added by the assembler and unfortunately, we can't turn it off. It will align the start of all directives into their nature boundary (128-bit for .octa).
My suggestion is we can let embedded-file-writer use 32-bit .word instead of .octa. As all MIPS (And most of other RISCs) instructions are 32-bit and .word have a 32-bit align boundary, with .word auto-align won't fill anything break our code.
@bnoordhuis
As now we've addressed the issue, after the workaround, NodeJS managed to pass most of the tests on mips64el, is it possible to push mips64el into an experimental or Tier 2 level supported architecture?
@xen0n and I can provide help with general MIPS issues, @xwafish is maintaining MIPS v8 upstream, and @wzssyqa from Debian can help with toolchain & system environment issues.
We can also provide mips64el Cl machine hosted in China.
Thanks.
@FlyGoat If that machine can be set up in a way where it's managed by our Build WG in order that they can run Jenkins etc. on it, then promoting mips64el to experimental shouldn't be a problem.
Tier 2 status means test failures block releases but the MIPS user base isn't large enough to warrant that.
@bnoordhuis Who should I get in touch for that?
Should I open a new issue?
Thanks.
@FlyGoat Can you open an issue over at https://github.com/nodejs/build/issues explaining you want to donate a machine, what specs it has, etc.? The build people will take it from there.
(Technically, I'm one of the build people but I'm no expert on how to provision machines.)
Fixed in upstream.
@FlyGoat You can open a back-port of the bug fix if you want. The process is outlined in https://github.com/nodejs/node/blob/master/doc/guides/maintaining-V8.md, specifically the "Backporting to Abandoned Branches" section.
The v8 patch is working on v12.x. Thank you all. @FlyGoat should open a back-port of the bug.
It appears to be a GCC only regression.
Clang build works fine, but there are some test failures with OpenSSL, like:=== release test-tls-honorcipherorder === Path: parallel/test-tls-honorcipherorder _tls_common.js:129 c.context.setCert(cert); ^ Error: error:140AB18F:SSL routines:SSL_CTX_use_certificate:ee key too small at Object.createSecureContext (_tls_common.js:129:17) at Server.setSecureContext (_tls_wrap.js:1312:27) at new Server (_tls_wrap.js:1176:8) at Object.createServer (_tls_wrap.js:1219:10) at test (/home/flygoat/nodejs/test/parallel/test-tls-honorcipherorder.js:30:22) at Object.<anonymous> (/home/flygoat/nodejs/test/parallel/test-tls-honorcipherorder.js:60:1) at Module._compile (internal/modules/cjs/loader.js:1204:30) at Object.Module._extensions..js (internal/modules/cjs/loader.js:1224:10) at Module.load (internal/modules/cjs/loader.js:1053:32) at Function.Module._load (internal/modules/cjs/loader.js:948:14) { library: 'SSL routines', function: 'SSL_CTX_use_certificate', reason: 'ee key too small', code: 'ERR_SSL_EE_KEY_TOO_SMALL' }Will do some investigation.
about that ssl test failure: the tests are supposed to be run using openssl.cnf distributed with nodejs,
which doesn't happen if built against shared openssl, in which case one has to set:
OPENSSL_CONF=./deps/openssl/openssl/apps/openssl.cnf make test-js
Most helpful comment
Filled an v8 upstream issue
As it's currently restricted to Googlers, I'll paste content below.