Node: Segfault on v10.7.0

Created on 16 Aug 2018  Â·  23Comments  Â·  Source: nodejs/node

E PID 8 received SIGSEGV for address: 0x0
E /app/node_modules/segfault-handler/build/Release/segfault-handler.node(+0x2c0d)[0x7f4e099bac0d]
E /lib/x86_64-linux-gnu/libpthread.so.0(+0x128e0)[0x7f4e0bd958e0]
E node[0xe837ac]
E node(_ZN2v88internal7Factory20NewStringFromTwoByteENS0_6VectorIKtEENS0_13PretenureFlagE+0x19)[0xe83c09]
E node(_ZN2v86String14NewFromTwoByteEPNS_7IsolateEPKtNS_13NewStringTypeEi+0x9a)[0xac57ba]
E node(_ZN4node11StringBytes6EncodeEPN2v87IsolateEPKtmPNS1_5LocalINS1_5ValueEEE+0x29d)[0x95aaad]
E node[0x95c9ed]
E node[0xb44409]
E node(_ZN2v88internal21Builtin_HandleApiCallEiPPNS0_6ObjectEPNS0_7IsolateE+0xb9)[0xb44f79]
E [0x1bb2750041bd]

  • Version:
    node -v
    v10.7.0
  • Platform:
    Linux d187343d97b6 4.9.93-linuxkit-aufs #1 SMP Wed Jun 6 16:55:56 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
string_decoder

All 23 comments

Same issue on v10.9.0:

E PID 8 received SIGSEGV for address: 0x0
E /app/node_modules/segfault-handler/build/Release/segfault-handler.node(+0x2c0d)[0x7f4dfe980c0d]
E /lib/x86_64-linux-gnu/libpthread.so.0(+0x128e0)[0x7f4e00d5b8e0]
E node[0xe9eb00]
E node(_ZN2v88internal7Factory20NewStringFromTwoByteENS0_6VectorIKtEENS0_13PretenureFlagE+0x19)[0xe9ef49]
E node(_ZN2v86String14NewFromTwoByteEPNS_7IsolateEPKtNS_13NewStringTypeEi+0x9a)[0xadf7ca]
E node(_ZN4node11StringBytes6EncodeEPN2v87IsolateEPKtmPNS1_5LocalINS1_5ValueEEE+0x29d)[0x96cebd]
E node[0x96edfd]
E node[0xb5996f]
E node(_ZN2v88internal21Builtin_HandleApiCallEiPPNS0_6ObjectEPNS0_7IsolateE+0xb9)[0xb5a4d9]
E [0x30fa8a1dc01d]

Can you share code that reproduces this? Do you use the official binaries downloaded from https://nodejs.org/, or from another source?

Sadly no, It's a rather large application that only crashes in the docker container running in staging and production environments, so the issues seem to be Linux specific or at least not happening on MacOSX where we are doing development.

We are using the official binaries you can see the docker images we use here: https://github.com/connectedcars/docker-node

Also, we are not seeing the issue on Node 8.x.

If you could give us some instructions on extracting more debug information, we can help with that?

@tlbdk oom? can you run node with gdb?

The relevant symbol here is 0x96edfd which is inside node::(anonymous namespace)::DecodeData(v8::FunctionCallbackInfo<v8::Value> const&), which is (unfortunately) not provided by segfault-handler.

How consistently are you seeing these issues? Is there a chance you could run modified node binaries, or possibly even compile a debug build of Node?

@baslr Don't think it's an out of memory issue, we can run node with gdb, any instructions on how to do that?

@addaleax It seems pretty consistent, so it would be easy to run a modified node binary. If you have a prebuilt debug build of node that would be the easiest.

@tlbdk I think the ideal situation would be causing a crash with a debug build to generate a core dump, then provide that debug binary + the core dump together (warning: these files are going to be huge).

If you want to run gdb yourself, you’d start gdb path/to/debug/binary and then enter run <node.js arguments> inside the shell. There, once the crash occurs, you could run bt to get a much better stack trace, use frame N to select individual frames from the stack as listed by bt, and p (expression) to print the values of variables etc.

@addaleax ok, what entails a debug build, do you have any instructions, should I just enable all the debug flags?

 ./configure | grep debug
                 'debug_nghttp2': 'false',
                 'node_debug_lib': 'false',
                 'v8_optimized_debug': 0,

@tlbdk It’s ./configure --debug – what you’re looking at in that output seem to be feature flags, not compiler settings

@addaleax we finally managed to build a debug image to run the code and have captured a core dump, note that we sanitized it for passwords using:

perl -pi -e 's/password/xxxxxxxx/g' core.node.7.1535125297

Hope this is ok, else please provide another method for doing this.

Core dump: https://storage.googleapis.com/connectedcars-staging-public/node-debug/core.node.7.1535125297.gz
Debug node: https://storage.googleapis.com/connectedcars-staging-public/node-debug/node.debug.gz

@tlbdk That doesn’t quite look like a debug build – those are typically in the range of ~600 MB binaries, whereas this one is the size of a regular Node binary…

./configure --debug generates two binaries, the regular one (node aka out/Release/node) and a debug build (node_g aka out/Debug/node).

@addaleax We just did ./configure --debug && make install

I now added:

cp -f out/Debug/node /opt/node-v10.9.0-linux-x64/bin/node
ls -lh /usr/local/bin/node
-rwxr-xr-x 1 root root 575M Aug 24 18:22 /usr/local/bin/node

Do I need to do anything else?

I would be nice with a short howto guide on how to do this, would be happy to write one if you had a good location to put it?

The build tasks 2-3 hours and hitting the issue another 10-20 minutes so we might not have a new core dump before Monday.

@tlbdk That looks about right, yes. :)

Depending on how much you think a guide would be, BUILDING.md in this repo might be a good place?

@mathiastj Sorry, but I’m getting file format errors when trying to open either with gdb…

$ gdb node.debug core.node.8.1535359906
[...]
"/tmp/22358/node.debug": not in executable format: File format not recognized
"/tmp/22358/core.node.8.1535359906" is not a core dump: File format not recognized

Are you sure you the password sanitization didn’t do more to the files than it should have done?

(Feel free to run the above gdb command on the original core dump yourself, running backtrace on it should be pretty helpful too)

I get the same file format errors when trying to open the original core dump. I tried to extract another core dump but that too gives the same error.
I am able to open the core dumps in lldb though.

@addaleax Running gdb on the container that had created the core dump I was able to open the files. For reference the image is here: https://github.com/connectedcars/docker-node/blob/debug/debug/Dockerfile
Here is the output of backtrace:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f59240e92f1 in __GI_abort () at abort.c:79
#2  0x000055a20a26cd16 in node::Abort () at ../src/node.cc:1193
#3  0x000055a20a26ce04 in node::Assert (args=0x55a20ccc07e0 <node::(anonymous namespace)::MakeString(v8::Isolate*, char const*, unsigned long, node::encoding)::args>) at ../src/node.cc:1210
#4  0x000055a20a364545 in node::(anonymous namespace)::MakeString (isolate=0x55a20de0bbb0, data=0x55a20df972a1 "2", length=65534, encoding=node::UCS2) at ../src/string_decoder.cc:35
#5  0x000055a20a364d6b in node::StringDecoder::DecodeData (this=0x55a20de54480, isolate=0x55a20de0bbb0, data=0x55a20df972a1 "2", nread_ptr=0x7ffd26791c70) at ../src/string_decoder.cc:219
#6  0x000055a20a365134 in node::(anonymous namespace)::DecodeData (args=...) at ../src/string_decoder.cc:272
#7  0x000055a20a6d28e4 in v8::internal::FunctionCallbackArguments::Call (this=0x7ffd26791e10, handler=0x2a8d26b8a659) at ../deps/v8/src/api-arguments-inl.h:94
#8  0x000055a20a6d584e in v8::internal::(anonymous namespace)::HandleApiCallHelper<false> (isolate=0x55a20de0bbb0, function=..., new_target=..., fun_data=..., receiver=..., args=...) at ../deps/v8/src/builtins/builtins-api.cc:109
#9  0x000055a20a6d36eb in v8::internal::Builtin_Impl_HandleApiCall (args=..., isolate=0x55a20de0bbb0) at ../deps/v8/src/builtins/builtins-api.cc:139
#10 0x000055a20a6d3485 in v8::internal::Builtin_HandleApiCall (args_length=7, args_object=0x7ffd26792008, isolate=0x55a20de0bbb0) at ../deps/v8/src/builtins/builtins-api.cc:127

@addaleax Was this enough to help with the debugging or do we need to provide more info?

@tlbdk I’m pretty sure it’s enough, I’ll try to have a patch up today or tomorrow. Thanks for the help!

@tlbdk @mathiastj Would be great if you could try the patch in #22622?

I'm out traveling but should have time to do a build from master end of the week

@addaleax

We have now tested it out and it works flawlessly.
Thanks a lot for all the help.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

addaleax picture addaleax  Â·  3Comments

loretoparisi picture loretoparisi  Â·  3Comments

mcollina picture mcollina  Â·  3Comments

willnwhite picture willnwhite  Â·  3Comments

stevenvachon picture stevenvachon  Â·  3Comments