Meteor: Meteor segfault randomly on reload (localy)

Created on 28 Apr 2017  Ā·  101Comments  Ā·  Source: meteor/meteor

Hi,

When i use meteor locally, server start correctly but I face frequent segfault which appear randomly (i cant figure exactly what cause it).
It seems to occure when server reload (due to a file modification). But doesn't segfault on every reload, and doesn't segfault on same files (sometimes modificaiton on server side, sometimes on client side).

My RAM isn't full (about 10Gb free when segfault occure).
And here is all package i use :

`[email protected] # Packages every Meteor app needs to have
[email protected] # Packages for a great mobile UX
[email protected] # The database Meteor supports right now
[email protected] # Compile .html files into Meteor Blaze views
[email protected] # Reactive variable for tracker
[email protected] # Helpful client-side library
[email protected] # Meteor's client-side reactive programming library

[email protected] # CSS minifier run for production mode
[email protected] # JS minifier run for production mode
[email protected] # ECMAScript 5 compatibility for older browsers.
[email protected] # Enable ECMAScript2015+ syntax in app code
[email protected] # Server-side component of the meteor shell command

[email protected]
kadira:flow-router
zimme:active-route
[email protected]
kadira:blaze-layout
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
twbs:bootstrap
tanis:bootstrap-social
fortawesome:fontawesome
[email protected]
[email protected]
[email protected]
meteorhacks:ssr
arillo:flow-router-helpers
cfs:filesystem
ostrio:files
[email protected]
dburles:google-maps
sergeyt:typeahead
raix:handlebar-helpers
dburles:collection-helpers
manuel:reactivearray
aldeed:collection2-core
mizzao:bootboxjs
meteorhacks:aggregate
aldeed:[email protected]
summernote:summernote
anback:bootstrap-validator
aldeed:autoform-bs-datepicker
natestrauser:select2
rajit:bootstrap3-datepicker
natestrauser:font-awesome
mystor:device-detection
[email protected]
aldeed:simple-schema
froala:editor
andrasph:clockpicker
aldeed:template-extension
tsega:bootstrap3-datetimepicker
`

I don't know which other informations i can give you, but i stay availabale if u need anything.

Regards.

some Tool Bug

Most helpful comment

@abernix Segfaults happened ~once/day on Meteor 1.4.4.3. It happens multiple times per hour on Meteor 1.5; usually on code changes.

All 101 comments

Hi @damjuve - what is the exact error message you're seeing? Does #8157 look similar to your problem?

same here after upgrade, the error is the same:

Client modified -- refreshing (x5)[1] 20587 segmentation fault meteor -s settings.json

changing the numer - 20587 now but also many others like: 19119, 16496.

@hwillson I think this is related to one or more of #8002 #6241 #4446

Seems a lot more people are reporting these lately so it may be time to take a proper look. The difficulty will probably be replicating the issue reliably without being able to access confidential code, I'm happy to help if I can though.

Edit: Just for info, I see abort trap 6 5+ times a day (across various Meteor apps), and segmentation fault 11 roughly once a week. Latest Meteor release (1.4.4.1) and macOS Sierra (10.12.4) on a 2014 MacBook Pro Retina 13 (2.8GHz i5, 16 GB RAM).

I'm having the same issue, and people are reporting it on the forums too.
Forum thread

@hwillson - There is no error, just a "segmentation fault (core dumped)" message from the system.
I don't know how to track more log.
I posted the problem on meteor forum (https://forums.meteor.com/t/frequent-segmentation-fault-on-local/36064/3 and https://forums.meteor.com/t/segmentation-fault-11-meteor-crashing-during-development/35234/11). It seems to happens frequently since last meteor update, and on several system.
In my case, i work with 2 coworkers on the same app.
I face really frequent segfault (about 5+ by day), on Ubuntu 16.04.
One of my collegue face it less frequently (about 5+ by week), on Mac 10.12.4.
The last one face it only twice, on Debian 3.16.0-4.

Don't know if this is relevent. I can add also that i work alot on backside (server files), the mac user works on front & back, and the debian user mostley on front. But i noticed that the problem happens either on reloading server and client files.

I stay available if i can bring more clue about this.

Thanks for your time.

Thanks all - these comments really help. Please keep posting any additional details you uncover.

Would it be possible for anyone getting a core dump here to share that core dump?

Hi @hwillson,

I'm experiencing the issue as well on macOS. About twice a day.
Got this using lldb attached to the process do you think the core dump can be useful? It seems it could be in node.

Process 48213 resuming
Process 48213 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x3815fd003950)
    frame #0: 0x000000010035ab2e node`v8::internal::MarkCompactCollector::ProcessWeakCollections() + 302
node`v8::internal::MarkCompactCollector::ProcessWeakCollections:
->  0x10035ab2e <+302>: movl   0xa8(%r12,%rax), %eax
    0x10035ab36 <+310>: movzbl %cl, %ecx
    0x10035ab39 <+313>: btl    %ecx, %eax
    0x10035ab3c <+316>: jae    0x10035b020               ; <+1568>

Just had seg fault 11 happen _three_ times within 10 minutes. This is seriously harming productivity.

It would be preferable for anyone "+1"-ing this to, at the very least, include the version of Meteor that they're using. While important when reporting _any_ bug, it's particularly important on an issue of this nature as it could be tied to any number of underlying dependencies (Node.js, v8, etc.)

@rlora Your issue looks like a dangling pointer. If you're not doing manual memory management and relying on V8's automatic garbage collection, then it could certainly seem to be a V8 issue. As stated above, the Meteor version you're using would be helpful. Of course, there are no guarantees that any of the faults in this thread are related, but you're in an opportune position if you've captured it while attached. Can you get any more information from lldb with further inspection of the backtrace? bt might be a good debugger command to start with (and maybe frame info?).

here is occuring several times almost on every change on code and meteor restarts...
I don't know how to help to debug it. using version 1.4.4.2

1.4.4.2 here as well.

@abernix I'm sorry, I'm trying to replicate the crash but I've spent the whole afternoon trying and works like a charm :(

I'm not doing manual memory management. I think the issue is in V8. I'm using Meteor 1.4.4.2

For people experiencing the crash in macOS follow this steps:

  1. Start your app meteor --settings settings.json
  2. In a new terminal window start lldb
  3. Inside lldb
(lldb) attach <PID>
(lldb) continue

Continue working in your app, if it crashes lldb will stop and your app will become unresponsive. But lldb will prevent the app from exiting with segmentation fault.

You can debug it from there. If you type bt you should get the backtrace.

I will continue trying to reproduce but maybe someone will capture it sooner.

Helpful info @rlora, I'll run lldb today and see if I can come up with anything useful

Who wants to decipher this?

Process 74267 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGPIPE
    frame #0: 0x00007fffcb2bb7e6 libsystem_kernel.dylib`write + 10
libsystem_kernel.dylib`write:
->  0x7fffcb2bb7e6 <+10>: jae    0x7fffcb2bb7f0            ; <+20>
    0x7fffcb2bb7e8 <+12>: movq   %rax, %rdi
    0x7fffcb2bb7eb <+15>: jmp    0x7fffcb2b2cd4            ; cerror
    0x7fffcb2bb7f0 <+20>: retq
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGPIPE
  * frame #0: 0x00007fffcb2bb7e6 libsystem_kernel.dylib`write + 10
    frame #1: 0x00000001007910f8 node`uv__write + 215
    frame #2: 0x0000000100790fe1 node`uv_write2 + 508
    frame #3: 0x000000010079151b node`uv_try_write + 110
    frame #4: 0x000000010068dcb3 node`node::StreamWrap::DoTryWrite(uv_buf_t**, unsigned long*) + 41
    frame #5: 0x000000010068ba3c node`int node::StreamBase::WriteString<(node::encoding)1>(v8::FunctionCallbackInfo<v8::Value> const&) + 1080
    frame #6: 0x000000010068e8fa node`void node::StreamBase::JSMethod<node::StreamWrap, &(int node::StreamBase::WriteString<(node::encoding)1>(v8::FunctionCallbackInfo<v8::Value> const&))>(v8::FunctionCallbackInfo<v8::Value> const&) + 72
    frame #7: 0x000000010017859f node`v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) + 159
    frame #8: 0x00000001001a0c04 node`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::(anonymous namespace)::BuiltinArguments<(v8::internal::BuiltinExtraArguments)1>&) + 1060
    frame #9: 0x00000001001a386d node`v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*) + 61
    frame #10: 0x000036af2a10963b
    frame #11: 0x000036af2bb64e8a
    frame #12: 0x000036af2bb6489a
    frame #13: 0x000036af2bba591b
    frame #14: 0x000036af2c0eddaf
    frame #15: 0x000036af2a109ff7
    frame #16: 0x000036af2bb33b98
    frame #17: 0x000036af2a109ff7
    frame #18: 0x000036af2c5f53ef
    frame #19: 0x000036af2a109ff7
    frame #20: 0x000036af2a1318f9
    frame #21: 0x000036af2a115b62
    frame #22: 0x00000001002d34c8 node`v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 728
    frame #23: 0x000000010052440c node`v8::internal::Runtime_Apply(int, v8::internal::Object**, v8::internal::Isolate*) + 908
    frame #24: 0x000036af2a10963b
    frame #25: 0x000036af2bbd3bbf
    frame #26: 0x000036af2a109ff7
    frame #27: 0x000036af2c1a582f
    frame #28: 0x000036af2bba3706
    frame #29: 0x000036af2be4bac1
    frame #30: 0x000036af2a1318fd
    frame #31: 0x000036af2a115b62
    frame #32: 0x00000001002d34c8 node`v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 728
    frame #33: 0x000000010015f4c4 node`v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 276
    frame #34: 0x000000010064f5d7 node`node::AsyncWrap::MakeCallback(v8::Local<v8::Function>, int, v8::Local<v8::Value>*) + 609
    frame #35: 0x000000010069137f node`node::TimerWrap::OnTimeout(uv_timer_s*) + 127
    frame #36: 0x0000000100792bce node`uv__run_timers + 38
    frame #37: 0x00000001007881e2 node`uv_run + 580
    frame #38: 0x0000000100661aa1 node`node::Start(int, char**) + 735
    frame #39: 0x0000000100001834 node`start + 52

The previous comment was a seg fault 11 on one app, I think this one was an abort trap 6 on a different app:

It looks like the meteor tool crashes first, followed by the actual app...

Output of ps aux | grep node | grep ZZZ:
(Replaced ports, paths and app name with XXXX / YYY / ZZZ)

mike             73815  56.4  4.6  3918388 777380 s001  SX    9:40am  37:10.75 /Users/mike/.meteor/packages/meteor-tool/.1.4.4_2.r4ho8o++os.osx.x86_64+web.browser+web.cordova/mt-os.osx.x86_64/dev_bundle/bin/node /Users/mike/.meteor/packages/meteor-tool/.1.4.4_2.r4ho8o++os.osx.x86_64+web.browser+web.cordova/mt-os.osx.x86_64/tools/index.js --port XXXX --settings ../settings.json
mike             81476   1.5  0.7  3199824 113484 s001  S    11:32am   0:10.58 /Users/mike/.meteor/packages/meteor-tool/.1.4.4_2.r4ho8o++os.osx.x86_64+web.browser+web.cordova/mt-os.osx.x86_64/dev_bundle/bin/node /Users/mike/YYY/ZZZ/.meteor/local/build/main.js

meteor-tool:

Process 73815 stopped
* thread #11, stop reason = signal SIGABRT
    frame #0: 0x00007fffcb2b9d42 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill:
->  0x7fffcb2b9d42 <+10>: jae    0x7fffcb2b9d4c            ; <+20>
    0x7fffcb2b9d44 <+12>: movq   %rax, %rdi
    0x7fffcb2b9d47 <+15>: jmp    0x7fffcb2b2caf            ; cerror_nocancel
    0x7fffcb2b9d4c <+20>: retq
(lldb) bt
* thread #11, stop reason = signal SIGABRT
  * frame #0: 0x00007fffcb2b9d42 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fffcb3a75bf libsystem_pthread.dylib`pthread_kill + 90
    frame #2: 0x00007fffcb21f420 libsystem_c.dylib`abort + 129
    frame #3: 0x00000001007926a2 node`uv_cond_wait + 20
    frame #4: 0x000000010078617b node`worker + 227
    frame #5: 0x0000000100792300 node`uv__thread_start + 25
    frame #6: 0x00007fffcb3a49af libsystem_pthread.dylib`_pthread_body + 180
    frame #7: 0x00007fffcb3a48fb libsystem_pthread.dylib`_pthread_start + 286
    frame #8: 0x00007fffcb3a4101 libsystem_pthread.dylib`thread_start + 13
(lldb) continue
Process 73815 resuming
Process 73815 exited with status = 0 (0x00000000) Terminated due to signal 6

the app:

Process 81476 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGPIPE
    frame #0: 0x00007fffcb2bb7e6 libsystem_kernel.dylib`write + 10
libsystem_kernel.dylib`write:
->  0x7fffcb2bb7e6 <+10>: jae    0x7fffcb2bb7f0            ; <+20>
    0x7fffcb2bb7e8 <+12>: movq   %rax, %rdi
    0x7fffcb2bb7eb <+15>: jmp    0x7fffcb2b2cd4            ; cerror
    0x7fffcb2bb7f0 <+20>: retq
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGPIPE
  * frame #0: 0x00007fffcb2bb7e6 libsystem_kernel.dylib`write + 10
    frame #1: 0x00000001007910f8 node`uv__write + 215
    frame #2: 0x0000000100790fe1 node`uv_write2 + 508
    frame #3: 0x000000010079151b node`uv_try_write + 110
    frame #4: 0x000000010068dcb3 node`node::StreamWrap::DoTryWrite(uv_buf_t**, unsigned long*) + 41
    frame #5: 0x000000010068ba3c node`int node::StreamBase::WriteString<(node::encoding)1>(v8::FunctionCallbackInfo<v8::Value> const&) + 1080
    frame #6: 0x000000010068e8fa node`void node::StreamBase::JSMethod<node::StreamWrap, &(int node::StreamBase::WriteString<(node::encoding)1>(v8::FunctionCallbackInfo<v8::Value> const&))>(v8::FunctionCallbackInfo<v8::Value> const&) + 72
    frame #7: 0x000000010017859f node`v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) + 159
    frame #8: 0x00000001001a0c04 node`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::(anonymous namespace)::BuiltinArguments<(v8::internal::BuiltinExtraArguments)1>&) + 1060
    frame #9: 0x00000001001a386d node`v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*) + 61
    frame #10: 0x000014e493e0963b
    frame #11: 0x000014e495dea48a
    frame #12: 0x000014e495de9e9a
    frame #13: 0x000014e495de9665
    frame #14: 0x000014e495d48dda
    frame #15: 0x000014e496eb363c
    frame #16: 0x000014e496e79da1
    frame #17: 0x000014e496e79aa2
    frame #18: 0x000014e493e09ff7
    frame #19: 0x000014e496e4f792
    frame #20: 0x000014e496e78eb2
    frame #21: 0x000014e493e09ff7
    frame #22: 0x000014e496e92a47
    frame #23: 0x000014e493e09ff7
    frame #24: 0x000014e496909f62
    frame #25: 0x000014e496909ca8
    frame #26: 0x000014e493e345e7
    frame #27: 0x000014e496eda72f
    frame #28: 0x000014e493e09ff7
    frame #29: 0x000014e495d780b4
    frame #30: 0x000014e493e09ff7
    frame #31: 0x000014e4974810ca
    frame #32: 0x000014e495e8e705
    frame #33: 0x000014e496ee1de8
    frame #34: 0x000014e496ee19e5
    frame #35: 0x000014e496ee1678
    frame #36: 0x000014e496eda860
    frame #37: 0x000014e493e09ff7
    frame #38: 0x000014e495d780b4
    frame #39: 0x000014e493e09ff7
    frame #40: 0x000014e4974810ca
    frame #41: 0x000014e495e8e705
    frame #42: 0x000014e496ee1de8
    frame #43: 0x000014e496ee19e5
    frame #44: 0x000014e496ee1678
    frame #45: 0x000014e496eda860
    frame #46: 0x000014e493e09ff7
    frame #47: 0x000014e495d780b4
    frame #48: 0x000014e493e09ff7
    frame #49: 0x000014e4974810ca
    frame #50: 0x000014e495e8e705
    frame #51: 0x000014e496ee1de8
    frame #52: 0x000014e496ee19e5
    frame #53: 0x000014e496ee1678
    frame #54: 0x000014e496eda860
    frame #55: 0x000014e493e09ff7
    frame #56: 0x000014e495d780b4
    frame #57: 0x000014e493e09ff7
    frame #58: 0x000014e4974810ca
    frame #59: 0x000014e495e8e705
    frame #60: 0x000014e496ee1de8
    frame #61: 0x000014e496ee19e5
    frame #62: 0x000014e496ee1678
    frame #63: 0x000014e496eda860
    frame #64: 0x000014e496edb777
    frame #65: 0x000014e493e09ff7
    frame #66: 0x000014e495da7ee7
    frame #67: 0x000014e4974dfcc1
    frame #68: 0x000014e495d635c9
    frame #69: 0x000014e495dc60e3
    frame #70: 0x000014e495d385e7
    frame #71: 0x000014e493e318fd
    frame #72: 0x000014e493e15b62
    frame #73: 0x00000001002d34c8 node`v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 728
    frame #74: 0x000000010015f4c4 node`v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 276
    frame #75: 0x000000010064f6f3 node`node::AsyncWrap::MakeCallback(v8::Local<v8::Function>, int, v8::Local<v8::Value>*) + 893
    frame #76: 0x000000010068ae90 node`node::StreamBase::EmitData(long, v8::Local<v8::Object>, v8::Local<v8::Object>) + 224
    frame #77: 0x00000001006b198e node`node::TLSWrap::ClearOut() + 228
    frame #78: 0x00000001006b26c8 node`node::TLSWrap::DoRead(long, uv_buf_t const*, uv_handle_type) + 112
    frame #79: 0x000000010068db87 node`node::StreamWrap::OnReadCommon(uv_stream_s*, long, uv_buf_t const*, uv_handle_type) + 127
    frame #80: 0x000000010078f87e node`uv__stream_io + 1235
    frame #81: 0x0000000100797004 node`uv__io_poll + 1621
    frame #82: 0x00000001007880df node`uv_run + 321
    frame #83: 0x0000000100661aa1 node`node::Start(int, char**) + 735
    frame #84: 0x0000000100001834 node`start + 52

@mjmasn Thanks for providing that. That does look like it's stemming from the meteor tool itself, and your own app is simply being terminated (indicated as a SIGPIPE term, since it's a child process) as a by-product of that SIGABRT. This project is upgraded to Meteor 1.4.4.2, correct? (assuming based on #8630). The libuv involvement certainly make me lean toward file watching issues, something you've brought up recently in https://github.com/meteor/meteor/issues/8002#issuecomment-297612039.

@abernix: yep, yep and yep ;)

Another possibly related issue (relating to gulp so could be another vote for this being a file watching issue): https://github.com/nodejs/node/issues/10163. This could definitely be a bug in node / libuv...

Is there anything I can do to dig deeper into this? I'm out of the country for a week without my laptop after today but happy to do anything that might help at the back end of next week :)

@mjmasn Node 6 still uses the same version of libuv otherwise I'd suggest taking a very experimental walk through https://github.com/meteor/meteor/pull/6923. This should be in _NO_ way suggested as a work-around for anyone experiencing this problem as Meteor 1.6 is currently extremely experimental and unsuitable for development or production. You could consider applying the patch in the issue you linked to, though you'd need to do some even more experimental building of the dev bundle with the patched version, by making modifications to generate_dev_bundle.sh. It's a bit of the wild-west if you choose to embark on that adventure though. :)

i got that with lldb:

Process 65164 stopped

  • thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x00007fffcaeddd96 libsystem_kernel.dylibkevent + 10 libsystem_kernel.dylibkevent:
    -> 0x7fffcaeddd96 <+10>: jae 0x7fffcaeddda0 ; <+20>
    0x7fffcaeddd98 <+12>: movq %rax, %rdi
    0x7fffcaeddd9b <+15>: jmp 0x7fffcaed5caf ; cerror_nocancel
    0x7fffcaeddda0 <+20>: retq

(lldb) continue
Process 65164 resuming
Process 65164 stopped

  • thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGUSR2
    frame #0: 0x00007fffcaedd1ce libsystem_kernel.dylib__sigsuspend + 10 libsystem_kernel.dylib__sigsuspend:
    -> 0x7fffcaedd1ce <+10>: jae 0x7fffcaedd1d8 ; <+20>
    0x7fffcaedd1d0 <+12>: movq %rax, %rdi
    0x7fffcaedd1d3 <+15>: jmp 0x7fffcaed5cd4 ; cerror
    0x7fffcaedd1d8 <+20>: retq
    (lldb) continue
    Process 65164 resuming
    Process 65164 stopped
  • thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGUSR2
    frame #0: 0x00007fffcaedd1ce libsystem_kernel.dylib__sigsuspend + 10 libsystem_kernel.dylib__sigsuspend:
    -> 0x7fffcaedd1ce <+10>: jae 0x7fffcaedd1d8 ; <+20>
    0x7fffcaedd1d0 <+12>: movq %rax, %rdi
    0x7fffcaedd1d3 <+15>: jmp 0x7fffcaed5cd4 ; cerror
    0x7fffcaedd1d8 <+20>: retq
    (lldb) continue
    Process 65164 resuming

@abernix I was able to catch a couple of segmentation faults but the signal I'm getting is different: EXC_BAD_ACCESS, apparently related to garbage collection. I got the signal a couple of times followed by a segmentation fault.

This is the frame: v8::internal::MarkCompactCollector::ProcessWeakCollections() + 302

I just notice that right now I got the segmentation falt when tried to load a template from another one and it has a typo in the name - I wrote "+loading" and it was "+loadingView"

I'm using jade instead html

1.4.2.3 and getting Abort trap 6 several times a day.

Chiming in here - happening to me as well about 2-4 times a day developing. Running on Linux 14.04, Meteor version 1.4.4.2. The exact segfault error message I receive each time is:

30026 Segmentation fault      (core dumped) meteor run 

Next time it happens I will try to capture a core dump

Been happening for a few weeks now, ever since I updated to 1.4.4.2. Several times a day, when refreshing after file change. Mac Sierra.

Client modified -- refreshing (x10)Segmentation fault: 11

Today I grabbed a core dump and lldb trace (thanks to @rlora for detailing how to do this). I'm on Ubuntu 16.04, Meteor 1.4.4.2.

I don't know what to actually do with these, but are they of use? If so, is this the appropriate place to add them?

Same issue here. Developping on linux, i get regular segfault when code reload.
Please note that the RAM usage rush from ~2G (standard) to max (4G) before the core dump, making the whole PC freeze during 10/20 seconds. I can't do anything during that time, but i'm able to see this on my desktop widget.

Although it sometimes takes a while, I can relatively reliably reproduce the Abort issue. Here's my stack trace (pretty much the same as ones above):

Process 18226 stopped
* thread #8, stop reason = signal SIGABRT
    frame #0: 0x0000000101818d42 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill:
->  0x101818d42 <+10>: jae    0x101818d4c               ; <+20>
    0x101818d44 <+12>: movq   %rax, %rdi
    0x101818d47 <+15>: jmp    0x101811caf               ; cerror_nocancel
    0x101818d4c <+20>: retq   
(lldb) bt
* thread #8, stop reason = signal SIGABRT
  * frame #0: 0x0000000101818d42 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x0000000101960457 libsystem_pthread.dylib`pthread_kill + 90
    frame #2: 0x000000010174a420 libsystem_c.dylib`abort + 129
    frame #3: 0x0000000100792652 node`uv_cond_wait + 20
    frame #4: 0x000000010078612b node`worker + 227
    frame #5: 0x00000001007922b0 node`uv__thread_start + 25
    frame #6: 0x000000010195d93b libsystem_pthread.dylib`_pthread_body + 180
    frame #7: 0x000000010195d887 libsystem_pthread.dylib`_pthread_start + 286
    frame #8: 0x000000010195d08d libsystem_pthread.dylib`thread_start + 13
(lldb) up
(lldb) up
(lldb) up
frame #3: 0x0000000100792652 node`uv_cond_wait + 20
node`uv_cond_timedwait:
    0x100792652 <+0>: pushq  %rbp
    0x100792653 <+1>: movq   %rsp, %rbp
    0x100792656 <+4>: subq   $0x10, %rsp
    0x10079265a <+8>: movq   %rdx, %rcx

Note that I found it useful to disable the SIGUSR2 which is not an error:

(lldb) pro hand -p true -s false SIGUSR2
NAME         PASS   STOP   NOTIFY
===========  =====  =====  ======
SIGUSR2      true   false  true

Executing 'frame v' gives no output (due to absence of debug info, presumably).
Any suggestions on the next steps? Would running with symbols help, if so, where to get them?

Updated to 1.5 but still experiencing frequent segmentation faults

@sdarnell Thanks for that trace. Would you be interested in making a more extreme change and following the lines of what I suggested in this comment above? Note that #6923 has been superseded by https://github.com/meteor/meteor/pull/8728.

If you could use that Node.js 6 branch from the new PR (release-1.6) but build with Node.js 8 by rebuilding the dev bundle after changing the NODE_VERSION here. I'm not suggesting going as far as trying to apply the patch but the libuv involvement makes me want to generally suggest trying a newer Node.js (which would have a newer version of libuv), which Node.js 8 has (1.11.0 vs 1.9.1).

@abernix I had been trying to reproduce with debug symbols on 4.8.3 but hit a seemingly unrelated problem before I get to the abort: https://github.com/nodejs/node/issues/13351

As part of testing that issue, I had been running release 1.6-alpha.0 with node 6.10.3. Although I did a bit of testing I didn't get an abort. But I switched to a debug build of node 6.10.3 and it was unbelievably slow. I was also having issues with the client restarting (even without the debug build).

I'll try to find some time over the next few days to drop in a 8.x node. Strikes me it is worth running with the 6.x node for a while first though, just to confirm that it does/doesn't happen there too.

Running 1.5, still getting "Segmentation fault" at building for linux.x86_64

upgraded to 1.5 worked fine for a few hours. Now almost every code change cause segmentation fault 11.
I'm on Mac 10.12.2

@sdarnell With the same identical warning as my comment above, Meteor 1.6 is currently in an alpha stage and as of this comment (and the 1.6-alpha.1 release), is running Node.js 8 (:tada:) . If you want to give it a shot and see how the abort trapping goes, 1.6 will save you the hassle of having to build your own dev bundle!:

meteor update --release 1.6-alpha.1

(Follow the https://github.com/meteor/meteor/pull/8728 PR for updates on Meteor 1.6)

@mattandwhatnot, @hotelratepro, others: I would have no expectation of this problem going away with Meteor 1.5 as no dependencies have been changed (libuv, Node.js, etc.) that might support that fix.

@abernix Thanks for the release. I've been running on 1.5 for the past few days, and I must admit that the 'abort gods' have been smiling on me so far. But I'll switch to the 1.6-alpha.1 for a while to see how that goes.

Sadly the 'abort gods' stopped smiling and I just got an 'Abort trap: 6' running 1.6-alpha.1 (that is, node v8.0.0). I didn't have lldb attached, so can't confirm the stack trace matches uv_cond_wait, but I'll run in future with lldb attached.

I have very little evidence, but there seems to be some correlation with the aborts happening shortly after waking the laptop from sleep. \Maybe the file system monitoring code or file system handles sometimes become errored/invalid/corrupt across a sleep?\

I could be completely ignorant but maybe your onto something with the file system monitoring. I find mine aborts more often after saving large changes than small ones. I don't know about waking from sleep though I'm running in an always up Ubuntu container on AWS.

is this impacting dev only, as it only shows up after code change? or people see this is production as well? trying to figure out whether I should roll back to 1.4.2.3 my previous version that worked fine.

@hotelratepro this is definitely a dev only issue, assuming by production you mean deploying a bundle created with meteor build. If you're running meteor --production on the server (and nobody should be doing this btw!) you may also see this in 'production'.

Just adding confirmation that the node v8.0.0 abort is the same as seen on node v4.8.3 (llnode doesn't given any extra info either):

(lldb) v8 bt
 * thread #11: tid = 0x5b36cf, 0x00000001024edd42 libsystem_kernel.dylib`__pthread_kill + 10, stop reason = signal SIGABRT
  * frame #0: 0x00000001024edd42 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x0000000102635457 libsystem_pthread.dylib`pthread_kill + 90
    frame #2: 0x000000010241f420 libsystem_c.dylib`abort + 129
    frame #3: 0x0000000100be4fe9 node`uv_cond_wait + 20
    frame #4: 0x0000000100bd88d7 node`worker + 227
    frame #5: 0x000000010263293b libsystem_pthread.dylib`_pthread_body + 180
    frame #6: 0x0000000102632887 libsystem_pthread.dylib`_pthread_start + 286
    frame #7: 0x000000010263208d libsystem_pthread.dylib`thread_start + 13

A backtrace on all threads doesn't really show anything interesting, they're all sat waiting on mutexes, or semaphores. There's a pathwatcher thread sat waiting on a kevent.

Having the same problem with Meteor 1.5 on macOS 10.11.6. Others members of my team report similar.

Has anyone come up with a durable fix or workaround?

Constant segfaults on 1.5 using meteor run command. Ubuntu16.

I have noticed it tends to happen more often when viewing the app in Desktop Chrome.

@zed-apps @abuddenb Are you saying that this did not happen to you at all on Meteor 1.4.4.3?

@abernix Segfaults happened ~once/day on Meteor 1.4.4.3. It happens multiple times per hour on Meteor 1.5; usually on code changes.

^^ what he said

I've been trying various node builds, but due to the debugging issues with 1.5 I've been running with Meteor 1.4.4.2 (node 4.8.2) for a while. Unfortunately, I didn't have lldb attached this time to confirm where the final abort occurred. But all other times it has been uv_cond_wait.

This time I got an assert going off before the abort:

=> Client modified -- refreshing (x3)Assertion failed: (loop->watchers[w->fd] == w), function uv__io_stop, file ../deps/uv/src/unix/core.c, line 888.
Abort trap: 6

where the assert is:

  if (w->pevents == 0) {
    QUEUE_REMOVE(&w->watcher_queue);
    QUEUE_INIT(&w->watcher_queue);

    if (loop->watchers[w->fd] != NULL) {
      assert(loop->watchers[w->fd] == w);  // <==== line 888
      assert(loop->nfds > 0);
      loop->watchers[w->fd] = NULL;
      loop->nfds--;
      w->events = 0;
    }
  }

Without the debugger attached I doubt there's much to be gleaned from the above (although some form of memory corruption can explain just about anything).

Another related update is that I've been testing other builds of node with some extra checking, and was able to show that the abort is (usually for me) due to pthread_cond_wait() returning EINVAL and from my debugging the obvious conditions for EINVAL seem not to be the case.
Extra details in: https://github.com/nodejs/node/issues/10163#issuecomment-307639005

It turns out that the loop->watchers assert seems to be a known symptom when clients of libuv watch the same fd twice. The libuv issue is: https://github.com/libuv/libuv/issues/1172

I didn't notice this problem at all before updating to 1.5 on Sierra. Now I got it around every 3-4 refreshes. That's unbearable.

A few weeks ago, in a message I ended up deleting, I was saying Meteor 1.5 seemed to have solved the problem on my current development project... which was wrong, as I noticed after a few days of even more frequent segmentation fault: 11 errors.

If it can be of any help, here is a possible explanation for this misleading first impression : right before upgrading to 1.5, as a sort of "cleaning mantra" before every important Meteor release, I had deleted four elements in .meteor/local/: build, bundler-cache, plugin-cache and dev_bundle.

Last week, my project had reached such a crazy error ratio as 1 each 2 code changes : very disturbing, especially when working on a large project that takes a long time to restart, so I decided to test a few workaround approaches - and as a matter of fact, the very first one worked : I re-deleted the same four elements this morning, and all of a sudden the errors disappeared - at least for a whole working day with hundreds of server and client code changes, which is so much more comfortable than the last weeks' experience of up to 50% crashes.

Could others confirm this as a workaround ? if yes, perhaps we could try a more progressive approach to locate the possible culprit(s).

@sdarnell What are you doing to be able to attach lldb to the process? Do you need a debug build of Node? (Thanks for investigating, by the way!)

I have been just attaching lldb without a debug build (though at times in the past I have also used debug builds of node). For clarity, here is what I have been doing:

meteor --settings settings.json
... wait until things are going, then in separate window ...
ps -ef | grep meteor | grep settings
lldb -p 89226           (where 89226 is process id from ps|grep above)
process handle SIGUSR2 -n true -p true -s false       (I actually just type Ctrl+R hand Enter)
break set -n abort
c

then continue working and wait for the abort, which almost always occurs at the point of doing a client refresh. For me it have been on and off how reproducible it has been, and I would say that it occurs more with non-trivial client changes. I also have a fairly full stack of stuff (including Typescript + Angular-meteor) that are involved in the (re-)compilation process.

Following suggestions from the node team (bnoordhuis) I'd added some extra logging (fprintf's) to libuv to trace the error code from the pthread calls, but otherwise it is a plain node executable.

Maybe I'm not writing code that's challenging the typescript compiler enough, but recently I've not been getting the aborts very often - which is great, but really awkward for trying to nail down the source of the problem.

@JanMP A friend of mine reported something similar, that the aborts started happening when they updated to a newer version of macOS - though this was a little while ago. However, very similar issues also occur on Linux so either there are two issues, or the macOS update made the system more susceptible to the problem.

@YannLeBihan it is an interesting possibility that it is somehow related to the files in the .meteor/local directory. If for whatever reason these files were being monitored for changes that might create extra pressure on handle resources, or simply slow things down. Occasionally I've had half baked builds left around in that directory. Typescript (via angular-meteor/meteor-compilers) also creates a hidden directory: .typescript-cache which can accumulate a fair amount.
Here's my current situation:

.meteor/local $ for d in build bundler-cache plugin-cache .typescript-cache; do echo "$d `find $d | wc -l`" ; done 
build      776
bundler-cache      530
plugin-cache      447
.typescript-cache     8034

As I sometimes need to switch branches and meteor versions, I will occasionally clear out the local directory so maybe that's why it's no longer happening for me (I'm not sure).

Deleting the caches worked. No more segmentation fault 11. But the speed of refreshing pages is back to where it was in the bad old days (it takes around 10 seconds after saving the file for the refreshing message to show up in the terminal).

Getting a similar issue with meteor deploy. 1 in 3 or 4 of our deploys result in segfaults and we're unable to pinpoint since deploys run on clean CircleCI machines (Ubuntu 14.04) which we can't SSH into without restarting. We had this issue with both Meteor 1.4 and 1.5.

Is there any insight on a potential common cause? I didn't provide much info, but definitely can if necessary @benjamn. Would just like to get an idea of what we're dealing with before starting debug efforts.

Segmentation Fault 11 is back, but only every now and then (not on every other refresh like it used to before deleting the caches). I had my first Segmentation Fault 11 before installing 1.5.1 rc3 and still got them with the rc3. Refresh speed is bearable with rc3.

sometimes it goes for like an hour, and right now it's failing on nearly every single file change, sass or js doesn't seem to care, the file size or nested ness of the components and files is also seemingly random.

tried all the different versions of node with no changes:
node: 7.2.1
meteor: 1.5

libdb dump

Process 82307 stopped
* thread #1: tid = 0x134e4c9, 0x00007fffd8698e2a libsystem_kernel.dylib`kevent + 10, name = 'npm', queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x00007fffd8698e2a libsystem_kernel.dylib`kevent + 10
libsystem_kernel.dylib`kevent:
->  0x7fffd8698e2a <+10>: jae    0x7fffd8698e34            ; <+20>
    0x7fffd8698e2c <+12>: movq   %rax, %rdi
    0x7fffd8698e2f <+15>: jmp    0x7fffd8690cdf            ; cerror_nocancel
    0x7fffd8698e34 <+20>: retq   

Executable module set to "/Users/kevingreen/.nvm/versions/node/v7.2.1/bin/node".
Architecture set to: x86_64-apple-macosx.
(lldb) continue
Process 82307 resuming
Process 82307 stopped
* thread #1: tid = 0x134e4c9, 0x00007fffd8697712 libsystem_kernel.dylib`__kill + 10, name = 'npm', queue = 'com.apple.main-thread', stop reason = signal SIGSEGV
    frame #0: 0x00007fffd8697712 libsystem_kernel.dylib`__kill + 10
libsystem_kernel.dylib`__kill:
->  0x7fffd8697712 <+10>: jae    0x7fffd869771c            ; <+20>
    0x7fffd8697714 <+12>: movq   %rax, %rdi
    0x7fffd8697717 <+15>: jmp    0x7fffd8690cdf            ; cerror_nocancel
    0x7fffd869771c <+20>: retq   
(lldb) continue
Process 82307 resuming
Process 82307 stopped
* thread #1: tid = 0x134e4c9, 0x00007fffd8697712 libsystem_kernel.dylib`__kill + 10, name = 'npm', queue = 'com.apple.main-thread', stop reason = signal SIGSEGV
    frame #0: 0x00007fffd8697712 libsystem_kernel.dylib`__kill + 10
libsystem_kernel.dylib`__kill:
->  0x7fffd8697712 <+10>: jae    0x7fffd869771c            ; <+20>
    0x7fffd8697714 <+12>: movq   %rax, %rdi
    0x7fffd8697717 <+15>: jmp    0x7fffd8690cdf            ; cerror_nocancel
    0x7fffd869771c <+20>: retq   
(lldb) continue
Process 82307 resuming
Process 82307 stopped
* thread #1: tid = 0x134e4c9, 0x00007fffd8697712 libsystem_kernel.dylib`__kill + 10, name = 'npm', queue = 'com.apple.main-thread', stop reason = signal SIGSEGV
    frame #0: 0x00007fffd8697712 libsystem_kernel.dylib`__kill + 10
libsystem_kernel.dylib`__kill:
->  0x7fffd8697712 <+10>: jae    0x7fffd869771c            ; <+20>
    0x7fffd8697714 <+12>: movq   %rax, %rdi
    0x7fffd8697717 <+15>: jmp    0x7fffd8690cdf            ; cerror_nocancel
    0x7fffd869771c <+20>: retq   
(lldb) continue

working on this project is becoming very troubling since it crashes even if i change a single line of css, have not tried alpha yet since it seems like others reported it was still crashing

Anything we can do about this? Im stuck on developing on a mac for this summer, this really takes the fun out of developing on Meteor.

@fullhdpixel @peteli3 @JanMP @benjamn

So I did a serious audit of my codebase today, my bundle size was around 7megs unminified, I cleaned serious house and removed so much code that wasn't actually being used, including some legacy code that was injected before proper npm packages, some of these files were really big and I think that was causing a lot of the reload issues.

I've got my bundle size down to 2.48megs unminified, I was lucky enough to be able to easily remove tons of extra packages and large legacy files from early development cycles. But I have my app running on more than one computer and haven't been seeing anymore segment faults since optimizing the codebase.

Curious to hear about the size of your app.js files which can be found here:

/app/.meteor/local/build/programs/web.browser/app/

I used source-map-explorer to find some of the troubling buggers in bloating my bundle, but you can also use bundle-visualizer (a meteor package).

_for reference_
my meteor packages: https://gist.github.com/iamkevingreen/c0a1ebcbdc997e8b12808f3c7d1215d4
my package.json: https://gist.github.com/iamkevingreen/f0b9dbfd428d3797565d438d95dc324a

@iamkevingreen I think you're on to something.

I went through my codebase and removed unused files, node modules, and meteor packages. Since then I've only experienced one segmentation fault in 24 hours (instead of several per hour). So perhaps this issue is related to bundle size?

@iamkevingreen definitely with you on this. I'm actually in the middle of a few things so I won't be able to provide an unbloated size. With that said, our app.js is at 9.3megs right now, and segfaulting substantially less after cleaning out almost 20% of our build.

The issue was Fastclick meteor package. Remove it and the segfaults stop.

@zed-apps as fastclick is a relatively small client side-only package it seems very strange. Unless of course the re-compilation/bundling of it somehow causes the issue. If you add it back, do the segfaults start again?

@zed-apps @sdarnell I don't think it's fastclick either. After cleaning out our build, we are segfault-free, while still having fastclick installed.

@sdarnell @Hanley1 It worked for me, no further cleanup needed. I'll let y'all know if they start coming back though.

new project and they're back. this time using F7/Vue but no fastclick :-1:

A problem on refresh I identified while investigating the Aborts on macOS with the websocket driver (used by faye-websocket, which in turn is used by ddp-client), was possibly causing corruption, and at least was causing a debug assert when run with a debug build of node.js. See https://github.com/faye/websocket-driver-node/issues/21

@benjamn That issue has now been fixed, so I think it would be a good idea to add a task to switch to the new package faye-websocket package when it is released, as this has the potent to fix the segfault issues on refresh (or at least some of them).

From the change description it also seems that the websocket-driver was using some constants that were not appropriate for newer nodejs versions. So another reason to update.

Yeah the faults came back out of no where and it's sort of back to the wild west, i do have fastclick installed as a nested dep from another package so not sure if I can remove that without having to refactor something else.

@sdarnell is there a way we can switch to the updated driver now somehow?

Well, i don't think it's directly related to fastclick or such, i never used it and I also have this problem since a long time ago.

@sdarnell Great find! Thanks for reporting this with faye/websocket-driver-node. It looks like the fix hasn't been released yet, but hopefully they'll get a new release out soon. To make sure upgrading faye-websocket isn't lost in this large issue thread, would you mind opening a new FR (meteor/meteor-feature-requests) requesting this?

@iamkevingreen the simple answer is no, there isn't an easy way of switching in the changes until the updated code gets released and incorporated into meteor.

However, in the interests of trying to establish whether this fixes the segfaults and given I'm not able to reproduce it myself, I'm attaching a replacement websocket-driver package. But note that this involves monkey patching your meteor installation, so I STRONGLY advise against this unless you really know what you're doing, and undo the changes before deploying etc.

Find out which meteor-tool you're using by running up meteor, and then doing a ps -ef | grep meteor.
For my test this is /Users/steve/.meteor/packages/meteor-tool/.1.5.0.q9pe82.mj0xhzd7vi++os.osx.x86_64+web.browser+web.cordova/mt-os.osx.x86_64/tools/index.js. The .1.5.0...etc is the directory you need next.
Given the path to the meteor-tool there will be a directory under it called websocket-driver, use find ...meteor-tool.path... -type d -name websocket-driver. The one you want is the one in node_modules e.g.: /Users/steve/.meteor/packages/meteor-tool/.1.5.0.q9pe82.mj0xhzd7vi++os.osx.x86_64+web.browser+web.cordova/mt-os.osx.x86_64/isopackets/ddp/npm/node_modules/meteor/ddp-client/node_modules/faye-websocket/node_modules/websocket-driver
Move this directory sideways and put in the files from the attached websocket-driver.zip

I was redirected here after #8945 was marked as duplicate of this one.

As you can read there, for me it crashes with segmentation fault 11.

I can confirm what @YannLeBihan wrote at https://github.com/meteor/meteor/issues/8648#issuecomment-311177605. It takes quite a while for me until it fails if I deleted the mentioned folders in ./.meteor/local/.

Another interesting information might be, that it takes longer both for meteor and meteor test. I hereby mean, that I can run the app in development mode (meteor) for quite a while until it starts crashing and when I then switch from development to testing (meteor test), I also have quite a while without crashes there. If I then remove the folders in ./meteor/local again, I will again have some time without any errors on both development and test.

The crash-report I get using lldb is the same as https://github.com/meteor/meteor/issues/8648#issuecomment-299608260 - means, that I also get the following:

* thread #1: tid = 0x8a76a2, 0x000000010035a64e node`v8::internal::MarkCompactCollector::ProcessWeakCollections() + 302, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x241eccf03d38)
    frame #0: 0x000000010035a64e node`v8::internal::MarkCompactCollector::ProcessWeakCollections() + 302
node`v8::internal::MarkCompactCollector::ProcessWeakCollections:
->  0x10035a64e <+302>: movl   0xa8(%r12,%rax), %eax
    0x10035a656 <+310>: movzbl %cl, %ecx
    0x10035a659 <+313>: btl    %ecx, %eax
    0x10035a65c <+316>: jae    0x10035ab40               ; <+1568>

Trying to let lldb continue always ends up in the same error. The app doesn't crash when continuing like described in https://github.com/meteor/meteor/issues/8648#issuecomment-300066870.

@sdarnell I've tried to replaced the websocket-driver, but it didn't help in my case ...

For the record: I use Meteor v1.5.1 and macOS El Capitan v10.11.6 on a Mid 2012 MacBook Pro Retina 13 (2.3GHz i7, 8 GB RAM).

I've now used meteor 1.6-beta.15 and the error hasn't come in quite some hours. I'll ping you back if it crashes again.

On OS X 10.12.5
I do a fresh install of meteor and then try to run
meteor create foo
and get
Segmentation fault: 11

Meteor 1.5.1 Ubuntu 16.04 LTS. Just started noticing after upgrading my 1.2.x app to 1.5.x and modifying files. Glad to hear it doesn't happen in production on Galaxy.

@abernix I've now had meteor in version 1.6-beta.15 for quite a while and I can confirm, that the rebuild of tests doesn't crash as it does in meteor 1.5.1. So it could well be that my issue is fixed by updating libuv, as it is done by updating to Node 8.x.

I can only confirm this for the Segmentation fault: 11 errors. I've never had these abort trap 6 ones others mentioned here.

For me, this started about the same time as this.

Do you experience this when changing multiple files in one "build"? (e.g. find/replace)
Are you also experiencing cached code (when changing multiple files)?

So I noticed something else strange, but maybe helpful to everyone in here.

I've been moving/working on code inside of the imports directory in root, and still have my routes and all of that in the client, but code changes in there don't actually refresh the app but instead restart meteor, which is not actually crashing or calling the seg fault. I worked on it for something like 3-4 hours today and it never seg faulted until i created a new file in the client/ directory but other than that I'm working fairly pain free in the imports directory šŸ¤”

I moved from Blaze to React and now is very very rare a segmentation faut error. While on Blaze it was at least 10 times a day.

I“m getting those while re-compiling a lot, too... Somethimes the usual 'Segmentation fault: 11' and now often with a bit of text at the end like 'Segmentation fault: 11owser '

Great news - check out the recently accepted node PR https://github.com/nodejs/node/pull/14829, followed by https://github.com/meteor/meteor/pull/9031. Thank you @abernix and @benjamn!

I tried the newest beta release 1.5.2-beta.13 and for a while I thought this got fixed, but...

=> Client modified -- refreshing (x16)
I20170823-13:29:14.856(3)? Serving logger on /logger
I20170823-13:29:14.871(3)? Processing jobs disabled.
=> Meteor server restarted
I20170823-13:30:39.247(3)? Serving logger on /logger
I20170823-13:30:39.263(3)? Processing jobs disabled.
=> Meteor server restarted
=> Client modified -- refreshing (x51)Segmentation fault: 11
arggh@ šŸš€  :~/Development/app$ meteor --version
Meteor 1.5.2-beta.13                          
arggh@ šŸš€  :~/Development/app$ meteor node --version
v4.8.4
arggh@ šŸš€  :~/Development/app$

@arggh That's unfortunate, though I know it's certainly fixed other segmentation faults so your particular case must be different. If you run ulimit -c unlimited on your (Mac? *nix?) machine, and continue to develop it until it happens again it should write a core dump (The default configuration usually doesn't as ulimit -c is set to 0). I won't advocate publicly posting a core dump unless you're confident it doesn't contain sensitive or proprietary information, but if you'd be willing to upload and share it with me privately I will try to take a look at it. I can be messaged on the Meteor forums with the same username. It'd be important to also let me know the exact version of Meteor which produced the core dump.

I'd also suggest trying Meteor 1.6 beta releases if that's an option for you!

Testing on 1.5.2-rc.2, will edit with results.

EDIT: So far so good, we had really frequent segmentation faults and haven't had any yet

I would say that this issue is lessened with Meteor 1.5.2, but that we're not out of the woods. Specifically, I _believe_ the

v8::internal::MarkCompactCollector::ProcessWeakCollections()

crash should be fixed thanks to #9031, however, there does seem to be another issue (also in similar V8 garbage-collection code) which (based on information I've analyzed privately with @arggh) appears to be the same as https://github.com/nodejs/node/issues/3715 and https://bugs.chromium.org/p/chromium/issues/detail?id=408380 which have both been dismissed upstream due to lack of reproduction.

Anyone continuing to experience crashes with Meteor 1.5.2, please do write back and provide any additional information you might have about your segmentation fault. If you're on macOS, consider posting (as a Gist and linking here) the file generated in ~/Library/Logs/DiagnosticReports/node_<TIMESTAMP>_<HOSTNAME>.crash after the segmentation fault occurs.

I'll leave this particular issue open until @arggh or I are able to write up a new issue for the v8::internal::PointersUpdatingVisitor::VisitPointer failure. And truly, thank you @arggh for working diligently and thoroughly to help me diagnose his segmentation faults. šŸ‘

Hmm, I'm seeing this a few times a day now too (OSX):

=> App running at: http://localhost:3000/
=> Client modified -- refreshing (x4)
=> Meteor server restarted                    
=> Client modified -- refreshingSegmentation fault: 11
$  meteor --version
Meteor 1.5.1  
$ meteor node --version
v4.8.4

Updating to 1.5.2 now, will report back if issue persists.

@danwild and others: If you're experiencing this problem on a regular basis, as a matter of experimentation would you please try running setting the TOOL_NODE_FLAGS environment variable to be --no-expose-gc and see if the problem goes away?

On macOS this can be done with:

$ TOOL_NODE_FLAGS="--no-expose-gc" meteor ... # usual arguments here.

Reason being: the comments I listed here. Meteor does use the --expose-gc flag in order to afford ourselves better management of memory within the development process, however it may be causing the problems you're experiencing, presumably because an issue in V8 (as those referenced issues allude to).

By passing --no-expose-gc, you'll override our enabling of this (which we can detect) but potentially increase your memory usage (though, hopefully not to the point where you'll be exceeding any memory allocations).

/cc @arggh this should be an easier way of accomplishing the same change I had you make to your meteor command in your dev_bundle. :wink:

Oh, and please do report back with your findings, @danwild!

Well, two days of dev since upgrading to Meteor 1.5.2 and I haven't seen any segfaults.
The update seems to have fixed the issue for me - thanks all!

@danwild Is that with the TOOL_NODE_FLAGS="--no-expose-gc" environment variable, or just by upgrading to Meteor 1.5.2 alone?

Just the meteor 1.5.2 update @abernix, hadn't seen the error again so didn't seem to need --no-expose-gc

+1 on Meteor 1.5.0. Today, I got the SIGABRT for the first time without any reload, it occured after the server received data via the websockets connection (method call).

I'm getting segmentation faults in production looks like code that uses Set and Map.
When i started running the production code with meter 1.5.2.2 using command line "meteor node main.js" the crashes stopped. (before i was using mup with abrenix-base meteord image)
Does the node bug supposed to happen only when running meteor in development?

@sirpy, I don't have the answer, but I'm interested in it. Could you show some example code: Set/Map from what?

I think i isolated the crashes to a section in the code of a cronjob(using
syncedcron package) were the changes before it started crashing was
switching from regular javascript arrays/objects to new javascript Set/Map
objects.

On Sun, Oct 22, 2017 at 10:32 PM, Michael Cole notifications@github.com
wrote:

@sirpy https://github.com/sirpy, I don't have the answer, but I'm
interested in it. Could you show some example code: Set/Map from what?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/meteor/meteor/issues/8648#issuecomment-338502866, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAo9d7bPbDbASyuRerqVc57BXTxFAQnLks5su5hjgaJpZM4NLsae
.

I've tried running the server with meteor patched node 4.8.4, (meteor node main.js) and the segmentation fault crashes stopped.

@sirpy Glad to hear that! For what it's worth, Node 4.8.6 now officially contains the Meteor-supplied fix which we patched into Meteor's Node 4.8.4. See https://github.com/meteor/meteor/pull/9320 for more information.

Reporting just in case: Meteor 1.6.1 and still getting Abort trap: 6 randomly ~once or twice a day. Now Meteor is also crashing without an error message sometimes, as if CTRL+C was pressed. I'm observing this on all three apps I'm currently working on: one of them huge & two tiny.

@arggh Was this with a deployed version of a Meteor 1.6.1 app? (That is to say, a built app being ran with node directly?) If so, which Node.js version?

@abernix Nope, they happen during local development. I haven't yet deployed with Meteor 1.6.x to any actual live app, should I expect/fear these crashes to appear in the deployed version with 1.6.1?

I started facing this from today - happens every 10 mins approx. I did not do a meteor update today, just a ubuntu update (I am on Bionic beta), so it is highly likely that the cause is os specific, esp. since I am also getting chrome segfaults.

Still, since I am at a loss on figuring out a fix, any help would be totally appreciated.
Meteor: v1.6.1.1
Meteor node: v8.11.1

--no-expose-gc does not help

segfault at d ip 00007f721ac9646c sp 00007fff2d860c50 error 4 in libnode.so[7f721a116000+137e000]
trap invalid opcode ip:7fd3aa039d88 sp:7fd3a4af3288 error:0 in libc-2.27.so[7fd3a9eaf000+1e7000]

I also detect it's OS specific also it's directly indicating with Session Package?

It still happens on the latest 1.8.0.2 release
Here you can have a look at the repo https://github.com/rikyperdana/petbum
This single problem kills the mood to continue developing in Meteor

Was this page helpful?
0 / 5 - 0 ratings

Related issues

juliancwirko picture juliancwirko  Ā·  117Comments

dksharp picture dksharp  Ā·  297Comments

jdalton picture jdalton  Ā·  189Comments

giacecco picture giacecco  Ā·  80Comments

martijnwalraven picture martijnwalraven  Ā·  171Comments