We are seeing some crash reports that are missing the "native_frames" array for the crashed thread.
We are able to reproduce it using this debug command in MonoDevelop: https://github.com/mono/monodevelop/pull/9049
We are seeing managed and native frames for all threads except the one labeled "crashed = true" which is missing native frames.
Expecting to see the backtrace for all threads, including (most importantly?) the crashed thread.
[x] macOS
[ ] Linux
[ ] Windows
Version Used:
Mono JIT compiler version 6.6.0.126 (2019-08/8969f2cc99b Mon Oct 14 18:19:47 EDT 2019)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
TLS:
SIGSEGV: altstack
Notification: kqueue
Architecture: amd64
Disabled: none
Misc: softdebug
Interpreter: yes
LLVM: yes(610)
Suspend: hybrid
GC: sgen (concurrent by default)
@lambdageek I filed this for book keeping. The fix is in https://github.com/mono/mono/pull/17466 and we should can close this when the back port to 2019-08 completes. Thanks!
Seems like the fix works:
Crashing thread before: https://gist.github.com/kdubau/50a2b627799d31f4e42bc747315b8fcf#file-gistfile1-txt-L503-L706 (missing "unmanaged_frames" array).
And after: https://gist.github.com/kdubau/387c9a6fe9ebf8b36dc0602b41971645#file-gistfile1-txt-L640-L1083 (has the frames, L843)
Recording the underlying issue:
When mono gets a SIGSEGV or SIGBUS, it runs on an altstack. Setup with sigaltstack and SA_ONSTACK flag for sigaction. (We do this to detect null pointer dereferences and turn them into managed NullPointerExceptions and stack overflows into StackOverflowException). The signal handler is supposed to examine the context of the signal and if it was in a managed method, raise an NPE or SOE. To do that it uses altstack_handle_and_restore.
The issue was that the crash reporter code expects to run backtrace, but since we were running the crash reporting code from the main SIGSEGV handler without restoring from the altstack to the original stack, backtrace (which doesn't know how to jump back from the altstack to the main stack) would not find anything that it could unwind and we would get an empty native stack trace.
The solution was to run the crash reporting code from altstack_handle_and_restore which runs back in the main stack.
Most helpful comment
Recording the underlying issue:
When mono gets a SIGSEGV or SIGBUS, it runs on an altstack. Setup with
sigaltstackandSA_ONSTACKflag forsigaction. (We do this to detect null pointer dereferences and turn them into managed NullPointerExceptions and stack overflows into StackOverflowException). The signal handler is supposed to examine the context of the signal and if it was in a managed method, raise an NPE or SOE. To do that it usesaltstack_handle_and_restore.The issue was that the crash reporter code expects to run
backtrace, but since we were running the crash reporting code from the main SIGSEGV handler without restoring from the altstack to the original stack,backtrace(which doesn't know how to jump back from the altstack to the main stack) would not find anything that it could unwind and we would get an empty native stack trace.The solution was to run the crash reporting code from
altstack_handle_and_restorewhich runs back in the main stack.