It seems there are runtime crashes when LLVM is built with assertions on.
This is one of them, but there was at least another trace which I saw:
[----------] 9 tests from CPU/BackendCorrectnessTest
[ RUN ] CPU/BackendCorrectnessTest.convTest/0
Program aborted due to an unhandled Error:
Error value was Success. (Note: Success values must still be checked prior to being destroyed).
#0 0x00000000032d61eb llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/visit0r/src/tcemc/tce/llvm-build-Debug/release_70/lib/Support/Unix/Signals.inc:490:0
#1 0x00000000032d627e PrintStackTraceSignalHandler(void*) /home/visit0r/src/tcemc/tce/llvm-build-Debug/release_70/lib/Support/Unix/Signals.inc:554:0
#2 0x00000000032d3f8c llvm::sys::RunSignalHandlers() /home/visit0r/src/tcemc/tce/llvm-build-Debug/release_70/lib/Support/Signals.cpp:67:0
#3 0x00000000032d5be5 SignalHandler(int) /home/visit0r/src/tcemc/tce/llvm-build-Debug/release_70/lib/Support/Unix/Signals.inc:353:0
#4 0x00007f4ee9370390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390)
#5 0x00007f4ee8729428 gsignal /build/glibc-LK5gWL/glibc-2.23/signal/../sysdeps/unix/sysv/linux/raise.c:54:0
#6 0x00007f4ee872b02a abort /build/glibc-LK5gWL/glibc-2.23/stdlib/abort.c:91:0
#7 0x0000000003213f0d /home/visit0r/src/tcemc/tce/llvm-build-Debug/release_70/lib/Support/Error.cpp:102:0
#8 0x0000000000560bd7 llvm::Error::assertIsChecked() /home/visit0r/local/stow/llvm-7.0-tcepatched/include/llvm/Support/Error.h:265:3
#9 0x0000000000562600 llvm::Error::operator=(llvm::Error&&) /home/visit0r/local/stow/llvm-7.0-tcepatched/include/llvm/Support/Error.h:209:12
#10 0x000000000055ece1 glow::ExecutionEngine::runInternal(glow::ExecutionContext&, llvm::StringRef, glow::CompiledFunction&)::$_1::operator()(unsigned long, llvm::Error, std::unique_ptr<glow::ExecutionContext, std::default_delete<glow::ExecutionContext> >) const /home/visit0r/src/glow/build_Debug_llvm_7/../lib/ExecutionEngine/ExecutionEngine.cpp:131:9
...
@jackm321 Could you have a look? This is most likely related to the recent error-handling changes.
Yeah looks like we have an Error that isn't checked. I'll into it
Seems this is still happening with a few test cases and also with LLVM 7. I assume other Glow developers have LLVM assertions turned off and are not seeing this and thus not getting distracted by the failing tests? I patch my Error.h manually to avoid it since I didn't find a cleaner way.
2 - BackendTest (Child aborted)
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0720 12:14:50.814687 24923 Partitioner.cpp:1234] Profiling a model to be partitioned cross different backends. Each sub-network will be optimized and run on cpu backend.
Program aborted due to an unhandled Error:
Error value was Success. (Note: Success values must still be checked prior to being destroyed).
...
20 - MLTest (Child aborted)
I0720 12:15:01.712651 27355 Partitioner.cpp:1274] The model is too small for applying partition.
Model size : 8
Backend Name : Interpreter
Device memory: 2000000000
Program aborted due to an unhandled Error:
Error value was Success. (Note: Success values must still be checked prior to being destroyed).
...
This is pretty much going to keep recurring unless we use debug llvm into our CI :-/
Most helpful comment
Seems this is still happening with a few test cases and also with LLVM 7. I assume other Glow developers have LLVM assertions turned off and are not seeing this and thus not getting distracted by the failing tests? I patch my Error.h manually to avoid it since I didn't find a cleaner way.