I'm not sure what might be causing this, but here's what I'm seeing when I run make runtest on a checkout of master. I'm running on Debian Jessie, with GCC 4.9 and CUDA 8 RC. The only interesting thing about this machine is that it has 4x GTX 1080.
[----------] 2 tests from HingeLossLayerTest/2, where TypeParam = caffe::GPUDevice<float>
[ RUN ] HingeLossLayerTest/2.TestGradientL2
[ OK ] HingeLossLayerTest/2.TestGradientL2 (6 ms)
[ RUN ] HingeLossLayerTest/2.TestGradientL1
[ OK ] HingeLossLayerTest/2.TestGradientL1 (6 ms)
[----------] 2 tests from HingeLossLayerTest/2 (12 ms total)
[----------] 9 tests from AdaGradSolverTest/2, where TypeParam = caffe::GPUDevice<float>
[ RUN ] AdaGradSolverTest/2.TestLeastSquaresUpdateWithEverythingAccumShare
[ OK ] AdaGradSolverTest/2.TestLeastSquaresUpdateWithEverythingAccumShare (12 ms)
[ RUN ] AdaGradSolverTest/2.TestAdaGradLeastSquaresUpdateWithEverythingShare
*** Aborted at 1474827886 (unix time) try "date -d @1474827886" if you are using GNU date ***
PC: @ 0x7fbbf4951e2d (unknown)
*** SIGSEGV (@0x1451f000) received by PID 23925 (TID 0x7fbc03491a00) from PID 340914176; stack trace: ***
@ 0x7fbbf4bd38d0 (unknown)
@ 0x7fbbf4951e2d (unknown)
@ 0x7fbbf5496350 std::vector<>::_M_erase()
@ 0x7fbbf549427d caffe::DevicePair::compute()
@ 0x7fbbf5499123 caffe::P2PSync<>::Prepare()
@ 0x7fbbf54997a0 caffe::P2PSync<>::Run()
@ 0x6af00e caffe::GradientBasedSolverTest<>::RunLeastSquaresSolver()
@ 0x6c2d2f caffe::GradientBasedSolverTest<>::TestLeastSquaresUpdate()
@ 0x6c31b0 caffe::AdaGradSolverTest_TestAdaGradLeastSquaresUpdateWithEverythingShare_Test<>::TestBody()
@ 0x8ff553 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x8f7eca testing::Test::Run()
@ 0x8f8018 testing::TestInfo::Run()
@ 0x8f80f5 testing::TestCase::Run()
@ 0x8f8a28 testing::internal::UnitTestImpl::RunAllTests()
@ 0x8f8d03 testing::UnitTest::Run()
@ 0x46e9df main
@ 0x7fbbf483ab45 (unknown)
@ 0x4764e9 (unknown)
@ 0x0 (unknown)
Makefile:526: recipe for target 'runtest' failed
make: *** [runtest] Segmentation fault
EDIT: I'm compiling with CuDNN enabled, but turning it off doesn't seem to make a difference.
ran into exactly the same problem today with Ubuntu 16.04, 4 X K80, CUDA 8 RC, and GCC-5.3.
Advice highly appreciated!
This may be unrelated, but as an extra datapoint, I also get a segfault if
I import pycaffe and theano in the same file and the try to do anything
with caffe. Let me know if I can provide any extra info!
On Tue, 27 Sep 2016, 22:47 ruonanl, [email protected] wrote:
ran into exactly the same problem today with Ubuntu 16.04, 4 X K80, CUDA 8
RC, and GCC-5.3.
Advice highly appreciated!—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/BVLC/caffe/issues/4772#issuecomment-250008887, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACYqiCs5Z-Kv08RXj7FJyxBSQaG5R-Reks5quY8HgaJpZM4KF9-e
.
An extra bit of observation: segfaults appear in several "SolverTest", but all share the same stack trace:
std::vector<>::_M_erase()
caffe::DevicePair::compute()
caffe::P2PSync<>::Prepare()
caffe::P2PSync<>::Run()
caffe::GradientBasedSolverTest<>::TestLeastSquaresUpdate()
Same here.
Titan X (Pascal)_6+K80_2+GTX1080*1 + Ubuntu 16.04 + cudnn v5.1 + cuda 8 + GCC-5.4.
[----------] 12 tests from SGDSolverTest/2, where TypeParam = caffe::GPUDevice
[ RUN ] SGDSolverTest/2.TestLeastSquaresUpdateWithWeightDecay
* Aborted at 1475986823 (unix time) try "date -d @1475986823" if you are using GNU date
PC: @ 0x7f13e92fd512 (unknown)
* SIGSEGV (@0x19ae2000) received by PID 14082 (TID 0x7f13f0ac7ac0) from PID 430841856; stack trace: *
@ 0x7f13e958a3d0 (unknown)
@ 0x7f13e92fd512 (unknown)
@ 0x7f13e9eae280 std::vector<>::_M_erase()
@ 0x7f13e9eac494 caffe::DevicePair::compute()
@ 0x7f13e9eb1d50 caffe::P2PSync<>::Prepare()
@ 0x7f13e9eb285e caffe::P2PSync<>::Run()
@ 0x5b409e caffe::GradientBasedSolverTest<>::TestLeastSquaresUpdate()
@ 0x5b49ff caffe::SGDSolverTest_TestLeastSquaresUpdateWithWeightDecay_Test<>::TestBody()
@ 0x91ad53 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x91436a testing::Test::Run()
@ 0x9144b8 testing::TestInfo::Run()
@ 0x914595 testing::TestCase::Run()
@ 0x91586f testing::internal::UnitTestImpl::RunAllTests()
@ 0x915b93 testing::UnitTest::Run()
@ 0x46d9ed main
@ 0x7f13e91d0830 __libc_start_main
@ 0x475459 _start
@ 0x0 (unknown)
Makefile:526: recipe for target 'runtest' failed
make: ** [runtest] Segmentation fault (core dumped)
I suspect it is a bug of multi-GPU support.
I tried to use "export CUDA_VISIBLE_DEVICES=0" to make only 1 GPU visible to Caffe, and then I can successfully pass all the tests.
[==========] 2081 tests from 277 test cases ran. (353009 ms total)
[ PASSED ] 2081 tests.
@nitbix
This may be unrelated, but as an extra datapoint, I also get a segfault if
I import pycaffe and theano in the same file and the try to do anything
with caffe. Let me know if I can provide any extra info!
This was fixed in theano in commit bb170f4fb201109f88b95da282ed3a21b5021c13 (23 Sep 2016). It was calling cudaThreadExit on shutdown which then caused a segfault when Caffe subsequently called cublasDestroy on cleanup
Dear All,
Please advice how you solve this issue as I have the same problem. Any answer is highly appreciated.

Hi all
I have the same problem in ubuntu 16.4

Any answer is highly appreciated. Thank you
@RuaYahya Did you solve issue?
Hi ,
The proplem in my case is that my labtop does not have a Nvidia card . Check whether your graphical processing unit is nvidia or not. It works fine when I try another laptop.
Thanks
@RuaYahya
* SIGABRT (@0x113c) received by PID 4412 (TID 0x7f64016a5b00) from PID 4412; stack trace: *
@ 0x7f63ffd094b0 (unknown)
@ 0x7f63ffd09428 gsignal
@ 0x7f63ffd0b02a abort
@ 0x7f63ffd4b7ea (unknown)
@ 0x7f63ffd53e0a (unknown)
@ 0x7f63ffd5798c cfree
@ 0x7f64008878af google::protobuf::internal::DestroyDefaultRepeatedFields()
@ 0x7f6400886b3b google::protobuf::ShutdownProtobufLibrary()
@ 0x7f63e98c6329 (unknown)
@ 0x7f64015a2c17 (unknown)
@ 0x7f63ffd0dff8 (unknown)
@ 0x7f63ffd0e045 exit
@ 0x7f63ffcf4837 __libc_start_main
@ 0x4077c9 _start
@ 0x0 (unknown)
Makefile:532: recipe for target 'runtest' failed
I have the same problem in ubuntu 16.4.Did you solve issue?
@Mehuli-Ruh11
I believe he would simply include it before the command, like this export CUDA_VISIBLE_DEVICES=0 make runtest. This fixed the error for me, it's related to this line in _Makefile.config_
# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0
@denru01
I had a similar problem with you.
I had a boost python package installed through conda, it has a different version with the one in my system. If you are using Anaconda, just uninstall the boost python package(conda uninstall boost)
That might fix the problem.
did someone find a solution ? I have the same problem and I'm running on ubuntu16.04 with only one gpu (gtx1080) and cuda8.
@FangbRen
您好,我在安装caffe时遇到了和您相同的问题,想向您请教一下如何解决,谢谢
i solved this issue by the command : make runtest -j export CUDA_VISIBLE_DEVICES=0
Most helpful comment
I suspect it is a bug of multi-GPU support.
I tried to use "export CUDA_VISIBLE_DEVICES=0" to make only 1 GPU visible to Caffe, and then I can successfully pass all the tests.
[==========] 2081 tests from 277 test cases ran. (353009 ms total)
[ PASSED ] 2081 tests.