I ran caffe with my training data, program crashed and reported the following errors(part of whole messages).
I0707 13:28:21.967123 5959 layer_factory.hpp:74] Creating layer pair_data
I0707 13:28:21.967149 5959 net.cpp:92] Creating Layer pair_data
I0707 13:28:21.967156 5959 net.cpp:359] pair_data -> pair_data
I0707 13:28:21.967190 5959 net.cpp:129] Setting up pair_data
F0707 13:28:21.967268 5959 db.hpp:109] Check failed: mdb_status == 0 (12 vs. 0) Cannot allocate memory
* Check failure stack trace: *
@ 0x7fd5152eeb7d google::LogMessage::Fail()
@ 0x7fd5152f0c7f google::LogMessage::SendToLog()
@ 0x7fd5152ee76c google::LogMessage::Flush()
@ 0x7fd5152f151d google::LogMessageFatal::~LogMessageFatal()
@ 0x7fd5157449d9 caffe::db::LMDB::Open()
@ 0x7fd5156d0de8 caffe::DataLayer<>::DataLayerSetUp()
@ 0x7fd51569e436 caffe::BaseDataLayer<>::LayerSetUp()
@ 0x7fd51569e539 caffe::BasePrefetchingDataLayer<>::LayerSetUp()
@ 0x7fd515740b81 caffe::Net<>::Init()
@ 0x7fd5157433f1 caffe::Net<>::Net()
@ 0x7fd51577da16 caffe::Solver<>::InitTrainNet()
@ 0x7fd51577e012 caffe::Solver<>::Init()
@ 0x7fd51577e635 caffe::Solver<>::Solver()
@ 0x40e458 caffe::GetSolver<>()
@ 0x4076c7 train()
@ 0x4055fb main
@ 0x7fd51479576d (unknown)
@ 0x405a81 (unknown)
what about the reason for this issue, thanks.
You could try using ImageDataLayer instead of DataLayer as your input. Unless you're using slow HDD or sharing a filesystem with lots of people, the ImageDataLayer is basically as fast as the Data Layer, and is simpler to set up.
If ImageDataLayer is used as network input, should I change original lmdb file to another db type. I did not find ImageDataLayer implementation in current caffe framework, thanks.
No, it's not a database. It's where you specify the images as a list of files and integer class labels.
Here:
https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto#L333
https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto#L578
https://github.com/BVLC/caffe/blob/master/src/caffe/layers/image_data_layer.cpp
How big is your database, and how much memory do you have?
Can you provide steps to reproduce this error on the latest master with DEBUG enabled? (See https://github.com/BVLC/caffe/wiki/Reporting-Bugs-and-Other-Issues.)
I have the same error, I am trying to install caffe on GPU cluster.
I get this when I try to do this "make runtest"
How could you solve it ??
Seconded. Running on Scientific Linux, compiled all components with GCC4.4 and running the default imagenet creation example given on the site. Runs locally, but bombs out with this message when running on a cluster
I solved it by changing the following:
in convert_mnist_data.cpp
CHECK_EQ(mdb_env_set_mapsize(mdb_env, 1099511627776), MDB_SUCCESS) // 1TB
to:
CHECK_EQ(mdb_env_set_mapsize(mdb_env, 1073741824), MDB_SUCCESS) // 1GB
and compile again
and also in :
/src/caffe/util/db_lmdb.cpp
change
LMDB_MAP_SIZE
to
4294967296
I can confirm this works on our cluster to a point:
F0720 13:00:15.593750 31623 db_lmdb.hpp:13] Check failed: mdb_status == 0 (-30792 vs. 0) MDB_MAP_FULL: Environment mapsize limit reached
Probably means I'm not assigning enough in terms of memory. I'm running the example imagenet
Closing as this looks like a usage / platform configuration issue.
Please ask usage questions on the caffe-users list.
Thanks!
Hi, I just started using CAFFE and I am facing an error while doing make runtest as follows
[ RUN ] DBTest/1.TestWrite
F1030 10:10:18.350623 24197 db_lmdb.hpp:14] Check failed: mdb_status == 0 (-30792 vs. 0) MDB_MAP_FULL: Environment mapsize limit reached
* Check failure stack trace:
@ 0x2b028c02 google::LogMessage::Fail()
@ 0x2b02a2c6 google::LogMessage::SendToLog()
@ 0x2b028946 google::LogMessage::Flush()
@ 0x2b02a8dc google::LogMessageFatal::~LogMessageFatal()
@ 0x2b91938c caffe::db::LMDBTransaction::Put()
@ 0x139e1e caffe::DBTest_TestWrite_Test<>::TestBody()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x203f18 testing::internal::HandleExceptionsInMethodIfSupported<>()
Aborted (core dumped)
make: ** [runtest] Error 134
which I think is because I am on a 32bit system. I tried checking related issues and fixing it (changing LMDB_MAP_SIZE) as done above. Can anyone direct me to solution.
I am also experiencing this Cannot allocate memory problem :
[ RUN ] DataLayerTest/1.TestReadCropTrainSequenceUnseededLMDB
F1106 16:32:51.940896 3956 db_lmdb.hpp:14] Check failed: mdb_status == 0 (12 vs. 0) Cannot allocate memory
*** Check failure stack trace: ***
@ 0x8cafd8 google::LogMessage::Fail()
@ 0x8caf32 google::LogMessage::SendToLog()
@ 0x8ca87a google::LogMessage::Flush()
@ 0x8cdb23 google::LogMessageFatal::~LogMessageFatal()
@ 0x2aea12b2e43c caffe::db::LMDB::Open()
@ 0x7a882d caffe::DataLayerTest<>::Fill()
@ 0x7a9003 caffe::DataLayerTest_TestReadCropTrainSequenceUnseededLMDB_Test<>::TestBody()
@ 0x8c61dd testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x8b84e1 testing::Test::Run()
@ 0x8b85c7 testing::TestInfo::Run()
@ 0x8b8707 testing::TestCase::Run()
@ 0x8bd60f testing::internal::UnitTestImpl::RunAllTests()
@ 0x8c5d8d testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x8b7b0a testing::UnitTest::Run()
@ 0x4fc29f main
@ 0x39b461ed5d (unknown)
@ 0x4fbff9 (unknown)
make: *** [runtest] Aborted (core dumped)
Even though I am sure I have a lot of memory here
tesla1 % pwd
/gluster/dr01/sirawat-p/caffe
tesla1 % df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda5 62G 13G 47G 21% /
tmpfs 63G 3.7M 63G 1% /dev/shm
/dev/sda1 504M 119M 361M 25% /boot
/dev/sda6 518G 205M 491G 1% /data
/dev/sda2 124G 2.2G 115G 2% /var
glusterfs00:/dr01 328T 55T 273T 17% /gluster/dr01
glusterfs00:/sr01 655T 110T 546T 17% /gluster/st01
As you can see this shared space dr01 has available 273T space.
how to solve this problem quickly with a proper method?
and also in :
/src/caffe/util/db_lmdb.cpp
change
LMDB_MAP_SIZE
to
4294967296
Compile Solve~
@GaoTianTian96 Thank you! The code change you suggested solved my runtime error with CPU_ONLY Caffe! I was getting
Check failed: mdb_status == 0 (12 vs. 0) Cannot allocate memory
during running the Caffe unit tests with make runtest.
I am also experiencing this Cannot allocate memory problem :
[ RUN ] DataLayerTest/1.TestReadCropTrainSequenceUnseededLMDB
F1106 16:32:51.940896 3956 db_lmdb.hpp:14] Check failed: mdb_status == 0 (12 vs. 0) Cannot allocate memory* Check failure stack trace:
@ 0x8cafd8 google::LogMessage::Fail()
@ 0x8caf32 google::LogMessage::SendToLog()
@ 0x8ca87a google::LogMessage::Flush()
@ 0x8cdb23 google::LogMessageFatal::~LogMessageFatal()
@ 0x2aea12b2e43c caffe::db::LMDB::Open()
@ 0x7a882d caffe::DataLayerTest<>::Fill()
@ 0x7a9003 caffe::DataLayerTest_TestReadCropTrainSequenceUnseededLMDB_Test<>::TestBody()
@ 0x8c61dd testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x8b84e1 testing::Test::Run()
@ 0x8b85c7 testing::TestInfo::Run()
@ 0x8b8707 testing::TestCase::Run()
@ 0x8bd60f testing::internal::UnitTestImpl::RunAllTests()
@ 0x8c5d8d testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x8b7b0a testing::UnitTest::Run()
@ 0x4fc29f main
@ 0x39b461ed5d (unknown)
@ 0x4fbff9 (unknown)
make: ** [runtest] Aborted (core dumped)Even though I am sure I have a lot of memory here
tesla1 % pwd
/gluster/dr01/sirawat-p/caffe
tesla1 % df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda5 62G 13G 47G 21% /
tmpfs 63G 3.7M 63G 1% /dev/shm
/dev/sda1 504M 119M 361M 25% /boot
/dev/sda6 518G 205M 491G 1% /data
/dev/sda2 124G 2.2G 115G 2% /var
glusterfs00:/dr01 328T 55T 273T 17% /gluster/dr01
glusterfs00:/sr01 655T 110T 546T 17% /gluster/st01As you can see this shared space dr01 has available 273T space.
I am also experiencing this Cannot allocate memory problem
how to solve this problem quickly with a proper method?
I0208 09:43:00.615448 29833 layer_factory.hpp:77] Creating layer data
F0208 09:43:03.655230 29833 db_lmdb.hpp:15] Check failed: mdb_status == 0 (37 vs. 0) No locks available
* Check failure stack trace: *
@ 0x7f7461d67466 google::LogMessage::Fail()
@ 0x7f7461d673ab google::LogMessage::SendToLog()
@ 0x7f7461d66d50 google::LogMessage::Flush()
@ 0x7f7461d6a233 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f746251e7d8 caffe::db::LMDB::Open()
@ 0x7f74624ca3cf caffe::DataLayer<>::DataLayer()
@ 0x7f74624ca562 caffe::Creator_DataLayer<>()
@ 0x7f74625b9749 caffe::Net<>::Init()
@ 0x7f74625bc0ee caffe::Net<>::Net()
@ 0x7f74625e65c5 caffe::Solver<>::InitTrainNet()
@ 0x7f74625e7a55 caffe::Solver<>::Init()
@ 0x7f74625e7d6f caffe::Solver<>::Solver()
@ 0x7f74625932a1 caffe::Creator_SGDSolver<>()
@ 0x417091 caffe::SolverRegistry<>::CreateSolver()
@ 0x40eb94 train()
@ 0x40b983 main
@ 0x7f7460d3a830 __libc_start_main
@ 0x40c329 _start
Aborted (core dumped)
how to solve this problem? Thks
Most helpful comment
and also in :
/src/caffe/util/db_lmdb.cpp
change
LMDB_MAP_SIZE
to
4294967296
Compile Solve~