Caffe: Finetuning out-of-memory and lack of output

Created on 13 Jul 2014 · 12Comments · Source: BVLC/caffe

Hi, i plan to apply the pretrained ImageNet model for a 2-class classification task. So I need to modify the fc8 layer and further finetune the network. I follow shelhamer's suggestion in https://github.com/BVLC/caffe/issues/186.

Here is what i do:
1) prepare the input data
2) change data of input layer and fc8 layer in imagenet_train.prototxt and imagenet_val.prototxt
3) type finetune_net imagenet_solver.prototxt caffe_reference_imagenet_model in terminal (caffe_reference_imagenet_model is the 244MB pretrained model file)

after that, the terminal doesn't response for a long time without output.

I'm new to caffe, could someone tell me how to finetune a existing model. Thanks for any reply!

downstream problem?

Source

htzheng

Most helpful comment

I am trying to implement DeepFace model and the memory required for test 49724060 (with batch size, finally 1) and for train 49724060 (again with batch size 1).... This makes total memory required around 94 MB. But I am still having the 'out of memory' problem. I have got nvidia geforce GT 650M as my GPU. Viewing the status of GPU using $nvidia-smi -q, I can see that the total Memory (FB) is 2047 Mib and Free memory is 1646 MiB. Can anyone point me what am I blinkering??

Prasanna1991 on 14 Jan 2015

👍4

All 12 comments

update:
in step (3) finetune_net imagenet_solver.prototxt caffe_reference_imagenet_model, it will show this info.

F0713 21:28:27.059324 3532 syncedmem.cpp:47] Check failed: error == cudaSuccess (2 vs. 0) out of memory
* Check failure stack trace: *
@ 0x7fb1ce8da9fd google::LogMessage::Fail()
@ 0x7fb1ce8dc89d google::LogMessage::SendToLog()
@ 0x7fb1ce8da5ec google::LogMessage::Flush()
@ 0x7fb1ce8dd1be google::LogMessageFatal::~LogMessageFatal()
@ 0x447284 caffe::SyncedMemory::mutable_gpu_data()
@ 0x43c9d2 caffe::Blob<>::mutable_gpu_diff()
@ 0x4934f2 caffe::InnerProductLayer<>::Backward_gpu()
@ 0x42e403 caffe::Net<>::Backward()
@ 0x445bd7 caffe::Solver<>::Solve()
@ 0x40a0c8 main
@ 0x7fb1cbf56ec5 (unknown)
@ 0x40be37 (unknown)
Aborted (core dumped)

is there any clue?

htzheng on 13 Jul 2014

The message is clear you are out of memory in your GPU card. So you will
need to reduce the batch_size

Sergio

2014-07-13 6:34 GMT-07:00 htzheng [email protected]:

update:
in step (3) finetune_net imagenet_solver.prototxt
caffe_reference_imagenet_model, it will show this info.

F0713 21:28:27.059324 3532 syncedmem.cpp:47] Check failed: error ==
cudaSuccess (2 vs. 0) out of memory
* Check failure stack trace: *
@ 0x7fb1ce8da9fd google::LogMessage::Fail()
@ 0x7fb1ce8dc89d google::LogMessage::SendToLog()
@ 0x7fb1ce8da5ec google::LogMessage::Flush()
@ 0x7fb1ce8dd1be google::LogMessageFatal::~LogMessageFatal()
@ 0x447284 caffe::SyncedMemory::mutable_gpu_data()
@ 0x43c9d2 caffe::Blob<>::mutable_gpu_diff()
@ 0x4934f2 caffe::InnerProductLayer<>::Backward_gpu()
@ 0x42e403 caffe::Net<>::Backward()
@ 0x445bd7 caffe::Solver<>::Solve()
@ 0x40a0c8 main
@ 0x7fb1cbf56ec5 (unknown)
@ 0x40be37 (unknown)
Aborted (core dumped)

is there any clue?

—
Reply to this email directly or view it on GitHub
https://github.com/BVLC/caffe/issues/682#issuecomment-48840705.

sguada on 14 Jul 2014

👍3

@sguada Thank you! Now i change to a smaller batch_size and problem solved.

There is a tiny issue for finetune_net.bin on Ubuntu OS: LOG(INFO) does not print info on terminal. I change LOG(INFO) to std::cout. As observed, the finetuning code works well.

htzheng on 14 Jul 2014

Happy that you figured it out. For logging do

GLOG_logtostderr=1 finetune_net imagenet_solver.prototxt caffe_reference_imagenet_model

shelhamer on 14 Jul 2014

Prasanna1991 on 14 Jan 2015

👍4

I got the same GPU error problem when I trained imagenet example. Even though, the limitation of the batch size to 4 from 256 in train_val.prototxt, which fixed the short of my GPU memory issue. I had the second thought about changing solver_mode to CPU from GPU because my GPU had only 512MB of memory due to my old Macbook pro. CPU mode seemed to train without the problem. I think I should use GPU mode for massive parallel computing but I only have several PCs and laptops and I am just learning caffe for my study. Should I use GPU mode always? I think GPU mode can optimize more for accelerating computing speed but not for my caffe learning phase. However, I have to build a new PC for training caffe. Please give me advise for what type of GPUs are good enough for caffe. I am thinking to buy NVIDIA GeForce GTX960 with 2GB memory. I heard GPU with 3GB or above memory is sufficient for caffe.

ToruHironaka on 4 Mar 2015

when i try install fast rcnn than i got like this error? how to slove it?

Loaded network /home/rvlab/Music/fast-rcnn/data/fast_rcnn_models/vgg16_fast_rcnn_iter_40000.caffemodel

Demo for data/demo/000004.jpg
F0718 22:09:35.547049 13693 syncedmem.cpp:51] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
Aborted (core dumped)

cervantes-loves-ai on 18 Jul 2016

I get this error when running the following:
jalal@klein:~/computer_vision/py-faster-rcnn$ ./tools/demo.py

I0823 20:13:31.522610 40008 layer_factory.hpp:77] Creating layer relu5_1
I0823 20:13:31.522629 40008 net.cpp:106] Creating Layer relu5_1
I0823 20:13:31.522644 40008 net.cpp:454] relu5_1 <- conv5_1
I0823 20:13:31.522662 40008 net.cpp:397] relu5_1 -> conv5_1 (in-place)
I0823 20:13:31.522843 40008 net.cpp:150] Setting up relu5_1
I0823 20:13:31.522869 40008 net.cpp:157] Top shape: 1 512 14 14 (100352)
I0823 20:13:31.522883 40008 net.cpp:165] Memory required for data: 112795648
I0823 20:13:31.522891 40008 layer_factory.hpp:77] Creating layer conv5_2
I0823 20:13:31.522902 40008 net.cpp:106] Creating Layer conv5_2
I0823 20:13:31.522909 40008 net.cpp:454] conv5_2 <- conv5_1
I0823 20:13:31.522922 40008 net.cpp:411] conv5_2 -> conv5_2
I0823 20:13:31.529803 40008 net.cpp:150] Setting up conv5_2
I0823 20:13:31.529841 40008 net.cpp:157] Top shape: 1 512 14 14 (100352)
I0823 20:13:31.529849 40008 net.cpp:165] Memory required for data: 113197056
I0823 20:13:31.529868 40008 layer_factory.hpp:77] Creating layer relu5_2
I0823 20:13:31.529887 40008 net.cpp:106] Creating Layer relu5_2
I0823 20:13:31.529903 40008 net.cpp:454] relu5_2 <- conv5_2
I0823 20:13:31.529920 40008 net.cpp:397] relu5_2 -> conv5_2 (in-place)
F0823 20:13:31.530177 40008 cudnn_relu_layer.cpp:13] Check failed: status == CUDNN_STATUS_SUCCESS (4 vs. 0)  CUDNN_STATUS_INTERNAL_ERROR
*** Check failure stack trace: ***
Aborted (core dumped)

@sguada how should I reduce the batch size? in which file? can you show an example?