Darkflow: Out of Memory

Created on 3 Feb 2017  路  8Comments  路  Source: thtrieu/darkflow

Hi @thtrieu,

What the minimal requirement for the GPU device.
I use GTX 1080 and even set the batch = 1 and got out of memory?

darkflow$ ./flow --model cfg/tiny-yolo-voc.cfg --load bin/tiny-yolo-voc.weights --train --gpu 1.0
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
Parsing ./cfg/tiny-yolo-voc.cfg
Parsing cfg/tiny-yolo-voc.cfg
Loading bin/tiny-yolo-voc.weights ...
Successfully identified 63471556 bytes
Finished in 0.00466394424438s

Building net ...
Source | Train? | Layer description | Output size
-------+--------+----------------------------------+---------------
| | input | (?, 96, 96, 3)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 96, 96, 16)
Load | Yep! | maxp 2x2p0_2 | (?, 48, 48, 16)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 48, 48, 32)
Load | Yep! | maxp 2x2p0_2 | (?, 24, 24, 32)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 24, 24, 64)
Load | Yep! | maxp 2x2p0_2 | (?, 12, 12, 64)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 12, 12, 128)
Load | Yep! | maxp 2x2p0_2 | (?, 6, 6, 128)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 6, 6, 256)
Load | Yep! | maxp 2x2p0_2 | (?, 3, 3, 256)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 3, 3, 512)
Load | Yep! | maxp 2x2p0_1 | (?, 3, 3, 512)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 3, 3, 1024)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 3, 3, 1024)
Load | Yep! | conv 1x1p0_1 linear | (?, 3, 3, 125)
-------+--------+----------------------------------+---------------
GPU mode with 1.0 usage
cfg/tiny-yolo-voc.cfg loss hyper-parameters:
H = 3
W = 3
box = 5
classes = 5
scales = [1.0, 5.0, 1.0, 1.0]
Building cfg/tiny-yolo-voc.cfg loss
Building cfg/tiny-yolo-voc.cfg train op
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.86
pciBusID 0000:03:00.0
Total memory: 7.92GiB
Free memory: 7.66GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:03:00.0)
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 7.92G (8505458688 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
Segmentation fault (core dumped)

Most helpful comment

Tensorflow is asking for a block of about 8GB on your GPU device. Memory used is from: variables, intermediate calculations, gradients, moving averages. I suggest using --trainer sgd to avoid additional memory for gradients' moving average.

All 8 comments

Tensorflow is asking for a block of about 8GB on your GPU device. Memory used is from: variables, intermediate calculations, gradients, moving averages. I suggest using --trainer sgd to avoid additional memory for gradients' moving average.

@pribadihcr how did you solve your issue?

@thtrieu : The solution of setting --trainer sgd is not solving the problem and it is throwing different error as below.

 optimizer = self._TRAINER[self.FLAGS.trainer](self.FLAGS.lr)
KeyError: 'sgd'

@thtrieu : thanks for commit, but still facing the same issue:

GPU mode with 1.0 usage
cfg/yolo.cfg loss hyper-parameters:
    H       = 13
    W       = 13
    box     = 5
    classes = 3
    scales  = [1.0, 5.0, 1.0, 1.0]
Building cfg/yolo.cfg loss
Building cfg/yolo.cfg train op
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GeForce GTX 750 Ti
major: 5 minor: 0 memoryClockRate (GHz) 1.15
pciBusID 0000:04:00.0
Total memory: 1.95GiB
Free memory: 1.93GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 750 Ti, pci bus id: 0000:04:00.0)
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 1.95G (2096431104 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

run.sh: line 2: 14186 Segmentation fault      (core dumped) ./flow --model cfg/yolo.cfg --train --dataset training_data_baggage_3_claases/train_images/ --annotation training_data_baggage_3_claases/annotations/ --batch 2 --epoch 100 --savepb --gpu 1.0 --trainer sgd

Is there any minimum memory requirement for the same ??

It worked when I changed 'gpu' input argument :

/flow --model cfg/yolo.cfg --train --dataset training_data_baggage_3_claases/train_images/ --annotation training_data_baggage_3_claases/annotations/ --batch 2 --epoch 100 --savepb --gpu 1.0 --trainer sgd

what is your original command and what did you changed? Might be useful for later users.

The working command is as below :

/flow --model cfg/yolo.cfg --train --dataset training_data_baggage_3_claases/train_images/ --annotation training_data_baggage_3_claases/annotations/ --batch 2 --epoch 100 --savepb --gpu 0.8 --trainer sgd
Was this page helpful?
0 / 5 - 0 ratings

Related issues

bareblackfoot picture bareblackfoot  路  5Comments

Kowasaki picture Kowasaki  路  4Comments

1NNcoder picture 1NNcoder  路  3Comments

halt9 picture halt9  路  3Comments

jubjamie picture jubjamie  路  4Comments