Darkflow: Segmentation fault (core dumped)

Created on 19 Dec 2019  路  5Comments  路  Source: thtrieu/darkflow

I used class=1(solar panel ) and changes filters =30 for custom object detection training.
i am using VMware15.05 and i installed RHEL 6 on VMware.
I am given size 400 GB for RHEL 6 on VMware
I am using yolo.cfg configuration file and yolo.weights weights file for training.
i use class=1 and filter =30
i changes in label.txt i.e. contain only "solar_panel" .

After Executing this Command :

    python flow  --model cfg/yolo-1c.cfg --load bin/yolo.weights --train --annotation new_model_data/annotations --dataset new_model_data/images  --epoch 400

Process are failed and this is Last line after executing above Command .

  0219-12-19 03:45:00.680721: I tensorflow/core/common_runtime/bfc_allocator.cc:816] Sum 
  Total of in-use chunks: 2.80GiB
  2019-12-19 03:45:00.680729: I tensorflow/core/common_runtime/bfc_allocator.cc:818]  
  total_region_allocated_bytes_: 3011067904 memory_limit_: 3011067904 available bytes: 0 
  curr_region_allocation_bytes_: 4294967296
  2019-12-19 03:45:00.680748: I tensorflow/core/common_runtime/bfc_allocator.cc:824] Stats: 
Limit:                  3011067904
InUse:                  3011067904
MaxInUse:               3011067904
NumAllocs:                    1574
MaxAllocSize:            830704384

2019-12-19 03:45:00.680786: W tensorflow/core/common_runtime/bfc_allocator.cc:319] 
**************************x**********************************x**************************************
 2019-12-19 03:45:00.689879: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES 
failed at mkl_util.h:1026 : Resource exhausted: OOM when allocating tensor with 
shape[189267968] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator 
 mklcpu
 Segmentation fault (core dumped)

Problem Images ...........................

image

image

image

image

image

image

Please help me to solve this Problem .................................Thank you

Most helpful comment

A

Everything is fine you done well ........use any another configuration file (like tiny-yolo.cfg) which have less layers ........yolo.cfg contain more than 22 convolution layers so it wants more memory .....that's why your this error comes .

In future if you provide sufficient amount of memory for yolo.cfg configuration file for training but your speed is slow if you uses CPU ..........because CPU have very less processing power.

Always use yolo-1c.cfg a new copy of yolo.cfg configuration file. Don't edit the yolo.cfg.

All 5 comments

@thtrieu, please help. Thanks a lot in advance.

@ankitAMD please help.

Everything is fine you done well ........use any another configuration file (like tiny-yolo.cfg) which have less layers ........yolo.cfg contain more than 22 convolution layers so it wants more memory .....that's why your this error comes .

In future if you provide sufficient amount of memory for yolo.cfg configuration file for training but your speed is slow if you uses CPU ..........because CPU have very less processing power.

A

Everything is fine you done well ........use any another configuration file (like tiny-yolo.cfg) which have less layers ........yolo.cfg contain more than 22 convolution layers so it wants more memory .....that's why your this error comes .

In future if you provide sufficient amount of memory for yolo.cfg configuration file for training but your speed is slow if you uses CPU ..........because CPU have very less processing power.

Always use yolo-1c.cfg a new copy of yolo.cfg configuration file. Don't edit the yolo.cfg.

thank you Ankit.

Everything is fine you done well ........use any another configuration file (like tiny-yolo.cfg) which have less layers ........yolo.cfg contain more than 22 convolution layers so it wants more memory .....that's why your this error comes .

In future if you provide sufficient amount of memory for yolo.cfg configuration file for training but your speed is slow if you uses CPU ..........because CPU have very less processing power.

      Thank you @ankitAMD this helps me to solve this error. Thank you again
Was this page helpful?
0 / 5 - 0 ratings