Darkflow: InvalidArgumentError: Input to reshape is a tensor wit h 84500 values, but the requested shape requires a multiple of 5070

Created on 7 Dec 2017  路  6Comments  路  Source: thtrieu/darkflow

I'm trying to retrain tiny-yolo-voc to get only the person object. I'm using on Windows with a GTX 750 Ti. I was getting the ResourceExhaustedError, but I used --gpu 0.7 and --batch 4, so the error is gone. Now I'm getting this error related to reshape function.

Statistics:
person: 5227
Dataset size: 4952
Dataset of 4952 instance(s)
Training statistics:
        Learning rate : 1e-05
        Batch size    : 4
        Epoch number  : 1000
        Backup every  : 2000
2017-12-07 10:04:29.378250: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu
\PY\35\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc
) ran out of memory trying to allocate 1.15GiB. The caller indicates that this i
s not a failure, but may mean that there could be performance gains if more memo
ry is available.
2017-12-07 10:04:29.444254: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu
\PY\35\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc
) ran out of memory trying to allocate 2.29GiB. The caller indicates that this i
s not a failure, but may mean that there could be performance gains if more memo
ry is available.
Traceback (most recent call last):
  File "C:\Users\Lucas\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\py
thon\client\session.py", line 1323, in _do_call
    return fn(*args)
  File "C:\Users\Lucas\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\py
thon\client\session.py", line 1302, in _run_fn
    status, run_metadata)
  File "C:\Users\Lucas\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\py
thon\framework\errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape i
s a tensor with 84500 values, but the requested shape requires a multiple of 507
0
         [[Node: Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:lo
calhost/replica:0/task:0/device:GPU:0"](BiasAdd_8, Reshape/shape)]]
         [[Node: mul_17/_57 = _Recv[client_terminated=false, recv_device="/job:l
ocalhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/t
ask:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_972_mul_17", t
ensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "flow", line 6, in <module>
    cliHandler(sys.argv)
  File "C:\tandera\yolo\darkflow\cli.py", line 29, in cliHandler
    print('Enter training ...'); tfnet.train()
  File "C:\tandera\yolo\darkflow\net\flow.py", line 52, in train
    fetched = self.sess.run(fetches, feed_dict)
  File "C:\Users\Lucas\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\py
thon\client\session.py", line 889, in run
    run_metadata_ptr)
  File "C:\Users\Lucas\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\py
thon\client\session.py", line 1120, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\Users\Lucas\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\py
thon\client\session.py", line 1317, in _do_run
    options, run_metadata)
  File "C:\Users\Lucas\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\py
thon\client\session.py", line 1336, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape i
s a tensor with 84500 values, but the requested shape requires a multiple of 507
0
         [[Node: Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:lo
calhost/replica:0/task:0/device:GPU:0"](BiasAdd_8, Reshape/shape)]]
         [[Node: mul_17/_57 = _Recv[client_terminated=false, recv_device="/job:l
ocalhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/t
ask:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_972_mul_17", t
ensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]


Caused by op 'Reshape', defined at:
  File "flow", line 6, in <module>
    cliHandler(sys.argv)
  File "C:\tandera\yolo\darkflow\cli.py", line 22, in cliHandler
    tfnet = TFNet(FLAGS)
  File "C:\tandera\yolo\darkflow\net\build.py", line 76, in __init__
    self.setup_meta_ops()
  File "C:\tandera\yolo\darkflow\net\build.py", line 139, in setup_meta_ops
    if self.FLAGS.train: self.build_train_op()
  File "C:\tandera\yolo\darkflow\net\help.py", line 15, in build_train_op
    self.framework.loss(self.out)
  File "C:\tandera\yolo\darkflow\net\yolov2\train.py", line 56, in loss
    net_out_reshape = tf.reshape(net_out, [-1, H, W, B, (4 + 1 + C)])
  File "C:\Users\Lucas\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\py
thon\ops\gen_array_ops.py", line 3937, in reshape
    "Reshape", tensor=tensor, shape=shape, name=name)
  File "C:\Users\Lucas\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\py
thon\framework\op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\Lucas\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\py
thon\framework\ops.py", line 2956, in create_op
    op_def=op_def)
  File "C:\Users\Lucas\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\py
thon\framework\ops.py", line 1470, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-
access

InvalidArgumentError (see above for traceback): Input to reshape is a tensor wit
h 84500 values, but the requested shape requires a multiple of 5070
         [[Node: Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:lo
calhost/replica:0/task:0/device:GPU:0"](BiasAdd_8, Reshape/shape)]]
         [[Node: mul_17/_57 = _Recv[client_terminated=false, recv_device="/job:l
ocalhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/t
ask:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_972_mul_17", t
ensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

How can I fix this?

Most helpful comment

Got to work! Needed to change the number of filters for the last layer using the formula `filters = num * (#classes + 5).

All 6 comments

If I remember correctly, a reshape error is when the .weights and .cfg file aren't lined up correctly. The way the prografas lines to read, it returns this error.

My suggestion: Unless you've fiddled with the filters while training on your own dataset, this is likely a problem with the .weights and .cfg file you're using. All useful .cfg files are already in .cfg folder so keep downloading new weights and retrying, The ones that finally worked for me were simply yolo.weights and yolo.cfg.

Got to work! Needed to change the number of filters for the last layer using the formula `filters = num * (#classes + 5).

@lucasharada what is num in filters = num * (#classes + 5).?

Found the way to solve it. In my case, I didn't notice I was changing the filters in the wrong convolutional layer. To train on your own data, besides changing the number of classes, you need to modify the LAST convolutional layer on the config file.
@Estapraq num refers to the attribute num in the last layer of the config file: the [region] layer. Hope this helps

num is the number of prior anchors in yolo @Estapraq

@Estapraq The num in that statement can be found under the [region] section at the end of the config file. For me, this was on line 246 in yolo.cfg, where I saw that num = 5.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

realityzero picture realityzero  路  3Comments

pribadihcr picture pribadihcr  路  5Comments

borasy picture borasy  路  3Comments

wonny2001 picture wonny2001  路  4Comments

ManojPabani picture ManojPabani  路  4Comments