Hello,
I am experiencing an issue while trying to train the retinanet with any of the mobilenet backbones.
Whenever i start the training with my custom dataset (that works with resnet 50) with the command keras_retinanet/bin/train.py --backbone mobilenet128_0.75 --batch-size 2 csv annotations_train.csv classes_to_int_map.csv, it results in the error:
None
Epoch 1/50
2018-12-14 10:05:26.555197: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at gather_nd_op.cc:50 : Invalid argument: indices[274827] = [0, 274827] does not index into param shape [2,272010,1]
Traceback (most recent call last):
File "keras_retinanet/bin/train.py", line 492, in
main()
File "keras_retinanet/bin/train.py", line 487, in main
callbacks=callbacks,
File "/home/marco/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(args, *kwargs)
File "/home/marco/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/home/marco/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 217, in fit_generator
class_weight=class_weight)
File "/home/marco/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1217, in train_on_batch
outputs = self.train_function(ins)
File "/home/marco/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
return self._call(inputs)
File "/home/marco/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(array_vals)
File "/home/marco/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1439, in __call__
run_metadata_ptr)
File "/home/marco/.local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[274827] = [0, 274827] does not index into param shape [2,272010,1]
[[{{node loss/classification_loss/GatherNd_1}} = GatherNd[Tindices=DT_INT64, Tparams=DT_FLOAT, _class=["loc:@training/Adam/gradients/loss/classification_loss/GatherNd_1_grad/ScatterNd"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](classification/concat, *loss/classification_loss/Where)]]
I am up to date with keras, tensorflow, keras-retinanet.
Thank for your help,
M
Mobilenet is a community contribution so I can't offer much help here. Strange that it does work for resnet though. What is the content of your classes_to_int_map.csv file?
it is obst,0, since I am currently detecting a single class in the images
I'm not sure if its the same error but I had this error using mobilenet backbone:
InvalidArgumentError: flat indices[179577, :] = [0, 180077] does not index into param (shape: [1,179928,1]).
I was using the current release of tensorflow for CPU-only, but as described in this issue it works on GPU, so I tried the GPU version of tensorflow and mobilenet was working.
Maybe, if you are using the CPU release you can try using the other one.
Thanks for the hint @dredonieto, but unfortunately at the moment I do not have any GPU in my machine and therefore I cannot install tensorflow-gpu
I can confirm that trained on GPU, mobilenet works.
Closing this as it appears to be an upstream issue then.
Most helpful comment
I'm not sure if its the same error but I had this error using mobilenet backbone:
I was using the current release of tensorflow for CPU-only, but as described in this issue it works on GPU, so I tried the GPU version of tensorflow and mobilenet was working.
Maybe, if you are using the CPU release you can try using the other one.