Mask_rcnn: generator raised StopIteration at the very beginning of epoch

Created on 25 Oct 2019  路  15Comments  路  Source: matterport/Mask_RCNN

While I have encountered the problem that is mentioned in the title, I tried to run the whole process again to ensure there is nothing else wrong.

1) I uninstall and re-download Anaconda, as well the Mask_RCNN
2) I run requirements.txt and setup.py as is needed.
3) I run the train_shapes.ipynb code.
4) First, it downloads the coco model, and it runs until we get to the training cell.
4i) Error: 'Model' object has no attribute 'metrics_tensors'. Solved by downgrading Keras to 2.1.0.
4ii) Error: module 'tensorflow' has no attribute 'random_shuffle' Solved by updating tensorflow to 1.13.
4iii) RuntimeError: generator raised StopIteration

This error seems to cannot be overcome. I have thoroughly investigated it, though. First of all, it comes from the model.train() function. In the model code, train_generator and val_generator are defined through data_generator() function which loads the images. But, through printing, I have concluded that the code does not go into data_generator() function, thus it does not load images, and as a result, Keras has nothing to train, leading to StopIteration.

I discovered this issue while I was training my own dataset, but I assumed there was a problem with my code, so I tried it on the sample code by first deleting everything.

Thanks for your time.

Most helpful comment

Running Mask_RCNN in Google Colab works fine. The default versions of it are:

  • Keras==2.2.5
  • Tensorflow==1.15.0

Changing to the above versions in my personal Anaconda solved the problem.

All 15 comments

I haven't experienced such an issue when training my custom dataset. Can you try without passing the val_generator? Do you have any callbacks that are passed as arguments?

I don't quite understand what you mean. Passing the val_generator where? In the model.py there is this part of the code in the model.train() function

image

While the top and bottom messages are being printed, messages I've put into the data_generator(...) function do not get printed.

@agelosk Sorry for that. That was for a different model (PSPNet) and a different issue. I wrong posted it here.

This seems to me like a python 3.7 issue, take a look at it here: Here.

The messages that you have put inside data_generator(), is this not getting printed even when you call the data_generator() for the train_dataset or it works for train_dataset and it doesn't work when you call data_generator() for val_dataset?

It happens for both generators. I even tried copying the same function with a different name and calling it again, but still nothing. Does it seem to have to do with the arguments passed?

One month ago that I had downloaded mrcn, train_shapes was running fine with no errors, but did I still have python 3.7?

Update: Even after installing python 3.6, the problem has not been resolved.

I also met this problem. I followed the readme to install. However, the version of my tensorflow is tensorflow-2.0.0. We may need tensorflow-gpu-2.2.0.
So, I'm trying to figure out.

Well, try to downgrade and see if it works. These are my versions.
tensorflow, tensorflow-base, tensorflow-gpu, tensorflow-estimator are 1.14.0
keras, keras-base, keras-gpu 2.2.4 and keras-preprocessing 1.1.0 and python is 3.7.4

2.0.0 gpu failed. Maybe we need downgrade. Yeah

https://github.com/matterport/Mask_RCNN/pull/1817 presumably fixed compatibility issues with Tensorflow 2.0. Maybe worth a test for this issue?

@agelosk :
I ran into the same issues, fixed them by downgrading and modifying the code a little. Got stuck with 'stop iteration' issue
Platform: Windows 10, TF 1.15, Keras 2.1.0
If you found a way out, please share.

Configurations:
BACKBONE resnet101
BACKBONE_STRIDES [4, 8, 16, 32, 64]
BATCH_SIZE 2
BBOX_STD_DEV [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE None
DETECTION_MAX_INSTANCES 100
DETECTION_MIN_CONFIDENCE 0.7
DETECTION_NMS_THRESHOLD 0.3
FPN_CLASSIF_FC_LAYERS_SIZE 1024
GPU_COUNT 1
GRADIENT_CLIP_NORM 5.0
IMAGES_PER_GPU 2
IMAGE_CHANNEL_COUNT 3
IMAGE_MAX_DIM 1024
IMAGE_META_SIZE 14
IMAGE_MIN_DIM 800
IMAGE_MIN_SCALE 0
IMAGE_RESIZE_MODE square
IMAGE_SHAPE [1024 1024 3]
LEARNING_MOMENTUM 0.9
LEARNING_RATE 0.001
LOSS_WEIGHTS {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE 14
MASK_SHAPE [28, 28]
MAX_GT_INSTANCES 100
MEAN_PIXEL [123.7 116.8 103.9]
MINI_MASK_SHAPE (56, 56)
NAME kangaroo_cfg
NUM_CLASSES 2
POOL_SIZE 7
POST_NMS_ROIS_INFERENCE 1000
POST_NMS_ROIS_TRAINING 2000
PRE_NMS_LIMIT 6000
ROI_POSITIVE_RATIO 0.33
RPN_ANCHOR_RATIOS [0.5, 1, 2]
RPN_ANCHOR_SCALES (32, 64, 128, 256, 512)
RPN_ANCHOR_STRIDE 1
RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD 0.7
RPN_TRAIN_ANCHORS_PER_IMAGE 256
STEPS_PER_EPOCH 131
TOP_DOWN_PYRAMID_SIZE 256
TRAIN_BN False
TRAIN_ROIS_PER_IMAGE 200
USE_MINI_MASK True
USE_RPN_ROIS True
VALIDATION_STEPS 50
WEIGHT_DECAY 0.0001

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:492: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:63: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:3630: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:3458: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:1822: The name tf.image.resize_nearest_neighbor is deprecated. Please use tf.compat.v1.image.resize_nearest_neighbor instead.

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:1208: calling reduce_max_v1 (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:1242: calling reduce_sum_v1 (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\tensorflow_core\python\ops\array_ops.py:1475: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From C:\Users\kumakalDocuments\WorkAccessories\Mask_RCNN\mrcnnmodel.py:559: The name tf.random_shuffle is deprecated. Please use tf.random.shuffle instead.

WARNING:tensorflow:From C:\Users\kumakalDocuments\WorkAccessories\Mask_RCNN\mrcnn\utils.py:202: The name tf.log is deprecated. Please use tf.math.log instead.

WARNING:tensorflow:From C:\Users\kumakalDocuments\WorkAccessories\Mask_RCNN\mrcnnmodel.py:606: calling crop_and_resize_v1 (from tensorflow.python.ops.image_ops_impl) with box_ind is deprecated and will be removed in a future version.
Instructions for updating:
box_ind is deprecated, use box_indices instead
WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:158: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:163: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:168: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:172: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:181: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:188: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

Starting at epoch 0. LR=0.001

Checkpoint Path: ./kangaroo_cfg20191029T0756\mask_rcnn_kangaroo_cfg_{epoch:04d}.h5
Selecting layers to train
fpn_c5p5 (Conv2D)
fpn_c4p4 (Conv2D)
fpn_c3p3 (Conv2D)
fpn_c2p2 (Conv2D)
fpn_p5 (Conv2D)
fpn_p2 (Conv2D)
fpn_p3 (Conv2D)
fpn_p4 (Conv2D)
In model: rpn_model
rpn_conv_shared (Conv2D)
rpn_class_raw (Conv2D)
rpn_bbox_pred (Conv2D)
mrcnn_mask_conv1 (TimeDistributed)
mrcnn_mask_bn1 (TimeDistributed)
mrcnn_mask_conv2 (TimeDistributed)
mrcnn_mask_bn2 (TimeDistributed)
mrcnn_class_conv1 (TimeDistributed)
mrcnn_class_bn1 (TimeDistributed)
mrcnn_mask_conv3 (TimeDistributed)
mrcnn_mask_bn3 (TimeDistributed)
mrcnn_class_conv2 (TimeDistributed)
mrcnn_class_bn2 (TimeDistributed)
mrcnn_mask_conv4 (TimeDistributed)
mrcnn_mask_bn4 (TimeDistributed)
mrcnn_bbox_fc (TimeDistributed)
mrcnn_mask_deconv (TimeDistributed)
mrcnn_class_logits (TimeDistributed)
mrcnn_mask (TimeDistributed)
WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\optimizers.py:711: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\tensorflow_core\python\framework\indexed_slices.py:424: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\tensorflow_core\python\framework\indexed_slices.py:424: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\tensorflow_core\python\framework\indexed_slices.py:424: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:953: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:675: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\backend\tensorflow_backend.py:940: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\kerascallbacks.py:705: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.

WARNING:tensorflow:From c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\kerascallbacks.py:708: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

Epoch 1/5


StopIteration Traceback (most recent call last)
in
17 model.load_weights('mask_rcnn_coco.h5', by_name=True, exclude=["mrcnn_class_logits", "mrcnn_bbox_fc", "mrcnn_bbox", "mrcnn_mask"])
18 # train weights (output layers or 'heads')
---> 19 model.train(train_set, test_set, learning_rate=config.LEARNING_RATE, epochs=5, layers='heads')

~Documents\WorkAccessories\Mask_RCNN\mrcnnmodel.py in train(self, train_dataset, val_dataset, learning_rate, epochs, layers, augmentation, custom_callbacks, no_augmentation_sources)
2380 max_queue_size=100,
2381 workers=workers,
-> 2382 use_multiprocessing=True,
2383 )
2384 self.epoch = max(self.epoch, epochs)

c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\legacy\interfaces.py in wrapper(args, *kwargs)
85 warnings.warn('Update your ' + object_name + 86 ' call to the Keras 2 API: ' + signature, stacklevel=2)
---> 87 return func(args, *kwargs)
88 wrapper._original_function = func
89 return wrapper

c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\engine\training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
2063 batch_index = 0
2064 while steps_done < steps_per_epoch:
-> 2065 generator_output = next(output_generator)
2066
2067 if not hasattr(generator_output, '__len__'):

c:users\kumakal\appdata\local\programs\python\python36\lib\site-packages\keras\utils\data_utils.py in get(self)
708 all_finished = all([not thread.is_alive() for thread in self._threads])
709 if all_finished and self.queue.empty():
--> 710 raise StopIteration()
711 else:
712 time.sleep(self.wait_time)

StopIteration:

@sainatarajan I run the train_shapes with the same versions as you have mentioned above, but nothing changes.
@kpkumar3 Of course If I have found a solution I would have mentioned it, before closing the thread. Sadly, I still have no solution.

I am using MRCNN for a project, and I am stuck right now and cannot move forward. Is there an alternative solution?

@agelosk Do you have any other system where you can run the code? Try Google Colab if you don't have a secondary PC.

same question,keras 2.1.3 may help you!

Running Mask_RCNN in Google Colab works fine. The default versions of it are:

  • Keras==2.2.5
  • Tensorflow==1.15.0

Changing to the above versions in my personal Anaconda solved the problem.

Running Mask_RCNN in Google Colab works fine. The default versions of it are:

  • Keras==2.2.5
  • Tensorflow==1.15.0

Changing to the above versions in my personal Anaconda solved the problem.

THankyou agelosk!! it works

Hey, what about if I have the same error that you've mentioned in the topic 4iii) but in a 3.8.5 version of Python? What I have to do in there?

Was this page helpful?
0 / 5 - 0 ratings