Users of generator_queue() like fit_generator() have a race condition when the generator exits after generating a number of samples equal to or slightly greater than samples_per_epoch. The function will occasionally fetch None from the queue instead of the final elements.
The generator is expected to loop indefinitely according to the documentation so if you by _"when the generator exits"_ mean that it's finished and that another next() call will result in a StopIteration being raised, then this is not a necessary fix.
You might want to iterate over a finite data set once; right now you have to add an unspecified number of padding elements to do this. Also, evaluate_generator() and predict_generator() have the same problem, but I should update the doc to say "generator must return at least (samples_per_epoch * nb_epoch) elements".
If you want to iterate over a finite data set just once, then you're expected to set nb_epochs=1 when calling fit_generator(). Doesn't that work?
The thing is that the generator runs on its own thread to maximize GPU throughput (and there even used to be multi-threading support!), which is why it should never stop. The thread has a queue of samples that handles everything, but for that to work the generator must never finish.
If you want to iterate over a finite data set just once, then you're expected to set nb_epochs=1 when calling fit_generator. Doesn't that work?
Yes, that's what I'm doing, but because of the race condition the generator has to stay alive even after it has yielded all of its samples. The fix is simple, it's just a matter of making sure the queue is empty before you check the _stop event.
Again, the generator should yield its samples indefinitely and could therefore never exit. You're not making much sense.
"generator must return at least (samples_per_epoch * nb_epoch) elements".
No, the generator should yield exactly samples_per_epoch distinct samples repeatedly, forever.
My dataset doesn't fit into memory, so I load new files and create a new generator after every epoch, passing it to fit_generator() with nb_epochs = 1.
I guess I could use train_on_batch() or make the generator loop over its data multiple times, but I don't see why finite generators are a problem when you know exactly how many samples will be consumed.
If you do that you loose the immense performance benefit of being able to buffer up samples in the queue at the end of an epoch for the start of the next epoch.
Unlike the first batches of the first epoch in which your model has to wait a while for the data to be processed initially, the start of every successive epoch can start immediately because the generator queue keeps at it.
You really shouldn't kill the data generator thread like you're doing, for this reason alone.
What do you even gain from initializing a data generator per epoch?
As part of the preprocessing I have to evaluate the input on a separate large model which is too big to coexist with the training model in RAM. So I can't easily do this inside of the generator since it would swap out the model being trained and they would both grind to a halt. Probably suboptimal, but my epochs are so long that it isn't a huge performance hit.
That seems like a legitimate use case (albeit exotic). It seems like an uphill battle to reload models continuously during training, and nothing was built with this in mind as far as I know.
When you say RAM do you mean for the CPU or GPU VRAM? I'd just focus on having enough memory such that the models fit, but it could be difficult with VRAM until summer. RAM is cheap though! What is your memory usage? I'm curious.
on_epoch_end to communicate over a network. This is probably your safest bet.There's an option to pass in a generator to validation_data in fit_generator. There's a finite amount of data to loop over and you'll have a StopIteration to indicate the end.
More broadly, I'd like Keras to have better support for generators -- they're a handy of working with large amounts of data.
Similar problem here. I understand that the generator should yield indefinitely, and that does "fix" the problem. But to me, requiring the generator to yield indefinitely IS the problem. Here is why:
I'm using predict_generator() to classify images. My generator code is similar to this (simplified):
def gen():
for i in range(100):
yield load_images([i*batch_size : (i+1)*batch_size])
Keras returns this error:
ValueError: output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: None
After spending about an hour debugging my code to find out why my generator was returning None, I finally stumbled upon this thread and realized that Keras expects generators to yield indefinitely, even when they don't have anything else to yield.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open it if needed.
Most helpful comment
Similar problem here. I understand that the generator should yield indefinitely, and that does "fix" the problem. But to me, requiring the generator to yield indefinitely IS the problem. Here is why:
I'm using predict_generator() to classify images. My generator code is similar to this (simplified):
Keras returns this error:
After spending about an hour debugging my code to find out why my generator was returning None, I finally stumbled upon this thread and realized that Keras expects generators to yield indefinitely, even when they don't have anything else to yield.