Keras: predict_generator errors out with predictions of varying length

Created on 5 Jan 2018  路  6Comments  路  Source: keras-team/keras

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on StackOverflow or join the Keras Slack channel and ask there instead of filing a GitHub issue.

Thank you!

  • [X] Check that you are up-to-date with the master branch of Keras. You can update with:
    pip install git+git://github.com/keras-team/keras.git --upgrade --no-deps

  • [X] If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.

  • [X] If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
    pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps

  • [X] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

When running prediction with outputs of varying length predict generator will throw an error when trying to concatenate the predictions .
One can write a quick predict_on_batch loop themselves but a more permanent fix would be nice.
Code to reproduce:

import keras.layers as L
from keras.models import Model
import numpy as np
x_train=np.array([[[1]],[[1],[1]]])
def generate_arrays(X):
    while 1:
        for i in X:
            yield np.array(i).reshape((1,len(i),1))


inp=L.Input((None,1))
out=L.TimeDistributed(L.Dense(1))(inp)
model=Model(inp,out)
model.compile('adam','mse')
model.predict_generator(generate_arrays(x_train),steps=2)

Offending lines are with np.concatenate() :
https://github.com/keras-team/keras/blob/35e10e91172c2fb716d5a6ec489ae71e0f488ca3/keras/engine/training.py#L2443-L2451

Check for lengths of outputs matching and returning the outputs as lists instead of np.array will fix the issue

All 6 comments

Did you find any solution for this?

A bit annoying but this works totally fine
results = [] gen = generate_arrays(x_train) for i in range(steps): data = gen() results.append(model.train_on_batch(data)

Of course that can be worked around by using train_on_batch, but only by loosing the convenience of predict_generator (queueing, looping, progbar, currying, aggregation).

It would be much nicer if Keras was agnostic about the length of each batch here, just as with train_generator and evaluate_generator.

This could be implemented without breaking existing callers by adding a keyword argument (let's say as_list=False) switching to new behaviour: simply returning a list of all batch results.

   if len(all_outs) == 1:
        if as_list:
            return all_outs[0]
        else:
            if steps_done == 1:
                return all_outs[0][0]
            else:
                return np.concatenate(all_outs[0])
    if as_list:
        return all_outs
    else:
        if steps_done == 1:
            return [out[0] for out in all_outs]
        else:
            return [np.concatenate(out) for out in all_outs]

Well this is remains a problem for me as well. What bothering me is that if i get the all list from it it how would i calculate the class labels. I know by using argmax and i have to do it manually. But keras use to have a function called predict_classes. But there is no function here for that. Is it possible if there is any other way around
One other thing I came across is that if i am using Masking then all the masked labels predictions will be the same as of the last true prediction in real values. These are the results from predict generator. normal fit and predict work quite fine.
Cheers :)

Both solutions are not working, what other solutions are suggested?

predictions = []
for batch_x, batch_test in tqdm(test_generator(np.array(data_x)), desc="Predicting"):
    batch_pred = model.predict_on_batch([batch_x, batch_test])
    y_hat = np.argmax(batch_pred, axis=-1)
    predictions.extend(y_hat)
Was this page helpful?
0 / 5 - 0 ratings