Keras: How to batch the validation_data in fit_generator() ?

Created on 12 May 2016  Â·  11Comments  Â·  Source: keras-team/keras

I have a data set (X, y), that's a bit large, so I have to use batch_size = 20 for training, however when I try to use fit_generator() this way:

history = model.fit_generator( datagen.flow( X, y, batch_size = BATCH_SIZE, shuffle = True),
        samples_per_epoch = len(X), nb_epoch = 15, callbacks = callbacks,
        validation_data = ( X, y ), verbose = 1, show_accuracy = True )

it seems like the whole data set is being loaded into the GPU memory for the validation, resulting in memory overflow. When I try to specify validation_data = ( X[:20], y[:20] ) it works, but I would like to validate over the whole dataset, not only the first 20 items.

Is there anything like batch_size for validation data or any other way to use it?

Most helpful comment

validation_data can also be a generator.

model.fit_generator( ..., 
                    validation_data=val_datagen.flow(val_X, val_y, batch_size=BATCH_SIZE), 
                    nb_val_samples=val_X.shape[0])

All 11 comments

validation_data can also be a generator.

model.fit_generator( ..., 
                    validation_data=val_datagen.flow(val_X, val_y, batch_size=BATCH_SIZE), 
                    nb_val_samples=val_X.shape[0])

@joelthchao thanks, but this does not work, model.py has this code for the generator case:

if hasattr(validation_data, 'next'):
     # assumed to be generator
     # TODO: call self.evaluate_generator()
     _stop.set()
     raise NotImplementedError()

this might be the problem of 0.3.1 and be fixed in a more recent version, though.

@lazydroid Yah, you may need to upgrate to Keras 1.0.2 to enjoy this feature.

validation_data as a generator works for me.
@lazydroid Should this be closed?

@lazydroid I just got the inverse situation.
Some metrics like F-value should be computed on the whole validation or test dataset at one pass.
To split those datasets into small batches then average the metrics over batches can NOT get the same (or correct) output.

It seems the default way of Keras to train or test or validate data is to split into batches.

For test, I can use function test_on_batch, and feed it with the whole test dataset.

But for validate, I don't see an easy solution.

Could you please provide a simple way?

I think there should be a boolean argument for _model.fit_ to indicate the choice.

@jcyk i don't know if this is the correct solution for your problem, but before using generator you may call generator.fit() for ZCA whitening and other things that depend on the whole dataset to work properly. it might be a good idea to look into that.

I would like to reopen this issue, as it seems to me that with the current API it's unnaturally difficult to express the very common situation where you the exact same validation set is used on each epoch.

As with @lazydroid, my entire validation is far too large to be treated as a single batch. In the case that the entire validation set fits in GPU memory (almost never in practice?) the current API is sufficient.

However with the current API the only way I see to do this is to use a validation_generator that repeats itself exactly every validation_steps.

That's pretty unnatural though. I think that the the API should be modified to handle this very common use case.

A backwards compatible way to do it would be to optionally allow validation_data to be a method that returns a generator that halts after a single iteration through the validation set. Then after each epoch, get a new validation generator. Backwards compatibility would be achieved by checking for a next method.

Alternatively, allow validation_data when passed as an array to have an extra dimension for batches.

Also, the Tensorboard callback cannot calculate histograms when a generator is used for the validation data.

Hello,
I am trying to use model.fit_generator with a custom Callback that tries to access Validation data. However, whatever I do, when accessing validation data from within the Callback, it always equates to None.

class RecallMetrics(Callback):
    def on_train_begin(self, logs=None):
        print('RecallMetrics ... validating')
        self.val_f1s = []
        self.val_recalls = []
        self.val_precisions = []


    def on_epoch_end(self, epoch, logs=None):
        x=(self.validation_data[0])
        if x is None :
            print ('Error: validation_data is None')
            return
        else:
            val_predict = (np.asarray(self.model.predict(self.validation_data[0]))).round()
            val_targ = self.validation_data[1]
            _val_f1 = f1_score(val_targ, val_predict)
            _val_recall = recall_score(val_targ, val_predict)
            _val_precision = precision_score(val_targ, val_predict)
            self.val_f1s.append(_val_f1)
            self.val_recalls.append(_val_recall)
            self.val_precisions.append(_val_precision)
            print (" — val_f1: % f — val_precision: % f — val_recall % f" % (_val_f1, _val_precision, _val_recall))
            return

history = model.fit_generator(generator=train_gen,
                                  validation_data=validate_gen,
                                  # validation_data=None,
                                  steps_per_epoch=len(train_file_list),
                                  validation_steps=len(val_file_list) * 3,
                                  verbose=2,
                                  epochs=int(tc.config["LUNA16"]["epochs"]),
                                  callbacks=callbacks,
                                  workers=multiprocessing.cpu_count(),
                                  use_multiprocessing=True)

How can I access validation data from a custom Callback when using fit_generator?

Best,

Hello how are you? I apologize for the inconvenience. Could you help me. I'm trying to create my own generator from the above comments. However, when I apply model.fit_generator, I realize that my network does not use batch_size. For example, if I have 32676 images and batch_size of 64, I should realize 510 iterations per epoch. But my network has 32676 iterations per epoch. My dataset is large and with two channel images, so I need to create my own generator. I can not use the commands ImageDataGenerator, flow_from_directory and model.fit_generator direct from keras, because my images have two channels and these commands only work with 1 and 3 channel images. Would it be possible for you to help me?

I also did a generator for validation. That's why I use validationGenerator ().

I send my own generator to you:

  ######################## Generator ##################################

      def trainingGenerator():
            train_Class1_dir='/media/HD500/RGB_MIN/train/Class1'
            train_Class2_dir='/media/HD500/RGB_MIN/train/Class2'

############################ Class1 ###############################
            X_trainP = []
            trainP_ids = next(os.walk(train_Class1_dir))[2]
            for n, id_ in tqdm(enumerate(trainP_ids), total=len(trainP_ids)):
                  treinamento = train_Class1_dir + '/' + id_
                  X_trainP.append(treinamento)
            Y_trainP = np.ones((len(X_trainP), 1), dtype=np.uint8)
############################ Class 2 ###########################
            X_trainPN = []
            trainPN_ids = next(os.walk(train_Class2_dir))[2]
            for n, id_ in tqdm(enumerate(trainPN_ids), total=len(trainPN_ids)):
                  treinamento = train_Class2_dir + '/' + id_
                  X_trainPN.append(treinamento)
            Y_trainPN = np.zeros((len(X_trainPN), 1), dtype=np.uint8)
############ Dataset of Train ########################
            X_trainFinal = X_trainP + X_trainPN
            Y_train = np.concatenate((Y_trainP,Y_trainPN),axis=0)
            num_classes = np.unique(Y_train).shape[0]
            Y_train = np_utils.to_categorical(Y_train, num_classes) # One-hot encode the labels

 ########################### Image #############################
           img_width, img_height, img_channels = 227, 227, 4
           X_train = np.zeros((len(X_trainFinal), img_width, img_height, img_channels), dtype=np.uint8)
           for n, path1 in tqdm(enumerate(X_trainFinal), total=len(X_trainFinal)):
                   path = path1
                   img = imageio.imread(path)[:,:,:img_channels]
                   img = resize(img, (img_height, img_width), mode='constant', preserve_range=True)
                   X_train[n] = img

           batch_size=64
           X_train = X_train.astype('float32')
           X_train /255
           while 1:
                    for i in range(len(X_train)//batch_size):
                          yield X_train[i*batch_size:(i+1)*batch_size], Y_train[i*batch_size:(i+1)*batch_size]

  MyTrainingGenerator = trainingGenerator()
  MyValidationGenerator = validationGenerator()

  Results_Train = model.fit_generator(MyTrainingGenerator,
                    steps_per_epoch=nb_train_samples // batch_size,
                    epochs=num_epochs,
                    validation_data=MyValidationGenerator, 
                    validation_steps = nb_validation_samples // batch_size,
                    callbacks=[History, checkpointer, csv_logger],
                    verbose=1)

I thank you for your attention,
Gledson Melotti

@bayesianio I am having the same issue with self.validation_data equating to None in my custom Callback when using model.fit_generator. Were you ever able to resolve your issue?

Was this page helpful?
0 / 5 - 0 ratings