Keras: validation accuracy 0.0

Created on 5 Nov 2016 · 6Comments · Source: keras-team/keras

I am trying to fit a model with

model.fit(X_train, y_train, batch_size = 32, nb_epoch=20, shuffle=True,verbose=1,
callbacks=[remote], validation_split=0.10)

I am getting validation accuracies close to 0

Epoch 1/20
15174/15174 [==============================] - 312s - loss: 0.9802 - acc: 0.6364 - val_loss: 4.3220 - val_acc: 0.0000e+00
Epoch 2/20
15174/15174 [==============================] - 306s - loss: 0.5741 - acc: 0.8084 - val_loss: 5.1616 - val_acc: 0.0000e+00
Epoch 3/20
15174/15174 [==============================] - 323s - loss: 0.4029 - acc: 0.8665 - val_loss: 5.3345 - val_acc: 0.0000e+00
Epoch 4/20
15174/15174 [==============================] - 313s - loss: 0.3310 - acc: 0.8934 - val_loss: 3.3267 - val_acc: 0.0700
Epoch 5/20
15174/15174 [==============================] - 322s - loss: 0.3020 - acc: 0.9008 - val_loss: 4.8457 - val_acc: 0.0000e+00
Epoch 6/20
15174/15174 [==============================] - 326s - loss: 0.2662 - acc: 0.9130 - val_loss: 4.6635 - val_acc: 0.0136

When I do not use validation split, I do get good results. Is this a problem with validation split argument, or am I using it in a wrong way?

Source

ykg2910

Most helpful comment

Thanks @carlthome . So now I understand it, My data is arranged in different category folder and not shuffled. so categories for validation data does not match to categories for training data. so it is not able to give good results for categories, for which it does not have any data to learn in training set.

ykg2910 on 7 Nov 2016

👍11

All 6 comments

Your model doesn't seem to be able to learn what you want it to because the weights are not useful with unseen data. This is really hard to solve as it depends on your particular model, its complexity and your particular data. If nothing else, consider regularization techniques (a Dropout() layer for example) to improve validation accuracy.

carlthome on 7 Nov 2016

Thanks @carlthome for quick response.

When I use same model and do not use validation_split, rather give it validation set explicitly like this.

model.fit(X_train, y_train, batch_size = 32, nb_epoch=30, shuffle=True,verbose=1,
callbacks=[remote, early_stopping], validation_data=(X_validation, y_validation))

It seems to learn well. I face this problem when I use validation split.

Epoch 1/20
16860/16860 [==============================] - 362s - loss: 1.1185 - acc: 0.5844 - val_loss: 1.1172 - val_acc: 0.5442
Epoch 2/20
16860/16860 [==============================] - 352s - loss: 0.6780 - acc: 0.7668 - val_loss: 0.4111 - val_acc: 0.8756
Epoch 3/20
16860/16860 [==============================] - 1633s - loss: 0.4728 - acc: 0.8418 - val_loss: 0.3898 - val_acc: 0.8812
Epoch 4/20
16860/16860 [==============================] - 2082s - loss: 0.3997 - acc: 0.8678 - val_loss: 0.3083 - val_acc: 0.8964
Epoch 5/20
16860/16860 [==============================] - 423s - loss: 0.3320 - acc: 0.8928 - val_loss: 0.3072 - val_acc: 0.9000
Epoch 6/20
16860/16860 [==============================] - 378s - loss: 0.2775 - acc: 0.9077 - val_loss: 0.3502 - val_acc: 0.8858

ykg2910 on 7 Nov 2016

👍3

Interesting. Please assert that your manual validation set is selected identical to this.

carlthome on 7 Nov 2016

ykg2910 on 7 Nov 2016

👍11

Hey @ykg2910, Even I am getting the same error. Couldn't figure out your last comment! Can you please let me know what you did to solve it?

axn170037 on 17 Sep 2019

@axn170037 It means that you need to shuffle your data if the categories of your data are in order. For example, there are 10 categories 0-9 in your dataset and they are in order and in balance, if you use 'validation_split=0.2', then the training set contains data of 0-7, and the validation set contains 8 and 9, thus val_acc will be 0.

Be careful that model.fit(...,shuffle=True) will not fix the problem because it will do the split first. So you need to shuffle it yourself before model.fit()