Keras: Avoid or mitigate overfitting in Image categorization using CNN

Created on 9 Nov 2016 · 1Comment · Source: keras-team/keras

I am working on using CNN to perform image categorization. There are 10 categories of images each of them has about 300-500 images. During training, it is observed that the training accuracy (close to 90%) is much higher than the validation accuracy (about 30%, only better than random guess), indicating a severe overfitting issue. I have randomly selected 80% of images per category for training and the rest for validation.

Is there any strategies to avoid or mitigate the overfitting issue, and meanwhile to improve the validation performance. I have tried to use early stopping, but the testing accuracy is still very low.

stale

Source

jingweimo

Most helpful comment

Seems you have less training data, try to use Data Augmentation techniques like this, this and this
Make the model simple - May be the problem you are trying to solve isn't that hard, data might be easy to learn. If the model has too many parameters, instead of generalizing it simply memorizes the training data.
Use Regularization Techniques like Dropout, DropConnect and Regularizers - More about Regularization

Good Luck (y)

varun-bankiti on 11 Nov 2016

👍3

>All comments

Seems you have less training data, try to use Data Augmentation techniques like this, this and this
Make the model simple - May be the problem you are trying to solve isn't that hard, data might be easy to learn. If the model has too many parameters, instead of generalizing it simply memorizes the training data.
Use Regularization Techniques like Dropout, DropConnect and Regularizers - More about Regularization