The MINST CNN example code claims 99.25% test accuracy
However when run with the Tensorflow v2.1.0 backend (Ubuntu 18.4 + CUDA 10.1), observed accuracy is only ~84%
What do I need to do in order to get 99.25% test accuracy?
src/keras/examples/keras_example_mnist_cnn.py
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
60000/60000 [==============================] - 7s 109us/sample - loss: 2.2754 - accuracy: 0.1646 - val_loss: 2.2287 - val_accuracy: 0.3468
Epoch 2/12
60000/60000 [==============================] - 5s 88us/sample - loss: 2.2023 - accuracy: 0.2839 - val_loss: 2.1380 - val_accuracy: 0.5022
Epoch 3/12
60000/60000 [==============================] - 5s 88us/sample - loss: 2.1075 - accuracy: 0.3794 - val_loss: 2.0163 - val_accuracy: 0.5872
Epoch 4/12
60000/60000 [==============================] - 5s 88us/sample - loss: 1.9826 - accuracy: 0.4569 - val_loss: 1.8593 - val_accuracy: 0.6505
Epoch 5/12
60000/60000 [==============================] - 5s 85us/sample - loss: 1.8317 - accuracy: 0.5159 - val_loss: 1.6713 - val_accuracy: 0.7023
Epoch 6/12
60000/60000 [==============================] - 5s 86us/sample - loss: 1.6617 - accuracy: 0.5625 - val_loss: 1.4650 - val_accuracy: 0.7441
Epoch 7/12
60000/60000 [==============================] - 5s 87us/sample - loss: 1.4878 - accuracy: 0.6030 - val_loss: 1.2637 - val_accuracy: 0.7791
Epoch 8/12
60000/60000 [==============================] - 5s 87us/sample - loss: 1.3346 - accuracy: 0.6342 - val_loss: 1.0898 - val_accuracy: 0.7997
Epoch 9/12
60000/60000 [==============================] - 6s 95us/sample - loss: 1.2022 - accuracy: 0.6603 - val_loss: 0.9495 - val_accuracy: 0.8124
Epoch 10/12
60000/60000 [==============================] - 5s 86us/sample - loss: 1.1006 - accuracy: 0.6819 - val_loss: 0.8413 - val_accuracy: 0.8239
Epoch 11/12
60000/60000 [==============================] - 5s 86us/sample - loss: 1.0165 - accuracy: 0.7002 - val_loss: 0.7581 - val_accuracy: 0.8349
Epoch 12/12
60000/60000 [==============================] - 5s 86us/sample - loss: 0.9482 - accuracy: 0.7159 - val_loss: 0.6931 - val_accuracy: 0.8427
Test loss: 0.6930903025627136
Test accuracy: 0.8427
Issue was resolved by using import keras rather than import tensorflow.keras
However there seems to be a deeper issue (need to raise a different bug), that import keras provides 99% accuracy, but import tensorflow.keras results in only 84% accuracy
I have traced back the source of this issue, which is due to different default learning rates between keras.optimizers.Adadelta() and tf.keras.optimizers.Adadelta()
keras = venv/lib/python3.6/site-packages/keras/optimizers.py
class Adadelta(Optimizer):
def __init__(self, learning_rate=1.0, rho=0.95, **kwargs):
tf.keras = venv/lib/python3.6/site-packages/tensorflow_core/python/keras/optimizer_v2/adadelta.py
@keras_export('keras.optimizers.Adadelta')
class Adadelta(optimizer_v2.OptimizerV2):
def __init__(self,
learning_rate=0.001,
rho=0.95,
epsilon=1e-7,
name='Adadelta',
**kwargs):
The example code produces 99%+ accuracy in 12 epocs with tf.keras if an explict learning_rate is passed into Adadelta
import tensorflow.keras as keras
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(learning_rate=1.0, rho=0.95),
metrics=['accuracy'])
Most helpful comment
I have traced back the source of this issue, which is due to different default learning rates between
keras.optimizers.Adadelta()andtf.keras.optimizers.Adadelta()keras = venv/lib/python3.6/site-packages/keras/optimizers.py
tf.keras = venv/lib/python3.6/site-packages/tensorflow_core/python/keras/optimizer_v2/adadelta.py
The example code produces 99%+ accuracy in 12 epocs with
tf.kerasif an explict learning_rate is passed into Adadelta