Keras: Can't Reproduce the Accuracy for "Pre-trained word embeddings in Keras" Example

Created on 16 Mar 2017 · 4Comments · Source: keras-team/keras

I copied the entire example and ran it in my local. It says in the article that after 2 epochs validation set accuracy is 95%+ but when I tried it it was 20% and after 10 epochs it went around 70%.

I am really curious to know if anybody has ever met the same issue here, and if not what could I possibly have done wrong?

https://blog.keras.io/using-pre-trained-word-embeddings-in-a-keras-model.html

stale

Source

MadLily

Most helpful comment

@queirozfcom

Seems like there are two versions of the newgroups files. One with header, and one without header. The one with header contained the newsgroup in the text, thus it was easy to detect the pattern.

See this comment from the pull request with removed the header:

https://github.com/fchollet/keras/pull/5585

alvinhom on 27 May 2017

👍3

All 4 comments

I've run into the same problem. I had updated my keras and theano packages among others because I got a prompt to do so and use "gpuarray". I don't remember the exact reason that was given. After that I noticed that the loss and accuracy in my script using pre-trained glove embeddings seemed significantly worse. So I ran some sample programs from the internet including the pre-trained embeddings example here:
https://blog.keras.io/using-pre-trained-word-embeddings-in-a-keras-model.html
I found that the accuracy after two epochs was very far off from the 95% mentioned in the example. I did not notice the same problem when running programs that don't use embeddings.
I wasn't sure if there was something wrong with my setup, so I did a fresh install of Ubuntu. Then installed Anaconda and tried again. Here are the results:

Using TensorFlow backend.
Indexing word vectors.
Found 400000 word vectors.
Processing text dataset
Found 19997 texts.
Found 174074 unique tokens.
Shape of data tensor: (19997, 1000)
Shape of label tensor: (19997, 20)
Preparing embedding matrix.
Training model.
Train on 15998 samples, validate on 3999 samples
Epoch 1/10
15998/15998 [==============================] - 58s - loss: 2.4617 - acc: 0.1654 - val_loss: 2.0106 - val_acc: 0.2633
Epoch 2/10
15998/15998 [==============================] - 6s - loss: 1.9074 - acc: 0.3144 - val_loss: 1.7446 - val_acc: 0.3656
Epoch 3/10
15998/15998 [==============================] - 6s - loss: 1.5748 - acc: 0.4334 - val_loss: 1.4715 - val_acc: 0.4696
Epoch 4/10
15998/15998 [==============================] - 6s - loss: 1.3109 - acc: 0.5323 - val_loss: 1.2180 - val_acc: 0.5704
Epoch 5/10
15998/15998 [==============================] - 6s - loss: 1.0926 - acc: 0.6180 - val_loss: 1.1524 - val_acc: 0.6064
Epoch 6/10
15998/15998 [==============================] - 6s - loss: 0.9419 - acc: 0.6707 - val_loss: 1.0192 - val_acc: 0.6584
Epoch 7/10
15998/15998 [==============================] - 6s - loss: 0.8312 - acc: 0.7087 - val_loss: 0.9377 - val_acc: 0.6887
Epoch 8/10
15998/15998 [==============================] - 6s - loss: 0.7301 - acc: 0.7431 - val_loss: 0.9162 - val_acc: 0.6902
Epoch 9/10
15998/15998 [==============================] - 6s - loss: 0.6464 - acc: 0.7710 - val_loss: 0.9491 - val_acc: 0.7014
Epoch 10/10
15998/15998 [==============================] - 6s - loss: 0.5689 - acc: 0.8023 - val_loss: 0.9231 - val_acc: 0.7144

As can be seen from the output the accuracy is way off the expected 95% after 2 epochs. For this output my version of tensorflow is 1.0.1 and the version of keras is 2.0.2.

I'd like to point out that when I tried this example in my initial setup I was using theano as the backend and in the new installation I am using tensorflow, this problem shows up in both cases.

robert604 on 14 Apr 2017

I'm also experiencing this issue. I've even asked a question on stackoverflow.

Did you guys get the actual 95% before (before what?) and only now you're seeing these worse results?

queirozfcom on 30 Apr 2017

@queirozfcom

Seems like there are two versions of the newgroups files. One with header, and one without header. The one with header contained the newsgroup in the text, thus it was easy to detect the pattern.

See this comment from the pull request with removed the header:

https://github.com/fchollet/keras/pull/5585

alvinhom on 27 May 2017

👍3

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.