Spacy: Keras Entailment Example

Created on 23 Jan 2017  Â·  18Comments  Â·  Source: explosion/spaCy

I had a few doubts/questions about the Keras Entailment example

  1. Why do we have to pass the model_dir? And if we don't pass it, shouldn't there be a default path? Also, it's not being used in the train method at all - or is it? In the other methods it's only loading from the directory passed, but I don't think the train method (or any other method) is storing it anywhere - or am I missing something here?

  2. In the readme it asks for the directory for the training/evaluation, but what it needs is the path to the json file. Unless I've misunderstood something - it would be better to have this clarified on the readme (explicitly mentioning it needs a path to the son file).

  3. If you, for example, run either demo, or evaluate, which need less parameters, it still throws up the : error: too few arguments message.

  4. This could be because I've again, missed something somewhere - but when I run either evaluate or demo, I get this - NameError: global name 'SimilarityModel' is not defined from the spacy_hook file. What exactly is supposed to be happening here? The training works fine, and so does the pytest.

[btw- the pytest sometimes fails if you're using a virtualenv, where you'd have to re-install pytest]

I wouldn't mind opening a PR to make the documentation changes (if you think they would be helpful/necessary).

I'm in the process of making a jupyter notebook which tries to walkthrough all of this, because I think the information in the readme isn't detailed enough about how to exactly get this to run. Also wanted demonstrate it (with some results).

Your Environment

  • Operating System: OS X El Capitan
  • Python Version Used: 2.7.12
  • spaCy Version Used: 1.6.0
  • Environment Information:
examples

Most helpful comment

I encountered this same problem and fixed it by selectively restoring part of an older version of keras. This commit:

https://github.com/fchollet/keras/commit/6417d90d5c1f70844d8d346312f1b40f449545a5#diff-56dc3cc42e1732fdb3a3c2c3c8efa32a

and this one:

https://github.com/fchollet/keras/commit/570fdf31c5cb9a580496d1d93320bc7ab1b9ad46#diff-56dc3cc42e1732fdb3a3c2c3c8efa32a

Introduces some code into keras/utils/generic_utils.py to allow for saving and re-loading of closures, but then this one:

https://github.com/fchollet/keras/commit/edae1785327dd7a418ac06c2fe85a8c1f6ea05b7#diff-56dc3cc42e1732fdb3a3c2c3c8efa32a

Removes the function that restores the closures. The comments on this commit claim that this was "broken" code, but it worked fine for me.

I just made sure that the functions related to closures were as in https://github.com/fchollet/keras/commit/570fdf31c5cb9a580496d1d93320bc7ab1b9ad46#diff-56dc3cc42e1732fdb3a3c2c3c8efa32a and this part worked.

Since this is obviously a bit unstable in keras right now, it would be better to see if this can be redone without closures.

(Also note that by default the example trains a model with max_length = 100 but uses max_length = 10 when running demo or evaluate. You'll have to change one of these to make it work).

All 18 comments

Hey,

Thanks for your attention on this. I agree that the example is a bit messy at the moment, and would really appreciate the PR for the docs changes, and any general tidying you want/need to do when making the notebook.

I think a notebook for this will be really great, because I've done a bit of hacking on different model options, and it's hard to explain them in the current format. A notebook is really a better solution.

Matt

Yup, I'm on this now.

Still need some help with questions 1 & 4 though - what's the purpose of passing the model directory? I don't see the model_dir parameter being used in the train method at all. Where is the keras model being tied to the pipeline?

Also when I run evaluate, it stops at the create_similarity_pipeline method because SimilarityModel isn't defined. Or am I supposed to make the model, the way it's described in the docs? I was unsure because I thought the example was ready to run, with the way the code is right now.

Hey,

I've actually just been updating this --- let me push some state I have in my working directory.

Updated. Btw, be sure to use Theano with this — for some reason I can't get it to work on Tensorflow...

Also, just a note: it sure looks to me like the normalization done here is incorrect, because it's computed without reference to the mask. You'll get probability mass in the attention going to elements that are actually masked, so you won't have a proper distribution.

@honnibal , have you got either demo or evaluate to work on your machines? While the training is fine when I try to use it (theano backend), I end up getting the following error:

  File "/Users/bhargavvader/Open_Source/spacy-notebooks/venv/lib/python2.7/site-packages/spacy/language.py", line 301, in __init__
    self.pipeline = overrides['create_pipeline'](self)
  File "keras_parikh_entailment/spacy_hook.py", line 88, in create_similarity_pipeline
    KerasSimilarityShim.load(nlp.path / 'similarity', nlp, max_length=10)
  File "keras_parikh_entailment/spacy_hook.py", line 19, in load
    model = model_from_json(file_.read())
  File "/Users/bhargavvader/Open_Source/spacy-notebooks/venv/lib/python2.7/site-packages/Keras-1.2.1-py2.7.egg/keras/models.py", line 213, in model_from_json
    return layer_from_config(config, custom_objects=custom_objects)
  File "/Users/bhargavvader/Open_Source/spacy-notebooks/venv/lib/python2.7/site-packages/Keras-1.2.1-py2.7.egg/keras/utils/layer_utils.py", line 41, in layer_from_config
    custom_objects=custom_objects)
  File "/Users/bhargavvader/Open_Source/spacy-notebooks/venv/lib/python2.7/site-packages/Keras-1.2.1-py2.7.egg/keras/engine/topology.py", line 2582, in from_config
    process_layer(layer_data)
  File "/Users/bhargavvader/Open_Source/spacy-notebooks/venv/lib/python2.7/site-packages/Keras-1.2.1-py2.7.egg/keras/engine/topology.py", line 2560, in process_layer
    custom_objects=custom_objects)
  File "/Users/bhargavvader/Open_Source/spacy-notebooks/venv/lib/python2.7/site-packages/Keras-1.2.1-py2.7.egg/keras/utils/layer_utils.py", line 41, in layer_from_config
    custom_objects=custom_objects)
  File "/Users/bhargavvader/Open_Source/spacy-notebooks/venv/lib/python2.7/site-packages/Keras-1.2.1-py2.7.egg/keras/layers/core.py", line 681, in from_config
    function = func_load(config['function'], globs=globs)
  File "/Users/bhargavvader/Open_Source/spacy-notebooks/venv/lib/python2.7/site-packages/Keras-1.2.1-py2.7.egg/keras/utils/generic_utils.py", line 100, in func_load
    closure=closure)
TypeError: arg 5 (closure) must be None or tuple

This is similar to this error raised with keras.

@matt-gardner : I've struggled with this type of issue a lot actually. Do you have a good answer? So far I haven't found a better way to implement this in Keras.

I think the sequence handling in Keras is really broken, in general. The masking is very buggy and inconsistent, and even when you convince Keras to pass the mask layer forward for you, it's still ver difficult to make the model correct.

@bhargavvader Can confirm that this is broken for me now too :(. I'm not sure whether Keras changed, or whether it's due to code changes I introduced.

I encountered this same problem and fixed it by selectively restoring part of an older version of keras. This commit:

https://github.com/fchollet/keras/commit/6417d90d5c1f70844d8d346312f1b40f449545a5#diff-56dc3cc42e1732fdb3a3c2c3c8efa32a

and this one:

https://github.com/fchollet/keras/commit/570fdf31c5cb9a580496d1d93320bc7ab1b9ad46#diff-56dc3cc42e1732fdb3a3c2c3c8efa32a

Introduces some code into keras/utils/generic_utils.py to allow for saving and re-loading of closures, but then this one:

https://github.com/fchollet/keras/commit/edae1785327dd7a418ac06c2fe85a8c1f6ea05b7#diff-56dc3cc42e1732fdb3a3c2c3c8efa32a

Removes the function that restores the closures. The comments on this commit claim that this was "broken" code, but it worked fine for me.

I just made sure that the functions related to closures were as in https://github.com/fchollet/keras/commit/570fdf31c5cb9a580496d1d93320bc7ab1b9ad46#diff-56dc3cc42e1732fdb3a3c2c3c8efa32a and this part worked.

Since this is obviously a bit unstable in keras right now, it would be better to see if this can be redone without closures.

(Also note that by default the example trains a model with max_length = 100 but uses max_length = 10 when running demo or evaluate. You'll have to change one of these to make it work).

@jfoster17 Thanks!

Okay, I think it's best to avoid the json serialisation. This makes sense to me, really, especially since we have our own attributes that we're trying to pass around (sorry that the max_length=10 hack made it into master! I was hacking at this...)

So, we should write out a config.json that gives us the necessary hyper-params to make another call to build_model when loading the data. I think this is the way the sentiment analysis example in deep_learning_keras.py does this.

@honnibal we're building a library that tries to make NLP easier with Keras; you can see what we did for attention layers here. It took a lot of work and a lot of tests to make sure masking is done properly throughout, but we're reasonably confident that it does the right thing now. The library is close to ready for public consumption, but not quite there yet, so you'll probably notice some inconsistencies and issues still if you poke around the code.

@matt-gardner Interesting!

I think it's valuable to work within Keras this way, especially now that it's been appointed the official Tensorflow front-end. But I have to say, I think the whole masking idea is just bad, tbh.

I think it's much better to maintain an array of the sequence lengths. For this type of model, you're then able to concatenate all the inputs into a single matrix without any padding. For LSTM models, you can sort the batch by length, and then drop the short rows as they're completed.

I don't have the LSTMs implemented yet, but you can see the pooling at work in this example: https://github.com/explosion/thinc/blob/master/examples/quora_similarity.py

The code is quite different from Keras though, because it's not based on Tensorflow etc...It's just based on numpy/cupy. The flatten_with_lengths operation is here: https://github.com/explosion/thinc/blob/master/thinc/api.py#L44

@honnibal I got the example running on tensorflow and applied @jfoster17 fix to resolve the error with closures. Regarding tensorflow I made the following changes to keras_decomposable_attention.py and __main__.py:

  1. included the keras backend
    import keras.backend as K
  2. Added a flag to import tensorflow
USE_TF = True
if USE_TF:
    import tensorflow as tf
  1. To resolve precondition errors due to uninitialized variables, I created a tensorflow session, initialized model variables using the session, and assigned the keras session to it. In particular the following code block was added to test_fit_model of keras_decomposable_attention.py and train' of__main__.pydirectly before the call tomodel.fit`:
if USE_TF:
        sess = tf.Session()
        init = tf.global_variables_initializer()
        sess.run(init)
        K.set_session(sess)
  1. Lastly, I added the following as the last block of code in the functions above:
if USE_TF:
        sess.close()

Hope this helps! Oh I got this running using the following environment:

  • Operating System: Windows Server 2012 (16 cores no GPU)
  • Python: Anaconda python 3.5

On another note, @honnibal and @matt-gardner, what affect does improper probability distribution have on model performance?

@enigmoization Thanks!! Are you able to make a pull request with the fixes?

The messed up attention weights might have a big impact if the length cap is relaxed, which does seem to improve accuracy.

@honnibal you're welcome and thanks for the explanation about the attention weights. Regarding a pull request, I haven't tried but can look into making one this weekend.

I made all changes which enigmoization suggested and run following command:

$python keras_parikh_entailment/ demo snli_1.0/snli_1.0_train.jsonl snli_1.0/snli_1.0_dev.jsonl

But it is giving following error:
Using TensorFlow backend.
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Users/saurabh/Desktop/ai/tools/keras/keras_parikh_entailment/__main__.py", line 155, in
plac.call(main)
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/plac_core.py", line 328, in call
cmd, result = parser.consume(arglist)
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/plac_core.py", line 207, in consume
return cmd, self.func((args + varargs + extraopts), kwargs)
File "/Users/saurabh/Desktop/ai/tools/keras/keras_parikh_entailment/__main__.py", line 152, in main
demo()
File "/Users/saurabh/Desktop/ai/tools/keras/keras_parikh_entailment/__main__.py", line 89, in demo
create_pipeline=create_similarity_pipeline)
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/spacy/__init__.py", line 42, in load
return cls(
overrides)
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/spacy/en/__init__.py", line 34, in __init__
Language.__init__(self, *
overrides)
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/spacy/language.py", line 297, in __init__
self.pipeline = overrides'create_pipeline'
File "keras_parikh_entailment/spacy_hook.py", line 88, in create_similarity_pipeline
KerasSimilarityShim.load(nlp.path / 'similarity', nlp, max_length)
File "keras_parikh_entailment/spacy_hook.py", line 19, in load
model = model_from_json(file_.read())
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/keras/models.py", line 345, in model_from_json
return layer_module.deserialize(config, custom_objects=custom_objects)
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/keras/layers/__init__.py", line 54, in deserialize
printable_module_name='layer')
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/keras/utils/generic_utils.py", line 139, in deserialize_keras_object
list(custom_objects.items())))
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/keras/engine/topology.py", line 2487, in from_config
process_layer(layer_data)
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/keras/engine/topology.py", line 2473, in process_layer
custom_objects=custom_objects)
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/keras/layers/__init__.py", line 54, in deserialize
printable_module_name='layer')
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/keras/utils/generic_utils.py", line 139, in deserialize_keras_object
list(custom_objects.items())))
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/keras/layers/core.py", line 697, in from_config
function = func_load(config['function'], globs=globs)
File "/Users/saurabh/Desktop/ai/tools/keras/keras_vir/lib/python2.7/site-packages/keras/utils/generic_utils.py", line 206, in func_load
closure=closure)
TypeError: arg 5 (closure) must be None or tuple

Not able to run demo function. Can someone please help?

I got it working by applying this. Why this fix is not included in keras latest code ?

Merging this with #1445!

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ank-26 picture ank-26  Â·  3Comments

muzaluisa picture muzaluisa  Â·  3Comments

besirkurtulmus picture besirkurtulmus  Â·  3Comments

TropComplique picture TropComplique  Â·  3Comments

smartinsightsfromdata picture smartinsightsfromdata  Â·  3Comments