Gensim: word2vec overflow when running

Created on 25 Aug 2016 · 9Comments · Source: RaRe-Technologies/gensim

/gensim-0.13.1/gensim/models/word2vec.py:296: RuntimeWarning: overflow encountered in exp
fb = 1. / (1. + exp(-dot(l1, l2b.T))) # propagate hidden -> output

how can fix it, thank you

bug difficulty easy

Source

jackyjwwang

Most helpful comment

@gojomo
Consider a large negative value. For instance -2000000000.

Numpy will raise a warning, "RuntimeWarning: overflow encountered in exp".

Check this.

markroxor on 28 Sep 2016

👍2

All 9 comments

Have you tried using the compiled C extension? It is much faster than pure Python.

Could you please give more instructions on how to reproduce? A sample corpus would help.

tmylk on 29 Aug 2016

I'm sorry, it's long time to check the mail.

I can't use the C version, Because it depends the C complier lib\

best regards

thanks

2016-08-29 16:31 GMT+08:00 Lev Konstantinovskiy [email protected]:

Have you tried using the compiled C extension? It is much faster than
Python.

Could you please give more instructions on how to reproduce? A sample
corpus would help.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/RaRe-Technologies/gensim/issues/838#issuecomment-243063921,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AUQTA-s1z4o75-RtsRUJwVKYqelUSvxBks5qkpjogaJpZM4Js3uI
.

jackyjwwang on 19 Sep 2016

The necessary C compilation tools should be freely available on all platforms where gensim runs – and they can offer a 100X or more speedup. For example, they can turn runtimes that would be over 4 days (without) into under-an-hour. So, we highly recommend ensuring the optimized code is running.

Still, there shouldn't be any fatal warnings in the pure-python code. Does the warning actually cause execution to stop, or does your training still complete and result in usable word-vectors?

(I think the MAX_EXP conditional check and early-loop-continue in the cython code – eg https://github.com/RaRe-Technologies/gensim/blob/9a02527ab315d00dae30088855d2ca466cc3e436/gensim/models/word2vec_inner.pyx#L83 and elsewhere – may be what prevents a similar overflow message there.)

gojomo on 19 Sep 2016

Hello @gojomo ,
I think limiting the value of
fb = 1. / (1. + exp(-dot(l1, l2b.T))) # propagate hidden -> output
here
will fix the issue, I am willing to send a PR but am quite not sure about the values to which I should limit it to. Any ideas?

markroxor on 27 Sep 2016

If the values aren't too big then this solution will help:

def inv_logit(p):
    if p > 0:
        return 1. / (1. + np.exp(-p))
    elif p <= 0:
        np.exp(p) / (1 + np.exp(p))
    else:
        raise ValueError

tmylk on 27 Sep 2016

That should do the trick as well. Sending a PR, merge if it helps.

markroxor on 27 Sep 2016

Is there an example of a p value where inv_logit(p) gives a valid value, while the old code 1.0 / (1.0 + exp(-p)) triggers the error?

gojomo on 27 Sep 2016

@gojomo
Consider a large negative value. For instance -2000000000.

Numpy will raise a warning, "RuntimeWarning: overflow encountered in exp".

Check this.

markroxor on 28 Sep 2016

👍2

Fixed in #895

tmylk on 28 Sep 2016