Gensim: Word2vec to use GPU

Created on 13 Sep 2015  路  15Comments  路  Source: RaRe-Technologies/gensim

Add an option (likely subclassing Word2vec) to train word2vec model using GPU.

difficulty hard feature wishlist

Most helpful comment

@yutarochan Thanks for your interest. There has been an implemenation in Keras - would really appreciate if you could evaluate it and tell us what you think.
https://github.com/niitsuma/word2vec-keras-in-gensim

All 15 comments

Careful with the issue numbers @ziky90 .

I'd like to help contribute a PR for this. For implementation with the GPU, what sort of dependency constraints or preferences do you have?

So far, libraries we can potentially use to implement the GPU versions are:

  • Gnumpy
  • PyCUDA & PyOpenCL
  • NumbaPro
  • Theano

Although I already have an implementation for Theano, but I was wondering whether there were specific types of preferences you have in terms of adding additional dependencies.

@yutarochan Thanks for your interest. There has been an implemenation in Keras - would really appreciate if you could evaluate it and tell us what you think.
https://github.com/niitsuma/word2vec-keras-in-gensim

I've successfully tested word2veckeras using keras 0.3.1 with theano backend.
I'll try to make it compatible with the current version of keras.

I also want to rewrite a part of Word2vec training using theano functions.

@SimonPavlik could you please post a link to the results of your experiments here?

BTW Deeplearing4j are also working to resolve batching issues in order to make word2vec run on GPU faster than on CPU

I have 4 Titan Xs sitting on a bus within the same Supermicro enclosure, overclocked to 1342Mhz.
If the software is stable and all it needs either TF or Theano I could attempt to benchmark it.

Note: Previously I observed Keras programs run 5 times faster when backed by Theano than when backed by TF.

@phalexo Thanks a lot for volunteering the hardware! Adding a Titan benchmark to this list would be great https://github.com/RaRe-Technologies/gensim/pull/1033#issuecomment-273567836
Please ask @markroxor for the exact code he ran.

@SimonPavlik : not sure I got this right from your article, but my understanding is that the GPU code was run with only one data loading thread. If that is true, then I can imagine that the speed bottleneck is at the data loading level, not at the GPU. Is there any comparison where the model has "enough" data loading threads ?

That's right @octavian-ganea, only one worker was used for the preprocessing. Even with the preprocessed data ready in the memory, a single threaded generator couldn't keep the GPU busy.

I don't know any GPU implementation that works faster as current CPU word2vec, if we have any benchmark results/good reference implementations - please post it here.

Hi, I'm using doc2vec. What is the current state, is there a GPU acceleration for doc2vec? Maybe a GPU mode in gensim? Thanks alot

No, and there is no plan for adding that either. Let me close this issue.

Was this page helpful?
0 / 5 - 0 ratings