Hi,
I have heard a lot about gensim, but now when I am trying to use it, for probably the simplest task, i.e loading pre-trained embeddings, I am stuck for hours.
Consider:
from gensim.models import KeyedVectors
# Load vectors directly from the file
model = KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
# Access vectors for specific words with a keyed lookup:
vector = model['simple']
python word2vec.pyTraceback (most recent call last): File "word2vec.py", line 3, in <module>
model = KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
File "/home/andy/anaconda3/lib/python3.6/site-packages/gensim/models/keyedvectors.py", line 1436, in load_word2vec_format limit=limit, datatype=datatype)
File "/home/andy/anaconda3/lib/python3.6/site-packages/gensim/models/utils_any2vec.py", line 178, in _load_word2vec_format
result.vectors = zeros((vocab_size, vector_size), dtype=datatype)
MemoryError
Could gensim be used to load word2vec pre-trained embeddings released by google? How could I do so?
Cheers!
Hello @andymancodes,
Could gensim be used to load word2vec pre-trained embeddings released by google?
of course, but google pre-trained vectors are really huge, you should have enough of RAM to use it. For reduce memory usage, you can load only part of this vector, for this, specify limit parameter, i.e.
model = KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True, limit=10 ** 5)
For future, please use mailing list for questions (github issues only for feature request/bug report).
thanks @menshikh-iv ! it solved the issue, will use the mailinglist in the future, thanks for the link!
Nice!!!.
Also worked for me
Hi,
I am trying to use gensim model but giving below error. I have been trying for 2 days and checked my RAM. I have 16 GB RAM and only 29% is used at the time of running this code unable to understand how to fix it. Please help.
Code snippet:
import gensim.downloader as api
fasttext_model300 = api.load('fasttext-wiki-news-subwords-300')
Error:
Traceback (most recent call last):
File "C:/Users/amitabhseth/IdeaProjects/class1/Test1.py", line 34, in
fasttext_model300 = api.load('fasttext-wiki-news-subwords-300')
File "C:\Python3.7\lib\site-packages\gensim\downloader.py", line 502, in load
return module.load_data()
File "C:\Users\amitabhseth/gensim-data\fasttext-wiki-news-subwords-300__init__.py", line 8, in load_data
model = KeyedVectors.load_word2vec_format(path, binary=False)
File "C:\Python3.7\lib\site-packages\gensim\models\keyedvectors.py", line 1498, in load_word2vec_format
limit=limit, datatype=datatype)
File "C:\Python3.7\lib\site-packages\gensim\models\utils_any2vec.py", line 349, in _load_word2vec_format
result.vectors = zeros((vocab_size, vector_size), dtype=datatype)
MemoryError
Most helpful comment
Hello @andymancodes,
of course, but google pre-trained vectors are really huge, you should have enough of RAM to use it. For reduce memory usage, you can load only part of this vector, for this, specify
limitparameter, i.e.For future, please use mailing list for questions (github issues only for feature request/bug report).