gensim/utils.py occur TypeError: file must have a 'write' attribute

Created on 1 Feb 2017  Â·  24Comments  Â·  Source: RaRe-Technologies/gensim

Hello, I am new to gensim library.

When I run a code like bellow, the type error will occur.

    model = models.Doc2Vec(alpha=0.025, min_alpha=0.025)  # use fixed learning rate
    model.build_vocab(sentences)
    # sentences is a labeled text data. using models.doc2vec.LabeledSentence()

    # store the model to mmap-able files
    model.save(model_basename+'.d2v')

the method of save will cause the error

Traceback (most recent call last):
  File "/root/.pyenv/versions/3.6.0/lib/python3.6/site-packages/gensim/utils.py", line 493, in save
    _pickle.dump(self, fname_or_handle, protocol=pickle_protocol)
TypeError: file must have a 'write' attribute

Could you help me?

bug difficulty medium

Most helpful comment

@AaronKim-CN
I faced the similar issue, the easiest solution I came across was

import dill
#save your model as 
with open('file-name','wb') as f:
    dill.dump(model-obj, f)

# later load the model as
model = gensim.models.Doc2vec.load('file-name')

The error is caused because of python pickle could not pickle python modules.

Hope it helps!!

All 24 comments

Hi,
Could you please paste code defining model_basename and complete stack trace?

This is strange because the exception should be caught by the code. The Python 3.6 tests pass in Continuous Integration where filename is given as a string.

Hi, tmylk

Thank you for your response.

the defination of model_basename is follwing.

 model_basename = './D2V/doc2vec_result/model'

I have confirmed the type of [ model_basename+'.d2v' ] is str.

And following is the stack trace.

ERROR:root:Traceback (most recent call last):
  File "/root/.pyenv/versions/3.6.0/lib/python3.6/site-packages/gensim/utils.py", line 493, in save
    _pickle.dump(self, fname_or_handle, protocol=pickle_protocol)
TypeError: file must have a 'write' attribute

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "doc2vecKmeans.py", line 114, in <module>
    model.save(model_basename+'.d2v')
  File "/root/.pyenv/versions/3.6.0/lib/python3.6/site-packages/gensim/models/word2vec.py", line 1462, in save
    super(Word2Vec, self).save(*args, **kwargs)
  File "/root/.pyenv/versions/3.6.0/lib/python3.6/site-packages/gensim/utils.py", line 497, in save
    pickle_protocol=pickle_protocol)
  File "/root/.pyenv/versions/3.6.0/lib/python3.6/site-packages/gensim/utils.py", line 369, in _smart_save
    pickle(self, fname, protocol=pickle_protocol)
  File "/root/.pyenv/versions/3.6.0/lib/python3.6/site-packages/gensim/utils.py", line 919, in pickle
    with smart_open(fname, 'wb') as fout:  # 'b' for binary, needed on Windows
  File "/root/.pyenv/versions/3.6.0/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 127, in smart_open
    return file_smart_open(parsed_uri.uri_path, mode)
  File "/root/.pyenv/versions/3.6.0/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 558, in file_smart_open
    return open(fname, mode)
FileNotFoundError: [Errno 2] No such file or directory: './D2V/doc2vec_result/model.d2v'

I am experiencing a similar issue with LsiModel. I believe you can see the problem by looking at the relevant code for saving LsiModel objects.

The save function documents that fname can be a string naming a file or an open file handle. However, if you trace back the inheritance to utils.SaveLoad you see the code that finally tries to save with pickle and it looks like this cannot possibly work with a string naming a file path. An exception will always occur in that first try block.

It also looks like the "smart save" methods always expect it to be a string naming a file path because there is an implicit assumption in the code that you can take fname and append a ".projection" onto it to get a file path for saving an LsiModel projection. This is where I'm getting this error:

Traceback (most recent call last):
  File "/home/espears/anaconda3/envs/molr-gensim/lib/python3.5/site-packages/gensim/utils.py", line 493, in save
    _pickle.dump(self, fname_or_handle, protocol=pickle_protocol)
TypeError: file must have a 'write' attribute

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "lsa_index.py", line 8, in <module>
    _save_lsi_model(lsi_model, "lsa-data/lsa-model/fixtures.model")
  File "/home/espears/molr/lib/indexing/generate_lsa_matrix.py", line 66, in _save_lsi_model
    lsi_model.save(model_path)
  File "/home/espears/anaconda3/envs/molr-gensim/lib/python3.5/site-packages/gensim/models/lsimodel.py", line 545, in save
    self.projection.save(utils.smart_extension(fname, '.projection'), *args, **kwargs)
  File "/home/espears/anaconda3/envs/molr-gensim/lib/python3.5/site-packages/gensim/utils.py", line 497, in save
    pickle_protocol=pickle_protocol)
  File "/home/espears/anaconda3/envs/molr-gensim/lib/python3.5/site-packages/gensim/utils.py", line 369, in _smart_save
    pickle(self, fname, protocol=pickle_protocol)
  File "/home/espears/anaconda3/envs/molr-gensim/lib/python3.5/site-packages/gensim/utils.py", line 919, in pickle
    with smart_open(fname, 'wb') as fout:  # 'b' for binary, needed on Windows
  File "/home/espears/anaconda3/envs/molr-gensim/lib/python3.5/site-packages/smart_open/smart_open_lib.py", line 127, in smart_open
    return file_smart_open(parsed_uri.uri_path, mode)
  File "/home/espears/anaconda3/envs/molr-gensim/lib/python3.5/site-packages/smart_open/smart_open_lib.py", line 558, in file_smart_open
    return open(fname, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'lsa-data/lsa-model/fixtures.model.projection'

I can't propose a code change yet because it's very hard to jump around the "smart save" methods and determine each of the different things it will try. Can you shed any light on this?

@spearsem Does the folder 'lsa-data/lsa-model/' exist? If the folder doesn't exist then gensim will not create it.

Also @AaronKim-CN does './D2V/doc2vec_result/' folder exist?

We should probably add code to create the folders. What do you think?

@tmylk Is anyone working on this? Would you like me to add code to create folders?

@pranaydeep-af that would be great

@tmylk Sent PR #1225

Gensim should definitely not create any new folders. That would be unintuitive and against best practices.

I agree with @spearsem that the code path for save is too convoluted. It looks like that function, which started out simple, accumulated too much cruft. @tmylk please simplify and clarify what arguments it accepts and how the code works. If the input file is a string, we can afford to save extra files like .projections etc. But if a file object, we must use simple pickle (no extra files).

Perhaps an explicit switch (a parameter, defaults to None interpreted as True) that allows gensim to save "extra files" would be appropriate. And if the user explicitly requests True but gives a file object, fail with an exception.

@AaronKim-CN
I faced the similar issue, the easiest solution I came across was

import dill
#save your model as 
with open('file-name','wb') as f:
    dill.dump(model-obj, f)

# later load the model as
model = gensim.models.Doc2vec.load('file-name')

The error is caused because of python pickle could not pickle python modules.

Hope it helps!!

CC @menshikh-iv is the behaviour described in my comment above still there?

I'd consider the current behaviour on the verge of bug, it's really not good. Too complicated, and still fails in common scenarios.

@lordzuko seems unrelated -- there should be no need to pickle any modules. Normal pickle works fine with gensim objects.

You're probably trying to pickle some of your lambda (unnamed) code, which pickle.dump does not support, but that's not related to the issue here.

@piskvorky I tried the save method and it gave me the same error as mentioned in the issue, the error does says that the smart save fails because the model couldn't be pickled.
Doesn't that means the object could not be pickled using python's pickle. What I suggested, did worked for me and I could easily save the model and load it back.

@lordzuko the original error was simply due to @AaronKim-CN trying to save into a directory that doesn't exist (FileNotFoundError: [Errno 2] No such file or directory).

Is it really the same with you? If so, doesn't creating the directory solve the problem for you too?

If not, it's a different issue.

My previous comment was talking about @spearsem 's observation that save(X) works only when X is a string. It doesn't work when X is a file object (or works weirdly, some models fail), which I consider a bug.

@piskvorky In my case the directory existed, but the model couldn't be saved and was giving the same error. I was trying to save Doc2vec model .

@lordzuko ok, please send the full code snippet + full stack trace here. Thanks!

@piskvorky Here is the code snippet and full stack trace (Let me know if I should create a new issue)

Code Snippet:

english_em = gensim.models.wrappers.fasttext.FastTextKeyedVectors.load_word2vec_format('../data/embeddings/wiki.en.vec')
corpus_used = test_doc_for_model
# build the model
model = Doc2Vec(size=300,iter=1,  
                window=10, seed=1337, min_count=5, workers=4,
                alpha=0.025,min_alpha=0.025)
model.build_vocab(corpus_used)
model.intersect_word2vec_format(english_em.vocab)
model.train(corpus_used, total_examples=len(corpus_used), epochs=1)
model.save('model/model_en_.d2v')

Full Trace:

TypeError                                 Traceback (most recent call last)
~/virtualenv/document-clustering/lib/python3.4/site-packages/gensim/utils.py in save(self, fname_or_handle, separately, sep_limit, ignore, pickle_protocol)
    499         try:
--> 500             _pickle.dump(self, fname_or_handle, protocol=pickle_protocol)
    501             logger.info("saved %s object" % self.__class__.__name__)

TypeError: file must have a 'write' attribute

During handling of the above exception, another exception occurred:

PicklingError                             Traceback (most recent call last)
<ipython-input-23-db263796e7c8> in <module>()
----> 1 model.save('model/model_en_.d2v')

~/virtualenv/document-clustering/lib/python3.4/site-packages/gensim/models/word2vec.py in save(self, *args, **kwargs)
   1404         kwargs['ignore'] = kwargs.get('ignore', ['syn0norm', 'table', 'cum_table'])
   1405 
-> 1406         super(Word2Vec, self).save(*args, **kwargs)
   1407 
   1408     save.__doc__ = utils.SaveLoad.save.__doc__

~/virtualenv/document-clustering/lib/python3.4/site-packages/gensim/utils.py in save(self, fname_or_handle, separately, sep_limit, ignore, pickle_protocol)
    502         except TypeError:  # `fname_or_handle` does not have write attribute
    503             self._smart_save(fname_or_handle, separately, sep_limit, ignore,
--> 504                              pickle_protocol=pickle_protocol)
    505 #endclass SaveLoad
    506 

~/virtualenv/document-clustering/lib/python3.4/site-packages/gensim/utils.py in _smart_save(self, fname, separately, sep_limit, ignore, pickle_protocol)
    374                                        compress, subname)
    375         try:
--> 376             pickle(self, fname, protocol=pickle_protocol)
    377         finally:
    378             # restore attribs handled specially

~/virtualenv/document-clustering/lib/python3.4/site-packages/gensim/utils.py in pickle(obj, fname, protocol)
    928     """
    929     with smart_open(fname, 'wb') as fout:  # 'b' for binary, needed on Windows
--> 930         _pickle.dump(obj, fout, protocol=protocol)
    931 
    932 

PicklingError: Can't pickle <class 'module'>: attribute lookup module on builtins failed

Thanks! Yes, looks unrelated to the original error.

@gojomo any ideas, can you reproduce?

intersect_word2vec_format() expects a path-to-a-file, not an in-memory dict, as its argument – so I'm surprised the code gets past that line. (It looks to me like the smart_open() on a non-path should fail.)

@lordzuko Can you reproduce without the non-standard intersect_word2vec_format() usage? (Are you using a standard version of intersect_word2vec_format()? It doesn't look like your code is likely to do as intended...)

@gojomo Yes I tried without the intersect_word2vec_format() , I am still getting the same error.

I think it's a new issue, can should I raise a new issue ?

print ('Data Loading finished')
corpus_used = test_doc_for_model
# build the model
model = Doc2Vec(size=300,iter=1,  
                window=10, seed=1337, min_count=5, workers=4,
                alpha=0.025,min_alpha=0.025)
model.build_vocab(corpus_used)
#model.intersect_word2vec_format(english_em.vocab)
model.save('model/model_en_.d2v')

I am getting following trace:

TypeError                                 Traceback (most recent call last)
~/virtualenv/document-clustering/lib/python3.4/site-packages/gensim/utils.py in save(self, fname_or_handle, separately, sep_limit, ignore, pickle_protocol)
    499         try:
--> 500             _pickle.dump(self, fname_or_handle, protocol=pickle_protocol)
    501             logger.info("saved %s object" % self.__class__.__name__)

TypeError: file must have a 'write' attribute

During handling of the above exception, another exception occurred:

PicklingError                             Traceback (most recent call last)
<ipython-input-46-db263796e7c8> in <module>()
----> 1 model.save('model/model_en_.d2v')

~/virtualenv/document-clustering/lib/python3.4/site-packages/gensim/models/word2vec.py in save(self, *args, **kwargs)
   1404         kwargs['ignore'] = kwargs.get('ignore', ['syn0norm', 'table', 'cum_table'])
   1405 
-> 1406         super(Word2Vec, self).save(*args, **kwargs)
   1407 
   1408     save.__doc__ = utils.SaveLoad.save.__doc__

~/virtualenv/document-clustering/lib/python3.4/site-packages/gensim/utils.py in save(self, fname_or_handle, separately, sep_limit, ignore, pickle_protocol)
    502         except TypeError:  # `fname_or_handle` does not have write attribute
    503             self._smart_save(fname_or_handle, separately, sep_limit, ignore,
--> 504                              pickle_protocol=pickle_protocol)
    505 #endclass SaveLoad
    506 

~/virtualenv/document-clustering/lib/python3.4/site-packages/gensim/utils.py in _smart_save(self, fname, separately, sep_limit, ignore, pickle_protocol)
    374                                        compress, subname)
    375         try:
--> 376             pickle(self, fname, protocol=pickle_protocol)
    377         finally:
    378             # restore attribs handled specially

~/virtualenv/document-clustering/lib/python3.4/site-packages/gensim/utils.py in pickle(obj, fname, protocol)
    928     """
    929     with smart_open(fname, 'wb') as fout:  # 'b' for binary, needed on Windows
--> 930         _pickle.dump(obj, fout, protocol=protocol)
    931 
    932 

PicklingError: Can't pickle <class 'module'>: attribute lookup module on builtins failed

Can you supply an example that reproduces the error, without using intersect_word2vec_format(), starting fresh? (Your latest comment still shows int being used, and with the wrong argument type – an in-memory dict rather than a path-to-a-file.)

@gojomo I have commented the intersect_word2vec_format() in the above code if that what you meant by "Your latest comment still shows int being used, and with the wrong argument type – an in-memory dict rather than a path-to-a-file."

Anyways, I will start fresh, raise a new issue with the stack trace.
Thanks

Is there any progress of this original issue? I got the same error when I want to save a doc2vec model. Anyone can give me a help?

 model.train(new_docs,total_examples=model.corpus_count,epochs=70)
 model.save(model_path)  #model_path is a str. 
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/gensim/utils.py", line 692, in save
    _pickle.dump(self, fname_or_handle, protocol=pickle_protocol)
TypeError: file must have a 'write' attribute

@GD-Susan What version of gensim are you using?

@GD-Susan What version of gensim are you using?

3.8.1. I have solved this problem in another way. Thanks for the reply.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Laubeee picture Laubeee  Â·  3Comments

dancinghui picture dancinghui  Â·  4Comments

menshikh-iv picture menshikh-iv  Â·  3Comments

volj1 picture volj1  Â·  4Comments

jeradf picture jeradf  Â·  4Comments