Gensim: Doc2Vec to wikipedia articles notebook error - object has no attribute

Created on 7 Jun 2018  路  10Comments  路  Source: RaRe-Technologies/gensim

For the Doc2Vec to wikipedia articles notebook (https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/doc2vec-wikipedia.ipynb) I get this error:

pre = Doc2Vec(min_count=0)
pre.scan_vocab(documents)
executed in 11ms, finished 09:09:33 2018-06-07
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-10-0b281772fe9f> in <module>()
      1 pre = Doc2Vec(min_count=0)
----> 2 pre.scan_vocab(documents)
AttributeError: 'Doc2Vec' object has no attribute 'scan_vocab'

I also get a similar error for the next cell of the notebook:

AttributeError: 'Doc2Vec' object has no attribute 'scale_vocab'

Your Doc2Vec notebook on the Lee dataset (https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/doc2vec-lee.ipynb) works fine for me though.

=======================
I am using Ubuntu 16.04.4 , Python 3.6, and the latest version of Gensim.

bug difficulty easy documentation

Most helpful comment

IMHO ideal fix would restore the symmetry/consistency in Word2Vec-related classes before #1777, where you could always replace a single build_vocab() with individual constituent steps with consistent names, if you wanted more control (such as via extra actions/analysis between steps). And, to prevent future regressions, test methods confirming such an 'un-bundling' of the build_vocab() steps gives the same results.

All 10 comments

@impulsecorp can you please try with the latest release, 3.5.0?
CC @gojomo

I think this is likely fallout from #1777 @manneshiva refactorings.

@impulsecorp is a problem still reproduced with gensim==3.5.0?

Sorry, I don't have that project running any more, so I can't try it again.

Reproduced with gensim==3.5.0

from gensim.models import Doc2Vec

model = Doc2Vec()
model.scan_vocab()
AttributeError                            Traceback (most recent call last)
<ipython-input-6-4f495aae01df> in <module>()
      2 
      3 model = Doc2Vec()
----> 4 model.scan_vocab()

AttributeError: 'Doc2Vec' object has no attribute 'scan_vocab'


but I'm not sure that this is a bug because user shouldn't call this method (only build_vocab should be used), for this reason, need to update notebook (not d2v code).

Pre #1777, users could choose to call the 3 steps of build_vocab() (scan_vocab(), prepare_vocab(), finalize_vocab()) manually, instead of just build_vocab(), in either Word2Vec or Doc2Vec, if they wanted to do extra reporting/tinkering between those steps.

@gojomo yes, and scan_vocab and prepare_vocab still available as methods of model.vocabulary, but finalize_vocab partially replaced with model.trainables.prepare_weights

can i work on this?

@Naba7 feel free to take any open issue (no need to ask each time) :)

IMHO ideal fix would restore the symmetry/consistency in Word2Vec-related classes before #1777, where you could always replace a single build_vocab() with individual constituent steps with consistent names, if you wanted more control (such as via extra actions/analysis between steps). And, to prevent future regressions, test methods confirming such an 'un-bundling' of the build_vocab() steps gives the same results.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dancinghui picture dancinghui  路  4Comments

menshikh-iv picture menshikh-iv  路  4Comments

menshikh-iv picture menshikh-iv  路  3Comments

vlad17 picture vlad17  路  4Comments

sairampillai picture sairampillai  路  3Comments