Gensim: Fix deprecations in SoftCosineSimilarity

Created on 9 Sep 2020  Â·  6Comments  Â·  Source: RaRe-Technologies/gensim

When running our CI test suite, I see an array of deprecation warnings:

https://dev.azure.com/rare-technologies/gensim-ci/_build/results?buildId=287&view=logs&jobId=f9575ddc-dec8-54e6-9d26-abb8bdd9bed7&j=f9575ddc-dec8-54e6-9d26-abb8bdd9bed7&t=180156a9-2bf9-537d-c84a-ef9e808c0367

Some are from gensim, some from scipy:
Screen Shot 2020-09-09 at 09 16 00

@Witiko could you please have a look? Is it something your existing PR already addresses?

If not, can you please fix those? Thanks.

bug housekeeping impact MEDIUM reach HIGH

All 6 comments

…and we're also still using the deprecated smart_open.smart_open apparently, CC @mpenkov :

gensim/test/test_api.py::TestApi::test_load_dataset
gensim/test/test_api.py::TestApi::test_multipart_load
  d:\a\1\s\.tox\py37-win\lib\site-packages\smart_open\smart_open_lib.py:254: UserWarning: This function is deprecated, use smart_open.open instead. See the migration notes for details: https://github.com/RaRe-Technologies/smart_open/blob/master/README.rst#migrating-to-the-new-open-function
    'See the migration notes for details: %s' % _MIGRATION_NOTES_URL

And more:

gensim/test/test_datatype.py: 2 warnings
gensim/test/test_doc2vec.py: 14862 warnings
gensim/test/test_fasttext.py: 96 warnings
gensim/test/test_keyedvectors.py: 20 warnings
gensim/test/test_word2vec.py: 84 warnings
  D:\a\1\s\gensim\models\keyedvectors.py:1478: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
    fout.write(utils.to_utf8(prefix + str(key)) + b" " + row.tostring())

The whole CI log could do with a careful inspection.

@piskvorky I can remove the deprecated parameters together with their tests and open a PR. Please note that this will break backwards compatibility, i.e. the PR should only be merged if we are going for the next major release of Gensim.

Thanks! Yes, we are going for the next major release. Current develop no longer supports py2.x etc, it's all 4.0.0+.

Most of the deprecation warnings are fixed by #2940 except ...

…and we're also still using the deprecated smart_open.smart_open apparently, CC @mpenkov :

gensim/test/test_api.py::TestApi::test_load_dataset
gensim/test/test_api.py::TestApi::test_multipart_load
  d:\a\1\s\.tox\py37-win\lib\site-packages\smart_open\smart_open_lib.py:254: UserWarning: This function is deprecated, use smart_open.open instead. See the migration notes for details: https://github.com/RaRe-Technologies/smart_open/blob/master/README.rst#migrating-to-the-new-open-function
    'See the migration notes for details: %s' % _MIGRATION_NOTES_URL

... this one, which originates from api.load("__testing_matrix-synopsis", return_path=True) where api is gensim.downloader. This likely needs to be fixed in gensim-data.

Awesome, thanks!

except … this one, which originates from api.load("__testing_matrix-synopsis", return_path=True) where api is gensim.downloader.

It's probably the test line a bit lower down, without the return_path=True. That line will load the __testing_matrix-synopsis dataset, which does indeed call smart_open.smart_open():
https://github.com/RaRe-Technologies/gensim-data/releases/tag/__testing_matrix-synopsis

Gensim-data releases are immutable, so we should make a new release of the __testing_matrix-synopsis dataset, with smart_open.smart_open replaced by smart_open.open. Not urgent, and definitely not blocking for this PR.

@mpenkov feel free to merge #2940 when ready.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sairampillai picture sairampillai  Â·  3Comments

menshikh-iv picture menshikh-iv  Â·  3Comments

dancinghui picture dancinghui  Â·  4Comments

simonm3 picture simonm3  Â·  3Comments

franciscojavierarceo picture franciscojavierarceo  Â·  3Comments