Gensim: AttributeError: module 'smart_open' has no attribute 's3'

Created on 30 Mar 2020  Â·  18Comments  Â·  Source: RaRe-Technologies/gensim

python 3.6
trying to import gensim and got:

AttributeError                            Traceback (most recent call last)
<ipython-input-8-492cbaec3fd5> in <module>()
     12 import configparser
     13 
---> 14 from dataset.conversation import Conversation
     15 # from train import train
     16 logging.basicConfig(stream=sys.stdout, format="%(asctime)s %(levelname)s %(message)s", level=logging.INFO)

~/SageMaker/conversation-platform/src/dataset/conversation.py in <module>()
      5 import torch
      6 import json
----> 7 from preprocess.builder import PipeBuilder
      8 from utils import VECTORIZED_COL, PATICIPANT_COL, BEGIN_TIME_COL, END_TIME_COL, AGENT_COL, CUSTOMER_COL
      9 

~/SageMaker/conversation-platform/src/preprocess/builder.py in <module>()
----> 1 from preprocess.filtering import IntervalsFilter, ParticipateFilter
      2 import preprocess.transformer as transformer
      3 from sklearn.pipeline import Pipeline
      4 from utils import build_embedding_matrix
      5 

~/SageMaker/conversation-platform/src/preprocess/filtering.py in <module>()
      1 from sklearn.base import BaseEstimator, TransformerMixin
      2 import pandas as pd
----> 3 from utils import UTTERANCE_ITEM
      4 
      5 

~/SageMaker/conversation-platform/src/utils.py in <module>()
      6 from sklearn import metrics
      7 import numpy as np
----> 8 from gensim.models import KeyedVectors
      9 
     10 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/gensim/__init__.py in <module>()
      3 """
      4 
----> 5 from gensim import parsing, corpora, matutils, interfaces, models, similarities, summarization, utils  # noqa:F401
      6 import logging
      7 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/gensim/parsing/__init__.py in <module>()
      2 
      3 from .porter import PorterStemmer  # noqa:F401
----> 4 from .preprocessing import (remove_stopwords, strip_punctuation, strip_punctuation2,  # noqa:F401
      5                             strip_tags, strip_short, strip_numeric,
      6                             strip_non_alphanum, strip_multiple_whitespaces,

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/gensim/parsing/preprocessing.py in <module>()
     40 import glob
     41 
---> 42 from gensim import utils
     43 from gensim.parsing.porter import PorterStemmer
     44 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/gensim/utils.py in <module>()
     43 from six.moves import range
     44 
---> 45 from smart_open import open
     46 
     47 from multiprocessing import cpu_count

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/smart_open/__init__.py in <module>()
     25 from smart_open import version
     26 
---> 27 from .smart_open_lib import open, smart_open, register_compressor
     28 from .s3 import iter_bucket as s3_iter_bucket
     29 __all__ = ['open', 'smart_open', 's3_iter_bucket', 'register_compressor']

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/smart_open/smart_open_lib.py in <module>()
     36 # smart_open.submodule to reference to the submodules.
     37 #
---> 38 import smart_open.s3 as smart_open_s3
     39 import smart_open.hdfs as smart_open_hdfs
     40 import smart_open.webhdfs as smart_open_webhdfs

AttributeError: module 'smart_open' has no attribute 's3'
bug

All 18 comments

@mpenkov could this be some version mismatch? Gensim requires smart_open >= 1.8.1, though I don't see how that could result in such error.

@eliksr please fill in the issue template fully, including software versions.

Yeah, I'm not sure what's going on here, either. We'll need the version numbers to proceed.

We had a similar issue reported here very recently: https://github.com/RaRe-Technologies/smart_open/issues/446. It's unlikely to be a coincidence.

Hi @piskvorky @mpenkov,
My environment is AWS Sagemaker, python 3.6

Linux-4.14.171-105.231.amzn1.x86_64-x86_64-with-glibc2.9
Python 3.6.5 |Anaconda, Inc.| (default, Apr 29 2018, 16:14:56) 
[GCC 7.2.0]
NumPy 1.15.4
SciPy 1.1.0

When I try to 'import gensim' I get this error message (AttributeError: module 'smart_open' has no attribute 's3' ).
When I pip installed gensim I got this:

fastai 1.0.60 requires nvidia-ml-py3, which is not installed.
googleapis-common-protos 1.51.0 has requirement protobuf>=3.6.0, but you'll have protobuf 
3.5.2 which is incompatible.

When I try to import gensim.models.KeyedVectors I get:

~/SageMaker/conversation-platform/src/utils.py in <module>()
      6 from sklearn import metrics
      7 import numpy as np
----> 8 from gensim.models import KeyedVectors
      9 import logging
     10 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/gensim/__init__.py in <module>()
      3 """
      4 
----> 5 from gensim import parsing, corpora, matutils, interfaces, models, similarities, summarization, utils  # noqa:F401
      6 import logging
      7 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/gensim/parsing/__init__.py in <module>()
      2 
      3 from .porter import PorterStemmer  # noqa:F401
----> 4 from .preprocessing import (remove_stopwords, strip_punctuation, strip_punctuation2,  # noqa:F401
      5                             strip_tags, strip_short, strip_numeric,
      6                             strip_non_alphanum, strip_multiple_whitespaces,

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/gensim/parsing/preprocessing.py in <module>()
     40 import glob
     41 
---> 42 from gensim import utils
     43 from gensim.parsing.porter import PorterStemmer
     44 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/gensim/utils.py in <module>()
     43 from six.moves import range
     44 
---> 45 from smart_open import open
     46 
     47 from multiprocessing import cpu_count

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/smart_open/__init__.py in <module>()
     25 from smart_open import version
     26 
---> 27 from .smart_open_lib import open, smart_open, register_compressor
     28 from .s3 import iter_bucket as s3_iter_bucket
     29 __all__ = ['open', 'smart_open', 's3_iter_bucket', 'register_compressor']

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/smart_open/smart_open_lib.py in <module>()
     41 import smart_open.http as smart_open_http
     42 import smart_open.ssh as smart_open_ssh
---> 43 import smart_open.gcs as smart_open_gcs
     44 
     45 from smart_open import doctools

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/smart_open/gcs.py in <module>()
     12 import sys
     13 
---> 14 import google.cloud.exceptions
     15 import google.cloud.storage
     16 import google.auth.transport.requests as google_requests

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/google/cloud/exceptions.py in <module>()
     22 from __future__ import absolute_import
     23 
---> 24 from google.api_core import exceptions
     25 
     26 try:

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/google/api_core/__init__.py in <module>()
     21 
     22 
---> 23 __version__ = get_distribution("google-api-core").version

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/pkg_resources/__init__.py in get_distribution(dist)
    470     require(dist_spec)[0].run_script(script_name, ns)
    471 
--> 472 
    473 # backward compatibility
    474 run_main = run_script

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/pkg_resources/__init__.py in get_provider(moduleOrReq)
    342 DEVELOP_DIST = -1
    343 
--> 344 
    345 def register_loader_type(loader_type, provider_factory):
    346     """Register `provider_factory` to make providers for `loader_type`

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/pkg_resources/__init__.py in require(self, *requirements)
    890         return distributions, error_info
    891 
--> 892     def require(self, *requirements):
    893         """Ensure that distributions matching `requirements` are activated
    894 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/pkg_resources/__init__.py in resolve(self, requirements, env, installer, replace_conflicting, extras)
    781                     dist = best[req.key] = env.best_match(
    782                         req, ws, installer,
--> 783                         replace_conflicting=replace_conflicting
    784                     )
    785                     if dist is None:

ContextualVersionConflict: (protobuf 3.5.2 (/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages), Requirement.parse('protobuf>=3.6.0'), {'googleapis-common-protos'})

Thanks. Seems related to the new GCS functionality in smart_open and conflicting versions, although I don't understand what's going on exactly.

CC @petedannemann – any ideas?

As a work-around, I think we could move the gcs stuff behind extras. It's still relatively new, and most people would not have time to learn to expect it there.

https://github.com/RaRe-Technologies/smart_open/pull/454

OK. It's also a good case for keeping the smart_open core lean – who knows what other unnecessary issues there are with the various dependencies, that no one reported.

Sure looks like a version conflict to me. Seems like pinning a specific version of the google-cloud-storage package in smart_open could fix this

Thanks all for looking to this. Been running into this error since early last week. Still haven't figured out a workaround.

PS - I'm working in google cloud datalab with python 3.5.6

@aramidek it's very likely that google cloud datalab uses a different version of google-cloud-storage than smart_open does. google-cloud-storage and related google APIs have had a lot of breaking API changes and this will require a lot of poking around to figure out the cause

@petedannemann what storage version is smart open compatible with? does it help if I change that version from the instance I'm working in?

One more thing. I unistalled gensim and reinstalled version 3.4.0. This time I got this error relating to google cloud storage

ContextualVersionConflict: (google-resumable-media 0.3.2 (/usr/local/envs/py3env/lib/python3.5/site-packages), Requirement.parse('google-resumable-media<0.6dev,>=0.5.0'), {'google-cloud-storage'})

Given that, I installed google-resumable-media==0.5.0 after running, I got the same error relating to the s3 attribute:

AttributeError: module 'smart_open' has no attribute 's3'

smart-open has been tested outside of gensim with versions 1.26.0 and 1.27.0 of google-cloud-storage. Right now we don't pin a specific version of google-cloud-storage insmart-open's installation so it pulls the latest version, 1.27.0.

I think a short-term work-around for the problem is to downgrade smart_open to the previously working version, in this case 1.9.0.

@mpenkov I ran it with the downgraded smart_open version and it works from data lab as follows:

!pip install gensim==3.4.0
!pip install smart_open==1.9.0
import smart_open
import gensim

It did throw the deprecation warning below:

/usr/local/envs/py3env/lib/python3.5/site-packages/scipy/sparse/sparsetools.py:20: DeprecationWarning: scipy.sparse.sparsetools is deprecated!
scipy.sparse.sparsetools is a private module for scipy.sparse, and should not be used.
_deprecated()

As of this release, GCS needs to be installed separately.
https://github.com/RaRe-Technologies/smart_open/releases/tag/1.11.1

I think it would be better to change smart_open to smart_open[all] in requires.

@eliksr do you still see issues with the latest smart_open 2.0.0? That should have fixed this.

If so, please post the full traceback of import smart_open.
If not, we'll close this as fixed.

Closing for the 3.8.3 release – if you see further problems, please let us know and we can reopen.

Was this page helpful?
0 / 5 - 0 ratings