Today, our daily gcloud ml-engine job has been broken. The problem may come from the changes in the module _grpc-google-iam-v1_. It seems that you did an update for the module _grpc-google-iam-v1_ from 0.11.1 (?) to 0.11.3 on the server-side.
I compared the "broken" version with the "working" version. I realized that the only diff is in the module grpc-google-iam-v1.
Python version: Python 2.7
Google-Cloud version:
google-cloud==0.24.0
google-cloud-storage==1.2.0
grpc-google-iam-v1==0.11.3
It has been broken because the pip tried to install _grpc-google-iam-v1==0.11.3_ which did not have the attribute RegisterServiceDescriptor in the object SymbolDatabase
Stacktrace:
Traceback (most recent call last): File "/usr/local/bin/pip", line 5, in <module> from pkg_resources import load_entry_point File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 3074, in <module> @_call_aside File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 3058, in _call_aside f(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 3102, in _initialize_master_working_set for dist in working_set File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 3102, in <genexpr> for dist in working_set File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2633, in activate declare_namespace(pkg) File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2191, in declare_namespace _handle_ns(packageName, path_item) File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2126, in _handle_ns loader.load_module(packageName) File "/usr/lib/python2.7/pkgutil.py", line 246, in load_module mod = imp.load_module(fullname, self.file, self.filename, self.etc) File "/root/.local/lib/python2.7/site-packages/google/cloud/pubsub/__init__.py", line 30, in <module> from google.cloud.pubsub.client import Client File "/root/.local/lib/python2.7/site-packages/google/cloud/pubsub/client.py", line 28, in <module> from google.cloud.pubsub._gax import _PublisherAPI as GAXPublisherAPI File "/root/.local/lib/python2.7/site-packages/google/cloud/pubsub/_gax.py", line 19, in <module> from google.cloud.gapic.pubsub.v1.publisher_client import PublisherClient File "/root/.local/lib/python2.7/site-packages/google/cloud/gapic/pubsub/v1/publisher_client.py", line 37, in <module> from google.iam.v1 import iam_policy_pb2 File "/root/.local/lib/python2.7/site-packages/google/iam/v1/iam_policy_pb2.py", line 296, in <module> _sym_db.RegisterServiceDescriptor(_IAMPOLICY) AttributeError: 'SymbolDatabase' object has no attribute 'RegisterServiceDescriptor'
Steps to reproduce:
It's very easy to reproduce:
Cause of #3736, we have to freeze the following version:
google-cloud == 0.24.0
google-cloud-storage == 1.2.0
ps: I did not test the latest version yet but at least the _grpc-google-iam-v1==0.11.3_ is not compatible with these versions google-cloud
Workaround: freeze the version _grpc-google-iam-v1==0.11.1_
@sonlac You can address #3736 instead by upgrading to the latest version (2.18.4) of requests
Can you upgrade to the latest version (1.4.0) of google-cloud-storage (which has a hard lower bound of 2.18.0 for requests) and see if the problem still persists?
@dhermes Thank you. I re-tested by upgrading the packages. In both cases, it always raises the same exception as above.
Don't you think that the problem comes from the module grpc-google-iam-v1-0.11.3?
Case 1:
google-cloud==0.24.0
google-cloud-storage==1.4.0
requests==2.18.4
Successfully installed packages in ml-engine job:
17:08:29.488
Successfully installed chardet-3.0.4 gapic-google-cloud-datastore-v1-0.15.3 gapic-google-cloud-error-reporting-v1beta1-0.15.3 gapic-google-cloud-logging-v2-0.91.3 gapic-google-cloud-pubsub-v1-0.15.4 gapic-google-cloud-spanner-admin-database-v1-0.15.3 gapic-google-cloud-spanner-admin-instance-v1-0.15.3 gapic-google-cloud-spanner-v1-0.15.3 gapic-google-cloud-speech-v1beta1-0.15.3 gapic-google-cloud-vision-v1-0.90.3 google-cloud-0.24.0 google-cloud-bigquery-0.24.0 google-cloud-bigtable-0.24.0 google-cloud-core-0.24.1 google-cloud-datastore-1.3.0 google-cloud-dns-0.24.0 google-cloud-error-reporting-0.24.3 google-cloud-language-0.24.1 google-cloud-logging-0.24.0 google-cloud-monitoring-0.24.0 google-cloud-pubsub-0.24.0 google-cloud-resource-manager-0.24.0 google-cloud-runtimeconfig-0.24.0 google-cloud-spanner-0.24.2 google-cloud-speech-0.24.0 google-cloud-storage-1.4.0 google-cloud-translate-0.24.0 google-cloud-vision-0.24.0 google-resumable-media-0.2.3 grpc-google-iam-v1-0.11.3 grpcio-1.6.0 idna-2.6 pandas-0.20.3 proto-google-cloud-datastore-v1-0.90.4 proto-google-cloud-error-reporting-v1beta1-0.15.3 proto-google-cloud-logging-v2-0.91.3 proto-google-cloud-pubsub-v1-0.15.4 proto-google-cloud-spanner-admin-database-v1-0.15.3 proto-google-cloud-spanner-admin-instance-v1-0.15.3 proto-google-cloud-spanner-v1-0.15.3 proto-google-cloud-speech-v1beta1-0.15.3 proto-google-cloud-vision-v1-0.90.3 requests-2.18.4 urllib3-1.22
and
Case 2:
google-cloud==0.27.0
google-cloud-storage==1.4.0
requests==2.18.4
Successfully installed packages in ml-engine job:
19:25:07.678
Successfully installed chardet-3.0.4 gapic-google-cloud-datastore-v1-0.15.3 gapic-google-cloud-error-reporting-v1beta1-0.15.3 gapic-google-cloud-logging-v2-0.91.3 gapic-google-cloud-pubsub-v1-0.15.4 gapic-google-cloud-spanner-admin-database-v1-0.15.3 gapic-google-cloud-spanner-admin-instance-v1-0.15.3 gapic-google-cloud-spanner-v1-0.15.3 google-cloud-0.27.0 google-cloud-bigquery-0.26.0 google-cloud-bigtable-0.26.0 google-cloud-core-0.26.0 google-cloud-datastore-1.2.0 google-cloud-dns-0.26.0 google-cloud-error-reporting-0.26.0 google-cloud-language-0.27.0 google-cloud-logging-1.2.0 google-cloud-monitoring-0.26.0 google-cloud-pubsub-0.27.0 google-cloud-resource-manager-0.26.0 google-cloud-runtimeconfig-0.26.0 google-cloud-spanner-0.26.0 google-cloud-speech-0.28.0 google-cloud-storage-1.4.0 google-cloud-translate-1.1.0 google-cloud-videointelligence-0.25.0 google-cloud-vision-0.26.0 google-resumable-media-0.2.3 grpc-google-iam-v1-0.11.3 grpcio-1.6.0 idna-2.6 monotonic-1.3 pandas-0.20.3 proto-google-cloud-datastore-v1-0.90.4 proto-google-cloud-error-reporting-v1beta1-0.15.3 proto-google-cloud-logging-v2-0.91.3 proto-google-cloud-pubsub-v1-0.15.4 proto-google-cloud-spanner-admin-database-v1-0.15.3 proto-google-cloud-spanner-admin-instance-v1-0.15.3 proto-google-cloud-spanner-v1-0.15.3 requests-2.18.4 tenacity-4.4.0 urllib3-1.22
For me it comes from the lib grpc-google-iam-v1.
I have downgraded to version 0.11.1: grpc-google-iam-v1==0.11.1 and it is working great now.
Indeed, I use an old version of google-cloud-pubsub: 0.27.0 instead of the latest 0.28.3
(Cannot upgrade now because the API has changed).
@lukesneeringer Can you confirm that grpc-google-iam-v1 (version 0.11.3 or any other) is still needed?
I _assume_ it is, but not sure. I can try to find out.
On Thursday, September 21, 2017 at 12:31:01 PM UTC-7, Luke Sneeringer wrote:
I assume it is, but not sure. I can try to find out.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
Is there an eta on the fix? I ran into the same error running a training pipeline today. Or is there a way to modify the ml engine worker environment when submitting a job?
@selfiebot We haven't actually confirmed it's an issue. It seems like a bad install of protobuf. (PS Are you a person or a bot?)
@dhermes Sorry about that. I commented via an internal thread, and the username surfaced as "selfie bot". :)
I'm having a similar issue to sonlac, although the error is occuring in apache beam. In my case I'm trying to run a dataflow job, the apache beam word count example that can be found here
Python version: Python 2.7
Google-Cloud version:
Google Cloud SDK 174.0.0
alpha 2017.09.15
app-engine-go
app-engine-python 1.9.61
beta 2017.09.15
bq 2.0.27
core 2017.10.02
datalab 20170818
gcloud
gsutil 4.27
I run the following command:
python -m apache_beam.examples.wordcount  --project unique-test-176123  --runner DataflowRunner  --staging_location gs://unique-test/staging  --temp_location gs://unique-test/temp  --output gs://unique-test/output
When I run it, I get the following stacktrace:
Traceback (most recent call last):
File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/examples/wordcount.py", line 126, in <module>
run()
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/examples/wordcount.py", line 105, in run
result = p.run()
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/pipeline.py", line 328, in run
return self.runner.run(self)
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 264, in run
super(DataflowRunner, self).run(pipeline)
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/runners/runner.py", line 133, in run
pipeline.visit(RunVisitor(self))
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/pipeline.py", line 353, in visit
self._root_transform().visit(visitor, self, visited)
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/pipeline.py", line 685, in visit
part.visit(visitor, pipeline, visited)
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/pipeline.py", line 688, in visit
visitor.visit_transform(self)
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/runners/runner.py", line 128, in visit_transform
self.runner.run_transform(transform_node)
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/runners/runner.py", line 171, in run_transform
return m(transform_node)
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 492, in run_GroupByKey
self.serialize_windowing_strategy(windowing))
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 787, in serialize_windowing_strategy
from apache_beam.runners import pipeline_context
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/runners/pipeline_context.py", line 27, in <module>
from apache_beam.portability.api import beam_fn_api_pb2
File "/Users/rubinghv/Library/Python/2.7/lib/python/site-packages/apache_beam/portability/api/beam_fn_api_pb2.py", line 2687, in <module>
_sym_db.RegisterServiceDescriptor(_BEAMFNCONTROL)
AttributeError: 'SymbolDatabase' object has no attribute 'RegisterServiceDescriptor'
I know it's in a different package, but since the error is so rare, I figured it's related.
@jonparrott @waprin Do you know who would be the point person for dataflow?
I seem to recall previous issues where pickle was being using to serialize Python objects across processes, so this might be related?
@rubinghv Do you mind running pip show protobuf from the same environment where the error occurred? (I'd also be interested in import google.protobuf as pb; print(pb.__version__) if they differ.)
@davidcavazos is new DPE focusing on it @amygdala is DA who tends to know things about it as well ,in case either of them can help.
@dhermes
Sure, no worries, and thanks for the quick reply.
pip show protobuf returns:
Name: protobuf
Version: 3.2.0
Summary: Protocol Buffers
Home-page: https://developers.google.com/protocol-buffers/
Author: [email protected]
Author-email: [email protected]
License: New BSD License
Location: /usr/local/lib/python2.7/site-packages
Requires: setuptools, six
When I run import google.protobuf as pb; print(pb.__version__), it prints: 3.2.0
It seems to be a version conflict, that attribute is just not there in protobof==3.2.0, but the beam_fn_api_pb module assumes it is:
In [1]: import google.protobuf
In [2]: google.protobuf.__version__
Out[2]: '3.2.0'
In [3]: import sys
In [4]: sys.version
Out[4]: '2.7.14 (default, Oct 5 2017, 09:30:02) \n[GCC 5.4.0 20160609]'
In [5]: from google.protobuf import symbol_database
In [6]: symbol_database.SymbolDatabase.RegisterServiceDescriptor
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-6-a114c41d11ce> in <module>()
----> 1 symbol_database.SymbolDatabase.RegisterServiceDescriptor
AttributeError: type object 'SymbolDatabase' has no attribute 'RegisterServiceDescriptor'
It is there in the current latest version (which is 3.4.0):
In [1]: import sys
In [2]: sys.version
Out[2]: '2.7.14 (default, Oct 5 2017, 09:30:02) \n[GCC 5.4.0 20160609]'
In [3]: import google.protobuf
In [4]: google.protobuf.__version__
Out[4]: '3.4.0'
In [5]: from google.protobuf import symbol_database
In [6]: symbol_database.SymbolDatabase.RegisterServiceDescriptor
Out[6]: <unbound method SymbolDatabase.RegisterServiceDescriptor>
Since the answer seems to be "upgrade protobuf", I am going to pre-emptively close this.
If I'm misunderstood the problem and solution, I am sorry. I am certainly happy to re-open this or continue discussion if need be.
This issue is still occurring even with protobuf on version 3.4.0
@parthmishra Can you run a few commands?
python -m pip show protobuf
python -c 'from google.protobuf import symbol_database; print(getattr(symbol_database.SymbolDatabase, "RegisterServiceDescriptor", None))'
It may be that you are checking the protobuf version and running the code in different interpreters by mistake
@dhermes Sure, here's my output:
python -m pip show protobuf
Name: protobuf
Version: 3.4.0
Summary: Protocol Buffers
Home-page: https://developers.google.com/protocol-buffers/
Author: [email protected]
Author-email: [email protected]
License: 3-Clause BSD License
Location: /usr/local/lib/python2.7/dist-packages
Requires: six, setuptools
and
python -c 'from google.protobuf import symbol_database; print(getattr(symbol_database.SymbolDatabase, "RegisterServiceDescriptor", None))'
<unbound method SymbolDatabase.RegisterServiceDescriptor>
So it seems like it the training submission should work but I still get error for issue #3967
So it looks like there's no issue. The original report was:
AttributeError: 'SymbolDatabase' object has no attribute 'RegisterServiceDescriptor'
@dhermes so if I'm still getting that AttributeError on a training submission, there is some other decency issue at play?
on a training submission
You need to make sure the interpreter that is running there has an up-to-date version of protobuf. Your local interpreter only tells you about itself.
I come across the same problem as Parthmishra described. I use protobuf-3.4.0, and the output check is as below:
python -c 'from google.protobuf import symbol_database; print(getattr(symbol_database.SymbolDatabase, "RegisterServiceDescriptor", None))'
But when I run the trainning, it show the same error message:
AttributeError: 'SymbolDatabase' object has no attribute 'RegisterServiceDescriptor'
I post a new issue in https://github.com/GoogleCloudPlatform/cloudml-samples/issues/99 to describe the details of my problem.
Is there anything else I can do to solve the problem?
@luckyapplehead I'm not familiar with this service, but the issue is the environment where you "run the training". What is the environment? Provide a link or some other context?
I have solved the problem according to parthmishra's advice as belows:
In your setup.py just list the version number afterwards i.e. protobuf==3.4.0 instead of protobuf (or change your requirements.txt if you're using that instead)
Thanks very much:)
Most helpful comment
It seems to be a version conflict, that attribute is just not there in
protobof==3.2.0, but thebeam_fn_api_pbmodule assumes it is:It is there in the current latest version (which is
3.4.0):Since the answer seems to be "upgrade
protobuf", I am going to pre-emptively close this.If I'm misunderstood the problem and solution, I am sorry. I am certainly happy to re-open this or continue discussion if need be.