Models: Error running Oxford Pets Tutorial on Google Cloud ML Engine

Created on 1 Feb 2018 · 11Comments · Source: tensorflow/models

Followed directions to run this tutorial on google cloud ML:
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_pets.md

I followed all the directions.
Set up fine on Mac and uploaded files to GCP. Started training and evaluation jobs as per directions. After starting TF, job terminates with a python error - missing the matplotlib.pyplot module on the GCP side.

s-replica-1
Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 51, in from object_detection.builders import model_builder File "/root/.local/lib/python2.7/site-packages/object_detection/builders/model_builder.py", line 29, in from object_detection.meta_architectures import ssd_meta_arch File "/root/.local/lib/python2.7/site-packages/object_detection/meta_architectures/ssd_meta_arch.py", line 31, in from object_detection.utils import visualization_utils File "/root/.local/lib/python2.7/site-packages/object_detection/utils/visualization_utils.py", line 24, in import matplotlib.pyplot as plt ImportError: No module named matplotlib.pyplot

awaiting model gardener

Source

krystynak

Most helpful comment

Got working updating:
setup.py:
REQUIRED_PACKAGES = ['Pillow>=1.0','matplotlib']

object_detection/utils/visualization_utils.py commenting out lines:

import matplotlib.pyplot as plt

def add_cdf_image_summary(values, name):

and below that.

vahvarh on 14 Feb 2018

👍3

All 11 comments

/CC @jesu9

reedwm on 1 Feb 2018

I have the same error. Have you solved it? @krystynak

XIONGJIECHENG on 10 Feb 2018

Got working updating:
setup.py:
REQUIRED_PACKAGES = ['Pillow>=1.0','matplotlib']

object_detection/utils/visualization_utils.py commenting out lines:

import matplotlib.pyplot as plt

def add_cdf_image_summary(values, name):

and below that.

vahvarh on 14 Feb 2018

👍3

This error was also reported as issue 2739
https://github.com/tensorflow/models/issues/2739

A solution was posted that worked for some users, but not everyone (including myself).

davidblumntcgeo on 6 Mar 2018

Hi, do you figure out this? I have the same problem when training in the cloud. Is it related to the python version? I can train locally on my computer, and I use the python version is 3.6 in Anaconda.

BrownTian on 10 Mar 2018

Im training a SSD using the fix that I referenced except that I use TF runtime 1.2 instead of 1.4, and Python 2.7. It is working for me. I’m not exactly sure which methods contained in the fix require runtime 1.4, but evidently some of them do.

On Mar 9, 2018, at 9:20 PM, BrownTian notifications@github.com wrote:

Hi, do you figure out this? I have the same problem when training in the cloud. Is it related to the python version? I can train locally on my computer, and I use the python version is 3.6 in Anaconda.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

davidblumntcgeo on 10 Mar 2018

@davidblumntcgeo So you fix the code in setup.py following the suggestion in #2739 ? And it works when the runtime = 1.2? By the way, may I ask you are using Windows or MacOS?

BrownTian on 10 Mar 2018

@davidblumntcgeo @BrownTian
have you guys found the exact solution to solve the issue, i 've tried many times referenced by issue #2739, it doesn't work, i deployed the training in cloud, and used TF1.4

sainttelant on 15 Mar 2018

The replica ps 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 51, in from object_detection.builders import model_builder File "/root/.local/lib/python2.7/site-packages/object_detection/builders/model_builder.py", line 29, in from object_detection.meta_architectures import ssd_meta_arch File "/root/.local/lib/python2.7/site-packages/object_detection/meta_architectures/ssd_meta_arch.py", line 32, in from object_detection.utils import visualization_utils File "/root/.local/lib/python2.7/site-packages/object_detection/utils/visualization_utils.py", line 25, in import matplotlib; matplotlib.use('Agg') # pylint: disable=multiple-statements ImportError: No module named matplotlib The replica ps 1 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 51, in from object_detection.builders import model_builder File "/root/.local/lib/python2.7/site-packages/object_detection/builders/model_builder.py", line 29, in from object_detection.meta_architectures import ssd_meta_arch File "/root/.local/lib/python2.7/site-packages/object_detection/meta_architectures/ssd_meta_arch.py", line 32, in from object_detection.utils import visualization_utils File "/root/.local/lib/python2.7/site-packages/object_detection/utils/visualization_utils.py", line 25, in import matplotlib; matplotlib.use('Agg') # pylint: disable=multiple-statements ImportError: No module named matplotlib The replica ps 2 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 51, in from object_detection.builders import model_builder File "/root/.local/lib/python2.7/site-packages/object_detection/builders/model_builder.py", line 29, in from object_detection.meta_architectures import ssd_meta_arch File "/root/.local/lib/python2.7/site-packages/object_detection/meta_architectures/ssd_meta_arch.py", line 32, in from object_detection.utils import visualization_utils File "/root/.local/lib/python2.7/site-packages/object_detection/utils/visualization_utils.py", line 25, in import matplotlib; matplotlib.use('Agg') # pylint: disable=multiple-statements ImportError: No module named matplotlib To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=741093659524&resource=ml_job%2Fjob_id%2Fgoldedseito_shaka_object_detection_1521114946&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22goldedseito_shaka_object_detection_1521114946%22

sainttelant on 15 Mar 2018

Please state if anyone found a solution for this problem .

thewozz on 28 Jun 2018

Hi There,
We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing.
If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.