Followed directions to run this tutorial on google cloud ML:
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_pets.md
I followed all the directions.
Set up fine on Mac and uploaded files to GCP. Started training and evaluation jobs as per directions. After starting TF, job terminates with a python error - missing the matplotlib.pyplot module on the GCP side.
s-replica-1
Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 51, in
/CC @jesu9
I have the same error. Have you solved it? @krystynak
Got working updating:
setup.py:
REQUIRED_PACKAGES = ['Pillow>=1.0','matplotlib']
object_detection/utils/visualization_utils.py commenting out lines:
and below that.
This error was also reported as issue 2739
https://github.com/tensorflow/models/issues/2739
A solution was posted that worked for some users, but not everyone (including myself).
Hi, do you figure out this? I have the same problem when training in the cloud. Is it related to the python version? I can train locally on my computer, and I use the python version is 3.6 in Anaconda.
Im training a SSD using the fix that I referenced except that I use TF runtime 1.2 instead of 1.4, and Python 2.7. It is working for me. I’m not exactly sure which methods contained in the fix require runtime 1.4, but evidently some of them do.
On Mar 9, 2018, at 9:20 PM, BrownTian notifications@github.com wrote:
Hi, do you figure out this? I have the same problem when training in the cloud. Is it related to the python version? I can train locally on my computer, and I use the python version is 3.6 in Anaconda.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
@davidblumntcgeo So you fix the code in setup.py following the suggestion in #2739 ? And it works when the runtime = 1.2? By the way, may I ask you are using Windows or MacOS?
@davidblumntcgeo @BrownTian
have you guys found the exact solution to solve the issue, i 've tried many times referenced by issue #2739, it doesn't work, i deployed the training in cloud, and used TF1.4
The replica ps 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 51, in
Please state if anyone found a solution for this problem .
Hi There,
We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing.
If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.
Most helpful comment
Got working updating:
setup.py:
REQUIRED_PACKAGES = ['Pillow>=1.0','matplotlib']
object_detection/utils/visualization_utils.py commenting out lines:
import matplotlib.pyplot as plt
def add_cdf_image_summary(values, name):
and below that.