I ran into a missing 'train_dir' issue while trying to train oxford pet dataset on gcloud according to the tutorial. I clearly have provided command line argument for 'train_dir' and I am able to run it locally just fine.
Attached is the error log from gcloud. Any help would be greatly appreciated!
Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 198, in train_dir is missing.' AssertionError: train_dir is missing.
Hi @TomPyonsuke - can you copy your command line in?
Hi! "zinc-guru-3900" is my bucket. Thanks!
gcloud ml-engine jobs submit training whoami_object_detection_date +%s --job-dir=gs://zinc-guru-3900/train --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz --module-name object_detection.train --region us-central1 --config object_detection/samples/cloud/cloud.yml -- \ --train_dir=gs://zinc-guru-3900/train --pipeline_config_path=gs://zinc-guru-3900/data/faster_rcnn_resnet101_pets.config

The problem seems to be with replica2
I solved this issue by providing train directory path as the default for train_dir. It's still unclear to me why replicas are not taking command line arguments. Looks like a bug to me?
Thank you!
@derekjchow PTAL
@TomPyonsuke thanks for reporting this. Could you help us reproduce by providing some information
gcloud --version)?i have the same problem. do you solve the problem yet?

I have the same problem how do you give the default path can you just copy paste the command you used to train
Thanks!
I have the same issue with the new way you train from the command line you can't add pipeline_config or train_dir in the new one, can someone ellaborate on how to do it. Thank You.
same with me~seems like can't add pipeline_config path from command line~how can I do~~
(in anaconda prompt(windows 10)),please help ~
same here, I am in (anaconda2 env(Ubuntu 16.04)), anyone can help?
@TomPyonsuke how did you provide train directory path as the default for train_dir? I tried to add path "object_detection/models/model/train" in train.py file. But then got the following issues:
Traceback (most recent call last):
  File "object_detection/train.py", line 197, in 
    tf.app.run()
  File "/home/yhmybzc/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "object_detection/train.py", line 144, in main
    model_config, train_config, input_config = get_configs_from_multiple_files()
  File "object_detection/train.py", line 126, in get_configs_from_multiple_files
    text_format.Merge(f.read(), train_config)
  File "/home/yhmybzc/anaconda2/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 125, in read
    pywrap_tensorflow.ReadFromStream(self._read_buf, length, status))
  File "/home/yhmybzc/anaconda2/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/home/yhmybzc/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.FailedPreconditionError: . 
Any suggestions would be grateful~
@TomPyonsuke "...-- \ --train_dir=gs://..."
This part seems wrong to me. Also do not leave a space after between =xxx as = xxx will fail with the same error. (and do not copy paste commands with spaces after the final \ like _ _ _, searched a while for this one)
amrita@amrita-VirtualBox:~/Downloads/models/research/object_detection$ python3 train.py
WARNING:tensorflow:From /home/amrita/.local/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
Traceback (most recent call last):
  File "train.py", line 147, in 
    tf.app.run()
  File "/home/amrita/.local/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "train.py", line 68, in main
    assert FLAGS.train_dir, 'train_dir is missing.'
AssertionError: train_dir is missing.
 i am having this error,thanks.
Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!
I solved this issue by providing train directory path as the default for train_dir. It's still unclear to me why replicas are not taking command line arguments. Looks like a bug to me?
Thank you!
@TomPyonsuke Bro how did you make the train directory path as the default for train_dir?
(tensorflow) D:\my-work\WiS - alert - 2\models\research\object_detection>python train.py --logtostderr --train_dir= D:/my-work/WiS - alert - 2 /models/research/object_detection/training/ --pipeline_config_path= D:/my-work/WiS - alert - 2 /models/research/object_detection/training/ssd_mobilenet_v1_coco.config
D:\installation\anaconda\envs\tensorflowlib\site-packages\h5py__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
  from ._conv import register_converters as _register_converters
Traceback (most recent call last):
  File "train.py", line 164, in 
    tf.app.run()
  File "D:\installation\anaconda\envs\tensorflowlib\site-packages\tensorflow\python\platformapp.py", line 125, in run
    _sys.exit(main(argv))
  File "train.py", line 88, in main
    assert FLAGS.train_dir, 'train_dir is missing.'
AssertionError: train_dir is missing.
Need Solution
I had the same problem because I was using copy-paste from some github page. You need to re-type the command letter by letter, and then it works!
I had the same problem because I was using copy-paste from some github page. You need to re-type the command letter by letter, and then it works!
worked for me as well
Did you solve this issue ?
I have the same problem can anyone help me please ? 
I went inside the train.py file and saw the following:

Make sure that you type the command exactly like that. In my case, I was not passing in the "train_dir" argument. Hope this helps.
Most helpful comment
I had the same problem because I was using copy-paste from some github page. You need to re-type the command letter by letter, and then it works!