I try to train my own dataset. Everything is ok and I run this code:
python keras_retinanet/bin/train.py csv C:/Users/Administrator/Desktop/keras-retinanet/Part_A_labels.csv C:/Users/Administrator/Desktop/keras-retinanet/class_mapping.csv
However, I get some errors. Here is my result. I wonder maybe I should try to run on Ubuntu, or maybe I should try another version of tensorflow(now it's 1.12.0) . Can anyone help me ? Thanks a lot! @hgaiser
(clw_env_py36) C:\Users\Administrator>cd C:\Users\Administrator\Desktop\keras-retinanet
(clw_env_py36) C:\Users\Administrator\Desktop\keras-retinanet>python keras_retinanet/bin/train.py csv C:/Users/Administrator/Desktop/keras-retinanet/xxx_labels.csv C:/Users/Administrator/Desktop/keras-retinanet/class_mapping.csv
Using TensorFlow backend.
2019-01-03 00:15:52.491959: I C:\users\nwani\_bazel_nwani\ujdkfsks\execroot\org_tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2019-01-03 00:15:52.701690: I C:\users\nwani\_bazel_nwani\ujdkfsks\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1392] Found device 0 with properties:
name: GeForce RTX 2080 major: 7 minor: 5 memoryClockRate(GHz): 1.8
pciBusID: 0000:65:00.0
totalMemory: 8.00GiB freeMemory: 6.53GiB
2019-01-03 00:15:52.705641: I C:\users\nwani\_bazel_nwani\ujdkfsks\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1471] Adding visible gpu devices: 0
2019-01-03 00:15:53.438635: I C:\users\nwani\_bazel_nwani\ujdkfsks\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-01-03 00:15:53.441423: I C:\users\nwani\_bazel_nwani\ujdkfsks\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:958] 0
2019-01-03 00:15:53.443366: I C:\users\nwani\_bazel_nwani\ujdkfsks\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:971] 0: N
2019-01-03 00:15:53.445381: I C:\users\nwani\_bazel_nwani\ujdkfsks\execroot\org_tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6278 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080, pci bus id: 0000:65:00.0, compute capability: 7.5)
Creating model, this may take a second...
Layer (type) Output Shape Param # Connected to
input_1 (InputLayer) (None, None, None, 3 0
....................
Total params: 36,382,957
Trainable params: 36,276,717
Non-trainable params: 106,240
__________________________________________________________________________________________________
None
Epoch 1/50
Exception in thread Thread-2:
Traceback (most recent call last):
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\threading.py", line 916, in _bootstrap_inner
self.run()
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\site-packages\keras\utils\data_utils.py", line 565, in _run
with closing(self.executor_fn(_SHARED_SEQUENCES)) as executor:
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\site-packages\keras\utils\data_utils.py", line 548, in <lambda>
initargs=(seqs,))
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\multiprocessing\context.py", line 119, in Pool
context=self.get_context())
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\multiprocessing\pool.py", line 175, in __init__
self._repopulate_pool()
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\multiprocessing\pool.py", line 236, in _repopulate_pool
self._wrap_exception)
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\multiprocessing\pool.py", line 255, in _repopulate_pool_static
w.start()
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle generator objects
Using TensorFlow backend.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Users\Administrator\Anaconda3\envs\clw_env_py36\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\Administrator\Desktop\keras-retinanet\keras_retinanet\bin\train.py", line 35, in <module>
from .. import layers # noqa: F401
ImportError: attempted relative import with no known parent package
I am facing this issue as well and looks like it's a windows related issue or the code is not fully portable on windows.
read this issue: https://stackoverflow.com/questions/42041000/error-in-use-of-python-multiprocessing-module-with-generator-function
I haven't been able to solve this yet.
Okay, I was able to solve this by turning off multiprocessing. You can do this by providing the "--workers=0" parameter or by changing it directly in the train.py file in the line "parser.add_argument('--workers',
help='Number of multiprocessing workers. To disable multiprocessing, set workers to 0',
type=int, default=0)"
I am facing this issue as well and looks like it's a windows related issue or the code is not fully portable on windows.
read this issue: https://stackoverflow.com/questions/42041000/error-in-use-of-python-multiprocessing-module-with-generator-functionI haven't been able to solve this yet.
Hello @sourcedexter , now I try the Ubuntu16.04锛宧owever锛宎fter the word 'epoch 1' shown on the screen for a while, I get the error "Segmentation fault" ........ I use Anaconda and I try a lot of version of python , it still not work. I don't know
Okay, I was able to solve this by turning off multiprocessing. You can do this by providing the "--workers=0" parameter or by changing it directly in the train.py file in the line "parser.add_argument('--workers',
help='Number of multiprocessing workers. To disable multiprocessing, set workers to 0',
type=int, default=0)"
Great! I'll try it later, thank you brother!
Okay, I was able to solve this by turning off multiprocessing. You can do this by providing the "--workers=0" parameter or by changing it directly in the train.py file in the line "parser.add_argument('--workers',
help='Number of multiprocessing workers. To disable multiprocessing, set workers to 0',
type=int, default=0)"
It works. Thank you so much!!
I am facing this issue as well and looks like it's a windows related issue or the code is not fully portable on windows.
read this issue: https://stackoverflow.com/questions/42041000/error-in-use-of-python-multiprocessing-module-with-generator-function
I haven't been able to solve this yet.Hello @sourcedexter , now I try the Ubuntu16.04锛宧owever锛宎fter the word 'epoch 1' shown on the screen for a while, I get the error "Segmentation fault" ........ I use Anaconda and I try a lot of version of python , it still not work. I don't know
that might be because of the dependencies being incorrectly installed
that might be because of the dependencies being incorrectly installed
@sourcedexter I think you are right, but I don't know how to find which package has been incorrectly installed. So I will transfer to Windows again....
And I have another question, recently I read a book named 'deep learning with python', the founder of keras advise users to avoid using Windows, I think maybe there are more strange problems in Windows. I think you are very experienced, why you use Windows instead of using Linux ?
that might be because of the dependencies being incorrectly installed
@sourcedexter I think you are right, but I don't know how to find which package has been incorrectly installed. So I will transfer to Windows again....
And I have another question, recently I read a book named 'deep learning with python', the founder of keras advise users to avoid using Windows, I think maybe there are more strange problems in Windows. I think you are very experienced, why you use Windows instead of using Linux ?
@clw5180 , I do use Linux even today. However, the work laptop is Windows, so I had to use that. I will be running this on a Linux Server though.
Okay, I was able to solve this by turning off multiprocessing. You can do this by providing the "--workers=0" parameter or by changing it directly in the train.py file in the line "parser.add_argument('--workers',
help='Number of multiprocessing workers. To disable multiprocessing, set workers to 0',
type=int, default=0)"
cool!!! solved!!!
Most helpful comment
Okay, I was able to solve this by turning off multiprocessing. You can do this by providing the "--workers=0" parameter or by changing it directly in the train.py file in the line "parser.add_argument('--workers',
help='Number of multiprocessing workers. To disable multiprocessing, set workers to 0',
type=int, default=0)"