The tutorial documentation was useful but did not generalize to EfficientDet well. To write this tutorial, I drew from the following resources:
Using the above resources, I wrote a tutorial to train EfficientDet in Google Colab with the TensorFlow 2 Object Detection API.
You can run this tutorial by changing just one line for your custom dataset import. I hope this tutorial allows newcomers to the repository to quickly get up and running with TensorFlow 2 for object detection!
In the tutorial, I write how to:
Thank you very much .. that was really helpful
Hi there,
Firstly, I wanted to thank you for your notebook. It was very helpful! I tried using your notebook for my dataset and am getting a RAM Error in Colab though. I ran all of your cells step for step and just changed the curl link to my curl link and the corresponding train/test paths in the config file. Here is the stack trace
I am getting
Traceback (most recent call last):
File "/content/models/research/object_detection/model_main_tf2.py", line 106, in <module>
tf.compat.v1.app.run()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/content/models/research/object_detection/model_main_tf2.py", line 103, in main
use_tpu=FLAGS.use_tpu)
File "/usr/local/lib/python3.6/dist-packages/object_detection/model_lib_v2.py", line 622, in train_loop
loss = _dist_train_step(train_input_iter)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py", line 580, in __call__
result = self._call(*args, **kwds)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py", line 611, in _call
return self._stateless_fn(*args, **kwds) # pylint: disable=not-callable
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 2420, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1665, in _filtered_call
self.captured_inputs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1746, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 598, in call
ctx=ctx)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[16,49104,4] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node Loss/Loss/huber_loss/Minimum (defined at /local/lib/python3.6/dist-packages/object_detection/core/losses.py:176) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[Func/Loss/regularization_loss/write_summary/summary_cond/then/_20/input/_95/_364]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
(1) Resource exhausted: OOM when allocating tensor with shape[16,49104,4] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node Loss/Loss/huber_loss/Minimum (defined at /local/lib/python3.6/dist-packages/object_detection/core/losses.py:176) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
0 successful operations.
0 derived errors ignored. [Op:__inference__dist_train_step_87648]
Function call stack:
_dist_train_step -> _dist_train_step
Could you give some insight on what could be causing this error? Is it a problem with my Colab environment or am I missing something in the notebook? Thanks!
@AarnavSawant that means you ran out of VRAM on GPU. Reducing either the Batch Size or the size of images should help.
Hi @Jacobsolawetz, your tutorial is amazing, thanks a lot! I was wondering, how can I get the coco evaluation metrics ?
@tazu786 thank you - I'm still working on that! The code is commented out in there to "eval continuously" which should give us the COCO metrics, but I didn't have luck with that at first.
@Jacobsolawetz yes I have the same problem. However, the training with my custom dataset somehow works. I tried to get some qualitative results from the trained model and they are pretty decent. Anyone else luckier than us?
@Jacobsolawetz I tried to apply this suggestion https://github.com/tensorflow/models/issues/8856#issuecomment-664753607 to the train_main_tf2.py but without success. Any news?
@tazu786 no luck yet! glad to hear the qualitative at least looks good... As soon a I see a fix, I will be putting it in there.
Any updates from the team on this issue with eval in training? I'm willing to take a look into the code and try to solve it, but only if this is not already being resolved by someone from the team...
@Jacobsolawetz i tried to run the colab tutorial however it throw the error below, maybe they removed something as i was able to run the demo successfully before
`/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/initializers/__init__.py in populate_deserializable_objects()
83 v2_objs = {}
84 base_cls = initializers_v2.Initializer
---> 85 generic_utils.populate_dict_with_module_objects(
86 v2_objs,
87 [initializers_v2],
AttributeError: module 'tensorflow.python.keras.utils.generic_utils' has no attribute 'populate_dict_with_module_objects'`
The error happens when i run this:
`import matplotlib
import matplotlib.pyplot as plt
import os
import random
import io
import imageio
import glob
import scipy.misc
import numpy as np
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont
from IPython.display import display, Javascript
from IPython.display import Image as IPyImage
import tensorflow as tf
from object_detection.utils import label_map_util
from object_detection.utils import config_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.utils import colab_utils
from object_detection.builders import model_builder
%matplotlib inline`
Update:
I solved the problem by updating to latest TF version: !pip install -U --pre tensorflow_gpu
and removed the edit to the tf_utils.py file
Hello, Jacobsolawetz did you finish the evaluation code to EfficientDET ?
Hello, I can't make a custom EfficientDet training work due to memory issues. Any help on this? #9141
Most helpful comment
@AarnavSawant that means you ran out of VRAM on GPU. Reducing either the Batch Size or the size of images should help.