Yolov5: Folder deletion in detect.py

Created on 7 Aug 2020  路  13Comments  路  Source: ultralytics/yolov5

鉂擰uestion

Hello!
When using detect.py on a video file, I ran into a questionable design choice. When launching detect.py on a video file in the same folder as the destination folder, the video folder gets wiped clean. Wouldn't checking if the output file exists, then renaming the new one be better than just wiping the entire folder? Or just... not deleting everything in it?
Obviously this can be solved by just creating a temporary folder for the output file, but someone might irreversibly delete some data, which would cause plenty of headaches.

Additional context

Used command:
python3 detect.py --weights exp9/weights/best.pt --source Videos/Hockey/video.mp4 --output Videos/Hockey --img-size 800 --view-img

Stale question

All 13 comments

Hello @MiiaBestLamia, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook Open In Colab, Docker Image, and Google Cloud Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

  • Cloud-based AI systems operating on hundreds of HD video streams in realtime.
  • Edge AI integrated into custom iOS and Android apps for realtime 30 FPS video inference.
  • Custom data training, hyperparameter evolution, and model exportation to any destination.

For more information please visit https://www.ultralytics.com.

@MiiaBestLamia thanks for the feedback! What would be your expected best detect.py workflow?

In your command I see the input and output directories are also equal, which will incur an overwrite of the original file even if it's contents are not cleared first, as inference image results are saved under the same filenames.

Screen Shot 2020-08-08 at 10 57 50 AM

@glenn-jocher there are tons of ways to "fix" this, even though it's not really a bug, but something that might create problems due to oversights. The easiest way is to probably:
a)create a sub-folder in which the processed info is stored, without the need to wipe anything.
b)create the inference file with a slightly different name, adding "..._processed" to the file-name
c)have the inference results by default get saved to a directory in the project folder
It's just a design choice that matters very little in the hands of an experienced user, but someone might not pay attention, store a file without a backup in the output folder and accidentally delete it, which might happen fairly often.
Hope i helped improve this project even a little, love what you and your team and everyone who offers improvements are doing, keep up the great work!

@MiiaBestLamia hmm, actually a couple months ago we transitioned _training_ from a same-directory method to a more best-practices method which increments run directories every time a new training starts. This prevents old results from being erased, and is the default tensorboard behavior which we inherited it from. Maybe this is the best implementation on the inference side as well? The output folder would create subfolders, and every additional run of detect.py would create an additional subfolder. This would also solve the 'same source and output directories' issue, killing two birds with one stone.

yolov5
  /inference
     /source
     /output
        /exp0
        /exp1
        /exp2

@glenn-jocher that sounds marginally better than the current version, in my opinion. The new feature of each training "session" being stored in it's own directory in the runs folder was definitely a welcome change, being able to see some images of the dataset, the results and all of the used settings makes training much more convenient, as you can much more easily track what was changed, what are the results and so on.
I for one would welcome the addition of creating subfolders on detect.py, would make it more convenient, and you would be able to see the past results any time, possibly even seeing which weights/configs were used, what augmentation was used and so on.

@MiiaBestLamia sounds good, I think we are on the same page. I've added a TODO label to the issue, so not sure when we will get around to it, but it's on the list now.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@MiiaBestLamia we are working on resolving this issue finally, as a similar issue #1297 was brought up with test.py, which likewise deletes previous results by default.

Our new idea is to handle train.py, test.py and detect.py uniformly the same way that we handle training runs now, by incrementing a --save-dir on each new run.

  • train.py data will be saved to runs/exp0, runs/exp1 (current default)
  • test.py directories will be runs/test0, runs/test1, etc.
  • detect.py directories will be runs/detect0, runs/detect1, etc.
  • no data is deleted
  • changing --save-dir from the default will prevent incrementing behavior. i.e. --save-dir new_dir will create a new dir if it does not exist, but if it already exists it will not alter it's contents (other than to overwrite any same-name files already present).

I'd like to get your feedback on this before we move forward. What do you think?

@glenn-jocher
In my opinion this is a great solution for this issue, although the fact that testing gets saved to the same directory as actual training and detection might prove to be slightly annoying, since testing might be run hundreds of times, while training only gets run a couple of times. Making test runs get saved in "tests" and detection runs get saved in "detects" would make it more understandable, but also make it harder to navigate the folders, therefore I suppose that the best solution would indeed be the one you proposed.

@MiiaBestLamia yes I see your point. The directory count may grow significantly especially after long term use. How about this:

runs/
  train/
    0/
    1/
    2/
    ...etc
  test/
    0/
    1/
    2/
    ...etc
  detect/
    0/
    1/
    2/
    ...etc

@glenn-jocher yeah, that would probably be the best way to solve resolve this - this makes it easier to locate all of the necessary runs, and sorts them in it's own way.

@Borda (and anyone else) I could use your feedback on this. I was thinking of centralizing all output in the runs/ directory, and logging test.py and detect.py outputs with incrementing directories (like train.py logging), to fix data deletion issues (#663, #1297).

# Current system ----------
yolov5/
    runs/
        exp0/  # train.py outputs increment
        exp1_name/ # with python train.py --name (optional)
        ...
        test/  # test.py outputs (WARNING directory DELETED on each new test.py run)
    inference/
        output/  # detect.py outputs (WARNING directory DELETED on each new detect.py run)


# Proposed system ----------
yolov5/
    runs/
        train/  # train.py outputs increment (WARNING: naming convention change)
            0/
            1_name/  # with python train.py --name (optional)
            ...
        test/  # test.py outputs increment
            0/
            1_name/  # with python test.py --name (optional)
            ...
        detect/  # detect.py outputs increment
            0/
            1_name/  # with python detect.py --name (optional)
            ...

It would make sense to handle the outputs uniformly among the 3, but this would be a breaking change to the existing train.py logging. Alternatively we could simply leave train.py logging alone, and implement this change for test.py and detect.py only. What do you think? Any feedback is appreciated.

@MiiaBestLamia these changes are now been pushed to master with PR #1322. Thank you for your feedback!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jaqub-manuel picture jaqub-manuel  路  4Comments

KangHoyong picture KangHoyong  路  3Comments

DucTaiVu picture DucTaiVu  路  3Comments

FSNStefan picture FSNStefan  路  4Comments

milind-soni picture milind-soni  路  3Comments