Keras-retinanet: retinanet-evaluate predictions have a random component

Created on 13 Jun 2019  路  10Comments  路  Source: fizyr/keras-retinanet

For this issue, please download the folder test here

https://drive.google.com/open?id=1MTRxPZckR1zBYnIluuMKWpyx7EbnJD48

and edit test_annotations.csv accordingly, if test is not in the root directory.

I run predictions on the six images in the test/images folder, with the following command:

retinanet-evaluate --save-path /outputs/inference_results/ csv /test/annotations/minitest_annotations.csv /test/annotations/classes.csv /test/resnet50_csv_inference_50.h5

The code runs smoothly, but every time I re-run it, I get different mAP values. Some of the results I got so far:

0.6667
0.5556
0.4444

This is very weird. Why is this happening?

Most helpful comment

Perfect! I got the same result 10 times out of 10 now. I'm closing the issue. Thank you!

All 10 comments

Hmm very strange. I downloaded your data and ran exactly the same command without editing anything (aside from the paths, as I didn't have them in my root) and I consistently get :

3 instances of class defect with average precision: 0.6667
mAP using the weighted average of precisions among classes: 0.6667
mAP: 0.6667

I've run it at least a dozen times and always get this output. I have a couple of questions:

  • What is the setup you are running this on (hardware, OS, relevant libraries and their versions (nvidia driver / CUDA / cuDNN / tensorflow / keras).
  • Did you make any modifications to the code?
  • Did you make any modifications to Keras / Tensorflow? Could you share your ~/.keras/keras.json?
  • Do you get consistent output if you use the CPU? (you can prefix your retinanet-evaluate command with CUDA_VISIBLE_DEVICES=).
  • How and when did you install keras-retinanet? What version do you have installed? Are you sure you are using the retinanet-evaluate that you expect you're using (ie., does which retinanet-evaluate produce a logical path to you?).
  • Is it possible for me to get (limited) access to the system so that I can run some tests?

Hey, thanks for the detailed answer! I won't be able to answer all your questions quickly, but I'll work through it in these days. Note that my issue is similar to this one: https://github.com/fizyr/keras-retinanet/issues/1015

I've run it at least a dozen times and always get this output. I have a couple of questions:

A bit more than a couple 馃槃 I'll try to answer.

What is the setup you are running this on (hardware, OS, relevant libraries and their versions (nvidia driver / CUDA / cuDNN / tensorflow / keras).

I run on two different machines, either a big one with (8 GPUs) or a MacBook Pro (no (usable) GPU 馃槵 ). The OS is the same because I use Docker. The Dockerfile, however, is different, because I cannot use nvidia-docker on the Mac. I added the Mac Dockerfile and the GPU Dockerfile to the shared folder (after removing private information). I also added the startup.sh and requirements.txt files on which the Dockerfile(s) depend. startup.sh is not very useful, you can do without it. The requirements.txt is more important. I can add the commands I use to build the Docker image and run then Docker container if you need them. They're very tailored to my env (including proxies) but if you really need them, I can "clean" them of any sensitive information and add them.

Did you make any modifications to the code?

Not at all, as you can see the Dockerfile just clones & installs the latest version of keras_retinanet from GitHub (as well as a bunch of other libraries/apt packages, without which I could not get the damn training to work).

Did you make any modifications to Keras / Tensorflow?

None at all.

Could you share your ~/.keras/keras.json?

What? I'll look for it and add it to the Google Drive folder.

Do you get consistent output if you use the CPU? (you can prefix your retinanet-evaluate command with CUDA_VISIBLE_DEVICES=).

Don't have time now, will test this later.

How and when did you install keras-retinanet? What version do you have installed?

See above.

Are you sure you are using the retinanet-evaluate that you expect you're using (ie., does which retinanet-evaluate produce a logical path to you?).

I'm not sure I understand the question. As you can see, the Dockerfile edits the PATH variable so that /root/.local/bin is in the PATH, because that's where retinanet-evaluate gets installed to by the pip3 install --user . command. Does this answer your question?

Is it possible for me to get (limited) access to the system so that I can run some tests?

Not on any company machine. However, thanks to Docker, I may try to build the image on my home computer (Ubuntu), check if the randomness appears there too, and give you access to it. I'm not a sysadmin, so I'll need guidance from you on how to do that. Can I trust you not to hack my pc? 馃槢

A bit more than a couple smile I'll try to answer.

Hehe yeah it turned out to be more than I thought.

I can add the commands I use to build the Docker image and run then Docker container if you need them. They're very tailored to my env (including proxies) but if you really need them, I can "clean" them of any sensitive information and add them.

Yeah if you would, that would help.

What? I'll look for it and add it to the Google Drive folder.

Considering you're using Docker for a fresh container, I don't think this is relevant.

I'm not sure I understand the question. As you can see, the Dockerfile edits the PATH variable so that /root/.local/bin is in the PATH, because that's where retinanet-evaluate gets installed to by the pip3 install --user . command. Does this answer your question?

Sometimes issues are caused when at some point in the past keras-retinanet was installed, then keras-retinanet updated, the user clones or pulls the latest changes. You have two versions of keras-retinanet then, one installed, one cloned. Depending on how you run the code you either use the installed or cloned one.

Not on any company machine. However, thanks to Docker, I may try to build the image on my home computer (Ubuntu), check if the randomness appears there too, and give you access to it. I'm not a sysadmin, so I'll need guidance from you on how to do that.

Sure no problem, should be relatively easy. If you install and run sshd (ssh daemon) and in your router a port to port 22 on the system running sshd then I should be able to access it remotely.

Can I trust you not to hack my pc? stuck_out_tongue

As good as the next guy on the internet ;) I understand if you don't want to do this. You could create a separate account with limited access if you want to be more secure.

Yeah if you would, that would help.

Done. build_public.sh and run_public.sh to respectively build the image and run the container. The run_public.sh command uses interactive mode, thus it opens a bash shell in the running container. Once you're there, execute the command

retinanet-evaluate --save-path /outputs/inference_results/ csv /test/annotations/minitest_annotations.csv /test/annotations/classes.csv /test/resnet50_csv_inference_50.h5

to generate the predictions (note that there was a small typo in the original command, but you surely found out already). I ran the above command 10 times consecutively and here are the mAP results:

0.3333
0.3333
0.3333
0.5000
0.3333
0.4444
0.3333
0.6667
0.5000
0.5556

As good as the next guy on the internet ;) I understand if you don't want to do this. You could create a separate account with limited access if you want to be more secure.

Hmm, let's see how far we can go, before having to resort to this. In meantime, I'll think about it 馃槈

Hmm this was an interesting one :p

Okay so I think it is "resolved" in #1042 . The problem was that in python 3.5 and older, list(some_dict.keys()) would give an arbitrary order of the keys. We use this in the CSV generator to get a list of all image filenames.

That in itself is not an issue, but most of the scores from your network were 1. The evaluation runs the network on all images, and then sorts detections according to their score. If a detection is sorted first, but is actually a false detection, then that is penalized a lot more than if that same detection is sorted second. Because a lot of the detections had score 1, the ordering of scores depended on the order in which they occurred, which depended on the order in which the images were listed in the generator, which was arbitrary in python3.5 and older.

PR #1042 changes the dict to an OrderedDict so that order should be kept, which removes stochastic behavior from the evaluation script. However, if you would shuffle your csv file, you could get a different mAP, even using that PR.

@AndreaPi please test PR #1042 to see if it makes a difference for you.

Thanks! I'll test the PR during the weekend and let you know.

@hgaiser I'm trying to test the PR but I'm having some difficulties 馃槄 As you know, I run keras-retinanet in a Docker container. The Dockerfile is:

COPY requirements.txt /tmp/requirements.txt

COPY startup.sh /

WORKDIR /

RUN apt-get -y update -o Acquire::https::Verify-Peer=false && apt-get -y install \
    libglib2.0-0 \
    libsm6 \
    libxext6 \
    libxrender1 \
    git

RUN pip3 install -r tmp/requirements.txt

RUN git clone https://github.com/fizyr/keras-retinanet.git

ENV PATH="${PATH}:/root/.local/bin"

RUN cd keras-retinanet && \
    pip3 install --user .

WORKDIR /

RUN pip3 install --user git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI

CMD /startup.sh

I build the image with build_public.sh and run the container with run_public.sh. However, the version of keras-retinanet which gets installed is the master branch. Instead, I need to test this branch:

https://github.com/fizyr/keras-retinanet/tree/ordered-dict

right? I need to modify the Dockerfile in order to actually install and use that branch, instead than the master branch. I think I should change this line

RUN git clone https://github.com/fizyr/keras-retinanet.git

to

git clone https://github.com/fizyr/keras-retinanet.git --branch ordered-dict --single-branch keras-retinanet

I'll try and let you know if it works. In meantime, if you have suggestions/comments, please let me know.

Or :

git clone -b ordered-dict https://github.com/fizyr/keras-retinanet.git

Perfect! I got the same result 10 times out of 10 now. I'm closing the issue. Thank you!

Alright, then I'm merging the PR. Thanks for testing! (and the detailed issue :))

Was this page helpful?
0 / 5 - 0 ratings