Deeplearning4j: Issues with ImageRecordReaders and Multiple Subfolders

Created on 5 May 2016  Â·  19Comments  Â·  Source: eclipse/deeplearning4j

So right now I am trying to read in 18 different classes with 60 images per class. These files are contained in subfolders (with each subfolder being the name of the class), with all of the subfolders contained in a main "training" folder. When using my neural network and ImageRecordReader, I keep having only one class predicted, no matter how I tune the net or what data I use. I realized that the file structure may be the issue, with ImageRecorReader not being able to properly recognize it or deal with it.

I set up the method to read in my images using the Image Data Pipeline tutorial page:
http://deeplearning4j.org/image-data-pipeline

When ImageRecordReader is attempting to read in the images, I think there may be one of two issues - it is either not reading in the images correctly (due to not accessing each individual subfolder) or it is not assigning the class labels to each image.

The link to a gist and my attached training data are below:

Gist

training.zip

Most helpful comment

All 19 comments

FYI, this might be the kind of thing I'll be looking into in priority over the next few days for https://github.com/deeplearning4j/Canova/issues/135...

I'm anewbie using DL4J but I can help if something needed.

@saudet Awesome! Let me know if there is anything I can do to help.

Also, do you know when this issue might be resolved? Or if there is any workaround I could use at the moment to get it working? I'm trying to submit a research paper to a conference by May 13th, so any if I could get this setup soon, that would be great.

Do you already test with my class?

@WikiDreams One of my other team members was testing your code. I'll ask him to see how it went.

If you need, call me ;)

@ChrisHayduk This might help for your immediate needs: https://github.com/deeplearning4j/nd4j/issues/897#issuecomment-217495023

@saudet Thanks for the link! I tried setting it up that way, and the accuracy for my neural net is still pretty terrible. It is only predicting 6 or 7 out of the 18 classes every time I run it, even though I am testing on the training data, so it shouldn't be too difficult to get 100% accuracy. I am not sure if this is an issue with my neural net itself, or if the labels are still being added incorrectly.

@ChrisHayduk Have you tried repartitioning your dataset to ensure minibatches are even? I'm still not sure if there was ever a bug in the first place. I keep telling people you have to TUNE your neural nets. Your neural net failing isn't always a software bug.

@agibsoncc From what I understand about my and @ChrisHayduk's situation,
the problem seems to be that ImageRecordReader's default behavior does
iterate over subfolder and pick up the images (or, at very most, only for
the last class). @ChrisHayduk's current problem, however, could be caused
by a variety of things.

@agibsonccc I've been trying to tune my neural net for the past month, but none of the changes to any of the parameters seemed to affect the situation. No matter what I did, the net would only predict one class and sometimes would only assign one class label to all of the data. The change to the ImageRecordReader call (that saudet linked) is the only thing that has produced different results, so it is leading me to believe that ImageRecordReader is the issue.

Have you actually been checking the batches going into the neural net
though? It's not like it's hard to so a while iterator .has next and just
print the labels going into it.
On May 8, 2016 08:13, "Chris Hayduk" [email protected] wrote:

@agibsonccc https://github.com/agibsonccc I've been trying to tune my
neural net for the past month, but none of the changes to any of the
parameters seemed to affect the situation. No matter what I did, the net
would only predict one class and sometimes would only assign one class
label to all of the data. The change to the ImageRecordReader call (that
saudet linked) is the only thing that has produced different results, so it
is leading me to believe that ImageRecordReader is the issue.

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/deeplearning4j/deeplearning4j/issues/1507#issuecomment-217674803

@agibsonccc When I try to print next.getLabelNames(), it's just outputting empty brackets.

Give me a github gist with a print out of the labels and the like and show me what it's outputting. If the labels aren't even, use the methods above I showed you and see what it does. If the label output is the same as your local file structure then we've done our job. I'm just hesitant to call this a bug when the problem is likely something else..this is such an easy thing to have and we have a decent number of unit tests for this.

@agibsonccc I'm sorry for the confusion. It seems like the labels are being output appropriately using the method that @saudet linked me to.

Chris, what you did when you changed the connection parameters and the output was always the same? I've got this output for a total of 10000 images, 2 classes each one with 5000 images.
Examples labeled as 0 classified by model as 0: 5000 times
Examples labeled as 1 classified by model as 0: 5000 times

Warning: class 1 was never predicted by the model. This class was excluded from the average precision

==========================Scores========================================
Accuracy: 0.5
Precision: 0.5
Recall: 0.5
F1 Score: 0.5
If I change the parameters of the model the output is always the same... someone to help?

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings