Hello Alexey, anybody,
Thanks for your previous answers. Hopefully you can help.
I have a bigger question this time. I'm working with +/- 20.000 classes, for simplicity let's say 500..
Given the following challenge (picture 1 below) and the following solution (picture 2 below):
Question 1
I have to be able to see the difference between 1A-1B-1C (3 different objects) but I also have to be able to see the difference between 2A-2B-2C (3 different objects but a different difference)..
[I've marked with a blue line so you can see what the different difference is]
I have to be able to see that 3A-3B-3C (the same object) are exactly the same object.
Can this be done by darknet/yolo? Can this all be done in 1 same model? I am thinking this is possible. My solution would be to train using the template backgrounds (second picture) and then use the chosen background in production environment.
Question 2
Which background would be best to train on from 1-10 ?
Ofcourse in the production environment i will then use the same background as the one used to train. Ofcourse in training I will try to make around 1000-2000 pictures per class on the same background.
Question 3
Does color or color contrast affect /make a difference for the backgrounds? Should I use black background with white lines or just grey/wood.
Question 4
Do I use softmaxing to Object Detect between superclasses (i.e. seperate Knife vs Scissor) and then Image Classify between subclasses (i.e. seperate Knife 1 vs Knife 2)?


@SJRogue Hi,
jitter=0
random=0
jitter=0.3
random=1
saturation = 0
exposure = 1.5
hue=0
Other backrounds can be used only if you want to know real size of object (not only proportions), but:
yolo9000.cfg with softmax-tree (superclasses and subclasses) https://github.com/AlexeyAB/darknet#using-yolo9000 but this model is more difficult to train. You can try to train both and compare results:Based on:
darknet.exe detector train data/obj.data yolo-obj.cfg darknet19_448.conv.23darknet.exe detector train data/obj9k.data yolo-obj.cfg yolo9000.conv.22darknet.exe partial yolo9000.cfg yolo9000.weights yolo9000.conv.22 22Alexey.. Big thank you.
I need time to respond with more questions but for now:
Q1:
For detection I can determine scale, resolution, shooting point (distance, angle). I can not always control lighting, every environment will be different.
For me, if shape + size + proportions are the same, then color does not matter, the object is the same object.
In casu: I mean that a scissor with a red handle = scissor with a blue handle if the shape and the real life size/proportion is the same.
Q2:
In detection: I can not determine rotation of the object placed on the background by the users. They will not always place instruments precise.
Q3:
Users will place one object at a time (in version 1..).
So I'm thinking I need a background with a high contrast (background color vs lines/grid/circle colors) and I need to pick 2 colors which are never blending with the instruments.
Or.. maybe as you say.. since I can control distance, shooting point, angle.. I can take background 1.. (you sure this not a problem for determining size/proportions?)
Q4:
Exactly! I think I need 9k because if I do this without softmaxing it's going to really be a problem to categorize and differentiate.
saturation = 1.5
exposure = 1.5
hue=.1
in the [region_layer] (jitter=0 or 0.05)
jitter=0
random=0
So, if you can not determine rotation of the object placed, then your training dataset should contain every possible rotation for each object (that are possible on detection-stage).
background-1 isn't problem for proportions. And it isn't problem for size (if scale and shooting point the same).
It can be problem, if some of objects have the same color as backgorund-1.
Yes, Softmax is used in any cases: yoloV2 and yolo9000, also yolo9000 has softmax-tree, so it will use different softmax for each subclass - this is very suitable for classifying a large number of classes.
Thanks, Alexey.
For taking the time and understanding the environment I'm putting down. I think you gave me what I need to know.
I will have questions about
- capture-stage, training-stage & detection-stage resolution
- detection-stage signalisation registration (when is a signal emitted/interrupted)
-> this is already for version 2 of my platform [not soon], where I will be
moving objects into and outside of camera scope, trying to avoid the
same object being detected twice but have to register two different objects of
the same class as 2 different entries of 1 class (possibly belonging to a different
group but that will be handled on my level once I can understand registration)
I will refer to you, no doubt, in my graduation project : )
Interesting application :+1: Good luck for your project.
@sivagnanamn
Thank you for the support, its going to take trial and error. Whatever results I book I will reach back.
Most helpful comment
Thanks, Alexey.
For taking the time and understanding the environment I'm putting down. I think you gave me what I need to know.
I will have questions about
- capture-stage, training-stage & detection-stage resolution
- detection-stage signalisation registration (when is a signal emitted/interrupted)
-> this is already for version 2 of my platform [not soon], where I will be
moving objects into and outside of camera scope, trying to avoid the
same object being detected twice but have to register two different objects of
the same class as 2 different entries of 1 class (possibly belonging to a different
group but that will be handled on my level once I can understand registration)
I will refer to you, no doubt, in my graduation project : )