Hi, thanks for your great work. I would like to ask some easy questions:
I've downloaded MS1M-ArcFace( which is 85k identites, 5.8M, 112x112 images). Are these the images that have already been preprocessed by MTCNN? And did you directly use these images for training? I have this question since the images still seem to have some background behind.
When I use dataloader to load the images, what are the mean and std should I use to normalize them? Are them all [0.5,0.5,0.5]?
Thanks in advance since I kept trying to use the orginal ResNet34(imported from pytorch) +center loss and with my own cropped face dataset, I only got 98.6% on LFW.
Joe
Hi, thanks to your response. But when I computed the mean and std of the dataset, they're not 1 and 0, instead I got [0.5412688 , 0.43232402, 0.37956172], and [0.28520286, 0.2531577 , 0.24701026].
This might slightly affect the performance, right?
Thanks
I don't think it will affect the performance, you can try.
- Already aligned by using MTCNN landmark output.
- Pixel mean and std is not required, just input original pixel values.
Hi, @nttstar
I found the normalization in symbol/fresnet.py:
https://github.com/deepinsight/insightface/blob/4a4b8d03fec981912fdef5b3232a37a827cbeed6/recognition/symbol/fresnet.py#L547
https://github.com/deepinsight/insightface/blob/4a4b8d03fec981912fdef5b3232a37a827cbeed6/recognition/symbol/fresnet.py#L548
So, the mean is 127.5, 127.5, 127.5, scale is 0.0078125 ?
Could someone point out the exact input format for image in case we are not using MTCNN?
E.g. uint8 img or fp32 img /255?
@clhne
@deepwilson
did you guys found correct image input format ?
I am implementing code in java and facing similar processioning issues
The below snippet i tried but does not seem to work with my model, might help you
https://github.com/deepinsight/insightface/issues/1086
It mentions
*ptr_image++ = (static_cast<float>(*data++) / 255.0 - mean_r) / std_r;
@siddharthshah3030 Have you tried fp32/255 without standardization?
@deepwilson
I tried, no luck
Any help is really appreciated
My aligned detected face
Utils.matToBitmap(aligned_MAT, aligned_face);
Imgproc.cvtColor(aligned_MAT, aligned_MAT, Imgproc.COLOR_BGRA2RGB);
I visually verified the aligned_MAT is in RGB and as expected and aligned face
model_input = ByteBuffer.allocateDirect(1 * h * w * c * 4 ); //fp32 is 4 byte per pixel in java
model_input.rewind(); // position is set to zero
below code from https://github.com/deepinsight/insightface/issues/1086 for np.transpose(nimg, (2,0,1))
// aligned_MAT ~= image in byte array
for (int k = 0; k < 1; k++)
for (int c = 0; c < 3; ++c)
for (int i = 0; i < h; ++i)
for (int j = 0; j < w; ++j)
{
float val = (float)(aligned_MAT[(i * w + j) * 3 + c]) ;
val = val / 255.0f;
model_input.putFloat(val);
}
then
tflite.run(model_input, embeddngs);
cosine distance are very vague between 0.0 to 0.4 for same and different people both
I believe issue is in image preprocessing
@siddharthshah3030 Are you applying any augmentations later in the pipeline?
Could it be that you applied some augmentations and after the operation you forgot to clip the values between 0 and 1?
Also please check if your arcface layer is implemented properly?
@deepwilson
I basically showed you whole pipeline
no augmentations after that
also I am missing below step
not sure how to implement this
input_blob = np.expand_dims(aligned, axis=0)
ps, model is working fine in python
@deepwilson
My issue is solved
all works good now
the model required HWC and I was sending CHW channel order
Thanks alot for your help
Most helpful comment
Hi, @nttstar
I found the normalization in symbol/fresnet.py:
https://github.com/deepinsight/insightface/blob/4a4b8d03fec981912fdef5b3232a37a827cbeed6/recognition/symbol/fresnet.py#L547
https://github.com/deepinsight/insightface/blob/4a4b8d03fec981912fdef5b3232a37a827cbeed6/recognition/symbol/fresnet.py#L548
So, the mean is 127.5, 127.5, 127.5, scale is 0.0078125 ?