Insightface: Question about preprocessing steps

Created on 21 Feb 2019  路  11Comments  路  Source: deepinsight/insightface

Hi, thanks for your great work. I would like to ask some easy questions:

  1. I've downloaded MS1M-ArcFace( which is 85k identites, 5.8M, 112x112 images). Are these the images that have already been preprocessed by MTCNN? And did you directly use these images for training? I have this question since the images still seem to have some background behind.

  2. When I use dataloader to load the images, what are the mean and std should I use to normalize them? Are them all [0.5,0.5,0.5]?

Thanks in advance since I kept trying to use the orginal ResNet34(imported from pytorch) +center loss and with my own cropped face dataset, I only got 98.6% on LFW.

Joe

Most helpful comment

  1. Already aligned by using MTCNN landmark output.
  2. Pixel mean and std is not required, just input original pixel values.

Hi, @nttstar
I found the normalization in symbol/fresnet.py:
https://github.com/deepinsight/insightface/blob/4a4b8d03fec981912fdef5b3232a37a827cbeed6/recognition/symbol/fresnet.py#L547
https://github.com/deepinsight/insightface/blob/4a4b8d03fec981912fdef5b3232a37a827cbeed6/recognition/symbol/fresnet.py#L548
So, the mean is 127.5, 127.5, 127.5, scale is 0.0078125 ?

All 11 comments

  1. Already aligned by using MTCNN landmark output.
  2. Pixel mean and std is not required, just input original pixel values.

Hi, thanks to your response. But when I computed the mean and std of the dataset, they're not 1 and 0, instead I got [0.5412688 , 0.43232402, 0.37956172], and [0.28520286, 0.2531577 , 0.24701026].

This might slightly affect the performance, right?

Thanks

I don't think it will affect the performance, you can try.

  1. Already aligned by using MTCNN landmark output.
  2. Pixel mean and std is not required, just input original pixel values.

Hi, @nttstar
I found the normalization in symbol/fresnet.py:
https://github.com/deepinsight/insightface/blob/4a4b8d03fec981912fdef5b3232a37a827cbeed6/recognition/symbol/fresnet.py#L547
https://github.com/deepinsight/insightface/blob/4a4b8d03fec981912fdef5b3232a37a827cbeed6/recognition/symbol/fresnet.py#L548
So, the mean is 127.5, 127.5, 127.5, scale is 0.0078125 ?

Could someone point out the exact input format for image in case we are not using MTCNN?

E.g. uint8 img or fp32 img /255?

@clhne
@deepwilson
did you guys found correct image input format ?

I am implementing code in java and facing similar processioning issues
The below snippet i tried but does not seem to work with my model, might help you
https://github.com/deepinsight/insightface/issues/1086

It mentions
*ptr_image++ = (static_cast<float>(*data++) / 255.0 - mean_r) / std_r;

@siddharthshah3030 Have you tried fp32/255 without standardization?

@deepwilson
I tried, no luck
Any help is really appreciated

My aligned detected face

        Utils.matToBitmap(aligned_MAT, aligned_face);
        Imgproc.cvtColor(aligned_MAT, aligned_MAT, Imgproc.COLOR_BGRA2RGB);

I visually verified the aligned_MAT is in RGB and as expected and aligned face

        model_input   = ByteBuffer.allocateDirect(1 * h * w * c * 4 );  //fp32 is 4 byte per pixel in java
        model_input.rewind(); // position is set to zero

below code from https://github.com/deepinsight/insightface/issues/1086 for np.transpose(nimg, (2,0,1))

// aligned_MAT ~= image in byte array
for (int k = 0; k < 1; k++)
            for (int c = 0; c < 3; ++c) 
                for (int i = 0; i < h; ++i)
                    for (int j = 0; j < w; ++j)
                    {
                        float val =   (float)(aligned_MAT[(i * w + j) * 3 + c]) ; 
                        val = val / 255.0f;
                        model_input.putFloat(val);
                    }

then
tflite.run(model_input, embeddngs);

cosine distance are very vague between 0.0 to 0.4 for same and different people both
I believe issue is in image preprocessing

@siddharthshah3030 Are you applying any augmentations later in the pipeline?

Could it be that you applied some augmentations and after the operation you forgot to clip the values between 0 and 1?

Also please check if your arcface layer is implemented properly?

@deepwilson
I basically showed you whole pipeline
no augmentations after that

also I am missing below step
not sure how to implement this
input_blob = np.expand_dims(aligned, axis=0)

ps, model is working fine in python

@deepwilson
My issue is solved
all works good now

the model required HWC and I was sending CHW channel order
Thanks alot for your help

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lmmcc picture lmmcc  路  4Comments

lzg188 picture lzg188  路  5Comments

1frey picture 1frey  路  4Comments

weihua04 picture weihua04  路  5Comments

nmzszxsl01 picture nmzszxsl01  路  4Comments