Face_recognition: Using more than 1 image for each person

Created on 17 Mar 2017  路  8Comments  路  Source: ageitgey/face_recognition

Hi, thanks for your amazing articles! :)

I was wondering how can we use more than 1 image for each person to do an embedding of the person in different conditions (light, beard/no beard, glasses, ...)?
I am new to this, correct me if I am wrong, we have to apply face_recognition.face_encodings(picture_of_me)[0] to each image (20, 30) of every person, and then apply SVM in order to predict the class of new images? If so, what will happen if we test a new face that was not used when training the classifier?

Thank you!

Most helpful comment

You could use multiple pictures of the same person a number of different ways. It just depends on your exact use case.

If you run face_recognition.face_encodings(picture_of_me)[0] on 30 pictures of the same person, you'll have 30 different encodings of the same person. Each encoding is an array of 128 numbers.

All 30 should be almost equal to each other (within 0.6 in euclidian distance across all 128 numbers), but of course that might not be the case. So you could handle that different ways:

  1. You could compare each unknown picture separately against all 30 known samples of the same person. This would work, but would be slow as you scale to more people. But it's easy to tell when someone is not a match for any of the known people.
  2. You could train any kind of classifier (like an SVM or whatever) like you said. In that case, you would still get a prediction that your unknown person matches one of your known people even if they aren't really a match. But you should get a low confidence score which you can use to determine that the match isn't good enough to consider a real match. So you can use that to know.

It really all depends on your use case. There's other cool stuff you can do like get the encodings for every image in a folder and use the Chinese Whispers algorithm to automatically cluster every face into groups where each group is one real person (i.e. discover all the individual people in a batch of images).

All 8 comments

You could use multiple pictures of the same person a number of different ways. It just depends on your exact use case.

If you run face_recognition.face_encodings(picture_of_me)[0] on 30 pictures of the same person, you'll have 30 different encodings of the same person. Each encoding is an array of 128 numbers.

All 30 should be almost equal to each other (within 0.6 in euclidian distance across all 128 numbers), but of course that might not be the case. So you could handle that different ways:

  1. You could compare each unknown picture separately against all 30 known samples of the same person. This would work, but would be slow as you scale to more people. But it's easy to tell when someone is not a match for any of the known people.
  2. You could train any kind of classifier (like an SVM or whatever) like you said. In that case, you would still get a prediction that your unknown person matches one of your known people even if they aren't really a match. But you should get a low confidence score which you can use to determine that the match isn't good enough to consider a real match. So you can use that to know.

It really all depends on your use case. There's other cool stuff you can do like get the encodings for every image in a folder and use the Chinese Whispers algorithm to automatically cluster every face into groups where each group is one real person (i.e. discover all the individual people in a batch of images).

Wouldn't it be more accurate to train with multiple images of each person? Multiple images could help to build an "average" encoding of the person.

So if inside the "known images" folder the script finds a person folder, it would encode its face based on the average of all that person's single encodings. That makes sense?

I wonder if it could be a feature for the lib, or you just tell me if I'm in the wrong direction

The facial encoding was generated based on a pre-trainned data set, which in turn was generated by deep learning algorithm. This ensure a certain accuracy and uniqueness for the generated facial encoding. It is expected that even if you use multiple images of a same person, the encodings variance should be insignificant toward classification usage. Ofc you could always setup a SVM to further enhance the process but that will come at the cost of performance and setup complexity for enhancement which expected to be minimal.

If I'd setup a SVM, what label should I take as target label?
My columns are "filepath, filename, e1, e2, e3... e128", where e[1-128] are the encoding values

Hi, the label can be the name of the person in the image. So, if you have, for example, 30 images for each person, you will have 30 arrays (face_encodings) with the label person1; then another 30 arrays with label person2 and further on. After, http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html the labels can be encoded, in order to feed them to the SVM, http://scikit-learn.org/stable/auto_examples/svm/plot_iris.html#sphx-glr-auto-examples-svm-plot-iris-py. Therefore, _fit (X, y, sample_weight=None)_ will receive as _X_ the matrix with all the encodings(arrays) from all the pictures, and as _y_ an array with all the labels.

I dont even bother...

from sklearn.svm import SVC

encodings_array = []
names_array = []

with open('data.json') as json_data:
        nodes = json.load(json_data)
        for node in nodes:
            encodings_array.append(node['encoding'])
            names_array.append(node['name'])

clf = SVC()
clf.fit(encodings_array, names_array)

I just have all the name of people(string) into an array and fit them in without encoding.

later on I do

name = clf.predict([face_encoding])[0]

and wolalala

This clarifies it a lot! Thanks
Just curious, SVC is a classifier, but what would you recommend to use to find similar faces? Just euclidian distance between encodings and then filtering using a threshold (~0.6).
Are there any other alternatives?

Great question @john-bonachon , I have just reaching this point myself.

In the original blog post of @ageitgey , he demonstrated using OpenFace demo classifier which used SVC.

You could check out this link for more classifiers available from sklearn. I strongly recommend going through each in detail to find suitable configuration for your use case :mag_right:

Was this page helpful?
0 / 5 - 0 ratings