Face_recognition: How can I make face recognition faster if I have more than 1M known images?

Created on 20 Nov 2017  路  69Comments  路  Source: ageitgey/face_recognition

I tried to use your amazing project for face recognition (for example single unknown image) with my big number of known images (1 Million) but its really slow, this slow because its will load all known images (load_image_file -> then face_encodings) in order to compare with ONE unknown image file.

Any ideas how to speed this process? I was thinking to do face_encodings for all known images then save the 128 as string into apache solr but with no luck as I still need to do compare_faces with all known images:) ... Any suggestions?

Most helpful comment

Thank you All. I've indexed all images encoding into apache solr and then I managed euclidean distance using solr build-in function dist i.e. http://localhost:8983/solr/mycore/select?q=:&fl=dist(2,v_0,v_1,v_3,...,v_127,-0.0621345,0.048437204,0.0839613,...)

So fare, I indexed around 40K images and the query speed very good (17ms without any solr cache)

All 69 comments

You could create a database table (postgres, mysql, etc) with 128 columns and store the pre-calculated 1M encodings in that table. Then you could do the compare_faces math using sql against that table to check one face.

Thank you for quick reply, How can I translate your compare_faces since its using np.linalg.norm to any other sql? Do you have any real example?

The formula for euclidean distance is just:

screen shot 2017-11-20 at 1 59 26 pm

So assuming you had one column for each of the 128 feature values, you could do something like:

SELECT * from my_stored_encodings 
ORDER BY 
      sqrt(
         power(e1 - TEST_ENCODING_VALUE_0_HERE, 2) + 
         power(e2 - TEST_ENCODING_VALUE_1_HERE, 2) + 
         power(..... etc.... 
     )

If you are using Postgresql, you can do more complex things like using it's built-in list data types to store the 128-number encoding in one column and do the comparison using a custom stored function. Just google around for "euclidean distance in sql".

Many thanks :)

you can use numpy logic to caluclate Euclidian distance between two vectors.
dist = numpy.linalg.norm(vecor(a)-vector(b))

Thank you All. I've indexed all images encoding into apache solr and then I managed euclidean distance using solr build-in function dist i.e. http://localhost:8983/solr/mycore/select?q=:&fl=dist(2,v_0,v_1,v_3,...,v_127,-0.0621345,0.048437204,0.0839613,...)

So fare, I indexed around 40K images and the query speed very good (17ms without any solr cache)

@khaledabbad : Could you please elaborate on how you managed to do this? I'm working on something very similar.

I used Apache solr filter query with dist function.

@khaledabbad & @ageitgey : How much accuracy can we expect if we go for euclidean distance calculation rather than knn classifier in case of large set of image data ? ----> https://github.com/ageitgey/face_recognition/blob/master/examples/face_recognition_knn.py
As i tested Knn_classifier is much promising in case of accuracy , can we expect that much accuracy.
or is this euclidean distance concept is used in knn classifier

hello @ageitgey
thanks for the great work.
I am saving the face_encodings into database in one column. But I am not sure what data type should I select for the row. bigint ? bigserial ?
can you help please ?

I don't think its will be a good idea to save all face features (encodings) in one column, try to add 128 columns with float data type i.e. Column0 float(22,20),Column1 float(22,20), ... Column127 float(22,20).

@khaledabbad but I am using postgresql database. @ageitgey mentioned that we can store encodings in 1 column if we use postgres.

@xenc0d3r In postges, you can optionally store them in a single column using the CUBE extension. However postgres has a 100-dimension limit on cube fields by default and you have to edit a file and recompile postgres yourself if you want to do it that way since the face encodings are 128 dimensions.

See the other thread: https://github.com/ageitgey/face_recognition/issues/403#issuecomment-373437405

@ageitgey thank you for your response. So I decided to save them as multiple columns. I know you are busy with other works but can you give a sql query example how to compare a uploaded image with the saved encodings in python.

Thanks in advance.

@xenc0d3r There's an example higher in this thread: https://github.com/ageitgey/face_recognition/issues/238#issuecomment-345847465

You'll just have to write out all 128 terms in the sql statement instead of only the 2 I put in the short example.

Hello @ageitgey @khaledabbad
When I try to save the encoded list to postgresql with python psycopg2 I receive the following error.
TypeError: 'numpy.float64' object does not support indexing

I have 128 rows for encoded list elements in my database table and their datatypes are float.
Can you help me please ?

Best Regards

@ageitgey @khaledabbad my code is as :

test = list(encoded-photo)
for row in test:
c.execute("""INSERT INTO photos VALUES (DEFAULT,%s,%s,%s ---128 times,);""", row)

I had success inserting with this: (python code)
INSERT INTO test (face_encoding) VALUES ('"+face_encoding_string+"')
just changing values to string and inserting into postgres 128 cube encoding.
Python:
face_encoding_string = "(0.1,0.1,0.1 .... 0.1,0.1)"

For me I don't need to optimize when filling database so this works in my case. I just need optimized querying for when I move towards analyzing video streams for face recognition.

@mmelatti can you help me about this. can we discuss instantly on twitter etc. thank you

Don't have twitter atm sorry, I'll post my code and see if it can help you? I'm using psycopg2 as well. This isn't a final product but you can see if it works for you:

`
image = face_recognition.load_image_file(self.filename)
face_encoding = face_recognition.face_encodings(image)[0] #only 1 face expected when entering 'mugshot'
face_encoding_string = "("

    for distance in face_encoding:
        face_encoding_string+=str(distance)
        face_encoding_string+=str(",")

    face_encoding_string = face_encoding_string[:-1]
    face_encoding_string += str(")")
    print(face_encoding_string) # demo what a face encoding data looks like in the output
    conn = psycopg2.connect(host="localhost",database="postgres", user="postgres", password="password")
    cur = conn.cursor()
    cur.execute("INSERT INTO wanted (first_name, last_name, face_encoding) VALUES ('"+self.entry_1.get()+"', '"+self.entry_2.get()+"', '"+face_encoding_string+"')");
    conn.commit()
    conn.close()`

@mmelatti thank you so much for your help. regards

@mmelatti and can you provide the sql query which you are using to compare images ? and also you said you used cude extension. but it has limitation up to 100 as far as I know.
May I use it ?

Yes its limited up to 100 but in this post I show my method for changing cube to 128:
https://github.com/ageitgey/face_recognition/issues/403#issuecomment-374336850

I have link to download the source for postgres and I have instructions for changing cube data type to 128 dimensions. I'm working on my query atm I'll share that code as soon as I finish it.

It'll basically look like this:
SELECT c FROM test ORDER BY c <-> cube(array[0.5,0.5,0.5]) LIMIT 1;
See:
https://www.postgresql.org/docs/10/static/cube.html

UPDATE:
Finished my query, same method for finding face encoding in new picture. then I query that against my database. Just finished. Now I need to stress test/ also test accuracy / and test thresholds aka if there isn't a face that closely resembles one in database we shouldn't return anything (unknown face).

Code: (Python3)
`
conn = psycopg2.connect(host="localhost",database="postgres", user="postgres", password="password")

cur = conn.cursor()

tempstring = "SELECT first_name FROM wanted ORDER BY face_encoding <-> cube(array["+face_encoding_string+"]) LIMIT 1"

cur.execute(tempstring)

print(cur.fetchall())
`

@mmelatti thank you. I am trying to install it.

@mmelatti I modified and installed the 10.3 version. But I think I can not start server with service postgresql start command. Everything tangled here.. :)

did you add new user $ adduser postgres
did you switch to that user to issue start command?

I'm on mac, if it doesn't work exactly for you I'd follow the README provided in the download for installation.

You can also try this docker installation:
https://github.com/oelmekki/postgres-350d

FATAL: role "faceadmin" does not exist
Seems role was not created correctly. Try using steps in README?

@mmelatti I receive FATAL: role "faceadmin" does not exist

@mmelatti Ok i achieved finally. it is woking

@mmelatti can you also send the code where you instered the encoded array into database.
and what data type did you set for it ? thank you

@mmelatti I inserted succesfully into database but when I try to compare them I got the following error.
LINE 1: ...first_name FROM wanted ORDER BY face_encoding <-> cube(array...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.

any ideas ?

I had that same error for a while. It went away when I changed ORDER By ________ to make sure it was the same column that I'm storing the cube data type. Might help you to review GiST index and metric operators defined in the cube data type 10.3 page

I believe the issue is with your <-> metric operator?

Also double check and make sure you have at least 2 entries in your table, I've seen that as an issue for people

Did you just make install cube, or did you make install all extensions. I ended up make installing all extensions. not sure if this could be the root of your problem.

I don't believe you should remove "ORDER BY"

@mmelatti so should I remove order by part. Can you help please. it does not recognize cube command

@mmelatti ARE you storing the encoding in 128 different rows or in a single row ?

1 column, (cube data type), entered as 128 dimensions: looks like this when entering: (0.1,0.1,0.1, ...., 0.1,0.1).
A row contains data entry for a single person, first and last name, id, and their face encoding (face encoding fits in one column) aka it fits into one cell in my table.

Thats the whole point of changing the datatype and specifically using Postgresql instead of just a typical database set up with 128 columns for the 128 face encodings. By doing it this way, there is less to do application side. Also I think my tables look nicer.

I also am going to experiment with many-to-one relationship with multiple face encodings (pictures) stored in my "wanted list" for a single person and see what types of results I get. (in case someone has more than one "mugshot" in our database.

Still need to experiment with thresholds as well for "unknown faces". I'll be adding my example code and detailed setup later.

I receive an error while encoding. Any ideas ? @ageitgey @khaledabbad

File "/usr/local/lib/python3.6/dist-packages/face_recognition/api.py", line 197, in face_encodings
raw_landmarks = _raw_face_landmarks(face_image, known_face_locations, model="small")
File "/usr/local/lib/python3.6/dist-packages/face_recognition/api.py", line 151, in _raw_face_landmarks
face_locations = _raw_face_locations(face_image)
File "/usr/local/lib/python3.6/dist-packages/face_recognition/api.py", line 100, in _raw_face_locations
return face_detector(img, number_of_times_to_upsample)
MemoryError: std::bad_alloc

Hello @mmelatti

I would like to thank you first for sharing the database for face recognition.

I had a problem when running LoadDatabase.py and I would appreciate if you please help to solve it.

$ python LoadDatabase1.py
/Users/Badr/.virtualenvs/cv/lib/python2.7/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: http://initd.org/psycopg/docs/install.html#binary-install-from-pypi.
""")
Exception in Tkinter callback
Traceback (most recent call last):
File "/usr/local/Cellar/python/2.7.14/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-tk/Tkinter.py", line 1541, in __call__
return self.func(*args)
File "LoadDatabase1.py", line 81, in SubmitDataBase
image = face_recognition.load_image_file(self.filename)
AttributeError: LoadDatabase instance has no attribute 'filename'

@badory happy to help. First of all, if you're getting Tkinter errors, Tkinter is just used for the GUI. I also included a command line interface that does pretty much the same thing. So if you get Tkinter errors I would try using the CLI in /command_line_interface folder. as for the psycopg2 error, that library is required for connecting to the PostgreSQL database. It seems like you're having problems with the wheel package. Have you tried steps to install this? sudo apt-get update, sudo apt-get pip2 or pip3 (whichever python version). pip install --upgrade pip ; pip install psycopg2 ; make sure its in your python virtual environment. maybe try uninstalling and reinstalling? I'm not sure if you're using python's virtualenv or what version of python you're using. I believe I was using python 2.7.

online this seems to be the solution for updating psycopg2:
sudo apt-get install pip3
sudo apt-get install libpq-dev
sudo pip3 install psycopg2

@mmelatti Thanks for the suggestions. I've already used the CLI that you provided and successfully load pic to the database and recognize the pic too. However, I still receives error when loading the picture and using load.py as follow:
Traceback (most recent call last):
File "load_db.py", line 40, in
self.SetDefault()
NameError: name 'self' is not defined

I also just realized that GUI also works properly if I don't use the default picture and choose another picture. I installed psycopg2 and create new environment with python3 installed. Yet I receive an error as follow:
Exception in Tkinter callback
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/tkinter/__init__.py", line 1702, in __call__
return self.func(*args)
File "LoadDatabase1py3.py", line 98, in SubmitDataBase
self.SetDefault()
File "LoadDatabase1py3.py", line 65, in SetDefault
self.entry_3.delete(0, Tkinter.END)
NameError: name 'Tkinter' is not defined

It looks like it gets confused between Tkinter(for python2) and tkinter (for Python3). In another word, it runs Tkinter instead of tkinter as im using python3 which I think it caused the problem.

Thank you again.,

@ageitgey - Did you read about the FB class action suit and revised Facebook privacy terms to come out where users opt-in to face recognition features (e.g. auto-tagging) ??? I think it had something to do with storing biometric (face) data...

https://techcrunch.com/2018/04/16/judge-says-class-action-suit-against-facebook-over-facial-recognition-can-go-forward/

@mmelatti, Thanks for your code :) are u suing knn search anywhere in database ? Why not store ML models of the encodings in database than encodings ? isnt the search time less with models ?

@MLDSBigGuy I haven't tested the database with models instead of encodings, however the PostgreSQL database is a spatial database that I assume optimized for the cube data type. I haven't personally tested a database with 50+ million entries, but I believe other people have commented in issues in this face_rec repo testing databases of that scale and did not mention search time as a major issue. I think when you have that many faces in the database you run into other issues. These issues are more related to accuracy of the API that uses trained data that is not incredibly diverse. Unless you're Google or Facebook its hard to get millions of tagged images. I believe this API training used IMDB (Stanford study) which is has proven to be most accurate with white faces. Furthermore you're dealing with pixels not vector images, so details data are lost. I believe you will see best results using same resolution photos when comparing. Please anybody correct me if I'm wrong I may be confusing this with another API I was reading about.

Hi @mmelatti,
Thank you for your reply,

I started with MongoDb to store the encodings. I pickled the numpy arrays as one and stored in to mongo document. Am planning to write the euclidean distance in application code after loading all the arrays fom database. Am i doing it correctly ? Any ideas ? Why didnot you simply pickle the encoding ? And load all the encodings to application code for comparision ?

For very large numbers The database searching an encoding may be faster than the application code for comparison (Application code may just loop through entries because we can't hash these encodings for lookup because we need closest match. need confirmation on this?)? I'm not as familiar with the specific details of the API. Additionally, haven't tried but I'm not sure on the management side, how to remove entries from application code so +/- management?

@mmelatti thank you :) Could you please tell the sql code for how you calculated the euclidean distance in postgre ? I see that usually query operations doesnot include a direct distance calculation feature. In mongo, it has array push pop and other minimal operations but not a direct distance calculation query. I just want to learn from your code to do the same with mongo.

Thank you,

pretty sure example of how sql should look is located right at the top of this thread: (ageitgey posted)

SELECT * from my_stored_encodings
ORDER BY
sqrt(
power(e1 - TEST_ENCODING_VALUE_0_HERE, 2) +
power(e2 - TEST_ENCODING_VALUE_1_HERE, 2) +
power(..... etc....
)

@mmelatti thank you very much 馃憤

How can you improve the accuracy of images if you store in database ? We need to store more images of same person ?

Imagine i store 10 images of same person. Then accuracy could be improved ?

Regarding the speed, if i have 50,000 different faces. can my response still be in milli seconds if i check with each result in postgres database ? (Doesnot matter even if i get wrong matches)

I see that postgres implemented knn indexing within cube <-> eulidean distance operator. Is this same as trianing a knn machine learning model ?

Thank you,

@MLDSBigGuy You might want to check out this website: https://hackernoon.com/building-a-facial-recognition-pipeline-with-deep-learning-in-tensorflow-66e7645015b8 convolution neural networks for face recognition.

You should still be able to get fast queries from a spatial database even with many entries in the db. The neural network converts feature maps to vector space (128 embedding) in vector space we can use vector distance to determine similarity in identifications. I don't believe the knn indexing is the same as the api. besides the obvious advantageous use of persistent data I would take a look at some key differences here:
https://pdfs.semanticscholar.org/237b/77328f6f1c75ba4fcdca131b0d95f6bb54b3.pdf
https://en.wikipedia.org/wiki/Spatial_database
https://dl.acm.org/citation.cfm?id=280279&dl=ACM&coll=DL

I don't believe storing 10 images of the same person in the database will improve accuracy. If done for all persons you would be blurring the identity in the vector space and also seems counter-intuitive to the convolution neural network except in special cases. I believe the best approach to improve accuracy without somehow creating a better more accurate convolution neural network would be to: not increase stored images of same person, but instead analyze more frames of a person in real-time surveillance . aka if you are doing video surveillance you can look at multiple frames of a person walking by and determine the person through multiple frames and angles. The industry seems to be taking this approach for increased accuracy.

best,

Thank you very much @mmelatti for the links and detailed answer.

Yes, you are right. matching the person walking through video surveillance is good 馃憤

I see in your postgres setup, there is no indexing on encodings. Wont query be much faster with indexing ?

If so, can you please tell me if i need to create index when i create table in starting (or) when i query entries ?

Thank you,

How can I use face_recognition to load multiple images? please guide me, I am new to python..
I've asked it in detail on stackoverflow, please see and please guide me:
https://stackoverflow.com/questions/53042959/dynamically-store-images-via-face-recognition

@Asad2195 ageitgey/face_recognition provides exactly that functionality. See The examples in the
Example from project: (identify_and_draw_boxes_on_faces.py)

import face_recognition
from PIL import Image, ImageDraw

# This is an example of running face recognition on a single image
# and drawing a box around each person that was identified.

# Load a sample picture and learn how to recognize it.
obama_image = face_recognition.load_image_file("obama.jpg")
obama_face_encoding = face_recognition.face_encodings(obama_image)[0]

# Load a second sample picture and learn how to recognize it.
biden_image = face_recognition.load_image_file("biden.jpg")
biden_face_encoding = face_recognition.face_encodings(biden_image)[0]

# Create arrays of known face encodings and their names
known_face_encodings = [
    obama_face_encoding,
    biden_face_encoding
]
known_face_names = [
    "Barack Obama",
    "Joe Biden"
]

This loads faces. This thread specifically follows how to store face encodings in a spatial database for querying likeness. I have a public repo that deals with postgresql spactial DB.

Also check out ageitgey knn examples

Hi @ageitgey
I used the database 'sqlite3'

SELECT * from my_stored_encodings ORDER BY sqrt( power(e1 - TEST_ENCODING_VALUE_0_HERE, 2) + power(e2 - TEST_ENCODING_VALUE_1_HERE, 2) + power(..... etc.... )

I'm sorry to bother you, when I used the database, I found the following error

'Error: no such function: sqrt'

What's the point of using a database if you're not computing Euclidean distances inside a table,
or because the database I'm using doesn't support 'sqrt' function?

@jackweiwang sqllte is not a traditional database. It's a minimal database that only works for some simple uses. It doesn't have the math functions you'd need to do this calculation in the database, so it wouldn't be a good choice for this kind of use.

@khaledabbad can u help me apache solr thing

Thank you All. I've indexed all images encoding into apache solr and then I managed euclidean distance using solr build-in function dist i.e. http://localhost:8983/solr/mycore/select?q=:&fl=dist(2,v_0,v_1,v_3,...,v_127,-0.0621345,0.048437204,0.0839613,...)

So fare, I indexed around 40K images and the query speed very good (17ms without any solr cache)

@khaledabbad
Hello! Please let me see your solr config for this and instructions.
Thank you!

@khaledabbad
I am getting around 1.6 seconds for 425,089 images with the default configuration.
Any way to get this down.

@deimsdeutsch
Great result!
You are using Solr or not?
Can you show example of your method?

@deimsdeutsch Can you tell ur steps how did you use solr.

@ageitgey I have a similar use case of matching one person encoding to millions of ever-growing encodings.
Can we use clustering(for ex, kmeans) to create groups of similar image. Then, when new image comes up, we match that first with nearest cluster and then, with nearest encoding within it. The distance method can be the same euclidean or other? Please tell your observations on this approach?

@khaledabbad @anigogo @deimsdeutsch @psyapathy

Hey, how should we create a schema.xml design for Apache Solr about this situation? Any examples do you have currently?

Is it okay to create 128 _float_ field like this for my schema.xml file?

<fieldType name="encoding_value" class="solr.FloatPointField" omitTermFreqAndPositions="true" omitNorms="true" required="true" stored="false" docValues="true" multiValued="false"/>

<field name="v_0" type="encoding_value"/>
<field name="v_1" type="encoding_value"/>
...
<field name="v_127" type="encoding_value"/>

Thank you All. I've indexed all images encoding into apache solr and then I managed euclidean distance using solr build-in function dist i.e. http://localhost:8983/solr/mycore/select?q=:&fl=dist(2,v_0,v_1,v_3,...,v_127,-0.0621345,0.048437204,0.0839613,...)

So fare, I indexed around 40K images and the query speed very good (17ms without any solr cache)

  1. @khaledabbad query means you have taken new image encodings and did euclidean distance using solr Right? and if yes did it took 17 milliseconds to give out prediction.
    2.It's better and helpful for others if you share your apache solr code of save encodings to the database and do the euclidean distance on it.

You could create a database table (postgres, mysql, etc) with 128 columns and store the pre-calculated 1M encodings in that table. Then you could do the compare_faces math using sql against that table to check one face.

There are two approaches for comparing right (1) distance based and (2) knn approach.
Q1) Which approach would provide more accuracy?
Q2) Can I use cosine similarity for comparing? if yes, what is the accuracy when data size is very large
Q3) If KNN-Classifier is giving more accuracy how can we implement it in database? (i am guessing it is tough or near to impossible even then i am asking out of curiosity :) )

Thank you All. I've indexed all images encoding into apache solr and then I managed euclidean distance using solr build-in function dist i.e. http://localhost:8983/solr/mycore/select?q=:&fl=dist(2,v_0,v_1,v_3,...,v_127,-0.0621345,0.048437204,0.0839613,...)

So fare, I indexed around 40K images and the query speed very good (17ms without any solr cache)

@khaledabbad can you share the source code if possible?

Thanks

Hello,
This was interesting, I tried storing my features in DB and used it for prediction by calculating Euclidean distance for comparing the images.
But here I am facing an issue, I updated in DB only one image feature of a single class image. When I use the same image by little zoomin/zoomout the euclidean distance value is high for the same image.
How we can handle this???
Basically here I am not training with multiple images if we are doing how we can store the feature??

Thanks

query = "SELECT first_name, last_name, face_encoding FROM people ORDER BY face_encoding <-> cube(array["+face_encoding_string+"]) LIMIT 1"
ERROR: Exception in Tkinter callback
Traceback (most recent call last):
File "C:UsershuuthoangAppDataLocalContinuumanaconda3envsopencv-envlibtkinter__init__.py", line 1705, in __call__
return self.func(*args)
File "D:FaceIDNew folder (2)FaceRecognition.py", line 105, in SubmitImage
cur.execute(query) #execute query
psycopg2.errors.UndefinedFunction: function cube(text[]) does not exist
LINE 1: ...e_encoding FROM people ORDER BY face_encoding <-> cube(array...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.

For very large numbers The database searching an encoding may be faster than the application code for comparison (Application code may just loop through entries because we can't hash these encodings for lookup because we need closest match. need confirmation on this?)? I'm not as familiar with the specific details of the API. Additionally, haven't tried but I'm not sure on the management side, how to remove entries from application code so +/- management?

How do I contact you so you can help me resolve the error I'm having

The formula for euclidean distance is just:

screen shot 2017-11-20 at 1 59 26 pm

So assuming you had one column for each of the 128 feature values, you could do something like:

SELECT * from my_stored_encodings 
ORDER BY 
      sqrt(
         power(e1 - TEST_ENCODING_VALUE_0_HERE, 2) + 
         power(e2 - TEST_ENCODING_VALUE_1_HERE, 2) + 
         power(..... etc.... 
     )

If you are using Postgresql, you can do more complex things like using it's built-in list data types to store the 128-number encoding in one column and do the comparison using a custom stored function. Just google around for "euclidean distance in sql".

Hi,
Thank you for the sample. I have encoded 250 faces in mysql DB and the Face_Recognition is trying to recognize from the db using the above logic.

This is recognizing incorrectly, where as when we use the pics in the code and use "face_distances = face_recognition.face_distance(known_face_encodings, face_encoding)" it works well.

Can you kindly suggest the solution for identifying the correct user name from the db ?

Thanks

Hi,
I tried to use your amazing project for face recognition (for example single unknown image) with my big number of known images (1 Million) but its really slow,
I was thinking to do face_encodings database using psycopg2 for all known images.
Can anyone share the source code if possible for creating a database and then how to do the query to find of the corresponding face ?
thanks

Sorry for asking on this old thread but,
given this formula:

      sqrt(
         power(e1 - TEST_ENCODING_VALUE_0_HERE, 2) + 
         power(e2 - TEST_ENCODING_VALUE_1_HERE, 2) + 
         power(..... etc.... 
     )

is the e1 supposed to be the column name and the TEST_ENCODING_VALUE_0_HERE the literal value?

Nevermind, it is exactly as I said e1 is the column name, while the left part is the literal value.

Here is a small template (jinja2) that I come with so it is not that tedious to write such query:

{% macro euclidea_distance(column_name) %}
sqrt(
    {% for i, term in encodings -%}
    power({{column_name}}{{ i }} - ({{ term }}), 2) + 
    {% endfor -%}
0)
{% endmacro %}
SELECT  *, {{ euclidea_distance("TERM_") }} as distance 
FROM faces f
HAVING distance < 0.5
ORDER BY distance
LIMIT 30

I found a distance of less than 0.5 gives very good results
Then you just provides the terms in an index-value fashion to the template, like this:

template.render(encodings=enumerate(face_encodings))
Was this page helpful?
0 / 5 - 0 ratings