Source: cnn.com
Hey there!
Ever wondered how computers can recognize faces? Well, nowadays, it’s not as complicated as it used to be, thanks to the amazing advancements in computer vision. There are libraries like “face_recognition” and “deepface” that make face recognition tasks quite straightforward. You can easily recognize one or two faces, or even a hundred, with just a few lines of code. However, as you might expect, things get a bit tricky when you’re dealing with a large collection of faces. The more faces you have, the more time and effort it takes.
But fear not! In this article, we’re going to dive into how you can tackle this challenge and perform face recognition on a big bunch of faces.
Understanding Embeddings:
First things first, let’s talk about something called “embeddings.” Think of embeddings as unique signatures for each face. These are arrays of numbers that describe the essence of a face. To get these embeddings using Python’s “face_recognition” library, follow these steps:
<span>import</span> <span>face_recognition</span><span># Load the known image (e.g., Joe Biden's face) </span><span>known_image</span> <span>=</span> <span>face_recognition</span><span>.</span><span>load_image_file</span><span>(</span><span>"biden.jpg"</span><span>)</span><span>biden_embeddings</span> <span>=</span> <span>face_recognition</span><span>.</span><span>face_encodings</span><span>(</span><span>known_image</span><span>)[</span><span>0</span><span>]</span><span>import</span> <span>face_recognition</span> <span># Load the known image (e.g., Joe Biden's face) </span><span>known_image</span> <span>=</span> <span>face_recognition</span><span>.</span><span>load_image_file</span><span>(</span><span>"biden.jpg"</span><span>)</span> <span>biden_embeddings</span> <span>=</span> <span>face_recognition</span><span>.</span><span>face_encodings</span><span>(</span><span>known_image</span><span>)[</span><span>0</span><span>]</span>import face_recognition # Load the known image (e.g., Joe Biden's face) known_image = face_recognition.load_image_file("biden.jpg") biden_embeddings = face_recognition.face_encodings(known_image)[0]
Enter fullscreen mode Exit fullscreen mode
When you print out these embeddings, you’ll see an array of numbers, usually with a length of 128. Different deep-learning models might produce embeddings of different lengths.
Calculating Similarity:
Now, what’s the use of these embeddings? Well, they help us compare faces. Let’s say we have another face, and we want to see how similar it is to Joe Biden’s face. We can use mathematical measures like “cosine similarity” or “Euclidean distance” for this.
Here’s how you calculate cosine similarity:
<span>from</span> <span>numpy</span> <span>import</span> <span>dot</span><span>from</span> <span>numpy.linalg</span> <span>import</span> <span>norm</span><span>def</span> <span>cosine_similarity</span><span>(</span><span>list_1</span><span>,</span> <span>list_2</span><span>):</span><span>cos_sim</span> <span>=</span> <span>dot</span><span>(</span><span>list_1</span><span>,</span> <span>list_2</span><span>)</span> <span>/</span> <span>(</span><span>norm</span><span>(</span><span>list_1</span><span>)</span> <span>*</span> <span>norm</span><span>(</span><span>list_2</span><span>))</span><span>return</span> <span>cos_sim</span><span>from</span> <span>numpy</span> <span>import</span> <span>dot</span> <span>from</span> <span>numpy.linalg</span> <span>import</span> <span>norm</span> <span>def</span> <span>cosine_similarity</span><span>(</span><span>list_1</span><span>,</span> <span>list_2</span><span>):</span> <span>cos_sim</span> <span>=</span> <span>dot</span><span>(</span><span>list_1</span><span>,</span> <span>list_2</span><span>)</span> <span>/</span> <span>(</span><span>norm</span><span>(</span><span>list_1</span><span>)</span> <span>*</span> <span>norm</span><span>(</span><span>list_2</span><span>))</span> <span>return</span> <span>cos_sim</span>from numpy import dot from numpy.linalg import norm def cosine_similarity(list_1, list_2): cos_sim = dot(list_1, list_2) / (norm(list_1) * norm(list_2)) return cos_sim
Enter fullscreen mode Exit fullscreen mode
In simple terms, the closer the similarity score is to 1, the more alike the faces are. So, if you get a similarity score of 0.86, you can say these faces are about 86% similar.
Using Vector Databases:
But wait, when you have a ton of faces, calculating similarity for each pair of faces can be slow and memory-intensive. This is where “vector databases” come to the rescue. Think of a vector database as a smart way to store and quickly retrieve embeddings.
Let’s take “ChromaDB” as an example. Here’s how you can use it for your face recognition task:
First, create a collection to store your images:
<span>import</span> <span>chromadb</span><span># Choose where to store the database </span><span>client</span> <span>=</span> <span>chromadb</span><span>.</span><span>PersistentClient</span><span>(</span><span>path</span><span>)</span><span>db</span> <span>=</span> <span>client</span><span>.</span><span>get_or_create_collection</span><span>(</span><span>name</span><span>=</span><span>'facedb'</span><span>,</span><span>metadata</span><span>=</span><span>{</span><span>"hnsw:space"</span><span>:</span> <span>'cosine'</span><span>,</span><span>},</span><span>)</span><span>import</span> <span>chromadb</span> <span># Choose where to store the database </span><span>client</span> <span>=</span> <span>chromadb</span><span>.</span><span>PersistentClient</span><span>(</span><span>path</span><span>)</span> <span>db</span> <span>=</span> <span>client</span><span>.</span><span>get_or_create_collection</span><span>(</span> <span>name</span><span>=</span><span>'facedb'</span><span>,</span> <span>metadata</span><span>=</span><span>{</span> <span>"hnsw:space"</span><span>:</span> <span>'cosine'</span><span>,</span> <span>},</span> <span>)</span>import chromadb # Choose where to store the database client = chromadb.PersistentClient(path) db = client.get_or_create_collection( name='facedb', metadata={ "hnsw:space": 'cosine', }, )
Enter fullscreen mode Exit fullscreen mode
Now, you can add your embeddings to the database:
<span>db</span><span>.</span><span>add</span><span>(</span><span>ids</span><span>=</span><span>[</span><span>'1'</span><span>],</span><span>embeddings</span><span>=</span><span>[</span><span>embeds</span><span>],</span> <span># Replace with your embeddings </span> <span>metadatas</span><span>=</span><span>[{</span><span>'name'</span><span>:</span> <span>'Joe Biden'</span><span>}]</span><span>)</span><span>db</span><span>.</span><span>add</span><span>(</span> <span>ids</span><span>=</span><span>[</span><span>'1'</span><span>],</span> <span>embeddings</span><span>=</span><span>[</span><span>embeds</span><span>],</span> <span># Replace with your embeddings </span> <span>metadatas</span><span>=</span><span>[{</span><span>'name'</span><span>:</span> <span>'Joe Biden'</span><span>}]</span> <span>)</span>db.add( ids=['1'], embeddings=[embeds], # Replace with your embeddings metadatas=[{'name': 'Joe Biden'}] )
Enter fullscreen mode Exit fullscreen mode
To search for similar faces in the database:
<span>results</span> <span>=</span> <span>db</span><span>.</span><span>query</span><span>(</span><span>query_embeddings</span><span>=</span><span>[</span><span>unknown_embeddings</span><span>],</span> <span># Replace with your unknown embeddings </span> <span>n_results</span><span>=</span><span>5</span><span>)</span><span>results</span> <span>=</span> <span>db</span><span>.</span><span>query</span><span>(</span> <span>query_embeddings</span><span>=</span><span>[</span><span>unknown_embeddings</span><span>],</span> <span># Replace with your unknown embeddings </span> <span>n_results</span><span>=</span><span>5</span> <span>)</span>results = db.query( query_embeddings=[unknown_embeddings], # Replace with your unknown embeddings n_results=5 )
Enter fullscreen mode Exit fullscreen mode
The results will tell you which faces are similar and how close they are.
Understanding Distance Metrics:
I’ve done some experiments with different distance metrics for ChromaDB. Imagine the blue indicating that the faces match and the red meaning they don’t.
- Cosine: Cosine similarity measures angles between vectors.
- L2 (Euclidean): Euclidean distance measures straight-line distances between points.
Using the “facedb” Package:
To make your life easier, I’ve bundled all this functionality into a handy package called “facedb.” You can install it with a simple pip command:
pip <span>install </span>facedbpip <span>install </span>facedbpip install facedb
Enter fullscreen mode Exit fullscreen mode
Source: github.com/ageitgey/face_recognition
Here’s how you can use it:
<span># Import the FaceDB library </span><span>from</span> <span>facedb</span> <span>import</span> <span>FaceDB</span><span># Create a FaceDB instance and specify where to store the database </span><span>db</span> <span>=</span> <span>FaceDB</span><span>(</span><span>path</span><span>=</span><span>"facedata"</span><span>,</span><span>)</span><span># Add a new face to the database </span><span>face_id</span> <span>=</span> <span>db</span><span>.</span><span>add</span><span>(</span><span>"Joe Biden"</span><span>,</span> <span>img</span><span>=</span><span>"joe_biden.jpg"</span><span>)</span><span># Recognize a face from a new image </span><span>result</span> <span>=</span> <span>db</span><span>.</span><span>recognize</span><span>(</span><span>img</span><span>=</span><span>"new_face.jpg"</span><span>)</span><span># Check if the recognized face matches the one in the database </span><span>if</span> <span>result</span> <span>and</span> <span>result</span><span>[</span><span>"id"</span><span>]</span> <span>==</span> <span>face_id</span><span>:</span><span>print</span><span>(</span><span>"Recognized as Joe Biden"</span><span>)</span><span>else</span><span>:</span><span>print</span><span>(</span><span>"Unknown face"</span><span>)</span><span># Import the FaceDB library </span><span>from</span> <span>facedb</span> <span>import</span> <span>FaceDB</span> <span># Create a FaceDB instance and specify where to store the database </span><span>db</span> <span>=</span> <span>FaceDB</span><span>(</span> <span>path</span><span>=</span><span>"facedata"</span><span>,</span> <span>)</span> <span># Add a new face to the database </span><span>face_id</span> <span>=</span> <span>db</span><span>.</span><span>add</span><span>(</span><span>"Joe Biden"</span><span>,</span> <span>img</span><span>=</span><span>"joe_biden.jpg"</span><span>)</span> <span># Recognize a face from a new image </span><span>result</span> <span>=</span> <span>db</span><span>.</span><span>recognize</span><span>(</span><span>img</span><span>=</span><span>"new_face.jpg"</span><span>)</span> <span># Check if the recognized face matches the one in the database </span><span>if</span> <span>result</span> <span>and</span> <span>result</span><span>[</span><span>"id"</span><span>]</span> <span>==</span> <span>face_id</span><span>:</span> <span>print</span><span>(</span><span>"Recognized as Joe Biden"</span><span>)</span> <span>else</span><span>:</span> <span>print</span><span>(</span><span>"Unknown face"</span><span>)</span># Import the FaceDB library from facedb import FaceDB # Create a FaceDB instance and specify where to store the database db = FaceDB( path="facedata", ) # Add a new face to the database face_id = db.add("Joe Biden", img="joe_biden.jpg") # Recognize a face from a new image result = db.recognize(img="new_face.jpg") # Check if the recognized face matches the one in the database if result and result["id"] == face_id: print("Recognized as Joe Biden") else: print("Unknown face")
Enter fullscreen mode Exit fullscreen mode
For More Use Cases:
If you’re interested in exploring more use cases and diving deeper into the code, you can check out the GitHub repository. There, you’ll find additional examples and resources to help you with your face recognition projects.
So, go ahead and give it a try! Goodbye, and I hope you find this information helpful for your face recognition projects!
原文链接:Face Recognition on a Large Collection of Faces with Python
暂无评论内容