When I started using Facebook, what amazed me the most was how quickly the website could identify the person in the image I uploaded without my input; it felt like magic. Google photos automatically create folders for different people in your gallery sorted them without your help. How do they do this? What’s the secret?
Face Recognition is the technology these platforms are using. Facial recognition is a software algorithm used to verify or identify an individual’s identity. Often using a video frame or a digital image as input, it compares facial features in an image to faces within a database. A camera will detect and recognise a human’s face. It is easily seen when the person is looking straight at the camera. Detection of slightly angled faces is also possible.
The face is separated into distinguishable landmarks – we can call these nodal points. Face recognition technology will analyse one of the eighty nodal points, like the distance between your eyebrows. This is called Face Analysis.
After analysis, each nodal point becomes a number in the application database. The entire numerical code is referred to as a faceprint. Just like a thumbprint, everyone also has a unique faceprint.
The final step of the process is finding a match. Your faceprint is compared to a database of other facial codes. The number of faces that are compared depends on the quality and access of the database. This step is called Matching.
🕵️ Problem Statement
Using these concepts, build your very own face recognition system for this puzzle.
Given a target face image, find the target face from a collection of 100 other faces. Your input will be a target face image and your output will be the location of the input image from a grid of 100 other faces.
💪 Getting Started
This puzzle dataset is an unsupervised dataset means there are no labels to train your model. There are two files provided in the resource section named data.zip with 1000 samples and sample_submission.csv.
The data.zip file contains two folders, missing and target containing images with corresponding filenames. The missing folder contains images of people who are missing with jpg extension dimensions 512x512. The target folder will contain images ( with corresponding file names containing in missing folder ) of over 100 people with jpg extension and dimensions 2160*2160 ( 216x216 for each face image ).
data.zip ├── missing │ ├── fekon.jpg │ ├── 20sbf.jpg │ └── ... over 1,000 images └── target ├── fekon.jpg ├── 20sbf.jpg └── ... over 1,000 images
Your task will be to recognize the person from the missing folder in the target folder and return the location of the recognized person in the target folder.
The location of the recognized person in the target folder will be determined by the x and y location of the face in the image. For ex. The top left face location will be 00 and the bottom right will be 99, the top right corner face will be 09 and the bottom left face will be 90.
- Note that all 100 face images in each image of the target folder will be unique and only 1 face image will match the corresponding face image in the missing folder.
Following files are available in the resources section:
- data.zip - ( 1k samples ) This zip file contains the missing and target folder.
- sample_submission.csv - ( 1k samples ) This csv file contains the format of your csv file for submitting the results.
Learn to make your first submission using the starter kit 🚀
- Create a submission folder in your working directory.
Use sample_submission.csv provided in the resources section and replace the target column values with your model predictions for the corresponding ImageID column.
- Save the CSV in the submission folder as submission.csv
- Inside a submission directory, put the .ipynb notebook from which you trained the model and generate predictions and save it as notebook.ipynb.
- Zip the submission directory
Overall, this is what your submission directory should look like
submission ├── assets │ └── submission.csv └── original_notebook.ipynb
Make your first submission here 🚀 !!
🖊 Evaluation Criteria
During the evaluation, the F1 Score ( average=weighted ) as the primary score and the Accuracy Score as the secondary score will be used to test the efficiency of the model.
- Divyanshu Kumar