Authors: Harshit Nigam, Mohammad Nabigh Abbas, Mohneesh Tiwari, Himanshu Mali Shalaj, Nida Hasib
Certificate: View Certificate
Facial Recognition, the biggest breakthrough in Biometric identification and security since fingerprints, uses an individual’s facial features to identify and recognize them. A technology that seems too farfetched taken straight from a science fiction novel is now available in smartphones in the palm of our hands. Facial Recognition has gained traction as the primary method of identification whether its mobile phones, smart security systems, ID verification or something as simple as login in a website. Recent strides in facial recognition technologies have made it possible to design, build and implement a facial recognition system ourself. Using Computer Vision and machine learning libraries like Facial Recognition and Dlib, we can create a robust system that can detect faces and then match and identify it with a database of pre-loaded facial data to successfully recognize them. This study conducted a literature review of these aforementioned technologies and various other advancements in the field of computer vision facial recognition by other scholars in their research papers. This paper analyzes domains to understand the working of these machine learning models and their different implementations in facial recognition systems. The research conducted by us during this review will be paramount in creating a proof-of-concept prototype facial recognition system.
Humans distinguish and identify faces based on location, size and shape of the of facial features such as nose, ears, lips, eyes, cheekbones. Face is highly non rigid and there are a lot of details reflecting individual differences. Generally, face recognition involves 2 phases, face detection and face recognition. Face detection means capturing a or discovering a face in the image. Then it is followed by face recognition. Face recognition is the process of finding the matching face by comparing the faces found in a static image or dynamic videos. It is generally used for the purpose of identification. It is a subset of biometric recognition. Computers that recognize faces could be applied to a wide variety of problems, including criminal identification, security systems, image and film processing, and human-computer interaction.
Face recognition researches began in early 1950’s. Early in 1966, Bledsoe et.al. Studied on the human facial recognition based on pattern recognition and created a new technology. It was the preliminary development of facial recognition technology. In 1983, Sirovich and Kirby introduced the principal component analysis (PCA) for feature extraction. Using PCA, Turk and Pentland Eigenface were developed in 1991 and is considered a major milestone in technology. Local binary pattern analysis for texture recognition was introduced in 1994 and is improved upon for facial recognition later by incorporating Histograms (LBPH). In 1996 Fisher face was developed using Linear discriminant analysis (LDA) for dimensional reduction and can identify faces in different illumination conditions, which was an issue in Eigenface method.
In 1997, a facial detection system was created which could detect a certain face among the crowd. Face recognition methods based on machine vision has achieved great results in facial recognition. We need to consider the intra-class changes caused by facial expression, posture, age, location and occlusion, and the inter-class changes caused by different factors like lighting and backdrop. These two changes are very complex and non- linear.
Traditional methods often fail to achieve the desired result for complex distribution of intra-class and inter-class changes. Deep learning simulates the cognitive learning of human visual perception, and can obtain more high-level feature which can be used to solve the intra-class and inter-class changes in facial recognition.
This paper summarizes the facial recognition technology based on deep learning and lists the basic model structure of deep learning. This paper will also summarize the research on Dlib library maintained by Davis King and Adam Geitgey’s facial recognition model. The network itself was created by Davis King on a data set of 3 million faces. On the Labeled Faces in the Wild (LFW) the network compares to other methods reaching an accuracy of 99.38%. This paper will also summarize other technologies related and used in facial recognition.
II. DEVELOPMENT STAGE OF FACIAL RECOGNITION AND RELATED TECHNOLOGIES
A. Face Recognition
It manipulates the face of a person and recognize from Python. Built using dlib’s state-of-the-art face recognition technology built with deep learning. It provides a simple face recognition command line tool that lets you do face recognition on a folder of images from the command line.
B. Features of Face_Recognition Module
Dlib is a modern C++ toolkit containing machine learning algorithms for creating complex software in C++ to solve real word problems. It is used in facial recognition to train the model to be able to recognize the face through digital image.
Major features of Dlib
D. Deep Learning features and Classifying Models
III. INNER WORKING OF THE FACIAL RECOGNITION MODULE
The face recognition process is divided into 4 parts -:
A. Facial Detection
The technology used in facial detection is called Histogram of Oriented Gradients or HOG. The input is converted in black and white color scheme since color data is not required. Then each and every pixel is observed and the pixels surrounding it are taken in consideration, based on the shift in color gradient from light to dark, a arrow or line is drawn. This process is repeated for every pixel and gradient shift in the image. The arrows or line that replaces the pixels are called Gradients and plots the flow of light in the input. By only considering the direction of gradients shift an image of the same person whether bright or dark will be represented exactly same. To decrease the level of details to save up space and get the forest of tree, we just consider the higher level of gradient. The image is divided into 16x16 pixel squares and in each square, the gradients are counted and the major direction of arrows is considered to draw one large arrows for a square. The end result will represent the basic structure of a face. This representation is called HOG representation and is compared to the most similar known HOG pattern to detect faces.
B. Framing and Projecting
Faces when detected and isolated are then posed in a pre-defined manner so that face of the same person in different angles, turns and orientation are not distinguished as different people. This is done by wrapping the image so that the eyes and the lips are positioned in the same place. An algorithm called Face Landmark Estimation is used to do this. The algorithm will try to find 68 specific points defined on a face called landmarks, these points are located on a face on various points- edges of the eyes, the chin, eyebrows, contour of the mouth, etc. When these landmarks are discovered and the location of eyes and mouth are known, the image is rotated, skewed, wrapped and scaled so that these features are centered as much as possible without distorting the image. This transformation of the image is called Affine Transformation.
C. Extracting Facial Features and Profile Creation
This process is the main backend work of the model as a Deep Convolutional Neural Network (CNN) is trained to extract relevant features from the image. The model is trained to generate 128 measurements for each face. The training of the network is done by feeding the model 3 face images -
The algorithm generates measurement for all the three images and tweaks the neural network so that the measurement for image #1 and #2 are closer while making sure the measurements for #2 and #3 are further apart. This process is repeated millions of times on millions of images so that the neural network learns to generate 128 measurements on faces of different people. The Dlib library is trained on millions of faces and the facial recognition module is built upon this neural network to accurately give out these measurements in efficient times. The process of generating 128 measurements is called Embedding. After Embedding, the measurements are stored in form of numerical data readable by the model in a profile. This facial profile is the final data that is compared with other profiles to identify similar profile and recognizing faces.
D. Recognizing Faces by comparing facial profiles
A simple machine learning classifier is used to compare profiles. Facial Recognition uses a linear SVM Classifier that is trained on embedded measurements from the input image and compare it to the stored profiles and find the closet match. The classifier after comparing different profiles stored in the system with the new input then outputs the name of the person.
After a deep, in-depth analytical review of these technologies and understanding the inner working of these algorithms, we can summarize the various use cases and implementation of these technologies. Computer Vision is a broad field with many areas of interest and implementations. Multi-platform libraries like DLib can be a huge bridge between software and hardware incompatibilities. Dlib certainly has the capability to extract features from a still image as well as a stream of images like in a video or live stream, extraction of features in a specific way so that to differentiate between the minute of details in the facial features. Being a library created in C/C++, it\'s use in a python environment certainly makes it a faster medium of feature extraction. The Facial Recognition module in python takes care of the rest of the process by creating individual facial profiles and storing them the model for comparing later. All this is just possible with a snap of a picture, the only input required.  This ease of input and use and the efficiency of the process makes these two technologies a perfect match to be implemented in a light weight software heavy facial recognition system. Using just the camera of a smartphone, laptop, webcam of a desktop or just a dedicated camera, this robust system can extract features, create a facial profile to be stored in the system, compare previously created profile with the new one to recognize the face in the input, all from just a single snapshot of a face.
 Modern face recognition with deep technologies, Article, https://www.medium.com/@ageitgey/machine-learning-is-fun-part-4-modern-face-recognition-with-deep-learning-c3cffc121d78, Published by Adam Geitgey Jul 24,2016.  Dlib-ml: A machine learning tool kit, Author, Davis E.King, Published on 7/09.  Face recognition Homepage, http://www.face-rec.org/algorithms/  A survey paper for Face Recognition Technologies in ISSN 2250-3153, Volume 6, Issue 7. Author, Kavita, Ms. Manjeet Kaur. Published in July 2016.  Seeing with OpenCV, Article, http://www.cognotics.com/opencv/servo2007series/part1/index.html, Published by Robin Hewitt,2010.  C. Cortes and V. Vapnik, ‘‘Support-vector networks,’’ Mach. Learn., vol. 20, no. 3, pp. 273–297, 1995.  A. Sun, E.-P. Lim, and Y. Liu, ‘‘On strategies for imbalanced text classification using SVM: A comparative study,’’ Decis. Support Syst., vol. 48, no. 1, pp. 191–201, Dec. 2009.  https://towardsdatascience.com/a-complete-guide-to-principal component-analysis-pca-in-machine-learning-664f34fc3e5a  ‘‘Face recognition based on deep learning,’’ in Proc. Int. Conf. Hum. Centered Compute., 2014, pp. 812–820.  ‘‘Implementation of robust face recognition system using live video feed based on CNN,’’ in Proc. Compute. Vis. Pattern Recognition., 2018. [Online].  D. Schofield, A. Nagrani, A. Zisserman, M. Hayashi, T. Matsuzawa, D. Biro, and S. Carvalho, ‘‘Chimpanzee face recognition from videos in the wild using deep learning,’’ Sci. Adv., vol. 5, no. 9, Sep. 2019, Art. no. eaaw0736.  Vogt, B. Mizaikoff, and M. Tacke, ‘‘Numerical methods for accelerating the PCA of large data sets applied to hyperspectral imaging,’’ Proc. SPIE, vol. 4576, pp. 215–226, Feb. 2002.  C. Ordonez, N. Mohanam, and C. Garcia-Alvarado, ‘‘PCA for large data sets with parallel data summarization,’’ Distrib. Parallel Databases, vol. 32, no. 3, pp. 377–403, Sep. 2014.  Chintalapati and M. V. Raghunath, ‘‘Automated attendance management system based on face recognition algorithms,’’ in Proc. IEEE Int. Conf. Comput. Intell. Comput. Res., Dec. 2013, pp. 1–5.  J. Lu, K. N. Plataniotis, and A. N. Venetsanopoulos, ‘‘Face recognition using LDA-based algorithms,’’ IEEE Trans. Neural Netw., vol. 14, no. 1, pp. 195–200, Jan. 2003.
Copyright © 2022 Harshit Nigam, Mohammad Nabigh Abbas, Mohneesh Tiwari, Himanshu Mali Shalaj, Nida Hasib. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.