Keywords

1 Introduction

Due to its use in security, surveillance, access control, and personalized marketing, face recognition is a rapidly expanding topic. It involves extracting and recognizing human faces from photos or videos using computer algorithms. There has been an increase in research targeted at creating more precise, effective, and reliable face recognition systems as the need for face recognition technology keeps growing. From basic machine learning approaches to more complex deep learning architectures, these systems rely on a range of methodologies. This survey paper's goal is to give readers a broad overview of facial recognition technology as it stands right now. The most widely used algorithms, approaches, and datasets will be covered, along with the most recent research and breakthroughs in the area.

Face recognition technology enables a computer system to identify or confirm a person's identity by examining and comparing their facial features to a database of recognized faces. The system employs algorithms that examine and contrast many facial characteristics, including the separation between a person's eyes, the nose's shape, the breadth of the mouth, and the curves of the face. Face recognition has several applications, including social networking, biometric authentication, security systems, and law enforcement. It frequently works in conjunction with other biometric technologies, such as iris and fingerprint scanning, to provide more reliable and accurate identification systems.

Face recognition is a complex process that involves multiple steps. The system uses object detection techniques in the initial step of face detection to find and locate human faces in an image or video. The system aligns the discovered faces to a standardized position and orientation in the subsequent stage of face alignment. The algorithm then extracts the important facial characteristics from each face, such as the separation between the eyes, the contours of the nose, and the face's contours, which are often represented as mathematical vectors. The system then does face matching, where it checks the feature vectors of a detected face against a database of recognized faces to find a match. Deep neural networks and support vector machines are two common machine learning algorithms used for this. The system finally decides based on the degree of similarity between the feature vectors of the discovered face and the database of recognized faces. The system will recognize the face as a match if the score rises above a particular cutoff. If not, it will not accept the face. A face recognition system's performance and accuracy are influenced by a number of parameters, including the quality of the input photos, the complexity of the facial features, and the quantity and caliber of the database of recognized faces. However, there are a number of difficulties that need to be overcome while face matching, including changes in lighting, position, expression, and occlusion. Researchers are creating more sophisticated algorithms that can manage these difficulties and use extra features, such as 3D face models, to improve the durability and accuracy of face matching.

The face recognition process typically involves the following steps:

  1. 1.

    Face detection

  2. 2.

    Feature extraction

  3. 3.

    Face matching

  4. 4.

    Verification or identification.

2 Related Work

Survey of various studies are shown in Table 1.

Table 1 Literature survey

3 Techniques Used

3.1 Deep Convolutional Neural Networks

In order to analyze and interpret pictures or other forms of structured data, deep convolutional neural networks (CNNs) were created. Multiple layers of convolutional and pooling processes are the foundation of a CNN's design, which is followed by one or more fully linked layers. The local patterns and characteristics in the input data, such as edges, corners, and textures, are recognized by the convolutional layers. These layers take the input data and run it through a series of learnt filters, creating a number of feature maps that represent various facets of the original image. The feature maps are subsequently downsampled by the pooling layers, which lowers their spatial resolution while maintaining the most crucial characteristics. The network's fully connected layers employ the high-level properties discovered by the convolutional layers to predict outcomes based on the input data. For instance, the fully connected layers may provide a probability distribution across a set of potential classes in an image classification problem. Deep CNNs have demonstrated outstanding performance across a range of computer vision applications, including segmentation, object identification, and picture classification. They've also been used with other kinds of structured data, such audio signals and text written in natural language structure of CNN network is shown in Fig. 1.

Fig. 1
A C N N network flowchart. It starts with an input layer, followed by a convolution plus sampling layer. It ends with a hidden layer and a classification layer.

Structure of CNN Network

3.2 Deep Face

DeepFace begins by recognizing faces in images with a pre-trained face detector. It then normalizes for changes in position and illumination by aligning the faces to a canonical stance. The aligned faces are then sent into the deep neural network, which learns to extract discriminative high-level properties for diverse faces. These characteristics are then utilized to calculate a similarity score between faces, which is subsequently used for face recognition. DeepFace was trained on a huge dataset of over 4 million labeled faces, and it performed well on multiple benchmark face recognition datasets, including Labeled Faces in the Wild (LFW) and YouTube Faces (YTF). On the LFW dataset, it achieved an accuracy of 97.35%, exceeding other classical and deep learning-based face recognition systems.

The DeepFace algorithm in face recognition involves the following steps:

  • Face detection

  • Face alignment

  • Feature extraction

  • Similarity scoring

  • Face recognition

  • Training.

3.3 VGG-Face

VGG-facial is a deep convolutional neural network architecture optimized for facial recognition applications. It is built on the VGG-16 network architecture and was trained on the VGG-Face dataset, a large-scale collection of faces. VGG-Face is made up of 13 convolutional layers and 3 fully linked layers. A face picture is sent into the network, which is subsequently processed via the layers to extract characteristics for face recognition. The purpose of face recognition is to compare a face image to a database of known faces and see whether there is a match. The VGG-Face network is capable of extracting features from both the probe (input) and gallery (database) faces. To see if there is a match, the characteristics can be compared using a similarity measure such as cosine similarity. Overall, VGG-Face has shown to be a highly successful face recognition architecture, delivering cutting-edge outcomes on a range of benchmark datasets.

3.4 Capsule Networks

Capsule Networks (CapsNets) are a form of neural network architecture developed as an alternative to classic convolutional neural networks (CNNs) for image recognition applications. Capsule Networks include a unique building component known as a capsule, which is a collection of neurons that represent the instantiation characteristics of a single item or feature in an image. Capsule Networks may be used in face recognition to identify and recognize facial features such as the eyes, nose, mouth, and other facial traits. Capsules in the network can learn to represent these features as unique entities with connections to one another, rather than merely as individual pixels or picture attributes. Face recognition tasks have showed potential for capsule networks, particularly in situations where there may be significant fluctuations in lighting, position, and facial expression. The network's ability to accurately capture the spatial connections between the various facial characteristics by modeling them as capsules helps increase identification rates. Although Capsule Networks have demonstrated some potential, further study is required to fully grasp their advantages and disadvantages in face recognition and other applications.

3.5 3D Face Recognition

A sort of face recognition technology called 3D face recognition utilizes 3D reconstructions of people's faces to recognize and identify them. 3D face recognition employs a 3D model of the face to capture the form, texture, and other physical aspects of the face, in contrast to standard 2D face recognition, which uses 2D photographs of the face. Lighting circumstances, position fluctuations, and facial emotions are some of the 2D face recognition's drawbacks that can be overcome using 3D face recognition. Since 3D models record the form of the face, they may be used to identify faces in a variety of lighting and viewing angles. Several techniques, such as structured light scanning, stereo photogrammetry, and laser scanning, can be used to create 3D representations of the face. These techniques record the facial form from several angles, which may be merged to create a 3D representation of the face. A 3D model of the face may be used to extract features for facial recognition once it has been created. These characteristics might include the how the face is shaped, how the nose and forehead curve, and how deep the eyes and lips are. In general, 3D face recognition has showed promise as a more reliable and accurate method of face recognition, especially in difficult situations where 2D face recognition may struggle. It can be more difficult to implement in some applications since it also calls for more complicated hardware and computation.

3.6 Principal Component Analysis

Using principal component analysis (PCA), face recognition software may decrease the dimensionality of face photographs while still preserving key facial traits. The method involves projecting the data onto a lower-dimensional space based on the key elements or traits that make up the majority of the variance in the dataset. In order to extract a set of eigenfaces from a huge number of face photos, PCA is frequently employed in face recognition. These eigenfaces, which may be used to categorize and identify novel faces, represent the most crucial characteristics or patterns in the faces. Finding the most similar eigenface from a fresh face picture projected onto the eigenface space is the first step in the recognition process.

Algorithm for PCA:

  1. Step 1:

    Standardization.

  2. Step 2:

    Computation of covariance matrix.

  3. Step 3:

    Calculate the eigenvectors and eigenvalues of the covariance matrix to identify principal components.

  4. Step 4:

    Feature vector.

  5. Step 5:

    Recast the data along the principal component axes.

Face recognition used PCA is shown in Fig. 2.

Fig. 2
A flowchart of face recognition using P C A. It starts with an input, followed by an error calculation and Euclidean distance calculation. It ends after image recognition at a minimum distance.

Face recognition used PCA

3.7 Linear Discriminant Analysis

Another popular face recognition method, linear discriminant analysis (LDA), is similar to PCA but has a different objective. LDA is used for feature extraction and classification whereas PCA is used to reduce dimensionality. In order to distinguish between distinct classes of faces, LDA in face recognition seeks out a linear combination of characteristics. In order to do this, the distribution of the face photos is modeled in a high-dimensional space, and the data is then projected onto a lower-dimensional space that maximizes the separation between the classes. The resulting traits, known as “fisherfaces,” can be used to classify and recognize new faces. To determine the closest match, a new face image is projected onto the fisherface space and compared to the existing fisherfaces. LDA has been found to outperform PCA in face recognition tasks, particularly when working with datasets with a large number of classes. However, labeled training data is required for LDA to perform properly, which can be a constraint in some applications. Face recognition LDA Subspace is shown in Fig. 3. Principle and PCA of LDA approach is shown in Figs. 4 and 5.

Fig. 3
A flow diagram of face recognition by L D A. It starts with probe images that undergo preprocessing and P C A projection. It ends with an Euclidean weighted metric.

Face recognition LDA subspace

Fig. 4
A flow diagram of L D A approach principle. It starts with training data, followed by w i from w 1 to w m. It ends with fisher space and the projected gamma x equation.

Principle of LDA approach

Fig. 5
A flow diagram of a P C A L D A approach. It starts with a face database, followed by testing and training datasets. The features are extracted from the projection of test image, and the K N N classifier is used to form the decision.

PCA LDA approach

3.8 FaceNet

FaceNet is a deep learning-based facial recognition system created in 2015 by Google researchers. It learns the characteristics and patterns of faces using a deep neural network and encodes each face as a high-dimensional vector in a feature space.

FaceNet's facial recognition algorithm consists of the following steps:

  1. 1.

    Face detection and alignment.

  2. 2.

    Triplet loss function.

  3. 3.

    Training.

  4. 4.

    Face representation.

  5. 5.

    Face recognition.

3.8.1 Euclidean Distance

A series of facial traits from each face, such as the eye location, nose shape, and mouth size, must first be retrieved in order to determine the Euclidean distance between two faces. In a high-dimensional space, these features are often represented as a collection of coordinates.

The Euclidean distance between the two faces can be determined using the formula below once the feature vectors for the two faces have been obtained:

$$ {\text{distance}} = {\text{sqrt}}((x2 - x1)^{2} + (y2 - y1)^{2} + \cdots + (n2 - n1)^{2} ) $$

Where  x1, y1, …, n1 represent the coordinates of the first face's feature vector, and  x2, y2, …, n2 represent the coordinates of the second face's feature vector. The square root of the sum of the squared differences is the Euclidean distance.

4 Proposed Model

The proposed model for face recognition authentication in a library is a computer vision system that utilizes deep learning techniques to identify and verify individuals based on their facial features. Only authorized users will be able to access library resources and services thanks to the model's ability to recognize faces properly.

The system also includes a database of every book in the collection, together with information about where each book is located on the shelves and its borrowing history. For the convenience of library patrons and librarians, the system stores and manages this data.

A camera for taking pictures of library patrons’ faces, a deep neural network trained on a large dataset of facial images to recognize patterns and features of human faces, and an authentication mechanism that compares the input image with the images of authorized users stored in the database make up the system.

An image of the user's face must be taken in order to begin the face recognition procedure. The user's face is captured by the camera, which then sends the image to the deep neural network for analysis. Convolutional neural networks (CNNs) are used by the neural network to extract important elements from the image, such as the separation between the eyes, the profile of the nose, and the curve of the lips.

The system compares these properties to those in the database once the neural network has extracted them. A collection of pre-registered faces of library patrons who have permission to utilize the library's resources and services can be found in the database. The system authenticates the user and gives access to the library services if the features of the input image match the features of any of the registered faces.

Users of the system can browse and search the library's book collection using an intuitive interface on a PC or mobile device. Additionally, they may view each book's availability and shelf location.

Each book's borrowing history is recorded by the system, which makes it possible to determine the most read titles in the collection and inform future book purchase choices. By keeping track of borrowing and return records, it also aids in preventing theft or loss of library books.

The proposed facial recognition authentication model for library services is, in general, a trustworthy and secure system that can offer seamless and practical access to library resources while guaranteeing the security and safety of the library's patrons and their data.

4.1 Future Scope in Proposed Model

To improve user experience and offer personalized services, the suggested model can potentially be expanded to include further elements like age and gender detection. It can also be improved by adding extra security features like liveness detection to thwart spoofing and hacking attempts.

5 Application

  1. 1.

    Real-Time Face Recognition

Real-time facial recognition systems are projected to become increasingly widespread and widely employed in applications such as security, surveillance, and access control as high-performance computing and advanced algorithms become more commonly available.

  1. 2.

    Mobile and Wearable Devices

Face recognition will very certainly be integrated into mobile and wearable devices, enabling for more easy and secure device authentication and unlocking.

  1. 3.

    Biometric Passports and Travel Documents

Face recognition is currently being used in biometric passports and travel papers, and it is expected to become more common in the future, offering travelers with more security and convenience.

  1. 4.

    Personalized Advertising and Marketing

Face recognition software can analyze facial expressions and emotions, enabling for more personalized advertising and marketing efforts that are tailored to individual tastes and requirements.

  1. 5.

    Healthcare

Face recognition technology may be utilized in medical diagnosis, monitoring, and therapy, such as recognizing illness symptoms and remotely monitoring patient status.

6 Conclusion

This survey paper has focused on various face recognition techniques. The evolution of face recognition from 2013 to 2021 is described using various algorithms. The challenges in various fields are also described such as Masked face detection, Gender detection, Mobile face detection, Attendance system, Security systems. The solution is also provided for the challenges and the accuracy is obtained using various algorithms and methods.