Convolutional Siamese neural network for few-shot multi-view face identification

Meddad, Majdouline; Moujahdi, Chouaib; Mikram, Mounia; Rziza, Mohammed

doi:10.1007/s11760-023-02535-w

Convolutional Siamese neural network for few-shot multi-view face identification

Original Paper
Published: 19 April 2023

Volume 17, pages 3135–3144, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Signal, Image and Video Processing Aims and scope Submit manuscript

Convolutional Siamese neural network for few-shot multi-view face identification

Download PDF

Majdouline Meddad¹,
Chouaib Moujahdi²,
Mounia Mikram^1,3 &
…
Mohammed Rziza¹

348 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

The face is the most popular biometric trait for human recognition. The goal of a face identification system is to mimic the human recognition process and automate applications such as border control, passport control, criminal investigation, and terrorist identification. In this study, we examine multi-view face identification systems, particularly when there are limited samples of images per angle of view per identity. We propose a multi-view face identification system based on the Siamese Neural Network (SNN), and we evaluate its performance under two training scenarios: using only same-angle images and using both same-angle and different-angle images. Our system is also trained with only one image per angle for the training set. The results of our experiments on Umist and Schneiderman databases demonstrate that the proposed SNN model is the optimal solution for few-shot multi-view face identification, with an accuracy of 74.4% compared to 37% for the VGGFace model and 77% compared to 76% for a CNN model trained from scratch, when using one image per angle for the training set on the Schneiderman database with an angle of view + 10. The accuracy was 59% for the VGGFace model. The proposed model can be downloaded from this link: https://github.com/Majdouline-Meddad/SNN-for-Multi-view-face-identification.

How to Choose Deep Face Models for Surveillance System?

Face Recognition Benchmark with ID Photos

A Siamese Neural Network-Based Face Recognition from Masked Faces

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Facial recognition technology has become one of the most widely used artificial intelligence applications today. Although facial recognition technology has not yet reached its peak, it has made enough progress to find series of applications and programs, and its uses have diversified greatly to reach a widely used extent around us, but we can say that the most prominent of these uses appears in security purposes.

The mission of a face recognition system is to identify or verify people, by comparing and analyzing different patterns on the basis of facial features. It can help generally to authenticate and distinguish human faces from a photograph or video. Verification system is a one-to-one matching process where the user features, extracted during the verification, are compared with only his/her features extracted and stored in the system database during the enrollment. Identification system is a one-to-many matching process where the system needs to match the entered face image to the whole identities stored in the database. This paper is interested in this mode of use.

Generally, a face identification system confronts, in addition to the complexity of matching process that is time consuming, several issues like the significant changes in age, the appearance of a beard or mustache, the wearing of glasses, the covering part of the face or the complicated view of face where not all features are visible. In this paper, we are interested in identification systems in the case of multi-view face acquisitions.

Multi-view face recognition, which aims to solve the problem of variation in pose of face, is a more difficult and complex task than frontal face recognition, and this difficulty is evident because of the nonlinear variations existing in the data space. Thus, it has attracted research efforts both because of its potential applications and the great challenge that presents.

In some critical nowadays applications, such as surveillance systems or criminal identification systems, we have usually only few face images per identity in the training database and always with different angles of view, which is a huge issue for identification systems that aim to ensure high precision, especially those that are based on Convolutional Neural Networks (CNNs) [1, 17, 22] and that have proved a robustness for several applications including frontal face recognition. However, CNNs require having several images per class for the training stage, which is not always possible for identification systems of real life and for multi-view face recognition as well. In addition, the classical methods of face recognition, that focus on the local manifold-structure, can be efficient even using few face images per identity for training, but absolutely only in the case of frontal face identification. To achieve this trad-off and deal with this special circumstances, we believe that Siamese Neural Networks [2,3,4] are the optimal alternative solution, since they can easily learn from little samples per subject using the few-shot learning technique [5]. In the literature, the proposed multi-view facial identification systems are created using databases containing several images per identity.

In this paper, we propose a few-shot multi-view face identification system, based on the Siamese Neural Network (SNN) [2][6], in the case when we have a few samples of images per angle of view per identity in the training set [7], then we compare this system with two CNN models trained from scratch and with the pre-trained VGGFACE model [6]. It should be mentioned here that the proposed system requires at least two images for the training process.

The rest of the paper is organized as follows. The related works are presented in Sect. 2. Then, in Sects. 3 we describe in details the proposed system. Our evaluation and experimental results are presented in Sect. 4. Finally, our conclusion and perspectives are given in Sect. 5.

2 Related works

Most face identification systems perform the training using multiple images per subject during the feature extraction process. In several applications like access control, surveillance systems, and criminal identification systems, we have usually only a few face images per identity and they are taken from different angles, which makes the development of few-shot multi-view face identification systems very important for such applications. In the literature, they are various methods to build multi-view face identification systems that can be classified into three categories as follow: machine Learning-based systems, Deep Learning-based systems and hybrid systems that combine Machine Learning, and Deep Learning methods.

For Machine Learning algorithms, they are many works that have used classical algorithms for feature extraction and classification tasks as well. Anand et al. [8] used a Local binary pattern algorithm for feature extraction then followed by Euclidean distance for classification. Kurita et al. [9] proposed to obtain aligned principal components by prior knowledge that is learned using the principal component analysis on multi-view images, and then, they synthesize a virtual view of viewed face image and frontal one using a linear object class. Fouad et al. [10] compare two feature extraction algorithms for face recognition which are PCA and LDA to recognize the best technique. Yuehang et al. [11] presented a framework to improve the efficiency and the low accuracy of a multi-view face recognition system based on a cascade face detector and improved distance model based on DLIB face alignment. Li et al. [12] proposed to create a hybrid system composes of Support vector regression and classification methods for multi-view face detection and recognition. The Support Vector Regression is used to detect the pose of the head to choose the detector for the specific view detected. Moujahdi et al. [13] have used LST [14] and SVDA [15] methods for feature extraction, and they used a model for head pose estimation in a 2D image [16]. This model is used to calculate the accurate angle of view of an individual before starting the recognition task, thus effectively dealing with pose variability in the same class using an inter-communication between several KNN classifiers. Tuncer et al. [17] proposed a new face recognition architecture based on Local cross-Pattern, Wavelet and fuzzy logical methods for the feature extraction task, and on SVM, KNN, LDA and KDA methods for the classification task.

We believe that Machine learning techniques typically improve efficiency and accuracy if we have an ever-increasing amounts of data, about the views, that are processed. However, the major drawback this category is the complexity to generate relevant and discriminating features. In addition, building an Machine Learning model for a large scale system is computationally expensive. Thus, improving the quality of feature extraction, which is measured by its ability to represent and discriminate face samples, has been one of the main challenges faced by the multi-view face recognition research community in the recent years. Deep Learning techniques, and more specifically Convolutional Neural Networks (CNNs), are currently the most widely used techniques to address this challenge.

For Deep Learning category, Zhu et al. [1] created a deep learning model named multi-view perception (MVP) to separate the identity and view representation for any multi-view face image. Cao et al. [18] proposed to use a Deep Residual Equivariant Mapping (DREAM) block to add residual to the input deep representation to transform face images from a profile face to a canonical pose. Wanshun et al. [19] proposed pose auto-augment framework based on a convolutional neural network model, before training this last a data augmentation is lunched. Xiongjun et al. [20] proposed a deep convolutional framework based on the SphereFace-20 model and the Batch Normalization (BN). Meddad et al. [21] proposed a hybrid face identification system based on a compressed CNN model with an indexation and parallelization method that is suitable for embedded devices.

We can say that Deep Learning field is efficient while feeding a huge number amount of data into the neural network architectures to learn from for training. The main disadvantage of this makes Deep learning approaches data greedy as since they require an excessive amount of data for training process which is not always available in all real-world applications, rarely exceeding a few samples, such as a company that contains only 30 employees. A new variant of deep learning algorithms is used nowadays, which is the Siamese Neural Network, for several applications that do not have a large amount of data for training. For example, Bromley et al [22] proposed an algorithm for signature verification written on a pen-input tablet based on a SNN model. Chopra et al [23] proposed a discriminative method for learning complex similarity metrics for face verification. Siamese Networks employ similarity scores to do recognition because they are a metric learning method.

For Hybrid machine learning and deep learning algorithms, Sarhan et al [24] proposed a combined adaptive deep learning vector quantization (CADLVQ) classifier with a majority voting algorithm for classification and SURF for feature extraction for only three different views. Kisku et al [25] present a multi-appearance fusion of Generalization of Linear Discriminant Analysis and Principal Component Analysis (PCA) for multi-view face image for verification system and using the SVM for binary classification pattern. Vareto et al. [26] focus on the open-set face identification problem, evaluating both partial least squares (PLS) and multilayer perceptron (MLP) classification models in the pursuit of an approach that is not directly dependent on gallery set size. In fact, the authors create a voting system scheme (candidate list) and a collection of either PLS or MLP binary models, specified as hashing functions, to assess whether the requested subject is known or unknown. The topic is recognized if he or she stands out among the other candidates on the list. The hybrid machine learning and deep learning algorithms give an efficient algorithm based on the advantage of the pertinent extraction features from the image by using deep learning approach and a good classification from using machine learning algorithms.

The main drawbacks of hybrid systems, the feature extraction from the image is slow by using machine learning algorithms, and if this step is not well done, the machine learning algorithm can well predict the identity because it depends on the feature vector.

In this paper, to overcome data scarcity limitation using classical deep learning algorithms, we have used a Convolutional Siamese Neural Network for few-shot multi-view face identification.

3 Proposed system

In this section, we will present our convolutional Siamese neural network for multi-view face identification system with a CNN encoder of two different architectures.

3.1 Siamese neural network model

The Siamese Neural Network model contains usually two different inputs, an image A and an image B, to built comparable vectors. As shown in Fig. 1, Siamese neural network contains two identical Convolutional Neural Network (sub-networks) with the same parameters, configurations and weights. These sub-networks are used to calculate the distance between two inputs using the comparison of their feature vectors extracted by the CNN models.

Table 1 Overall Face identification accuracy of the Schneiderman database using Scenarios 1 and 2 of training and all training set with binary cross-entropy loss and contrastive loss

Full size table

Table 2 Overall face identification accuracy of Umist database using Scenario 1 and 2 of training with all training set with binary cross-entropy loss and contrastive loss

Full size table

Firstly, the first image A is fed into the first model, and after passing it through the convolutional layers followed by a fully connected layer, we extract a vector of features F(A). The second image B is passed through the second CNN model which is similar to the first one in terms of layers, weights, and parameters then we extract the second feature vector F(B). Secondly, we compare the two face vectors by calculating the distance between them. The distance should be small than the threshold security of the system if the two inputs belong to the same identity.

$$\begin{aligned} d\left( F(A),F(B)\right) = \sqrt{\sum \nolimits _{i=1}^{n} \left( F_{i}(A)-F_{i}(B)\right) ^2 } \end{aligned}$$

(1)

where F(A) and F(B) are the face vectors of A and B images, respectively. After each feature extraction phase, a distance value (Euclidean distance see Eq. 1) and the loss are calculated to train the sub-networks. The loss functions used in our implementation are Binary cross-entropy (see Eq. 2) and Contrastive loss (see Eq. 3).

$$\begin{aligned} \textrm{Loss} = - y \cdot \textrm{log}(\ {P({y}) + (1-y)} \cdot \textrm{log}\; (1-{P({y})}) \end{aligned}$$

(2)

where the y value is the right label and P(Y) is the predicted label.

$$\begin{aligned} \textrm{Loss} = (1-y) \cdot \frac{1}{2} (\ {{D}_w)^2 + (y) \cdot \frac{1}{2}{\max (0,m-{D}_w)}^2 } \end{aligned}$$

(3)

where the y value is the right label. y is equal to 0 when two inputs are similar and equal to 1 where are dissimilar, ${D}_w$ is the distance measure between feature vectors of input images. In the test phase shown in Fig. 2, we fed the test image and all reference images to the SNN architecture, and then, we took the label of the referenced image that has the smaller distance measure between the referenced image of each identity and the test image. In the CNN encoder module, we can use any type of neural network models and architectures. In the next sub-section, we present the convolutional neural network models used in our SNN system.

Table 3 Overall face identification accuracy of the Schneiderman database using scenario 2 of training only one image per angle for training set with contrastive loss

Full size table

3.2 CNN structures

As shown in Fig. 3, that presents the two CNN models used in our SNN system, they consist of a total of six layers with five convolutional layers followed by a fully connected layer, and the images fed to the model are of size of ${\textbf {112}} \times {\textbf {92}}$. The five first layers are convolutional layers.

For the first convolutional model, the number of the first convolutional kernels is 8 and the size is ${\textbf {3}} \times {\textbf {3}}$ followed by a downsampled layer (i.e., Max Pooling). The second layer is still a convolutional layer, with a total of 12 convolution kernels, and the size of each convolution kernel is the same as that of the first convolutional layer followed by a max-pooling layer. The third layer is a convolutional layer, with a total of 16 convolutional kernels and a size of ${\textbf {3}} \times {\textbf {3}}$ followed by a downsampling layer. The two last convolutional layers have, respectively, in total 32 and 64 with a size of ${\textbf {3}} \times {\textbf {3}}$ followed by a RELU function. The last layer of the first model is a fully connected layer with 512 neurons followed by a RELU function.

For the second convolutional model, the number of the first convolutional kernels is 8 and the size is ${\textbf {3}} \times {\textbf {3}}$ followed by a downsampled layer (i.e., Max Pooling). The second layer is still a convolutional layer, with a total of 16 convolution kernels, and the size of each convolution kernel is the same of that of the first convolutional layer followed by a max-pooling layer. The third layer is a convolutional layer, with a total of 32 convolutional kernels and the size of ${\textbf {3}} \times {\textbf {3}}$ followed by a downsampling layer. The two last convolutional layers have, respectively, in total 64 and 128 with a size of ${\textbf {3}} \times {\textbf {3}}$ followed by a RELU function. The last layer of the first model is a fully connected layer with 1028 neurons followed by a RELU function.

The optimizer algorithm used for the training process is Adam Optimizer with a learning rate of 0.0001 and 100.000 iterations.

4 Experimental results

In this section, we evaluate the accuracy of several identification systems including the proposed one following several scenarios of training and test. We will present first the used datasets, then we will describe the training process scenarios, and finally, we will present and discuss the results.

4.1 Databases description and evaluation

We have used in this paper two multi-view face databases: Umist [27] and Schneiderman [28]. Umist database contains 475 images of 20 identities, and each identity owns 19 to 36 images in various angles from the left profile to the right profile. Schneiderman database contains 6660 images of 90 identities, each identity owns 76 images that are taken every 5 degrees from the right to the left profile. To evaluate our system, we have created two training sets and one test set. Firstly, we have split Umist Database on 5 angles of view which are: 0, 20, 60, 80 and 90. We have taken 3 images per angle per identity to built the training set. We have split as well the Schneiderman database to 10 angles of view: + 10, + 20, + 40, + 60, + 80, 0, $-20$, $-40$, $-60$ and $-80$, and then, we have taken 6 images per angle of view per identity to build the training set. Secondly, we have taken one image for each angle per identity as a test image. Finally, after finishing the training phase, we need to have a small data set that contains only one image per identity as a reference image of this identity which is used to be compared with the test image in the test phase. To evaluate our system, we have created a reference dataset of each angle by choosing one image per angle per identity. For example, if we have 3 angles 20, 30 and 40, we create 3 reference datasets that contain one image per identity per, respectively, the angles 20, 30 and 40.

The metric that we have used to evaluate the performance/accuracy of the models is calculated as follow:

$$\begin{aligned} \text {Accuracy} = \frac{\text {Number of correct prediction}}{ \text {total number of predictions}} \end{aligned}$$

(4)

4.2 Training process scenarios

We propose in this paper two training process scenarios:

Scenario 1: Training using only images of the same angle of view (see Fig. 4)
Scenario 2: Training using images with the same angle of view then using different angles (see Fig. 5)

In our system, we have two inputs, the first one represents the identity that we want to learn his/her similarity or dissimilarity with the second one. If this last is an image of the same identity of the first input, we have in this case a similarity learning process. If the second input is an image of different identity of the first input, we have in this case a dissimilarity learning process.

Table 4 Overall face identification accuracy of Umist database using scenario 2 of training process with only one image per angle for training set with contrastive loss

Full size table

As shown in Fig. 4, for the similarity training process, we have taken an image of angle 20 of the identity 1 as the first input and another image of angle 20 of the same identity for learning the similarity; then in the next iteration, we take the same first input with another image of the angle 20 of the same identity but different of the one used in the previous iteration.

For the dissimilarity training process, we have taken an image of angle 20 of the identity 1 as the first input and an image of angle 20 of the identity 2; then in the next iteration, we take the same first input with another image of the angle 20 of the next different identity human.

As shown in Fig. 5, for the similarity training process, we take an image of angle 80 of the identity 3 as the first input and another image of the same angle of the same identity; then, in the next iteration, we have taken the same first image of the first entry with an image of angle 60 of the same identity to learn the similarity in different angle of view.

For the dissimilarity training process, we have taken an image of angle 80 of the identity 3 as the first input and an image of the same angle of the identity 4; then, in the next iteration, we have taken the same first image of the first input with an image of angle 60 of the identity 4 to learn the dissimilarity in different angle of view.

In the training process, we choose to train all the models using all images of the training set in scenarios 1 and 2 and also using only one image per angle per identity to reduce the number of the training set but only for scenario 2.

We have designed these two training scenarios to enable the system to learn from multi-view face images with limited samples per human subject and particularly, with only one image per angle per identity.

4.3 Results and discussion

It should be noted here that the CNN models 1 and 2 cited in this sub-section are the CNN models presented in the Sect. 3.2. We want to remind as well here that we have used two losses algorithms (i.e., binary cross-entropy and contrastive loss) and that training was from scratch for all tested CNN models. As shown in Table 1, for the scenarios 1 and 2 using all images of the training set, we have found that for the Schneiderman database, the best accuracy is between 95.6% and 100% while using the VGGFACE, and for the Umist database (see Table 2) the best accuracy is between 95% and 97% while using the SNN model 2. These results prove that the accuracy of VGGFACE is improved with the increase in training images number, since its accuracy is better using Schneiderman database compared to Umist database. Results prove as well that SNN can preserve accuracy even with minimal number of training images. In practice, we know that having several images per class for the training stage is not always possible for multi-view face identification, that is why we have repeated the evaluations but this time using only one image per angle per identity for the training stage (see Tables 3 and 4).

As shown in Table 3, for scenario 2 using only one image per angle per identity during the training stage, SNN model 2 get the best results with an accuracy between 77% and 92% for Umist database and 58.1% and 98.7% for Shneiderman database (see Table 4). These results prove that SNN model preserves performance in real-life circumstances of multi-view identification compared to the classical CNN and VGGFACE models.

Figures 6 and 7 summarize our results of Table.3 and Table.4 and prove the priority of the proposed model in the case of few-shot multi-view face identification.

As shown in the experimental result, the classification method used is benefit to built a multi-view face identification method while having few samples to train. The distance measure calculated between the feature vector of the test entry and the reference image feature vectors makes as having a good prediction module in our system.

The advantage of our approach is the ability to identify an identity using just one image per angle of view. The use of limited samples makes the model simple to train and speeds up the training process. The limitation of our approach is its ineffectiveness when there is only one image per identity, as the SNN model requires two images as inputs.

5 Conclusion

In this paper, we have proposed a few-shot multi-view face identification system based on Convolutional Siamese Neural Network. We have proved that the proposed system outperforms Convolutional Neural Network models such as VGGFace and CNN models, that are trained from scratch, in the scenario where only few images per angle per identity are available for training, which is the same case of most real-life face identification applications with an accuracy of 74,4% with angle of view + 10 against 37% with VGGFace and, respectively, 29.1%, 23,2% with CNN trained from scratch. For our future work, our main objective is to improve the performance of the proposed system while handling limited image samples per individual in a large-scale multi-view database with a high number of identities.

Data availability

The used datasets are available in references [27, 28]. The proposed model can be downloaded from this link: https://github.com/Majdouline-Meddad/SNN-for-Multi-view-face-identification[29, 30].

References

Zhu, Z., Luo, P., Wang, X., Tang, X.: Multi-view perceptron: a deep model for learning face identity and view representations. NIPS (2014)
Putra, A.A.R., Setumin, S.: The performance of Siamese neural network for face recognition using different activation functions. In: International Conference of Technology, Science and Administration (ICTSA), vol. 2021, pp. 1–5 (2021) https://doi.org/10.1109/ICTSA52017.2021.94065
Bertinetto, L., Valmadre, J., João, F., Henriques, A.V., Torr, P.H.S.: Fully-convolutional siamese networks for object tracking. In: Computer Vision ECCV 2016 Workshops Springer International Publishing (2016)
Rajdeep, C., Soham, R., Satyabrata, R.: A Siamese neural network-based face recognition from masked faces. In: International Conference on Advanced Network Technologies and Intelligent Computing (2022)
Debaditya, S., Kar, T.: FedAffect: few-shot federated learning for facial expression recognition. 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) (2021)
Hayale, W., Negi, P.S., Mahoor, M.: Deep Siamese neural networks for facial expression recognition in the wild. IEEE Trans. Affect. Comput. (2021). https://doi.org/10.1109/TAFFC.2021.3077248
Article Google Scholar
Gabriel, S., Alceu, Jr., Rafael, V., William, S., David, M.: Open-set face recognition for small galleries using siamese networks (2021) https://doi.org/10.1109/IWSSIP48289.2020.9145245. arXiv:2105.06967v1 [cs.CV]
Jalal, A.S., Bhatnagar, C., Khan, M.A., Solanki, M. S.: LBP based face recognition system for multi-view face using single sample per person. 2016 11th International Conference on Industrial and Information Systems (ICIIS), Roorkee, pp. 414–419 (2016). https://doi.org/10.1109/ICIINFS.2016.8262976
Kurita, T.,Hosoi, T., Hidaka, A.: Principal component analysis of multi-view images for viewpoint-independent face recognition. In: IEEE International Conference on Video and Signal Based Surveillance, Sydney, pp. 55–55 (2006). https://doi.org/10.1109/AVSS.2006.93
Ahmed, F.: A comparative study of human faces recognition using principle components aanalysis ans linear discriminant analysis techniques. J. Eng. Sustain, Dev (2017)
Google Scholar
Zhou, H., Chen, P., Shen, W.: A multi view face recognition system based on cascade face detector and improved dlib. In: International Symposium on Multispectral Image Processing and Pattern Recognition (2017)
Li, Y., Gong, S., Liddell, H.: Support vector regression and classification based multi-view face detection and recognition. In: Proceedings 4th IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580), pp. 300–305 (2000) https://doi.org/10.1109/AFGR.2000.840650
Chouaib, M., Sanaa, G., Mounia, M., Abdul, W., Mohammed, R.: Inter-communication classification for multi-view face recognition. Int. Arab J. Inf, Technol (2014)
Google Scholar
Gu, S., Ying, T., He, X.: Laplacian smoothing transform for face recognition. Sci. China Inf, Sci (2010)
Book MATH Google Scholar
Gu, S., Tan, Y., He, X.: Discriminant analysis via support vectors. Neurocomputing (2010)
Jania, A., Simon, P.: Face pose estimation in uncontrolled environments. In: British Machine Vision Conference (2009)
Tuncer, T., Dogan, S., Abdar, M., Basiri, M.E., Pławiak, P.: Face recognition with triangular fuzzy set-based local cross patterns in wavelet domain. Symmetry 11(6), 787 (2019). https://doi.org/10.3390/sym11060787
Article Google Scholar
Cao, K., Rong, Y., Li, C., Tang, X., Loy, C.C.: Pose-robust face recognition via deep residual equivariant mapping. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, vol. 2018, pp. 5187–5196 (2018). https://doi.org/10.1109/CVPR.2018.00544
Gao, W., Zhao, X., Zou, J.: Boosting face recognition under drastic views using a pose autoaugment manner. Appl, Sci (2020)
Google Scholar
Xiongjun, Z., Qingxiang, W., Han, M., Huang, X.: Multi-view face recognition and verification based on convolutional neural network. In: International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (2018)
Majdouline, M., Chouaib, M., Mounia, M., Mohammed, R.: A hybrid face identification system using a compressed CNN in a big data environment for embedded devices. Int. J. Comput. Dig. Syst. 9(4) (2020)
Bromley, J., Guyon, I., LeCun, Y., Sickinger, E., Shah, R.: Signature verification using a Siamese time delay neural network. In: Advances in Neural Information Processing Systems (1994)
Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively with application to face verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) (2005)
Sarhan, S., Nasr, A.A., Shams, M.Y.: Multipose face recognition-based combined adaptive deep learning vector quantization. In: Computational Intelligence and Neuroscience2020, ISSN: 1687-5265 (2020) https://doi.org/10.1155/2020/8821868. https://www.hindawi.com/journals/cin/2020/8821868/
Kisku, D.R., Mehrotra, H., Gupta, P., Sing, J.K.: Robust multi-camera view face recognition. Computing Research Repository (2010)
Vareto, R., Silva, S., Costa, F., Schwartz, W.R.: Towards openset face recognition using hashing functions. In: 2017 IEEE International Joint Conference on Biometrics (IJCB) (2017)
Graham, D.B., Allinson, N.M.: Virtual eigen signatures for general purpose face recognition: from theory to applications. Comput. Syst, Sci (1998)
Google Scholar
http://robotics.csie.ncku.edu.tw/Databases/FaceDetect_PoseEstimate.htm#Our_Database_
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ICLR 2015 (2015)
Sun, Y., Liang, D., Wang, X., Tang, X.: DeepID3: face recognition with very deep neural networks (2015) arXiv preprint arXiv:1502.00873

Download references

Acknowledgements

Majdouline Meddad acknowledges the financial support of the “Centre National pour la Recherche Scientifique et Technique” CNRST, Morocco.

Funding

Financial support from the ”Centre National pour la Recherche Scientifique et Technique (CNRST)”, Morocco.

Author information

Authors and Affiliations

LRIT Laboratory, Associated Unit to CNRST (URAC 29), Rabat IT Center, Faculty of Sciences, Mohammed V University in Rabat, Rabat, Morocco
Majdouline Meddad, Mounia Mikram & Mohammed Rziza
Scientific Institute of Rabat, Mohammed V University in Rabat, Rabat, Morocco
Chouaib Moujahdi
Meridian Team, LYRICA Laboratory, School of Information Sciences, Rabat, Morocco
Mounia Mikram

Authors

Majdouline Meddad
View author publications
You can also search for this author in PubMed Google Scholar
Chouaib Moujahdi
View author publications
You can also search for this author in PubMed Google Scholar
Mounia Mikram
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Rziza
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors have participate equally in the elaboration of this work. They have all read and agreed to the published version of the manuscript. All authors did and reviewed the manuscript.

Corresponding author

Correspondence to Majdouline Meddad.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Meddad, M., Moujahdi, C., Mikram, M. et al. Convolutional Siamese neural network for few-shot multi-view face identification. SIViP 17, 3135–3144 (2023). https://doi.org/10.1007/s11760-023-02535-w

Download citation

Received: 29 April 2022
Revised: 03 February 2023
Accepted: 18 February 2023
Published: 19 April 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s11760-023-02535-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Convolutional Siamese neural network for few-shot multi-view face identification

Abstract

Similar content being viewed by others

How to Choose Deep Face Models for Surveillance System?

Face Recognition Benchmark with ID Photos

A Siamese Neural Network-Based Face Recognition from Masked Faces

1 Introduction

2 Related works