Keywords

1 Introduction

Students attendance plays a crucial role in every education system. When attendance is taken manually, it should be marked in a sheet which causes time loss, repetition, and incorrect marking. This may cause a great burden for teachers. So, there is a requirement for an automatic attendance management system. There are various automatic techniques available now like Face Recognition, Finger-print verification, RFID systems, Iris Recognition systems, Voice and Speech recognition systems [1, 2]. Among this, face recognition proved to be more efficient [2]. This is because in the case of the fingerprint verification system, the students have to form a queue to scan their thumb on the scanning device to mark their attendance [3], which causes time loss. In the case of RFID systems, the students will have to carry RFID cards, which is a weakness. There is a chance for the students to forget to take the RFID cards [4]. In the case of iris recognition systems, even though this method has very high accuracy, it is not practically possible. This is because there will be many pupils in the class and iris detection of each student is not feasible [3]. The major drawback of the voice/speech recognition system is that the speech features are sensitive to certain factors like background noise, and also the voice change that happens with age. These systems are not reliable and also may not work accurately if the students are suffering from throat infection [1].

Face detection is a technique of locating and localizing the faces in an image by creating a bounding box through the extent. A facial recognition is a technology that identifies and verifies a person from images or video frames. It is also specified as a Biometric Artificial Intelligence-based application. This is because, it will uniquely recognize a person by analyzing their facial features. There are many methods in which facial recognition works, but mainly they work by comparing the facial features with the known faces in a database to find a match.

In this work, an automatic system is developed using image processing techniques that can mark attendance automatically. An effective face recognition algorithm is proposed which can identify students efficiently. For image processing, an effective platform is used. The objective of this work is to propose a model that captures images from videos, detect and recognize the faces, predict the recognized face, and then marks attendance. This paper aims to build an automatic attendance management system that detects the faces of students from images using MTCNN, extracts the facial features using FaceNet model and classifies using SVM. After recognizing the faces, the attendance is marked. Initially, we worked on LFW dataset and the future work is to work on the image captured on real-time videos. The Sect. 2 of this paper presents related work, Sect. 3 presents motivation, Sect. 4 demonstrates the proposed system, Sect. 5 shows the Experimental results, and Sect. 6 shows the conclusion and future work.

2 Related Works

Over the last few decades, the research area is mostly focused on face detection and recognition. This section provides a detailed overview of the different methods used for implementing automatic attendance management systems.

2.1 Attendance System Using Image Processing Techniques

The idea behind this work [5], was to develop an automatic system which will handle the attendance of students using image processing techniques. Their aim was to develop a system that detects and recognizes the students faces using frames from videos and then recording their attendance by identifying them from their variant features. For this, they used Viola Jones algorithm and Fisher Face algorithm for detecting and recognizing faces, respectively. The proposed system achieved an accuracy of 45–50%. They were able to overcome the drawback of manual attendance systems. The efficiency of this system can be increased by improving the training process.

2.2 Attendance System Using Face Detection and Face Recognition

In this work [6], they used Haar filtered Adaboost for detecting faces and Principal Component Analysis (PCA) and Local Binary Pattern Histogram (LBPH) algorithms for identifying the detected faces. This system proposed an approach that provides better results than traditional attendance systems and other automatic attendance systems like biometric fingerprint and RFID systems. They obtained an accuracy in the range 75–90% during the experiments.

2.3 Attendance System Using Deep Learning Framework

In this work [7], they proposed a system using deep learning frameworks. They have used a state-of-the-art face detection model and a novel recognition architecture for detecting and recognizing faces respectively. CNN was used to develop the automated attendance system. They proposed a system that achieved an accuracy of 98.67% on LFW datasets and 100% on classroom datasets. They used a spatial transformer network to learn the alignment of faces which led to greater facial verification accuracy. They have mainly used the frontal face images of students.

2.4 Deep Learning Paradigm for Attendance Systems

In this work [8], they proposed a system using Convolutional Neural Networks. The captured frames were passed to the SRNet (Single Image Super-Resolution Network) for the image super resolution which results in SR images. The faces were detected from the SR image and cropped around the bounding boxes using MTCNN. Then FaceNet was used for identifying the detected faces. They used RAISE and DIV2K for SRNet, VGGface2 for FaceNet and LFW and their own dataset for testing and validation. They produced an accuracy of 96.80% on LFW datasets.

2.5 Attendance System Using FaceNet and SVM

In this work [9], their objective was to get an improved accuracy for multi-face recognition. The proposed system uses FaceNet for feature extraction and SVM as classifier. They achieved an accuracy of 99.6% for multi-face recognition. The overall accuracy which they obtained by using CNN model was less than the accuracy which they obtained using FaceNet and SVM.

3 Motivation

In recent years, Image processing played a unique role in technological advancement as it deals with extracting useful information from a digital image. As the applications of these techniques are increasing day by day, a lot of research are being done in this field. Image capturing plays an important role in the educational field, robotics, the medical field, and in smart phones. A classroom consists of a huge number of students and marking their attendance manually wastes a lot of time. So, there is a need for an automatic system that can be built using current technology. Face recognition, which is an application derived from image processing, is one of the best methods used for human detection. It helps in detecting the face of each student. Face is multi-dimensional structure and it requires good computational analysis for recognition.

4 Proposed System

An automatic attendance management system using face detection and recognition are proposed in this paper. The proposed system aims to take the images of the students from classroom videos, to detect and identify their faces from images and upon successful recognition to mark their attendance automatically. Figure 1 illustrates the model representation of the proposed system. The system takes real-time classroom videos as input. These videos are captured and frames are created in the pre-processing phase. The faces of students are detected from the frames. After detection of a face, the system crops the face and performs certain image processing techniques. Then the features are extracted from the detected faces. Lastly, the features of the detected faces and the test faces are compared in the classifier and the faces are recognized. Then the prediction is generated.

Fig. 1
figure 1

Model representation

The proposed system requires a camera mount to a surface in the classroom at a point where it could capture the images of all the students in the classroom. The Fig. 2 illustrates the detailed system design of the proposed model. Here, MTCNN is used for face detection [8, 10]. FaceNet is used for creating face embeddings for a given image and SVM is used as a classifier for classification [9, 10]. The working of the proposed model is explained in brief below.

Fig. 2
figure 2

System design

4.1 Input

The input to the model is real-time classroom videos of the students. Camera should be placed in the classroom in such a way that it captures all the students and their faces effectively. This camera needs to be interfaced to the computer system for further processing either through a wired or a wireless network.

4.2 Image Pre-Processing

In this phase, frames are created from videos. For this work, a sample video was taken from the internet and the frames were created. The size of the sample video was 0.34 s. The frames were created as one frame per 0.07 s.

4.3 Image Processing

In this phase, the faces are detected from the images and face embeddings are created from the detected faces. These features are used to recognize the faces of the people to mark their attendance. In the proposed system, MTCNN is used for creating Face Detector and FaceNet model is used for creating face embeddings for each detected face. And SVM classifier is used to predict the name of the given face [4]. The sub-steps are explained in brief below:

Face Detection. Face detection is the process of identifying and extracting the faces from images. MTCNN is used for face detection. This is a state-of-the-art deep learning model for face detection, described in [10, 11]. It helped in localizing the faces from images and created bounding boxes around their extent. The first step was to detect faces from images and to reduce the dataset to a set of faces. For this, the images were loaded as Numpy array and converted to RGB if in case the image is black and white. Then the MTCNN Face Detector was created. This detected the faces from the image and new dataset was obtained [4]. Figure 3 depicts the reduced dataset. Here, the faces are detected from each image of one class and the dataset is reduced to a series of faces only. The images are resized to (160 × 160 × 3) which is the input shape of FaceNet. All the images in the train and test sets are loaded and the faces are extracted.

Fig. 3
figure 3

Reduced dataset

Feature Extraction. In this section, face embeddings are created. Face embeddings are vectors that represent the features extracted from the face [4]. These embeddings are then compared with the vectors created for other faces. The FaceNet model will create the face embeddings for a face [12]. This model will return the face embeddings of the train and test datasets. Each face embedding is comprised of 128 vectors.

4.4 Face Classification

The next step is to develop a model for classifying the face embeddings. First, the face embedding vectors are normalized. Here, SVM is used to work with face embedding. This is because SVM is efficient in separating the face embedding vectors [4]. Next, the model is evaluated and then the classification accuracy is calculated.

4.5 Testing

For testing, we selected a random image from the test dataset. The face embeddings are used as input to make predictions. The expected class name, predicted class name, and probability of the prediction is obtained.

5 Experimental Results

The dataset used here is LFW datasets [13]. For this work, 15 classes of image data were taken from the LFW datasets for the implementation of this system. The dataset was divided into two sets: Train and test sets. For training 80% of images were used and for testing 20% of the images were used. Table 1, describes the classification accuracy obtained with a different number of classes. This system produced a classification accuracy of 99.177% for training and 100% for testing with 15 classes. And in case of 10 classes, it provided an accuracy of 99.99% for training and 100% for testing.

Table 1 Classification accuracy

FaceNet extracts high quality features from faces and creates facial embeddings. For prediction, a random image was selected from the test set. The system correctly recognized the faces from the images. The proposed system produced a good accuracy in recognizing the faces which were randomly taken from the test set [9]. The Fig. 4 depicts the face recognized from the random image taken from the test set. A plot of the selected face along with its expected name, predicted name, and probability is given. The accuracy, precision, recall, and F1-score of the system are also computed. The classification performance for the classes showing precision, recall and F1-score is shown in Table 2. This model provided an accuracy_score of 1.

Fig. 4
figure 4

Prediction done by SVM classifier [13]

Table 2 Classification performance

6 Conclusion and Future Work

Automatic Attendance System has been proposed to reduce the drawbacks of traditional (manual) systems. The proposed system produced a good result with provided datasets. It saves time, especially during a lecture with a large strength of pupils. This attendance system illustrates the use of image processing techniques in a classroom. The system can be improved by implementing it in real time by capturing the images from real-time videos, detect faces from the created frames, recognize the faces, and then marks the attendance which is the future work.

Sentimental analysis and emotional analysis can also be implemented in the future. Sentimental analysis provides the feedback of a class which shows on which topics and at what time attracts more concentration of the pupils. Emotional analysis will help the faculties to acquire the feedback of their class and this helps them to change or improve their pedagogy. These developments can improve the applications of the work.