Keywords

1 Introduction

The general topic of facial image processing have been received considerable interest in the last several decades, but facial image based age synthesis and estimation have become interesting topics in recent years because of their emergent real world applications such as forensic art, electronic customer relationship management, security control and surveillance monitoring, cosmetology, entertainment and biometrics.

There have been many researchers working in the area of facial image processing but only a small number of researches study in the area of modeling aging effects on facial images. The reason is that age estimation is much more complicated than recognizing other attributes such as gender, facial expressions and ethnicity. Furthermore facial aging effects display some unique characteristics [12]. There are fundamental difficulties in estimating age even humans have difficulty in determining a person’s age correctly. These difficulties are: (1) Age estimation is not a standard classification problem. It can be taken either a multi-class classification problem or a regression problem. (2) A large aging database, especially the chronometrical image series of an individual is often hard to collect. (3) Age progression displayed on faces is uncontrollable and personalized.

2 Related Work

The existing age estimation systems are typically consisting of age image representation and age estimation modules. Age image representation techniques were often based on shape-based and texture-based features that were extracted from facial images. They can be grouped under the topics of anthropometric models, Active Appearance Models (AAM), AGing pattErn Subspace (AGES), Age Manifold and Appearance Models. Then age estimation can be performed with age group classification or regression methods.

The earliest paper published in the area of age classification from facial images was the work by Kwon and Lobo [15]. They computed six ratios of distances on frontal images to separate babies from adults. They also use the wrinkle information to separate young adults from senior adults. They use a very small database containing 45 facial images in their experiments.

AAM is a statistical face model proposed initially in [5] for facial image coding. A statistical shape model and an intensity model are learned separately from training images and combined based on Principal Component Analysis (PCA). Lanitis et al. [16] extended AAMs for face aging by proposing an aging function \(age=f(b)\), to explain the variation in age. AAM based approaches consider both the shape and texture rather than just the facial geometry as in the anthropometric model based methods. But they have to deal with each aging face image separately.

Geng et al. [11] proposed a method called AGES that defines a sequence of personal face images of the same person sorted in the temporal order. Then a specific aging pattern is learned for each individual. AGES method can synthesize the missing age images by using an EM-like iterative learning algorithm. The Mean Absolute Error (MAE) was reported 6.77 years on FG-NET [6] database when the algorithm is tested in Leave One Person Out (LOPO) mode. They also used MORPH [18] database in their experiments. MORPH is only used to test the algorithms trained on the FG-NET database. AGES method achieves 8.83 MAE on MORPH database [12].

Instead of learning a specific aging pattern for each individual as in AGES, age manifold methods can learn a common aging trend or pattern from many individuals at different ages. This kind of aging pattern learning makes the task of face aging representation very flexible. The possible way to learn the common aging pattern is age manifold which utilizes manifold embedding technique to learn the low dimensional aging trend from many face images at each age [4, 7, 8, 13].

Appearance models are mainly focused on the aging-related facial feature extraction. Both global and local features were used in existing age estimation systems [1, 9, 10, 14] .

As one can see from the previous work, there have been many methods proposed in the age estimation field and most of them are implemented on FG-NET Aging database. In this study we use the Radon transform for age estimation for the first time and make experiments on three databases: FG-NET Aging database [6], MORPH database [18] and FERET database [17]. The results have shown that Radon features are efficient for age estimation on all databases.

This paper is organized as follows: Sect. 3 introduces the proposed method for age estimation including preprocessing, feature extraction, dimensionality reduction and regression modules. These modules are described in the sections of Sect. 3. Section 4 describes the databases used in the experiments. Section 5 discusses the experimental results and finally Sect. 6 concludes the paper.

3 Proposed Method

In this paper we propose a new age estimation method by using local features of facial images. Local features are extracted using regional Radon transform of facial images. This method consists of four modules: preprocessing, feature extraction with Radon transform, dimensionality reduction with PCA and age estimation with multiple linear regression. These modules are explained in the following sections.

3.1 Preprocessing

In the preprocessing module, the facial images are cropped, scaled and transformed to the size of 88 \(\times \) 88, based on the eye center locations. Examples from all databases are given in Fig. 1.

Fig. 1
figure 1

Samples from databases. a MORPH b FG-NET c FERET

3.2 Feature Extraction with Radon Transform

The Radon transform [3] compute projections of an image matrix along specified direction. Applying the Radon transform on an image \(f(x,y)\) for a given set of angles can be thought of as computing the projection of the image along the given angles. The resulting projection is the sum of the intensities of the pixels in each direction, i.e. a line integral [2]. The result is a new image \(R(s,\alpha )\). The Radon transform for an image can be written as;

$$\begin{aligned} R(s,\alpha )=\int \limits _{{-\infty }}^\infty \int \limits _{{-\infty }}^\infty f(x,y)\delta (s-x\ \mathrm{cos}\ \alpha -y\ \mathrm{sin}\ \alpha )dxdy \end{aligned}$$
(1)

where \(R(S,\alpha )\) is the line integral of a 2-\(D\) function \(f(x\),\(y)\) along a line from \(-\infty \) to \(\infty \). The position of the line is determined by two parameters \(s\) and \(\alpha \). Essentially, \(R(s,\alpha )\) is the integral of \(f\) over the line \(s=x \cos \alpha +y\sin \alpha \). In its discrete form, a Radon transform consists in the summation of pixel intensities along lines of different directions. The process for the calculation of Radon coefficients is visualized in Fig. 2.

Fig. 2
figure 2

Radon transform

The Radon pixel image has more geometric information than the original pixel image. For age estimation field, the regional texture information is more informative than global texture information. Consequently we have taken the Radon projections at \(\theta =k\pi /6\) where \(k= 0, 1, 2, 3, 4, 5\) from local image regions and concatenated them in a single feature vector. Regional Radon transform produces a feature vector with dimension of 3,920 for \(4\times 4\) regions.

3.3 Dimensionality Reduction

After the feature extraction module, Principal Component Analysis (PCA) is performed in order to find a lower dimensional subspace which carries significant information for age estimation. Then high-dimensional feature vectors are projected onto a low-dimensional subspace in order to improve the efficiency. Using this technique the \(p\)-dimensional feature vector \(x\) is transformed into a \(d\)-dimensional vector \(y\) with \(d<p\).

The PCA method finds the embedding that maximizes the projected variance, \(p=\mathrm {arg\ max}_{\Vert p=1\Vert } p^T Sp\), where \(S=\sum \nolimits ^{n}_{i=1}(x_i-{\bar{x}})(x_i,-{\bar{x}})^T\) is the scatter matrix, and \({\bar{x}}\) is the mean vector of \(\{x_i\}^n_{i=1}\). The solution of this problem is given by the set of \(d\) eigenvectors associated to the \(d\) largest eigenvalues of the scatter matrix. Once the projection subspace is determined, training and testing images were projected on it, allowing thus dimensionality reduction.

3.4 Regression

After finding the low dimensional representation of facial images, we define the age estimation problem as a multiple linear regression problem as \(age=f(M){:}\Leftrightarrow {\hat{L}}={\hat{f}}(Y)\), where \({\hat{L}}\) denotes the estimated age label, \(f(\cdot )\) the unknown regression function, and \({\hat{f}}(\cdot )\) is the estimated regression function. The age regression function used in this study is a linear function, \(\hat{\ell }=\hat{\beta }_0 +\hat{\beta }_1 ^Ty\), where \(\hat{\ell }\) is the estimate of age, \(\hat{\beta }_0 \) is the offset , \(\hat{\beta }_1 \) is the weight vector and \(y\) is the extracted feature vector.

4 Databases

The databases used in this study are FG-NET, MORPH and FERET databases. The Face and Gesture Recognition Research Network (FG-NET) aging database [6] comprises of 1,002 images of 82 subjects (6–18 images per subject) in the age range 0–69 years. Since the images were retrieved from real-life albums of different subjects, aspects such as illumination, head pose, facial expressions etc. are uncontrolled in this dataset.

The MORPH Database [18] is a public available face database, comprises face images of adults taken during different ages. The database records individuals’ metadata such as age, gender, ethnicity, height, weight etc., and is organized into two albums. MORPH Album-1 (A1) comprises of 1,690 digitized images of 515 individuals between the age range 15–68 years.

The FERET Database [17], a comprehensive database that addresses multiple problems related to face recognition such as illumination variations, pose variations, facial expressions etc. The database includes 14,126 images from 1,199 individuals. In this study we use 2,294 facial images that are taken from frontal view images.

Table 1 The distribution of images in specified age groups

The age distribution of FG-NET, MORPH and FERET databases is given in Table 1. One can see from the table that the images are not distributed uniformly. This irregularity affects the estimation results negatively.

5 Experiments and Results

In the training phase, we have taken the Radon projections of training samples at \(\theta =k\pi /6\) where \(k = 0, 1, 2, 3, 4, 5\). We extract Radon features from local image regions and concatenated them in a single feature vector. Regional Radon transform produces a feature vector with dimension of 3,920 for \(4\times 4\) regions. Then we apply PCA to reduce the dimension of feature vector. After dimensionality reduction step we define an aging function using multiple linear regression. In the testing phase, the regional Radon features of test samples are extracted similarly. Then age estimation is performed using the predicted aging function.

The evaluation framework is Leave-One-Person-Out (LOPO) mode for FG-NET Aging Database. That is in each fold the images of one person are used as test set and those of the others are used as the training set. After 82 folds, each subject has been used as test set once, and the final results are calculated based on all the estimations. In this way the algorithms are tested in the case similar to real applications.

In the experiments we also use 5-fold cross validation mode for MORPH and FERET in which the 1/5 of the images are selected randomly as test set and the rest are used as training set. After 5-folds the mean of all estimations is determined as estimation performance of the system.

For the performance comparison, we used the Mean Absolute Error (MAE) measurement. MAE is defined as the average of the absolute error between the recognized labels and the ground truth labels:

$$\begin{aligned} MAE=\frac{\sum \nolimits _{i=1}^{N_t } {\left| {\hat{y}_i -y_i } \right|} }{N_t } \end{aligned}$$
(2)

where \(\hat{y}_i \)is the recognized age for the \(i\)th testing sample, \(y_i \) is the corresponding ground truth, and \(N_t \) is the total number of the testing samples. The estimation results of conventional methods and proposed method for FG-NET and MORPH databases are listed in Table 2. We can see from Table 2 that, the proposed method achieves better result than conventional methods like WAS, AAS, KNN on these databases.

Table 2 The comparison of estimation results (MAE) on FG-NET and MORPH databases
Table 3 The comparison of estimation results on FERET database

Finally the performance of Radon features on FERET database is given in Table 3. There haven’t been enough studies reported on FERET database. In [19] the LBP features of facial images are used and the classification error (error rate) for 3 age classes (child, youth and oldness) is 7.88 %. In this study we take the age estimation problem as a regression problem and we achieve 6.98 MAE on FERET database.

6 Conclusion

In this paper, we have presented an age estimation method that uses regional Radon transform for age-related feature extraction from facial images. The contribution of this paper is using regional Radon features for age estimation. The Radon features are extracted from local image regions and concatenated into single vector. Thus the global and local geometrical information of facial images are included in the feature vector. Experimental results on the FG-NET aging database, MORPH and FERET databases have shown that proposed method is better than most conventional methods. Furthermore our result is slightly better than all age estimation results reported previously on MORPH database.