Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Facial skin wrinkles are not only important features in terms of facial aging but can also provide cues to a person’s lifestyle. For example, facial wrinkles can indicate the history of a person’s expressions (smiling, frowning, etc.) [15], or whether the person has been a smoker [29], or has had sun-exposure [35]. Some of the factors influencing facial winkles are a person’s lifestyle, overall health, skin care routines, genetic inheritance, ethnicity and gender. Hence, computer-based analysis of facial wrinkles has great potential to exploit this underlying information for relevant applications.

Face analysis is one of the main research problems in computer vision and facial features such as shape, geometry, eyes, nose, mouth, are analyzed in one way or another for different applications. However, research has been lacking in image-based analysis of facial wrinkles specifically. For example, a review of two good survey papers on facial aging analysis [14, 34] points to the absence of wrinkle analysis in facial aging research. As our review suggests, this can most probably be attributed to the following reasons:

Image quality: :

Lack of publicly available benchmark aging datasets with high resolution/high quality images clearly depicting facial wrinkles.

Age period: :

Lack of proper age period covered in aging datasets; most of these datasets do not have sufficient number of sample images of subjects with age 40 and more.

Challenges in wrinkle localization: :

Even in case of availability of high quality images of aged skin, facial wrinkles are difficult facial features to localize and hence are not commonly incorporated as curvilinear objects in image analysis algorithms.

Physically, skin wrinkles are 3D features on skin surface along with other features such as pores, moles, scars, dark spots and freckles. Most of these features are visible in 2D images due to their color or the particular image intensities they create. Image processing techniques interpret such image components as edges, contours, boundaries, texture, color space, etc. to infer information. The challenge arises when skin wrinkles cannot be categorized strictly as one of these categories. For example, despite causing image intensity gradients, wrinkles are not continuous as typical edges or contours. Wrinkles cannot be categorized as texture because they do not depict repetitive image patterns which is the defining characteristics of image textures. Wrinkles cannot be categorized as boundaries between two different textures as well as they appear in skin. The closest description of how wrinkles appear in a skin image can be as irregularities, discontinuities, cracks or sudden changes in the surrounding/background skin texture. A parallel can be drawn between the skin texture discontinuities caused by wrinkles in images and the cracks present in industrial objects like roads, steel slabs, rail tracks, etc. However, only in this case, more often than not, the background skin texture is not as smooth or homogeneous as that of a steel slab or road surface. The granular/rough/irregular 3D surface of skin appears as nonuniform or inhomogeneous image texture making it more difficult to localize wrinkles in surrounding skin texture. Although, a framework based on 3D analysis of skin surface would be better suited to draw conclusions based on facial wrinkles, such setups are not readily available to be used frequently.

In this chapter, we focus on research conducted on the analysis of facial wrinkles for applications in computer vision and leave out those in computer graphics. This research can be loosely categorized as following one of the two approaches. In the first and relatively more popular approach, wrinkles are considered as so-called ‘aging skin texture’ and analyzed as image texture or intensity features. In the second approach, wrinkles are analyzed as curvilinear objects, localized automatically or hand-drawn. Figure 1 depicts a block diagram of the two approaches. Each approach starts with an analysis of input image to obtain image features which can be simple image intensity values or image features obtained after some sort of filtering. Then, in texture-based approaches, image features are analyzed directly as illustrated by path ‘B’ in the diagram. In approaches based on wrinkles as curvilinear objects, an intermediate step is included in path ‘A’ for the extraction of curvilinear objects or localization of wrinkles before any other analysis. In Sect. 3 we will review work following the first approach, incorporating wrinkles as image texture, and in Sect. 4 we will review work following the second approach, incorporating wrinkles as curvilinear objects. However, first of all, we will mention early work on image-based analysis of facial wrinkles. Then, in Sect. 2, we will review briefly image filtering techniques applied to highlight intensity gradients caused by wrinkles. Table 1 presents a summary of the work reviewed in this chapter, the corresponding analysis approaches and applications.

Fig. 1
figure 1

A block diagram of the two approaches commonly employed to analyze facial aging

Table 1 Summary of research work reviewed in this chapter

Earlier Work As mentioned earlier, modeling of facial wrinkles and finer skin texture has been done commonly in computer graphics to obtain more realistic appearances of skin features. Specifically significant efforts have been reported on photo-realistic and real-time rendering of skin texture and wrinkles on 3D animated objects. This work typically follows the main approach of generating a pattern of skin texture/wrinkles based on some learned model and then render the resulting texture on 3D objects. Hence, most of the earlier work focused on developing generic skin models for 3D rendering. The research work focusing on other applications include work by Kwon and da Vitoria Lobo [21, 22] on localization of wrinkles for age determination (described in detail in Sect. 4.2). Magnenat-Thalmann et al. [25] and Wu et al. [43] presented a computational model for studying the mechanical properties of skin with aging manifested as wrinkles. The model was intended to analyze different characteristics of wrinkles such as location, number, density, cross-sectional shape, and amplitude, as a consequence of skin deformation caused by muscle actions. Boissieux et al. [6] presented 8 basic wrinkle masks (Fig. 2) for aging faces corresponding to different gender, shape of the face and smiling history after analyzing skincare industry data. Figure 2 illustrates the eight patterns included in their work.

Fig. 2
figure 2

Eight basic wrinkle masks corresponding to different gender, shape of the face and smiling history (reproduced from [6])

Cula et al. [11] presented a novel skin imaging method called bidirectional imaging based on quantitatively controlled image acquisition settings. The proposed imaging setup was shown to capture significantly more properties of skin appearance than standard imaging. The observed structure of skin surface and its appearance were modeled as a bidirectional function of the angles of incident light, illumination and observation. The enhanced observations about skin structure were shown to improve results for dermatological applications. Figure 3 depicts the variations in the appearance of a skin patch due to different illumination angles.

Fig. 3
figure 3

A skin patch imaged using different illumination angles (reproduced from [11])

2 Image Features for Aging Skin Texture

In this section, we review image filtering techniques commonly applied to highlight intensity gradients caused by wrinkles as well as image features based on aging appearance and texture. Most of the applications reviewed in the later sections make use of one or more of these features.

Laplacian of Gaussian The Laplacian is a 2-D isotropic measure of the second spatial derivative of an image. The Laplacian of an image highlights regions of rapid intensity change and is therefore often used for edge detection (e.g. zero crossing edge detectors). Since image operators approximating a second derivative measurement are very sensitive to noise, the Laplacian is often applied to an image that has first been smoothed with something approximating a Gaussian smoothing filter in order to reduce its sensitivity to noise, and hence, when combined, the two variants can be described together as Laplacian of Gaussian operator. The operator normally takes a single gray level image as input and produces another gray level image as output. Because of the associativity of the convolution operation, the Gaussian smoothing filter can be convolved with the Laplacian filter first, and then convolved with the image to achieve the required result. The 2D LoG function centered on zero and with Gaussian standard deviation \(\sigma\) has the form:

$$\displaystyle{ LoG(x,y;\sigma ) = -\frac{1} {\pi \sigma ^{4}} \left [1 -\frac{x^{2} + y^{2}} {2\sigma ^{2}} \right ]\exp (-\frac{x^{2} + y^{2}} {2\sigma ^{2}} ). }$$
(1)

Hessian Filter The Hessian filter is a square matrix of second-order derivative and is capable of capturing local structure in images. The eigenvalues of the Hessian matrix evaluated at each image point quantify the rate of change of the gradient field in various directions. A small eigenvalue indicates a low rate of change in the field in the corresponding eigen-direction, and vice versa. The Hessian matrix H of the input image I, consisting of 2nd order partial derivatives at scale \(\sigma\), is given as:

$$\displaystyle{ \mathbf{H} = \left [\begin{array}{*{10}c} \frac{\partial ^{2}I} {\partial x^{2}} & \frac{\partial ^{2}I} {\partial x\partial y} \\ \frac{\partial ^{2}I} {\partial y\partial x} & \frac{\partial ^{2}I} {\partial y^{2}} \end{array} \right ] = \left [\begin{array}{*{10}c} \mathbf{H}_{a}&\mathbf{H}_{b} \\ \mathbf{H}_{b} &\mathbf{H}_{c} \end{array} \right ]. }$$
(2)

In order to extract the eigen-direction in which a local structure of the image is decomposed, eigenvalues \(\lambda _{1},\lambda _{2}\) of the Hessian matrix are defined as:

$$\displaystyle\begin{array}{rcl} \lambda _{1}(x,y:\sigma ) = \frac{1} {2}\left [H_{a} + H_{c} + \sqrt{(H_{a } - H_{c } )^{2 } + H_{b }^{2}}\right ],& & \\ \lambda _{2}(x,y:\sigma ) = \frac{1} {2}\left [H_{a} + H_{c} -\sqrt{(H_{a } - H_{c } )^{2 } + H_{b }^{2}}\right ].& &{}\end{array}$$
(3)

Different Hessian filters vary in ways the eigenvalues are analyzed to test a hypothesis about image structure. For example, to determine if a pixel corresponded to a facial wrinkle or not Ng et al. [27] (described in Sect. 4) defined the following similarity measures R and S to test the hypotheses:

$$\displaystyle\begin{array}{rcl} R(x,y:\sigma ) = (\frac{\lambda _{1}} {\lambda _{2}})^{2},& & \\ S(x,y:\sigma ) =\lambda _{ 1}^{2} +\lambda _{ 2}^{2}.& &{}\end{array}$$
(4)

Steerable Filter Bank Freeman and Adelson proposed a steerable filter [12, 17] to detect local orientation of edges. For any arbitrary orientation, a steerable filter can be generated from a linear combination of basis filters where the basis filter set for a pixel p is given by:

$$\displaystyle{ \mathbf{G}(p) = \left [\frac{\partial ^{2}g(p)} {\partial x^{2}} + \frac{\partial ^{2}g(p)} {\partial x\partial y} + \frac{\partial ^{2}g(p)} {\partial y^{2}} \right ], }$$
(5)

where g(p) denotes Gaussian function of R 2, the most used example of steerable filters. Let the interpolating function of orientation \(\theta\) be given as:

$$\displaystyle{ \mathbf{k}(\theta ) = \left [\cos ^{2}(\theta ) -\sin 2\theta \sin ^{2}\theta \right ]^{T}. }$$
(6)

Then the steerable filter associated with the orientation \(\theta\) can be obtained as \(g_{\theta }(p) = \mathbf{k}^{T}\mathbf{G}(p)\) and can be used to extract image structure in that orientation using convolution.

Gabor Filter Bank Gabor operator is a popular local feature-based descriptor due to its robustness against variation in pose or illumination. The real Gabor filter kernel oriented in a 2D image plane at angle α is given by:

$$\displaystyle{ Gab(x,y) = \frac{1} {2\pi \sigma _{x}\sigma _{y}}\exp \left [\frac{-1} {2} \left ( \frac{x^{'}} {\sigma _{x}^{2}} + \frac{y^{'}} {\sigma _{y}^{2}}\right )\right ]\cos (2\pi fx'), }$$
(7)

where

$$\displaystyle{ \left [\begin{array}{*{10}c} x'\\ y' \end{array} \right ] = \left [\begin{array}{*{10}c} \cos \alpha &\sin \alpha \\ -\sin \alpha &\cos \alpha \end{array} \right ]\left [\begin{array}{*{10}c} x\\ y \end{array} \right ]. }$$
(8)

Let \(\{Gab_{k}(x,y),k = 0,\cdots \,,K - 1\}\) denote the set of real Gabor filters oriented at angles \(\alpha _{k} = -\frac{\pi }{2} + \frac{\pi k} {K}\) where K is the total number of equally spaced filters over the angular range \(\left [\frac{-\pi } {2}, \frac{\pi } {2}\right ]\). Then Gabor features can be obtained by convolving this Gabor filter bank with the given image.

Local Binary Pattern (LBP)

Ojala et al. [28] introduced the Local Binary Patterns (LBPs) to represent local gray-level structures. LBPs have been used widely as powerful texture descriptors. The LBP operator takes a local neighborhood around each pixel, thresholds the pixels of the neighborhood at the value of the central pixel and uses the resulting binary-valued integer number as a local image descriptor. It was originally defined for 3-pixel neighborhoods, giving 8-bit integer LBP codes based on the eight pixels around the central one. Considering a circular neighborhood denoted by (P, R) where P represents the number of sampling points and R is the radius of the neighborhood, the LBP operator takes the following form:

$$\displaystyle\begin{array}{rcl} f_{(P,R)}(p_{c}) =\sum _{ i=0}^{P-1}s(p_{ i} - p_{c})2^{i},& &{}\end{array}$$
(9)
$$\displaystyle\begin{array}{rcl} \mbox{ where }s(x) = \left [\begin{array}{*{10}c} 1\mbox{ if }x \geq 0\\ 0\mbox{ otherwise} \end{array} \right ],& &{}\end{array}$$
(10)

and p i is one of the neighboring pixels around the center pixel p c on a circle or square of radius R. Several extensions of the original operator have been proposed. For example including LBPs for the neighborhoods of different sizes makes it feasible to deal with textures at different scales. Another extension called ‘uniform patterns’ has been proposed to obtain rotationally invariant features from the original LBP binary codes (see [28] for details). The uniformity of an LBP pattern is determined from the total number of bitwise transitions from 0 to 1 or vice versa in the LBP bit pattern when the bit pattern is considered circular. A local binary pattern is called uniform if it has at most 2 bitwise transitions. The uniform LBP patterns are used to characterize patches that contain primitive structural information such as edges and corners. Each uniform pattern, which is also a binary pattern, has a corresponding integer value. The uniform patterns and the corresponding integer values are used to compute LBP histograms where each uniform pattern is represented by a unique bin in the histogram and all the non-uniform patterns are represented by a single bin only. For example, the 58 possible uniform patterns in a neighborhood of 8 sampling points make a histogram of 59 bins where 59th bin represents the non-uniform patterns. It is common practice to divide an image in sub-images and then use the normalized LBP histograms gathered from each sub-image as image features.

An extension of LBPs, called Local Ternary Patterns (LTP) [41], has also been used in analyzing aging skin textures. LBPs tend to be sensitive to noise, because of the selection of the threshold value to be the same as that of the central pixel, especially in near uniform image regions. LTPs were proposed to introduce robustness to noise in LBPs by introducing a threshold value r other than that of the central pixel. Since many facial regions are relatively uniform, LTPs were shown to produce better results as compared to LBP. An LTP operator is defined as follows:

$$\displaystyle\begin{array}{rcl} f_{(P,R)}^{LTP}(p_{ c}) =\sum _{ i=0}^{P-1}s(p_{ i},p_{c})2^{i},& & \\ \mbox{ where }s_{LTP}(x,p_{c}) = \left [\begin{array}{*{10}c} 1\mbox{ if }x \geq p_{c} + r \\ 0\mbox{ if }\vert x - p_{c}\vert <r \\ -1\mbox{ if }x \leq p_{c} - r \end{array} \right ].& &{}\end{array}$$
(11)

Each ternary pattern is split into positive and negative parts. These two parts are then processed as two separate channels of LBP codes. Each channel is used to calculate LBP histograms from LBP codes and the resulting LBP histograms from two channels are used as image features.

Active Appearance Model (AAM) The Active Appearance Model (AAM) was proposed in [8] to describe a statistical generative model of face shape and texture/intensity. It is a popular facial descriptor which makes use of Principle Component Analysis (PCA) in a multi-factored way for dimension reduction while maintaining important structure and texture elements of face images. To build an AAM model, a training set of annotated images is required where facial landmark points have been marked on each image. AAMs model shape and appearance separately. The shape model is learnt from the coordinates of the landmark points in annotated training images. Let N T and N L denote the total number of training images and the number of landmark points in each training facial image. Let \(\mathbf{p} = [x_{1},y_{1},x_{2},y_{2},\ldots.,x_{N_{L}},y_{N_{L}}]^{T}\) be a vector of length 2N L × 1 denoting the planar coordinates of all landmarks. The shape model is constructed by first aligning the set of N T training shapes using Generalized Procrustes Analysis and then applying PCA on the aligned shapes to find an orthonormal basis of N T eigenvectors, \(\mathbf{E}_{s} \in \mathcal{R}^{2N_{L}\times N_{T}}\) and the mean shape \(\overline{\mathbf{p}}\). Then the training images are warped onto the mean shape in order to obtain the appearance model. Let N A denote the number of pixels that reside inside the mean shape \(\overline{\mathbf{p}}\). For the appearance model, let \(\mathbf{l(x)},\mathbf{x} \in \mathbf{p}\) be a vector of length N A × 1 denoting the intensity/appearance values of the N A pixels inside the shape model. The appearance model is trained in a similar way to the shape model to obtain N T eigenvectors, \(\mathbf{E}_{a} \in \mathcal{R}^{N_{A}\times N_{T}}\) and the mean appearance \(\overline{\mathbf{l}}\).

Once the shape and appearance AAM models have been learnt from the training images, any new instance (p∗, l∗) can be synthesized or represented as a linear combination of the eigenvectors weighted by the model parameters as follows:

$$\displaystyle\begin{array}{rcl} \mathbf{p}{\ast} = \overline{\mathbf{p}} + \mathbf{E}_{s}\mathbf{a},& & \\ \mathbf{l}{\ast} = \overline{\mathbf{l}} + \mathbf{E}_{a}\mathbf{b},& &{}\end{array}$$
(12)

where a and b are the shape and appearance parameters respectively.

3 Applications Incorporating Wrinkles as Texture

Most computer vision applications involving facial aging incorporate wrinkles as aging texture where the specific appearance of the texture is incorporated as image texture features of choice. In this section, we present a review of the research work incorporating aging skin texture as image texture features.

3.1 Synthesis of Facial Aging and Expressions

Synthesis of aged facial images from younger facial images of an individual has several real world applications e.g. looking for lost children or wanted fugitives, developing face recognition systems robust to age related variations, facial retouching in entertainment and recently in healthcare to assess the long term effects of an individual’s lifestyle. Facial aging causes changes in both the geometry of facial muscles and skin texture. The synthesis of facial aging is a challenging problem because it is difficult to synthesize facial changes in geometry and texture which are specific to an individual. Furthermore, the availability of only a limited number of prior images at different ages, mostly low-resolution, for an individual poses additional challenge.

In the absence of long term (i.e. across 3–4 decades) face aging sequences, Suo et al. [38, 40] made two assumptions. First, similarities exist among short term aging patterns in the same time span, especially for individuals of the same ethnic group and gender. Second, the long term aging pattern is a smooth Markov process composed of a series of short term aging patterns. In their proposed method, AAM features were used to capture and generate facial aging. Guided by face muscle clustering, a face image was divided into 13 sub-regions. An extended version of AAM model was then used to include a global active shape model and a shape-free texture model for each sub-region. Thus the shape-free texture component of the AAM model described changes in skin texture due to wrinkling (Fig. 4). The principle components of the extended AAM model were also analyzed to extract age-related components from non-age-related components.

Fig. 4
figure 4

Representation of aging texture in [40]; (a1, a2) depict the shape-free texture in the region around eye and the corresponding synthesized images and (b1, b2) depict the same for the forehead region (reproduced from [40]).

With a large number of short term face aging sequences from publicly available face aging databases, such as FG-NET and MORPH, Suo et al. used their defined AAM model features to learn short term aging patterns from real aging sequences. A sequence of overlapping short term aging patterns in latter age span was inferred from the overlapping short term aging patterns in current age span. The short term aging patterns for the later age were then concatenated into a smooth long term aging pattern. The diversity of aging among individuals was simulated by sampling different subsequent short term patterns probabilistically. For example, Fig. 5 shows inherent variations in terms of the aging of a given face using their method based on AAM and on And-Or graphs (described later). It can be observed that the appearance of a synthesized aged face varies with increase in age. Figure 6 shows examples of age synthesis for four subjects using their AAM features.

Fig. 5
figure 5

Inherent variation in different instances of synthesized aged images for the same age (top reproduced from [40], bottom reproduced from [39])

Fig. 6
figure 6

Simulation of age synthesis in [38]; the left most column shows the input images, and the following three columns are synthesized images at latter ages (reproduced from [38])

In a different approach to aging synthesis Suo et al. [37, 39] presented a hierarchical And-Or graph based generative model to synthesize aging. Each age group was represented by a specific And-Or graph and a face image in this age group was considered to be a transverse of that And-Or graph, called parse graph. The And-Or graph for each age group consisted of And-nodes, Or-nodes and Leaf nodes. The And nodes represented different parts of face in three levels—coarse to fine—where wrinkles and skin marks were incorporated at the third, finer level. Or nodes represented the alternatives learned from a training dataset to represent the diversity of face appearance at each age group. By selecting alternatives at the Or-nodes, a hierarchical parse graph was obtained for a face instance whose face image could then be synthesized from this parse graph in a generative manner. Based on the And-Or graph representation, the dynamics of face aging process were modeled as a first-order Markov chain on parse graphs which was used to learn aging patterns from annotated faces of adjacent age groups.

To incorporate wrinkles in synthesized images, parameters of curves were learned in 6 wrinkle zones from the training dataset. Wrinkle curves were then stochastically generated in two steps to be rendered on synthesized face images: generation of curve shapes from a probabilistic model and calculation of curve intensity profiles from the learned dictionary. After warping the intensity profiles to the shape of wrinkle curves, Poisson image editing was used to synthesize realistic wrinkles on a face image. Figure 7 shows a series of generated wrinkle curves over four age groups on top and an example of generating the wrinkles image from the wrinkle curves on the bottom.

Fig. 7
figure 7

Generation of wrinkle curves for different age patterns and synthesis of a wrinkle pattern over aged image (reproduced from [39])

Patterson et al. [31] presented a framework for aging synthesis based on a face-model including landmarks for shape, and AAMs for both shape and texture. They learned age-related AAM parameters from a training set annotated with landmarks using support-vector regression (SVR). The learned AAM parameters were used to generate feasible random faces along with their age estimated by SVR. In the final step, these simulated faces were used to generate a table of ‘representative age parameters’ which then manipulated the AAM parameters in the feature space. The manipulated AAM parameters thus obtained were used to age-progress or regress a given face image. Figure 8 shows synthesized aged images vs. original images for a subject using their AAM-SVR face model.

Fig. 8
figure 8

The top row shows original images of an individual. The bottom row shows synthetic aged images where each image is synthesized at approximately the same age as that in the image above (reproduced from [31])

Ramanathan and Chellappa [33] proposed a shape variation model and a texture variation model towards modeling of facial aging in adults. Attributing facial shape variations during adulthood to the changing elastic properties of the underlying facial muscles, the shape variation model was formulated by means of physical models that characterized the functionalities of different facial muscles. Facial feature drifts were modeled as linear combinations of the drifts observed on individual facial muscles. The aging texture variation model was designed specifically to characterize facial wrinkles in predesignated facial regions such as the forehead, nasolabial region, etc. To synthesize aging texture, they proposed a texture variation model by means of image gradient transformation functions. The transformation functions for a specific age gap and wrinkle severity class (subtle/moderate/strong) were learnt from the training set. Given a test image, the transformed image according to an age group and wrinkles severity was then obtained by solving the Poisson equation of image reconstruction from gradient fields. Figure 9 illustrates the process of transforming facial appearances with increase in age in their work.

Fig. 9
figure 9

Facial shape variations induced for the cases of weight-gain/loss in [33]. Further, the effects of gradient transformations in inducing textural variations using Poisson image editing are illustrated as well (reproduced from [33])

Fu and Zheng introduced a novel framework for appearance-based photorealistic facial modeling called Merging Face (M-Face) [13]. They introduced ‘merging ratio images’ which were defined to be as the seamless blending of individual expression ratio images, aging ratio images, and illumination ratio images. Thus the aging skin texture was also represented as a ratio image. Derived from the average face, the caricatured shape was obtained by accentuating an average face by exaggerating individual distinctiveness of the subject while the texture ratio image was rendered during the caricaturing. This way, the expression morphing, chronological aging or rejuvenating, and illumination variance could be merged seamlessly in a photorealistic way on desired view-rotated faces yielded by view morphing. Figure 10 shows an example image and the corresponding rendered images for different facial attributes in their work.

Fig. 10
figure 10

An example image with photorealistically rendered images for different attributes (reproduced from [13])

As regards with aging in M-Face framework, the ‘age space’ including both shape and aging ratio images (ARI), was assumed to be a low-dimensional manifold of the image space where the origin of the manifold represented the shape and texture of the average face of a young face set. Each point in the manifold denoted a specific image with distinctive shape and ARI features. Facial attributes of a given image lay on this manifold, at some point P. Points at a farther distance from the origin than that of the original image represented aging and those closer to the origin represented rejuvenating. Different aged and rejuvenated faces were rendered by using features belonging to the points on this manifold by processing along the line from the origin to the point P (Fig. 11).

Fig. 11
figure 11

(a) Age space for aging and rejuvenating. The origin is the average face of a young face set. (b) Rejuvenation of an adult male face. (c) Original face image. (d) Aging of the face (reproduced from [13])

Following a similar approach based on ratio images, Liu et al. [23] presented a framework to map subtle changes in illumination and appearance corresponding to facial creases and wrinkles in the context of facial expressions instead of facial aging. Their work was an attempt to complement traditional expression mapping techniques which focused mostly on the analysis of facial feature motions and ignored details in illumination changes due to expression wrinkles/creases. In a generative framework, they proposed ‘expression ratio images (ERI)’ which captured illumination changes of a person’s expressions as we describe next. Under the Lambertian model, ERI is defined in terms of the changes in the illumination of skin surface due to the skin folds. For any point P on a surface, let n denote its normal and assume m point light sources. Let l i , 1 ≤ i ≤ m denote the light direction from point P to the ith light source, and I i its intensity. Assuming a diffuse surface let ρ be its reflectance coefficient at P. Under the Lambertian model, the intensity at P is:

$$\displaystyle{ I_{P} =\rho \sum _{ i=1}^{m}\mathbf{l}_{ i}\mathbf{n} \cdot I_{i}. }$$
(13)

With the deformation of skin due to wrinkles, the surface normals and light intensity change. Consequently, new intensity value at P is calculated as:

$$\displaystyle{ I'_{P} =\rho \sum _{ i=1}^{m}\mathbf{l}_{ i}\mathbf{n'} \cdot I'_{i}. }$$
(14)

The ratio image, ERI, is defined to be the ratio of the two images:

$$\displaystyle{ ERI = \frac{I'_{P}} {I_{P}}. }$$
(15)

The ERIs obtained in this way, corresponding to one person’s expression, were mapped to another person’s face image along with geometric warping to generate similar, and sometimes more ‘expressive’, expressions. Figure 12 depicts an example of synthesis of more expressive faces using this method.

Fig. 12
figure 12

An expression used to map to other subjects’ facial images. (a) Neutral face. (b) Result from ERI and geometric warping. (c) The ERI used in (b) and obtained from another person’s face (wrinkles due to expressions are prominent—reproduced from [23])

3.2 Age Estimation

Shape changes are prominent in facial aging during younger years, while wrinkles and other textural pattern variations are more prominent during older years. Hence, age estimation methods try to learn patterns in both shape and textural variations using appropriate image features for specific age intervals and then infer the age of a test face image using the learned classifiers. Some of the popular image features to learn age-related changes have been Gabor features, AAM features, LBP features, LTP features or a combination of them.

Luu et al. [24] proposed an age estimation technique combining holistic and local features where AAM features were used as holistic features and local features were extracted using LTP features. These combined features from training set were then used to train age classifiers based on PCA and Support Vector Machines (SVM). The classifiers were then used to classify faces into one of two age groups—pre-adult (youth) and adult.

Chen et al. [7] conducted thorough experiments on facial age estimation using 39 possible combination of four feature normalization methods, two simple feature fusion methods, two feature selection methods, and three face representation methods as Gabor, AAM and LBP features. LBP encoded the local texture appearance while the Gabor features encoded facial shape and appearance information over a range of coarser scales. They systematically compared single feature types vs. all possible fusion combinations of AAM and LBP, AAM and Gabor, and, LBP and Gabor. Feature fusion was performed using feature selection schemes such as Least Angle Regression (LAR) and sequential selection. They concluded that Gabor feature outperformed LBP and even AAM as single feature type. Furthermore, feature fusion based on local feature of Gabor or LBP with global feature AAM achieved better accuracy than each type of features independently.

3.3 Facial Retouching/Inpainting

Facial retouching is widely used in media and entertainment industry and consists of changing facial features such as removing imperfections, enhancing skin fairness, skin tanning, applying make-up, etc. A few attempts that detect and manipulate facial wrinkles and other marks for such retouching application are described here.

In their work Mukaida and Ando emphasized the importance of wrinkles and spots for understanding and synthesizing facial images with different ages [26]. A method based on local analysis of shape properties and pixel distributions was proposed for extracting wrinkles and spots. It was also demonstrated that extracted wrinkles and spots could be manipulated in facial images for visual perception of aging. The morphological processing of the luminance channel was used to divide resulting binary images in regions of wrinkles and dark spots. The extracted regions were then used to increase/decrease the luminance of the source facial image thus giving an impression of aging/rejuvenating. Figure 13 shows an example of facial image and the extracted binary template. The template is then used to manipulate the original facial image to give a perception of aging/rejuvenating.

Fig. 13
figure 13

Manipulation of facial skin marks. (a) Original image. (b) Binary image. (c) Strengthening. (d) Weakening (reproduced from [26])

Batool and Chellappa [3] presented an approach for facial retouching application based on the semi-supervised detection and inpainting of facial wrinkles and imperfections due to moles, brown spots, acne and scars. In their work, the detection of wrinkles/imperfections allowed those skin features to be processed differently than the surrounding skin without much user interaction. Hence, the algorithm resulted in better visual results of skin imperfection removal than contemporary algorithms. For detection, Gabor filter responses along with texture orientation field were used as image features. A bimodal Gaussian mixture model (GMM) represented distributions of Gabor features of normal skin vs. skin imperfections. Then a Markov random field model (MRF) was used to incorporate spatial relationships among neighboring pixels for their GMM distributions and texture orientations. An Expectation-Maximization (EM) algorithm was used to classify skin vs. skin wrinkles/imperfections. Once detected automatically, wrinkles/imperfections were removed completely instead of being blended or blurred. For inpainting, they proposed extensions to current exemplar-based constrained texture synthesis algorithms to inpaint irregularly shaped gaps left by the removal of detected wrinkles/imperfections. Figures 1415 and 16 show some results of detection and removal of wrinkles and other imperfections using their algorithms.

Fig. 14
figure 14

(Left) Wrinkle removal. (a) Original image. (b) Wrinkled areas detected by GMM-MRF. (c) Inpainted image with wrinkles removed. (d) Patches from regular grid fitted on the gap which were included in texture synthesis. (Right) (a) Original image. (b) Wrinkled areas detected by GMM-MRF. (c) Inpainted image with wrinkles removed; note that wrinkle ‘A’ has been removed since it was included in the gap whereas a part of wrinkle ‘B’ is not removed. (d) Stitching of skin patches to fill the gap (reproduced from [3])

Fig. 15
figure 15

Results of wrinkle detection and removal for a subject. (a) Original image. (b) Detected wrinkled areas. (c) Image after wrinkle removal (reproduced from [3])

Fig. 16
figure 16

Results of detection and removal of skin imperfections including wound scars, acne, brown spots and moles. (a) Original images. (b) Detected imperfections. (c) Images after inpainting (reproduced from [3])

4 Applications Incorporating Wrinkles as Curvilinear Objects

In this section, we present applications incorporating facial wrinkles as curvilinear objects instead of image texture features. Curvilinear objects are detected or hand-drawn in images and then analyzed for the specific application. In this section, we first describe work aimed at accurate localization of wrinkles in images.

4.1 Detection/Localization of Facial Wrinkles

Localization techniques can be grouped in two categories: stochastic and deterministic modeling techniques where Markov point process has been the main stochastic model of choice. Deterministic techniques include modeling of wrinkles as deformable curves (snakelets) and image morphology.

Localization Using Stochastic Modeling

Batool and Chellappa [1, 2] were the first to propose a generative stochastic model for wrinkles using Marked Point Processes (MPP). In their proposed model wrinkles were considered as stochastic spatial arrangements of sequences of line segments, and detected in an image by proper placement of line segments. Under Bayesian framework, a prior probability model dictated more probable geometric properties and spatial interactions of line segments. A data likelihood term, based on intensity gradients caused by wrinkles and highlighted by Laplacian of Gaussian (LoG) filter responses, indicated more probable locations for the line segments. Wrinkles were localized by sampling MPP posterior probability using the Reversible Jump Markov Chain Monte Carlo (RJMCMC) algorithm. They proposed two MPP models in their work, [1, 2], where the latter MPP model produced better localization results by introducing different movements in RJMCMC algorithm and data likelihood term. They also presented an evaluation setup to quantitatively measure the performance of the proposed model in terms of detection and false alarm rates in [2]. They demonstrated localization results on a variety of images obtained from the Internet. Figures 17 and 18 show examples of wrinkle localization from the two MPP models in [1, 2] respectively.

Fig. 17
figure 17

Localization of wrinkles in three FG-NET images using MPP model in [1]. (Top) Ground Truth. (Bottom) Localization results (reproduced from [1])

Fig. 18
figure 18

Localization of wrinkles as line segments for eight images of two subjects (reproduced from [2])

The Laplacian of Gaussian filter used by Batool and Chellappa [1, 2] could not measure directional information and the solution strongly depended on the initial condition determined by the placement of first few line segments. To address these shortcomings, Jeong et al. [17] proposed a different MPP model. To incorporate directional information, they employed steerable filters at several orientations and used second derivative of Gaussian functions as the basis filter to extract linear structures caused by facial wrinkles. As compared to the RJMCMC algorithm used by Batool and Chellappa [1, 2], their RJMCMC algorithm included two extensions: affine movements of line segments in addition to birth and deletion as well as ‘delayed’ rejection/deletion of line segments. Figure 19 shows comparison of localization results using MPP models of Jeong et al. [17] and Batool and Chellappa [1]. However, they reported results on fewer test images as compared to those in [1, 2].

Fig. 19
figure 19

Localization of wrinkles using Jeong et al.’s MPP model [17] vs. Batool and Chellappa’s MPP model [1] (reproduced from [17]). (a) Input. (b) Manually labelled. (c) Batool and Chellappa’s MPP model. (d) Jeong et al.’s MPP model

Several parameters are required in an MPP model to interpret the spatial distribution of curvilinear objects i.e. modeling parameters for the geometric shape of objects and hyper-parameters to weigh data likelihood and prior energy terms. To bypass the computationally demanding estimation of such large number of parameters, in further work, Jeong et al. presented a generic MPP framework to localize curvilinear objects including wrinkles in images [18]. They introduced a novel optimization technique consisting of two steps to bypass the selection of hyper-parameters. In the first step, an RJMCMC sampler with delayed rejection [17] was employed to collect several line configurations with different hyper-parameter values. In the second step, the consensus among line detection results was maximized by combining the whole set of line candidates to reconstruct the most plausible curvilinear structures. Figure 20 shows an example of combining linear structures using different hyper-parameter values for a DNA image. Figure 21 shows localization results for a wrinkle image using different initial conditions in RJMCMC algorithm. Thus their optimization scheme made the RJMCMC algorithm almost independent of the initial conditions.

Fig. 20
figure 20

Localization of a DNA strand using different hyperparameter values in [18]. (a) Original image. (b) Gradient magnitude. (c) Mathematical morphology operator, path opening. (d)–(f) Line configurations associated with different hyperparameter vectors. (g) Final composition result (reproduced from [18])

Fig. 21
figure 21

Localization of wrinkles using different initial conditions; every image row represents a different initial condition (reproduced from [18])

Localization Using Deterministic Modeling The MPP model, despite its promising localization results, requires a large number of iterations in the RJMCMC algorithm to reach global minimum resulting in considerable computation time. To avoid such long computation times for larger images, Batool and Chellappa [4] proposed a deterministic approach based on image morphology for fast localization of facial wrinkles. They used image features based on Gabor filter bank to highlight subtle curvilinear discontinuities in skin texture caused by wrinkles. Image morphology was used to incorporate geometric constraints to localize curvilinear shapes at image sites of large Gabor filter responses. In this work, they reported experiments on much larger set of high resolution images. The localization results showed that not only the proposed deterministic algorithm was significantly faster than MPP modeling but also provided visually better results.

Figure 22 includes some examples of localization with high detection rate using their deterministic algorithm and Fig. 23 presents comparison of localization results between their proposed MPP modeling [2] and deterministic algorithm [4].

Fig. 22
figure 22

A few examples of images with detection rate greater than 70 %. (Left) Original. (Middle) Hand-drawn. (Right) Automatically localized (reproduced from [4])

Fig. 23
figure 23

Comparison of localization results using MPP modeling (top row) and deterministic algorithm proposed by Batool and Chellappa (bottom row) (reproduced from [4])

For the localization of wrinkles, Ng et al. assumed facial wrinkles to be ridge-like features instead of edges [27]. They introduced a measure of ridge-likeliness obtained on the basis of all eigenvalues of the Hessian matrix (Sect. 2). The eigenvalues of the Hessian matrix were analyzed at different scales to locate ridge-like features in images. A few post-processing steps followed by a curve fitting step were then used to place wrinkle curves at image sites of ridge-like features. Figure 24 presents an example of wrinkle localization. Although, their localization results were compared with earlier methods, no comparison results were reported with those of MPP modeling [1, 2].

Fig. 24
figure 24

Automatic detection of coarse wrinkles. (a) Original image. (b), (c) and (d) are the wrinkle detection by two other methods and Ng et al.’s method respectively. Red: ground truth, green: true positive, blue: false positive (reproduced from [27])

4.2 Age Estimation Using Localized Wrinkles

One of the initial efforts related to age estimation from digital images of face and those also using detection of facial wrinkles as curvilinear features was reported by Kwon and da Vitoria Lobo [21, 22]. They used 47 high resolution facial images for classification into one of three age groups: babies, young adults or senior adults. Their approach was based on geometric ratios of so-called primary face features (eyes, nose, mouth, chin, virtual-top of the head and the sides of the face) based on cranio-facial development theory and wrinkle analysis. In secondary feature analysis, a wrinkle geography map was used to guide the detection and measurement of wrinkles. A wrinkle index was defined based on detected wrinkles which was sufficient to distinguish seniors/aged adults from young adults and babies. A combination rule for the face ratios and the wrinkle index allowed the categorization of a face into one of the above-mentioned three classes.

In their 2-step wrinkle detection algorithm, first snakelets were dropped in random orientations in the input image in user-provided regions of potential wrinkles around eyes and forehead. The snakelets were directed according to the directional derivatives of image intensity taken orthogonal to the snakelet curves. The snakelets that had found shallow image intensity valleys were eliminated based on the assumption that only the deep intensity valleys corresponded to narrow and deep wrinkles. In the second step, a spatial analysis of the orientations of the stabilized snakelets determined wrinkle snakelets from non-wrinkle snakelets. Figure 25a1, b1 shows the stabilized snakelets on an aged adult face and young adult face respectively. It can be seen in Fig. 25a2, b2 that a large number of stabilized snakelets correspond to wrinkles in an aged face. Figure 26 shows two examples of final results of detection of wrinkles from initial random snakelets.

Fig. 25
figure 25

(a1, b1) Stabilized snakelets. (a2, b2) Snakelets passing the spatial orientation test and corresponding to wrinkles (reproduced from [22])

Fig. 26
figure 26

Examples of detection of wrinkles using snakelets. (Top) Initial randomly distributed snakelets. (Bottom) Snakelets representing detected wrinkles (reproduced from [22])

4.3 Localized Wrinkles as Soft Biometrics

Recently, due to the availability of high resolution images, a new area of research in face recognition has focused on analysis of facial marks such as scars, freckles, moles, facial shape, skin color, etc. as biometric traits. For example, facial freckles, moles and scars were used in conjunction with a commercial face recognition system for face recognition under occlusion and pose variation in [16, 30]. Another interesting application presented in [20, 32] was the recognition between identical twins using proximity analysis of manually annotated facial marks along with other typical facial features. Where the uniqueness of the location of facial marks is obvious, the same uniqueness of wrinkles is not that obvious. Batool and Chellappa [5] investigated the use of a group of hand-drawn or automatically detected wrinkle curves as soft biometrics. First, they presented an algorithm to fit curves to automatically detected wrinkles which were localized as line segments using MPP modeling in their previous work. Figure 27 includes an example of curves fitted to the detected line segments using their algorithm.

Fig. 27
figure 27

Fitting of curves to detected wrinkles as line segments using MPP modeling (reproduced from [5])

Then they used the hand-drawn and automatically detected wrinkle curves on subjects’ foreheads as curve patterns. Identification of subjects was then done based on how closely wrinkle curve patterns of those subjects matched. The matching of curve patterns was achieved in three steps. First, possible correspondences were determined between curves from two different patterns using a simple bipartite graph matching algorithm. Second, several metrics were introduced to quantify the similarity between two curve patterns. The metrics were based on the Hausdorff distance and the determined curve-to-curve correspondences. Third, the nearest neighborhood algorithm was used to rate curve patterns in the gallery in terms of similarity to that of the probe pattern using their defined metrics. The recognition rate in their experiments was reported to exceed 65 % at rank 1 and 90 % at rank 4 using matching of curve patterns only.

4.4 Applications in Skin Research

Cula et al. [9, 10] proposed digital imaging as a non-invasive, less expensive tool for the assessment of the degree of facial wrinkling to establish an objective baseline and for the assessment of benefits to facial appearance due to various dermatological treatments. They used finely tuned oriented Gabor filters at specific frequencies and adaptive thresholding for localization of wrinkles in forehead images acquired in controlled settings. They introduced a wrinkle measure, referred to as wrinkle index, as the product of both wrinkle depth and wrinkle length to score the severity of wrinkling. The wrinkle index was calculated from Gabor responses and the length of localized wrinkles. The calculated wrinkle indices were then validated using 100 clinically graded facial images. Figure 28 shows examples of localization of wrinkles with different severity in images acquired in controlled setting along with a plot of clinical vs. computer generated scores given in their work.

Fig. 28
figure 28

(Left) Localization of wrinkles with varying severity. (Right) Plot of clinical scores vs. computer generated scores for 100 images (reproduced from [10])

Jiang et al. [19] also proposed an image based method named ‘SWIRL’ based on different geometric characteristics of localized wrinkles to score the severity of wrinkles. However, they used a proprietary software tool to localize wrinkles in images which were taken in controlled lighting settings. The goal was to quantitatively assess the effectiveness of dermatological/cosmetic products and procedures on wrinkles. In their controlled illumination settings, the so-called raking light optical profilometry, lighting was cast at a scant angle to the face of the subject casting wrinkles as dark shadows. The resulting high-resolution digital images were analyzed for the length, width, area, and relative depth of automatically localized wrinkles. The parameters were shown to be correlated well with clinical grading scores. Furthermore, the proposed assessment method was also sensitive enough to detect improvement in facial wrinkles after 8 weeks of product application. Figure 29 shows few images from different facial regions with localized wrinkles using a proprietary software tool used in their work.

Fig. 29
figure 29

Localization of wrinkles in different facial regions using a proprietary software used in [19] (reproduced from [19])

4.5 Facial Expression Analysis

The conventional methods on the analysis of facial expression are usually based on Facial Action Coding System (FACS) in which a facial expression is specified in terms of Action Units (AU). Each AU is based on the actions of a single muscle or a cluster of muscles. On the other hand little investigation has been conducted on wrinkle texture analysis for facial expression recognition. In this section we present research work in expression analysis which incorporates facial wrinkles. Facial wrinkles deepen, change or appear due to expressions and can be an important clue to recognizing expressions. Hence, the following approaches treat changes in facial wrinkles due to expressions as transient or temporary facial features.

Zang and Ji [45] presented a 3-layer probabilistic Bayesian Network (BN) to classify expressions from videos in terms of probability. The BN model consisted of three primary layers: classification layer, facial AU layer and sensory information layer. Transient features e.g. wrinkles and folds were part of the sensory information and were modeled in the sensory information layer containing other visual information variables, such as brows, lips, lip corners, eyelids, cheeks, chin and mouth. The static BN model for static images was then extended to dynamic BN to express temporal dependencies in image sequences by interconnecting time slices of static BNs using Hidden Markov modeling. In their work the presence of furrows and wrinkles was determined by edge feature analysis in the areas where transient features appear i.e. forehead, on the nose bed/between eyes and around mouth (nasolabial area). Figure 30 shows examples of transient feature detection in three regions. The contraction and extension of facial muscles due to expressions result in wrinkles/folds in particular shapes detected by edge detectors. The shape of wrinkles was approximated by fitting quadratic forms passing through a set of detected edge points in a least-square sense. The coefficients in the quadratic forms then signified the curvature of the folds and indicated presence of particular facial AUs.

Fig. 30
figure 30

Examples of detection of transient wrinkles during expressions in different facial regions in [45]

Tian et al. [42] proposed a system to analyze facial expressions incorporating facial wrinkles/furrows in addition to commonly studied facial features of mouth, eyes and brows. Facial wrinkles/furrows appearing or deepening during a facial expression were termed as ‘transient’ features and were detected in pre-defined three regions of a face namely around eyes, nasal root/bed or around mouth. The Canny edge detector was used to analyze frames of a video to determine if wrinkles appeared or deepened in later frames of a video. The presence/absence of wrinkles in three facial regions of interest as well as the orientations of the detected wrinkles were incorporated as an indication to the presence of specific AUs in their expression analysis system. Figure 31 shows three examples of the detection of the orientation of wrinkles around mouth for a certain expression.

Fig. 31
figure 31

(a) Three pre-defined areas of interest for detection of transient features (wrinkles/furrows). (b) Detection of orientation of expressive wrinkles. (c) Example of detection of wrinkles around eyes (reproduced from [42])

Yin et al. [44] explored changing facial wrinkle textures exclusively in videos for recognizing facial expressions. They assumed that facial texture consisted of static and active parts where the active part of texture was changed with an expression due to muscle movements. Hence they presented a method based on the extraction of active part of texture and its analysis for expression recognition where the wrinkle textures were analyzed in four regions of face as shown in Fig. 32a. In their method the correlation between wrinkles texture in the neutral expression and the active expression was determined using Gaussian blurring. The two textures were correlated several times as they gradually lost detail due to blurring. The rate-of-change of correlation values reflected the dissimilarity of the two textures in four facial regions of interest and was used as a clue to the determination of six universal expressions.

Fig. 32
figure 32

Example of wrinkle textures extracted from two expressions (smile and surprise). (a) Facial regions of interest. (b, c) Example of textures extracted from smile and surprise expressions. (d) Normalized textures (reproduced from [44])

5 Summary and Future Work

In this chapter, we presented a review of the research in computer vision focusing on the analysis of facial wrinkles as image texture or curvilinear objects with several applications. Facial wrinkles are important features in terms of facial aging/expressions and can be a cue to several aspects of a person’s identity and lifestyle. Image-based analysis of facial wrinkles can improve existing algorithms on facial analysis as well as pave way to new applications. For example, patterns of personalized aging can be deduced from the spatio-temporal analysis of changes in facial wrinkles. A person’s smoking habits, facial expression and sun exposure history can be inferred from the severity of wrinkling. The specific patterns of wrinkles appearing on different facial regions can be added to facial soft biometrics or to the analysis of facial expressions. Furthermore, analysis of subtle changes in facial wrinkles can quantify the effects of different dermatological treatments. However, the first step in any of these applications would be the accurate and fast localization of facial wrinkles in high resolution images.