Rank Beauty

Liao, Yanbing; Deng, Weihong; Cui, Can

doi:10.1007/978-981-10-3005-5_15

Yanbing Liao¹⁶,
Weihong Deng^16,17 &
Can Cui¹⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 663))

Included in the following conference series:

Chinese Conference on Pattern Recognition

2297 Accesses

Abstract

It is useful to automatically select the most attractive face images from large photo collections. Previous works in this area have little attention on facial attractiveness for one subject, but different objects. In this paper, we have a collection of subjects’ faces including a range of expression, postures, makeup, lighting and resolutions from Bing Search. Given training data of faces scored based on the majority of subjects’ tastes, we train a model to learn how to rank novel faces and show how it can be used to automatically mine attractive photos from personal photo collections. Our system achieves an average accuracy of 73 % on pairwise comparisons of novel faces.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Computation of facial attractiveness from 3D geometry

Article 21 July 2022

Personalized facial beauty assessment: a meta-learning approach

Article 11 February 2022

DiscoStyle: Multi-level Logistic Ranking for Personalized Image Style Preference Inference

Article 09 September 2020

Keywords

1 Introduction

Beauty is an abstract concept, getting more and more attention in recent years. In general, it’s easy for people to select good ones from one subject’s portraits but hard to make sure it’s right. Which portrait is more attractive? Will other people agree with me? Many factors, such as facial expressions, makeup, lighting or resolutions can contribute to why a face is beautiful. Besides, individuals have different tastes. That means our perception of ourselves is often quite different than that of others [1]. Specially, our method offers users feedback on how their range of portraits are perceived by others. Our model can be used to select the most attractive pictures of people or delete quite bad ones from a photo collection.

At present, many scholars use features to predict facial beauty by machine learning methods [2, 15]. Bottino and Laurentini [3] delivered a study of facial beauty as seen in the pattern analysis literature. These researches suggests more attractiveness levels of one’s appearance. Zhu [4] focuses on attractiveness of a given person only based on expression. However, lighting, make up, resolution are also important factors to judge what a portrait is flat. In paper, we consider a more complicated dataset to have a more applicable model.

Ranking and relative ordering have been thoroughly investigated within the machine learning literature especially in applications pertaining to information and document retrieval [5, 6]. We also find image search related applications for image retrieval using similar techniques that were used for document retrieval [7, 8]. Our work differs significantly from those as we learn a facial ranking function [9, 14] for individuals over image pairs. Most comparison works about facial beauty [4] focus on yes or no two answers. However, it’s so hard to judge which picture is flatter when compared images are in the same attractiveness level. Differently, we obtain the additional relative attributes what we call “similar” for training. In our comparison works, we find more than 1/3 pairs are similar. Finally, our experimental results show such annotation method can provide more information for our training and get a more accuracy facial ranking model.

This paper is organized as follows. Section 2 introduces our collecting dataset, comparisons and training approach about learning to rank. Section 3 shows our assessment method and experimental results compared with another recent methods. Section 4 shows the limitations and draws the conclusion and feature works.

2 Rank Beauty

In this section we shows our works about collecting portraits dataset, pre-processing and pairwise comparisons (Sect. 2.1). Then we present our main approach we used to build the facial beauty models (Sect. 2.2).

2.1 Data and Crowdsourcing

Our first goal is to collect a set of a subject’s portraits including variety of expressions, make up, lighting and resolution. Then rating them along attributes. In this section, we first show how we collect portraits. Then, we pre-process the images to normalize the position and extract the features for our model. Finally, we collect pairwise comparisons of portraits along attractive attributes.

Collecting Data and Pre-processing. We start by collecting a large of photos from Bing Search API that may be appropriate for portraits for each subject. All pictures are from celebrities, in order to make sure the number of portraits being large enough. In total, we collected the data of 108 subjects, 500 to 600 images per subject, including both male and female subjects ranging in age from 20 to 65. We perform several pre-processing steps (Fig. 1) for each image collection to align the facial data, compute facial features and reduce data redundancy.

We first perform recognition and cropping to normalize the face in a common reference frame. We crop faces using the bounding boxes, generated by Vio-Jones Face detector. Then we use a face tracker [10] that accurately estimates 9 facial feature points and localizes different facial parts such as eyes, mouth, nose and facial edge (Fig. 1). We apply a median filter with a window size of 5 frames to smooth the estimated points and suppress tracking temporal jitter. Besides, we filter small face, poor alignment and Non-frontal face. Later, we warp the face into a frontal view using the 3D template model [11] and the 3D-to-2D transformation matrix [4]. We exclude portraits for which little face or the tracker reports tracking failures. Finally, we have 108 subjects and 100 to 200 portraits for every collection. Figure 1 shows several examples of the remaining images. Hani Altwaijry and Serge Belongie [2] find that the most effective feature types for predicting beauty preferences are HOG. So we use a simple and straightforward 3720-dimensional HOG (Histogram of Oriented Gradients) [12] features that was calculated on five parts of images to capture visual properties of pictures.

Pairwise Comparisons. We next collect human response data that allows us to rank attractive portraits for each subject. We develop our annotation system to collect human response data of every pairwise comparison (e.g., “Is image A more beautiful than B?”). We suggest that volunteers can consider facial expression, postures, makeup, lighting and resolutions to make choose. We sample portraits to form pairwise comparisons in random and provide 3 labels, which are “yes”, “no”, “similar”. Choosing the relative attribute of “similar” not only means photos of the comparison are similar with each other. What’s important, our system also allows annotators to chose “similar” when they hesitate between yes or no. There are 20 volunteers to support our work. Each pair is annotated once. A single worker complete about 3 or 4 groups. Finally, we receive 78 subjects’ comparisons collections, 12434 pairs in total.

2.2 Learning to Rank

We use ranking functions trained with comparative labels. This method, originally introduced by Parikh and Grauman [9], compare images in terms of how strongly they exhibit a nameable visual property. We use a large-margin approach to model our facial relative attributes. In this work, we require using the “similar” relative attributes we collected before. To learn our ranking function g, it uses a set of portraits I = {I ₁, I ₂, …, I _n} in the dataset, each of which is described by the image features $ x_{i} $ ϵ R ^d. The annotated portraits list is given by a subject as a tuple A = {A ₁₂, A ₂₃, …, A _n–1,n} and A _ij ϵ {0, 0.5, 1}, where “A _ij = 1” denotes that I _i is more attractive than I _j and “A _ij = 0” means inverse. Moreover, “A _ij = 0.5” denotes that both I _i and I _j are equivalent in terms of attractiveness. Then, we sort comparisons of all subjects to two sets according to A. The first set O = {(i, j)} consist of ordered pairs of images for which the first image I _i has the attribute more than the second image I _j which means “A _ij = 1” or “A _ji = 0”. The second set S = {(i, j)} consisted of unordered pairs for which both images have the attribute to a similar extend and “A _ij = 0.5”. Our goal is to learn the function:

$$ g\left( {x_{i} } \right) = w^{T} x_{i} $$

(1)

subject to the constraints:

$$ \forall \left( {i,j} \right) \in O \to g\left( {x_{i} } \right) > g\left( {x_{j} } \right) $$

(2)

$$ \forall \left( {i,j} \right) \in S \to g\left( {x_{i} } \right) = g\left( {x_{j} } \right) $$

(3)

While this is an NP hard problem, as described by [9] is modeled as following optimization problem:

$$ {\text{minimize}}\,\left( {\frac{1}{2}\left\| w \right\|_{2}^{2} + C\left( {\sum \xi_{ij}^{2} + \sum \gamma_{ij}^{2} } \right)} \right) $$

(4)

$$ s.t.\,\,w^{T} \left( {x_{i} - x_{j} } \right) \ge 1 - \xi_{ij} ;\forall \left( {i,j} \right) \in O $$

(5)

$$ \left| {w^{T} \left( {x_{i} - x_{j} } \right)} \right| \le \gamma_{ij} ;\forall \left( {i,j} \right) \in S $$

(6)

$$ \xi_{ij} \ge 0;\gamma_{ij} \ge 0. $$

(7)

where ξ_ij, γ_ij are slack variables, the constant C balances the regularizer and constraints, and controls the satisfaction of strict relative order. Ranksvm is defined without Eq. (6). While we strict with two restrictions imposed by Eqs. (5) and (6) for reasons related to our sorting mechanism which corresponds to the tuple A. Extended ranksvm with Eq. (6) can learn more useful information about similar pairs. Moreover, this setting can enable us to get a more accuracy image ranking.

3 Results

In this section we present our main approach we used to measure accuracy of the ranking model (Sect. 3.1). Finally, we shows our experimental results (Sect. 3.2).

3.1 Measuring Accuracy

To test our method we collect an additional 30 subjects’ comparisons collections. To measure the accuracy of our method we turn towards a tool for comparing ranked orders of pairs: Kendall Tau [13].

In our implementation, we first focus on the accuracy of learning “more attractive than” relationships. For one subject, we get the our predicting ranking for pairs defined as E = {(i, j)} consisted of ordered pairs for which I _i is more beautiful than I _j. We measures the number of pairwise accuracy between O and E as follows: $ \tau \left( {E,O} \right) = \sum\limits_{{\forall \left( {i,j} \right) \in E}} {I\left( {\left( {i,j} \right) \in O} \right)} $, where I(.) is an indicator function.

Based on the Kendall Tau we construct our accuracy measurement to account for correct pairs divided by the total number of pairs to reach a notion of correctness. If N _O is the total number of pairs in set O for one subject, then our accuracy measurement for E matching O is: $ \alpha \left( {E,O} \right) = \tau \left( {E,O} \right)/N_{O} $.

Then we want to make full use of set S to know the ability of our model to rank similar portraits during our testing. We define a set D = {D_ij} and (i, j) ϵ S, where D_ij = |g(x_i) − g(x_j)|/d and d is a normalized parameter. D_ij means the normalized margin between mapped image I_i and I_j. In our implementation, we compare our model with the model presented in [3]. Next we set ours as D_s and another as D_us. We measure the number of the greater margin pairs between D_s and D_us as: $ \tau \left( {D_{s} ,D_{us} } \right) = \sum {\left\{ {D_{s} \left( {i,j} \right) < D_{us} \left( {i,j} \right)} \right\}} $. If N_S is the total number of pairs in set S for one subject, then our accuracy measurement about similar pairs for D _s relative to D _us is: $ \alpha \left( {D_{s} ,D_{us} } \right) = \tau \left( {D_{s} ,D_{us} } \right)/N_{S} $. Also we can easy know the accuracy of another model about similar pairs is $ 1 - \alpha \,(D_{\text{s}} ,D_{\text{us}} ) $.

Later, we will show the good performance of our model both in dissimilar and similar image pairs.

3.2 Facial Beauty Ranking

We designed our experiments using the method, which is introduced in Sect. 2.2. We trained our model by the 12434 training examples from 78 subjects and the relative attributes of each subjects from our volunteers. In order to test the accuracy of the rank system, we have 30 testing subjects and collecte pairwise comparisons of each subjects additionally. Running the experiment with four testing subjects showed the ranking results in Fig. 6. In Fig. 6, we show two rows for each of four testing subjects from personal images collections: the five most attractive, and the five least.

For observing how the similar attribute behaved, we decided to train original ranksvm model which is without the restriction imposed by Eq. (6). Noticing that similar attribute has no contribution to this rank model. We ran the experiment with the pairwise comparisons of each subjects and the results as the ranking accuracy plot are showed in Fig. 2. From Fig. 2, two models have exactly the same high rank accuracy for the dissimilar portraits. But original ranksvm model perform bed in Fig. 3 when it comes to the similar images. We believe this is due to the similar attribute. The results shows the similar attribute contributes a more accurate rank model.

Then we contrasted our facial beauty rank system with the existing SVR model proposed in [4]. Figures 4 and 5 shows how the two models perform. In Fig. 5, the average accuracy of 73 % was achieved by our rank system and another model performed the average accuracy of 59 %. Besides, our rank system perform good than another when ranking similar portraits. This can be seen with Fig. 4. Considering Figs. 2, 3, 4 and 5 our facial beauty rank system performs quite accurate.

Finally, we conclude that the portrait which has a good makeup, nice lighting, flat expression and not bad resolution gets higher ranking. Especially, our model loves personal photos with high professional photography level. The result of our facial rank model seems reasonable.

4 Conclusion

Our method has some limitations. Our model is a cross-subject rank model to predict attractiveness automatically. However, different people have different nice poses, expressions, polishing ways and so on, which are suitable for themselves. Besides, although our annotations from 20 workers maybe can stand for the public opinion, people still have many different opinions in details.

In this paper, we described a ranking system for facial beauty that can rank portraits from one subject according to public preferences. More importantly, we made full use of the relative attributes, especially similar attribute. Finally, we achieve an average accuracy of 73 %. Therefore, we can express that our personal facial beauty rank system do not only consider the expressions but also integrate with considering the lighting, clarity, makeup and so on. Similar relative attribute is proved to be important. Next, we can improve our dataset and organized method. Besides, doing some adaptive works is a promising avenue for the future work.

References

Springer, I.N., Wiltfang, J., Kowalski, J.T., Russo, R.A.J., Schulze, M., Becker, S., Wolfart, S.: Mirror, mirror on the wall: self-perception of facial beauty versus judgement by others. J. Craniomaxillofac. Surg. 40(8), 773–776 (2012)
Article Google Scholar
Altwaijry, H., Belongie, S.: Relative ranking of facial attractiveness. In: IEEE Workshop on Application of Computer Vision (WACV), pp. 117–124, 15–17 January 2013
Google Scholar
Bosch, A., Zisserman, A., Muñoz, X.: Scene classification via pLSA. In: Proceedings of ECCV 2006, pp. 517–530 (2006)
Google Scholar
Zhu, J.-Y., Agarwala, A., Efros, A.A., Shecht-man, E., Wang, J.: Mirror mirror: crowdsourcing better portraits. ACM TOG 33(6)
Google Scholar
Cao, Z., Qin, T., Liu, T.-Y., Tsai, M.-F., Li, H.: Learning to rank: from pairwise approach to listwise approach. In: Proceedings of ICML 2007, pp. 129–136. ACM, New York (2007)
Google Scholar
Li, H.: A short introduction to learning to rank. IEICE Trans. 94-D(10), 1854–1862 (2011)
Article Google Scholar
Gevers, T., Smeulders, A.W.M.: Pictoseek: combining color and shape invariant features for image retrieval. IEEE Trans. Image Process. 9(1), 102–119 (2000)
Article Google Scholar
He, J., Li, M., Zhang, H.-J., Tong, H., Zhang, C.: Manifold-ranking based image retrieval. In: Proceedings of MULTIMEDIA 2004, pp. 9–16. ACM, New York (2004)
Google Scholar
Parikh, D., Grauman, K.: Relative attributes. In: ICCV (2011)
Google Scholar
Asthana, A., Zafeiriou, S., Cheng, S., Pantic, M.: Robust discriminative response map fitting with constrained local models. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), Portland, Oregon, USA, June 2013
Google Scholar
Zhang, L., Snavely, N., Curless, B., Seitz, S.M.: Spacetime faces: high-resolution capture for modeling and animation (2004)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)
Google Scholar
Kumar, R., Vassilvitskii, S.: Generalized distances between rankings. In: Proceedings of WWW 2010, pp. 571–580 (2010)
Google Scholar
Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., Hullender, G.: Learning to rank using gradient descent. In: ICML (2005)
Google Scholar
Kalayci, S., Ekenel, H.K., Gunes, H.: Automatic analysis of facial attractiveness from video. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 4191–4195, October 2014
Google Scholar

Download references

Acknowledgments

This work was partially sponsored by supported by the NSFC (National Natural Science Foundation of China) under Grant No. 61375031, No. 61573068, No. 61471048, and No. 61273217, the Fundamental Research Funds for the Central Universities under Grant No. 2014ZD03-01, This work was also supported by Beijing Nova Program, CCF-Tencent Open Research Fund, and the Program for New Century Excellent Talents in University.

Author information

Authors and Affiliations

Beijing University of Posts and Telecommunications, Beijing, 100876, China
Yanbing Liao & Weihong Deng
Beijing Jiaotong University, Beijing, 100044, China
Weihong Deng & Can Cui

Authors

Yanbing Liao
View author publications
You can also search for this author in PubMed Google Scholar
Weihong Deng
View author publications
You can also search for this author in PubMed Google Scholar
Can Cui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yanbing Liao .

Editor information

Editors and Affiliations

Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an, China
Xuelong Li
Chinese Academy of Sciences, Institute of Computing Technology, Beijing, China
Xilin Chen
Tsinghua University , Beijing, China
Jie Zhou
Nanjing University of Science and Technology, Nanjing, China
Jian Yang
University of Electronic Science and Technology, Chengdu, Sichuan, China
Hong Cheng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liao, Y., Deng, W., Cui, C. (2016). Rank Beauty. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds) Pattern Recognition. CCPR 2016. Communications in Computer and Information Science, vol 663. Springer, Singapore. https://doi.org/10.1007/978-981-10-3005-5_15

Download citation

DOI: https://doi.org/10.1007/978-981-10-3005-5_15
Published: 22 October 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3004-8
Online ISBN: 978-981-10-3005-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics