Learning to Cluster Faces with Mixed Face Quality

Pan, Zhiwei; Guo, Zhihua; Yang, Huiting; Yan, Congquan; Yang, Pengju

doi:10.1007/978-3-031-18910-4_46

Zhiwei Pan^15,16,
Zhihua Guo¹⁷,
Huiting Yang¹⁸,
Congquan Yan¹⁵ &
…
Pengju Yang¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13535))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1506 Accesses

Abstract

Face clustering is the task of grouping faces by their underlying identity, and is still a challenging task in practical use due to the common low-quality face images caused by pose, blur, occlusion, illumination etc. To address the issue, this paper proposes a face clustering algorithm, referred as FC-Q, that takes the quality score as extra input. Based on the main observation that two nodes similar in feature subspace but with different identity may have larger score difference, the algorithm first integrates this prior with the modified self-attention mechanism of Transformer to infer reliable linkage likelihood between similar node pairs. Then the algorithm combines the face quality information with the label propagation module to further suppress the abnormal pairings. The effectiveness of the algorithm is evaluated on two public face datasets in good and bad quality. Experimental results validate that our algorithm outperforms the state-of-the-arts under the general circumstance of clustering faces with mixed face quality.

This work was supported by the Postdoctoral Merit Funding Project of Zhejiang Province under Grant ZJ2020050.

Access provided by Autonomous University of Puebla. Download conference paper PDF

ClusterFace: Clustering-Driven Deep Face Recognition

TSC-GCN: A Face Clustering Method Based on GCN

Structure-enhanced pairwise feature learning for face clustering

Article Open access 01 March 2023

Keywords

1 Introduction

Face clustering is a fundamental task in face analysis and has been extensively studied in resent years [8, 15, 20, 22, 23]. Existing face clustering methods roughly fall into two categories, i.e., unsupervised methods and supervised methods. Unsupervised approaches, such as K-Means [10] and DBSCAN [6], rely on specific assumptions and lack the capability of resonating with high-dimensional structured data information. Supervised face clustering methods mainly aim to learn more distinguishing embedding subspace [3, 14] or the complex cluster patterns [23]. These existing methods mainly based on the node distance in feature space, while ignore the negative effect caused by the face images in low quality. Face node with low quality is common and goes against the clustering due to its ambiguous identity. Figure 1(a) shows the example that the pair nodes with one in low quality can also have high similarity even they are under different identity. These low face quality nodes will obviously degrade the face clustering precision if not handled appropriately. It is essential for face clustering algorithms to have the ability to deal with this general application circumstance.

Fortunately, face quality score provides helpful auxiliary information for clustering. Figure 1(b) shows the mean absolute value of quality score difference between every two pair nodes under same and different identities with respect to similarity threshold. The node pairs in IJB-C dataset [11] with similarity higher than threshold are involved in this statistic. There always exists a gap between the score differences, which implies that two nodes similar in feature subspace but with different identity may have larger score difference. Based on this main observation, this paper proposes a face clustering algorithm, which is referred as FC-Q, intuitively takes the face quality as extra input to exploit unlabeled face data.

Face quality can be conveniently assessed beforehand by recent deep learning based methods. SER-FIQ [17] obtains the face quality by measuring the embedding variations generated from random sub-networks of the face recognition model. A deep tiny network [13] is also proposed to learn a face quality prediction function that is recognition-oriented. Meanwhile, face quality can be explicitly given along with the identity feature by face recognition network. MagFace [12] introduces an adaptive mechanism to learn a universal feature embedding with magnitude measuring the face quality. EQFace [9] outputs face quality and identity feature at the same time by adding a quality network branch to the baseline network of face recognition. Such methods make our work more efficient in gathering input data.

The proposed FC-Q algorithm incorporates the face quality into its two main modules, i.e., the linkage prediction module and the label propagation module. In the linkage prediction module, the algorithm adopts the framework of Transformer encoder [18]. It is specially modified to fit the general clustering circumstance. One relevancy prior is designed according to the quality score relationship among nodes in neighborhood. The prior helps the self-attention mechanism of Transformer encoder better infer the linkage likelihood between node pairs. In the label propagation module, the algorithm utilizes the face quality to recalibrate the abnormal linkage likelihood based on the local quality information. The linkage with one node having inconsistent quality score with its neighbors will be suspected of being unreliable, and its likelihood will be suppressed if the two pair nodes have a large gap with respect to their local quality information. Finally, our proposed algorithm transitively merges face nodes according to the refined linkage likelihood, and obtains the clusters.

To summarize, the main contributions of this work are as follows:

A face clustering algorithm named FC-Q is proposed with the face quality as extra input. Compared with the state-of-the-arts, this algorithm deals with the more general circumstance that the face nodes are not all guaranteed in good quality.
The proposed FC-Q algorithm modifies the Transformer encoder and designs the relevancy prior with face quality to infer more reliable linkage likelihood between similar pairs in feature subspace.
The proposed FC-Q algorithm utilizes the local quality information of each node to further suppress the pairing with abnormal high linkage likelihood.
The proposed FC-Q algorithm specifically conducts face clustering experiments on IJB-C dataset with low face quality and achieves $91.7\%$ pairwise F-score on partial IJB-C, which provides a strong baseline for low-quality face clustering.

2 Methodology

In this section, we introduce the details of the proposed FC-Q algorithm, which includes the specific linkage prediction module and label propagation module. Figure 2 shows the flowchart.

2.1 Linkage Prediction with Prior Attention

Following the link-based clustering paradigm, the proposed algorithm selects every face node as a pivot, and estimates the linkage likelihood between the pivot and its K nearest neighbors in feature subspace.

Given ith pivot face node $\textbf{f}_0^{i}\in \mathbb {R}^{d_f}$ and its K nearest neighbors $\textbf{f}_j^{i}, j\in \{1,...,K\}$, the input of the linkage prediction module consists of $K+1$ vectors with each has the form

$$\begin{aligned} \textbf{g}_j = \textrm{cat}\left( \textbf{f}_j^i, \textbf{f}_0^{i}, q_j^i, q_0^i, \textbf{p}_j\right) \in \mathbb {R}^{d_g}, \quad j\in \{0,...,K\}, \end{aligned}$$

(1)

where $q_0^i\in \mathbb {R}$ denotes the face quality score of pivot node, and $q_j^{i}, j\in \{1,...,K\}$ denotes the face quality score of its neighbor. Operator $\mathrm {cat()}$ denotes concatenation operation along the feature dimension, thus $d_g=2d_f+2+K+1$. Vector $\textbf{p}_j\in \mathbb {R}^{K+1}$ represents Pearson correlation encoding of each involved face node. The element in $\textbf{p}_j\in \mathbb {R}^{K+1}$, i.e., the Pearson correlation coefficient, is calculated as

$$\begin{aligned} \textbf{p}_j^k = \frac{\textrm{cov}\left( \textbf{f}_j^i, \textbf{f}_k^i\right) }{\sigma _{\textbf{f}_j^i}\sigma _{\textbf{f}_k^i}}, \quad k\in \{0,...,K\}, \end{aligned}$$

(2)

where $\textrm{cov}$ and $\sigma $ denote the covariance and the standard deviation of $\textbf{f}_j^i$ and $\textbf{f}_k^i$ respectively. Concatenating pivot feature $\textbf{f}_0^{i}$ in (1) aims to inform the linkage prediction module to learn the relationship between pivot-neighbor pair. Concatenating quality scores $q_j^i, q_0^i$ in (1) aims to let the linkage prediction module infer the link likelihood with more references. Concatenating Pearson correlation encoding $\textbf{p}_j$ in (1) is under the consideration that the Pearson correlation coefficient can offer the linkage prediction module robust linear relations between pair nodes, which is highly sensitive to outliers [2].

With the input vectors generated, the linkage prediction module adopts the framework of modified Transformer encoder to estimate the linkage likelihood of pivot-neighbor pair. Compared with the GCN based framework [20], the Transformer encoder framework has the ability to learn relation weights based on its effective self-attention mechanism. As shown in Fig. 2, the linkage prediction module is composed of a stack of 3 identical layers. Each layer has three sub-layers. The first is a pre-layer normalization [21], the second is a modified multi-head self-attention mechanism, and the third is a simple fully connected feed-forward network. Note that the skip connections of the last two sub-layers are specifically removed to let the whole module focus more on inferring the difference between pivot and neighbor nodes. The last feed-forward sub-layer performs the binary node classification followed by softmax activation, which outputs the probability of whether the corresponding input belongs to the same class as the pivot.

Figure 3 further shows the architecture of our three modified self-attention mechanism sub-layers. Taking the first one as an example, every normalized vector $\textbf{g}_j'$ in ith input $\textbf{G}_i'\in \mathbb {R}^{(K+1)\times d_g}$ is first linearly projected into three super vectors named key, query and value, which can be expressed in matrix form as

$$\begin{aligned} \begin{aligned}&\textbf{K}=\textbf{G}_i'\textbf{W}^S, \textbf{K}\in \mathbb {R}^{(K+1)\times d_s},\\&\textbf{Q}=\textbf{G}_i'\textbf{W}^S, \textbf{Q}\in \mathbb {R}^{(K+1)\times d_s},\\&\textbf{V}=\textbf{G}_i'\textbf{W}^V, \textbf{V}\in \mathbb {R}^{(K+1)\times d_v}, \end{aligned} \end{aligned}$$

(3)

where matrices $\textbf{W}^S\in \mathbb {R}^{d_g\times d_s}$ and $\textbf{W}^V\in \mathbb {R}^{d_g\times d_v}$ denote the learnable project matrices. The key and query are forced to share the same projection, thus the relevancy between the corresponding two samples will be symmetric. This setting is helpful for face clustering, which is shown in the following experiment section.

The relevancy between the samples are constructed as

$$\begin{aligned} \begin{aligned} \frac{1}{\sqrt{d_s}}\textbf{Q}\textbf{K}^T + \textbf{A}, \end{aligned} \end{aligned}$$

(4)

where matrix $\textbf{A}\in \mathbb {R}^{(K+1)\times (K+1)}$ represents the prior relevancy, and its element in the mth row and nth col is of the form

$$\begin{aligned} \begin{aligned} \textbf{A}_{m,n} = \textrm{sim}\left( \textbf{f}^i_m, \textbf{f}^i_n\right) \cdot \left( 1 - \textrm{abs}(q^i_m - q^i_n)\right) , \end{aligned} \end{aligned}$$

(5)

where operator $\textrm{sim}()$ calculates similarity between the identity features $\textbf{f}^i_m$ and $\textbf{f}^i_n$, and operator $\textrm{abs}()$ denotes the absolute operation. Prior relevancy matrix $\textbf{A}$ is also symmetric, and can offer the negative relevancy information between two samples if they have large quality score difference.

The output $\textbf{Z}\in \mathbb {R}^{(K+1)\times d_v}$ in self-attention mechanism is the aggregation of value matrix $\textbf{V}$ by attention weights, i.e.,

$$\begin{aligned} \begin{aligned} \textbf{Z}=\textrm{softmax}\left( \frac{\textbf{Q}\textbf{K}^T + \textbf{A}}{\sqrt{d_s}}\right) \textbf{V}, \end{aligned} \end{aligned}$$

(6)

where the attention weights are obtained by applying softmax normalization to (4).

The modified self-attention is further incorporated into the multi-head mechanism, where the self-attention is performed H times in parallel. The outputs of each self-attention are concatenated and once again projected, resulting in the final output of multi-head self-attention sub-layer

$$\begin{aligned} \begin{aligned} \textbf{Z}_{M}=\textrm{cat}\left( \textbf{Z}_1,...,\textbf{Z}_H\right) \textbf{W}_M, \end{aligned} \end{aligned}$$

(7)

where matrix $\textbf{W}_M\in \mathbb {R}^{Hd_v\times d_g}$ denotes learnable project matrix.

During the training stage, the whole linkage prediction module is trained by cross-entropy loss

$$\begin{aligned} \begin{aligned} \mathcal {L}=-\sum _{k=0}^K\textrm{log}(\hat{y}_{k_1}^i), \end{aligned} \end{aligned}$$

(8)

where $\hat{y}_{k_1}^i$ denotes the output probability, i.e., linkage likelihood, of whether the ith pivot and its corresponding kth neighbor belong to the same class.

2.2 Label Propagation with Anomaly Suppression

As all the pivot nodes are involved in linkage prediction module, a set of pivot-neighbor pairs $\mathcal {E}=\{\textbf{e}_j\}_{j=1,...,NK}$ will be obtained along with the corresponding linkage likelihood set $\mathcal {P}=\{p_j\}_{j=1,...,NK}$. The goal of the label propagation module is to assign pseudo label $y_i$ to every face image $\textbf{x}_i$ with the help of quality score set.

The label propagation module starts with the initial linkage threshold $\tau _p$, and performs in an iterative manner. In each iteration, the module first finds all the connected neighbors of every unlabeled node according to the current pair set $\mathcal {E}$. Then Given one pivot-neighbor pair $\textbf{e}_j$ connecting nodes m and n, the module calculates the mean absolute value of quality score difference between each node and its connected neighbors,

$$\begin{aligned} \begin{aligned} \breve{q}_{*} = \frac{1}{|\mathcal {N}_*|}\sum _{k\in \mathcal {N}_*}\textrm{abs}\left( q_*-q_k\right) ,\,\,\, *=m,n, \end{aligned} \end{aligned}$$

(9)

where $\mathcal {N}_*$ denotes the set of neighbor index, and $|\mathcal {N}_*|$ denotes the number. If the value $\breve{q}$ of either node is larger than a predefined threshold $\tau _q$, the pivot-neighbor pair will be considered unreliable and its linkage likelihood will be suppressed as

$$\begin{aligned} \begin{aligned} p_j =p_j \cdot \left( 1-\textrm{abs}\left( \bar{q}_m-\bar{q}_n\right) \right) , \end{aligned} \end{aligned}$$

(10)

where $\bar{q}_m$ and $\bar{q}_n$ denote the mean value of quality scores of the very node and its connected neighbors. The gap in local face quality information between the two nodes determines the degree to which the corresponding linkage likelihood is suppressed.

With the linkage likelihood all updated, the label propagation module first removes the pivot-neighbor pairs from $\mathcal {E}$ whose linkage likelihood are below the threshold $\tau _p$. Then the module finds connected components based on the remaining pivot-neighbor pairs. If the node number of one component is below the predefined maximum size M, all nodes in the component are annotated with a new pseudo label, and the corresponding pivot-neighbor pairs are also removed from set $\mathcal {E}$. The threshold $\tau _p$ is increased at the end of every iteration.

The process mentioned above is iterated until the pair set $\mathcal {E}$ is empty. The neglected orphan nodes are also annotated with new pseudo labels individually. Finally, the face nodes with same pseudo label constitute a cluster. To summary, Algorithm 1 lists the whole procedure of this label propagation. The maximum computational complexity of one iteration is of order $\mathcal {O}(3NK)$, and decreases as the iteration progresses. It is observed that involving face quality can help improve the clustering precision along with a slight drop of recall.

3 Experiments

3.1 Experimental Settings

Datasets. We use the IJB-C dataset [11] of low face quality as well as the refined MS1M dataset [4, 7] of good face quality for training and testing in face clustering. The IJB-C contains about 138K face images from 3.5K identities, and the MS1M contains about 5.8M face images from 85K identities. The IJB-C dataset is randomly partitioned into 10 splits with equal identity number, and each part has the same distribution of nodes per identity. As IJB-C dataset is small, 9 parts are used for training to alleviate overfitting and 1 part for testing. The MS1M dataset is partitioned in the same way with 1 part for training and the other 9 parts for testing.

We evaluate the performance of face clustering by three commonly used metrics, i.e., Pairwise [16], BCubed [1] and NMI [19]. Pairwise and BCubed both measure the precision and recall of clustering, with F-score being their harmonic mean. The former metric emphasizes more on large clusters. NMI measures the global closeness of the output pseudo labels and the ground-truth. All the metrics have a range of [0, 1] with 1 being the perfect. Four state-of-the-art algorithms, namely, L-GCN [20], GCN-DS [23], GCN-VE [22], and STAR-FC [15] are used for comparison.

Our framework is implemented in Pytorch. We first use the pre-trained IResNet50 [4, 5] to extract the identity features of face nodes with dimension $d_f=512$, and EQFace [9] to obtain the corresponding face quality scores. All the competing algorithms share the same input. We then set the neighbor number $K = 80$, the dimension of key, query and value $d_s=d_v=2048$, and the number of multi-head attention $H=2$. We also empirically set the initial linkage threshold $\tau _p=0.9$, the quality threshold $\tau _q=0.3$, and the maximum cluster size $M=1000$. The maximum iteration number T is set to 20 which is sufficient to obtain a satisfactory clustering. These parameter values remain the same in following experiments.

In addition, we train the model with 20 epochs from scratch, and optimize the loss with the SGD optimizer. The weight decay and the momentum are set to 0.0005 and 0.9, respectively. The initial learning rate is set to 0.01 and is empirically divided by 10 at 8, 12 and 18 epochs. All the experiments are performed on a single Tesla-P40 GPU, and one can use more for acceleration.

Table 1. Results on IJB-C dataset and MS1M dataset. “P” is short for “Pairwise”, and “B” is short for “BCubed”

Full size table

3.2 Experimental Results

Table 1 first shows the competing results on IJB-C dataset. The GCN-D only uses its detection module. The GCN-DS further incorporates the segmentation module but archives no performance improvement. This is because the algorithm fails in learning the complex cluster patterns as the identity similarities are not so reliable among low-quality faces. Algorithms GCN-V and GCN-VE present the same phenomenon, where the corresponding vertex confidence and edge connectivity estimation modules are heavily dependent on the identity similarity in feature subspace. The STAR-FC performs relatively better under the influence of its structure-preserved sub-graph sampling strategy. Overall, our FC-Q algorithm outperforms other algorithms on this dataset, and achieves $91.7\%$ pairwise F-score under the employment of face quality scores. Table 1 also shows the competing results on MS1M dataset. It is observed that all the algorithms achieve performance improvements when the faces are in good quality. Overall, although the assistance role of face quality is diluted, our FC-Q algorithm still gets the satisfactory result. This indicates that our algorithm can be the first choice when implementing face clustering under general circumstance where the face quality is unknown or mixed.

3.3 Ablation Studies

In this subsection, we evaluate some design elements used in our algorithm. The following experiments are all conducted on IJB-C dataset. Table 2 presents the clustering results of our algorithm when individually removing the following four design elements, i.e., concatenating the quality scores, concatenating the Pearson correlation encoding, adding the prior relevancy matrix, and making key and query share the same projection. It is observed that all these four design elements contribute to the performance improvement, and the last element helps the most.

Table 2. Results on IJB-C when four design elements are removed successively. “P” is short for “Pairwise”, and “B” is short for “BCubed”.

Full size table

Table 3. Results of L-GCN and FC-Q algorithms with their label propagation modules swapped. “A+B” denotes using A for linkage prediction and B for label propagation.

Full size table

We further swap the label propagation modules of L-GCN and FC-Q to validate the effectiveness of incorporating the face quality. We can see from Table 3 that equipping our proposed label propagation module evidently improves the clustering precision of L-GCN with a slight drop of recall. This is because our module can further suppress the abnormal pairing based on extra local quality information. The improvement is not so significant on FC-Q as its linkage prediction module has already incorporated the face quality to output more reliable linkage likelihood.

Figure 4 further shows the effectiveness of our modified self-attention mechanism. An attention map extracted from the last layer is list on the left, which involves one pivot example and its $K = 80$ nearest neighbors. The neighbors are sorted in descend order with respect to the similarity. The attention values are taken as logarithm to better demonstrate the numerical differences. The sequence above the attention map describes the identity consistency, where the black dot indicates the neighbor node at same sequence position having the different identity. Three pivot-neighbor pairs are also list on the right, with quality score shown on the image and similarity below the arrow. We can see that the first pivot-neighbor pair in red box and the second pivot-neighbor pair in rosy box have close similarity but opposite identity consistency. The self-attention mechanism successfully suppresses the attention values of the former neighbor node based on the large score difference prior. Note that although the third neighbor node in black box with different identity has small score difference, the self-attention mechanism can also suppress its attention value based on the relevancy aggregation among involved nodes.

4 Conclusion

This paper has introduced a face clustering algorithm, referred as FC-Q, to tackle with face nodes with mixed quality. The algorithm takes face quality score as extra input, which is incorporated into the linkage prediction module and label propagation module as a prior. The algorithm first modifies the Transformer encoder, and uses quality relevancy to infer more reliable linkage likelihood. Then the algorithm utilizes the local quality information to further suppress the abnormal pairing with high linkage likelihood. Experimental results validate that our algorithm gets the satisfactory clustering result under general circumstance where the face quality is unknown or mixed.

References

Amigó, E., Gonzalo, J., Artiles, J., Verdejo, F.: A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf. Retriev. 12(4), 461–486 (2009)
Article Google Scholar
Benesty, J., Chen, J., Huang, Y., Cohen, I.: Pearson Correlation Coefficient, pp. 1–4. Springer, Vienna (2009)
Google Scholar
Chen, D., Lv, J., Zhang, Y.: Unsupervised multi-manifold clustering by learning deep representation. In: The Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)
Google Scholar
Duta, I.C., Liu, L., Zhu, F., Shao, L.: Improved residual networks for image and video recognition. arXiv preprint arXiv:2004.04989 (2020)
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A a density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)
Google Scholar
Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: Ms-celeb-1m: a dataset and benchmark for large-scale face recognition. In: European Conference on Computer Vision, pp. 87–102 (2016)
Google Scholar
Li, P., Zhao, H., Liu, H.: Deep fair clustering for visual learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9070–9079 (2020)
Google Scholar
Liu, R., Tan, W.: Eqface: a simple explicit quality network for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1482–1490 (2021)
Google Scholar
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Article MathSciNet Google Scholar
Maze, B., et al.: Iarpa janus benchmark - c: face dataset and protocol. In: 2018 International Conference on Biometrics, pp. 158–165 (2018)
Google Scholar
Meng, Q., Zhao, S., Huang, Z., Zhou, F.: Magface: a universal representation for face recognition and quality assessment. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 14225–14234 (2021)
Google Scholar
Peng, B., Liu, M., Yang, H., Zhang, Z., Li, D.: Deep tiny network for recognition-oriented face image quality assessment. arXiv preprint arXiv:2004.04989 (2021)
Peng, X., Feng, J., Zhou, J.T., Lei, Y., Yan, S.: Deep subspace clustering. IEEE Trans. Neural Netw. 31(12), 5509–5521 (2020)
Article MathSciNet Google Scholar
Shen, S., et al.: Structure-aware face clustering on a large-scale graph with ${\bf 10}^{7}$ nodes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9085–9094 (2021)
Google Scholar
Shi, Y., Otto, C., Jain, A.K.: Face clustering: Representation and pairwise constraints. IEEE Trans. Inf. Forens. Secur. 13(7), 1626–1640 (2018)
Article Google Scholar
Terhorst, P., Kolf, J.N., Damer, N., Kirchbuchner, F., Kuijper, A.: Ser-fiq: unsupervised estimation of face image quality based on stochastic embedding robustness. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5651–5660 (2020)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: International Conference on Neural Information Processing Systems, vol. 30, pp. 5998–6008 (2017)
Google Scholar
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11(95), 2837–2854 (2010)
MathSciNet MATH Google Scholar
Wang, Z., Zheng, L., Li, Y., Wang, S.: Linkage based face clustering via graph convolution network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1117–1125 (2019)
Google Scholar
Xiong, R., et al.: On layer normalization in the transformer architecture. In: International Conference on Machine Learning, pp. 10524–10533 (2020)
Google Scholar
Yang, L., Chen, D., Zhan, X., Zhao, R., Loy, C.C., Lin, D.: Learning to cluster faces via confidence and connectivity estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 13369–13378 (2020)
Google Scholar
Yang, L., Zhan, X., Chen, D., Yan, J., Loy, C.C., Lin, D.: Learning to cluster faces on an affinity graph. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2298–2306 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Hikvision Research Institute, Hangzhou, China
Zhiwei Pan, Congquan Yan & Pengju Yang
Zhejiang University, Hangzhou, China
Zhiwei Pan
Xiamen University, Xiamen, China
Zhihua Guo
Nanjing University of Aeronautics and Astronautics, Nanjing, China
Huiting Yang

Authors

Zhiwei Pan
View author publications
You can also search for this author in PubMed Google Scholar
Zhihua Guo
View author publications
You can also search for this author in PubMed Google Scholar
Huiting Yang
View author publications
You can also search for this author in PubMed Google Scholar
Congquan Yan
View author publications
You can also search for this author in PubMed Google Scholar
Pengju Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhiwei Pan .

Editor information

Editors and Affiliations

Southern University of Science and Technology, Shenzhen, China
Shiqi Yu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Zhaoxiang Zhang
Hong Kong Baptist University, Hong Kong, China
Pong C. Yuen
Northwestern Polytechnical University, Xi'an, China
Junwei Han
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hong Kong Baptist University, Hong Kong, China
Yike Guo
Sun Yat-sen University, Guangzhou, China
Jianhuang Lai
Southern University of Science and Technology, Shenzhen, China
Jianguo Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pan, Z., Guo, Z., Yang, H., Yan, C., Yang, P. (2022). Learning to Cluster Faces with Mixed Face Quality. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13535. Springer, Cham. https://doi.org/10.1007/978-3-031-18910-4_46

Download citation

DOI: https://doi.org/10.1007/978-3-031-18910-4_46
Published: 27 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18909-8
Online ISBN: 978-3-031-18910-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning to Cluster Faces with Mixed Face Quality

Abstract

Similar content being viewed by others

ClusterFace: Clustering-Driven Deep Face Recognition

TSC-GCN: A Face Clustering Method Based on GCN

Structure-enhanced pairwise feature learning for face clustering

Keywords

1 Introduction

2 Methodology

2.1 Linkage Prediction with Prior Attention

2.2 Label Propagation with Anomaly Suppression

3 Experiments

3.1 Experimental Settings

3.2 Experimental Results

3.3 Ablation Studies

4 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Learning to Cluster Faces with Mixed Face Quality

Abstract

Similar content being viewed by others

ClusterFace: Clustering-Driven Deep Face Recognition

TSC-GCN: A Face Clustering Method Based on GCN

Structure-enhanced pairwise feature learning for face clustering

Keywords

1 Introduction

2 Methodology

2.1 Linkage Prediction with Prior Attention

2.2 Label Propagation with Anomaly Suppression

3 Experiments

3.1 Experimental Settings

3.2 Experimental Results

3.3 Ablation Studies

4 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation