Keywords

1 Introduction

Link prediction is a widely studied problem and receives considerable attention in data mining and machine learning in the past decades. It aims to infer a link which is not observed in current network or will arise in the future network [1,2,3,4,5,6,7,8]. The network object of link prediction research is a complex topology structure abstracted from real-world physical systems. In general, people observe the interactive system in the real-world, extract the entities in the system as the vertices, and the interaction relationship between entities as the edges, and construct a topological graph corresponding to the physical system, namely complex network model. Then, the network model is taken as the research object to explore some laws underlying the physical interaction system and simulate their evolution mechanism. However, due to the complexity of the real physical system, the extractive network models are often structurally incomplete. That is, there is always a missing situation of real information about the system in the complex network obtained through observation. The purpose of link prediction is to infer the missing or possible relationships in the future through this abstract complex network model, and to further study the evolution mechanism of real physical systems [9].

Because the research of link prediction problem is of great significance for the development of economy and society, its results are widely used in all walks of life in the real society [3, 4]. For example, analyze the evolution mechanism of the network [9], study the drug targeting relationship in the field of bioinformatics [10], realize the personalized recommendation of scenic spots or recommend new friends in social network [4, 11], and identify criminals in the field of public security [12,13,14,15].

At present, with the development of mobile Internet network, the amount of social information increases rapidly. When this real interaction system is abstracted into complex networks, the corresponding number of vertices becomes extremely large. However, the interaction relationship between the nodes did not grow significantly with the node scale. This phenomenon leads to exist many vertices in the extracted complex networks, but the edges between them appear extremely sparse. The phenomenon that the number of links known in the network is much less than the number of no links is called the structural sparsity problem. This problem has a very large impact on the performance of the link prediction [16, 17]. Therefore, how to solve the problem of declining prediction performance due to structural sparsity in large-scale networks becomes a challenge for link prediction. The motivation of this paper is to study the fusion problem of node attributes, so as to dig out the auxiliary information that can compensate for the sparsity of network structure, and build a multi-source information integration mode to realize the improvement of link prediction performance.

Recently, Social platforms based on mobile Internet networks are very frequently used, many network datasets appear with both the topology and node attribute information. For example, a webpage (i.e., vertex) can be associated with other webpages via hyperlinks, and it may have some inherent attributes of itself, like the text description in the webpage. Such type of networks is known as attributed networks. Some studies have shown that the degradation of the link prediction performance due to the sparse structure can be alleviated to some extent by using the node attribute information [17]. Recently, some link prediction methods are proposed based on attribute networks [18,19,20,21,22]. However, due to the diversity and heterogeneity of information and the variability of fusion methods, these algorithms either have poor overall prediction, or lack sufficient migration and robustness, or have too high computational complexity to adapt to large-scale networks. Therefore, the problem of how to reasonably integrate the structure and node attribute information has largely not been successfully solved.

Non-negative matrix factorization (NMF) is an important technique in the field of machine learning [23]. It can integrate heterogeneous information and promote each factor information to play a potential role [24]. In general, for a given matrix \(X\in {R}_{+}^{n\times m}\), the NMF algorithm tries to find two non-negative factor matrices \(B\in {R}_{+}^{n\times k}\) and \(C\in {R}_{+}^{k\times m}\), make \(X={X}^{\mathrm{^{\prime}}}\approx BC\). Where the \(k\) is called internal rank or hidden space, it satisfies \(\left( {m + n} \right)k \ll m\). The solution of NMF usually transforms into an optimization problem of finding \(min_{B \ge 0,C \ge 0} L\left( {X,BC} \right)\), and the symbol \(L\left( {.,.} \right)\) represents a certain loss function, such as Euclidean distance, KL divergence, or IS divergence. Given the Euclidean distance, the above optimization method can be converted to \(min_{B,C}|| X - BC||_F^2\), and the \( B \ge 0,C \ge 0\). Symbols \(|| \cdot ||_F\) indicate the Frobenius norm. The F-norm of a general matrix is usually defined as \(||X||_F = \sqrt { \sum \limits_{ij} \left| {x_{ij} } \right|^2 } = \sqrt {tr\left( {X^T X} \right)}\).

Considering the advantages of NMF models when incorporating multi-source information. In this paper, we introduce a Joint Weighted Nonnegative Matrix Factorization method for link prediction on attributed networks, namely JWNMF. For a given attributed network, our method presents a mechanism by using joint-NMF to integrate the structural and attribute information. Specifically, we design two matrix factorization terms. One is modeling the topology structure and the other is for attributes. Meanwhile, we modify the NMF by introducing a weighting variable for each attribute, which can be automatically updated and determined in each iteration.

Experiments are performed on five real-world attribute network datasets. The results show the advantages of performance of JWNMF model comparison with the benchmark methods and advanced algorithms.

The rest article develops as follows. Section 2 shows the related works. Section 3 is the network description and the problem definition. Section 4 is about the establishment of the proposed model and its optimization. Section 5 is experimental design and results analysis. The last part contains our conclusions and prospects.

2 Related Work

As a research hotspot in the field of complex network science, link prediction has been widely concerned by researchers in recent years. However, there are not much studies to fuse non-topological information like node attributes with network topological information and then realize link prediction, especially the framework based on NMF. Han et al. [16] used the configuration files of online social-contact users and other non-topological information, such as workplace and school to compute the attribute similarity, for counting the number of attributes the users all possess and the geographic distance between the users. Then, they proposed a prediction model based on support vector machines. Wang et al. [17] extracted topological and non-topological information by an implicit feature representation model, then proposed a link prediction method for missing link. Li et al. [18] proposed a link prediction for dynamic attributed networks. Moreover, for attribute networks with isolated nodes, the literature [17, 19,20,21,22] makes full use of attribute information to achieve link prediction on semi-structured networks.

However, it is difficult to integrate multi-source heterogeneous information and make them work in experimental prediction tasks. In this respect, the method based on matrix factorization is widely used [23, 24]. Menon et al. [25] proposed a link prediction algorithm based on the matrix factorization. Pech et al. [26] proposed a matrix filling-based link prediction method using the matrix filling principle in the field of recommendation systems. For the network topology sparsity, Chen et al. [27] proposed a link prediction model of robust NMF by using manifold regularization and sparse learning. To make full use of the node attribute information, Chen et al. [28] proposed a link prediction model incorporating node attribute information based on NMF, but the time complexity of their algorithm is high. Jiao et al. [29] proposed a Link predication model based on matrix factorization. This model fused multi class organizations information of network. They take advantage of the auxiliary information beyond the node attributes. Chen et al. [30] proposed a novel link prediction model based on deep NMF, which elegantly fuses topology and sparsity-constrained to perform link prediction tasks. Inspired by the matrix perturbation principle, Wang et al. [31] proposed a perturbation-based model for NMF link prediction. Moreover, there are also some NMF-based prediction models, they are used in dynamic time-varying networks [32, 33].

3 Preliminary

In this section, we introduce the formalized description of the problem of link prediction, and the network definition.

3.1 Network Representation

Given an undirected attribute network \(G\left( {V,E,A} \right)\) with n nodes, where \(V = \left\{ {v_1 ,v_2 , \cdots v_n } \right\}\) is the set of nodes and \(E = \left\{ {\left( {v_i ,v_j } \right), 1 \le i \le n, 1 \le j \le n, i \ne j} \right\}\) is the set of edges. The \(A\) is the set of attributes of all nodes in network. For the network \(G\) with \(n\) vertex, there are \(m\) attributes value for each vertex. These attributes are available to be represented by a matrix \(A_{n \times m}\). Each row of the matrix \(A_{n \times m}\) represents an attribute vector of the corresponding node \(v_i\). If the node \(v_{\text{i}}\) has the \(k\)-th attribute value, then \(A_{{\text{ik}}} = 1\), otherwise \(A_{{\text{ik}}} = 0\). The topology structure of the attribute network is represented by an adjacency matrix \(S_{n \times n}\). The element of the ith row and the jth column in the matrix correspond to the link between nodes \(v_i\) and \(v_j\) in the network, where \(S_{ij} = 1\) if there is a link from \(v_i\) to \(v_j\) and \(S_{ij} = 0\) otherwise. Multiple edges between two nodes and back edges on single nodes are not allowed.

3.2 Link Prediction Problem

The purpose of link prediction is to infer the probability \(P_{ij}\) of the existence of an edge between any two nodes \(v_i\) and \(v_j\) by using the known information in the network. In general, based on the sociological principle that “the more similar people are more likely to be connected”, the \(P_{ij}\) is treated as some similarity between nodes \(v_i\) and \(v_j\). The higher \(P_{ij}\), the more similar \(v_i\) and \(v_j\) are, and the more likely \(v_i\) is to form a link with \(v_j\). For a given observation network G, the \(P_{ij}\) probability of forming edges between unconnected nodes is inferred through the model proposed. The predicted values are then arranged in descending order, and the pairs of nodes at the top are considered the most likely to form connections. In this paper, we compute the score \(P_{xy}\) based on JWNMF model.

4 Proposed Method

In this section, we will introduce our proposed method in detail, which aims to fuse the attribute information of the nodes into the link prediction process.

4.1 Link Prediction Model: JWNMF

Excavating the available information and constructing a reasonable information fusion mode are the main ideas to solve the problem of network topology sparsity, and realize the link prediction task. Therefore, the basic framework of NMF is used to fully integrate the node attributes and network structure information to compensate for the defects of incomplete topological information, to realize the link prediction task and improve the performance in this paper. First, based on the basic principle of NMF, the adjacency matrix S representing the network topology is decomposed into the product of two non-negative factor matrices, namely S≈VVT, and the matrix A representing the node attribute information decomposed into A≈ZUT. However, the aim of this paper is to address information integration. Therefore, in order to enable the network structure and node attribute information to fully integrate and play a leading role in the link prediction, we need to attach certain constraint rules to their decomposed factor matrix. Inspired by the methods described in ref [24], which often delivers promising results for graph clustering, we apply the idea for attributed graph link prediction. Here, the hidden space V after the network structure information S is decomposed is approximately equal to the hidden space Z of the node attribute information, so that it can remain the same in the process of model learning, so as to achieve the purpose of mutual fusion and constraining the network structure and node attribute information. Therefore, the partial information of the attribute A is decomposed into the hidden space V of the structure information, namely A≈VUT. When the two-source information is integrated in a unified framework and uses Euclidean distance as a loss function, the overall model framework for the link prediction task is expressed as follows:

$$ L = min_{V,U} ||S - VV^T ||_F^2 + \lambda ||A - VU^T ||_F^2 s.t. V \ge 0,U \ge 0, $$
(1)

where \(S \in R_+^{n \times n}\), A \(\in R_+^{n \times m}\), the factor matrix \(U \in R_+^{m \times k} { }and{ }V \in R_+^{n \times k}\) represent the hidden space that integrates topological structure and node attribute information, \(R_+\) represents non-negative real number sets. The parameter \(\lambda > 0\) balance the availability of structure and attribute information.

Since the node attributes in the network are easy to mix with noise, in order to further reduce the impact of the noise on the prediction results, and promote the guiding role of the attribute information in predicting the network structure information, we also introduce a matrix \(W\) to assign a weight for each attribute. At this point, the decomposition form can be expressed as \(AW \approx VU^T\). By assigning a weight information to each node attribute with the matrix W, the effect of similarity between the node attributes can be uniformly integrated into the structure information to provide a promotion for the final results of link prediction. The weight matrix W is set to a diagonal matrix, which satisfies \(\mathop \sum \limits_{i = 1}^m W_{i,i} = 1\). After introducing the weight matrix W, the complete objective function is expressed as follows:

$$ L = min_{V,U} ||S - VV^T ||_F^2 + \lambda ||AW - VU^T ||_F^2 s.t. V \ge 0,U \ge 0, $$
(2)

where, the weight matrix \(W\epsilon R_+^{m \times m}\). To ensure that the \(W\) weights are assigned to a rule space, update operations need to be normalized to:

$$ W = \frac{W}{{\sum_{i = 1}^m W_{i,i} }}, $$
(3)

4.2 Update Rules

The solution of model \(L\) is difficult to obtain the global optimal solution, but the local optimal solution can be realized by the multiplicative iterative method. Therefore, for \(V\), \(U\) and \(W\) three factor matrices, introduce their corresponding non-negative Lagrangian multiplier \({\upalpha },{\upbeta },{\upgamma }\), thus replacing the objective function Eq. (2) with an unconstrained loss function form:

$$ L = \frac{1}{2}\left( {||S - VV^T ||_F^2 + \lambda ||AW - VU^T ||_F^2 } \right) + Tr\left( {\alpha^T V} \right) + Tr\left( {\beta^T U} \right) + Tr\left( {\gamma^T W} \right), $$
(4)

Simplified the Eq. (4) and take the partial differentiations of \(L\) for \(V,U,W\) respectively, then

$$ \frac{\partial L}{{\partial V}} = - \left( {SV + S^T V + \lambda AWU} \right) + 2VV^T V + \lambda VU^T U + \alpha , $$
(5)
$$ \frac{\partial L}{{\partial U}} = - \lambda WA^T V + \lambda UV^T V + \beta , $$
(6)
$$ \frac{\partial L}{{\partial W}} = - \lambda A^T VU^T + \lambda A^T AW + \gamma , $$
(7)

In this regard, according to complementary relaxation condition of the Karush-Kuhn-Tucker (KKT), we have \({\alpha V} = 0,\beta U = 0,\gamma W = 0\). Set \(\frac{\partial L}{{\partial V}} = 0,\frac{\partial L}{{\partial U}} = 0,{ }\frac{\partial L}{{\partial W}} = 0\), then the update rule for \(V,U,W\) is obtained.

$$ V \leftarrow V\frac{SV + S^T V + \lambda AWU}{{2VV^T V + \lambda VU^T U}}, $$
(8)
$$ U \leftarrow U\frac{WA^T V}{{UV^T V}}, $$
(9)
$$ W \leftarrow W\frac{A^T VU^T }{{A^T AW}}, $$
(10)

The above update rules Eq. (8) - Eq. (10) can be solved by element value or by matrix form as a whole. During the model learning training, the three-factor matrix \(V,U,W\) is obtained based on the convergence condition of the objective function. Then, the approximate solution of original network topology structure is solved by using the decomposition formula \(V \times V^T\). That is, after learning the matrix \(V\) through model training, we can obtain the similarity score between any two nodes in the network, or the probability \(P_{ij}\) of exist edge between two nodes, and finally realize the link prediction task.

In general, in the learning process, the model will seek the local optimal solutions of \(V,U,W\) using the update rules. However, before implementing the update operation, the adjacency matrix S and the attribute matrix A need to be preprocessed and given an initial value.

$$ S = \frac{S}{{\sum_{i = 1}^n \sum_{j = 1}^n S_{i,j} }}, $$
(11)
$$ A = \frac{A}{{\sum_{i = 1}^n \sum_{j = 1}^m A_{i,j} }}, $$
(12)

Note that updates the weight matrix W use Eq. (3).

The model JWNMF integrates the network structure and node attribute information through the NMF framework, and assigns a weight constraint information to each node attribute through the introduced diagonal matrix W, so that the network structure and node attributes can maximize their respective roles in the model training and learning process to serve the final prediction results. A schematic diagram of the principle of the model JWNMF is shown in Fig. 1.

Fig. 1.
figure 1

A schematic diagram of the principle of JWNMF model.

In conclusion, according to the basic principles of the proposed JWNMF model, the pseudo-code description of the algorithm is designed as follows (shown in Table 1).

Table 1. Pseudo-code description of JWNMF algorithm

The experimental environment of this paper is based on the operating system of windows10 of x86 computer, and then the simulation experiment of link prediction is implemented with Matlab tool programming. Here, the computational complexity is discussed. The computational complexity of JWNMF algorithm comes mainly from the time cost when iteratively updating the matrix \(V,U,W\). For a given network \(G\left( {V,E} \right)\), the number of vertices \(V\) is \(n\), and each vertex has \(m\) attributes. When updating \(V\), \(U\) and \(W\), to reduce the time overhead, we utilize the objective relative error as the stopping criterion and set to less than \(10^{ - 6}\) in experiment. Moreover, the dimension \(k\) after the matrix decomposition is a constant. Supposing the algorithm stops after t iterations, the overall cost for Symmetric NMF is \(O\left( {n^2 kt} \right)\). As the objective function adds one more linear matrix factorization term, the overall cost for updating rules is \(O\left( {(n^2 k + m^2 k + mnk} \right)t)\). According to the analysis rules of the time complexity of computer algorithms, when the scale n tends to infinity, the worst case of the time complexity of the model can be approximated by \(O\left( {n^2 } \right)\).

5 Experiment

This section mainly shows and analyzes the model prediction performance. Next, we will describe the datasets, comparison methods, evaluation metrics, and discussion of experimental results.

5.1 Datasets

This subsection mainly describes the basic topology of the datasets used in this paper, and the method of dividing training set and testing set.

To verify the model prediction performance, five real-world attribute network datasets widely used in the link prediction field were selected.

Table 2. Topological information of the network datasets

The basic topological properties of these network datasets are listed in Table 2. Where the symbol N represents the total number of network nodes, E represents the total number of existing links, < K > is the network average degree, < d > is the average shortest distance, C is the clustering coefficient, and #attributes represent dimension of node attributes. These network datasets used for the experiment can be downloaded from the following web sites. http://vladowiki.fmf.uni-lj.si/doku.php?id=pajek:data:urls:index; http://snap.stanford.edu/data/. For a detailed description of the data set, please also see the above website introduction.

5.2 Datasets Division Method

When comparing the prediction performance of the algorithm, the given network dataset needs to be divided according to the basic principles of machine learning. It is divided into training set and test set. There are many methods to divide data sets, and k-fold cross validation is used in this paper. The sample dataset was randomly divided into k parts, one of which was selected as test set and the remaining as training set, and then a prediction accuracy was calculated, so repeated k times. The prediction accuracy of the algorithm on the entire network dataset is the average of k prediction accuracies. In practical partitioning, k-taking 10 is a common method.

5.3 Evaluation Metrics

Like many existing link prediction studies, in our work adopts also the most frequently-used metrics AUC (area under the ROC curve) and the Precision to measure the performance of algorithm proposed. These metrics are viewed as a robust measure in the presence of data imbalance, which are also one of the most popular indices of evaluation link prediction. For more details on these two evaluation methods, readers can refer to the literature [1,2,3,4].

5.4 Baseline Methods

To validate the predictive performance of the newly designed algorithms, people usually select some benchmark methods and those representative up-to-date algorithms from the literature as the reference objects for comparative analysis. Generally, in order to reflect the fairness of comparison, the design principle of the comparison method selected is usually similar to the algorithm proposed. Therefore, in the experiment, several state-of-the-art algorithms based on NMF framework design often used in the link prediction research field are selected as reference objects. The benchmark methods are mainly structural similarity based classical algorithms.

We list four types of link prediction methods as the benchmark methods, including eleven local similarity indices based on the number of common neighbours between pairs of nodes (CN, AA, RA, PA, Salton, Jaccard, Sorenson, HPI, HDI, LHN and TSCN), four random walk methods (ACT, CosPlus, LRW, SRW), three local path methods (LocalP, Katz, LHN-II) and four other similarity algorithms (MFI, LNBCN, LNBAA, LNBRA). The mathematical expressions of these methods and their definitions can be found in ref. [1,2,3,4].

In addition, four state-of-the-art NMF-based link prediction algorithms were used as comparison methods in the experiment. They are the original NMF method [23], matrix fill method (MC) [26], the attribute-based NMF method (NMF_LP) [28] and the NMF method based on the perturbation principle (SPM_NMF) [31] respectively.

5.5 Experimental Results and Analysis

This section provides a comprehensive analysis of the predictive performance of JWNMF model. Experiments were performed on five real-world network datasets. The prediction performance of various algorithms is fairly judged by two evaluation indicators, Precision and AUC, and show the final evaluation results. In experiment, the datasets were divided into test set Etest and training set Etrain in different proportions. The results of experimental simulation are analyzed by taking the overall average at each proportion. Typically, the value of average prediction accuracy obtained from 100 independent simulations via Precision or AUC are taken as the final performance results.

5.6 Model Parameter Setting

To adjust the prediction performance of the JWNMF model to the optimum, the parameter \(\lambda\) in the model was analyzed before the start of the experiment. Figure 2 shows the predictive performance of the model when its parameter values are in the range from 0 to 3. Through the comparative analysis, the local optimum of the parameters \(\lambda\) is finally determined. In experiment, the parameter \(\lambda\) of JWNMF model takes a value of 0.09.

Fig.2.
figure 2

Analysis parameter \({{\varvec{\lambda}}}\) of JWNMF model

5.7 Performance Analysis

According to the conventional way in the field of link prediction research, each experimental data set is first divided by the ratio of 20% to 90%, and the step size is 10, with a total of 8 proportions. Assuming that the total number of existent edges are |E|=m in the network. It indicates that 20% of the m are used for the training set when the partition ratio is 20%, while the remaining 80% is used as the test set.

Table 3. Predictive results via AUC metrics on five network datasets

The JWNMF model was trained on this training set together with the benchmark and contrast methods. To judge their prediction performance, the test set is then used to measure the effect. Each experiment needs to be run at least 100 times independently and then averaged as the result. Although the many results generated by experiments, considering the universality and representativeness, Table 3 only shows the overall prediction effect of each method in the data set divided by 50%, the training set and the test set are in half each. The predictions values are shown in Table 3 by using AUC as the evaluation criterion.

From the numerical results presented in Table 3, The JWNMF method led the prediction performance on three datasets: Texas, Cornell and Lazega, but performed poorly on the Washington and Facebook datasets. As the overall analysis, the proposed JWNMF model showed good prediction performance on five datasets of attribute networks. This also shows that when implementing the link prediction, it can mine the external information such as the attributes of the nodes as an auxiliary, which can significantly improve the performance of the link prediction algorithm. This is significantly better effective than simply using structural information. Moreover, for networks with extremely sparse structure, using this external auxiliary information is more helpful to compensate for the insufficient performance problem caused by the sparse topological structure. Of course, the question of how to mine this auxiliary information and which external auxiliary information works better for the prediction is still under discussion. In order to show the overall predictive performance of the various methods more deeply, Fig. 3 shows the prediction effect when the data set is partitioned at 50% with Precision as the evaluation criterion.

Fig. 3.
figure 3

Performance comparison based on the Precision metic

Above, we briefly mention that the network topology sparsity has obvious effects on the prediction performance of the algorithm. To demonstrate this problem more specifically, many experiments were deliberately designed and completed during the study. In these experiments, the network dataset was divided from dense to sparse in a ratio of 90% to 20%, and under each division scheme, the prediction performance of each algorithm is verified by Precision and AUC standards, to test the impact of network topology on the prediction results of each algorithm. Moreover, it is also used to verify the adaptability and robustness of each algorithm under the different degrees of sparsity at network topology.

Taking the Facebook dataset as an example, the AUC values of each algorithm after the different partition proportions are shown in Table 4.

Table 4. The AUC value under different partitioning of Facebook dataset

From the analysis of these values, it can be seen that as the topology of the network gradually changes from dense to sparse, the prediction performance of the algorithm will have a significant downward trend. However, the prediction algorithm based on the JWNMF model still performs well at all proportions. This shows when facing different sparse degree of network topology, it can make full use of various external auxiliary information and compensate for the lack of topological information due to structure sparsity. Thus, it basically ensures the prediction performance of the algorithm in abnormal cases, and improves the adaptability and robustness of the algorithm. Figure 4 shows this result more visually.

Fig. 4.
figure 4

The AUC value under different partition of Facebook dataset

Similarly, with Precision as the evaluation criterion, we also compared the prediction performance of the various algorithms at different proportions of Facebook dataset (in Fig. 5). Although the model JWNMF is not the best under each partitioning scheme, it still shows a good prediction effect.

Fig. 5.
figure 5

The Precision value under different partition of Facebook dataset

6 Conclusion

In recent years, link prediction based on network topology has been one of the research hotspots in the field of data mining. However, in many cases, those algorithms that use only the information of network topology do not provide the accuracy required for link prediction, when the network topology is in an extremely sparse state. Furthermore, real-world networks are often sparse and contain noise, which makes the predictive performance of the algorithm very strongly correlated with the properties of the network itself. For these extremely sparse and noisy networks, the ultimate effect is not ideal if only the structural information is used to complete the prediction task. At present, with the development of mobile Internet, it is more and more convenient to obtain the non-topological information of network. This provides a hope for link prediction research.

In this paper, considering the advantages of NMF that is interpretability, nonnegative, and information fusion, we propose a link prediction model of weighted NMF. By designing a weighted matrix w to process the attribute information of each node, both the structure and attribute information fused into the NMF framework can fully play a guiding role in the link prediction task, thus solving the problem of structure sparsity and improving the prediction performance of the algorithm. Although our method can significantly improve the performance of link prediction on sparse networks, its temporal complexity is relatively high. This is also a direction that we need to improve in the future. In addition, we also consider the cold-start link prediction of complex network in a semi-structured state as another target for future studies.