NANE: Attributed Network Embedding with Local and Global Information

Mo, Jingjie; Gao, Neng; Zhou, Yujing; Pei, Yang; Wang, Jiong

doi:10.1007/978-3-030-02922-7_17

Jingjie Mo^18,19,20,
Neng Gao^19,20,
Yujing Zhou^18,19,20,
Yang Pei^18,19,20 &
…
Jiong Wang^18,19,20

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11233))

Included in the following conference series:

International Conference on Web Information Systems Engineering

1731 Accesses
5 Citations

Abstract

Attributed network embedding, which aims to map structural and attribute information into a latent vector space jointly, has attracted a surge of research attention in recent years. However, existing methods mostly concentrate on either the local proximity (i.e., the pairwise similarity of connected nodes) or the global proximity (e.g., the similarity of nodes’ correlation in a global perspective). How to learn the global and local information in structure and attribute into a same latent space simultaneously is an open yet challenging problem. To this end, we propose a Neural-based Attributed Network Embedding (NANE) approach. Firstly, an affinity matrix and an adjacency matrix are introduced to encode the attribute and structural information in terms of the overall picture separately. Then, we impose a neural-based framework with a pairwise constraint to learn the vector representation for each node. Specifically, an explicit loss function is designed to preserve the local and global similarity jointly. Empirically, we evaluate the performance of NANE through node classification and clustering tasks on three real-world datasets. Our method achieves significant performance compared with state-of-the-art baselines.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Deep Attributed Network Embedding Based on the PPMI

Deep Attributed Network Embedding with Community Information

A block-based generative model for attributed network embedding

Article 15 July 2021

Keywords

1 Introduction

Social networks are ubiquitous in our daily lives, ranging from online social-networking sites such as Facebook and Weibo, biological gene-disease networks [6, 18], to citation networks [14, 16]. Network embedding, which maps the information of each node into a latent space, has grown up to be an effective method in social network data mining. As a result, various data mining methods, e.g., group clustering [27], link prediction [21], anomaly detection [3], node classification [2, 21], can be directly conducted in the latent space.

Early works(e.g., DeepWalk [24], LINE [28], node2vec [9]) explore the effect of local and global structure similarity [8, 12, 17, 31] on network embedding. Local structure similarity indicates that a pair of nodes with edges are more prone to be similar. On the other hand, global structure similarity reveals node status in the network. For instance, a pair of nodes with semblable structural neighbors tend to be closer. However, these methods need to be improved for lack of abundant nodes’ attribute information. Recently, several algorithms pay attention to attributed network embedding leveraging structural and attribute information both. LANE [10] only takes the local attribute similarity into consideration which means that nodes in the network with similar attribute are more likely to be in a same community. UPP-SNE [35] is more prone to capturing global structural and attribute similarity under the framework of DeepWalk. How to extract the local and global information in structure and attribute jointly is still an arduous problem.

In this paper, we propose a Neural-based Attributed Network Embedding framework named NANE to address the aforementioned problem. In our approach, various information is comprehensively considered, including the local and global information in structure and attribute. We impose an autoencoder model to encode the global attribute and structural information into a latent space with a pairwise constraint to preserve the local information. Specifically, we regard the attribute similarity of two nodes as an uncertain link and the similarity indicates the possibility of connecting two nodes. On top of that, an affinity matrix is introduced to represent the attribute global proximity.

In conclusion, the main contributions can be summarized as follows:

We propose a generic neural-based framework NANE to represent attributed social network capturing both structural and attribute non-linear similarity. To our best of knowledge, our model is the first attempt to capture the local and global similarity in structure and attribute jointly. Specifically, we introduce an affinity matrix to measure the global attribute similarity among nodes.
We empirically conduct experiments on three real-world datasets with multi-class classification of vertices and node clustering tasks. Comparing with cutting-edge network embedding algorithms, we evaluate the effectiveness of NANE.

The rest of this paper is organized as follows. Firstly, we discuss the related work in Sect. 2, and then we show some definitions in our work in Sect. 3. On top of that, we propose the NANE algorithm in Sect. 4, followed by experiments in Sect. 5. Finally, Sect. 6 concludes the paper and visions the future work.

2 Related Work

In this section, we briefly summarize the development of network embedding methods which we can simply divide into two parts. The first part is named structure-based network embedding which focuses on structural information only. The latter is named content-aware network embedding that combines additional information with network structure for a better performance on graph representation.

2.1 Structure-Based Network Embedding

Some earlier works utilize manifold learning to capture structure proximity [1, 25, 29]. Recently, inspired by Skip-Gram [19, 20], DeepWalk [24] imposes random walks to generate sequences of nodes, and feeds them into Skip-Gram to capture the similarity of nodes. However, DeepWalk is prone to preserving the global structural proximity since the limit of random walks. On top of that, node2vec [9] introduces biased random walks to balance the global and local structural information. LINE [28] designs an explicit objective function to capture both global and local structural information while it lacks further fusion.

On the other hand, some researchers try to introduce neural networks to extract node high non-linearities. DNGR [5] adopts a random surfing model to capture network structure and reduces it into a limit dimension by a SDAE model. It happens that there is a similar case. SDNE [33] also exploits a deep neural network with a supervised component to extract the structural information.

All the methods mentioned above are based on network structure only without considering attribute information which is extremely beneficial to graph representation.

2.2 Content-Aware Network Embedding

Some recent efforts have been devoted to leveraging additional information for a better performance.

TADW [34], a text-associated DeepWalk model, creatively explores the contribution of nodes contents in network embedding. It imposes matrix factorization to encode text features into network representation learning. TriDNR [22] exploits DeepWalk and Doc2Vec [13] to extract structural information and capture content information respectively. Relying on a late fusion which is a series of weighted sums only, it ignores the correlation between structure and contents since it lacks further convergence. Further, these two models which concern about node contents only cannot handle noisy attribute information.

LANE [10] is the first attempt to utilize label information and node attributes. It utilizes three matrices to measure networks: the network adjacent matrix, node content-level similarity matrix and node label-level similarity matrix, and then maps them into the same vector space by dimension reduction. LANE projects node embedding with matrix factorization which cannot reflect the non-linear correlation between nodes. UPP-SNE [35] is the first attempt to take user profiles into consideration. The basic idea is to extract the similarity of nodes by random walks and embed user profile information via a non-linear mapping into a low-dimensional latent space. However, UPP-SNE is more prone to extracting the global proximity while it ignores the local information.

3 Problem Definition

We first summarize some notations used in this paper. We denote scalars as lowercase alphabets(e.g., n) and represent vectors as boldface lowercase alphabets(e.g., s). Moreover, matrices are represented by boldface uppercase alphabets(e.g., S). s$_i$ is the i$^{th}$ row of a matrix S, and the (i,j)$^{th}$ element of it is denoted by s$_{\textit{ij}}$. We list the main notations in Table 1.

We regard a social network as a homogeneous attributed network which indicates that there is only one relationship between nodes, each edge is undirected and some nodes have their attributes. Under these assumptions, we denote a social graph as $\mathcal {G}$ = ($\mathcal {V}$, $\mathcal {E}$, U), where $\mathcal {V}$ is a set of n nodes, $\mathcal {E}$ is a set of edges and U is the node attribute matrix. A is the adjacent matrix, and a$_{ij}$ is defined as 1 if there is a edge between node i and node j.

The aim of this paper is to project the nodes into a low-dimensional vector space while preserving structural and attribute information jointly. It is vital to map the matrices U,A into the same latent space jointly. In this end, we propose a neural-based attributed network embedding model to learn a comprehensive representation.

Table 1. Main notations and descriptions

Full size table

4 Methodology

In this section, we first elaborate the framework of our final model NANE preserving structural and attribute information jointly. And then we give the explicit loss function and its optimization in detail.

4.1 Overall Architecture

In this paper, we propose a neural-based attributed network embedding model to capture structural and attribute information jointly. As shown in Fig. 1, NANE consists of three major steps. Specifically, we first convert structural and attribute information to a set of binary features, and then leverage a self-feedforward layer to measure the weight of each feature. We then concatenate the output of self-feedforward layer and feed it into a neural network structure. The early fusion operation makes it possible to optimize all parameters simultaneously. To capture the highly non-linear node correlation, we utilize a deep autoencoder [26] model to reconstruct overall information and design a specific loss function to preserve both local and global similarity in the end. The details of each step are elaborated as follows.

4.2 Global Information Encoding

To economize the global attribute and structural information, we introduce an affinity matrix S and an adjacent matrix A respectively. In this paper, we take a row of the adjacency matrix to represent for structural information. For example, a$_i$ is the structural information of vertex i. Furthermore, we utilize a cosine similarity to represent the global attribute information. We simply take a row of the similarity matrix S to represent the attribute information, and s$_i$ is the attribute similarity information of vertex i.

To generate the attribute affinity matrix S, we need to define the user attribute information matrix U in the first place. As is well-known, attribute information is rich and diverse in social networks. However, it is inevitably incomplete and noisy due to its heterogeneity and feature of manual filling. To tackle this problem, we first convert all discrete attributes, e.g., user demographics [4], user interest [23], to a set of binary features by one-hot encoding. For instance, the marital status attribute has four values $\left\{ \textit{married},\textit{single},\textit{divorced},\textit{widowed} \right\} $, we can encode a married user as the vector v = $\left\{ 1,0,0,0 \right\} $, where the first binary feature of value 1 represents married. For continuous attributes, e.g. age, we normalize it to reduce the impact of value by Max-Min Normalization. For missing attribute values, we set the feature vector to all zeros. Thus, we aggregate all the feature vectors together and then we obtain the user attribute information matrix U. We obtain $s_{ij}$ by calculating the cosine similarity of u$_i$ and u$_j$.

4.3 Self-Feedforward Layer

To weight features in structure and attribute, we design a self-feedforward layer for both structure and attribute embedding. As shown in Fig. 1, the self-feedforward layer consists of two fully connected layers, which convert features in structural and attribute information into two dense vectors respectively. And we apply dropout, residual connections and layer normalization on the self-feedforward layer to prevent overfitting and improve the efficiency of neural network. The weights of each feature are denoted as W$_{struc}$ and W$_{attr}$ separately. The final output of self-feedforward layer of vertex i is denoted as h$_i^{(0)}$, which can be expressed as follows:

$$\begin{aligned} \mathbf h _{i}^{(0)}= \begin{bmatrix} Dropout(\sigma (\mathbf W _{struc}\cdot \mathbf a _{i})),\lambda Dropout(\sigma (\mathbf W _{attr}\cdot \mathbf s _{i})) \end{bmatrix} \end{aligned}$$

(1)

where $\mathbf h _{i}^{(0)}$ denotes the input layer of the autoencoder of vertex i, $\lambda $ adjusts the impact of attributes, $\sigma $ denotes the activation function, W$_{struc}$ and W$_{attr}$ are parameters we need to learn, which measure the weights of structural and attribute information respectively. For a better convergence and preventing overfitting, we design an $\mathcal {L}$2-norm regularizer in this part, which is defined as follows:

$$\begin{aligned} \mathcal {L}_{F-reg}=\left\| \mathbf W _{struc} \right\| _F^2 + \left\| \mathbf W _{attr} \right\| _F^2 \end{aligned}$$

(2)

where $\left\| \cdot \right\| _F^2$ denotes the square of F-norm.

4.4 Reconstructing Procedure

To capture the local and global proximity in structure and attribute jointly, we extend a deep autoencoder model to measure the interplay between structure and attribute and encode them into a latent space. A deep autoencoder consists of two parts, i.e. the encoder and decoder. At the encode step, multilayer non-linear function $f(\cdot )$ is applied to map the input data into a low-dimensional vector space, which is also called embedding space. On the contrary, there are also multilayer mirrored non-linear function $g(\cdot )$ is extended to map the embedding space to the reconstruction space. In this paper, the input data of deep autoencoder is the output of self-feedforward layer $\mathbf h _{i}^{(0)}$, and the representation of hidden layers can be denoted as follows:

$$\begin{aligned} \mathbf h _{i}^{(k+1)}=\sigma (\mathbf W ^{(k)} \cdot \mathbf h _{i}^{(k)} + \mathbf b ^{(k)}) \end{aligned}$$

(3)

where $W^{(k)}$, $b^{(k)}$ is the k$^{th}$ hidden layer weight matrix and biases.

The output of reconstruction layer is denoted as $\widehat{\mathbf{R }}$. Our goal of this step is to minimize the reconstruction error of output and input. The input data R that need to be reconstructed is shown as follows:

$$\begin{aligned} \mathbf R =\begin{bmatrix} \mathbf A , \lambda \mathbf S \end{bmatrix} \end{aligned}$$

(4)

And the loss function is denoted as follows:

$$\begin{aligned} \mathcal {L}=\left\| \widehat{\mathbf{R }} -\mathbf R \right\| _F^2 \end{aligned}$$

(5)

However, due to the sparsity of the input data, the number of zero elements in R is much larger than that of non-zero ones, which means that it is more prone to reconstruct the zero elements rather than non-zero ones. However, it is contrary to our intention. To address this problem, we impose an offset coefficient matrix to reset the weights of different elements, and the redesigned loss function is denoted as follows:

$$\begin{aligned} \mathcal {L}_{global}=\left\| (\widehat{\mathbf{R }}-\mathbf R )\odot \mathbf B \right\| _F^2 \end{aligned}$$

(6)

where B is the offset coefficient vector, b$_{ij}$=1 when r$_{ij}$=0 while b$_{ij}=\beta >1$ when r$_{ij}$=1, and $\odot $ means the Hadamard product. With the help of the deep autoencoder, the vectors of the vertexes which have semblable features would have similar representation. However, it is not only necessary to capture the global similarity, but also vital to preserve the local similarity. A pair of nodes with edges should also have semblable embedding i.e. the local structure similarity. Therefore, we aggrandize a loss function for local structure similarity, and the expression is shown as follows:

$$\begin{aligned} \mathcal {L}_{struc-local} = \sum _{i,j=1}^{n}a_{ij}\left\| \mathbf x _i-\mathbf x _j \right\| _2^2 \end{aligned}$$

(7)

where $a_{ij}$ is an element of adjacency matrix, $a_{ij}=1$ when there is a link between node i and node j. $x_{i}$ denotes the final embedding of node i. The local structure loss function aims to make the embedding of two connected nodes closer.

Similar with the local structure similarity, we also design a loss function to constraint the local attribute similarity. A pair of nodes with semblable attributes should also have semblable embedding. The loss function for local attribute similarity is denoted as follows:

$$\begin{aligned} \mathcal {L}_{attr-local} = \sum _{i,j=1}^{n}s_{ij}\left\| \mathbf x _i-\mathbf x _j \right\| _2^2 \end{aligned}$$

(8)

where $s_{ij}$ is an element of attribute similarity matrix. The greater value of $s_{ij}$, the attributes of node i and node j are more similar. Akin to the local structure loss function, the local attribute loss function aims to make the embedding of two nodes which have similar attributes closer.

4.5 Loss Functions and Optimization

To preserve the global and local similarity simultaneously, we combine the aforementioned loss functions in a integrated framework and propose a unified structure and attribute preserving model NANE. The joint loss function is denoted as follows:

$$\begin{aligned} \begin{aligned} \mathcal {L}&=\alpha \mathcal {L}_{global}+\gamma \mathcal {L}_{struc-local}+\theta \mathcal {L}_{attr-local}+\eta \mathcal {L}_{reg}\\&=\alpha \left\| (\widehat{\mathbf{R }}-\mathbf R )\odot \mathbf B \right\| _F^2+\gamma \sum _{i,j=1}^{n}a_{ij}\left\| \mathbf x _i-\mathbf x _j \right\| _2^2\\ {}&+\theta \sum _{i,j=1}^{n}s_{ij}\left\| \mathbf x _i-\mathbf x _j \right\| _2^2+\eta \mathcal {L}_{reg} \end{aligned} \end{aligned}$$

(9)

where $\alpha $,$\gamma $,$\theta $,$\eta $ are four hyper-parameters to adjust the weights of each part. Moreover, $\mathcal {L}_{reg}$ is the total regularization which consists of two parts: the regularization of self-feedforward layer $\mathcal {L}_{F-reg}$ and the regularization of deep autoencoder $\mathcal {L}_{ae-reg}$. We define $\mathcal {L}_{reg}$ as

$$\begin{aligned} \begin{aligned} \mathcal {L}_{reg}&=\mathcal {L}_{F-reg}+\mathcal {L}_{ae-reg}\\&=\left\| \mathbf W _{struc} \right\| _F^2 + \left\| \mathbf W _{attr} \right\| _F^2 + \sum _{k=1}^{K}(\left\| \mathbf W ^{(k)} \right\| _F^2+\left\| \mathbf b ^{(k)} \right\| _F^2) \end{aligned} \end{aligned}$$

(10)

To optimize the aforementioned framework, we apply RMSProp to minimize the objective in Eq.(9), which is able to adjust the learning rate for each parameter. We feed mini-batch into our model each time to accelerate the speed of training. Besides, in order to prevent falling into a local optimum solution, it is essential to find a good set of initialized parameters. Therefore, we adapt Restricted Boltzmann Machine to pre-train the parameters at first, which is a classic method of parameter initialization in neural network [7]. The integrated algorithm is presented in Algorithm 1.

5 Experiments

In this section, we evaluate our method by performing experiments on three real-world network datasets and compare it with several state-of-the-art baseline algorithms.

5.1 Datasets

We conduct experiments on three real-world networks: Facebook, Hamilton and Rochester. The statistics of the three datasets is summarized in Table 2.

Ego-Facebook is an ego-network which was collected from survey participants using the Facebook app. The dataset contains 1403-dimensional node features, 4039 nodes and 88234 edges. Besides, people education type is used as group label [15].

Hamilton and Rochester are two datasets collected by Adam D’Angelo of Facebook, consists of nodes from the Facebook networks at each of 100 American institutions [30]. Each node contains 7 attributes: student/faculty status flag, gender, major, second major/minor, dorm/house, year, and high school, which is described by a 144-dimensional and 235-dimensional feature vector respectively, and student/faculty status flag is selected as group class. Two datasets include 2314 nodes, 192788 edges and 4563 nodes, 322808 edges separately.

Table 2. The statistics of the three datasets

Full size table

5.2 Baseline Methods

We compare NANE with several state-of-the-art network embedding methods, which are divided into two categories. Structure-Based NRL Methods

DeepWalk [24] generates node sequences by random walks, and feed them into Skip-Gram model to learn network embedding.
node2vec [9] introduces biased random walks to DeepWalk, which aims to capture the local and global structure jointly.
LINE [28] imposes two separate objective functions to extract the first-order and the second-order proximity both.
SDNE [33] leverages a deep neural network which is a non-linear mapping operation to exploit structural information.

Attribute-Aware NRL Methods

LANE [10] fuses structural, attribute and label information together to preserve node similarity. Here, we only use the version without utilizing label information.
UPP-SNE [35] generates random walks to capture node pairwise similarity and embeds user profile information into a low-dimensional vector space.

5.3 Parameter Settings

For a fair comparison, we set the embedding dimension to 256 for all methods. In DeepWalk, node2vec and UPP-SNE, we set the window size t to 10, the walk length l to 80. In node2vec, we empirically set the return hyperparameter p to 2.0, and the in-out hyperparameter q to 0.5. In LINE, we set the first-order vector dimension to 128, and the second-order vector dimension to 128 in the same way. In SDNE, we set the weight $\gamma ,\alpha ,\beta $ to 1, 500, 100 respectively, and the learning rate is 0.01. In LANE, we tune the hyper-parameters of $\beta _1,\beta _2$ by grid search. In UPP-SNE, we set the number of iterations to 20. For our method, we use a grid search to find the best parameters. We set $\alpha $ to 1000, $\beta $ to 100, and $\gamma $,$\theta $ and $\eta $ to 1. The rest parameters are given in Table 3.

Table 3. Parameter settings of the three datasets

Full size table

5.4 Node Classification

We first evaluate the effectiveness of NANE by multi-class node classification. To be fair, we range the training size from 15% to 75% by taking 10% as a step, and apply a rbf-kernel svm classifier with $\gamma =100$ to all of the generated node representations. For each training size, we split train and test set in a random way for 10-times, and then conduct 5-fold cross-validation and output the average micro F1-score of node classification on three different datasets.

Table 4. Node classification F1-score(%) on Facebook

Full size table

Table 5. Node classification F1-score(%) on Hamilton

Full size table

Table 6. Node classification F1-score(%) on Rochester

Full size table

Tables 4, 5 and 6 show the average classification accuracy of all the methods on Facebook, Hamilton and Rochester, where the best results are bold-faced. It is not hard to see that our method consistently yields the best classification results among all the baselines. Extraordinary accuracy compared with structure-based NRL algorithms demonstrates that our method works well on the fusion of attribute information. Especially on Facebook, NANE achieves more than 30% improvement over all the structure-based NRL baselines, pointing to the significant performance on node classification. Moreover, NANE also outperforms all the attribute-aware NRL baselines, demonstrating its effectiveness on capturing the local and global similarity in structure and attribute. Our method imposes an early fusion of structural and attribute information, which enables our method to exploit the interplay between structure and attribute. On top of that, our method economizes the global information with a pairwise constraint, resulting in remarkable consequences.

5.5 Parameter Sensitivity

In this section, we explore the parameter sensitivity of our method. We select three crucial parameter batch size, $\lambda $ and t to conduct our experiments and investigate how these parameters affect the results. The train ratio is selected as 55%. We report the accuracy of node classification on different datasets with disparate parameters. In turns, we study the effect of one parameter with fixing the rest. And the results of our experiments are shown in Fig. 2.

We first show how the batch size affects the performance in Fig. 2(a). As we can see, the performance of our method on Hamilton and Rochester is stable with different values. However, performance on facebook is much fluctuant. The best value of batch size on facebook is 600. We then show how the weight of attribute information $\lambda $ influence the result in Fig. 2(b). It not hard to see that when the weight of attribute information approach to 1, which means structural information is as important as attribute information, the performance is much better. However, the performance of our method falls badly when lambda is larger than 10. It indicates that it may leads to the very opposite effect when the influence of attributes is too large. At last, how the dimension of the output of self-feedforward layer t affects the result is shown in Fig. 2(c). We can see that when t is close to the embedding dimension, the performance is much better in most case.

5.6 Node Clustering

For node clustering task, we apply K-Means++ to network embedding on three datasets and leverage ARI [11] and NMI [32] to evaluate the consequence of clustering. We set the number of clusters same as the group number, calculate the indexes 10 times to reduce occasionality. the average results are shown in Fig. 3.

As we can see, NANE achieves the best performance among almost all the cutting-edge baselines. Similar clusterings have a positive ARI, whereas negative values indicate poor performance. Furthermore, values of exactly 0 on NMI present purely independent label assignments. On Facebook, the values of ARI are lower than zero for LINE and SDNE, verifying that the representations of LINE and SDNE have independent labelings. Specifically, NANE achieves 0.125 improvement compared with the second best results. On Hamilton, NANE also yields the best clustering results, outperforming 0.127 and 0.178 lift on ARI and NMI respectively. It firmly demonstrates the better performance of NANE on unsupervised learning task.

6 Conclusion and Future Work

Attributed network, due to its inherent heterogeneity and sparsity, presents new challenges for many tasks. In this paper, we propose a Neural-based Attributed Network Embedding method to capture structural and attribute information jointly. We deem that it is crucial to explore the global and local similarity simultaneously for a better representation. Firstly, we impose a fully-connected layer with some dropout to extract the relevance in attribute and structure respectively. And then we leverage a deep autoencoder model which is a non-linear mapping to reconstruct the global information with a local constraint. Experiments on three different real-world datasets demonstrate the remarkable performance of our method comparing with cutting-edge algorithms. Our future work in this area will focus on the following directions. We will develop NANE to a multi-task learning algorithm and make it suitable for large-scale industrial-grade data. Furthermore, We are also interested in exploring how to capture user behaviors in time series and user images and map them into embedding space.

References

Belkin, M., Niyogi, P.: Laplacian Eigenmaps and spectral techniques for embedding and clustering. In: International Conference on Neural Information Processing Systems: Natural and Synthetic, pp. 585–591 (2002)
Google Scholar
Bhagat, S., Cormode, G., Muthukrishnan, S.: Node Classification in Social Networks. Springer, New York (2011)
Book Google Scholar
Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network anomaly detection: methods, systems and tools. IEEE Commun. Surv. Tutor. 16(1), 303–336 (2014)
Article Google Scholar
Burger, J.D., Henderson, J., Kim, G., Zarrella, G.: Discriminating gender on twitter. In: Conference on Empirical Methods in Natural Language Processing, pp. 1301–1309 (2011)
Google Scholar
Cao, S., Lu, W., Xu, Q.: Deep neural networks for learning graph representations. In: Thirtieth AAAI Conference on Artificial Intelligence, pp. 1145–1152 (2016)
Google Scholar
Dong, Y., Zhang, J., Tang, J., Chawla, N.V., Wang, B.: Coupledlp: link prediction in coupled networks. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 199–208 (2015)
Google Scholar
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11(3), 625–660 (2010)
MathSciNet MATH Google Scholar
Granovetter, M.: The strength of weak ties: a network theory revisited. Sociol. Theory 1(6), 201–233 (1983)
Article Google Scholar
Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: KDD 2016, pp. 855–864 (2016)
Google Scholar
Huang, X., Li, J., Hu, X.: Label informed attributed network embedding. In: Tenth ACM International Conference on Web Search and Data Mining, pp. 731–739 (2017)
Google Scholar
Hubert, L., Arabie, P.: Comparing partitions. J. Classification 2(1), 193–218 (1985)
Article Google Scholar
Kossinets, G., Watts, D.J.: Origins of homophily in an evolving social network1. Am. J. Sociol. 115(2), 405–450 (2009)
Article Google Scholar
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents, vol. 4, p. II-1188 (2014)
Google Scholar
Leskovec, J.: Graphs over time: densification laws, shrinking diameters, explanations and realistic generators. In: KDD, pp. 177–187 (2005)
Google Scholar
Mcauley, J., Leskovec, J.: Learning to discover social circles in ego networks. In: International Conference on Neural Information Processing Systems, pp. 539–547 (2012)
Google Scholar
Mccallum, A.K., Nigam, K., Rennie, J., Seymore, K.: Automating the construction of internet portals with machine learning. Inf. Retr. 3(2), 127–163 (2000)
Article Google Scholar
Mcpherson, J.M., Smith-Lovin, L.: Homophily in voluntary organizations: status distance and the composition of face-to-face groups. Am. Sociol. Rev. 52(3), 370–379 (1987)
Article Google Scholar
Menche, J., et al.: Disease networks. Uncovering disease-disease relationships through the incomplete interactome. Science 347(6224), 1257601 (2015)
Article Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Computer Science (2013)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality, vol. 26, pp. 3111–3119 (2013)
Google Scholar
Narayanan, A., Chandramohan, M., Chen, L., Liu, Y., Saminathan, S.: subgraph2vec: learning distributed representations of rooted sub-graphs from large graphs (2016)
Google Scholar
Pan, S., Wu, J., Zhu, X., Zhang, C., Wang, Y.: Tri-party deep network representation. In: International Joint Conference on Artificial Intelligence, pp. 1895–1901 (2016)
Google Scholar
Pennacchiotti, M., Popescu, A.M.: Democrats, republicans and starbucks afficionados: user classification in twitter. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 430–438 (2011)
Google Scholar
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)
Google Scholar
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–6 (2000)
Article Google Scholar
Salakhutdinov, R., Hinton, G.: Semantic Hashing. Elsevier Science Inc., Amsterdam (2009)
Article Google Scholar
Tang, J., Liu, J., Zhang, M., Mei, Q.: Visualizing large-scale and high-dimensional data, vol. 37(7), pp. 287–297 (2016)
Google Scholar
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line:large-scale information network embedding, vol. 2, pp. 1067–1077 (2015)
Google Scholar
Tenenbaum, J.B., Silva, V.D., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319 (2000)
Article Google Scholar
Traud, A.L., Mucha, P.J., Porter, M.A.: Social structure of facebook networks. Physica A Stat. Mech. Appl. 391(16), 4165–4180 (2012)
Article Google Scholar
Tsur, O., Rappoport, A.: What’s in a hashtag? Content based prediction of the spread of ideas in microblogging communities. In: ACM International Conference on Web Search and Data Mining, pp. 643–652 (2012)
Google Scholar
Vinh, N.X., Epps, J., Bailey, J.: Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance (2010). JMLR.org
Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1225–1234 (2016)
Google Scholar
Yang, C., Liu, Z., Zhao, D., Sun, M., Chang, E.Y.: Network representation learning with rich text information. In: International Conference on Artificial Intelligence, pp. 2111–2117 (2015)
Google Scholar
Zhang, D., Yin, J., Zhu, X., Zhang, C.: User profile preserving social network embedding. In: Twenty-Sixth International Joint Conference on Artificial Intelligence, pp. 3378–3384 (2017)
Google Scholar

Download references

Acknowledgements

This work is partially supported by National Natural Science Foundation of China (No.U163620068) and National Key Research and Development Program of China.

Author information

Authors and Affiliations

School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Jingjie Mo, Yujing Zhou, Yang Pei & Jiong Wang
State Key Laboratory of Information Security, Chinese Academy of Sciences, Beijing, China
Jingjie Mo, Neng Gao, Yujing Zhou, Yang Pei & Jiong Wang
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Jingjie Mo, Neng Gao, Yujing Zhou, Yang Pei & Jiong Wang

Authors

Jingjie Mo
View author publications
You can also search for this author in PubMed Google Scholar
Neng Gao
View author publications
You can also search for this author in PubMed Google Scholar
Yujing Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yang Pei
View author publications
You can also search for this author in PubMed Google Scholar
Jiong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Neng Gao .

Editor information

Editors and Affiliations

Zayed University, Dubai, United Arab Emirates
Hakim Hacid
Poznan University of Economics, Poznan, Poland
Wojciech Cellary
University of Victoria, Footscray, VIC, Australia
Hua Wang
UNSW Australia, Sydney, NSW, Australia
Hye-Young Paik
Swinburne University of Technology, Hawthorn, VIC, Australia
Rui Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mo, J., Gao, N., Zhou, Y., Pei, Y., Wang, J. (2018). NANE: Attributed Network Embedding with Local and Global Information. In: Hacid, H., Cellary, W., Wang, H., Paik, HY., Zhou, R. (eds) Web Information Systems Engineering – WISE 2018. WISE 2018. Lecture Notes in Computer Science(), vol 11233. Springer, Cham. https://doi.org/10.1007/978-3-030-02922-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-02922-7_17
Published: 20 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02921-0
Online ISBN: 978-3-030-02922-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

NANE: Attributed Network Embedding with Local and Global Information

Abstract

Similar content being viewed by others

Deep Attributed Network Embedding Based on the PPMI

Deep Attributed Network Embedding with Community Information

A block-based generative model for attributed network embedding

Keywords

1 Introduction