1 Introduction

A network in graph theory is a tuple \(G = (V, E)\) where V is a (finite) set of vertices and E is a (finite) set of edges [1]. Each edge is either a one or two element subset of V. When a system is represented in the form of a network, the entities of the system are denoted as the nodes of the network. If a pair of entities have an interaction or a relationship with each other then this is denoted as an edge between the entities [2]. For instance, if the transportation network of country is represented as a network, the vertices (nodes) of this network would be the different cities of a country. The edges of the network would denote the presence of a direct transport link between one or more cities. Depending on the type of interaction this transportation network aims to capture, the edges could be directed or un-directed and weighted or un-weighted. Un-directed edges would represent presence or absence of a direct route between the cities. Weighted edges would represent the volume of traffic between cities in the transportation network. Thus, graph representations offer the flexibility to capture different aspects of systems [3].

Network representation of a system also allows the application of Network science for its analysis. Network science concepts have their roots in graph theory, a fertile field of mathematics. Graph theory is concerned with proving theorems and developing algorithms that can be applied on arbitrary graphs (irrespective of what the graph models in the real world). What distinguishes network science from graph theory is its empiric [4]. It is this empirical aspect that makes Network analysis interesting [2]. Network science researchers do not study graphs from an abstract point of view but instead study graph representations of real world systems to understand their properties.

1.1 Background

Usefulness of network representations in the examination of complex systems is illustrated in the origin story of Google [5]. In their seminal paper, L Page et al. [6] argued that identification of authoritative web-pages on the internet could be done by representing the internet as a network (Fig. 1). The web-pages (entities) could be the nodes of the network and web-pages connected by a hyperlink could be shown in the network as nodes linked by a directed edge. In the web-graph that is thus formed, authoritative web-pages would be those with high eigen-vector centrality ranking [5]. The PageRank ranking technique proposed by the authors for information retrieval was based on this intuition [5]. Thus, network science provided a novel approach to analyze the internet. This proved useful in increasing the efficacy of information retrieval on the internet. This analogy shows how a concept of graph theory was used on a network representation model to develop an efficient search engine.

Fig. 1
figure 1

Internet network. Reprinted figure with permission from www.opte.org

In addition to this, there are several other useful statistical concepts in graph theory which can be applied on network models to draw inferences about the nature of the systems they represent [3, 7]. Measurements of path length, diameter, connectivity, transitivity and density allow inferences on the structural characteristics of systems [1, 2, 4, 89]. Locations of nodes in a network model can be used to draw parallels about the importance of those entities in the overall scheme of the system [2]. For instance, nodes located at the boundaries in a network are important for information exchange with nodes outside the network and nodes located at the center of a network are hubs which keep the network intact and play a key role in information interchange within the network.

Fig. 2
figure 2

A network model representing the collaborations between scientists working at the Santa Fe Institute (SFI). Reprinted figure from [2]

In the model given in Fig. 2 edges connect scientists that have coauthored at least one paper. Symbols indicate the research areas of the scientists. As seen in Fig. 2, density of edges between researchers in a particular domain are high compared to density of edges between researchers of different domains. The position of scientists (nodes) in the Santa Fe Institute network provides insights regarding their importance to the research community of the institution. The key takeaway from this example is that graph theory has several concepts that can be applied to networks to understand the behavior or trends in the real-world system they represent [10]. Similarly, data or systems across diverse domains such as computer science, transportation, social science, economics and biology too have been investigated using a network perspective [11].

1.2 Objective

This inquiry provides statistical analysis of large social network data-sets available on Stanford Network Analysis Project as well as standard synthetic benchmark data-sets such as GN-benchmark, LFR benchmark [12], dynamic LFR benchmark, small world model [13], Erdos Renyi Random Graph [14], Barabasi Albert Preferential Attachment graph and Forest Fire Graph [11]. The focus is on detailed analysis of these data-sets to uncover behavioural and structural characteristics. Domain specific phenomenon that occur in these data-sets are explained using intuition and empirical evidence is offered to support them.

The rest of the paper is organized into three main sections. Section 2 presents a review of literature which includes network theoretic concepts [15, 16]. Section 3 presents the detailed descriptions of the data-sets and their specifications. In case of synthetic benchmarks the specifications are used as provided in research papers where the generative models have been proposed. Section 3 also hosts a critical discussion of the results obtained and an explanation for the same. The concluding remarks of this inquiry are presented in Sect. 4.

2 Review of literature

2.1 Empirical analysis of social networks

Statistical features of social networks such as number of nodes n, average degree d, diameter D and average path length L of graphs can be used to infer the rate of diffusion processes amongst nodes. Shorter diameters and path lengths would indicate a faster diffusion of information Eq. 1. Modularity is a measure of the community structure in the graph and ranges from [-1, 1] for pure random graphs to perfect community structure. Reciprocity index of a graph is the proportion of mutual connections in a graph and ranges from [0, 1]. Assortativity measures homophily in a graph. Transitivity is defined as the probability of \(e_{j,k}\) in a graph where for nodes i, j, k the edges \(e_{i,j}\) and \(e_{i,k}\) exist.

$$\begin{aligned} D \propto \frac{ln(n)}{ln(d)}; L \propto \frac{ln(n)}{ln(d)}. \end{aligned}$$
(1)

Adhesion or edge connectivity E for connected graph G is the minimum number of edges \(\lambda (G)\) whose deletion from a graph G disconnects G.

Diameter is the length \(max_{(u,v)}d(u,v)\) of the “longest shortest path” (i.e., the longest graph geodesic) between any two graph vertices (uv) of a graph, where d(uv) is a graph distance.

Average path length L = \(\sum _1^E(G) \frac{d(u,v)}{E(G)}\)

Degree distribution of graph \(P(k) = \frac{n_k}{n}\) is fraction of nodes in the network with degree k i.e. \(n_k\) where n is the Graph order.

Modularity Q is a measure of quality of separation of different vertex types from each other.

$$\begin{aligned} Q = \frac{1}{2m} * \sum {A_{vw} - \frac{k_v * k_w}{2m} * \delta (c_v,c_w)} \end{aligned}$$
(2)

where, m is the number of edges in Eq. 2; \(A_{vw}\) is the element of the A adjacency matrix in row v and column w; \(k_v, k_w\) is the degree of v and w; \(c_v, c_w\) is the type (or component) of v and w; \(\delta (c_v,c_w)\) is 1 if \(c_v = c_w\) otherwise 0.

Verification of power laws \(f(k) \propto k^{-\alpha }\) in networks related to eigen-vectors distribution \(x_1, x_2,\ldots x_{20}\), component distribution \(C_1, C_2, \ldots , C_k\)

Assortativity measures the level of homophily of the graph.

$$\begin{aligned} r = \frac{\sum _{jk} jk(e_{jk} - q_jq_k)}{\sigma ^2_q} \end{aligned}$$
(3)

where, \(q_k\) is the number of edges leaving the node, other than the one that connects the pair jk; \(\sigma _q\) is the standard deviation of q in Eq. 3; \(e_{jk}\) is the refers to the joint probability distribution of the remaining degrees of the two vertices.

Graph density (\(G_D\)) is the number of edges present graph G amongst all possible edges in G. \(G_D\) for undirected and directed graphs is given by below Eqs. 4 and 5 respectively.

$$\begin{aligned} \frac{2|E|}{|V|(|V|-1)} \end{aligned}$$
(4)
$$\begin{aligned} \frac{|E|}{|V|(|V|-1)}. \end{aligned}$$
(5)

Reciprocity \(\rho\) as given in Eq. 6 is the measure of the likelihood of vertices in a directed network to be mutually linked.

$$\begin{aligned} \rho = \frac{\sum _{i \ne j (a_{ij} - {\overline{a}})(i \ne j (a_{ji} - {\overline{a}})}}{sum_{i \ne j (a_{ij} - {\overline{a}})^2}}. \end{aligned}$$
(6)

The betweenness centrality of a node g(v) is given by the Eq. 7:

$$\begin{aligned} g(v) = \sum _{s \ne v \ne t} \frac{\sigma _{st}v}{\sigma _{st}} \end{aligned}$$
(7)

where \(\sigma _{st}\) is the total number of shortest paths from node s to node t and \(\sigma _{st}(v)\) is the number of those paths that pass through v.

McGlohon et al. [7] observed that In and Out degree distributions of social networks to follow a power law, the weights of edges \(w_{i,j}\) between two nodes with weights \(w_i\), \(w_j\) follow a relation given by Eqn 8, the distribution of component sizes of the social networks follow a power law, number of edges E(t) and total weight of a graph W(t), at time t follow a power law and the first twenty eigenvalues of a social network follow a power law. In this inquiry the validity of the phenomena is tested on diverse datasets both synthetic and real.

$$\begin{aligned} w_{i,j} \propto (\sqrt{(w_i - w_{i,j})*(w_j - w_{i,j})})^\gamma . \end{aligned}$$
(8)

Tan et al. [17] observed that indegrees of nodes matched their outdegrees and presence of densely connected hubs of high degree nodes in social networks. The authors proposed other measures for social network analytics such as change in group clustering coefficient with group size, relation of out degree of a node with the number of groups to which it belongs, relation of clustering coefficient with degree of a node and degree of assortativity and reciprocity in social networks. Freidman et al. [18] observed higher graph density in online social networks such as Facebook, Twitter etc compared to social networks of other domains. Gewerc et al. [19] observed that node centrality and graph density, provide insight into how friendship would be formed in an online social network, how peers interact and how all this affects the network evolution over time. Hoff et al. [8] have argued that probability of relation between actors depends on the position of actors in an unobserved social space. Dwyer et al. [20] have applied measures of concern for privacy and trust to members of different sites, and looked for variances in behaviour. Gomez et al.  [3] have observed that social networks of various domains present common features of traditional social networks such as a presence of a large connected component, small average path length and high clustering, but differs from them in showing moderate reciprocity and neutral assortativity by degree. Using K–S Goodness of fit test, the authors show that the degree distributions are better explained by log-normal instead of power-law distributions. Another interesting observation is a high reciprocity in links in the online social network Slashdot.

The literature review highlights works that have uncovered new trends or phenomena in social networks. The focus of the current inquiry is to further this line of research i.e. use of network science to uncover patterns or trends seen in systems belonging to various domains. A second aim is to provide data driven analysis of various interesting data-sets obtained from Stanford Network Analysis Project. This shall provide novice and expert users with tools for algorithmic designs and methodologies. However, there are certain issues of the network representation models. These issues are elaborated along with possible solutions to them.

2.2 Synthetic data-sets

2.2.1 Barabasi Albert (BA): preferential attachment

Intuition behind this model is that when new nodes enter a network they prefer to attach to popular nodes (high in-degree) over others. The generative process of the network initializes with a single node. Then at each time step a node is created that initiates outgoing edges to nodes existing in the system. The probability that an existing node i is chosen by an outgoing edge is given by Eq. 9:

$$\begin{aligned} P[i] \propto k[i]^{\alpha } \end{aligned}$$
(9)

\(\alpha\) is the exponent of preferential attachment; k[i] is the in-degree of vertex i in the current time step.

Thus the probability P(k) that a newly created nodes links with k existing nodes decays as a power law. The graph generated using this model has power law distribution of degrees. However, this stochastic process assumes only linear relation for preferential attachment [9].

2.2.2 Erdős–Rényi random graph: random attachment

These graphs are of two types G(np) and G(nm). G(np) has n vertices and probability of an edge between them is constant p. G(nm) has n vertices and m edges such that m edges are chosen uniformly at random from a set of all possible edges [9]. In both G(np) and G(nm) ER generative models, it is assumed that the nodes decide to form edges with other nodes based on a constant probability (G(np)) or uniformly at random G(nm). Hence, no preferential attachment pattern is observed.

2.2.3 Preferential attachment and aging

This is a discrete time step model of a growing random graph. At each time step a single node is added and it initiates links to node already existing in the network. The probability of a node k getting an initiated edge is given by P[k] in Eq. 10 [9]. This model thus enriches the Barabasi Albert model.

$$\begin{aligned} P[k] = (c*k[i]^{\alpha } + a)*(d*l[i]^{\beta } + b) \end{aligned}$$
(10)

c, d is the coefficient of degree and age; k[i], l[i] is the in-degree and age of node i at current time step; a, b is the attractiveness of node with no adjacent edge and zero age; \(\alpha\), \(\beta\) is the preferential attachment exponent, aging exponent.

2.2.4 Watts–Strogatz model

A generative model which creates a lattice structured graph. Each node is connected to all nodes within its neighbourhood. The lattice structure that is formed is rewired i.e. edges are selected at random with a probability p and connected to nodes outside their immediate neighbourhood. This is done without altering the number of nodes or edges. The rewiring procedure, creates a “Small World” Effect i.e. reduction in the average path length of the graph [9].

2.2.5 J–R model

Jackson et al. proposed a network generative model where nodes of the social network are allowed to form links to other nodes using a hybrid strategy that encapsulates elements of preferential attachment model and the Erdős-Rényi model. Thus if there are pre-existing m nodes in a network then a newborn node links to \(a*m\) of them chosen uniformly at random and \((1-a)*m\) using a neighborhood search strategy (choice based links) and attaches to them. The hyper-parameter a is ratio of chance based interactions to choice based interactions.

2.2.6 Girvan Newman benchmark, LFR benchmark

GN benchmarks are developed using the stochastic block models. These graphs have 4 communities with 32 nodes each. There are a total of 128 nodes with each having a degree of 16. The mixing parameter \(\mu\) as given in Eq. 11 decides each community association and each node has disjoint membership to one community.

$$\begin{aligned} \mu = \frac{k_o}{k_i} + k_o \end{aligned}$$
(11)

\(k_o\) is the number of edges connecting vertices in different communities; \(k_i\) is the number of edges connected to a vertex.

Girvan Newman benchmarks [2] produce networks with poisson distribution. This is a drawback as real world graphs have power law distributed network sizes. To overcome this drawback, LFR benchmarks [2, 21] were proposed that had vertex degrees and community sizes power law distributed. LFR benchmark graphs are basically configuration models with built in communities which may be overlapping or disjoint. Another benchmark designed to model dynamic communities was proposed by Granell et al. [12] based on the planted l-partition model. Communities are allowed to grow, shrink, merge and split. However, at each time step the sub-graphs are proper communities in the probabilistic sense [2].

2.2.7 Forest fire network model

The Forest Fire network is a generative model where one vertex is added at a time. This vertex a connects to ambs vertices already present in the network, chosen uniformly at random. For each chosen vertex v the following procedure is performed:

  • Generate two random numbers that are geometrically distributed with means \(\frac{p}{1-p}\) and \(\frac{rp}{1-rp}\) such that p is forward probability, r is backward probability.

Based on these probabilities outgoing and incoming neighbours of v are connected to a. If v has neighbours below a threshold value then all of them are connected to a.

2.3 Real networks

Tables 1 and 2 are a collection of network data-sets publicly available on platforms such as SNAP [22] and UCIML [23]. The description of these data-sets are given in Tables 34 and 5.

Table 1 Description of the Social Relationship in networks (Part- I)
Table 2 Description of the social relationship (part-II)
Table 3 Description of networks—part 1
Table 4 Description of networks—part 2
Table 5 Description of networks—part 3

3 Experimental study

Measures based on literature review in Sect. 2.1 are highlighted in Table 6 and are used to evaluate synthetic and real data-sets enlisted in Sect. 2.1, 2.3. The aim is to understand and explain underlying social phenomena and derive interesting insights about the behavior of these networks.

Table 6 Measures of network analysis

3.1 Results

3.1.1 Barabasi Albert: preferential attachment [BA-game]

A connected, directed graph is generated by through this model with parameters \(N = 10000, \alpha = 1, a = 1\). Adhesion and edge connectivity is 0 and hence the graph is disconnected. A small diameter of 14 (given in Table 7) indicates diffusion of information can take place fast. However, 0.1% nodes have most of the incoming edges. Thuss “Rich getting richer” phenomenon is observed in this model. Modularity value indicates no inherent community structure. Preferential attachment model creates hubs which capture most of the incoming edges.

3.1.2 Erdos Renyi random graph [ER-game]

An undirected graph is generated through this model with parameters \(N = 10000, E = 10000\). Adhesion is 0 and hence the graph is disconnected. A diameter of 29 indicates slower diffusion rate compared to Preferential attachment model. A longer diameter is due to lower average degree. The generated graph is not connected. Modularity value indicates no community structure. Thus, it would be ideal as a null benchmark for testing community detection algorithms. The strongly connected components distribution follows a power law with \(\gamma = 3.09\) and intercept of 1. The first 20 eigenvalues of this graph also followed a power law distribution with \(\gamma = 32\) and intercept of 3.32. Both the previous results were verified on K–S test.

3.1.3 Evolving random graph with preferential attachment and ageing [BA-aging]

An directed graph is generated through this model with parameters set at \(N = 10000, c = 1, d = 1, a = 1, b = 0, \alpha = 1, \beta = 1\). Adhesion is 0 and hence the graph is easily disconnected. A diameter of 11 indicates the fast rates for diffusion of information. The generated graph is connected. Modularity value indicates no community structure. Thus, it would be ideal as a null benchmark for testing community detection algorithms. The co-efficients of power law fi to the degree distribution is given in Fig. 3.

3.1.4 Watts–Strogatz model [WS-model]

The Small World network generated by this model with parameters \(N = 100, L = 1\) has strong adhesion due to high clustering between nodes. The graph shows a small diameter \(D = 10\). Small diameter indicates possibility of fast diffusion of information through nodes. As the entire network is connected, the component distribution is not available. Also as most of the nodes have same degree, the degree distribution is also not available. However, it was found that first 20 eigenvalues followed power law distribution with \(\gamma =3.34, \mathrm{intercept} \,=\,5.313\) and a good fit as per the K–S test. This graph captures the “Small World phenomenon” observed in real networks but fails to capture other effects such as a power law distributed degree distribution and component size distribution.

3.1.5 GN benchmark graph [GN-model]

The GN Benchmark model with parameters set at \(N = 128, Deg_{Avg} = 16\) has strong adhesion due to high clustering between nodes. The graph shows a low degree of separation with diameter \(D = 4\). The graph also has inbuilt communities but community sizes don’t follow a power law distribution. As the entire network is connected, the component distribution is not available. As all the nodes have same degree and hence a degree distribution is not available. It was found that first 20 eigenvalues followed power law distribution with \(\gamma =3.15, \mathrm{intercept}\, =\, 7\) and a good fit as per the K–S test. This model fails to have a power law distributed degree distribution and component size distribution.

Table 7 Summary of results
Fig. 3
figure 3

Power-law co-efficient for BA-game, ER-game and BA-aging. All results indicate a good fit on K–S test

3.1.6 LFR benchmark graph

LFR benchmark models given in Table 8 provide a graph with power law distributed community sizes and degree distributions. At low values of \(\mu\) community structure is absent and hence the poor score for modularity. The benchmark is a good substitute for real data as it follows properties seen in real networks such as having a large connected component, a small diameter and power law distributed first 20 eigenvalues. The power law fit was verified using p-value of K–S Test as given in Fig. 4.

Static LFR Weighted benchmarks are used to verify additional properties as listed in Table 9.

Table 8 LFR benchmark graph
Fig. 4
figure 4

Power law co-efficient of LFR benchmark graphs

Table 9 LFR benchmark graph

3.1.7 Dynamic benchmark graph

The dynamic benchmarks given in Table 10 do not show the effect seen in real graphs such as “Gelling Point” [7]. The evolution of the edges with time do not show a power law relation with the nodes unlike that seen in real graphs [7].

Table 10 Dynamic benchmark graph

3.1.8 Forest fire network model

The graph generated by this model has no adhesion or edge connectivity as given in Table 11. The diameter and average path length are small which may promote fast diffusion of data. The graph has poor assortativity by degree and reciprocity. A giant connected component exists in the graph consisting of all nodes. No central hubs are present. Modularity is poor indicating no inherent communities. The clustering coefficient has negative correlation with out degree, suggesting that there is significant clustering among low-degree nodes. Majority nodes have incoming edges equal to their outgoing edges. The in-degree and out-degree distributions of nodes fit the power law well as indicated by \(R^2\) values. Phenomena seen in real networks such as homophily, reciprocity are absent.

Table 11 Forest fire network model

3.1.9 Online social networks

The commonality identified during the analysis of the social networks of these websites was the power law distribution of their in-degree with in-weights and out-degree with out-weights as given in Fig. 5. The degree distributions were also power law distributed with \(1.18 \le \gamma \le 1.74\). Adhesion for Facebook, Gowalla were 1 which indicated that users of these websites were reachable through atleast 1 route. Reciprocity of directed networks such as Epinions and Slashdot indicated higher level of mutual agreements between members. Vertex betweenness for the social networks did not indicate presence of central hubs. For social networks of such large scale lack of a central hub is quite intuitive. Poor score for modularity and graph density is seen in all these social networks, this is also common for such large scale graphs. Epinions and Slashdot see high reciprocity for links as the social relationship is that of trust.

Fig. 5
figure 5

Characteristics of online social network data-sets

3.1.10 Communication networks

Average number of frequent contacts a person has is three in the Organization under review. However an email account exists that has 7636 outgoing links. This might indicate the presence of a facility for Employee-Management communication. Assortativity by degree has a negative value which means highly active emailers send most of their emails to less active emailers. Inherent community structure is present in the graph which means that the organization could possibly have departments which communicate more internally than with other departments.

3.1.11 Citation networks

Nodes with zero citations exist which is common for citation networks. Also nodes with zero outdegree also exist (possibly because the graph is incomplete). Average number of citations a paper has is twelve. Max number of citations a paper receives is 846, max number of citations a paper makes is 411. Adhesion is zero, as many papers exist with few citations or nill citations. Assortativity in citation graphs is low as most of the links are from papers with no citations to papers with large number of citations. Reciprocity doesn’t exist in citation graphs as the social relationship is “Academic status” and is not reciprocated frequently unlike friendships.

3.1.12 Web graphs

Google Web graph has adhesion and edge connectivity 0 as the graph is not connected. It is also characterised by lack of a central hub and lack of inherent community structure. The high reciprocity value for hyperlinks between webpages is peculiar and might not be held true for a larger sample size of the internet graph. Figure 6a shows analysis of local transitivity indicated a higher level of clustering between neighbours of nodes of low degree than that of nodes of high degree. Figure 6b shows that indegree of nodes matches their outdegree except for popular webpages.

Fig. 6
figure 6

a Local transitivity, b outdegree–indegree relation

3.1.13 Product co-purchasing networks

Analysis of the product co-purchasing network of Amazon reveals that Incoming edges for products are higher than outgoing edges. A Hub is present in the graph which has 420 co-purchased products. Analysis of the modularity value indicates that there exist communities of products that are co-purchased together. \(\approx 40\%\) links are reciprocal indicating consumers opt for Customers Who Bought This Item Also Bought feature of the Amazon.

3.1.14 Collaboration networks

The analysis of the degree of the nodes in ArXiv Collaboration networks reveals that an average scientist has collaborated with 16 others and in rare cases upto 550 others for his research. The scientists that collaborate with few people for research show a higher degree of transitivity between their collaborators. Average path length of this network is \(\approx 5\) indicating fast diffusion of information in the network as given in Fig. 7.

Fig. 7
figure 7

Analysis of network data-sets

3.1.15 Other datasets

The analysis of the social networks from Voting trends (Wiki-Vote), Internet Topology (CAIDA-Net), P2P networks (GnutellaP2P) and Road networks (Road-PA) reveal a adhesion and edge connectivity of 0. Assortativity by degree on Road networks show that busy (high degree) intersections are connected to other busy intersection as given in Fig. 8. Such a design requirement is very intuitive and could be common to other road networks too. All the social networks exhibit poor community structure and also do not contain central hubs. Figure 9a shows that users that have voted for many participants have low transitivity. Transitivity patterns of all the networks under analysis reveal the same trend. Figure 9b indicates users that have supported large number of other users do not necessarily receive support from others. Component and degree distributions are analysed in Figs. 10 and  11 respectively.

Fig. 8
figure 8

Degree of assortativity, reciprocity and adhesion in the networks

Fig. 9
figure 9

a Local transitivity, b outdegree–indegree relation

Fig. 10
figure 10

Component distributions of network data-sets

Fig. 11
figure 11

Degree distribution of networks

4 Conclusion

The analysis of social networks belonging to domains such as online social networks to citation graphs was performed in this inquiry. Network theoretic concepts were used for the analysis. Observations made in the previous inquiries were validated on social network data-sets of different domains. It was observed that degree distributions on real and synthetic data-sets were power law distributed. However, component size distribution has a poor fit for the power law except in GnutellaP2P file sharing websites. In the case of the road networks of Pennsylvania state of USA, it was found that busy intersections would be connected to other busy intersection. Probably the roads were designed in this way to ensure smooth vehicle movement. Commonality between data-sets of all domains was the absence of central hubs, presence of community structure and low graph density.

Social networks are a particular class of networks that represent the sum of all personal or professional ties between the members of the system. A network perspective revealed that social networks shared properties such as negative assortativity, domination of a few members, high edge density, power law distributed degree and component sizes, high transitivity, high reciprocity and small average path length and diameter.

Representing the systems in the form of a social network facilitated easier but conceptually sound analysis. However, there are also several limitations of network representation models. The calculation of certain characteristics such as diameter and average path length require \(n*n\) computations for a graph of n nodes. This made the calculation expensive for large graphs. Machine learning applications on graphs such as node classification, link prediction, expert recognition etc. require hand-engineering features to be calculated initially. This makes the results of the exercise dependent on the ability of the researcher. Machine learning cannot be directly applied on the network as the data-points on the network are not i.i.d. The data-points are connected to each other by links or edges and thus the i.i.d assumption for machine learning is not satisfied.

A solution for these drawbacks is network representation learning (NRL). The adjacency matrix of network G(VE) is denoted by \(A \in R^{|V|*|V|}\) and \(V^a \in R^{|V|*p}\) is used to denote the vertex attribute matrix if present otherwise, \(V^a = \phi\). The problem is defined as follows: given a network \(G = (V, E)\) and associated attributes, the aim is to represent each node u in a low-dimensional vector space \(y_u\) by learning a mapping \(f : {V, V^a} \rightarrow R^d,\) namely \(y_v = f(v, V^a) \forall v \in V\). It is required that \(d \ll |V|\) and the function f preserve a proximity measure defined on the graph G. Intuitively, if two nodes u and v are “similar” in graph G, their embedding \(y_u\) and \(y_v\) should be close to each other in the embedding space i.e. \({y_u}^Ty_v \sim 1\). The notation \(f(G) \in R^{|V|*d}\) is used for the embedding matrix of all nodes in the graph G. As the nodes are converted to vector embeddings, the drawbacks of network representation models are overcome. Designing efficient NRL techniques is a promising research direction if machine learning applications have to be developed for graphs.