Optimal Representation of Large-Scale Graph Data Based on K2-Tree

Chang, Liang; Zeng, Xiangxuan; Xu, Zhoubo; Qian, Junyan; Gu, Tianlong; Song, Houbing

doi:10.1007/s11277-017-4087-5

Optimal Representation of Large-Scale Graph Data Based on K²-Tree

Published: 15 March 2017

Volume 95, pages 2271–2284, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Wireless Personal Communications Aims and scope Submit manuscript

Optimal Representation of Large-Scale Graph Data Based on K²-Tree

Download PDF

Liang Chang¹,
Xiangxuan Zeng¹,
Zhoubo Xu¹,
Junyan Qian¹,
Tianlong Gu¹ &
…
Houbing Song ORCID: orcid.org/0000-0003-2631-9223²

362 Accesses
2 Citations
Explore all metrics

Abstract

Graph is widely used to model data in various applications. With the rapid growth of many emerging applications such as Internet of Things, it is urgent to require the processing capability on large scale graphs with billions of vertices. Web graph is a typical case of graph data that is widely used for analyzing the structure, behavior and evolution of the World Wide Web. In this paper, we focus on optimal representation of large-scale Web graphs. Our work is motivated by the need of fit large-scale graphs into the main memory and carry out analyze on them. By analyzing the adjacency matrix of Web graphs, we find two characteristics on the distribution of 1s in the matrix. Firstly, only a very small proportion of elements in the matrix are 1s. Secondly, majority of 1s gather around the principal diagonal and form a few number of clusters in the matrix. Based on these characteristics, we first develop a clustering mechanism to locate the clusters of 1s in the adjacency matrix. Then, we combine this clustering mechanism with a structure named K²-tree and propose an approach for representing large-scale Web graphs compactly. Basic idea of the approach is trying to compress a large number of zeros as a single zero. Experimental results show that, our approach not only reduces the space for representing a Web graph, but also reduces the time consumption for operations such as retrieving neighbors of any nodes on the graph; compared with existing approaches, our approach achieves the best space/time tradeoff.

Delta-K 2-tree for Compact Representation of Web Graphs

Common Neighbor Query-Friendly Triangulation-Based Large-Scale Graph Compression

Fast Construction of Compressed Web Graphs

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Graph is widely used to model data in various applications [1–7]. With the rapid growth of emerging applications like social network analysis, semantic Web analysis, bioinformatics network analysis, and Internet of Things, it is urgent to require the processing capability on large scale graphs with billions of vertices [8–14]. In order to analyze the structure, behavior and evolution of the World Wide Web, researchers often model the Web as a directed graph by treating Web pages as nodes and treating the links between pages as directed edges. Such a directed graph is also called Web graph [14]. Web graph is a typical case of graph data.

With the rapid increase in the size of the World Wide Web, it becomes a great challenge to represent a large-scale Web graph on computers. According to the report of China Internet Network Information Center, by the end of 2015, the total numbers of Web pages and hyperlinks in China were more than 212 billion and 10¹⁸ respectively. If the corresponding Web graph is represented as adjacent matrix, then the space for storing it is more than 40 TB. Such a Web graph is too big to be loaded in the main memory of computer for analyzing. Obviously, the explosive growth of the scale of Web graphs has made it difficult for using traditional storage structure to represent and manipulate the graphs.

At present, there are three approaches to deal with large-scale Web graphs. (1) Make use of external memory to store Web graph [15]. Shortcoming of this approach is a large amount of time consumption for operations on graphs; since the access speed of external memory is 4–6 orders of magnitude slower than that of main memory, and the graph has to be loaded from external memory to main memory frequently. Therefore, when taking this approach, the main concern of researchers is how to reduce the number of disk I/O operations by optimizing the spatial locality of data access. (2) Divide a big Web graph into many parts and store them in main memories of different computers in a distributed system [16]. Shortcoming of this approach is a big communication cost between computers caused by the coupling among all the parts of the Web graph. Therefore, when taking this approach, the main concern of researchers is how to reduce the communication cost and time consumption by technically designing a good segmentation algorithm. (3) Convert the Web graph into a compressed form so that it could be stored in the main memory of a computer and could be handled effectively [17]. In this paper, we follow the third approach.

By analyzing the adjacency matrix of a Web graph, we find two characteristics on the distribution of the values of ones in the matrix. First, only a very small proportion of elements in the matrix are ones. Secondly, majority of 1s gathers around the principal diagonal and form a few number of clusters in the matrix. Based on these characteristics, we develop a clustering mechanism to locate the clusters in the matrix, and incorporate this mechanism into a structure named K²-tree. As a result, we propose an algorithm for converting large-scale Web graphs into compressed forms, and we demonstrate that this algorithm achieves the best space/time tradeoff in all existing algorithms.

2 The K²-Tree

K²-tree is a structure proposed by Brisaboa et al. [18] for representing graphs in the case that the number of edges is far less than the square of the number of nodes. The basic idea is to transform the adjacency matrix of a graph into a complete K²-fork tree, where K is a positive integer. In this K²-complete tree, each node labeled with 1 just stands for an element of 1 in the matrix, whereas each node labeled with 0 will stands for an area which is full of 0s in the matrix.

More precisely, given an adjacency matrix, the K²-tree approach will first generate a root node and divides the matrix into K² submatrixes of the same size. Be corresponding to each submatrix, the approach will generate a child for the root node, and labels this child node with 0 if all elements in the submatrix is 0; otherwise the child node will be labeled with 1. During this process, all the K² submatrixes are sorted by the order that from top to bottom and then left to right. Such a sequence is reflected in the ordering of the corresponding children nodes that from left to right. Then, for each submatrix which contains 1, the approach will divide it again into K² submatrixes; and correspondingly K² children nodes will be generated and be labeled with 0 or 1 in the tree. Such a process is repeated, until the submatrix containing 1 is a 1 × 1 matrix.

In the case that the order of an adjacency matrix is not an integral power of the integer K, the K²-tree approach need to do some preprocessing. It will introduce a number of rows and columns of 0s into the bottom and the right-hand side of the adjacency matrix so that the order of the matrix is an integral power of K. Because a square matrix of 0s will be compressed as a single node labeled with 0 in the K²-tree, 0s introduced in the preprocessing stage don’t bring many effects on the performance of the K²-tree approach.

As an example, the white part of the matrix in the left part of Fig. 1 is an adjacency matrix for a graph with 11 nodes and 12 edges. We will represent it as a K²-tree with K = 2. First, 0s in the grey part of the matrix are introduced to transform the adjacency matrix into a matrix with an order 16. Then, the complete 2²-fork tree shown in the right part of this figure will be constructed by the K²-tree approach. Now, let L be a bit-vector composed of the labels of the nodes in the lowest level of the tree, with the order that from left to right; and let T be a bit-vector composed of the labels of all the other nodes in the tree, with the order that from top to bottom and then left to right. Then, we just need to store these two bit-vectors in computer since they contain all the information of original graph. In this example, in the case that the graph is represented as adjacency matrix, the space consumption is 121 bits. By making use of the K²-tree approach, we only need 72 bits to store the vectors T and L. Obviously, the K²-tree approach can reduce storage space greatly.

The K²-tree approach also supports efficient operations on graphs. Based on a top-down traverse of the tree, we can realize forward and backward navigation on graphs. Furthermore, the operations of checking individual links and retrieving links between given ranges of nodes can also be realized [19].

3 Limitations of K²-Tree in Representing Web Graphs

K²-tree provides a good approach for representing graphs compactly. However, since it divides a matrix into K rows and K columns mechanically during the compression process, it has as least three shortcomes.

First, the close coupling between nodes might be destroyed and it will increase the time consumption for query operations on the Web graph. In a World Wide Web there are always some groups of Web pages which are related tightly. In an extreme case, there exist links between every pair of pages in a group. By the K²-tree approach, nodes in such a group might be dispersed to different submatrixes and therefore make a close connection between submatrixes. As a result, for operations such as forward or backward navigation on the graph, the visit of nodes in a K²-tree has to be switched frequently between brother nodes. The switch of visit between brother nodes is time consuming, since it has to go through the father node.

Second, a large area of matrix full of 0s might be divided into many submatrixes and therefore can’t be compressed completely. The spirit behind the K²-tree approach is trying to compress a large number of zeros as a single zero. If there is a large area of matrix full of 0s, the ideal result is to represent this area by a single node labeled with 0. However, once it is divided into many submatrixes, more 0s and consequently more spaces will be used for the representation.

Thirdly, the height of the K²-tree has a major impact on the time consumption of operations on the graph. Suppose the number of nodes in a Web graph is n, then the height of the corresponding K²-tree will be \(h = \left\lceil {\log_{k} n} \right\rceil\). For operations such as forward or backward navigation on the Web graph, there will be lots of top-down traversals accompanied with backtracking processes. For example, on the data set CNR-2000 [14] of Web graph which has 325,557 nodes and 3,216,152 edges, we made an experiment to visit all the neighbors of a given node in it. We found that the K²-tree was recursively visited 450 times in average, before all the neighbors of a given node were found. Therefore, in order to reduce the time consumption of query operations, we need to reduce the height of the K²-tree.

Now let us have a closer look of Web graph and analyze the adjacency matrix of it. CNR-2000 is a public data set of Web graph that often used as benchmark in literatures. As we mentioned before, the graph has 325,557 nodes and 3,216,152 edges. In order to investigate the distribution of 1s in the adjacency matrix, we map the line number and column number of nodes into the X-coordinate and Y-coordinate respectively, and map each element in the matrix with value 1 into a corresponding point in the coordinate system. As a result, we get Fig. 2. It is obvious that majority of 1s are gather around the principal diagonal and form a few number of clusters.

Next, we make a quantitative analysis of this phenomenon. Let us horizontally divide the adjacency matrix of CNR-2000 into many parts. Without loss of generality, we let the height of each part be 65,536, and in this case, the matrix is in fact divided into 5 parts. We count the number of 1s contained in each part and denote it as total_num, as shown by the second column in Table 1. Furthermore, for each part, we pick out the square matrix whose principal diagonal is a part of the principal diagonal of the adjacency matrix; we call this square matrix as main submatrix. Now, for each part, we count the number of 1s in the main submatrix and denote it as in_num, as shown by the third column in Table 1; we also count the number of 1s not in the main submatrix and denote it as out_num, shown by the fourth column in Table 1. Finally, we calculate the ratio of out_num to total_num and call the result as outlier rate. With the index of outlier rate, it is shown again that majority of ones in the adjacency matrix gather around the principal diagonal and form a few number of clusters.

Table 1 Node distribution of the data set CNR-2000

Full size table

Based on the above observation, we get the following idea to improve the K²-tree approach. Given the adjacency matrix of a Web graph, instead of dividing it mechanically, we will carry out the following operations. (1) Find out such submatrixes of the adjacency matrix that majority elements in the submatrix are 1s; compress each submatrix by the K²-tree approach. (2) Remove all of these submatrixes from the adjacency matrix and fill their places in the adjacency matrix with 0s; compress the resulted adjacency matrix by the K²-tree approach. (3) Record all of those K²-trees generated in the previous steps by some mechanism, and get a compressed representation of the graph.

Figure 3 illustrates the above idea. In the rest of the paper, we will call the submatrixes found in the first step as clusters of 1s.

Intuitively, our method can overcome or alleviate the defects of the K²-tree approach discussed in the beginning of this section. First, the close coupling between nodes is protected by the first step. Second, areas of matrix full of 0s in the adjacency matrix are expanded in the second step, and therefore each single node labeled with 0 in the K²-tree can represent more elements of the adjacency matrix. Thirdly, the height of the resulted K²-tree is decreased; since we generate many K²-trees where each K²-tree only represents a submatrix or represents a very sparse matrix that has been backfilled by 0s.

4 Incorporating Clustering Mechanism into K²-Tree

In this section, we give a detailed description to our method. Firstly we propose an algorithm Clustering(M, MinSize, ϕ) to find out the clusters of 1s in the adjacency matrix M along the principal diagonal. In this algorithm, the parameter M is an adjacency matrix of the Web graph that we will compress; parameters MinSize and ϕ respectively sets the minimum size and the minimum outlier rate for the clusters that we are looking for. After the execution of the algorithm, we will get a queue of matrixes that will be compressed into K²-trees one by one in Algorithm 2.

Given a Web graph, it is possible that some nodes that have the same neighbors are not encoded in adjacent positions. Such an order of nodes will damage the locality characteristics of the adjacency matrix. Therefore, in the beginning of Algorithm 1, we firstly make use of the BFS technique [21] to change the order of nodes in the adjacency matrix so as to get a better locality characteristic.

In Algorithm 1, we look up clusters of 1s along the principal diagonal of the matrix M. Starting from the first element in M, the algorithm will selects a submatrix with the size of width × width, and then calculate the outlier rate of it. If this submatrix satisfies the two conditions on clusters, then it will be recorded in the queue list, and its area in the matrix M will be filled with 0s. The purpose of setting the parameter MinSize is to avoid the cases that too many clusters will increase the time consumption of operations on the compressed representation of graphs.

Now we can present our compression algorithm Cluster-K ²-tree(M, K). The first four steps of Algorithm 2 are designed to calculate the parameters MinSize and ϕ. Then the algorithm Clustering(M, MinSize, ϕ) is called and a queue list of matrixes is generated. For each matrix contained in list, we make a compression to it by the K²-tree approach, and get the corresponding T vectors and L vectors. The storage mode used here is a little different from the storage mode of the K²-tree approach. In Algorithm 2, the T vectors of compression results by the K²-tree approach will be stored independent; but all the L vectors, except the L vector corresponding to the whole adjacency matrix, will be composed to form the vector global_L, which will then be transformed into an array and a bitmap by the DACs [22] encoding approach. The DACs coding technique can compress the L vectors with a high efficiency and support fast random access [13]. Due to the space limitation, here we omit the detailed process of DACs encoding.

As an example, let us consider again the adjacency matrix in the left part of Fig. 1. In the case that MinSize = 4 and ϕ = 0.1, we will get two clusters shown in Fig. 4. For each of these clusters, we make a compression to it by the K2-tree approach with K = 2, and get the results shown in Fig. 5. Now we only need to record the bit-vectors T1, T2, L1 and L2, and the corresponding space consumption is only 64bit. Recall that, in Section II, it needs 121 bits to store the adjacency matrix of this graph, and needs 72 bits by the original K2-tree approach.

The advantage of our method not only reflects in the ratio of compression, but also reflects in the time consumption of query operations on Web graphs. The reason is that our method reduces the height of the generated K²-trees and, at the same time, preserves the close coupling between nodes. For example, the height of the K²-tree in the left part of Fig. 1 is 5, and it is now reduced to 4 in Fig. 5. Furthermore, if we need to visit all the neighbors of the tenth node in the graph, then the K²-tree in Fig. 1 will be visited recursively 31 times; however, the K²-trees in Fig. 5 will be visited only 15 times.

The above advantage is more obvious in large-scale Web graphs. For example, let us consider again the data set CNR-2000 that we discussed in Section III. In the case that MinSize = 40 and ϕ = 0.00003, we will get 221 clusters. For the query operation that visiting all the neighbors of a given node, by our method the K²-trees will be visited recursively 361 times in average. However, recall that, in the discussion of Section III, the K²-tree generated by the original K²-tree approach was visited recursively 450 times in average. Obviously, we get a reduction of almost 20%.

5 Experiments

In this section, we compare our method with existing approaches for representing large-scale Web graphs.

At present three are three typical algorithms in the literatures, named K²-tree [18], Re-Pair [23] and LZ78 [23], that have excellent space/time tradeoff. Source codes of these algorithms are available on the website of the FCWGR (Fast Compact Web Graph Representations) project [24] from the University of Chile. We downloaded the source codes from the Website and ran them in our testing environment. Our testing environment is based on a computer with Intel(R) Core(T) i5-4590 CPU@3.30 GHz and 4 GB RAM. Operation system is Ubuntu GNU/Linux in version 12.04 (64 bits). All tests only use one CPU core. Programming language is the C language and the compiler is gcc in version 4.4.7.

We use the data sets CNR-2000 and EU-2005 as benchmarks. CNR-2000 and EU-2005 are data sets for real Web graphs and are available on the website of the LAW laboratory [20] from the University of Milan. Some information on these data sets are listed in Table 2, including the number of nodes, the number of edges, ratio of the number of edges to the number of nodes, and the size for a plain adjacency list representation of the graph (using 4-byte integers).

Table 2 Description of the web graphs used in the experiment

Full size table

In the experiment, we take into account both space and time. Space consumption is measured as a ratio of bits per edge. More precisely, we calculate the total space for storing the results of compression, and divide it by the number of edges in the Web graph. For the time consumption, we focus on the operation that finding all neighbors of a given node in the graph. For every node in the graph, we respectively calculate the time consumption of finding all the neighbors of it; then we add all of these time consumptions together and divide the sum by the number of edges in the Web graph. The unit of time in the experiment is microsecond.

Figure 6 shows the space and time consumption on the data set CNR-2000. During the execution of our algorithm, there are 221 clusters been generated. Figure 7 shows the results on the data set EU-2005, and on this data set there are 180 clusters been generated during the execution of our algorithm.

Compared with the classical K²-tree approach, our method significantly reduces both the space consumption and the time consumption. More precisely, for the space consumption, our method reduces 42 and 38% respectively in the data sets CNR-2000 and EU-2005. For the time consumption, our method reduces 20 and 43% respectively.

Compared with the Re-Pair and LZ78 algorithms, our method significantly reduces the space consumption. More precisely, on the data set CNR-2000, our method reduces 59% to Re-Pair and 78% to LZ78; on the data set EU-2005, our method reduces 50% to Re-Pair and 65% to LZ78. The disadvantage of our method is that it needs more time on the operation of finding all neighbors of a given node. More precisely, on the data set CNR-2000, our method increases 6% to Re-Pair and 75% to LZ78; on the data set EU-2005, our method increases 45% to Re-Pair and 95% to LZ78.

Although our method needs more time compared with Re-Pair and LZ78, it achieves a better space/time tradeoff. Furthermore, for operations on Web graphs, both Re-Pair and LZ78 only support forward navigation. Since our method is an improvement of the K²-tree approach, in theory it supports all the operations supported by K²-tree approach, such as forward navigation, backward navigation, deciding the existence of a link, and retrieving the links between certain nodes [19].

6 Conclusions and Future Work

With the rapid development of World Wide Web and the rapid growth of many emerging applications on the Web, it is urgent to require the processing capability on large scale graphs with billions of vertices. How to represent such large scale graphs is a problem full of challenge.

In this paper, by combining a clustering mechanism with the K²-tree structure, we propose a method for representing large-scale Web graphs compactly. Our method firstly finds out clusters of 1s in the adjacency matrix, and represents each cluster compactly by making use of the K²-tree approach. Then, for the original adjacency matrix of the Web graph, our method replaces all the 1s occurring in the clusters by 0s, and gets a sparse adjacency matrix. This sparse adjacency matrix is represented compactly again by making use of the K²-tree approach. Finally, the representations of clusters and sparse adjacency matrix are well-organized and we get a compressed representation of the whole Web graph. Compared with existing approaches in the literature, such as the classical K²-tree approach, the Re-Pair approach and the LZ78 approach, our method not only reduces the space for representing Web graph but also reduces the time consumption for the query operation on the graph. Furthermore, our method has the best space/time tradeoff.

In the future work we will extend our method to support more operations on graphs such as the forward navigation, backward navigation, existence checking of certain links, and retrieving of links between nodes. Another work is to extend the method to deal with dynamic graph data.

References

Sun, Y. C., Yu, X. P., Bie, R. F., & Song, H. B. (2016). Discovering time-dependent shortest path on traffic graph for drivers towards green driving. Journal of Network and Computer Applications, 83(1), 204–212.
Google Scholar
Lv, Z., Song, H., Basanta-Val, P., Steed, A., & Jo, M. (2017). Next-generation big data analytics: State of the art, challenges, and future research topics. IEEE Transactions on Industrial Informatics. doi:10.1109/TII.2017.2650204.
Google Scholar
Song, H., Srinivasan, R., Sookoor, T., Jeschke, S., & Cities, S. (2017). Foundations, principles and applications (pp. 1–906). Hoboken: Wiley.
Google Scholar
Tawalbeh, L. A., Bakheder W., & Song H. (2016) A mobile cloud computing model using the cloudlet scheme for big data applications. In IEEE First International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE) (pp. 73–77). Washington, DC.
Shojafar, M., Javanmardi, S., Abolfazli, S., & Cordeschi, N. (2015). FUGE: A joint meta-heuristic approach to cloud job scheduling algorithm using fuzzy theory and a genetic method. Cluster Computing, 18(2), 829–844.
Article Google Scholar
Shojafar, M., Cordeschi, N., & Baccarelli, E. (2016). Energy-efficient adaptive resource management for real-time vehicular cloud services. IEEE Transactions on Cloud Computing. doi:10.1109/TCC.2016.2551747.
Google Scholar
Yu, J. G., Deng, X., Yu, D. X., Wang, G. H., & Gu, X. (2013). CWSC: Connected k-coverage working sets construction algorithm in wireless sensor networks. International Journal of Electronics and Communications, 67(11), 937–946.
Article Google Scholar
Sun, Y. C., Lu, C., Bie, R. F., & Zhang, J. S. (2016). Semantic relation computing theory and its application. Journal of Network and Computer Applications, 59, 219–229.
Article Google Scholar
Lin, C., Song, Z., Song, H., Zhou, Y., Wang, Y., & Guowei, W. (2016). Differential privacy preserving in big data analytics for connected health. Journal of Medical Systems, 40(4), 1–9.
Google Scholar
Tawalbeh, L. A., Mehmood, R., Benkhlifa, E., & Song, H. (2016) Mobile cloud computing model and big data analysis for healthcare applications. In IEEE Access (Vol. 4, pp. 6171–6180).
Lin, C., Wang, P., Song, H., Zhou, Y., Liu, Q., & Guowei, W. (2016). A differential privacy protection scheme for sensitive big data in body sensor networks. Annals of Telecommunications, 71(9–10), 465–475.
Article Google Scholar
Cordeschi, N., Shojafar, M., & Baccarelli, E. (2013). Energy-saving self-configuring networked data centers. Computer Networks, 57(17), 3479–3491.
Article Google Scholar
Yu, J. G., Chen, Y., Ma, L. R., Huang, B. G., & Cheng, X. Z. (2016). On connected target k-coverage in heterogeneous wireless sensor networks. Sensors, 16(1), 104.
Article Google Scholar
Meusel, R. (2015). The graph structure in the web—analyzed on different aggregation levels. The Journal of Web Science, 1(1), 33–47.
Article Google Scholar
Vitter, J. S. (2009). External memory algorithms and data structures: Dealing with massive data. ACM Computing Surveys, 33(2), 209–271.
Article Google Scholar
Khan, L. R., Thuraisingham, B., McGlothlin, J., Masud, M. M., & Husain, M. F. (2011). Heuristics-based query processing for large RDF graphs using cloud computing. IEEE Transactions on Knowledge and Data Engineering, 23(9), 1312–1327.
Article Google Scholar
Boldi, P., & Vigna, S. (2004). The webgraph framework I: Compression techniques. In: Proccedings of the 13th International Conference on World Wide Web (pp. 595–602).
Brisaboa, N. R., Ladra, S., & Navarro, G. (2009). K²-trees for compact web graph representation. In: Proceedings of the 16th International Symposium on String Processing and Information Retrieval (pp. 18–30).
Brisaboa, N. R., Ladra, S., & Navarro, G. (2014). Compact representation of Web graphs with extended functionality. Information Systems, 39(1), 152–174.
Article Google Scholar
http://law.di.unimi.it/datasets.php.
Apostolico, A., & Drovandi, G. (2009). Graph compression by BFS. Algorithms, 2(3), 1031–1044.
Article MathSciNet Google Scholar
Brisaboa, N. R., Ladra, S., & Navarro, G. (2013). DACs: Bringing direct access to variable-length codes. Information Processing and Management, 49(1), 392–404.
Article Google Scholar
Claude, F., & Navarro, G. (2010). Fast and compact web graph representations. ACM Transactions on the Web, 4(4), 523–530.
Article Google Scholar
http://webgraphs.recoded.cl/.

Download references

Acknowledgements

This work is supported by the Natural Science Foundation of China (Nos. 61572146, 61363030, 61572231, U1501252); the Natural Science Foundation of Guangxi Province (Nos. 2015GXNSFAA139285, 2016GXNSFDA380006); and the High Level of Innovation Team of Colleges and Universities in Guangxi and Outstanding Scholars Program.

Author information

Authors and Affiliations

Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin, Guangxi, 541004, China
Liang Chang, Xiangxuan Zeng, Zhoubo Xu, Junyan Qian & Tianlong Gu
Department of Electrical and Computer Engineering, West Virginia University, Montgomery, WV, 25136, USA
Houbing Song

Authors

Liang Chang
View author publications
You can also search for this author in PubMed Google Scholar
Xiangxuan Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Zhoubo Xu
View author publications
You can also search for this author in PubMed Google Scholar
Junyan Qian
View author publications
You can also search for this author in PubMed Google Scholar
Tianlong Gu
View author publications
You can also search for this author in PubMed Google Scholar
Houbing Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Tianlong Gu or Houbing Song.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, L., Zeng, X., Xu, Z. et al. Optimal Representation of Large-Scale Graph Data Based on K²-Tree. Wireless Pers Commun 95, 2271–2284 (2017). https://doi.org/10.1007/s11277-017-4087-5

Download citation

Published: 15 March 2017
Issue Date: August 2017
DOI: https://doi.org/10.1007/s11277-017-4087-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Optimal Representation of Large-Scale Graph Data Based on K²-Tree

Abstract

Similar content being viewed by others

Delta-K 2-tree for Compact Representation of Web Graphs

Common Neighbor Query-Friendly Triangulation-Based Large-Scale Graph Compression

Fast Construction of Compressed Web Graphs

1 Introduction

2 The K²-Tree

3 Limitations of K²-Tree in Representing Web Graphs

4 Incorporating Clustering Mechanism into K²-Tree

5 Experiments

6 Conclusions and Future Work

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimal Representation of Large-Scale Graph Data Based on K2-Tree

Abstract

Similar content being viewed by others

Delta-K 2-tree for Compact Representation of Web Graphs

Common Neighbor Query-Friendly Triangulation-Based Large-Scale Graph Compression

Fast Construction of Compressed Web Graphs

Explore related subjects

1 Introduction

2 The K2-Tree

3 Limitations of K2-Tree in Representing Web Graphs

4 Incorporating Clustering Mechanism into K2-Tree

5 Experiments

6 Conclusions and Future Work

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

Optimal Representation of Large-Scale Graph Data Based on K²-Tree

2 The K²-Tree

3 Limitations of K²-Tree in Representing Web Graphs

4 Incorporating Clustering Mechanism into K²-Tree