Abstract
Distributed file system is one of the key blocks of data centers. With the advance in geo-replicated storage systems across data centers, both system scale and user scale are becoming larger and larger. Then, a single metadata server in distributed file system may lead to capacity bottleneck and high latency without considering locality. In this paper, we present the design and implementation of MRFS (Metadata Replication File System), a distributed file system with hierarchical and efficient distributed metadata management, which introduces multiple metadata servers (MDS) and an additional namespace server (NS). Metadata is divided into non-overlapping parts and stored on MDS in which the creation operation is raised, while namespace and directory information is maintained in NS. Such a hierarchical design not only achieves high scalability but also provides low-latency because it satisfies a majority of requests in local MDS. To address hotspot issues and flash crowds, the system supports flexible and configurable metadata replication among MDSs. Evaluation results show that our system MRFS is effective and efficient, and the replication mechanism brings substantial local visit at the cost of affordable memory overhead under various scenarios.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Leung, A.W., Shao, M., Bisson, T., Pasupathy, S., Miller, E.L.: Spyglass: Fast, Scalable Metadata Search for Large-Scale Storage Systems. In: FAST, vol. 9, pp. 153–166 (2009)
Roselli, D.S., Lorch, J.R., Anderson, T.E.: A Comparison of File System Workloads. In: USENIX Annual Technical Conference, General Track, pp. 41–54 (2000)
Traeger, A., Zadok, E., Joukov, N., Wright, C.P.: A nine year study of file system and storage benchmarking. ACM Transactions on Storage (TOS) 4(2), 5 (2008)
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10. IEEE (2010)
Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: ACM SIGOPS Operating Systems Review, vol. 37(5), pp. 29–43. ACM (2003)
MooseFS, http://www.moosefs.org
Satyanarayanan, M., Kistler, J.J., Kumar, P., Okasaki, M.E., Siegel, E.H., Steere, D.C.: Coda: A highly available file system for a distributed workstation environment. IEEE Transactions on Computers 39(4), 447–459 (1990)
Rosenblum, M., Ousterhout, J.K.: The design and implementation of a log-structured file system. ACM Transactions on Computer Systems (TOCS) 10(1), 26–52 (1992)
Brandt, S.A., Miller, E.L., Long, D.D., Xue, L.: Efficient metadata management in large distributed storage systems. In: 2013 IEEE 10th International Conference on Mobile Ad-Hoc and Sensor Systems, pp. 290–290 (2003)
Weil, S.A., Brandt, S.A., Miller, E.L., Maltzahn, C.: CRUSH: Controlled, scalable, decentralized placement of replicated data. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, p. 122. ACM (2006)
Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D., Maltzahn, C.,, C.: A scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation. USENIX Association (2006)
Zhu, Y., Jiang, H., Wang, J.: Hierarchical bloom filter arrays (hba): A novel, scalable metadata management system for large cluster-based storage. In: 2004 IEEE International Conference on Cluster Computing, pp. 165–174 (2004)
GlusterFS, http://www.gluster.org
MapR, http://www.mapr.com
NumPy, http://www.numpy.org
Arnold, B.C.: Pareto distribution. John Wiley & Sons, Inc. (1985)
Reed, W.J.: The Pareto, Zipf and other power laws. Economics Letters 74(1) (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Yu, J., Wu, W., Yang, D., Huang, N. (2014). MRFS: A Distributed Files System with Geo-replicated Metadata. In: Sun, Xh., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2014. Lecture Notes in Computer Science, vol 8631. Springer, Cham. https://doi.org/10.1007/978-3-319-11194-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-11194-0_21
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11193-3
Online ISBN: 978-3-319-11194-0
eBook Packages: Computer ScienceComputer Science (R0)