Skip to main content

A Review on P2P File System Based on IPFS for Concurrency Control in Hadoop

  • Conference paper
  • First Online:
Emerging Technologies in Data Mining and Information Security

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1286))

Abstract

A distributed file system is used to store and share files in a peer-to-peer network using the InterPlanetary File System (IPFS) protocol. In a distributed file system, multiple central servers can save the files which will be accessed by various remote clients with proper authorization rights within the network. Nowadays, the amount of data getting generated each minute is huge and can be accessed by multiple users. This creates a problem in managing, accessing and executing data and finally leading to concurrency control issues. Concurrency control is a process of simultaneously managing the execution of data in a shared database and ensures the serializability of data for multiple users. Thus, in the context of Hadoop, if multiple clients want to write an updated data in the HDFS file system then the protocol that needs to be followed to make sure that the write done by one client does not influence the computation performed by the other client. This paper elaborates how multiple users can access the same file without getting any distortion in the content of that file. It also provides a theoretical solution to handle the concurrency control problem in Hadoop. The solution discussed in this paper is to implement Hadoop’s Java-based Filesystem interface for the decentralised, peer-to-peer file system using IPFS. The proposed interface will allow the Hadoop MapReduce functions to be directly performed on data files hosted on IPFS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Wang, S., Zhang, Y., Zhang, Y.: A blockchain-based framework for data sharing with fine-grained access control in decentralized storage systems. IEEE Access 6, 38437–38450 (2019)

    Article  Google Scholar 

  2. Ali, S., Wang, G., White, B., Cottrell, R.L.: A blockchain-based decentralized data storage and access framework for PingER. In: Proceedings of the 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), New York, NY, USA, pp. 1303–1308, 1–3 Aug 2018

    Google Scholar 

  3. Kuo, T.T., Kim, H.E., Ohno-Machado, L.: Blockchain distributed ledger technologies for biomedical and health care applications. Int. J. Am. Med. Inform. Assoc. 6, 1211–1220 (2017)

    Article  Google Scholar 

  4. Vorick, D., Champine, L.: Sia: simple decentralized storage. https://sia.tech/sia.pdf (2020). Accessed 31 January 2020

  5. Wilkinson, S., Boshevski, T., Brandoff, J., Buterin, V.: Storj: a peer-to-peer cloud storage network. https://storj.io/storj.pdf (2020). Accessed 2 February 2020

  6. Shafagh, H., Burkhalter, L., Hithnawi, A., Duquennoy, S.: Towards blockchain-based auditable storage and sharing of IOT data. In: Proceedings of the 2017 on Cloud Computing Security Workshop, Dallas, TX, USA, pp. 45–50, 3 Nov 2017

    Google Scholar 

  7. Wennergren, O., Vidhall, M., Sörensen, J., Steinhauer, J.: Transparency Analysis of Distributed File Systems. Bachelor Degree Project in Information Technology; University of Skövde, Skövde, Sweden (2018)

    Google Scholar 

  8. Bansal, N., Upadhyay, D., Mittal, U.: Concurrency control techniques in HDFS. In: 2014 5th International Conference Confluence The Next Generation Information Technology Summit (Confluence), Noida, pp. 87–90 (2014)

    Google Scholar 

  9. Benet, J.: IPFS-content addressed, versioned, P2P file system arXiv:1407.3561 (2014)

  10. Jovović, I., Husnjak, S., Forenbacher, I., Maček, S.: 5G, Blockchain, and IPFS: A general survey with possible innovative applications in industry 4.0. In: Proceedings of the 3rd EAI International Conference on Management of Manufacturing Systems-MMS, Dubrovnik, Croatia, pp. 1–10, 6–8 Nov 2018

    Google Scholar 

  11. Chen, Y., Li, H., Li, K., Zhang, J.: An improved P2P file system scheme based on IPFS and Blockchain, pp. 2652–2657. https://doi.org/10.1109/bigdata.2017.8258226

  12. IPFS-Cluster. https://cluster.ipfs.io/. Accessed 5 Dec 2018

  13. Interplanetary Linked Data (IPLD). https://ipld.io/. Accessed March 2020

  14. Kaiser, J., Meister, D., Brinkmann, A., Effert, S.: Design of an exact data deduplication cluster. In: Proceedings of the IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), San Diego, CA, USA, pp. 1–12, 16–20 Apr 2012

    Google Scholar 

  15. IPFS Architecture. https://github.com/ipfs/specs/tree/master/architecture. Accessed March 2020

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shashank Srivastava .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sethi, J., Srivastava, S., Upadhyay, D. (2021). A Review on P2P File System Based on IPFS for Concurrency Control in Hadoop. In: Hassanien, A.E., Bhattacharyya, S., Chakrabati, S., Bhattacharya, A., Dutta, S. (eds) Emerging Technologies in Data Mining and Information Security. Advances in Intelligent Systems and Computing, vol 1286. Springer, Singapore. https://doi.org/10.1007/978-981-15-9927-9_83

Download citation

Publish with us

Policies and ethics