Keywords

1 Introduction

Blockchain gained popularity after the emergence of bitcoin technology in the year 2009 by the person or group of people by the pseudonym Satoshi Nakamoto. Blockchain is the new face of Industry 4.0. Blockchain [1] may be thought of as a ledger which is distributed, transparent, peer to peer, and consensus-based. Convergence of blockchain in diverse applications proves it to be decentralized and changes how we view towards the development. Most of the conventional issues of data failures, i.e. security, privacy, etc., can be eliminated by decentralized way of data storage [2, 3]. Blockchain has redefined security and has brought into play “trustless transactions” that can be securely done without the intervention of trusted third-party. Ethereum is one of the most popular public blockchain-based platforms. Data is cryptographically stored in blockchain and allows easy interaction with the client side. Data storage is a major concern in the era of IoT and many such upcoming technologies. Data in cloud has a security concern, and we need data storage to be done using a blockchain approach in order to secure the highly indispensable data.

A blockchain can have human readable naming system [4] to have ownership, readability, and decentralization. A blockchain is a decentralized P2P architecture where transactions are stored in the form of blocks [5]. These blocks have a world state and a blockchain state. “Blocks” on the blockchain are made up of digital pieces of information. Specifically, they have three parts:

  1. i.

    Blocks store information about transactions [6, 7] like the date, time, and dollar amount of your most recent purchase from Amazon. (This Amazon example is for illustrative purchases; Amazon retail does not work on a blockchain principle).

  2. ii.

    Blocks store information about who is participating in transactions [6]. A block for your splurge purchase from Amazon would record your name along with Amazon.com, Inc. Instead of using your actual name, your purchase is recorded without any identifying information using a unique “digital signature,” sort of like a username.

  3. iii.

    Blocks store information that distinguishes them from other blocks. Much like you and I have names to distinguish us from one another, each block stores a unique code called a “hash” that allows us to tell it apart from every other block [5,6,7]. Let us say you made your splurge purchase on Amazon, but while it is in transit, you decide you just cannot resist and need a second one. Even though the details of your new transaction would look nearly identical to your earlier purchase, we can still tell the blocks apart because of their unique codes (Fig. 1).

    Fig. 1
    figure 1

    A structure of a data block in blockchain framework

2 Motivation

The prime motivation behind is the securing of data within cloud in a blockchain fashion. The data in cloud with the convergence of blockchain becomes more secure and reliable. The data must be distributed in such a way that it ensures decentralization along with scalability, availability as well as reliability.

3 Objectives of the Present Work

The main objective of this research is to use blockchain to solve the problem of security of data within the cloud. The proposed model uses two complimentary techniques sharding and swarming to secure cloud data [8, 9]. In sharding, data is split horizontally, and kept together so that it can be easily retrieved by a decentralized and distributed application using a unique partition key. The collective storage of shards is called as swarming. These two techniques help in securing, storing, auditing, scaling and decentralizing of data in cloud.

4 Enabling Techniques

4.1 Sharding

The blockhain and cryptocurrency developers are trying to improve the speed of transactions through several ways, and sharding is one of the ways. Sharding is a solution to scability, latency and transactions throughput issues [5]. It is a concept that is widely used in databases to make them more efficient and faster. When it is implemented in blockchain, each block will have only the data part and not the entire information. The decentralization is still maintained because nodes that maintain a shard maintain information only on that shard in a shared manner.

4.2 Swarming

The collective storage of shards is called as swarming. Swarm [9] provides a platform, an infrastructure for the developers for various things such as leveraging, data streaming, etc. This technique helps in securing, storing, auditing, scaling and decentralizing of data in cloud.

5 Related Work

Hepp et al. [10] presented a novel method to safeguard physical assets using craquelure lacquers (PUFs) to ensure privacy of data. Origin-stamp service is used which is open trusted timestamping to acquire integrity. Whenever physical products are linked with blockchain, there will be a requirement of scalability. Moreover, to increase scalability of blockchain, two complimentary techniques swarming and Sharding are used.

Phansalkar [11] mainly focuses on integrating two technologies, i.e. AI and blockchain. Decentralization of blockchain forces AI to do value addition in the field of security, trustiness and efficiency. This paper also discusses about the intrinsic difference between AI and blockchain. Finally, the paper discusses where these both technologies can be used together to get better results such as in financial sector, healthcare sector, government sector, etc. There are various challenges being faced whenever traditional AI becomes distributed or decentralized AI.

Shafagh et al. [3] bring the concept of blockchain secured cloud data. IOT data in cloud needs a high-level security for sharing, auditing, etc. Blockchain provides decentralization, distributed and trustworthiness in IOT-cloud centric data. Author proposed layered architecture for distributed data storage system for IOT consisting of control plane (Blockchain, virtualchain) and data plane (Routing, storage).

Shrestha et al. [12] proposed a model for meta products for decentralized data storage in cloud to ensure privacy and trustworthiness. Traditional model use cloud model (SAAS) for meta products for data storage and processing. Many limitations were found in this type of model like only specific platform can be used by specific product. With the integration of blockchain, a single platform for different users, developers for different requirements has been proposed.

Liang et al. [13] implemented ProvChain an architecture to collect and verify cloud-data provenance. From the features of blockchain technology, the record is with unalterable timestamp, and for each validation of the data, we can generate a blockchain receipt. This architecture helps in auditing, privacy and availability.

Kaaniche et al. [14] combined hierarchical identity-based cryptography mechanism with blockchain technology. This architecture provides a transparent and trusted environment that helps the service providers to have tamper-proof evidence of receiving user’s content before processing their personal data. It also ensures a better confidentiality.

Eyal et al. [15] presents a new blockchain protocol known as bitcoin-NG to overcome the problem of scalability that was not there in bitcoin derived blockchain. It provides same trust model as bitcoin with Byzatine fault tolerant blockchain protocol that is robust to extreme churn.

Chauhan et al. [16] discussed the number of transactions increasing every day, and it is the miner who is getting bottle necked. So, the waiting time is increasing. So, developers of bitcoin proposed a new lightning protocol to overcome this situation. Similarly, developers of ethereum proposed the method of sharding to overcome the same (Table 1).

Table 1 Classification of literature survey based on work performed

6 Proposed Model

The data of various forms which needs high security is grouped in a data block. The data is further divided into small chunks called shards. Furthermore, the shards [8] are individually encrypted and hashed. Then, the data is ready to be inserted and synced with the blockchain ledger. Then, using a particular algorithm, the data chunks are distributed and grouped into swarms [9]. Each swarm now consists arbitrary data. Now, to secure the data, we cannot use a single cloud service provider to store all the swarms rather each swarm shall be occupied by diverse cloud service providers. Finally, the data is distributed into various clouds and since control is not within a single cloud, decentralization is also preserved (Fig. 2).

Fig. 2
figure 2

Proposed system architecture for data storage

7 Concluding Remarks and Future Scope

The proposed system architecture is a novel approach towards data storage, availability and scalability with the aid of blockchain technology. Furthermore, the techniques of sharding and swarming working together provides a future research direction towards the convergence of blockchain technology in storing secure data which may include IoT sensory data, healthcare data and many such. The cloud storage facilities currently available are not fully adequate to provide proper security mechanisms, and thus, the use of blockchain is becoming a mandate in today’s era of highly inevitable data. The future research scope towards the swarming algorithms to ensure decentralization as well as distribution of data is gaining popularity, and this paper tries to present a novel approach towards the challenges faced in securing data.