Keywords

1 Related Study

Big data is characterized by considering 3V’s i.e. volume, variety and velocity, growing to 7v’s wherein representation of data cannot be confined to conventional systems. In the 1970s, the term “Big Data” was coined, but rose in 2008. Big data defines a dataset where the data size is beyond the traditional database’s capability to record, store, manage and analyze information [1]. Big data has no universally accepted definition of how large it should be for classification as big data. The data volumes are in the range of petabytes (1015), exabytes (1018) and beyond [2]. Data created, collected and arranged in exabyte every year. However, its creation and aggregation is quick and will approach to zeta-byte (1021) in the coming years. A review of big data challenges and issues for data-centric applications expressed in type and significance of information retrieved [3]. Also, this extended too many areas further including cloud computing, IOT (internet of things), social networks, healthcare applications etc. was proven useful. Big data has advancements toward information management and handling challenges like big data analysis, knowledge diversity, knowledge extraction and reducing, integration and cleansing and several other tools for analysis and mining [4]. As business spaces are developing and there is a need to recompile the monetary framework, rethinking connections among producers, merchants, and customers of merchandise and enterprises [6].

1.1 Big Data Challenges and Issues

The most crucial part of big data is information. The appearance of any hazard indicates a need for security and protection during data transition or data storage. Classical security solutions are insufficient with reference to big data to make sure security and privacy. Hence, many security and privacy problems with big data are confidentiality, integrity, visualization and information privacy are expatiate in consequence of literature.

1.1.1 Confidentiality

Confidentiality could be a key measure to handle sensitive data, particularly, having the ability to store and process data whereas assuring confidentiality to assemble data. Confidentiality is imposing a restriction on the information against illegitimate revelation.

1.1.2 Integrity

Data integrity gives assurance against changing information by an unauthorized user. Packet sniffing, password attacks, phishing and pharming, data diddling, the man in the middle attack and session hijacking attacks are the most well-known attacks where integrity is comprised. Integrity is also maintained using data provenance, data trustiness, data loss, and data deduplication.

1.1.3 Visualization

Visualization provides a graphical representation of data that becomes easy to understand and interpret outcomes. This technique is helpful for decision makers to access, evaluate, analyze, comprehend and act on real-time data.

1.1.4 Security and Privacy

Big data contains huge amounts of individual interpersonal data that is voluminous in size and security of private data thus is the greatest challenge (Table 1).

Table 1. Big data challenges and issues

1.2 Big Data Analytics Methods

(See Table 2).

Table 2. Big data analytical methods

2 Open Research Challenges in Big Data

2.1 Security and Privacy

Many techniques developed to check personal and private information using security protocols. Still, infrastructure-based aspect for data privacy and management is an issue needed to be resolved.

2.2 Data Fusion and Visualization

Assessment of certain individual group behavior and pattern identification research carried out. An efficient storage and collection of information are required along with solving spatio-temporal problems.

2.3 Cloud Computing

On-demand services not only improve the availability of resources in the cloud environment but also works for cost reduction. Efficient management of data in storing, processing with resources utilization aspects can be considered as one aspect.

2.4 Other Social Media Related

Challenges for example online social network, data integration and performing specified operations for rumor and fake news spread can be the current state of work in big data.

3 Conclusion

Big data is a new data analysis platform that handles multidimensional information on a large-scale for data discovery and higher cognitive processes. Big Data technologies are widely used for data exploitation with the help of large - scale computing infrastructure in social big knowledge analytics tools in many areas, ranging from business intelligence to scientific exploration. This paper is a review of big data challenges, issues and methods in literature focussing on processes and tools to form the next generation computing with big data. The paper also focuses on current research challenges in the field of big data.