Keywords

1 Introduction

In the last two decades, the Internet has shown significant growth, which was especially responsible for the amount and the speed of data generation [1]. Currently, most of the actions performed on the Internet generating data, have been identified in order to analyze customer preferences, behavior patterns, evaluate trends, and even detect potential crises and fraud [2]. The importance of data generation and its applications increases considering the estimative of data volume growth for 2020 around 40 zettabytes [2, 3]. Among the possibilities to extract value from this large amount of data (structured or not), one of the outstanding ways it is to identify existing patterns in databases through the most frequently used information. Another way is that companies can create and store data and get detailed information across a range of areas, such as inventory forecasts, demand prospects over the coming months, and then use that information to make better decisions and improve organizational performance [4].

The term Big Data is used to define a large and complex data set whose traditional database techniques, tools, and software are no longer efficient. Therefore, scaling this data, the diversity and complexity require new techniques, architectures, and algorithms for its management and analysis, allowing, an easier extraction of value and knowledge.

Big Data tools and techniques help to extract value and useful information for better decision making in the most diverse areas and possibilities (Table 1).

Table 1 Different areas of Big Data applications

1.1 Theoretical Framework

According to Chen [5], the use of data as the information is not recent. Since the ‘50s, the administration and information technology areas have already used this concept and information systems. The first research was reported in the ‘70s, giving rise to data processing methods. In the ‘90s, software’s has emerged and has been used to analyze the amount of data [6].

Big Data started to be used more frequently in the 2000s, with the expansion of new software’s, processing techniques, storage, and data transmission [6]. Chen [5], Davenport [7], and his collaborators emphasized the emergence of new technologies and the spread of the Internet and global e-commerce was defined as the beginning of a new era of data generation that is transmitted over this worldwide network. The authors mentioned examples of actions taken from the Internet that began to drive a steady and growing stream of data, such as patterns of web browsing, clicks, online commerce, content that users themselves generate on social media, Web sites, blogs, and platforms. Thus, the term Big Data is not related to a whole new concept, but to the updating of several technologies, tools, and techniques that have already existed; this union provides a clearer understanding of Big Data today [6].

  1. A.

    The 5 V’s of Big Data

There is not a well-established consensus on the definition of the term Big Data, although it is known as a large amount of data that is daily generated daily. The first attempts are defined by three pillars: volume, velocity, and variety [8]. But other authors, when defining Big Data, use five pillars (the 5 V’s of Big Data): volume, velocity, veracity, variety, and value.

Volume: It is related to a large amount of data that is generated daily around the world, coming from different sources [9].

Velocity: The speed in which data is generated [10].

Variety: data diversity by the many sources, structured or unstructured data [11].

Veracity: Data authenticity is essential for organizations [12]. It is related to the quality of the information available for decision making [9].

Value: The value that influences decision making based on data analytics [9].

Trying to systematize these V’s that define Big Data, a timeline was created to relate publication dates, definitions, and their number of letters V’s (Fig. 1).

Fig. 1
figure 1

Big Data evolution timeline [2731]

Although most authors define only the five pillars (volume, velocity, variety, veracity, and value), there are other authors who add more pillars. For example, Ali-ud-din Khan [9] added two other pillars: validity and volatility. Sahoo [13] and collaborators highlight two others: visualization and variability. There are even some current blogging attempts to define the term through 10 V’s, but without academic recognition.

2 Methodology

This research has a qualitative approach (which objective is to analyze Big Data definitions in time), with explanatory purpose about the potential of Big Data application in different areas of engineering and to describe its definition along the time.

A literature review was conducted from academic publications from national and international literature that was searched and used from academic bases Google Scholar, ScienceDirect, and ISI Web of Science.

The search was performed based on the keywords: “Big Data” + “engineering area,” obtaining publication numbers from year to year, from the 2000s to the present date.

3 Analysis and Discussion

3.1 Big Data Applications in Engineering Areas

This study presents some Big Data investigation and applications in seven engineering areas (civil, electrical, manufacturing, mechanical, materials, chemistry, and software).

  1. A.

    Civil Engineering

Liang Wang [14] reports a new vehicle detection and tracking system based on image data collected by unmanned aerial vehicles (UAV). This system uses consecutively captured frames to generate dynamic vehicle information: position and speed.

This system can be used by “smart cities” in traffic management centers, for the location of accidents, in the real monitoring of a highway, among others, as area of construction, search and rescue applications, structural inspection, and health monitoring; and with some modifications, it can also be used in jungle animal tracking.

Rathore [15] presents an IoT-based system for smart city development and urban planning using Big Data analytics. The current system uses different types of real-time sensors (weather sensors, water sensors, and intelligent parking), surveillance cameras, emergency buttons on the streets, among other devices. Its implementation is structured in the collection, data filtering, classification, preprocessing, processing, and decision making. The sensors generate large amounts of data at high speeds, which are processed by the Hadoop and MapReduce frameworks and guarantee system scalability and efficiency. It cannot only benefits citizens, but also authorities by providing information that makes decision making faster and more efficient.

  1. B.

    Electrical Engineering

Chou [16] reports a Big Data analysis framework for smart grids and components of an energy-saving decision support system. The structure analyzes, in real-time, the data of electricity consumption, identifies consumption patterns, and predicts energy consumption, in addition to provide optimal operation schedules for appliances. The measurement infrastructure was installed in a residential building in the experimental simulations.

But, Lee [17] reports the development of smart charging infrastructure for electric cars using Big Data tools. The structure is being implemented in Jeju Province, South Korea, where the authors present the distribution of chargers and the number of chargers in operation. The goal is to combine various statistical and machine learning techniques to identify the charging demand pattern of electric vehicles and integrate renewable energy to power charging networks.

  1. C.

    Manufacturing Engineering

Zhang [18] presents a proposal for a general Big Data-based analysis architecture for the product lifecycle management (PLM). This architecture integrates Big Data analytics and service-driven standards that aided decision making, coordination, and optimization of the Cleaner Production (CP) process. By utilizing the technology and the IoT concept at every stage of the PLM, develop an intelligent manufacturing and maintenance environment.

On the other hand, Lee [19] exposes the possibilities and trends of integrating advanced data analytics into manufacturing, products, and services, linking Big Data, the advancement of Information and Communication Technologies (ICT), and the fact that cyber-physical systems (CPS) facilitate the systematic transformation of massive data into information and assist in decision making. In addition, the group also features a systematic architecture called 5C that encompasses the steps required to fully integrate CPS into manufacturing.

  1. D.

    Mechanical Engineering

Bumblauskas [20] reports a smart maintenance decision support system (SMDSS) based on a company’s corporate data. The system is able to provide the user with recommendations for improving asset life cycles. The authors illustrate how Big Data analytical tools can be used to prioritize equipment maintenance and define the architecture in which a physical asset owner can enter usage parameters for interconnected equipment and receive a comprehensive proposal for the service to meet the recommendations generated by the analytical model.

Fernandes [21] presents the results of a study in a metallurgy company where data analysis and resource selection methods were employed from continuous equipment monitoring. Machine learning and data mining techniques were used to extract information and assist in decision making. The information gained will assist in the development of adaptive learning models capable of handling complex information that can be applied across a complete line of industrial products and equipment.

  1. E.

    Materials Engineering

Lu [22] provides an overview and gives examples of research progress toward the discovery of new materials and classifies this research area as Materials 4.0. The work also highlights the development of machine learning protocols and speaks of quantum material property simulation software, digitized material data, intelligent machine learning algorithms, among others. The author emphasized the considerable reduction in time between concept and commercialization of new products and also speaks of the recovery after the conclusion of the product life cycle. The challenge is still aggregating data from multiple sources, as well as managing and analyzing unstructured data to enable 4.0 materials, and further recommends that web-based material research platforms need to be developed to explore opportunities and identify gaps.

  1. F.

    Chemical Engineering

May [23] presents the possibilities of advanced multidimensional separations in mass spectrometry (MS) using Big Data tools, where hybrid analytical instrumentation based on MS has been a powerful technique for meeting the challenges in science and medicine, including helping in drug discovery and synthetic biology. The paper highlights the possibility of large-scale measurements to obtain information and also highlights the challenges of Big Data in chemical analysis: from the enumeration of chemical isomers to this field of multidimensional analysis based on MS.

Chiang [24] shows recent advances in some areas such as the chemical and pharmaceutical industry, among others. The objectives of this paper are to educate the chemical engineering community about the capacity of Big Data and to improve reliability and operational efficiency in various industry sectors, including directing future research. The study also points out that 88% of chemical executives recognize that data analysis will be crucial to maintaining a competitive advantage over a 5-year interval.

  1. G.

    Software Engineering

Arndt [25] explores how software engineering (SE) technology can support the development of Big Data projects, and how Big Data techniques can be used to develop new processes and evolve SE techniques. The paper points out that a lot of research has been done in recent years to improve the production of Big Data systems and to ensure fast and elastic scaling whenever and wherever needed. One of the main researches focuses on applying SE methods to the production of Big Data systems is in the area of software architectures development.

According to Casale [26], software engineering as a discipline must be consolidated, because despite presenting impressive achievements in the field, it is one of the latest scientific disciplines. Currently, researchers in software engineering have been distinguished by the high quality of software and the adoption of controlled practices in development and operation.

3.2 Publications with the Term Big Data by Area

The search for the term “Big Data” + “engineering area” studied at this work in the last 20 years, also held on 27th September 2019 in Google Scholar from the year 2000 until 2019 enabled the drawing of the graph of Fig. 2.

Fig. 2
figure 2

Number of publications (×1000) by engineering area

The graph shows the trend of publications in each engineering area studied in this work for the same period. The chart shows a greater number of publications for the terms “Big Data” and “software engineering,” with 826,000 publications in the period. The lowest number of publications among those surveyed was “Big Data and” “civil engineering” with 86,000 publications.

4 Conclusions

This research presented a brief overview that, in addition to the evolution of publications over a period of 20 years, highlighted some applications in civil, electrical, manufacturing, mechanical, materials, chemistry, and software engineering.

Most of the case approaches to exemplify Big Data applications in engineering highlight: IoT, decision making, or intelligent sensing.

The results of this research demonstrate that some engineering areas, as expected, are of greater interest. But in any case, the results highlight applications that may help in future engineering research.