Keywords

1 Introduction

The industry sector consumes over a third of global energy [1], making increased energy and resource efficiency of the sector an important lever to counter climate change. Analysis of production data is an effective lever to improve these efficiencies. Researchers and industry practitioners state that the most problems in realizing successful production data analytics are at the start of the data pipeline, namely in data acquisition and pre-processing [2,3,4]. In their recent publication on data-driven energy savings, Teng et al., point out that despite this, academic research on data acquisition and pre-processing is limited, and that most efforts have been focused on modelling and analysis [5]. It is clear that without the proper available data, analysis for energy and resource savings cannot be performed. Making the data available for analysis includes identifying: which variables need to be measured, how to measure the data, how to transmit the data, how to aggregate multiple measurements and how to store the data in a usable form and location. A thorough systematic literature review of this topic was not found among existing publications. Thus, the motivation of this literature review is to identify what this limited research consists of, and what gaps remain.

Objective of this literature review: Determine the current state of research regarding data acquisition and pre-processing for enabling energy and resource efficient manufacturing. Questions that will be addressed: Question 1) What types of subjects are examined? E.g., industries, company size, specific processes, etc.; Question 2) What types of variables are measured? E.g., electricity consumption, water consumption, CO2 emissions, etc.; Question 3) In what settings is the data acquired? E.g., controlled setting, live production setting, from database, etc.; Question 4) How is the data applied once obtained and processed? E.g., basic monitoring, advanced analytics, database of records, etc.

2 Methodology

Since the authors’ goal was to assess all available relevant literature on the topic, a descriptive review was conducted. As stated in their information systems article on transparent literature reviews, Templier et al. emphasize that reproducibility of method is critical for ensuring trustworthiness of a review [6]. Thus, one of the most widely used systematic methodologies, PRISMA, was used [7]. Due to the page limitation of this conference paper, the authors plan to elaborate on their methodology and results in a separate journal article. Below follows only a brief summary of the methodology.

The following criteria were defined for literature: type (journal articles and books), language (English and German), and publication status (published and manuscript). Science Direct, SCOPUS and Web of Science were the databases used. As an example, an excerpt of the query used in SCOPUS is shown below:

TITLE-ABS-KEY ((manufactur* OR production) AND (“energy efficiency” OR “energy saving*” OR “resource efficiency” OR “resource saving*”) AND (“data acquisition” OR “data collection” OR “data *processing” OR “data availability”)).

3 Results

Following the methodology described above, studies were selected as shown in Fig. 1

Fig. 1.
figure 1

PRISMA results of each phase of the systematic literature review

Results can be best structured along the initial questions that the review set out to address:

What Types of Subjects are Examined?

43% of the papers concerned discrete manufacturing environments, 35% process manufacturing and 22% both. This is interesting, because discrete manufacturing industries (e.g. machinery, electronics, automotive) are often not as energy intensive as process manufacturing industries (e.g. metals, paper, chemicals), which would imply a higher motivation to optimize energy consumption in process industries [8]. This should be investigated more closely, but a potential explanation could be that discrete manufacturing environments tend to have more machines, creating more dispersed and complex data, making data acquisition and pre-processing a more pressing research topic more. Additionally, process manufacturing processes often have to be monitored more closely, so the issue of data acquisition may have already been addressed extensively in the past, which lowers the need for new research in this area. Small- and medium-sized enterprises (SME) were only mentioned in three papers. Of these, only one considered SME requirements in the development of its methodology, while the others conducted case studies with SMEs.

What Types of Variables are Measured?

Electricity consumption is the most commonly measured variable, as shown in Table 1. In 79% of the papers, at least one further variable is considered, with auxiliary inputs or material flow through the manufacturing process being the most common. Only 35% of papers developed architectures to acquire and process three or more types of variables. This lack of multiple data sources is in line with the findings of Abele et al., who state “not many works develop integration methods of complex data sets from multiple sources” even though this could enable further efficiency gains [9].

Table 1. Variable types measured

In What Settings is the Data Acquired?

As shown in Table 2, 23 papers (55% of the literature) address continuous data acquisition, as this setting is needed for (near) real-time monitoring and meaningful predictive analytics. In this paper, continuous measurement refers to live production data acquisition on an ongoing basis, as opposed to only acquiring data once. The remaining literature is based on one-time analysis or data from databases, which are both settings with limited applicability beyond historical analysis.

Table 2. Comparison of number of papers per data acquisition method and data granularity

Granularity is defined in Table 2 as the manufacturing level at which the data is collected. Continuous measurement and one-time analysis in live settings, mostly provide medium (machine level) granularity data, and occasionally high (component level) granularity data. Zhang et al. and Hu et al. point out that obtaining high granularity real-time data is a challenge [10, 11]. The literature review results indicate that one-time analyses in an experimental environment are used to collect high granularity data, but less for collecting lower granularity data. Woo et al. raise the issue that models based on empirical one-time data are however less reliable than those continuously using historical and real-time data [12]. Databases primarily provide only low (enterprise level) granularity data, and are mostly used for life cycle assessment analysis. Across all data acquisition methods, in 35% of the papers measurements were done on the component level, in 52% at the machine level and in 13% at the enterprise level. Very few papers covered multiple levels, even though research shows that “the extension of an analytical approach to the process and plant levels with multiple machine tools” can lead to further energy savings [13]. Few papers give reasoning for the specific data acquisition architecture, and most describe standalone solutions without significant integration into existing systems or platforms.

How is the Data Applied Once Obtained and Processed?

As shown in Table 3, the majority of literature considers data for (near) real-time, monitoring or historical analysis. However, a trend can be seen towards predictive analytics for energy and resource efficiency, especially in the past three years. A challenge this trend faces is the above-mentioned difficulty of obtaining the high granularity data from continuous measurement, which is needed for flexible predictive models. As Woo et al. explain, high granularity data “leads to precise and flexible modeling because it can decompose and re-compose models dynamically in terms of stratification.” E.g. With a dynamic model, unlike with a model based on a smaller data set, energy consumption of a tooling machine could be predicted even when the product geometry is changed [12]. Data is used to build a database of energy-related KPI primarily in the context of life cycle assessment or inventory studies. Such data is most useful for assessing the energy efficiency across multiple enterprises, for example along a supply chain.

Table 3. Comparison of number of papers per data acquisition method and application of data

4 Discussion

After evaluation of the results, the following four research gaps were identified:

Research Gap 1: Few papers implement methodology across all machines in the factory, enabling analysis from machine, to process, to plant level. Typically, the analyses are limited to one machine or one process, or only consider the plant in aggregate. However, for example as demonstrated by Kang et al. and Bevilacqua et al. in their integrated machine data analytics approaches, integrating data across these levels can provide valuable insights [13, 14]. Diaz et al. report various analysis that can be done with data from each manufacturing level, and highlight that few studies aim to address data collection to enable analysis across all levels [15]. This integration across levels understandably increases the complexity of the required data acquisition methodology and data architecture, which may be a reason it is rarely attempted. The variety of measurements and data types, especially the spatiotemporal properties [16], increases as more different components and machines are included in the analysis, which increases the difficulty of aggregating the data. A robust architecture is required to manage this data, and these added complexities are likely difficult to address with a single methodology. These challenges should be investigated, and more widely applicable solutions should be developed.

Research Gap 2: Data beyond electricity consumption is rarely incorporated in data acquisition and processing methodologies. At the same time, reasoning for which data is and is not included is often lacking. There are a variety of possible reasons for selecting or omitting certain data, such as technical feasibility of the data acquisition, importance of the data regarding total energy and resource consumption, or external reporting regulations, to name a few. As shown in this literature review, many studies focus solely on electricity consumption. Though this can be a fair prioritization, there are further metrics for energy and resource consumption that can be relevant, as Mani et al. list in their paper on sustainability characterization of manufacturing processes. They mention other secondary energy sources such as fuels, as well as primary sources of the electricity consumed, as noteworthy when measuring energy consumption. They list water, material input and waste, among as relevant metrics when measuring resource consumption [17]. Hence, it would be valuable to research when which data, especially data other than electricity consumption, can provide valuable insights, and how this data can be acquired and processed for analysis.

Research Gap 3: SME are mentioned in less than 7% of the studies. SME production environments typically differ from those of large enterprises, in that they have older machines and less IT infrastructure. Compared to large enterprises, SME face additional challenges in achieving data driven energy savings, including lack of staff to focus on energy efficiency, small budget and need for short return on investments [18], as well as less advanced IT and IoT infrastructure. Rao et al. for example, show a systematic way for companies, especially SME, to assess their sub metering needs and prioritize investments in retrofit electricity sensors [18]. Approaches for data acquisition and pre-processing for energy and resource improvements should likely be differentiated for SME, and require further investigation.

Research Gap 4: Methodologies are rarely tested in multiple settings, and tend to be designed for specific factory and machining processes. This makes it unclear how widely applicable the methodology actually is. Thus, existing methodologies should be tested more, to prove their applicability, or the limitation of their scope should be clearly defined.

5 Conclusion

This literature review fulfills the original objective to determine the current state of research regarding data acquisition and pre-processing, for enabling energy and resource efficient manufacturing. Discrete, not process, manufacturing has received more attention, when it comes to data acquisition and pre-processing methodology. Typically only one or two variables are measured, namely electricity consumption and material flow. Continuous measurement of machine level data is most commonly the subject of study. Data is most often used for (near) real-time monitoring or for historical analysis, to find opportunities for improving energy efficiency. However, collecting (near) real-time energy consumption data at high granularity remains a challenge.

The primary limitation of this study is that database searches were limited to data acquisition and pre-processing methods within the context of energy and resource efficiency. However, methods outside of the energy and resource context could very well be relevant and applied or adapted to the energy and resource context. Initial searches resulted in a very high number of returned documents, going beyond the scope of this conference paper. Thus, in the future, the authors aim to expand this literature review in a journal paper as described above.

Of the multiple gaps identified, the authors will prioritize the lack of data acquisition methods that are applicable across manufacturing levels and different manufacturing set-ups, in their next research endeavors. In conclusion, data acquisition and pre-processing for sustainable manufacturing is a growing field, in which several challenges remain to be addressed by future research.