1 Introduction

Several researches have been carried out in order to improve refrigeration compressor energy efficiency, due to market competitiveness and environmental factors, given that most of the energy consumption in a vapor compression refrigeration system is associated with the compressor. To this end, compressor testing has become an important and daily activity for both research and development and product quality assurance [3].

One of the most important variables to obtain in compressor performance evaluation is the cooling capacity (CC). There are several tests capable of providing the CC of a compressor. These tests are regulated by international standards, such as ANSI/ASHRAE 23 [2], DIN EN 13771-1 [5], and ISO 917 [10], to allow the comparison of results between different manufacturers. Even though ISO 917 is currently withdrawn, it is still used as a consolidated standard in the compressor industry. Despite the differences between the standards—mainly the allowable measurement uncertainty and operating limits—they require the measurement to occur under steady-state conditions in special refrigeration circuits. Among the methods provided in standards, the most used in industry is the calorimeter, presented in Flesch and Normey-Rico [7]. The main problem in this method is the long transient, which takes a few hours, resulting in long test times [12]. This duration impacts directly the production chain, since sample tests must occur in order to release the products to the customers, and sometimes the batches have some dispatch delay caused by the difficulty in generating these results in a timely manner.

Previous researches aimed to reduce the duration of performance tests using two approaches: by improving the controllers of the test rig, reducing test duration to approximately 2 h [7, 16], and by using artificial neural networks (ANNs) to identify the time instant the measured variables reach steady state [1, 13]. However, even with the advances regarding total test time reduction, it is still in the order of about 1 h, making it unfit for use as a production line quality control method for every compressor produced, limiting the tests to a few samples, given the fact that the production cycle is generally much faster than traditional tests. This problem could be solved by using multiple calorimeters in parallel to account for the production line; however, the cost around US$200,000 per rig makes this solution unfeasible in practice. As a result, manufacturers tend to use other tests for quality control and quick response on production line, such as the pressure rise test (PRT) described in Coral et al. [3]. Among other quantities, it measures the pressure rise rate (PRR) within a vessel of known volume, which is a quantity proportional to the mass flow rate generated by the compressor.

Coral et al. [4] proposed the use of ANNs to estimate the CC based on the PRR and other measured quantities during the PRT, along with a combination of the Monte Carlo method with bootstrap aggregating to express the inference of the uncertainties of ANNs. In Pacheco et al. [11], a similar approach was taken, but the method for expressing inference uncertainty was improved. Using the PRT combined with ANNs, it was possible to reduce the time for inferring the CC to a few seconds.

Despite significantly reducing the test time, the production line test rig was not designed to evaluate performance, particularly due to the lack of laboratory conditions for testing and the large measurement uncertainty of the instruments used. The data previously used in Coral et al. [4] and Pacheco et al. [11] rely on three main variables: the PRR, the electrical power consumed, and the compressor shell temperature, with measurement uncertainties of \(\pm {15}\,\hbox {kPa} \hbox {s}^{-1}\), \(\pm {6}\,\hbox {W}\), and \(\pm {6}\,^{\circ }\hbox {C}\), respectively. In addition to these two factors, the quality of the estimates provided by the PRT is affected by the temperature measurement. A study presented in Vitor et al. [18] shows that using motor winding resistance to estimate temperature for this purpose is better than using the compressor shell temperature, given the lower uncertainty and the faster dynamic response of the winding resistance for changes in temperature caused during the test.

In this paper, a more reliable method to estimate the CC is proposed. A better result for the inference of CC is obtained by combining a new ANN model and a test rig in laboratory conditions, with a greater number of controlled and monitored variables, that are measured with smaller uncertainty. In addition, different models of compressors were used for training the neural model, in contrast to Pacheco et al. [11], in which a CC estimate was made for several examples of only one compressor model. The new measurement rig seeks to achieve CC uncertainty values closer to the ones of traditional rigs described in international standards, but reducing testing time from a few hours to a couple minutes, in order to improve the percentage of samples used for product quality assurance in each production batch.

The main novelties of this study are:

  • development of a more reliable method to estimate CC based on data measured in quick quality assurance tests;

  • proposition of a test rig which is able to increase measurement sampling for quality assurance tests by about 2000 times over traditional methods;

  • experimental study considering several compressor models of different CC values.

This paper is organized in five sections. Details about the traditional tests to measure CC, the PRR test, and the correlation between CC and PRR are presented in Sect. 2. The proposed test rig and the neural model used for CC inference are presented in Sect. 3. The results of a case study of tests carried out in a laboratory environment are presented in Sect. 4. Finally, in Sect. 5, the results are summarized and the conclusions of the work are presented.

2 Problem description

In this section, tests performed on compressors are presented, encompassing both the traditional tests described in international standards and tests that measure PRR. Additionally, the correlation between CC and PRR obtained from both types of tests is discussed.

2.1 Traditional tests

Cooling capacity is a measure of the capability of a compressor to produce mass flow rate of a refrigerant fluid for given suction and discharge pressures. It is described by:

$$\begin{aligned} \phi _{0} = q_\textrm{mf} \frac{V_{ga}}{V_{gl}}(h_{g1} - h_{f1}), \end{aligned}$$
(1)

where \(\phi _{0}\) is the cooling capacity, \(q_\textrm{mf}\) is the mass flow rate of the refrigerant fluid, \(V_{ga}\) is the specific volume of the refrigerant vapor at the suction inlet, \(V_{gl}\) is the specific volume of refrigerant vapor at the suction conditions corresponding to the specified test conditions, \(h_{g1}\) is the specific enthalpy of the refrigerant entering the compressor at the specified test conditions, and \(h_{f1}\) is the specific enthalpy of the refrigerant liquid at saturation temperature corresponding to compressor discharge pressure specified in the test conditions [10].

The standard ISO 917 [10] presents nine different methods to estimate the CC. The simplest method for determining mass flow rate in a refrigeration circuit is to measure it directly using mass flow meters. The standard determines that two different methods must be used, and they must not deviate by more than 4% between results. To use those methods, special test rigs must be composed by refrigeration circuit, specialized instrumentation, controllers, and data acquisition systems, costing hundreds of thousands of U.S. dollars. As an example, Flesch and Normey-Rico [7] described a test rig, with the piping and instrumentation diagram presented in Fig. 1, which involves both the measurement of mass flow rate in the liquid phase and the calorimetry method in the evaporator. This rig is the state of the art for measuring cooling capacity, and yet it takes about 2 h to perform the test. No significant improvement in test time using regular calorimeter test rigs has been observed in the last years, so it is expected that improvement based solely on control methods has reached an economically viable limit. In this context, novel test topologies gain particular attention to reduce test time, although the test results are not obtained according to the methods described in the international standards currently in use.

Fig. 1
figure 1

Calorimeter piping and instrumentation diagram

2.2 Pressure rise tests

The PRT is a common way of assessing hermetic compressor quality in manufacturing lines. It evaluates the ability of the compressor to pressurize a vessel. The total test duration is typically less than 7 s, making it possible to evaluate each compressor produced in a factory with high production flow. In the typical test configuration, as shown in Fig. 2, the suction terminal of the unit under test is open to the environment and the discharge terminal is coupled to a pressure vessel of known volume. The pressure vessel is connected to an external pressure line, which has two basic functions: to keep the discharge pressure constant and to rise it to a preset value depending on the test phase. The test consists of cycling different pressures applied to the known volume and its main steps are shown in Fig. 3, which illustrates a typical discharge pressure profile for the test cycle.

Fig. 2
figure 2

PRT piping and instrumentation diagram

Fig. 3
figure 3

Typical discharge pressure profile for PRT cycles

At the time instant \(t_0\) the compressor motor is started. The discharge pressure rises until a steady-state value \(P_1\), obtained at \(t_1\). In the time interval between \(t_1\) and \(t_2\), the active electrical power consumed is measured. At \(t_2\), the external pressurization line begins to rise the discharge pressure, up to a new level \(P_2\), and the vessel is sealed at \(t_3\), so that the compressor continues the pressure rise in the vessel by itself up to the time instant \(t_4\). The PRR is measured from \(t_3\) to \(t_4\). The time required between \(t_0\) and \(t_4\) is usually less than 5 s.

The PRT is usually the last stage of a compressor assembly line, which can have many topologies. In some situations, the measurement is done right after the drying of the shell paint in a stove and in this case the variability of temperatures can be quite large, since the drying is usually done in batches and there are different times for each compressor to reach the test station after the drying process is finished. However, some of the parameters obtained in the test are strongly affected by the compressor temperature, as can be seen in Fig. 4, for PRR and active electrical power consumed (consumption). This happens because changes in the compressor internal temperature modify the density of the refrigerant fluid at the inlet of the compression cylinder [17], which translates directly as a change in the mass flow rate, given that the volumetric flow rate is constant [6]. In addition, the winding resistance of the induction motor associated with the compressor changes with temperature, which affects the electrical power consumed [8]. Owing to the hermetic characteristics of the compressors, the shell temperature is measured and used to compensate part of those changes, since the temperature of the compressor motor winding cannot be measured directly, as discussed in Coral et al. [4].

Fig. 4
figure 4

Impact of the compressor shell temperature on different variables for three samples of the same model: a PRR; b Consumption

2.3 Correlation between CC and PRR

Both the CC and the PRR depend on the compressor capability to generate mass flow. The relationship between these parameters was shown in Pacheco et al. [11] and can be expressed by:

$$\begin{aligned} \phi _{0}= & {} \frac{PVM}{ZRT} \left[ P_{rr} \left( \frac{1}{P}-\frac{1}{Z}\frac{\partial Z}{\partial P} \right) \right. \nonumber \\{} & {} \quad \left. -\frac{\partial T}{\partial t}\left( \frac{1}{T}+\frac{1}{Z}\frac{\partial Z}{\partial T} \right) \right] \frac{V_{ga}}{V_{gl}}(h_{g1} - h_{f1}), \end{aligned}$$
(2)

where P is the gas pressure, V is the volume, M is the molar mass, Z is the compressibility factor, R is the universal constant of gases, and T is the absolute temperature. The relation presented in (2) highlights the nonlinearity between the quantities, which happens because P, T, and \(V_\text {ga}\) are not constant during the test, and Z is also dependent upon P and T. The temperature increases over time, as a result of the compression process and heat exchange between the compressor parts and the refrigerant fluid. In addition, the initial vessel temperature is not controlled, and the temperature is not homogeneous inside it.

While a mixed empirical-analytical model could potentially address the problem at hand, developing and solving the resulting model would be time-consuming and complicated. Given the high level of complexity required to estimate CC, ANNs offer a viable solution. These networks are recommended for modeling complex nonlinear tasks, where mathematical models for real physical phenomena are difficult or impossible [9, 15]. Besides, given the dependency upon P and T, the use of low uncertainty instrumentation along with controlled conditions ensures lower variation on these quantities when using them in a regression tool.

3 Experimental analysis

Taking into account the PRT and its limitations regarding the test time and measured variables, we propose a new test rig that improves the correlation between CC and PRR. The proposed test is not as quick as the PRT, but it is considerably quicker than calorimeter tests and provides a better CC measurement confidence in relation to the PRT.

3.1 Proposed test rig

The proposed test rig is an upgrade of the rig presented in Sect. 2.2. The new rig, called laboratory pressure rise rig (LPRR), operates in a laboratory environment, where parameters such as temperature are more constant. The non-dependence on production cycle time allows for the test to be conducted without requiring an external pressurization line. For the test proposed in this work, the pressure rise is performed by the compressor from the value used for measurement of the electric consumption up to a predefined maximum pressure. This should allow to characterize the pressure rise behavior over a wider range of pressures, and of all compressors over the same range, independently of their pressure rise rates. The piping and instrumentation diagram of the proposed LPRR is presented in Fig. 5.

Fig. 5
figure 5

LPRR piping and instrumentation diagram

The proposed test rig has the following additional measurements when compared to the PRT rig used in the production line: suction temperature, discharge vessel temperature, compressor ambient temperature, ohmic resistance of the motor winding, suction pressure, and motor angular speed. Among the new monitored variables, a control loop was designed for the suction pressure to remain constant at 100 kPa in all tests, ensuring uniformity of this parameter. The compressor ambient temperature also remained at constant values due to the fact that the tests were carried out in a laboratory environment, where the temperature was controlled. The discharge pressure transducer was replaced by one with lower uncertainty. In addition, the compressor shell temperature measurement, done with an infrared meter in the PRT, was done in the LPRR with a Pt100, which has lower measurement uncertainty.

Since the motor winding temperature measurement in hermetic compressors is difficult, its temperature was not measured directly, but using the resistance value, which changes with temperature and impacts the electrical quantities directly. By measuring the motor winding resistance both before and after the test, it is possible to account for the heating that occurs during the PRT due to the starting current and other losses observed during regular operation. Since the pressure rise phase is still quite short, the compressor shell does not reflect noticeable changes in the temperature. Tests using instrumented compressors on the PRT show that the winding temperature rises more than 2 \(^{\circ }\hbox {C}\) during the test, while the increase in the temperature on the compressor shell is practically imperceptible. For the resistance measurement, the LPRR makes use of the winding resistance measurement device (WRMD) proposed in Vitor et al. [18]. This device consists of a combination of relays and electronic circuits that makes it possible to switch between powering the compressor under test or measuring the main winding resistance using the 4-wire method, thus achieving low measurement uncertainty. The uncertainties for the variables measured in the LPRR are shown in Table 1 for a coverage probability of 95%.

Table 1 Uncertainties of the LPRR measured variables

The LPRR has an electromechanical system to control the suction and discharge pressures of the compressor under test. The pneumatic circuit is designed to maintain constant pressures during the consumption measurement. After that, it keeps the suction pressure constant while the compressor raises the discharge pressure. The new test routine is presented in Fig. 6. The main difference compared to the regular test used in the production line is that LPRR does not have an external pressurization line, as the proposed test is not required to fit into production cycle time. Another aspect is that in the line test, a time interval is defined a priori for the compressor to raise the internal pressure of the vessel, which for low capacity compressors occasionally does not allow the discharge pressure to reach sufficient values to properly characterize the behavior of the compressor. In the proposed test, the value for \(t_4\) is not defined a priori and the compressor is kept on until the pressure \(P_2\) is reached.

Fig. 6
figure 6

Typical discharge pressure profile for proposed test rig cycles

Before the compressor is turned on, the initial winding resistance is measured between \(t_0\) and \(t_1\) using the WRMD. After the compressor starts, at \(t_1\), the pressure rises and then reaches a steady-state value of \(P_1\) (200 kPa), at \(t_2\), which is controlled using an automatic valve. During the time interval from \(t_2\) to \(t_3\), the pressure is in steady state, so the measurement of the compressor consumption takes place. At \(t_3\), the vessel is sealed, and the compressor begins to rise the discharge pressure, up to a new level \(P_2\) (around 1600 kPa), which is reached at \(t_4\). The PRR is measured from \(t_3\) to \(t_4\). After the PRR measurement is finished, the compressor is switched off at \(t_4\) and the final winding resistance measurement using the WRMD takes place. As this happens, the pressure in the vessel is released. The time required between \(t_0\) and \(t_4\) is about 60 s.

With the tests carried out on the proposed rig, it was possible to analyze the improvement of the correlation between PRR and CC when compared with the results of the production line rig. Figure 7 illustrates the improvement in the correlation for a single compressor model, measured using Pearson’s linear correlation coefficient, which increased from 0.240 [11] to 0.750. The result indicates that it was possible to improve the correlation by using the proposed test rig, as discussed in the beginning of Sect. 3.

Fig. 7
figure 7

Correlation between PRR and CC with the proposed rig

3.2 Artificial neural network

With the data obtained from LPRR and calorimeter tests, it is possible to develop a regression tool for CC inference. Given the complexity on the mathematical modeling of the physical phenomenon, a supervised training using ANNs was chosen motivated by the results in Coral et al. [4] and Pacheco et al. [11]. The data obtained in both tests are given by:

$$\begin{aligned} \mathscr {F} = \left\{ (\textbf{x}_j,y_j) \right\} _{j=1}^{n} \text {,} \end{aligned}$$
(3)

where \(\textbf{x}_j\in \mathbb {R}^r\) contains the r variables measured in the LPRR, \(y_j\) is the CC value obtained in a calorimeter for the same compressor used to obtain \(\textbf{x}_j\), and n is the sample size. The complete data set \(\mathscr {F}\) was divided into a training set S, with \(n_S\) elements, and a test set \(\mathscr {T}\), with \(n_\mathscr {T}\) elements. The desired modeling must consider a nonlinear input–output mapping, when finding a function f that satisfies the inequality:

$$\begin{aligned} \frac{1}{n_S} \sum _{j = 1}^{n_S} (\text {f}(\textbf{x}_j) - y_j)^2 \le \varepsilon , \end{aligned}$$
(4)

where \(\text {f}(\textbf{x}_j)\) is the CC inference based on LPRR results in \(\textbf{x}_j\) and \(\varepsilon\) is a small positive number, serving as the upper bound for the squared error between the measured CC value and the inference. If the training sample size \(n_S\) is sufficiently large and the ANN has a suitable number of free parameters, it is possible to reduce the approximation error to a small enough value for the problem [9]. In this problem, the multilayer perceptron (MLP) topology was used to represent the nonlinear input–output mapping.

In the training process of the MLP, different initial conditions can lead to different results for the same training set [11]. The effects of the random components in the result can be reduced by combining different MLPs with the same goal in an ensemble. In this case, the results can be expressed as the arithmetic mean of the outputs, as:

$$\begin{aligned} \hat{y}_j = \left( \sum _{i = 1}^{n_u} \text {f}_i (\textbf{x}_j) \right) /n_u\text {,} \end{aligned}$$
(5)

where \(\hat{y}_j\) is the arithmetic mean of the ensemble outputs, \(n_x\) is the number of trained networks in the ensemble, and \(\text {f}_i (\textbf{x}_j)\) is the ith ANN output for input \(\textbf{x}_j\). This method was combined in Pacheco et al. [11] with bootstrap aggregating to randomly select data to build a different training set based on S for each ANN, considering the measurement uncertainty values, aiming to increase the diversity of networks in the ensemble.

Along with the CC inference, the uncertainty associated with the output of the ANN is important to ensure the metrological reliability, particularly when the ANNs are considered as part of the measurement system. For this purpose, Pacheco et al. [11] proposed the use of bootstrap techniques with Monte Carlo simulations (MCS)—considering only the input uncertainties—during the training phase, to assess the inference uncertainty. The process is repeated \(n_u\) times, with \(n_u\) being the number of ANNs in the ensemble used for assessing the uncertainty. The MCS is applied m times on the input data, with each of the results being used as input for the \(n_u\) ANNs. This results in m vectors \(\hat{\textbf{y}}_i \in \mathbb {R}^{n_u}\), for \(i=1,\ldots ,m\), concatenated as \(\textbf{Y} =[{\hat{\textbf{y}}_{1}}^{\text {T}},{\hat{\textbf{y}}_{2}}^{\text {T}},...,{\hat{\textbf{y}}_{m}}^{\text {T}}]^{\text {T}}\), where \(\textbf{Y} \in \mathbb {R}^{n_um}\). As stated in Coral et al. [4], the frequency distribution of the ensemble outputs is a good approximation of the probability density function (PDF) for \(n_u\) around \(10^3\) and the product \(n_um\) around \(10^6\). Considering a normal distribution, as the PDF is centered on the inference value, the standard uncertainty for the ensemble (\(u_{\text {E}}\)) is the same as the standard deviation of the \(n_um\) values, as:

$$\begin{aligned} u_{\text {E}} = \sqrt{\left( \sum _{i=1}^{n_um} ({Y_i} - \bar{Y})^2 \right) /(n_um-1)}\text {,} \end{aligned}$$
(6)

where \(\bar{Y}\) is the arithmetic mean of the output values in \(\textbf{Y}\) and \({Y}_i\) is the ith element of \(\textbf{Y}\). In this approach, the measurement uncertainty of the target variable is not used during the training phase. This uncertainty contribution is added later, throughout:

$$\begin{aligned} u_{\text {cI}} = \sqrt{{u_{\text {E}}}^2 + {u_{\text {M}}}^2} {,} \end{aligned}$$
(7)

where \(u_{\text {cI}}\) is the combined standard uncertainty of the inference and \(u_{\text {M}}\) is the standard uncertainty of the target variable, which is the CC in our case. The expanded uncertainty of the inference can be obtained by calculating the shortest interval for the stipulated coverage probability.

The activation functions used were all of the type hyperbolic tangent as it was used in the ANNs in Coral et al. [4] and Pacheco et al. [11]. The training algorithm was Levenberg–Marquardt, considered one of the most efficient algorithms for this kind of problem [14] and the one which achieved the lowest RMSE values for training. The number of neurons in the single hidden layer—the same topology defined in Pacheco et al. [11]—was defined by a grid search method. The results committees of \(n_S=30\) MLPs with hidden layer sizes between 1 and 50 neurons were tested. The ANNs were trained with a subset of the training set and evaluated with 8 examples that are in the training set but were not used as inputs for training. The different configurations are compared using the root mean squared error (RMSE), described as:

$$\begin{aligned} \text {RMSE} = \sqrt{\sum _{j=1}^{n_x} \frac{(y_j - \hat{y}_j)^2}{n_x}} \text {,} \end{aligned}$$
(8)

being \(n_x\) the number of examples used for evaluation, with \(n_x\) assuming values of \(n_u\), \(n_S\), or \(n_\mathscr {T}\), depending upon the application. The result of this evaluation can be seen in Fig. 8, which shows that hidden layers of sizes close to five neurons have the best results when compared with calorimeter tests. Larger numbers of neurons in the hidden layer tend to cause overfitting, given the small number of samples available for training. As a consequence, it was chosen to work with five neurons in the hidden layer.

Fig. 8
figure 8

RMSE of network committee inference compared to calorimeter measurements for different numbers of neurons in the hidden layer

As proposed in Pacheco et al. [11], an ensemble with \(n_S=30\) ANNs was trained to infer the CC of the test set samples and an ensemble with \(n_u=10^3\) ANNs was trained to estimate the CI by using a combination of the bootstrap technique with the MCS method. For obtaining the CI, the training set S has the uncertainties added to the input values through MCS, generating a new set \(S^{*}\). From this set \(S^{*}\), a bootstrap replica \(S_i\) is generated with the same number of examples \(n_S\), this process is repeated \(10^3\) times creating \(n_u=10^3\) bootstrap replicas (\(S_1\) to \(S_{10^3}\)), with the same number of examples, \(n_S\), in each one. A MLP network was trained with each of these bootstrap replicas and stored to build a committee, used just for obtaining the CI of the estimates. A pseudocode of the training method is shown in Algorithm 1. With the network committee already trained, each of the examples in the test set \(\mathscr {T}\) has the uncertainties U added to its input variables via the MCS method, with \(m=10^3\) trials. Each of these sets of the combination of the test set with a realization of the uncertainty is used as input to all the \(n_u=10^3\) ANNs in the previously trained committee, thus resulting in \(10^6\) predictions for each example in the test set. With the results, it is possible to create a histogram and obtain the CI for a coverage probability of 95%. Finally, to take into account the measurement uncertainty of the targets, it is necessary to combine the calorimeter uncertainties with those obtained using the committee, which is done using (8). A pseudocode of the method used to obtain the CI of the inferences for new data is shown in Algorithm 2.

The inputs used to train the model were: the suction inlet temperature; the motor main winding resistance before and after the test; the initial shell temperature and its mean value; the mean electric consumption; the electric consumption rate; and the PRR. The target for the supervised training was the CC from the calorimetry method. The angular speed and pressure vessel temperature, which are measured quantities added in the proposed test rig, did not provide significant difference in the quality of the CC inferences, so they were not considered by the ANNs. Along with the input uncertainties presented previously in Table 1, which were assumed to have rectangular distributions for a more conservative evaluation, the target CC uncertainty considered was \(\pm 3\%\).

figure a
figure b

4 Experimental evaluation

The tests for this case study were carried out in a laboratory environment, in order to be independent of production time and to reduce some random effects, such as the shell temperature variability due to drying of the shell paint in a stove at the end of the line. Following the same protocol for testing in a calorimeter after the production line, the compressors were allocated in a place with monitored temperature, and there they remained for a few hours to ensure temperature homogeneity, so that the resistance variation observed is only the one imposed by the test. After this time, the compressors were tested on the LPRR, and then on a calorimeter, generating data from both methods for the supervised training carried out in this work.

In total, 70 tests were carried out with compressors from 6 different models, all of which use R134a as the working fluid. Of this amount, 50 samples were used for the training set, 12 for the validation set, and 8 for the test set. The training and validation sets were randomly selected and varied for each new network trained in the committee. Only the compressors present in the test set were selected, ensuring that different models and different CC ranges are represented in the test set.

The inference value of the samples was inferred with a 30 MLPs committee using (5). The CI was estimate with a committee of \(10^3\) networks created with bootstrap replicas was trained, each sample of the test set had its input variables simulated \(10^3\) times, and applied to each of the networks, thus forming \(10^6\) results. With this mass of data, a histogram was created, where it is possible to visualize the distribution of results and find the shortest 95% coverage interval. The histogram obtained for one of the samples in the test set is shown in Fig. 9.

Using the proposed method, the errors of the CC inference result became much smaller than the confidence intervals of the traditional method used to measure it. It is possible to see in Fig. 10 that all inferences are within the confidence interval of the measurement made in calorimeters, for a confidence level of 95%. The average relative error found was 0.3%, and the maximum value observed was 0.7%. For the confidence interval, mean values of 4.3% were observed, and the worst case was 4.8%. The mean uncertainty value has already shown improvements when compared to previous works that used the PRR test, but evaluating only the highest capacity compressors, as in [11], and Coral et al. [4], the average confidence interval for the uncertainty became 4.0%, showing even better results. All the results of the compressors present in the test set are arranged in Table 2.

Fig. 9
figure 9

Frequency distribution of CC for a sample in the test set with 95% confidence interval

Fig. 10
figure 10

Error between ANN committee inferences and calorimeter measurements

Table 2 Results obtained for the test set samples. Confidence intervals are presented with a 95% confidence level

An F-test was performed to prove that the values of the measurement uncertainties provided by the proposed method are less than that reported in Pacheco et al. [11]. The p-value obtained was \(2.43 \times 10^{-4}\), which indicates that the probability of this result occurring by chance is less than 0.03%. As a consequence, even for very small significance levels, it is possible to guarantee that the uncertainty value obtained using the proposed approach is less than that of the most recent method reported in the literature [11].

5 Conclusion

This work proposed the adaptation of a PRR test rig, where it was possible to control and measure more quantities, and improve the measurement of other quantities that were already measured in the original test. Regarding the test time, the proposed test takes about 1 min, including the setup time. Given this time, multiple rigs would be required for 100% inspection of the production, but the proposed alternative has 0.8% of the test time of a regular calorimeter with an uncertainty value which is considerably smaller than the one observed in typical quality assurance tests made in the production line. In this way, this method is a good choice for production quality assurance, since many more samples from a batch can be tested than what is observed in the current situation. In addition, the cost of the proposed rig is around US$10,000, a very low value when compared with the calorimeter rig. This difference also enables to improve the overall batch coverage in performance tests by using more rigs in parallel with the same cost as a single calorimeter. Since the LPRR is about 20 times cheaper than the calorimeter and its test time is about 100 times faster, it is possible to test approximately 2000 compressors in the LPRR with the same resources to test one compressor in a calorimeter.

All samples from the test set had their inference results with errors smaller than 1% when compared with the results from the reference calorimeter used in this study. The analysis of uncertainties was done based on the method proposed in Pacheco et al. [11], which takes into account uncertainties arising from the training process, the incompleteness of the data set, and the uncertainties of measurements of the variables. The resulting uncertainty of the inferences was kept below 5% in all cases.

The case study considered in this work has a broader domain of compressor parameters than previous works that used the PRR for CC inference. In Pacheco et al. [11] and Coral et al. [4], all compressors analyzed had large CC values (greater than 170 W). In this work, CC values were inferred for compressors in the range of 80 W to 200 W. Even so, the tool achieved better CI values, as a result of the combination of more constant test conditions, a different test procedure, measurement of more quantities, improvement of the ANN inference tool, and improvement of the uncertainties of the input variables.

Since each test in a regular calorimeter rig is very expensive and those tests are required for training the proposed ANNs, the number of samples considered in the case study was limited to less than 100 compressors. This small number of evaluated compressors limited the size of the neural tool, since ANNs with a larger number of neurons did not improve the prediction quality due to overfitting. In future work, the number of calorimeter tests will be increased to better evaluate the trade-off between the number of samples in the training set and the uncertainty of the estimates provided by the proposed method.