Introduction

Water is a determining element for the sustainability of human activities and the maintenance of the ecosystem (Kinar & Brinkmann, 2022) and is among the scarcest natural resources (Demetillo et al., 2019; Martínez et al., 2020). Its importance for humanity and for the maintenance of animal and plant life is undeniable. For this reason, water in conditions suitable for drinking and cooking, hygiene and cleanliness purposes, food production, and other uses has become one of the most significant current concerns  (Nandakumar et al., 2020). Nevertheless, more than 10% of the world’s population is affected by drinking water shortages, and the ingestion of polluted water is closely associated with diseases such as cholera, typhoid, and others (Demetillo et al., 2019; Chowdury et al., 2019; Nandakumar et al., 2020).

Therefore, methods for determining water quality parameters are the subject of a growing number of studies (Agossou & Toshiro, 2021; Sameh et al., 2021; Ang et al., 2022; Fonseca-Campos et al., 2022; Jayadi et al., 2022). Current methods for measuring water quality parameters in situ rely on professional multiparametric probes to automatically measure parameters such as turbidity, dissolved oxygen, temperature, pH, and redox potential. However, multiparametric probes are relatively expensive and make it impossible to form monitoring networks with adequate spatial distribution, especially in developing regions or less developed countries.

Up-to-date and automatic information on both the pollution levels of water bodies and whether they are suitable for sustaining aquatic life or for human consumption is directly related to two Sustainable Development Goals (SDGs) established at United Nations (2023). Goal 6 aims to ensure the availability and sustainable management of water and sanitation for all. Goal 11 addresses the conservation and sustainable use of oceans, seas, and marine resources for sustainable development. Making the collection of such information economically accessible through low-cost sensors (LCS) is still a way to reduce inequality within and between countries, thus fulfilling Goal 10 of the SDGs.

Recent advances in low-cost water quality sensors provide an opportunity to develop more affordable electronic devices capable of measuring and transmitting monitoring data to the Internet in real time. LCS technology is emerging as a promising alternative to traditional probes, primarily because of their size and lower cost, ranging from US$6.9 to US$169.00, but can cost up to US$500.00 (Camargo et al., 2023). Within the concept of the Internet of Things (IoT), it is an important alternative for continuous monitoring of water quality parameters Jayaraman et al. (2024); Hamel et al. (2024). However, much remains to be done in terms of the durability, accuracy, precision, and reliability of this class of sensors.

In the work of Camargo et al. (2023), a systematic review of the literature analyzed 142 papers to determine which LCS have been used for remote water quality monitoring and how they performed in the measurements. One of the findings was that only a few papers compared LCS with reference devices (Bórquez López et al., 2020; Huan et al., 2020; Tsai et al., 2021; Singh et al., 2021; Adriman et al., 2022). In most published work, the capabilities and limitations of the sensors have not been thoroughly investigated to ensure that they meet the data quality requirements for the intended applications. Validation and statistical analysis of the results is non-existent or uncertain. Other articles, such as Jayaraman et al. (2024); Hawari & Hazwan (2022); Wannee & Samanchuen (2022), present data on the accuracy of the prototype but not on the sensors used in the prototype and do not present a statistical analysis of the LCS used in the solution or do not use LCS. It is critical to thoroughly investigate the capabilities and limitations of the LCS to ensure that it meets the data quality requirements for the intended application.

In this direction, the objective of this paper is to evaluate multiple low-cost water quality sensors by comparing them to professional reference devices and statistically assessing their performance in determining water quality parameters. LCS from four different manufacturers are used to measure up to five different parameters commonly used to infer on water quality: pH, turbidity, dissolved oxygen (DO), redox potential (ORP) and temperature. A professional Hanna probe was chosen as a reference for the evaluation of LCS. In addition, the measured values are confronted with the values defined in the Standard Methods for the Examination of Water and Wastewater (SMEWW) (Lipps et al., 2023). Statistical analyses of the LCS and the Hanna device are also presented, including error analyses.

The structure of the remaining sections in this article is as follows. Section “Materials and methods” describes the methodology, providing information about the reference device, the LCS evaluated in this study, calibration procedures, the experimental setup, and data collection. Section “Results and discussions” presents the results for each LCS in relation to the water quality parameter, and Section “Conclusion” presents the conclusion.

Table 1 Technical specifications of the Hanna HI98194 probe

Materials and methods

This section presents technical information about the reference probe and the evaluated LCS in Sections “Sonda Hanna HI9829” and “Low-cost sensors.” Additionally, the methods adopted for the calibrations and tests are presented in Sections “Sensor calibration procedures” and “Experimental design and data collection,” respectively.

Sonda Hanna HI9829

According to the user manual (Instruments, 2023), the Hanna HI9829 probe is a multiparameter meter that allows reading of different water quality parameters, which are pH, ORP, EC, TDS, turbidity, resistivity, salinity, dissolved oxygen, atmospheric pressure, and temperature. The data is collected with a configurable acquisition period ranging from 1 s to 3 h, allowing the storage of 45,000 records continuously or on-demand for all parameters. The probe performs automatic temperature compensation within the measurement range (\(-\)5.00 to 55.00 \(^{o}\)C). Table 1 presents the specifications of the probe for the five parameters addressed in this study.

The probe is factory calibrated but allows for recalibration in two specific ways: quick calibration and individual parameter calibration. The first one enables the calibration of a pH, electrical conductivity, and/or DO point. This procedure is considered ideal by the manufacturer for field calibrations but has lower accuracy compared to the other possible method.

The second calibration method, which allows each parameter to be individually calibrated, provides more accurate results. The pH is automatically calibrated at 1, 2, or 3 points with automatic recognition of five standard solutions (pH 4.01, 6.86, 7.01, 9.18, 10.01), also allowing for the use of a custom pH standard provided by the user. The ORP calibration, on the other hand, enables the user to perform a custom calibration at a single point (relative mV) based on the use of a standard ORP solution, compensating for any changes caused by platinum surface contamination and deviations in the reference electrode.

The temperature measurement undergoes a factory calibration, but users have the option to perform a new procedure using an alternative temperature sensor as a reference. As for the DO calibration, it can be conducted based on either percentage of saturation or concentration of dissolved oxygen. The percentage of saturation calibration offers the choice of two points (0% and 100%) or a custom point within the range of 50 to 500%. On the other hand, DO concentration calibration requires a user-configured solution with a known concentration. It is worth noting that according to the manufacturer, variables such as temperature, pressure, and sensor membrane cleanliness can negatively impact the DO calibration process.

Lastly, just like the other parameters, turbidity is also calibrated using standard solutions through automatic recognition by the probe. The process is carried out at up to eight points, creating a curve through linear interpolation between the calibrated points.

Low-cost sensors

In this study, 19 different LCS are tested for the measurement of five water quality parameters: pH, dissolved oxygen, oxidation-reduction potential, turbidity, and temperature. The tested models, selected based on the work presented by Camargo et al. (2023), are from four distinct manufacturers: DFRobot, Atlas Scientific, Vernier, and Delfino. Table 2 presents the list of tested sensors, along with their manufacturer identification, information regarding the measured parameter, measurement range, accuracy, and cost provided by manufacturers on their websites.

Table 2 Overview of low-cost sensors evaluated in this study: manufacturer-provided information

Two categories of sensors developed by Atlas Scientific were evaluated: ENV20 and ENV50. ENV20 sensors are characterized as laboratory-grade, possessing smaller dimensions and constructed from more fragile materials, which consequently leads to lower manufacturing costs and a shorter expected lifespan. In contrast, the ENV50 class is specifically designed for industrial use, constructed using more robust materials and finishing techniques that ensure enhanced sealing capabilities. Furthermore, these sensors are produced in larger dimensions to accommodate a greater volume of electrolyte solution, resulting in reduced maintenance requirements and extended continuous operation, according to the manufacturer’s instructions. Notably, the ENV50 probes are equipped with integrated temperature sensors, allowing for temperature correction to be conducted. It is important to emphasize that this correction is not automated and must be carried out by the user. The temperature probe tested belongs to the industrial-grade ENV50 series and is designated as PT-1000. Despite the differences, the accuracy of the ENV-20 and ENV-50 sensors is the same for pH and ORP and lower when compared to other LCS.

Regarding communication with ENV20 and ENV50 sensors, Atlas Scientific provides interface boards for data acquisition in both analog and digital formats, which need to be purchased separately. The interface board that enables analog signal acquisition is called Gravity, and it can be paired with sensors from the ENV20 class. Digital data acquisition, using communication protocols such as I2C or UART, is accomplished through the EZO board, which can be used with both ENV20 and ENV50 sensors. It is important to note that measuring each parameter requires separate Gravity or EZO boards. The EZO boards are equipped with an electrical isolation system designed to filter out non-natural electrical potentials originating from other devices. To use them, a carrier board is also required, which is available for purchase from the manufacturer.

Fig. 1
figure 1

Schematic diagram depicting the implemented system

DFRobot sensors are also classified by the manufacturer as laboratory-grade sensors and closely resemble ENV20 sensors in terms of dimensions, operating principles, and communication methods. Regarding maintenance, these sensors require frequent replacement of the electrolyte solution. Data acquisition is carried out through Gravity interface boards sold by the manufacturer, which provide analog signals. The DFRobot temperature sensor, on the other hand, is a DS18B20 sensor enclosed in a waterproof capsule. It offers configurable resolution ranging from 9 to 12 bits and communicates using the 1-Wire protocol developed by Dallas Semiconductor.

Vernier sensors are categorized by the manufacturer as educational and laboratory development tools. Their sensors share similarities with those from DFRobot and the ENV20 series by Atlas Scientific. Vernier offers a diverse range of sensors in their portfolio, including ones related to water quality assessment. In terms of maintenance, these sensors require minimal interventions. This is either because they are optical, such as the dissolved oxygen sensor, or because they have sealed reference cells that do not require recharging, as is the case with the pH and ORP sensors.

The temperature sensor, on the other hand, is a 20k\(\Omega \) NTC thermistor, whose non-linear relationship between resistance and temperature is approximated by the Steinhart-Hart Equation. To facilitate the integration of these sensors with development platforms, the manufacturer offers shields or adapters that provide information in both analog and digital formats.

Similar to Vernier’s dissolved oxygen sensor, Delfino’s dissolved oxygen sensor operates on the fluorescent principle, offering advantages such as no oxygen consumption, no flow rate limitations, no need for electrolytes, minimal maintenance, and calibration requirements. It also promises strong anti-interference capabilities and excellent stability. Delfino’s turbidity sensor, on the other hand, is based on the ISO 7027 standard’s 90-degree infrared scattering method, measuring turbidity by assessing the intensity of light scattered by suspended particles in the water sample. Meanwhile, the pH sensor follows the classical electrochemical principle. All Delfino sensors communicate digitally with the development platform using the Modbus RS485 communication protocol.

Fig. 2
figure 2

Photograph of the actual system

Considering Table 2, it is evident that Atlas Scientific’s pH sensors exhibit significantly higher precision compared to those from other manufacturers. Precision levels for dissolved oxygen sensors are less divergent, with ENV-50-DO standing out as the most accurate sensor. Unfortunately, the comparison of ORP sensor precision is compromised by the absence of documented values for DFRobot and Vernier sensors. Similarly, turbidity sensor precision information was only found for the Delfino sensor. Lastly, among the temperature sensors, once again, Atlas Scientific’s sensor stands out as the most precise.

Higher levels of precision typically result in higher sensor acquisition costs. This can be observed in the prices of ENV50 class sensors, which, except for the temperature sensor, exceed $230. Another noteworthy point is the higher cost of sensors employing optical technology, such as Vernier and Delfino’s dissolved oxygen sensors, as well as the turbidity sensor from Delfino. In general, DFRobot sensors, as well as those in the ENV20 class from Atlas Scientific, offer more budget-friendly options, with Vernier sensors falling just above them in terms of cost.

Hardware and firmware

To enable the execution of the tests presented in this work, an electronic data acquisition device was developed. It is based on the use of an Arduino Mega 2560 development platform, which is coupled with peripherals for interfacing with the sensors, as well as for data recording and setting a time base. A schematic of the implemented system can be observed in Fig. 1. The real system, on the other hand, is depicted in Fig. 2.

Using a single Arduino board for simultaneous measurements ensures consistent test conditions and minimizes errors from environmental changes or voltage fluctuations, resulting in more accurate and reliable data.

The system responsible for timekeeping is a DS1307 Real-Time Clock (RTC), which relies on a crystal with a frequency of 32.768 KHz. Data recording, on the other hand, is facilitated by a module for reading an SD card that seamlessly integrates with the Arduino through an SPI interface. Interfacing with the sensors, with the exception of the manufacturer Delfino, is achieved through dedicated boards produced and supplied by the manufacturers of these measuring devices.

The interfacing with both DFrobot sensors and Atlas Scientific ENV20 sensors is achieved by using Gravity-compatible boards specific to each sensor type and each of the various parameters under investigation. These boards, connected to the sensors via SMA or BNC connectors, amplify the collected signals and provide them in analog format, within specific voltage ranges, to the Arduino’s analog-to-digital converter (ADC). Power for the Gravity boards is supplied through the 5V and Gd pins of the Arduino itself.

The ENV50 class sensors, on the other hand, are interfaced with the application using Atlas EZO boards associated with the carrier board provided by Atlas Scientific. In this work, the I2C protocol was employed for communication with the Arduino Mega. The Vernier sensors are interfaced through a shield sold by the manufacturer itself, which is compatible with the Arduino Mega. Through this shield, the sensors can be read using the analog ports, utilizing the library also provided by the manufacturer. Finally, for the utilization of Delfino sensors, a Modbus RS485 converter module for Arduino was applied. Regarding power supply, it should be noted that ENV50 class sensors and Vernier sensors are powered directly by the Arduino, whereas the Delfino sensors require an external DC source ranging from 9 to 36 volts.

In terms of firmware, it is important to highlight that manufacturers often offer pre-existing code and basic applications to facilitate the development process for more advanced applications. In the context of this particular project, our challenge lies in achieving simultaneous communication with diverse sensors, each employing distinct interfaces and utilizing various communication protocols.

Table 3 Calibration results of low-cost pH measurement sensors using 4, 7, and 10 pH calibration solutions

Sensor calibration procedures

The first step that precedes the acquisition of measured values is the calibration of the sensors. Due to the different measurement principles of the various parameters evaluated and the unique characteristics of the sensors themselves, each sensor has its own calibration method, which is recommended by the manufacturers and implemented in this work. All conversion equations used in this work are consistent with the manufacturer’s conversion instructions. Sections “Calibration of pH sensors” to “Calibration of turbidity sensors” present the calibration methods used in this study.

Calibration of pH sensors

Calibration of pH sensors is performed by determining the relationship between the potential measured by the sensors and the hydrogen ion potential of the tested sample. This relationship is linear and can be determined by defining the coefficients of a particular line, as shown in Eq. 1, where a and b represent the slope and intercept coefficients, respectively, and voltage denotes the electrical potential delivered by the sensor.

$$\begin{aligned} pH = \left( a \cdot voltage \right) + b \end{aligned}$$
(1)

Calibration can be performed with one, two or three points. In this study, a three-point calibration was performed to increase precision. To achieve this, standard solutions with pH values of \(4.00\pm 0.02\), \(7.00\pm 0.02\), and \(10.00\pm 0.02\) were used. Although a single measurement for each standard solution would be sufficient for calibration, 120 consecutive readings were recorded for each standard solution with a measurement interval of one second to mitigate the effects of possible electrical measurement variations. The measured values were then used in a linear regression process to establish the conversion equation for each of the sensors. This process was applied to the ENV20, DFRobot and Vernier sensors and the calibration results are shown in Table 3.

For the ENV50 and Delfino sensors, the manufacturers provide libraries, codes and calibration functions to perform a linear regression via commands sent over serial communication. Calibration can be performed with up to three points, paying particular attention to the order of calibration when selecting multiple points. The mean pH calibration is taken as the reference point to initiate the procedure and clears all other calibration points from the embedded system memory for each sensor. Therefore, both sensors must be calibrated with the medium pH solution first, followed by the low and high pH solutions.

Calibration of DO sensors

For the calibration of dissolved oxygen sensors, standard solutions of 0% and 100% are used to establish the linear relationship between the electrical potential and the percentage saturation or concentration of dissolved oxygen. Ambient conditions such as temperature and pressure affect the calibration results as the maximum concentration of dissolved oxygen in the sample is influenced by these variables. For example, the concentration of dissolved oxygen in water at 20 \(^o\)C and 1 atm is 9.09 mg/L.

Table 4 Calibration results of LCS for dissolved oxygen measurement using the 100% saturation potential
Table 5 Calibration results of LCS for ORP measurement using a 225 mV ORP calibration solution

The procedure recommended by the manufacturers is to first measure the saturation point of 100% and then check the point of 0%. Since at zero saturation the potential should be zero, the linear coefficient of the relationship is also zero, and the calibration equation is written as in Eq. 2. The percentage saturation can then be expressed as in the Eq. 3.

$$\begin{aligned} DO_{mg/L}=\frac{voltage}{voltage_{100\%}}\cdot DO_{\max _{mg/L}} \end{aligned}$$
(2)
$$\begin{aligned} DO_{\%}=\frac{DO_{mg/L}}{DO_{\max _{mg/L}}}\cdot 100 \end{aligned}$$
(3)

In this study, the recommended procedure was applied to the ENV20 and DFRobot sensors. The result was the determination of a saturation potential of 100%, which was 430 mV and 1401 mV, respectively. The ENV50 sensor, on the other hand, was calibrated digitally via serial commands, as were the other parameters. No calibration is required for the Vernier and Delfino sensors as they are optical sensors, as stated by the manufacturers. In the tests performed, a maximum saturation of 9.09 mg/L was taken into account for the conversion from concentration to percentage saturation (Table 4).

Calibration of ORP sensors

According to the manufacturers, the calibration of ORP sensors is carried out by determining a measurement deviation and can be performed by evaluating a single solution with a known redox potential. Standard solutions with characteristic potential values, such as 225 and 240 mV, are available for this purpose. To perform the calibration, simply immerse the sensor in the standard solution, wait until the measurements have stabilized and determine the deviation of the measured value compared to the potential value of the standard solution used.

This deviation should then be added to the potential value measured by the sensor to obtain the correction shown in Eq. 4. Although this calibration technique is the same for all sensors, unlike the ENV20, DFRobot and Vernier sensors, the calibration of the ENV50 sensor is performed via the software using the serial command CAL,#, where # represents the ORP potential value in millivolts of the standard solution used. This command should be sent as soon as the measurements taken by the sensor have stabilized.

$$\begin{aligned} ORP_{mV} = voltage_{mV} + V_{offset} \end{aligned}$$
(4)

For the calibration of the sensors used in this study, an ORP calibration solution of 225 mV from Atlas Scientific was used. Table 5 shows the offset values obtained during the calibration procedure for each of the LCS.

Calibration of turbidity sensors

The relationship between the potential measured by a turbidity sensor and its value in NTU is determined by a se-cond-order equation resulting in a non-linear curve as shown in Eq. 5. The calibration methodology is based on the use of several standard solutions and a linear interpolation between each calibration point or on the application of a regression with the voltage values at the calibrated points to obtain the parameters of the equation.

$$\begin{aligned} Turbidity_{NTU} = a \cdot voltage^2 + b \cdot voltage + c \end{aligned}$$
(5)

It is worth noting that only two solutions were available for turbidity, both of which had very similar NTU values, which negatively impacted the calibration procedure. The calibration of the DFRobot sensor was even more affected by its large measurement range. As a result, the voltage values measured with each of the solutions were almost identical, making regression impossible. Without being able to calibrate the DFRobot sensor, the authors therefore decided not to test it. However, an evaluation of the sensor was carried out on Xavier et al. (2022), where the sensor was found to be fragile when left submerged for long periods and susceptible to infiltration. Section “Experiments with turbidity sensors” describes the results of Xavier et al. (2022) when the results on turbidity are discussed.

The calibration of the Delfino sensor was actually carried out digitally, just like all other sensors from the same manufacturer.

Experimental design and data collection

The tests were performed with water samples from the lake of the Diva Paim Barth Ecological Park in the city of Toledo, Paraná, Brazil. For each of the parameters tested, 20 ls of surface water were collected in suitable containers, transported to the laboratory, and placed in a tank. Then, it was tested simultaneously with the LCS and the Hanna probe, so that the intercomparison of the collected data was possible, since all sensors are tested with the same water sample and under the same environmental conditions.

For tests in which the sample must be kept in motion, a system was used to circulate the liquid in the tank, namely a mechanical mixer Fisatom 713DS (Fisatom, 2024). The test apparatus can be seen in Fig. 3, with all sensors attached to the tank lid and in contact with the sample.

Fig. 3
figure 3

Photograph of the device used in the tests

For each parameter tested, after stabilization of the values, data was collected over a period of 24 h with a sampling interval of 1 min, resulting in a total of 1440 samples for each sensor. At the end of the tests, the spent water was properly disposed of to make room for a new sample that was used to test a different parameter. This process was repeated for each of the five parameters tested in this study.

After running the tests, the data was filtered with a moving average smoothing filter with a window of nine samples to remove short-term overshoots or noisy fluctuations. They were then statistically analyzed using Origin 9.0™ software with a confidence level of 95% in terms of precision, bias, and comparison with the reference instrument (relative error, Pearson coefficient, etc.).

To assess precision and accuracy, the values defined in the Standard Methods for the Examination of Water and Wastewater (SMEWW) (Lipps et al., 2023) are used, which are summarized in Table 6. However, it is worth noting that, to our knowledge, there is no method for evaluating LCS. SMEWW is applied to discrete laboratory measurements. In the experiment performed, the sample may react over time, either due to temperature fluctuations or chemical changes. Therefore, the Hanna probe was chosen as a reference for the evaluation of LCS. The results are presented and discussed in Section “Results and discussions”.

Table 6 Standard Methods for the Examination of Water and Wastewater (SMEWW) parameters

Results and discussions

Sensors from the manufacturers Atlas Scientific, DF-Robot, Vernier, and Delfino were used for the tests. As a reference, the Hanna probe was used. The sensors were previously calibrated following the methodology indicated by the manufacturers and presented in Section “Sensor calibration procedures.” Water samples and measurements for all variables were performed according to the procedure described in Section “Experimental design and data collection.”

The results of the Shapiro-Wilk normality test indicate a non-normal distribution of all data, and statistical differences were found at the level of 5% (Mann–Whitney test) of significance, comparing the values provided by the LCS and those of the reference for all parameters. The results for each sensor and parameter are detailed below.

Table 7 Statistical data on pH values
Fig. 4
figure 4

LCS and Hanna reference probe measurements over time for pH

Experiments with pH sensors

In Fig. 4a, the performance of the LCS is compared to Hanna’s equipment. Results from the Delfino sensor were excluded because the sensor exhibited inconsistent behavior compared to the other sensors and Hanna, with a continuous decrease in pH, starting at 6.4 and reaching a pH of 5.5 after 24 h. Table 7 presents the descriptive statistics of the dataset, which can also be observed in the boxplot graphic in Fig. 4b.

During the 24-h measurements, the water sample remained the same; however, physical-chemical reactions can occur, potentially altering the observed values compared to the beginning. Furthermore, the water temperature varied (\(\Delta \) = 1.55 °C), which may affect the measured values. Therefore, the results of the performance analyses are comparable to the Hanna probe in terms of accuracy, considering here in all comparisons with reference that is a combination of bias and precision of an analytical procedure, which reflects the closeness of a measured value to a true value.

When assessing precision, SMEWW recommends ± 0.02 pH units for the deviation for the same sample under identical conditions. As presented in Table 7, the standard deviation of the Hanna sensor resulted in ± 0.03 pH units. Compared to the others, ENV50 sensor showed significantly distinct variability, i.e., ± 0.09 pH units. DFRobot and Vernier reached values of ± 0.02 and ± 0.03, respectively, that is, very close to the value of the Hanna probe. The relative standard deviations (RSD) are less than 1% for almost all measures, except for the ENV50 sensor which presented RSD of 1.37%. The Hanna probe obtained 0.45% and Vernier 0.43%, i.e., the lowest RSD.

The measurement of pH in natural waters is critical to conclude the quality of water and water bodies. Although SMEWW considers the SD limit to be ± 0.02 pH units, LCS with a similar SD (0.02\(-\)0.04 units and this means variations of 0.2\(-\)0.4 times in the concentration of H+ ions in the solution) are sufficient to infer water quality for this parameter. An important aspect of environmental monitoring of water quality is the follow-up of parameters over time, and LCS may be very useful.

Table 8 Error analysis of pH sensors compared to the Hanna reference probe
Fig. 5
figure 5

Measurement results from the LCS and the Hanna reference probe over the sample time for dissolved oxygen. a Curves over time and b boxplot

Table 8 shows the error analysis of the LCS compared to the Hanna probe. It can be observed that the relative error is highest for the low-cost Vernier sensor and the ENV20, but does not exceed 5%. The Root Mean Square Error (RMSE) measures the difference between the predicted values for a model and the actual value, which can be considered as the standard deviation of the error or the correlation between the variables in terms of precision. The lower the value, the better the model or the correlation between the variables. In this case, the DFRobot and Vernier LCS have the lowest values, with the Vernier sensor having a higher Pearson R coefficient, indicating a higher correlation between the measurements of the reference device and the LCS.

Regarding accuracy, SMEWW specifies an error of 0.1 pH unit in water measurements. According to this criterion, only the DFRobot sensor fully meets the requirement and partially complies with the ENV50 as the bias are 0.10 pH unit and 0.11 pH unit, respectively. The low-cost DFRobot and Vernier sensors have shown greater stability over the sampling period, while the ENV20 and ENV50 sensors exhibit hysteresis in the signal concerning temperature, a phenomenon observed to a lesser extent in the Hanna probe sensor. Hysteresis is the change in electrical properties in a device due to variations in working temperature.

Compared to the results presented in the literature, the LCS tested in this work show a significantly lower SD for pH. Huan et al. (2020) report an SD of \(\pm 0.09\) pH units for the E-201-C sensor from an unknown manufacturer, a value identical to that of the ENV50 sensor in the same question. Bórquez López et al. (2020) claim to have determined an SD of \(\pm 0.11\) pH units for an Atlas Scientific sensor with an unspecified model.

Regarding error analysis, Huan et al. (2020) point to a relative error of \(0.21\%\), while the work of Nandakumar et al. (2020) indicate a relative error of \(3.20\%\) for the DFRobot SEN 0237-A sensor and Adriman et al. (2022) describe a relative error of \(3.15\%\) for an unidentified low-cost sensor. In this sense, it should be noted that the sensors tested in this work do not differ from the results presented in the literature.

In turn, Bórquez López et al. (2020) and Demetillo et al. (2019) report tests with correlation coefficients (R) of 0.84 and 0.98, respectively. These results are similar to those presented by the ENV50 and Vernier sensors.

Table 9 Statistical data on dissolved oxygen values observed during the sampling period
Table 10 Error analysis for dissolved oxygen sensors in relation to the Hanna reference probe

Environmentally, the pH affects several chemical and biological processes in water, and it is one of the most important environmental factors limiting species distributions. The pH from 6.5 to 9 is an adequate range for most aquatic organisms (ANA, 2023). However, certain species are influenced even in small variations, which can cause consequences. The reason for tolerance of 0.1 unit of pH in water measurements could be related to those potential effects (chemical communication, reproduction, and growth) (Smithsonian Institution, 2023), which may occur with variation of 0.2–0.3 units, especially on marine life. However, the performance of DFRobot and ENV50 LCS presented values near 0.1 unit. Based on the results, considering that are LCS and large gap on monitoring water quality, we believe that most of the LCS tested here (DFRobot, ENV20, ENV50, and Vernier) can measure pH in freshwater for monitoring purposes.

Analysis of the results obtained indicates that they are suitable for reliable measurements, similar to commercially available multiparameter probes intended for in situ measurements of pH variables in aqueous media. An observation is about the LCS ENV50, which is more expensive and intended for industrial applications but did not provide satisfactory results. Robustness to real field conditions was not evaluated in this work, and the sensor may perform better under these conditions. The analyses were performed according to the procedures of commercial water quality monitoring instruments, such as the Hanna probe.

Experiments with dissolved oxygen sensors

Figure 5a shows the performance of the LCS for measuring DO compared to the Hanna probe. The same water sample was maintained throughout the 24 h of testing, and its movement was carried out using the agitator at a speed of 180 rotations per minute. It is noteworthy that dissolved oxygen levels are closely related to temperature, which remained at 21.58 ± 0.11 °C during the sampling period. Table 9 presents the descriptive statistics of the data set, as can also be seen in the boxplot graph in Fig. 5b.

Figure 5a shows that the Hanna probe exhibits an increasing trend in DO values over time. On the other hand, the LCS, with the exception of the ENV50, show stable readings over the entire sampling period. The ENV50 probe shows a decreasing trend and in the last moments of the experiment an abrupt drop in a small time interval. After some time, the readings from this sensor return, also abruptly, to a level consistent with the rest of the time series of measurements.

Fig. 6
figure 6

Measurement of LCS and Hanna reference probe for oxidation-reduction potential (ORP). a Curves over time and b boxplot

As presented in Table 9, LCS and Hanna results ranged from 69.29 to 82.66%, with the exception of the Vernier sensor, whose OD average is 51.53\(\%\). In terms of standard deviation, all LCS values were below those of the Hanna probe (2.79\(\%\)), with the exception of the ENV50 sensor. The SMEWW specifies that when measuring dissolved oxygen with an electrode in samples that have the same physico-chemical properties, the precision must be ± 0.1 mg/L (\(DO_{\%}\approx \pm \) 1.1\(\%\) for water at room temperature). The sensors from Vernier and DFRobot therefore meet the requirement. The other sensors have a standard deviation of less than ± 3.05% of dissolved oxygen. In comparison, the values for the RSD are lower than those of the Hanna probe, i.e., 3.59%, except for the ENV50 sensor (\(RSD = 4.02\%\)). This fact shows that the LCS have a satisfactory precision compared to the reference devices.

Table 10 shows the error analysis of the LCS based on the Hanna probe. The relative error is highest for the low-cost Vernier sensor with \(-\)33.77%, followed by the ENV20 sensor from Atlas Scientific with a relative error of \(-\)10.94%. These sensors also have the highest bias values: 26.27% and 8.51%, respectively. In terms of relative error and bias, the best results are obtained for the DFRobot and ENV50 sensors. The Delfino sensor shows an average performance compared to the other sensors. In terms of accuracy, SEMWW determines an error of 0.05 mg/L (\(DO_{\%}\approx . \pm \)0.55\(\%\)), so, as can be seen in Table 10, no LCS has a fully satisfactory performance when only this criterion is considered.

Vernier sensor has the best correlation with the reference probe in terms of RMSE, although it has the highest relative error and bias compared to Hanna. The Pearson correlation for the Vernier sensor has a negative sign, indicating other trends. In other words, while the dissolved oxygen measured by the Hanna probe is increasing, the values measured by the Vernier sensor are decreasing. In general, the DFRobot sensor was the one that gave the best results, with a low relative error and bias, associated with satisfactory correlations in terms of RMSE and Pearson’s R.

The evaluated sensors for DO show similar performan-ce to the results found in related work. Bórquez López et al. (2020) claim to have tested an Atlas Scientific sensor of an unidentified model and obtained an SD of \(1.15\%\) and a Pearson’s R of 0.9. Huan et al. (2020) claim to have obtained an SD of \(6.05\%\) and a relative error of \(2.48\%\). An analysis in terms of RMSE, bias,, and Person’s R is presented by Méndez-Barroso et al. (2020). In the paper, an Atlas Scientific sensor of an unidentified model showed an RMSE of 1.29, a bias of \(10.89\%\), and a Pearson’s R of 0.43.

The amount of oxygen in freshwater reservoirs is variable and depends on several local environmental conditions (e.g., water temperature, solar radiation, discharge, biological processes). It is an indicator of the water body’s capacity to maintain its life-cycle functions (e.g., growth, maintenance, and reproduction) and the habitat distributions of aquatic organisms. There is a consensus that a healthy DO amount should be at least 5 mg/L in water bodies; however, there are different species in the ecosystem that have different oxygen requirements (Fondriest Environmental, 2013; EPA - United States Environmental Protection Agency, 2021; Abdul-Aziz & Gebreslase, 2023).

Table 11 Statistical data on ORP values observed during the sample period
Table 12 Error analysis for ORP sensors in relation to the Hanna reference probe

The natural fluctuation of DO and the complexity of biological and chemical processes that occur in the ecosystem do not allow us to recognize the sensitivity limits of all species to DO. The variability in the rivers is higher (e.g., 12–13 mg/L in winter and 6–9 mg/L in the summer, Pompton River in New Jersey) than to saltwater, i.e., the surface mean annual of DO is 9 mg/L near the poles and 4 mg/L near the equator (Fondriest Environmental, 2013). At the same time, the SMEWW values were established to apply generally for examining waters under a wide range of quality. Hence, considering the environmental purposes and the LCS results for DO, the precision and bias provided by the LCS from ENV50 and DFRobot could be used to obtain satisfactory measurements. The difference in RSD of ENV50 to Hanna was low (0.43% or 0.04 mg/L) and with a bias of 0.2 mg/L, while DFRobot the bias was 0.1 mg/L, which we believe is an acceptable accuracy.

Experiments with ORP sensors

Figure 6a shows the measurements of the redox potential carried out with the LCS and the Hanna probe over 24 h. The curves show that the ORP measured with the Hanna probe starts at 230 mV and rises slowly for about 10 h until it reaches 237 mV and then returns to the initial potential. From then on, the rising trend is observed again until the last moment of the test. In comparison, the DFRobot sensor under the LCS shows a behavior more similar to that of the Hanna probe, even if the absolute potential values measured are different.

The other LCS show mostly stable measurements, with a focus on the Vernier sensor, which took measurements with a small drop throughout the sampling period. The ENV20 sensor shows a decreasing trend in ORP measurements during the first half of the test, which then leads to a sudden increase in potential, followed by a sudden drop around the 12 h of the test, with the behavior repeating after 18 h of measurements. Such behavior is not observed in the measurements of the other sensors. The ENV50 sensor, on the other hand, shows little oscillating measurements for practically 16 h until it shows a sudden increase of about 7 mV in its potential measurements and maintains this until the end of the test.

Figure 6b and Table 11 show the statistical descriptions for the Hanna probe and the LCS. In terms of precision, the Vernier sensor has the lowest standard deviation, followed by the Atlas Scientific, ENV50 and ENV20 sensors. The Hanna probe has an SD of ± 4.43 mV. Only the DFRobot sensor is above this value (\(SD=\pm 5.82 mV\)). In terms of precision, the SMEWW defines a maximum deviation of ± 10 mV for ORP measurements in water. In this sense, all LCS are suitable. The Vernier sensor is the one with the lowest RSD (0.08 %), and all the others have a relative standard deviation lower or very close to that of the reference probe (\(RSD=1.89 \%\)), which proves their applicability.

In terms of accuracy, Table 12 shows that the sensor with the lowest relative error and bias is Vernier’s LCS, followed by ENV20, ENV50, and finally DFRobot. In terms of correlation, it is noteworthy that the DFRobot sensor has a greater coincidence with the Hanna sensor, with a Pearson’s R of 0.774. Meanwhile, the Vernier sensor has the lowest RMSE, which is mainly due to its lower relative error and bias.

Fig. 7
figure 7

Measurement of LCS and Hanna reference probe for turbidity. a Curves over time and b boxplot

Table 13 Statistical data on turbidity values observed during the sampling period

Although the oxidation-reduction potential measurement is not a water quality parameter usually used in the water quality indexes, it infers about the tendency of water bodies to cleanse themselves by the presence of oxidative substances, like oxygen and chlorine. Therefore, high values of ORP usually show high values of oxygen in the water. The ORP or also known as redox potential is the tendency of a chemical species to acquire electrons and thereby be reduced. Environmentally, the freshwater ORP changes, and it is an interesting parameter to follow over time that contributes to evaluating the environmental conditions of water bodies and could be expected, in healthy waters, values of 300 and 500 millivolts (for aerobic species). However, in environmental natural conditions, anaerobic conditions are also found (Donna N. Myers, 2019). The bias and RSD obtained by LCS for ORP indicate a concern about measurements performed by these sensors, except for Vernier.

Experiments with turbidity sensors

Figure 7a shows the turbidity measurements carried out with the low-cost Delfino sensor and the Hanna probe over a period of 24 test hours. It can be seen visually that the turbidity measured by both sensors starts at a certain value and decreases approximately linearly over time. Table 13 shows the descriptive statistics of the data set as seen in the boxplot diagram in Fig. 7b.

Table 14 Error analysis for turbidity sensors in relation to the Hanna reference probe
Fig. 8
figure 8

Measurements from LCS and Hanna reference probe for temperature. a Curves over time and b boxplot

In terms of precision, Table 13 shows that both the Hanna probe and the Delfino sensor have a standard deviation of less than ± 0.7 NTU and an RSD of less than 1.69%. Although the SEMWW suggests deviations of less than ± 0.02 NTU, the observed deviations are considered acceptable considering that the turbidity changes during the test due to the deposition of solid particles in the sample.

In terms of accuracy, Table 14 shows that the relative error of Delfino’s LCS is approximately 15%, with a bias of 6 NTU. In terms of correlation, the LCS mentioned above has a Pearson’s R of 0.9585 and an RMSE of 0.1383, indicating a high level of agreement between the two devices. Therefore, it is assumed that the sensor is suitable for turbidity measurements in aquatic environments compared to the reference probe since this bias be considered.

The relative error is higher than that of DFRobot’s SEN0189 sensor, which was tested and presented in the paper by Nandakumar et al. (2020). In his tests, the author reports relative errors of \(1.6\%\). In the work by Xavier et al. (2022), the LCS SEN0189 only survived immersion for the first 8 h of a 42-h test. However, during the 8-h test phase, the SEN0189 delivered results that were very similar to those of the reference sensor. The average measured turbidity was 13.65 NTU with a standard deviation of 5.37, while the reference sensor recorded an average turbidity value of 15.3 NTU with a standard deviation of 0.66.

The clarity of water is measured by the turbidity. Turbidity and total suspended solids (TSS) are different ways to measure similar water quality characteristics. It infers the concentration of suspended particles in the water, like clay, silt, organic matter, algae, etc. The turbidity is measured by nephelometry and reported in NTU. Therefore, it does not indicate the concentration or the quantity of mass (i.e., mg/L) of species present. In the environment, it can vary according to local conditions (soil, vegetation, temperature, discharge, etc) and not directly infer the quality of water in the reservoir. As ORP, the turbidity is a parameter interesting to follow over time at the same place to help analyze the water quality (Fondriest Environmental, 2014; EPA - United States Environmental Protection Agency, 2021).

Experiments with temperature sensors

Figure 8a shows the temperature measurements by the LCS and the Hanna probe during the entire test period. The curves show a great similarity between the measurements of the reference probe and the LCS, although a difference in the absolute values can be seen in the results of the ENV50 sensor. In contrast, the Vernier and DFRobot sensors correspond exactly to the value measured by the Hanna probe. Table 15 and Fig. 8b show the descriptive statistics of the data set.

In terms of precision, according to Table 15, the LCS have a standard deviation of the same order of magnitude as the reference probe (\(DP = \pm 0.41C\) ) and an RSD of less than 2%, which clearly shows the equivalence of the performance of the LSC and the Hanna multiparameter probe.

Table 15 Statistical data on temperature values observed during the sampling period
Table 16 Error analysis for temperature sensors in relation to the Hanna reference probe

In terms of accuracy, Table 16 shows that the relative errors are less than 0.42%, with the exception of the ENV50 sensor, which has a relative error in the order of 4%. Table 16 shows a very high correlation between the LCS and the Hanna probe, with an RMSE of less than 0.061 and a Pearson’s R of more than 0.9877. Therefore, it is considered that the LCS are equally suitable for temperature measurement and all represent a good alternative for a water quality monitoring system.

In comparison with the results presented in the literature, it can be seen that the temperature sensors evaluated in this work have a better performance. Bórquez López et al. (2020) claim to have tested an Atlas Scientific sensor of an unidentified model, obtaining an SD of 2.81°C and Pearson’s R of 0.98. The same correlation coefficient is reported by Demetillo et al. (2019). Huan et al. (2020) claim to have obtained an SD of 0.12°C and a relative error of \(0.15\%\). An analysis in terms of RMSE, bias, and Person’s R is presented by Méndez-Barroso et al. (2020). In the paper, an Atlas Scientific sensor of an unidentified model showed an RMSE of 0.50, a bias of 0.99°C, and a Pearson’s R of 0.98.

The temperature is a parameter usually used to calculate the water quality index since it affects the biological and chemical reactions in the ecosystem. Besides, it can contribute to indicating the discharge of wastewater. The LCS Vernier and DFRobot, considering the RSD and bias, are adequate to perform measurements.

Conclusion

We have presented the study of 19 low-cost sensors for monitoring five different parameters that contribute to the analysis of water quality: pH, dissolved oxygen, oxidation-reduction potential, turbidity, and temperature. These sensors, which are manufactured by four factories, were selected because they are the most commonly used in the literature in the development of low-cost water monitoring systems. The sensors were evaluated in terms of their technical and operational specifications. Their performance was statistically evaluated and described in comparison with each other and with respect to a multiparametric reference probe Hanna HI98194.

The results were statistically analyzed, with the majority showing a relative error of less than 5%. The pH and temperature sensors gave the most accurate results with a maximum relative error of 4.95% and 4.01%, respectively. The DO, ORP, and turbidity sensors had higher relative errors, five of them with errors greater than 10%. Regarding the correlation between the low-cost sensors and the reference probe, the best results were observed for the temperature sensors with a Person’s R above 0.98, followed by the turbidity sensor with an \(R > 0.95\) and for two of the sensors for pH and one for ORP with an \(R > 0.77\). The worst performances were observed for some DO and ORP sensors, with relative errors (\(>10\%\)) and low correlation indices (\(R<0.25\)).

In general, the low-cost sensors presented a satisfactory performance for all parameters evaluated. It can be concluded that low-cost sensors can be used as complementary instruments to quickly detect changes in water quality parameters.

Further studies are needed to assess the performance of these sensors over time in terms of data quality and reliability, resistance to environmental conditions, maintenance requirements, need for recalibration, and other characteristics in different water samples. Performance metrics should include the study of sensor drift, i.e., the subtle and gradual changes that occur in the sensor over time, especially when sensors remain submerged for long periods. Error correction methods, such as machine learning and incorporating the temperature variable when creating the sensor’s analytical curve, can also be applied to improve the accuracy of readings.