Introduction

With freshwater availability decreasing in many biomes, accurately accounting for water cycling through terrestrial ecosystems is becoming increasingly important at local, regional, national, and global levels with potential policy implications across scales [1, 2]. Plants exert a particularly strong influence on the global water cycle because transpiration is the dominant source of water returning to the atmosphere from terrestrial ecosystems [3, 4], comprising 60–80% to terrestrial evapotranspiration [3, 5, 6]. While occupying only 31% of Earth’s terrestrial surface, forests account for over 50% of global productivity [7, 8], which is positively related to evapotranspiration [9]. Not surprisingly, the relative importance of transpiration increases as tree density increases across terrestrial biomes [5, 7, 8, 10]. Further importance is placed on trees and the accuracy of calculating transpiration when we consider their agronomic value and our dependence on trees as a source of food [11,12,13]. Clearly, a thorough and robust understanding of the global water cycle requires a comprehensive and accurate quantification of tree and forest transpiration.

Sap flow is the most common technique for estimating transpiration because of its relative low cost and ease of installation [14, 15]. Heat-based sap flow methods, which use heat as a tracer to quantify the rate of water movement through xylem, have been applied for nearly a century [16]. Sap flow can generally be categorized into heat pulse and constant heating approaches, of which there are four common methods: heat pulse velocity, HPV [17]; heat balance, HB [18, 19]; thermal dissipation, TD [20]; and heat field deformation, HFD [21]. Each method has relative advantages and disadvantages including ease of construction, cost, as well as size and type of stem on which sensors can be used [22]. The application of TD probes occurs twice as often as any other sap flow method [14]. Lu [23] was the first to suggest that calibration should be performed on TD probes whenever the design deviated from that of the original [20], which has implications for commercially available and lab-built sensors, alike. Gutierrez and Santiago [24] demonstrated that TD probes consistently underestimated sap flow rates. Furthermore, the accuracy of TD probes, as well as other sap flow methods, was scrutinized in 2010 [25] and it was concluded that species-specific calibrations were necessary to obtain accurate whole-tree transpiration estimates (hereafter, referred to as water use). The need to calibrate commercially available and lab-built sensors has been established by several publications [25,26,27]. Additional publications have since demonstrated the improved accuracy that results from calibrating sap flow sensors [28,29,30].

Sap flow calibration is the comparison of an independent reference water use with water use estimated by sap flow sensors. Independent reference water use is often measured using gravimetric or potometer approaches. Gravimetric approaches use a balance to measure the mass of water flowing through a stem segment [25, 31], whereas potometer approaches measure the volume or mass of water removed from a reservoir containing a tree stem with an intact canopy severed from its root system [32]. The magnitude and direction of error between sap flow measurements and independent reference water use are highly variable across and within previous reports, ranging from underestimating actual water use by a factor of three [33] to overestimating actual water use by 55% [32]. Although the degree of error may differ among methods, recent literature has observed inaccuracies across multiple sap flow methods, which suggests the need to calibrate is not method-specific [14, 34, 35]. Despite numerous warnings and recommendations (Table 1), the performance and application of calibrations seemingly appear to be the exception rather than the rule in reports of tree sap flow.

Table 1 Recommendations to calibrate taken directly from previous literature

Here, we seek to determine how researchers have responded to recommendations for performing and applying sap flow calibrations. Specifically, we conducted a literature search to quantify the number of sap flow reports that have been published since these calibration issues have been identified to determine if sap flow calibrations have been adopted as part of a best practice for sap flow methodologies. We also explored a potential bias in calibration between commercially available and lab-built sap flow sensors based on the possibility that users of commercially available sensors may assume implicit accuracy. We further identify and discuss challenges, limitations, and research priorities related to sap flow calibration that have broader implications for the application of sap flow.

Methods and Results

We performed a literature search in Web of Science using the search terms “‘Sap fl*’ OR ‘Sapfl*’, AND ‘Tree*’” across a 9-year period from 2010 through 2018 to determine the total number of publications that reported tree sap flow. We began our search in 2010 because two publications definitively demonstrated the importance of calibrations across multiple sap flow sensors and xylem anatomies in 2010 [25, 31], and we aimed to identify if these seminal publications were incorporated as best practices in subsequent sap flow literature. We carefully read the abstract and methods of each publication and skimmed other sections for context to determine if it met our criteria. We defined calibration as the comparison of independent reference water use to estimated water use from sap flow sensors. We also considered publications that described applying calibrated coefficients from previous studies or described applying a correction equation derived from calibration. We did not consider publications that documented using other sap flow sensors as a reference water use. We also did not consider publications emphasizing the importance of calibration (including the two seminal papers from 2010) rather than using calibration to ensure or improve accuracy of tree water use estimates (Supplemental Table 2). We focused exclusively on woody trees and therefore did not include monocots, tree-like monocots, vines, or lianas. Our search criteria identified 875 publications reporting on sap flow in trees during that 9-year period (Supplemental Table 3).

Across the 9-year period, we observed a mean (± s.e.) of 97.2 ± 6.3 tree sap flow publications per year with a minimum and maximum of 66 and 128, respectively (Fig. 1a). The number of tree sap flow publications and the number reporting calibrations increased through time. We observed a mean of 5.2 ± 1.1 publications documenting calibrations per year with a minimum and maximum of 2 and 10, respectively, which translated to a mean of 5.2 ± 1.0% of publications documenting calibrations per year with a minimum of 1.9% in 2013 and a maximum of 9.4% in 2017. Of the 875 publications, 378 (43.2%) used commercially available sensors, 45 (5.1%) explicitly indicated sensors were made in the laboratory, and 452 (51.7%) did not specify sensor source or construction details, which likely indicates laboratory construction because manufacturing information is usually provided in publications. Across the 9-year period, only 47 publications (5.3%) documented the performance of calibration or the use of calibration coefficients from previous studies. Thermal dissipation and compensation heat pulse were the sap flow methods (i.e., sensor types) most commonly calibrated (Fig. 1b). Of the 47 publications that documented calibration, 35 performed calibrations, whereas 12 applied data from previous calibrations. Of the 35 publications that performed calibration, 42.9% generated species-specific coefficients, 20.0% applied a correction factor instead of new coefficients, 25.7% validated empirical or theoretical coefficients, and 11.4% did not provide enough information to understand how calibration data were applied to sap flow measurements (Table 2). Of the 35 publications that performed calibration, 17 (49.5%) used commercially available sensors, which, based on our assumption above, suggests that 18 (51.5%) used lab-built sensors. Multiple approaches were used to determine independent reference water use during calibration, but three approaches (gravimetric, potometer, and balance) accounted for 57% of calibrations (Table 3).

Fig. 1
figure 1

a Total number of tree sap flow publications (black, solid line), the number of publications that document calibration or use calibrated parameters (black, dashed line), and the percent of publications that calibrated or used calibration (black, dotted line) from 2010 to 2018. b Total number of publications published from 2010 to 2018 with percent of publications calibrated for each method, including compensation heat pulse (CHP), heat balance (HB), heat field deformation (HFD), heat pulse (HP), heat ratio (HR), sap flow+ (Sap+), thermal dissipation (TD), and Tmax (Tmax). There were 55 publications that employed multiple sap flow methods (i.e., sensor types), which resulted in differences between the total number of publications that reported calibration as described in Fig. 2a (n = 875) and the number of calibrations reported for each sap flow method in Fig. 2b (n = 962)

Table 2 Summary of how data from the 35 publications that performed sap flow calibrations have been applied to sap flow data
Table 3 Summary of calibration approach among the 35 publications that performed sap flow calibration

Discussion

We have shown that, despite a number of publications that demonstrated the importance of calibrating sap flow sensors (e.g., Supplemental Table 1), only 5.3% of studies over the past 9 years actually performed calibrations or applied results from previous calibrations to ensure or improve the accuracy of tree water use estimates (Fig. 1). That commercially available and lab-built sensors were calibrated at similar proportions in the literature, suggests that lack of calibration does not simply reflect user assumptions that commercially available sensors do not require calibration. Failure to adopt calibration as a best practice may leave a lasting impact on our ability to progress a variety of disciplines that routinely implement sap flow, which could impede our understanding of terrestrial water cycling across stand, landscape, and global scales. Specifically, we sacrifice accuracy in constructing water budgets and these inaccuracies propagate through synthesis and modeling efforts. There is also economic cost associated with not performing sap flow calibrations, as irrigation of orchards and plantations may depend on accurate estimates of transpiration, particularly in semiarid climates [36,37,38,39]. Calibrating scientific instruments or experimental measurements is a common best practice across all scientific disciplines that ensures the highest quality data and strongest inference possible regarding the phenomena and processes we are interested in understanding. We encourage researchers to place additional effort into performing calibrations on all sap flow sensor types regardless of specific sap flow method (e.g., HPV, HB, TD, HFD, etc.) or sensor origin (i.e., commercial or lab-built), which will provide the highest quality sap flow data possible and enhance the strength of inference we strive to make through our research.

There is not only a lack of calibration throughout recent sap flow literature but relatively few references of the seminal papers highlighting the importance of calibration. Only 47 of the 875 sap flow papers (5.3%) identified in our literature search documented calibration. From July 2010 to 2018, Steppe [25] had been cited 140 times, and from December 2010 to 2018, Bush [31] had been cited 69 times, according to Web of Science. In other words, these seminal papers that advocated for implementation of calibration into sap flow methodology have been cited by less than 20% of subsequent tree sap flow publications. Furthermore, these papers are cited for purposes other than calibration recommendations, which suggests their influence on promoting calibration is even lower. We would expect higher levels of citations for these seminal calibration publications, even if these publications were cited only to explain why calibration was unnecessary within the scope of various studies. It has been nearly 20 years since Lu [23] identified a need for calibration in TD and 10 years since Steppe [25] identified errors in estimated sap flow for TD, HFD, and CHP, yet calibration has still not been embraced as a best practice.

The improved accuracy in water use estimates gained through calibration represents a challenge for research because the calibration process is resource intensive. Calibration is expensive in terms of plant material, time, and equipment. The cost and limitation of time is always a factor when considering data collection protocols. Equipment costs beyond those needed for measuring sap flow can be expensive when using large precision balances or weighing lysimeters, which represents a major limitation to calibration of intact trees. Further costs may be incurred for calibration with larger trees as heavy machinery may be necessary for transport and calibration [40]. In addition, most calibration approaches require felling trees, which may present logistical limitations. Felling trees may not be possible if the study involves a rare or endangered species [41], if trees are located in restricted areas (e.g., national parks, private property, or public lands), if trees are too large to safely fell [42] or if trees provide economic value and the sacrifice is not feasible (e.g., fruit orchards, plantations, etc.). The logistics of calibration become increasingly challenging—or altogether impossible—with increasing tree size, because of mass and ability to fell, transport, and perform calibrations. Regardless of these challenges, calibration is an essential validation procedure necessary to accurately quantify tree water use. Ultimately, these logistical limitations become limitations on when and where heat-based sap flow methods can be applied to gain estimates of transpiration.

Correcting sap flow measurements using coefficients or correction factors from calibrations performed in other studies is an appealing alternative to calibration, but these approaches ultimately rely on an assumption that the coefficients or corrections result in more accurate estimates of water use, which may or may not be valid. Pooling data to determine correction factors based on site, sensor, or wood properties has been suggested in a published synthesis of 290 calibrations representing 55 studies [34••] and in two studies that each examined six species [31, 35]. Although the intent is to improve transpiration estimates of water use, the suggestion to apply a pooled correction factor likely provides a false sense of advancement that continues to perpetuate calibration avoidance, which further impedes the progress we could be making in understanding plant water use if calibrations were universally applied. Indeed, calibrations can differ among species at the same site [27, 32, 43] and within species at different sites [31, 35], indicating that applying coefficients or correction factors from other studies may do little to improve estimates of water use. However, method-specific corrections may have potential to provide a measure of uncertainty on transpiration estimates, though this needs to be explored with actual calibrations. While calibration can mitigate intrinsic errors related to sensor fabrication and specific tree and site attributes, it does not mitigate extrinsic errors related to sapwood heterogeneity and is not a substitute for proper sensor installation, functional sapwood area determination, or wounding correction.

Not all sap flow applications require estimates of absolute water use, but even these applications can benefit from calibration. For example, calibration may not be necessary when sap flow is compared among trees of the same species of a similar age and size on similar sites when the goal is to understand responses to environmental factors and quantification of absolute water use is not of interest. Sap flow methods in these circumstances would facilitate comparison of relative water use, as sap flow without calibration is positively related to plant water use [34••]. However, even when sap flow is used to explore relationships between water use and environmental factors, calibration could improve the data and potentially influence statistical inference. For example, vapor pressure deficit (VPD) exerts strong control on leaf-level and whole-tree transpiration [44,45,46], so relating sap flow to VPD is common [47,48,49,50]. Calibration could increase or decrease the volume of sap flow and thus the slope of the relationship between sap flow and VPD, which could influence statistical comparisons of how different tree species respond to environmental factors at the same site or how the same species may respond to environmental factors at different sites. Thus, calibration should be incorporated as a best practice in sap flow methods even when estimates of absolute water use are not the primary aim.

Through examining the tree sap flow literature, we identified aspects of calibration that may be important but are unable to offer strict recommendations moving forward, as calibration literature is limited (see Flo [34••]); rather, we highlight two immediate research priorities for sap flow research. Because of the increasing difficultly of performing calibration with larger trees, calibrations are often performed on small trees or branches and calibration coefficients are then applied to larger trees [43, 44]. However, we are unaware of studies demonstrating that this approach provides robust and transferrable coefficients for larger trees of the same species at the same site. If this assumption does not hold, it would further limit the application of sap flow techniques. There are some fundamental differences in calibration approaches, including water movement through xylem (i.e., negative vs. positive pressure) and calibration material (i.e., intact trees, stems with foliage, stem segments), but there is not enough data to determine fully how these differences may affect coefficients and subsequent water use estimates [34, 35]. Testing the assumptions that calibration coefficients derived from small trees are transferrable to larger trees and that coefficients derived from different calibration approaches result in similar estimates of water use are immediate research priorities.

Though we cannot overstate recommendations to perform sap flow calibrations, we acknowledge limitations to this ideal and encourage authors to justify explicitly why calibration could not be performed or why the application did not require accurate estimates of water use. When calibrations are performed, we suggest including enough information to understand how calibration was performed and applied. This will provide further information on what other barriers are present for researchers using this technology and thus present an opportunity for researchers to explore possible solutions to these problems. Despite the clear lack of calibration to improve tree water use estimates, the publications that have focused on calibrations (Supplemental Table 2) provide important insight into the approaches and applications of sap flow calibration that will help guide future research. Further studies like these are necessary to improve, refine, and perhaps ultimately standardize calibration approaches. While we suggest scenarios when calibration is critical and when it may be less so, we cannot recommend specific sap flow best practices until calibration is more thoroughly researched and standardized. Such efforts should ultimately lead to more accurate, robust, and reliable sap flow data, which will improve our ability to assess and understand physiological mechanisms and ecosystem processes related to tree water use.