Introduction

Large expenditure has been directed to research and development of using plant sensing (PS) technology to select how much fertilizer to apply. While adoption has been increasing, adoption rates by producers are still low for most precision agriculture technologies and especially the PS technology considered here (Erickson and Widmar 2015; Schimmelpfennig and Ebel 2016; Boyer et al. 2016). Presumably, few producers have adopted PS because it does not unambiguously pay to do so.

With respect to nitrogen (N) fertilizer application, the economic feasibility of PS hinges on the capability of the technology to either (1) increase yields by recommending more N, or (2) retain yields while recommending less N. Increasing yields by applying more N is an unlikely strategy for PS technology, as producers already apply more N than necessaryFootnote 1 in most years (El-Hout and Blackmer 1990; Babcock 1992; Rajsic and Weersink 2008). Thus, economic feasibility hinges on PS technology retaining yields while recommending less N than would be applied without the technology.

The study focused on the Raun et al. (2002, 2005) precision sensing system that used PS of hard red winter wheat ((Triticum aestivum L.) in February to determine recommended levels of topdress N. Previous research has indicated that the Raun et al. PS system lowered input costs by recommending less N than farmer practice; however, it did not retain yield levels (Biermacher et al. 2009; Boyer et al. 2011). The yield loss from the PS system-recommended N could be due to an implicit assumption of zero prediction error, which is unlikely since weather after sensing affects potential yields. Note that yield losses with current commercial versions are not as large since the formulas have been adjusted to apply more nitrogen. This is a heuristic way of doing what the Bayesian model does. The PS systems evaluated here can be used on an entire field or for each plot within a field if using a variable rate system. As efforts expand in using big data for precision agriculture purposes, the Bayesian approach used here can serve as a general approach in a variety of applications.

Information not directly used by a PS system is the information a producer has about a given field. Producers sometimes intuitively apply a little more than the recommendation if it is less than their current practice or a little less if the recommendation is more, but they are not using an optimal Bayesian approach. A producer does not possess perfect information about the functional relationship of expected yield and N. For example, Rodriguez and Bullock (2015) argue that the traditional recommendation of quantity of N per yield goal does not match actual yield functions, so if experiment station recommendations are inaccurate then producer estimates are also likely inaccurate. A Bayesian approach allows a producer to combine optical sensing information with prior objective—or subjective—information to decrease the error associated with the calculation of profit-maximizing N.

Bayesian approaches have been used to determine the economic value of weather information to agricultural producers (Doll 1971; Baquet et al. 1976; Byerlee and Anderson 1982; Marshall et al. 1996), project agricultural yield (Krause 2008) and determine returns of using soil sample information (Pautsch et al. 1999). However, no research has shown how Bayesian updating can improve profit-maximizing N recommendations from a PS system.

In this study, Bayesian methods are used to combine prior information with PS information. Fertilizer costs account for more than 30% of operating costs (Huang et al. 2009) and a PS system can reduce this cost. Furthermore, advancement in PS profit-maximizing N recommendations may increase the adoption rate of precision agriculture. Moreover, increased adoption of PS would decrease N application levels and consequently reduce negative environmental externalities associated with excess N application.

Materials and methods

The Bayesian approach

The Bayesian approach combines the producer’s prior estimate of the production function with new information from sensing.

Prior information

Consider a risk neutral producer who maximizes expected profit with a choice set that is reduced to an N decision. The objective function for a producer can be represented by

$$ \begin{aligned} \mathop {\hbox{max} }\limits_{N} E(\pi |N,I_{0} ) & = \mathop {\hbox{max} }\limits_{N} pE(Y) - rN \\ s.t.\, E(Y) & = \mathop \int Y(N;\varTheta )f(\varTheta |I_{0} )d\varTheta \\ \end{aligned} $$
(1)

where \( E\left( {\pi |N,I_{0} } \right) \) is expected profit, p and r are known output price and N cost, and E(Y) is expected yield as a function of N, represented by N, prior information set I 0 and the functional relationship of expected yield and N with the parameter vector Θ.

Previous research has generally favored plateau-type models over polynomial functions. The linear stochastic plateau of Tembo et al. (2008) was used since it was developed to match the production function assumed with Raun et al.’s (2005) precision nitrogen system. Tembo et al.’s stochastic plateau model has been used in many previous studies to estimate yield response to N (Tumusiime et al. 2011; Boyer et al. 2013; Ouédraogo et al. 2016) and can be expressed as

$$ Y_{it} = \hbox{min} \left( {\beta_{0} + \beta_{1} N_{it} , \mu_{P} + v_{t} } \right) + \tau_{t} + \varepsilon_{it} $$
(2)

where Y it is the observed yield of location i in year t, N it is the level of N applied, \( v_{t} \sim N(0, \sigma_{v}^{2} ) \) is the plateau year random effect, \( \tau_{t} \sim N(0, \sigma_{\tau }^{2} ) \) is the year random effect, \( \varepsilon_{it} \sim N(0, \sigma_{\varepsilon }^{2} ) \) is a random error term and β 0, β 1 and μ P (mean plateau yield) are parameters to be estimated. The stochastic variables (v t , τ t , ε it ) are assumed to be independent. With a stochastic plateau function, the expected profit maximizing solution is to equate the marginal expected revenue and the marginal cost. If the revenue from applying nitrogen when it is needed is six times its marginal cost, then the expected profit maximizing solution is to apply nitrogen up to the point that it is only needed in one year out of six. The Raun et al. (2002, 2005) PS system measures yield with no nitrogen (β 0 + τ t ) and the plateau level of yield using a nitrogen-rich strip (μ P  + v t  + τ t ). By reducing the uncertainty about the yield response to nitrogen, the system can reduce the average amount of nitrogen used.

Equation (2) was estimated using historical data to represent prior beliefs. In practice, the parameters would need to be calibrated based on the producer’s expected yield and expected optimal level of nitrogen, as well as historical information for the field or region. Since historical data were available here, an objective prior was used.

New information

Many PS systems are available that vary by sensor type and nitrogen response index (Alchanatis et al. 2005; Begiebing et al. 2007; Ehlert et al. 2004; Havránková et al. 2007). The N fertilizer optimization algorithm (NFOA) developed by Raun et al. (2002) and updated by Raun et al. (2005) has been produced commercially under the name GreenSeeker® (Trimble, Sunnyvale, CA, USA), which uses an optical sensor and gives a recommended level of N based on mid-season growth. The PS algorithm used here corresponds closelyFootnote 2 to that of GreenSeeker®. Initial versions applied a different amount of N to each square meter, but the current commercial version is a $495 hand-held sensor and gives one recommended rate for each field.

The Raun et al. (2002, 2005) NFOA procedure can be separated into three processes: (1) estimate potential yield if no additional mid-season N is applied; (2) estimate maximum potential yield if additional mid-season N is applied; and (3) calculate the required additional mid-season N to reach maximum potential yield given an exogenously estimated slope. Note that if there is no error, the first and second processes of NFOA define the intercept and plateau of a plateau model, but NFOA does not provide information similar to the slope (β 1). Also, note that producers could also be asked to provide subjective estimates of their yield goal and the amount of nitrogen to reach that yield goal, which could be used to derive the same parameters in Eq. (2).

Processes one and two of NFOA are achieved by applying a non-yield-limiting amount of N to a narrow strip in the field before planting—also referred to as an N-rich strip. Between Feekes (Large 1954) growth stages 4 and 6, optical reflective measurements (ORM) were used to calculate a normalized difference vegetation index (NDVI) that measured the total biomass at the time of measurement. NDVI was divided by the number of growing degree days to arrive at an in-season estimated grain yield (INSEY). Raun et al. (2002, p. 816) define INSEY as INSEY = NDVI (Feekes 4–6)/days from planting where growing degree days (GDD) > 0 [GDD = (Tmin + Tmax)/2 − 4.4 °C, where Tmin and Tmax are daily ambient low and high temperatures. McMaster and Wilhelm (1997) discussed alternative ways of measuring GDD, but that is not an issue that is explored here. The INSEY where the field-level pre-plant N was applied was used to determine minimum potential yield, and the INSEY where the non-yield-limiting pre-plant N was applied was used to determine maximum potential yield.

Raun et al. (2002, 2005) used an exponential functional form for the relationship between yield and the INSEY. A linear functional form was used here since there is little difference in the explanatory power of the two functional forms, and the linear functional form allows an analytical solution for the Bayesian optimization problem. To calculate minimum and maximum yield potential for a given location and year, the linear model used is

$$ Y_{it} = \gamma_{0} + \gamma_{1} \textit{INSEY}_{it} + z_{t} + \epsilon_{it} , $$
(3)

where INSEY it is INSEY for location i in year t, \( z_{t} \sim N(0, \sigma_{z}^{2} ) \) is the year random effect, \( \epsilon_{it} \sim N(0, \sigma_{\epsilon }^{2} ) \) is a random error term, and γ 0 and γ 1 are parameters to be estimated. Estimated minimum yield potential, represented by \( {\widehat{\textit{YP}}}_{0t} , \) and maximum yield potential, represented by \( \widehat{\textit{YP}}_{Mit} , \) for location i in year t are found by

$$ \widehat{\textit{YP}}_{0t} , = \hat{\gamma }_{0} + \hat{\gamma }_{1} \overline{\textit{INSEY}}_{0t} $$
(4)

and

$$ \widehat{\textit{YP}}_{Mt} , = \hat{\gamma }_{0} + \hat{\gamma }_{1} \overline{\textit{INSEY}}_{Mt} $$
(5)

where \(\overline{\textit{INSEY}}_{Oit}\) is an average INSEY where field-level pre-plant N was applied for location i in year t, \(\overline{\textit{INSEY}}_{Mit}\) is an average INSEY across plots where non-yield-limiting pre-plant N was applied for location i in year t, and \( \hat{\gamma }_{0it} \) and \( \hat{\gamma }_{1it} \) are the estimated parameters from Eq. (5). Note that \( \widehat{\textit{YP}}_{0t} \) is an estimate of (β 0 + τ t ) and \( \widehat{\textit{YP}}_{Mt} \) is an estimate of (μ P  + v t  + τ t ). Bushong et al. (2016) propose using a soil moisture measure as an additional factor to predict yield potential. The Bayesian procedure will essentially remain the same regardless of the data used to estimate yield potential.

Data used for estimating parameters

Data from hard red winter wheat experiments conducted on agronomic research stations in Lahoma and Stillwater, Oklahoma, USA consisting of yields, N applied and the INSEY from ORM were used to estimate parameters. Observations on N applied and yield were collected at Lahoma from 1971 to 2012 and Stillwater Oklahoma from 1969 to 2012. Observations on PS variables were also collected at these locations; however, PS technology is a relatively recent invention and thus PS data collection did not begin until 1999. The Stillwater location is missing observations for 2007, 2009 and 2010. Both sites used a randomized complete block design with four replications. The data used were the average of these four replications and only observations where 67 kg/ha of P2O5 and 45 kg/ha of K2O were applied were used here.

The Lahoma site is a Grant silt loam (fine-silty, mixed, thermic Udic Argiustoll), while the Stillwater site is Kirkland silt loam (fine, mixed, thermic Udertic Paleustoll). Additional detail about these experiments is available at Oklahoma State University (2016a, b).

Parameters for the prior information (Eq. 2) and new information (Eq. 3) were estimated for every year that both sensing and yield information were available using an out-of-sample approach, similar to the grouped cross validation approach used by Norwood et al. (2004). That is, data collected from a given year were excluded when estimating parameters for that year. In contrast, the average INSEY values (i.e., \(\overline{\textit{INSEY}}_{Oit}\) and \(\overline{\textit{INSEY}}_{Mit}\)) were calculated for a given year using data only from that year.

Bullock and Mieno (2017) have proposed doing fertilizer experiments in farmers’ fields, but most producers will not have historical data on nitrogen experiments in their fields. The priors will need to be calibrated from historical yield data and could vary across the field if yield monitor data were available. Further research will be needed to extend the Bayesian approach to commercial applications. Note that a prior that varied across the field would be more expensive since it would require GPS coordinates.

Combining prior and new information with Bayesian updating

As shown in Fig. 1, the prior and new information contain analogous information, with the exception that new information does not provide an estimate of yield response to N. The process of using new information to revise prior information is the essence of Bayesian updating. Figure 2, adapted from Zellner (1971), schematically represents how prior information is used to estimate a prior density, new information is used to estimate a likelihood function, and the prior density and likelihood function are combined using Bayes’ Theorem to get a posterior density.

Fig. 1
figure 1

Representation of the relationship for parameters of the prior information and new information

Fig. 2
figure 2

The process of using new information to revise prior information

The benefit of relatively quick computation is essential for a PS system that has limited computing power such as an on-the-go system. Therefore, all densities are assumed to be multivariate normal, which greatly reduces the computational burden.

Prior information (historical yield and N levels), represented by I 0, can be used to estimate the relationship between expected yield and N in Eq. (2) and the vector of estimated parameters for the prior information is represented by

$$ \varvec{\varphi }_{0} = \left( {\begin{array}{*{20}l} {\hat{\beta }_{0} } \hfill \\ {\hat{\beta }_{1} } \hfill \\ {\hat{\mu }_{P} } \hfill \\ \end{array} } \right), $$
(6)

with covariance matrix that includes both the random effects and the estimation error:

$$ {\varvec{\Sigma}}_{0} = \left( {\begin{array}{*{20}l} {{\text{var}}\left( {\hat{\beta }_{0} } \right)} \hfill & {{\text{cov}}\left( {\hat{\beta }_{0} ,\hat{\beta }_{1} } \right)} \hfill & {{\text{cov}}\left( {\hat{\beta }_{0} ,\hat{\mu }_{t} } \right)} \hfill \\ {{\text{cov}}\left( {\hat{\beta }_{1} ,\hat{\beta }_{0} } \right)} \hfill & {{\text{var}}\left( {\hat{\beta }_{1} } \right)} \hfill & {{\text{cov}}\left( {\hat{\beta }_{1} ,\hat{\mu }_{t} } \right)} \hfill \\ {{\text{cov}}\left( {\hat{\mu }_{t} ,\hat{\beta }_{0} } \right)} \hfill & {{\text{cov}}\left( {\hat{\mu }_{t} ,\hat{\beta }_{1} } \right)} \hfill & {{\text{var}}\left( {\hat{\mu }_{t} } \right) + \hat{\sigma }_{v}^{2} } \hfill \\ \end{array} } \right). $$
(7)

The prior density then is represented by p 0(Θ) and the prior distribution of Θ is \( N\left( {\varvec{\varphi }_{0} ,\varvec{ }{\varvec{\Sigma}}_{0} } \right). \) In theory, the prior using Bayesian methods should be estimated, but using classical methods makes little difference.

New information (PS), represented by I, can be used to estimate the minimum and maximum yield potentials in Eqs. (6) and (7). However, I does not measure yield response to additional N and, therefore, the vector of estimated parameters for the new information must be supplemented with the slope estimate. The new data are uninformative with respect to the slope parameter. Thus, the vector of estimated parameters for the new information is

$$ \varvec{\varphi } = \left( {\begin{array}{*{20}l} {\widehat{\textit{YP}}_{O} } \hfill \\ {\hat{\beta }_{1} } \hfill \\ {\widehat{\textit{YP}}_{M} } \hfill \\ \end{array} } \right) $$
(8)

with lower triangular portion of covariance matrix Σ equal to

$$ \left( {\begin{array}{*{20}l} {\text{var} \left( {\hat{\gamma }_{0} } \right) + \text{var} \left( {\hat{\gamma }_{1} } \right)\overline{\textit{INSEY}}_{O}^{2} + 2\text{cov} \left( {\hat{\gamma }_{0} ,\hat{\gamma }_{1} } \right)\overline{\textit{INSEY}}_{O} + \sigma_{\epsilon}^{2} + \sigma_{z}^{2} } \hfill & . \hfill & . \hfill \\ 0 \hfill & \infty \hfill & . \hfill \\ {\text{var} (\hat{\gamma }_{0} ) + \left[ {\overline{\textit{INSEY}}_{O} + \overline{\textit{INSEY}}_{M} } \right]\text{cov} \left( {\hat{\gamma }_{1} ,\hat{\gamma }_{0} } \right) + \left[ {\overline{\textit{INSEY}}_{O} *\overline{\textit{INSEY}}_{M} } \right]\text{var} \left( {\hat{\gamma }_{1} } \right) + \sigma_{z}^{2} } \hfill & 0 \hfill & {\text{var} \left( {\hat{\gamma }_{0} } \right) + \text{var} \left( {\hat{\gamma }_{1} } \right)\overline{\textit{INSEY}}_{M}^{2} + 2\text{cov} \left( {\hat{\gamma }_{0} ,\hat{\gamma }_{1} } \right)\overline{\textit{INSEY}}_{M} + \sigma_{\epsilon}^{2} + \sigma_{z}^{2} } \hfill \\ \end{array} } \right) $$

The likelihood function then is represented by p(I|Θ) and gives that the estimate of Θ has distribution \( N\left( {\varvec{\varphi }, {\varvec{\Sigma}}} \right) \). In practice, the infinite variance for the slope parameter is approximated with a large number to make the calculations tractable.

The posterior vector of estimated parameters is a linear combination of the prior and likelihood vectors of estimated parameters (\( \varvec{\varphi }_{0} \) and \( \varvec{\varphi } \)). More specifically, the prior vector of estimated parameters is multiplied by the covariance matrix of the likelihood (\( {\varvec{\Sigma}} \)) and added to the product of the likelihood vector of estimated parameters and the prior covariance matrix (\( {\varvec{\Sigma}}_{0} \)). Both terms of the linear combination are normalized by the sum of the covariance matrices (\( {\varvec{\Sigma}}_{0} \) and \( {\varvec{\Sigma}} \)). Thus, the posterior mean vector is a weighted average of the prior and likelihood mean vectors, and the information with lower variance has a greater weight. Mathematically defined, the posterior vector of estimated parameters, derived formally in Duda et al. (2001, pp. 95–97) is

$$ \varvec{\varphi }_{b} = {\varvec{\Sigma}}_{0} \left( {{\varvec{\Sigma}}_{0} + \frac{1}{n}{\varvec{\Sigma}}} \right)^{ - 1} \varvec{\varphi } + \frac{1}{n}{\varvec{\Sigma}}\left( {{\varvec{\Sigma}}_{0} + \frac{1}{n}{\varvec{\Sigma}}} \right)^{ - 1} \varvec{\varphi }_{0} $$
(9)

with covariance matrix

$$ {\varvec{\Sigma}}_{b} = {\varvec{\Sigma}}_{0} \left( {{\varvec{\Sigma}}_{0} + \frac{1}{n}{\varvec{\Sigma}}} \right)^{ - 1} \frac{1}{n}{\varvec{\Sigma}}. $$
(10)

Note that (9) and (10) assume that \( {\varvec{\Sigma}}_{0} \) and \( {\varvec{\Sigma}} \) are known. The approach used here uses estimates of \( {\varvec{\Sigma}}_{0} \) and \( {\varvec{\Sigma}} \), which makes the calculation simple enough that it would be practical to use the formula with on-the-go sensing.

An example of estimated and combined parameters

For illustration, Table 1 shows parameter estimates for the prior density and likelihood function and the calculation of the posterior density for Stillwater in 2012. The minimum and maximum yield potential in the posterior mean vector are bounded by the minimum and maximum yield potential in the prior and likelihood mean vectors. Thus, the posterior density mean vector is a weighted average of the two sources of information.

Table 1 Parameter estimates for stillwater wheat yield (100 kg/ha) in 2012

A benefit of Bayesian updating that should not be overlooked can be seen by comparing the covariance matrices. Combining the prior and new information decreases the variance associated with estimating minimum and maximum yield potential.

N recommendations

N recommendations following Tembo et al. (2008) are denoted by N PI i , and Bayesian N recommendations are denoted by N B i . Prices were from the United States Department of Agriculture National Agricultural Statistics Service (NASS 2017a, b). The price of a kg of urea for the 2013 marketing year was $1.41 and this was used for the cost of N (r) and the price received for a kg of wheat (p) in the 2013 marketing year was $0.25.

With the Tembo et al. approach, if for example \( r/\left( {p\beta_{1} } \right) = 0.2 \) then, in four out of 5 years, producers would optimally apply more nitrogen than was needed in that year. The optimal expected yield will be less than the potential plateau yield in 1 year out of five, the maximum yield would not be reached. A perfect-information PS system would be able to achieve the plateau yield with an expected nitrogen level of \( (\mu_{P} - \beta_{0} )/\beta_{1} , \) but any imperfect information system will have a lower yield and use more nitrogen. Nevertheless, the N recommendation for new information is found by assuming that the PS system has no error:

$$ N_{i}^{NI} = \frac{{\widehat{\textit{YP}}_{M} - \widehat{\textit{YP}}_{O} }}{{\hat{\beta }_{1} }}. $$
(11)

New information recommendations were restricted to be non-negative as the yield was greater when no N was applied in some years. The approach in (11) is essentially what was proposed by Raun et al. (2002, 2005). The approach led to too little nitrogen being applied, so with experience \( \hat{\beta }_{1} , \) which is nitrogen use efficiency, has been reduced so that current formulas have been heuristically adjusted for the uncertainty. This heuristic adjustment is not considered here.

Yield and profitability comparisons

A linear plateau was estimated as a yield function with respect to N for both locations and for each year that both prior and new information were available. The linear plateau is

$$ y_{i} = \hbox{min} \left( {\delta_{0} + \delta_{1} N_{i} , \tau } \right) + \omega_{i} $$
(12)

where y i is the observed yield of plot i, N i is level of N applied to plot i, \( \omega_{i} \sim N(0, \sigma_{\omega }^{2} ) \) is a random error term, and δ 0, δ 1, and τ (plateau yield) are parameters to be estimated. The N recommendations (N PI i , N NI i , and N B i ) were then plugged into the estimated linear plateau to arrive at the estimated yields. Additional to the estimated yields for the N recommendations, yields were also estimated for a constant application of 73 and 109 kg/ha of N.

Results and discussion

To determine economic feasibility, net returns were calculated for each estimated yield and corresponding N recommendation. A custom urea application rate of $12/ha based on a survey by Doye et al. (2014) was used for the application cost.

N recommendations and net returns for Lahoma and Stillwater are shown in Tables 2 and 3, respectively. On average, in kg/ha, the prior information, new information and Bayesian N recommendations were 72, 41 and 66 in Lahoma and 61, 34 and 51 in Stillwater. The average Bayesian N recommendation was not significantly different than the average prior information N recommendation in Lahoma (F test, p value = 0.14); however, the N recommendation using the Bayesian approach was less in Stillwater (F test, p value = 0.03). On average, for both locations, new information N recommendations were significantly lower than both prior information and Bayesian N recommendations (F test, p value  < 0.01, for both locations).Footnote 3 This confirms that the new information, NFOA, tends to recommend lower levels of N. The new information N recommendation has more variation than prior information and Bayesian N recommendations. However, this result is partly due to new information recommending zero N in some years. A recommendation of zero N has the benefit of saving application costs ($12).

Table 2 Lahoma N recommendations and net returns
Table 3 Stillwater N recommendations and net returns

Average net returns in dollars per hectare for prior information, new information and Bayesian N recommendations were 692, 669 and 691 in Lahoma and 355, 345 and 354 in Stillwater. Additionally, the net returns for a constant application of 73 and 109 kg/ha of N were 690 and 716 in Lahoma and 369 and 370 in Stillwater. The only significant difference in net returns in Lahoma was between the new information N recommendation and 109 kg/ha of N (F test, p value = 0.03). There were no significant differences in net returns across N recommendations in Stillwater.

Table 4 shows how changes in the price received for wheat affects net returns for the various N recommendations. Changes in price received affect the ordering of net returns. For example, at the low wheat price ($0.13/kg), an N recommendation from new information has the highest net return for both Lahoma and Stillwater. However, the only additional significant difference in net returns found from the sensitivity analysis was between the new information N recommendation and 109 kg/ha of N at the high price ($0.38/kg) for Stillwater (F test, p value = 0.04).

Table 4 Sensitivity of net returns to a change in wheat price

Franzen et al. (2016) discuss four different plant sensing algorithms and all four of them are candidates for Bayesian updating. Note that the Bayesian approach could be revised to have a producer’s prior be optimal N with a level of uncertainty rather than to specify all the parameters with the prior as is done here. Such a prior would be easier for a producer to specify and would even further simplify the calculation.

The PS approach is noisy and so treating it as if it had no error worked poorly. The evaluation is also noisy since production functions were estimated for each year to predict yield. Note that nitrogen-rich strips in farmer fields are typically larger than experimental plots and so the noise could be higher here than in actual implementation. The prior distribution used here is from a regression using data from the same plots that were farmed in the same way each year. Producers likely have less accurate prior information than that considered here. So while the approach does not look profitable in this research, it might be closer to being profitable in actual practice.

Conclusion

This paper established a method of Bayesian updating to combine PS and prior information about the response to N for a given field. PS technology using a plug-in approach had lower N recommendations, but had the lowest return due to reduced yield. The Bayesian updating also provided a small reduction in the level of N. Thus, adoption of Bayesian updating would have social benefits by reducing negative environmental externalities associated with excess N application, such as runoff into waterways and increased carbon emissions, without sacrificing net returns. Because an analytical solution is used, it should be feasible to program the Bayesian solution in a hand-held device such as the current device that is being marketed. While current commercial algorithms focus on whole-field recommendations, the analytical solution also helps make it feasible to use the Bayesian updating algorithm in an on-the-go commercial system. The methods used provide a foundation for using Bayesian updating of parameters of a yield function in a PS system. Future research could build on this study by examining the differences between Bayesian updating of parameters to obtain an N recommendation, as done here, and Bayesian updating of N recommendations derived from separate sources of information.