Introduction

Bacterial spot is one of the most devastating diseases of tomato (Solanum lycopersicum L.) particularly in wet and humid environments (Jones et al. 1991). At least four species of Xanthomonas (X. euvesicatoria, X. vesicatoria, X. perforans, and X. gardneri) (Jones et al. 2004; Mbega et al. 2012) cause the disease on tomato, resulting in severe yield and fruit-quality losses in open field production (Yang et al. 2007; Stall et al. 2009). The recent development of solar greenhouse for vegetable production in winter (Gao et al. 2010) accelerates the widespread of the disease in protected tomato production areas in China (Yang 2013; Sun et al. 2014). The existence of multiple species and races of these pathogens, as well as lack of efficacy of commonly applied bactericides and commercially available resistant varieties make it difficult to effectively control the disease.

Although the bacterial spot pathogen can infect all above ground parts of tomato plants, most studies have been based on assessment of foliar disease (Yang et al. 2007). Therefore, accurate estimation of disease severity on tomato leaves is critical for making appropriate decisions in disease management and monitoring epidemic progress to minimize the impact of the disease. Visual assessment of disease severity with the Horsfall and Barratt (1945) category scale has been widely used for screening resistance sources and for investigating the genetic basis of resistance to bacterial spot in tomato (Reviewed in Sun et al. 2014). However, the accuracy and reliability of visual assessment using the category scale has been questioned (Nutter and Esker 2006; Bock et al. 2009a, 2010a). Indeed, direct visual estimation of disease severity using a 0–100 % ratio scale has better agreement with true disease severity values compared to the Horsfall-Barratt scaled data (Bock et al. 2009a; Sun et al. 2014). To help minimize error in estimation, standard area diagrams set (SADs) has been developed as an assessment aid to estimate many crop disease severities including corn white spot (Capucho et al. 2010), apple sooty blotch and flyspeck (Spolti et al. 2011), coffee leaf rust (Capucho et al. 2011), pepper anthracnose (Pedroso et al. 2011), potato early blight (Duarte et al. 2013), wheat blast (Rios et al. 2013), pecan scab (Yadav et al. 2013), wheat spot blotch (Domiciano et al. 2014), Asiatic citrus canker on sweet orange (Braido et al. 2014, 2015), and loquat scab (Gonzalez-Dominguez et al. 2014). These SADs can effectively improve accuracy, precision and reliability of visual estimates, particularly for inexperienced raters.

Considering the economic importance of the tomato crop, the impact of bacterial spot on tomato production worldwide, and the absence of a SAD set to evaluate bacterial spot severity on tomato leaves, the objective of this study was to develop and evaluate a set of SAD encompassing the full range of disease severities as an aid for raters to accurately and reliably estimate bacterial spot severity on tomato leaves.

Materials and methods

Plant growth

Tomato seeds of an F2 population consisting of 538 individuals derived from a cross between two lines OH 88119 and Hawaii 7998 were germinated in 288 Square Plug Trays (Taizhou Longji Gardening Materials Co., Ltd, Zhejiang, China) in a mixture of peat and vermiculite (3:1). OH 88119, an elite processing tomato breeding line that is susceptible to bacterial spot was used as the susceptible control, while the unimproved breeding line Hawaii 7998 with partial resistance to race T1 (Yang et al. 2005) was used as the resistant controls. Two-week old seedlings were transplanted into 12 cm (diameter) × 12 cm (height) pots filled with the same peat : vermiculite mixture and placed in a greenhouse (25/ 30 °C night / day cycle with a 14 h photoperiod). Water and fertilizer were provided as needed.

Inoculum preparation and inoculation

X. euvesicatoria race T1 strain Xcv110c were grown in Petri plates on yeast, dextrose, and calcium carbonate (YDC) agar medium (Lelliot and Stead 1987) at 28 °C for 48 to 72 h. Bacteria were washed from the agar with sterile double-distilled water (ddH2O) and the suspension was adjusted to approximately 6 × 107 colony forming units (CFU) per ml confirmed using a serial dilution plating method (Sun et al. 2014). Plants were spray-inoculated with the bacterial suspension using a handheld sprayer approximately 6 weeks after transplanting. The inoculated plants were kept in a plastic tent in the greenhouse and misted with water twice a day (9:00 am and 5:00 pm) from 1 day before inoculation to 15 days after inoculation to increase humidity and prolong leaf wetness for disease development.

Development of the SAD set

A total of 2,398 diseased leaves were collected from all parental and F2 plants 15 days after inoculation and wiped with a moist cloth to remove dust. Leaves from the same plant were scanned as a single TIFF image using a BenQ 5560 Color Scanner (BenQ, Shanghai, China) at 600 × 600 dpi. The true disease severity (percent diseased leaf area) was measured using the leaf-by-leaf approach (Sun et al. 2014) with image analysis software ASSESS V2.0 (Lamari 2008), and used as the reference for evaluating the accuracy and precision of rater estimates with and without use of the SAD set. Based on the range (0.1–71.8 %) and the frequency of the true disease severities in the 2,398 diseased leaves, 10 images spanning the minimum and maximum disease severity on a linear basis was developed. Although disease severity greater than 80 % was not observed in this study, it did occur in the field conditions (Scott et al. 1997; Hutton et al. 2010a; Sun et al. 2011). Therefore, two images with the disease severities of 80 and 90 % were also included, resulting in a SAD set with 12 images (Fig. 1).

Fig. 1
figure 1

The standard area diagram set developed as an aid for estimating disease severity of Xanthomonas euvesicatoria race T1 on tomato leaves. The values represent the percent diseased leaf area

Validation of the SAD set

Twelve inexperienced raters having no any formal training or familiarity with plant disease symptoms assessed the same set of 150 images of diseased leaves selected from the above 2,398 leaves with a range (0.1–62.8 %) of known actual disease severity (Fig. 2). The 150 leaves were placed on individual slides to be viewed in a POWERPOINT presentation file. The 12 raters assessed the 150 leaves twice on the same day. The first assessment was done without the use of the SAD set in the morning. The symptoms of tomato bacterial spot on leaves were described to raters, who were instructed to provide an estimate of percent diseased area for each image. Both necrotic and chlorotic symptoms were considered as part of the tomato bacterial spot symptoms. In the afternoon, the 12 raters used the SAD set as a visual aid to conduct the second assessment by comparing the unknown image to the diagram in the SAD set to obtain a new estimate of the percentage of diseased area.

Fig. 2
figure 2

The frequency and range of disease severities (% diseased leaf area including the necrotic area measured using image analysis) in a set of 150 images of tomato leaves infected with Xanthomonas euvesicatoria race T1

Data analysis

All analyses were performed in either SAS V9.2 (SAS Systems, Cary, NC, USA) or MS Excel 2007 (Microsoft Corp, Redmond, WA, USA). Main effects of number of SADs used (0 = without SADs and 1 = with SADs), actual disease severities on 150 diseased leaves, and raters, and all two-way interactions were determined using the general linear model described in Braido et al. (2014).

The accuracy and precision of the estimates with and without the use of the SADs by each rater was determined based on Lin’s concordance correlation coefficient (LCCC, ρ c ) (Lin 1989). A perfect agreement between the estimates and actual values should have the LCC statistics of systematic bias (scale shift or slope shift) υ = 1, constant bias (location bias or height shift) u = 0, accuracy Cb = 1, precision r = 1, and agreement ρ c  = 1 (Nita et al. 2003; Bock et al. 2010b). Any deviation from these values indicates loss of accuracy, agreement or precision. Each of these statistics describing precision, bias and agreement was used in subsequent analyses. The frequency of each statistic was also calculated with and without the SADs. Absolute error (estimated severity minus actual severity) with and without the SADs was calculated for all raters and tomato bacterial spot disease severity assessments.

The inter-rater reliability of the estimates without and with the SADs was determined using the coefficient of determination (R 2) from linear regression analyses (PROC REG) of the pairwise relationships between the estimates for all pairs of raters (Nutter and Schultz 1995). In addition, the intra-class correlation coefficient was calculated to measure inter-rater reliability without and with the SADs by analyzing each data set with a two-way random effect ANOVA (PROC ANOVA) as described by Nita et al. (2003). Variance components were obtained using PROC VARCOMP. The intra-class correlation coefficient (ρ) for all raters was calculated using the formula ρ = σ 2 leaf /(σ 2 leaf  + σ 2 rater  + σ 2 error ) (Yadav et al. 2013).

For all LCC statistics (υ, u, C b , r, ρ c ) and the inter-rater reliability (R 2), the difference between means for each rater from each pair of assessments (i.e., with the SAD set minus without the SAD set) was calculated. An equivalence test was used to test for significance and calculate the 95 % confidence intervals (CIs) for each statistic (the difference between the means) by bootstrapping using the percentile method (Yi et al. 2008; Bardsley and Ngugi 2013; Yadav et al. 2013). All analyses were based on 2,000 balanced bootstrap samples using PROC SURVEYSELECT and calculating the 95 % CI using PROC UNIVARIATE (SAS Institute Inc.). The difference was considered not significant (α = 0.05) if the CIs of the mean difference embraced zero.

Results

Effect of the SADs

The general linear model analysis demonstrated that main effects of number of SADs, actual disease severity, rater, and interactions of actual disease severity × rater, actual disease severity × number of SADs, and rater × number of SADs were significant (Table 1). Based on Lin’s concordance correlation analysis and estimated and actual severity, assessments made by the raters using the SADs were closer to the actual values than without the aid of the SAD set. For all raters, there was a linear relationship between estimated and actual disease severity (Fig. 3). According to the equivalence test, all statistical parameters (r, Cb, υ, u and ρ c ) were significantly improved when the raters used the SAD set as an aid to estimate bacterial spot severity on tomato leaves having unknown severity, demonstrating that both the accuracy and precision of the estimated values were greatly improved (Table 2).

Table 1 General linear model (GLM) analysis of the effects of standard area diagrams set (SADs) (with and without the SADs), raters (12) and actual disease severity (150) on estimates of bacterial spot severity on tomato leaves
Fig. 3
figure 3

Relationship between actual and estimated disease severity of Xanthomonas euvesicatoria race T1 assessed on 150 tomato leaves by 12 different raters without and with the use of standard area diagram set (SADs). The dash line is the concordance line representing perfect agreement between actual and estimated severity (slope of 1, intercept of 0), and the solid line represents the best-fit linear regression line. Agreement was determined with Lin’s concordance correlation coefficient (ρ c ) calculated as the product of the correlation coefficient (r) and the bias correction factor (Cb). Cb is a product of location shift (u) and scale shift (υ) indicating changes in line height and slope, respectively

Table 2 Effect of standard area diagram (SAD) assessment aids on the bias, precision, agreement of, and time taken for assessments of bacterial spot severity on 150 diseased leaves of tomato by 12 raters

All bias and accuracy component statistics of Lin’s concordance correlation coefficient improved significantly when the SAD set was used to assess disease severity of bacterial spot on tomato leaves (Table 2). The agreement (ρ c ) was improved when the SAD set was used for all 12 raters. When the SAD set was not employed, the ρ c ranged from 0.71 to 0.93, with a mean of 0.83. However, when the SAD set was used, the ρ c ranged from 0.95 to 0.97, with a mean of 0.96 (Table 2). The combined estimate of accuracy (Cb) was closer to 1 when using SADs. Both location and scale bias (u and υ values, respectively), improved for all raters. The mean was −0.34 with a range of −0.73 to 0.09 for u and was 0.82 with a range of 0.62 to 1.00 for υ without the use of the SAD set, while it was 0.01 with a range of −0.19 to 0.11 for u and 1.01 with a range of 0.90 to 1.09 for υ when the SAD set was used (Table 2). The use of the SAD set helped raters to estimate the disease severity precisely. Without using the SAD set, the precision (r) ranged from 0.84 to 0.96, with a mean of 0.92, but when the SAD set was employed, the precision ranged from 0.95 to 0.98 with a mean of 0.97 (Table 2). The more precise evaluation of bacterial spot disease severity using the SAD set was also confirmed by a reduction in absolute error. Raters had larger errors when the SAD set was not used than the SAD set was used (Fig. 4). Most raters tended to overestimate disease severity when the percent diseased leaf area was less than 40 % or underestimate disease severity when the percent diseased leaf area was greater than 40 % without the use of the SAD set. This tendency was markedly reduced when the SAD set was used (Fig. 4).

Fig. 4
figure 4

The absolute error (estimated severity minus actual severity) for assessment of a set of 150 images of tomato leaves infected with Xanthomonas euvesicatoria race T1 by each of the 12 raters with and without the use of the standard area diagram set (SADs)

The frequencies of lower bias values (u and υ), improved accuracy (Cb), agreement (ρ c ) and precision (r) statistics increased with the use of SADs (Fig. 5). Distributions of all bias and accuracy component statistics of Lin’s concordance correlation coefficient were narrower when the SAD set was used than without the use of the SAD set. Based on measures of bias, accuracy and precision (Fig. 6), raters with the poorest ability to evaluate the disease severity tended to improve the most, while those raters with low bias, high levels of accuracy, precision and agreement without the SAD set did not respond too much to using the SAD set compared to the raters with the poorest assessment ability.

Fig. 5
figure 5

The frequency of bias, precision and agreement values without and with use of standard area diagram set (SADs) assessment aids by 12 raters who assessed 150 images of tomato leaves infected with Xanthomonas euvesicatoria race T1. a scale bias, b location bias, c correction factor, d Lin’s concordance correlation coefficient, e correlation coefficient

Fig. 6
figure 6

The relationship between gain (difference between the estimate with and without the use of the standard area diagram set (SADs) for measures of accuracy, precision and agreement of disease severities made by 12 inexperienced raters for a set of 150 images of tomato leaves infected with Xanthomonas euvesicatoria race T1. a scale bias, b location bias, c bias correction factor, d Lin’s concordance correlation coefficient, e correlation coefficient

Inter-rater reliability

Inter-rater reliability was improved by using the SAD set to assess bacterial spot severity on tomato leaves. Based on the equivalence test, both the mean pairwise coefficient of determination (R 2) and intra-cluster correlation coefficient (ρ) were significantly improved when the raters used the SAD set (Table 3), suggesting that raters assess disease more uniformly when the SADs was provided. Without its use, 86.4 % of the pairwise comparisons had an R 2 < 0.90, but when the SAD set was employed, only 25.8 % of pairwise comparisons had an R 2 < 0.90 and 74.2 % resulted in an R 2 ≥ 0.90 (Fig. 7a). The ρ value was greater when the SAD set was used (ρ = 0.85 without the SAD set and ρ = 0.91 with the SAD set), indicating that it provided greater inter-rater reliability (Table 3). The frequency of improved inter-rater reliability increased (Fig. 7a), and the regression analysis showed that the improvement was consistently greatest for those pairs of raters who were least reliable without the SADs (Fig. 7b).

Table 3 Inter-rater reliability of visual assessments by 12 raters of bacterial spot on 150a leaves of tomato both without and with the use of standard area diagram (SAD) assessment aids. Inter-rater reliability is measured by the intra-class correlation coefficient (ρ) and coefficient of determination (R 2)
Fig. 7
figure 7

The (a) frequency of the inter-rater reliability measured by the coefficient of determination (R 2) without and with use of standard area diagram set (SADs) assessment aids by 12 raters who assessed a set of 150 images of tomato leaves infected with Xanthomonas euvesicatoria race T1 and (b) the relationship between the coefficient of determination without SADs and the difference between the coefficients of determination (with SADs - no SADs)

Discussion

Resistance to bacterial spot in tomato is partially dominant or incomplete, suggesting multigene models for field resistance (Yang et al. 2007; Yang 2013). This increases the difficulty in developing varieties with resistance to the pathogen through traditional breeding approach. Although several quantitative trait loci (QTLs) for resistance to bacterial spot have been identified in tomato (Yang et al. 2005; Hutton et al. 2010a, b; Sharma et al. 2011; Sun et al. 2011, 2014), none of them have been fine-mapped due to lack of a tool for accurate and precise assessment of disease severity, which prevents the use of marker-assisted selection to circumvent some of the problems associated with phenotypic selection for resistance conditioned by multigenes. In the present study, a set of SADs containing 12 images with a range of disease severity approximately linearly distributed from 0.5 to 90 % was proposed. The results demonstrated that the SADs improved accuracy, precision and reliability of assessing disease severity, which will provide a tool for phenotypic data collection and fine-mapping of QTLs conferring resistance to bacterial spot in tomato.

Previous studies set up the maximum severity values for constructing SADs based on true disease severities data collected in 1 year or growing season, and the maximum values ranged from 39 to 100 % (Capucho et al. 2010; Capucho et al. 2011; Pedroso et al. 2011; Spolti et al. 2011; Duarte et al. 2013; Rios et al. 2013; Yadav et al. 2013; Braido et al. 2014, 2015; Gonzalez-Dominguez et al. 2014; Domiciano et al. 2014). It is well known that the disease severity is usually influenced by environmental conditions. One observation might not reflect the nature of disease severity. In the present study, the maximum disease severity was 71 %. The number of diagrams (10) used to depict the severity range (0.5–71 %) was similar to the number used in SADs of other pathosystems, which is of practical use (Spolti et al. 2011; Duarte et al. 2013; Rios et al. 2013; Yadav et al. 2013; Domiciano et al. 2014). However, disease severity of greater than 80 % has been frequently observed in the field (Scott et al. 1997; Hutton et al. 2010a; Sun et al. 2011). Therefore, two additional diagrams (80 and 90 %) were also included, resulting in 12 diagrams in the SADs, to represent the range of intensities of bacterial spot occurred in the field. Although use of an excessive number of diagrams can be time-consuming and might affect the efficiency of the assessments (Yadav et al. 2013), it might help to solve the problem of greater bias of disease severity estimation due to using few severity values in a certain disease range (e.g., three values for disease severity >50 % in Domiciano et al. 2014).

It has been reported that inexperienced raters tended to overestimate low disease severity or underestimated mid to high disease severity on leaves without using the SADs (Forbes and Korva 1994; Bock et al. 2008, 2009b). The tendency of both slight overestimation and underestimation of disease severity could also be observed when the SADs was used (Capucho et al. 2010; Duarte et al. 2013; Rios et al. 2013; Domiciano et al. 2014). In this study, most raters tended to overestimate when the disease severity was less than 40 % and underestimate when the disease severity was greater than 40 % without using the SADs. However, this tendency was markedly reduced by using the SADs (Fig. 4). Several factors including lesion size, lesion shape, lesion number and disease distribution might affect the tendency to over or under-estimate disease severity (Bock et al. 2011; Sherwood et al. 1983). Lesions of bacterial spot on tomato leaves can coalesce to form long dark streaks and result in yellowing and blighting of leaflet when the conditions are optimal for disease development (Jones et al. 1991). These might add difficulty to accurately and precisely estimate the disease severity even with the aid of SADs.

In summary, the standard area diagram set proposed herein as an aid to estimating bacterial spot on tomato leaves consistently and significantly improved the accuracy, precision and reliability of the assessment. The SADs should be useful for quantitative studies in epidemiology, disease management, and collecting of phenotypic data for fine-mapping of QTLs. It is hoped that the SADs could also be used for assessing disease severity on tomato leaves caused by other bacteria.