Introduction

The pathogen and its vector

Citrus greening or huanglongbing (HLB) is one of the most destructive diseases of citrus in the world, and accounts for substantial economic losses in Asia, Africa and the American continent. The causal agents of the disease are Candidatus Liberibacter spp., gram-negative bacteria that are limited to phloem and have not been cultured consistently (Gottwald 2010). Characteristic symptoms of the disease include yellow shoots, blotchy mottle on older leaves, and lop-sided and off-flavor fruits, as well as a severe drop in production of usable fruit. Severely infected trees often appear stunted, sparsely foliated and die back, losing their economic viability (Gottwald 2010).

There are three currently known forms of the pathogen causing HLB: Candidatus Liberibacter asiaticus (CLas) occurring in Asia and the Americas, Candidatus Liberibacter africanus documented in Africa (CLaf), and Candidatus Liberibacter americanus (CLam) in Brazil (da Graca and Korsten 2004). CLaf is transmitted by the African citrus psyllid (Trioza erytreae del Guercio) and usually occurs in cool and moist regions in highlands (mostly above 900 m), with temperatures below 30 °C (optimum 22–25 °C). Its vector T. erytreae is sensitive to temperatures above 32 °C (da Graca and Korsten 2004). The Asian form, CLas, transmitted by the Asian citrus psyllid or ACP (Diaphorina citri Kuwayama), is more heat tolerant and can withstand temperatures of 30–35 °C, while the optimal temperature range for ACP is 25–28 °C (da Graca and Korsten 2004). D. citri normally feeds and oviposits on young twigs and leaves (flush), and injection and acquisition of CLas by ACP is related to the behavior of the nymphs and adults feeding from the phloem sap of the host plant (Chiyaka et al. 2012; Lee et al. 2015). CLas has been observed in the alimentary canal, salivary glands and haemolymph of ACP and can multiply in its vector (Ammar et al. 2011). Transmission to a previously uninfected tree primarily takes place when infectious adults inject saliva into flush tissues. CLas transmission can also occur by grafting of infected plant materials (Lopes and Frare 2007).

The latency period between infection of flush and acquisition of CLas by ACP nymphs can be as short as 15 days (Lee et al. 2015). Optimal acquisition and transmission requires flush colonization by nymphs. Adults emerging from infected nymphs are directly capable of transmission (Lee et al. 2015). The incubation period, from the time of infection until HLB symptom appearance, is much longer and can vary from a few months to several years (Shen et al. 2013). Previous studies suggest that young trees have a shorter incubation time compared to 7–10 year-old trees (6–12 months vs. 1–2.5 year) (Gottwald 2010; Manjunath et al. 2008).

Current distribution

All Citrus species and cultivars and some related plant genera like Murraya are susceptible to infection by CLas (Halbert et al. 2012; Manjunath et al. 2008). CLas dispersal takes place not only by movement of adult psyllids, but also by transportation of infected nursery stock of Citrus and related genera (Gottwald 2010; Lopes and Frare 2007). The latter route may be more important for long distance spread (Halbert et al. 2012).

The Asian form of the pathogen is widespread in most citrus growing countries in East and South Asia, as well as South, Central and North America, wherever citrus is grown (Bove 2014). In the USA, ACP was first reported in South Florida on a key lime tree in 1998; and in 3 years it spread into 31 counties in Florida (Grafton-Cardwell et al. 2014; Halbert 2005; Shen et al. 2013). CLas was first reported in South Florida in 2005 (Halbert 2005), and spread from there throughout peninsular Florida in less than 10 years (Shen et al. 2013). In 2001, ACP was found in Texas and in 2012, CLas was reported on sweet orange in San Juan, Texas (Kunta et al. 2012). In 2008, ACP was reported in Alabama, Georgia, Louisiana, Mississippi and South Carolina (Marutani-Hert et al. 2010), and in the meantime, HLB was observed in Georgia, Louisiana and South Carolina (Marutani-Hert et al. 2010). ACP also has been found in Southern California since 2008, and has spread to the citrus growing areas in the Central Valley since then. CLas was detected in one tree in Los Angeles in 2012 and in nine other trees in the same neighborhood in 2015 (in seven properties in the Los Angeles area) (Kumagai et al. 2014; CDFA 2015; http://www.cdfa.ca.gov/plant/pe/interiorexclusion/hlb_quarantine.html). Thus, it is still limited to the Los Angeles area.

Despite the production of healthy nursery stock in an ACP-free environment since 2008 (Halbert et al. 2012) and regular insecticide applications to control ACP in citrus groves in Florida, the disease has spread unabatedly. Eradication of infected trees may have slowed down HLB spread in very large groves, but has not been effective in areas with smaller operations (Bassanezi et al. 2012). ACP populations with insecticide resistance have developed (Tiwari et al. 2011) and complete vector control is impossible (Gottwald 2010). Finally, methods to control the pathogen in infected trees, for example by heat treatment or antibiotics, have not been developed fully (Zhang et al. 2011).

Previous HLB and ACP models

Several models were developed for the spread of ACP and HLB in trees, groves or at local and regional scales. In a deterministic analytical simulation model of CLas transmission within a citrus tree, Chiyaka et al. (2012) found that the rate of flush-to-flush CLas spread through a tree and the latency development rate for psyllid transmission were important variables affecting HLB development. A similar model with delay functions for the incubation period and egg-nymph stages predicted complete symptomatic infection of a grove in 5 years (Vilamiu et al. 2012). An individual tree- and insect-based deterministic model for one grove predicted a relatively slow spread: it took more than 7 years for a field to become fully infected (Kobori et al. 2012). In the latter model, the latent period was assumed to be 3 months, but it can be less than 15 days, which would result in much faster spread (Lee et al. 2015). The spread of ACP populations and tree-to-tree infections within a small grove can be so fast that all trees can become asymptomatically infected within 1 or 2 years, depending on the initial spatial colonization pattern of ACP, as predicted by another individual-based microsimulation model (Lee et al. 2015). A stochastic dispersal model of HLB over a number of citrus blocks differing in age in a very large orchard predicted that, in the presence of intensive control measures, the rate of disease spread would depend on the age of the trees and seasonal host susceptibility (Parry et al. 2014). At an even larger scale, a simulation model for the effects of temperature on citrus flushing and ACP development predicted that the current climatic conditions in Northern and Eastern Australia are conducive for ACP establishment, even though CLas and ACP are not present there (Aurambout et al. 2009). Similarly, a temperature-based simulation model for ACP, HLB and citrus production predicted that areas around the Gulf of Mexico and the Mediterranean Sea would be prone to invasion by ACP and HLB (Gutierrez and Ponti 2013). The climatic suitability for citrus growth (based on temperature, rainfall and evapotranspiration) and for ACP multiplication (based on temperature), and the potential distribution of ACP (and implicitly HLB) was also assessed using the SIMPEC modeling structure for various locations in Mexico (Torres-Pacheco et al. 2013). The Pacific coastal area (including areas in Mexico close to California) and the Southern Gulf coast area were considered most conducive to ACP establishment. CLas has been documented for the Northwest of Mexico but no official documentation is available for the South of Mexico (Torres-Pacheco et al. 2013). This last area, with a similar climate as citrus producing areas in Brazil, Cuba and Florida, would be most conducive for HLB caused by CLas.

Recent models (Lee et al. 2015) as well as observations (Shen et al. 2013) indicate that regional spread of CLas infection can be very fast. Nevertheless, there are still areas that are (relatively) free from this disease. For example, CLas has not been documented for Australia and parts of North America. A single incidence of HLB caused by CLas and numerous sightings of ACP in California raised concerns about the likelihood of HLB establishment in this area and elsewhere in the USA. However, few risk assessment models currently are available for predicting the potential establishment of HLB in those regions (Gutierrez and Ponti 2013), despite a call for predictive global mapping of HLB more than 10 years ago (da Graca and Korsten 2004). Current models on potential regional spread of HLB caused by CLas are based on the climatic suitability of various locations for ACP (Aurambout et al. 2009; Gutierrez and Ponti 2013; Torres-Pacheco et al. 2013). Although insect transmission is one of the major factors in disease spread, the risk of HLB establishment should not be based merely on risk establishment of the vector, because the pathogen may have different environmental requirements than its vector (Gottwald 2010). Therefore, the objectives of this study were to: (i) model the potential distribution of both ACP and HLB (caused by CLas) worldwide with a focus on the USA, (ii) to identify climatic variables important for the potential establishment of ACP and HLB, (iii) to compare the predictions of the likelihood of HLB and ACP establishment by different models, and (iv) to identify hotspots with a high probability of establishment of both HLB (Clas) and ACP according to a consensus model.

To address these objectives, we used several single-species distribution models (SDMs). SDMs relate the current distribution of the target species to historical environmental data and then predict the potential distribution based on similarity of the environmental data. They can involve mechanistic or correlative modeling. SDMs have been widely used to assess the potential distribution of different organisms such as invasive plants, insects and pathogens into new areas (Narouei Khandan 2014; Narouei Khandan et al. 2013). We used two types of correlative modeling approaches to assess the climate suitability for CLas and ACP: 1) MaxEnt as a presence-only or presence-background model (Phillips et al. 2006) and 2) Multi-Model Framework (MMF) that includes nine different presence- pseudo-absence models (Worner et al. 2010) .

Materials and methods

Occurrence (presence) data

To acquire geo-referenced data of HLB caused by CLas and ACP incidence in latitude-longitude format, the published occurrence localities were cross-checked with literature and personal communication with experts. We only used HLB (CLas) occurrence data where the presence of the pathogen had been confirmed by PCR or official reports. Reports that were only based on visual assessments were not included. When the name of a city or town was mentioned as the presence locality of HLB or ACP, Google Earth (https://www.google.com/earth/) and its satellite imagery capability was used to locate the nearest citrus grove and determine its latitude and longitude. In total, 208 points were collected for HLB and 268 points for ACP (Table S1). Locations in the USA where HLB or ACP were reported were not used for model calibration (Fig. 1). Instead, they were used to evaluate the performance of the models. To test if the predictions would change when USA data were included, the models were run again with the complete global data set after appropriate rarefication. The resulting presence data were 224 and 299 for HLB and ACP, respectively.

Fig. 1
figure 1

Current global distribution of citrus Huanglonbing, HLB, caused by CLas (a) and Asian Citrus Psyllid, ACP (b). Occurrence of HLB and ACP in the USA were not included in the models

Environmental data

Climate data, including 19 bioclimatic variables, were acquired from the Worldclim website (www.worldclim.org) (Table S2). These variables have been derived from long-term (1950–2000) monthly temperature and rain data (Hijmans et al. 2005), and are assumed to reflect the climate suitability for the growth and development of different organisms including plant pathogens (Hijmans et al. 2005). The bioclim data resolution for this study was in 10 arc minutes (around 18.6 km).

MaxEnt modelling

In MaxEnt, the presence locations of the target species (occurrence data) were compared versus background points (the localities where the presence or absence of the target species was unidentified or unknown). To avoid model overfitting of spatially clustered presence points and not being able to predict spatially independent data, the presence data were spatially rarified using SDMtoolbox (Brown 2014). This process reduced the HLB (CLas) and ACP presence points from 208 to 174 and from 268 to 254, respectively, when the USA data were left out. In addition, to avoid too large a background extent around presence points, potentially resulting in the failure of the model to predict the suitability of un-colonized locations that might be climatically suitable for the species, the background optimization method suggested by Senay et al.(2013) was used. Thus, circles with a radius of 500 km around the presence points were selected as the most appropriate background extent, because ACP can spread several hundred km (Lewis-Rosenblum et al. 2015) . The SDMToolbox was also used to determine variables that were highly correlated (Pearson’s correlations higher than 0.8). The random forest option that generates 1000 decision trees in MMF was then used to select significant bioclimatic variables among the variables that were not highly correlated (Breiman 2001). This resulted in the selection of 11 and 10 bioclimatic variables for HLB and ACP, respectively (Table 1). These reduced sets of variables gave improved model fit and computation time. The same variables selected by the random forest program in MMF were also used in MaxEnt, because this model does not have a robust method for variable selection like the random forest program. We used 75 % of the presence data to train the model and 25 % for model validation (in addition to the USA presence points which were not used in the modelling process). As MaxEnt fails to make predictions for areas where the climate is considerably different from the current range of the target species, the MESS analysis (‘Multivariate Environmental Similarity Surfaces’) was performed to determine if the model extrapolated appropriately in the areas of interest. Because we had enough presence points for HLB and ACP, default features were used in the MaxEnt models for both ‘species’ (Merow et al. 2013). The model was set to run with 5000 iterations and >10,000 background points. The model was also set to run the jackknife test, which evaluates variable importance, and to produce response curves showing how each environmental variable affected the model prediction. The MaxEnt prediction performance for the validation locations (25 % of the data set) was evaluated by area under the curve or AUC (Phillips et al. 2006). The AUC is calculated based on a confusion matrix which comprises the frequency of each possible type of binary prediction. The AUC plots true positive predictions (presence locations with correct predictions) versus false positive predictions (the absence locations which were incorrectly predicted as presence).

Table 1 Average percent contribution (out of 5000 iterations) of environmental variables used in the MaxEnt model for prediction of global distribution of huanglongbing, HLB caused by CLas, and the Asian Citrus Psyllid, ACP

Multi-Model Framework

To evaluate the outcomes of several modeling methods we used the Multi-Model Framework (MMF) developed by the Ecological Informatics Group in the Bio-Protection Research Centre, Lincoln University (Senay et al. 2013; Worner et al. 2010). Nine presence-pseudoabsence species distribution models that estimate probability of presence based on climatic suitability were used within the MMF which include: logistic regression (LOG), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), classification and regression trees (CART), naıve Bayes (NB), conditional trees (CTREE), K-nearest neighbor (KNN), artificial neural networks (NNET) and support vector machines (SVM).

Because true absence points are usually unavailable in ecological studies, the alternative is to generate pseudo-absence points. MMF generates a set of pseudo-absence points by one-class support vector machines (OCSVMs). Instead of selecting a single best-performing OCSVM, an ensemble of 100 models fitted to different samples of the data was selected based on the lowest prediction errors in thousands of runs. Because this procedure resulted in many possible absence locations, the generated absence locations with similar environmental characteristics were clustered by K-means clustering, resulting in K clusters equal to the number of presence locations. For each cluster, the geographical location with environmental conditions closest to the cluster centroid was selected as the absence point for that cluster. The generated absence locations had a zero-probability of environmental suitability in all 100 models.

The same environmental variables which were selected by the random forest method mentioned above were used in MMF. The 10 and 11 selected variables for ACP and HLB, respectively, were used in each of the nine models in MMF. Model validation was carried out by bootstrapping (re-sampling data) and 10-fold cross-validation (Senay et al. 2013). At the end, the MMF models were evaluated using ten different performance criteria which increased the ability to assess the robustness of the model. These performance criteria included: Accuracy, AUC, F-score, Kappa, precision, recall (sensitivity), specificity, True Skill statistic (TSS), uncertainty and a 0.632 error (Table S3). The models were ranked based on all 10 performance criteria and the model with the highest rank in both validation methods (bootstrapping and cross-validation) was selected as the best model. Model predictions were plotted globally. Also, the output maps were generated in ASCII format that were exported into ArcMap 10.1 software.

Principal component analysis

A principal component analysis (PCA) was performed to visualize the degree of climatic similarity of global presence locations to Florida and California climate data. The same 19 bioclimatic data for global presence locations as initially used for the modeling were extracted in ArcGIS Desktop 10.1. Data then were transferred to R version 3.2.1 and were analyzed using the “princomp” command. The results of PCA were transferred back into ArcGIS and the actual presence points were marked with a different color to distinguish them from the Florida and California locations.

Model consensus

To identify hot spots where HLB and ACP might establish, the global probabilities of HLB and ACP establishment derived from MaxEnt and MMF were converted to binary data in ArcGIS 10.1 using a threshold of 10 percentile presence in the training data set in MaxEnt (Jarnevich and Reynolds 2011) and using a threshold of 0.5 in MMF (Gallien et al. 2012). Maps of the binary data of the four data sets (ACP-MaxEnt, ACP-MMF, HLB-MaxEnt, and HLB-MMF) were overlaid using ‘equal to frequency’ in ‘spatial analysis tools’ to identify the areas where one, both or neither of the models agreed on the habitat suitability for both species.

Results

HLB model performance, effect of environmental variables, and potential distribution

Model performance

The MaxEnt model fitted the current distribution of HLB reasonably well. The AUC value was 0.69, which indicates an acceptable model (Peterson 2011). In MMF, ten different performance criteria were calculated for each of the nine models. The model with the highest performance for a particular criterion was given a score of 1 and the model with the lowest performance was given a score of 9 for that criterion (Table S3). The overall performance was calculated as the sum of the scores for all criteria. Among the nine models compared in the MMF, the Support Vector Machine (SVM) was ranked the best model based on all performance criteria. For example, the AUC of the SVM model was highest (0.969) according to both validation methods (bootstrap and cross-validation) followed by the NNET (Neural Network) and KNN (K-nearest Neighbor) models.

Environmental variables

Eleven variables were selected as informative variables for the HLB model by the random forest method. The MaxEnt model calculated the percentage contribution of each of these variables to the final model prediction (Table 1). Variable bio12 (annual precipitation) had the highest contribution to the prediction of MaxEnt followed by bio 6 (minimum temperature of the coldest month) and bio13 (precipitation of wettest month) (Table 1). According to the jackknife test the variable that had the highest training gain when used alone was bio09 (mean temperature of the driest quarter). The jackknife charts also showed that the model gain decreased when variable bio08 (mean temperature of wettest quarter) was omitted, implying that this variable had the most information that was not present in other variables (Fig S1). According to the bio13 response curve the probability of HLB occurrence would be very low when precipitation in the wettest month would be less than 150 mm/month (Fig. 2a), a relatively high threshold. At a higher precipitation, the HLB presence probability would continue to increase (with some variability). Similarly, the response curve for bio12 (annual precipitation) indicated that at least 1200 mm per year would be conducive for the development of HLB caused by CLas (Fig. S2). The response curve of bio09 (mean temperature of driest quarter) indicated that CLas could survive in locations where the mean temperature of the dry season was between 15 to 35 °C but that the probability of CLas occurrence decreased rapidly above 20 °C (Fig. 2b). The response curve of the mean temperature of the wettest quarter (bio08) showed that CLas can tolerate a wider range of temperatures (15–35 °C) in the wet season (Fig. 2c).

Fig. 2
figure 2

Response curves of the variables contributing most to the predictions by the MaxEnt models for HLB (ac) and ACP (df): precipitation in the wettest month (a), mean temperature of driest quarter (b), mean temperature of wettest quarter (c), precipitation in the wettest month (d), minimum temperature of coldest month (e), and temperature annual range (f). X axes values are in mm/month (a, d) and in °C multiplied by 10 as provided by the Worldclim website (b, c,e, f)

Potential HLB (CLas) distribution

The SVM model generally predicted higher suitability, and globally larger suitable areas compared to the MaxEnt model for HLB establishment. The USA data were not used initially in training or calibration of the models and therefore provided independent data for validation. Both the MaxEnt and SVM models predicted that Florida, the southern parts of Louisiana, Georgia, Texas and South Carolina had a highly suitable climate for CLas establishment (Fig. 3a). In addition, SVM predicted that parts of North Carolina and Arizona would be suitable for CLas establishment, although no commercial citrus is grown there. The probability of CLas establishment in California was predicted to be lower than that in the Gulf Coast states according to both models (Fig. 3a, b). In areas around San Diego and Los Angeles the probability of HLB occurrence was higher than at other locations in that state (Fig. 3a, b and Fig. S3a, b). When USA data, including findings in California, were added to the global data set, the predictions for CLas establishment did not change (data not shown). The models also predicted a high chance of CLas establishment in coastal areas of northern and eastern Australia (Fig. 3a). Both models predicted that central and eastern parts of Africa would also be environmentally suitable for the Asian form of HLB (Fig. 3a, b). According to MaxEnt, the probability of long–term establishment of CLas was lower in East than in Central and West Africa (Fig. 3a), while citrus growing areas in eastern Africa would be highly suitable according to the SVM model (Fig. 3b). In addition, Mediterranean countries in North Africa and southern Europe (especially southern Portugal and to some extent Spain) were predicted as highly suitable (Fig. 3b). Overall, the SVM model predicted higher suitability in most areas compared to MaxEnt and may over-predict potential HLB (CLas) occurrence in some areas such as inland areas of West Africa and central South America where it may be too dry for ACP survival (Fig. 3b).

Fig. 3
figure 3

Global potential distribution of citrus huanglongbing, HLB, caused by CLas by MaxEnt (a) and the Support Vector Machine, SVM (b) models

ACP model performance, effect of environmental variables, and potential distribution

Model performance

Compared to the MaxEnt model for HLB, the model for ACP had a slightly better AUC (0.72). Among the nine models tested in the MMF, SVM achieved the highest rank. For example, the AUC of the SVM model for ACP was 0.968, which suggests very good model performance (Table S3).

Environmental variables

Ten variables were selected as informative variables by the random forest method in MMF (Table 1). For ACP, bio06 (minimum temperature of the coldest month), bio13 (precipitation of wettest month) and bio11 (mean temperature of the coldest quarter) contributed the most to the final model (Table 1). The jackknife test for the training gain also indicated that variable bio06 (minimum temperature of the coldest month) was the variable with the most important information by itself (Fig. S4). The response curve of bio13 (precipitation in the wettest month) was similar to that for HLB. The probability of ACP occurrence increased dramatically above 150 mm/month, and continued to increase at higher precipitation levels (Fig. 2d). The response curve of variable bio06 showed that the probability of ACP occurrence started to increase when the minimum temperature of the coldest month exceeded 4 °C and reached a maximum at 17–18 °C (Fig. 2e). The jackknife test showed that the model gain decreased noticeably when variable bio07 (temperature annual range) was omitted from the model (Fig S3). The response curve of bio07 (temperature annual range) indicated that when the temperature annual range exceeded 25 °C, the probability of ACP occurrence started to decline substantially (Fig. 2f).

Potential ACP distribution

Similar to HLB, the areas predicted to be suitable for ACP were more extensive using the SVM model than using MaxEnt (Fig. 4a, b). Although the USA data were not used in model training and testing, MaxEnt predicted Florida as highly suitable for ACP establishment with lower probabilities in southern parts of Louisiana, Georgia, Texas, and South Carolina. Besides these states, SVM predicted that North Carolina would also be climatically suitable, although citrus is not grown commercially there. Similar to the prediction for HLB, both MaxEnt and SVM predicted that a relatively small area in southern California, especially along the coast, was highly suitable for ACP establishment, while the Central Valley was predicted to be moderately suitable, when the USA data were omitted for calibration (Fig. 4a, b and Fig. S5). The predicted areas were therefore mostly in accordance with the distribution of ACP in the USA. When the USA data were included, both models predicted more suitable areas in California, including more inland areas like the Central Valley (Figs. S6 and S7) although the ACP numbers per unit area were still very low in California compared to Florida.

Fig. 4
figure 4

Global potential distribution of the Asian Citrus Psyllid, ACP, by MaxEnt (a) and the Support Vector Machine, SVM (b) models

Both models predicted that northern and eastern Australia would be highly suitable. Coastal areas in southern and western Australia were predicted as highly suitable by SVM only. The northern part of the North Island of New Zealand was predicted to be suitable by both models, whereas SVM predicted that the southern regions of the North Island would also be highly suitable. Although MaxEnt predicted very low probability for ACP establishment in South of Korea where this psyllid does not occur, this area was modeled as unsuitable by SVM. Both models also predicted that the lowlands in central and eastern Africa would be highly suitable for ACP (Fig. 4). In central and west Africa, very large areas were predicted highly suitable for the psyllid. In addition, coastal areas in northern Africa were estimated to be highly suitable. In Europe, Italy, Portugal and part of Spain had a high probability of ACP establishment based on SVM predictions.

Validation of Florida and California predictions by PCA analysis

The results of PCA analysis for CLas and ACP (Fig. 5a, b) showed that for both species, the Florida climate data points were environmentally very similar to those of the worldwide presence points (excluding the USA), while the California climate data points were distinct from the Florida points and most of the global presence point (again excluding the USA). However, the climate data of relatively few locations in California were similar to those of some of the global presence points.

Fig. 5
figure 5

Principal component analysis showing climate similarity of the presence points of huanglongbing (HLB) caused by CLas and the Asian Citrus Psyllid (ACP) to Florida (yellow) and California (green) climate data obtained from the Worldclim website (www.worldclim.org)

Consensus model for HLB and ACP

The consensus model categorized the areas where both models agreed on the suitability of the climate for both CLas and ACP (hot spots), contrasted with areas where only one model predicted suitable climate for both species and areas where there was no agreement (Fig. 6). The areas where both models predicted high probability of both species occurrence were in agreement with observed occurrences of HLB. In addition, CLas and ACP establishment were predicted for northern and eastern Australia and central and southeastern Africa, where these species have not been detected.

Fig. 6
figure 6

Consensus model showing the hot spot areas where one or two models (MaxEnt and SVM) agree on the probability of both citrus huanglongbing (HLB) caused by CLas and the Asian Citrus Psyllid (ACP) occurrence

Discussion

Model comparison

Prediction models can be useful for risk assessment of bio-security threats (da Graca and Korsten 2004). The current study is the first to address the global potential risk of HLB and ACP establishment in citrus plantings throughout the world using two different correlative modeling approaches. As expected, both models provided a good fit to the current distribution of HLB and ACP worldwide. Additionally, both models predicted well the current occurrences of the target species in the USA that had been left out of the training data set. When the USA data were included, the predictions for ACP from both models changed for California and the Mediterranean area (Figs. S6 and S7), but the CLas predictions remained essentially the same.

Although both models showed a similar pattern regarding climatic suitability of the two modelled species at the global scale, MaxEnt predicted a more narrow distribution compared to the SVM model selected from the MMF. This is in agreement with the expectation that presence-only models have a lower ability to predict invasion in a new area compared to presence-pseudo-absence models (Townsend Peterson et al. 2007).

Global prediction of HLB and ACP

Both models predicted suitable areas in regions where HLB and/or ACP have not been reported so far. For example, both predicted that Northeast Australia could be highly suitable for HLB and ACP establishment. This result has also been predicted with a previous simulation model (Aurambout et al. 2009). ACP was reported from Australia in 1922, but it was quickly eradicated and has not been reported since then (Bellis et al. 2005). Considering that ACP and HLB have been reported in most parts of the Asia-Pacific region, the current absence of these species in Australia may indicate the important role that effective bio-security measures can play in prevention and control of pest species.

The models also predicted that vast areas in Africa were highly suitable climatologically for CLas and ACP establishment. Only the African form of HLB has been reported in Africa so far, except for a recent report of the Asian form of HLB (caused by CLas) from Tigray in Ethiopia (Bove 2014). According to both models, the chance of CLas and ACP establishment is high in East and West Africa where citrus trees are grown. Thus, in the absence of effective preventive measures, CLas may establish in Africa if it is not outcompeted by CLaf.

For the USA, both models predicted Florida as highly suitable for HLB (CLas) and ACP establishment even though the USA data were not used in model training. Both models also predicted the suitability for ACP and CLas in areas of Texas where ACP and CLas have been reported recently (da Graca et al. 2008). Similar to the predictions by Gutierrez and Ponti (2013), areas surrounding the Gulf of Mexico were predicted to be quite conducive. Both of our original models (without USA occurrence data) predicted a high probability of ACP and CLas establishment in the coastal areas of California, especially around Los Angeles, but a low probability in the Central Valley, whereas Gutierrez and Ponti (2013) predicted a moderate probability for ACP and HLB establishment throughout California. Similar to the predictions, there were several reports of HLB (CLas) from Los Angeles in 2012 and 2015, which were followed by immediate eradication of the infected trees (Kumagai et al. 2014). ACP has been reported for many counties in California, including the Central Valley, since 2008 (Kumagai et al. 2014). When the USA occurrence data were included in the global data set, the positive ACP predictions for California extended to the Central Valley, as could be expected from correlative models (Fig. S7).

Our original results from the global data without the USA data were supported by the PCA analysis, showing that the California climatic conditions were mostly distinct from those of current ACP infested areas worldwide, except for some points around Los Angeles and other locations close to the coast. Although ACP now occurs in 16 counties in California (http://www.cdfa.ca.gov/plant/pe/InteriorExclusion/acp_quarantine.html#maps), the populations of psyllids are very low there compared to Florida, except in Los Angeles county (Grafton-Cardwell et al. 2014). The low ACP densities and current absence of HLB in most of California suggest that the climate may be only moderately conducive for ACP and CLas in addition to the possible effectiveness of control measures. HLB control measures (Martinez-Corrillo et al. 2015) were apparently effective in northern Mexico (Sonora) where there is no official report of HLB, even though ACP has been present since 2006. According to our model predictions, HLB (CLas) could establish in this region, as well as in the coastal areas of Western and Northeastern Mexico. These same areas were also predicted as suitable for ACP establishment by Torres-Pacheco et al. (2013), who developed indices for climate suitability for citrus growth and ACP development based on temperature and rainfall data in Mexico. Another important factor may be the prevention of infected plant material being transported into this area (Halbert et al. 2010). However, HLB symptoms could also have been overlooked (http://www.usda.gov/wps/portal/usda/usdahome?contentidonly=true&contentid=citrus-quarantine-maps.xml).

Unlike the positive predictions for certain areas in Australia, Africa and the USA, the southern parts of South Korea were predicted to be environmentally unsuitable for both Clas and ACP establishment, possibly due to the relatively low temperatures and the isolated position of the island where citrus is mainly produced. This is in agreement with the current absence of these species in South Korea, despite the presence of HLB and ACP in neighboring countries such as Japan and China.

The consensus model showed that central and southeastern Africa, north and eastern Australia, southern China, India, most parts of South America, Florida, Georgia, North and South Carolina, Texas, Louisiana, and small coastal areas in California were potential hot spots of CLas and ACP (Fig. 6 and Fig. S8). This last finding is in agreement with a recent increase of ACP densities in the Los Angeles area (Pers. Comm. with Elizabeth Grafton-Caldwell). Model consensus, especially when it includes the vector habitat suitability, can increase awareness of the areas which should be more intensively monitored.

Environmental variables

The most distinguishing features of climates that are or are not suitable for HLB (CLas) and ACP establishment are low annual precipitation and precipitation in the wettest month. The total annual rainfall needs to be above 700 mm for ACP and 1200 mm for HLB to be suitable for establishment of the disease (Fig. S2). This is a relatively high rainfall requirement which may not be met in semi-arid environments like the Southern Central Valley of California with an Interior Mediterranean climate, characterized by hot and dry summers that could potentially reduce the chance of ACP establishment (Razi et al. 2014). This area has an annual precipitation of 125–150 mm (http://frap.cdf.ca.gov/webdata/maps/statewide/rainmap.pdf/), and CLas has not yet been found there. Similarly, CLas has not been detected in central Southern California, although ACP has been found there repeatedly, possibly because irrigation may compensate for low rainfall. However, the models do not account for the potential effect of irrigation in areas like the Central and Imperial Valleys of California, because the global distribution of the amount of irrigation water is not known. Irrigation may alleviate dry conditions, increase the relative humidity and number of flush events, and allow ACP and HLB establishment in some semi-arid climates, where the models indicate low probability of ACP and HLB presence. Consequently the model results for such areas should be interpreted carefully because it is difficult to determine to which extent the irrigation may compensate for precipitation. This highlights a critical knowledge gap and need for conducting studies to evaluate how ACP and HLB behave in relatively dry irrigated areas.

When the models were tested with all presence data including those in California, only the suitability for ACP was extended into the Central Valley (Fig. S7). To date, the density of ACP is very low in California (mostly less than 20 psyllids per km2, with slightly higher densities in the Los Angeles area). Thus, it is not clear if the Interior Mediterranean climate is conducive for ACP establishment or if the low ACP populations are mainly related to human-assisted activities.

Above the monthly and annual rainfall thresholds, higher rainfall continues to promote ACP and HLB development, despite the potential wash-off of psyllid nymphs from flush; there appears to be no upper limit (Fig. 2a, d; Fig. S2). Rainfall is often preceded by high wind speeds, which may promote high-altitude and long-distance ACP dispersal, followed by settling out in subsequent rain. However, the positive relationship with rain may be primarily due to indirect effects via the host, including the appearance of new flush (Chiyaka et al. 2012). In humid and warm climates, citrus trees produce flush throughout the year (Torres-Pacheco et al. 2013). The number of days suitable for new flush development as determined by temperature and rainfall has proved to be an important index determining the ACP distribution in Mexico (Liu and Tsai 2000; Torres-Pacheco et al. 2013).

In addition to rainfall, the minimum temperature in the coldest month as well as the annual temperature range contributed to climate suitability for ACP (Fig. 2d, e). Large annual temperature ranges are encountered in (semi-)arid climates and high altitude areas. A minimum temperature below about 4 °C in winter and an annual average temperature range larger than 27 °C are not conducive for ACP survival, especially in dry seasons. Winter temperatures above 8 °C are more or less uniformly conducive for ACP development. In the winter (in climates with distinct seasons), there is generally no flush and temperatures may be limiting for adult ACP survival. In flushing periods, temperatures are mostly moderate, so that egg production may not be limited; temperatures during these periods were not selected by MaxEnt as factors determining ACP presence. In the warmest season however, a mean temperature of 33 °C limits ACP presence (data not shown), indicating that eggs and young nymphs are sensitive to high temperatures and desiccation (Liu and Tsai 2000). Earlier reports mention lower and upper thresholds for ACP development of 10 and 33 °C, which may depend on the location where ACP was collected (Liu and Tsai 2000; Torres-Pacheco et al. 2013). At the high end of this temperature range, oviposition is stimulated and the generation time is reduced compared to the lower end (Aurambout et al. 2009; Liu and Tsai 2000).

Optimal and limiting temperature conditions for HLB symptoms are dependent on rainfall. In the driest quarter the probability of HLB (CLas) presence declines above 20 °C (but not to zero), while in the wettest quarter, the probability of CLas establishment increases sharply above 15 °C up to 25 °C and declines only slightly up to almost 40 °C. The sensitivity to high temperatures in the driest quarter indicates that the combination of dryness and high temperatures forms a stress factor either on the pathogen itself or on its vector (Razi et al. 2014). It is well-known that the impact of the Asian form of HLB is most severe in warm and wet climates (Bove 2014). The temperature sensitivity under dry conditions opens perspectives for HLB control by heat treatment in the dry season (Hoffman et al. 2012).

Caveats

To interpret any habitat model output, the inherent uncertainties involved should be well considered and one should exercise caution. Firstly, bioclimatic models assume that a species is in equilibrium with its environment (they have fully filled their environmental niche) and they may fail to correctly project the distribution of a species that is on the move, invading a new area or more particularly a species that experiences a range shift in response to climate change (Vaclavik and Meentemeyer 2009). Secondly, the resolution of environmental data, the quality of occurrence records, and spatial autocorrelation should be taken into account (Franklin 2013). The combination of several models like MaxEnt and the MMF can give more confidence in the results, especially when they coincide so that a consensus model can provide realistic predictions (Narouei Khandan 2014; Narouei Khandan et al. 2013).

Thirdly, if a global map of citrus production would become available, the generation and selection of pseudo - absence points could be different, the predictions of HLB and ACP could be overlaid with presence of citrus trees and the prediction could be improved. Nevertheless, the current predictions may be useful if citrus were to be planted in the future. Finally, models may predict potential suitability for an area where the pathogen is not (yet) present, for example as a result of quarantine measures or the absence of large citrus production areas. Despite these caveats, predictions using the best possible models can be useful and are better than no science-based predictions at all.