Introduction

Classical biological control, i.e. the use of specialist natural enemies from the weed’s native range to reduce its abundance or spread in the introduced range, is an effective and cost-efficient approach to control invasive alien plant species (Culliney 2005; Page and Lacey 2006; Winston et al. 2014). A biological control program is a long-term action which includes four main steps: surveys and identification of natural enemies of the target weed, assessment of the host range (i.e. specificity tests) and impact of candidate biological control agents, petitioning for field releases of biological control agents, followed by an independent decision made by regulatory bodies on the merits of the petition, and then the release of the approved agent(s) in the new environment followed by monitoring of its population build-up and impact (Sheppard et al. 2005; Suckling and Sforza 2014).

Assessing and predicting the likelihood and consequences of non-target effects by a candidate biological control agent is one of the fundamental challenges of pre-release studies (Müller-Schärer and Schaffner 2008). Host specificity studies have a long history in weed biological control, starting with the first scientific attempts at the beginning of the 20th century (van Wilgen et al. 2013) and advancing with seminal papers on selecting test plant species and applying different test designs in the 1970s (e.g., Zwölfer and Harris 1971; Wapshere 1974). Pre-release host range testing often includes an assessment of the fundamental host range which is usually determined by testing all test plant species under no-choice conditions. However, no-choice tests often reveal a fundamental host range that tends to be larger than that known to be used in the field in the native range (e.g. Wapshere 1989; Marohasy 1998; Schaffner 2001; Hinz et al. 2014). Several extrinsic and intrinsic factors may explain why a plant species that supports development of a candidate agent under no-choice conditions is not attacked in the field, including life-history traits or the host selection behaviour of the agent (e.g., visual or olfactory host location and identification cues that are bypassed under confined conditions), the availability of preferred host plants (i.e. the target weed or weeds), the ability of the agent to disperse and forage for preferred host plants, and a low probability for the agent to encounter non-target plant species (Marohasy 1998; Hinz et al. 2014).

Differences in results from experiments conducted in confinement from those under field conditions depend on the specific traits of the biological control agent and the plant species tested, as well as on the artificial conditions used (Briese 1999; Paynter et al. 2004). Marohasy (1998) proposed the term “false positive” for those cases where attack occurs on a test-plant species, which would not be attacked in the field, and “false negative” when a plant species is not attacked in the test, but it might be attacked in the field. Thus, the challenge is to design experiments that can accurately predict risk to non-target plants in the region of introduction, minimizing the influence of both false positive and false negative experimental results.

Open-field host range studies provide the opportunity to study host selection when a biological control candidate can display the whole array of pre- and post-alightment behaviours, as well as dispersal (Briese et al. 2002). According to Briese et al. (2002), open-field host range tests are particularly suitable to avoid “false positive” results that are potentially expensive as they may result in the rejection of an agent that would be safe for release (Hinz et al. 2014). To our knowledge, the first open-field host range study was published by Andres and Angalet (1963). Open-field studies conducted through 1998 were reviewed by Clement and Cristofaro (1995) and Briese (1999), and both reviews concluded that this method has proved its worth in numerous cases where no-choice tests appeared to overestimate risk to some non-target species. Briese (1999) concluded that “open-field tests should continue to form an important-adjunct to the more traditional laboratory-based host-testing and, where anomalies exist, continue to reduce the chances of missing effective agents without compromising safety”.

Here we review how open-field host range studies have been applied in the course of its 53-year history (1963–end of 2015) to assess the host specificity of invertebrates in weed biological control programs. Building on Clement and Cristofaro’s (1995) and Briese’s (1999) reviews, we assessed how the frequency of open-field host range studies, their main purpose and the test design have changed over the last decades. We then discuss the advantages and challenges of testing host specificity under open-field conditions and suggest ways to advance this method to improve risk assessments in weed biological control programs.

Methods

We searched relevant papers on the ISI Web of science in February 2014 and March 2017. We used the following search term combinations: biological control AND weed AND (host range OR host specificity) AND (field OR natural) AND (herbivore OR insect OR arthropod) for the period 1999–end of 2015. We screened the reference lists from all retrieved papers for other relevant publications. We then compared the main purpose and the experimental design of the studies published between 1999 and 2015 with those published earlier and reviewed by Clement and Cristofaro (1995) and Briese (1999). For the classification of the studies according to their main purpose and the experimental design we followed but modified Clement and Cristofaro’s (1995) and Briese’s (1999) approach. Regarding the main purpose, we classified studies as follows: (1) general screening, i.e. studies that were done as part of a general screening of the host range; usually no prior assumptions were made about the host range of the biological control candidate; (2) clarification, i.e. studies that were conducted to clarify ambiguous or conflicting results obtained in other tests (e.g. no-choice tests and multiple-choice cage experiments); and (3) refinement, i.e. studies that were conducted at the time the host range was generally known, e.g. to address issues brought up during the petitioning process or after field release of the agent into the region where biological control should be implemented. Hence, the first two categories include studies done in the native range, while the latter category includes studies conducted either in the native or in the invaded range.

The study designs were separated into three categories (Fig. 1): (1) interspersion design, where target and test plant species were randomly interspersed, usually within or next to natural populations of the target species and of one or several biological control candidates; this design is characterized by a relatively small number of test plant species exposed to biological control candidates in a relatively large number of naturally growing individuals of the target species; (2) reverse interspersion design, which either makes use of existing or artificially creates high densities of one or a few test plant species and introduces a small number of the target plants; naturally occurring target plants are deliberately removed in order to keep the ratio of test plant species to target species high; usually, the biological control candidate(s) is/are also deliberately released at the experimental site; and (3) set design, which is characterized by an arrangement of test and target plants according to an experimental design (e.g., randomized block, Latin square); the ratio of test to target plants tends to be relatively even but the target species may be removed between or during test runs, such as in the case of the two-phase set design proposed by Briese (1999).

Fig. 1
figure 1

Experimental designs used in open-field host range studies. The varying fills/shadings in the circles represent different non-target test plant species

For both the main purpose and the test design, Pearson’s χ2 tests were conducted to confirm whether or not the purpose or test designs (three categories each) differed between Clement and Cristofaro’s (1995) and Briese’s (1999) and those in this review (three publications), and to assess whether the use of this method has experienced a shift over the course of its 53-year history. The analyses were conducted with IBM SPSS Statistics 25.0.

Results

We found a total of 36 open-field host range studies in 32 papers that were published in ISI journals between 1999 and the end of 2015 (Supplementary material, table S1). Compared to the total number of studies reviewed by Clement and Cristofaro (1995) and Briese (1999), the frequency of host range studies has remained comparable to those in the 1990s (Fig. 2; Clement and Cristofaro 1995; Briese 1999).

Fig. 2
figure 2

Number of open-field host range studies published over the 53-year history (in 5-year intervals, until the end of 2015)

Over the course of the last 53 years, the purpose of open-field host range studies has considerably changed (Table 1; three purposes × 3 publications; Pearson’s χ2 = 15.76, df = 4, P < 0.003). General screening decreased from 46% for studies before 1995 to 11% after 1999, whereas most of the studies reviewed by Briese (1999), and in this paper, were intended to clarify results obtained in confinement. Open-field studies conducted in the last 15 years predominately used set designs (56.8%), but there is no significant shift in the choice of test designs over the three time periods (Table 1; three test designs × 3 publications; Pearson’s χ2 = 5.888, df = 4, P = 0.208).

Table 1 Number of papers with open-field studies included in the reviews by Clement and Cristofaro (1995); Briese (1999) and in this paper (covering the literature from 1999 until the end of 2015)

During the 20th century, open-field host range studies generally used either interspersion (20/40 = 50%) or set-designs (19/40 = 48%) (Table 1). Since the beginning of the 21st century, most studies applied an experimental set design approach (25/36 = 69%). The only two studies we found that used a reversed interspersion design using relatively few target plants among many nontarget plants were those published by Andres and Angalet (1963) and Cristofaro et al. (2013) (Fig. 1). Gandolfo et al. (2007) created a plot with a large number of plants from one non-target species without the target species, and had another plot with the target species alone as a control, 40 km away.

Discussion

Opportunities

Open-field host range studies are suitable for addressing some of the key questions related to the prediction of host specificity of biological control candidates when encountering the target species among novel potential host plants in a novel environment. For example, one of the key questions classical biological control is confronted with is what will happen if the biological control agent successfully reduces or even eliminates the target weed locally. The two-phase test design originally proposed by Rizza et al. (1988) and Dunn and Campobasso (1993) and refined by Briese (1999) allows this question to be addressed. The first phase of the experiment examines the host specificity of biological control candidates in the presence of target and non-target species. During the second phase, the target plant is removed from the central plot, and the herbivores may leave the central plot to search for target plants, or stay and start colonizing non-target plants. Using a revised version of the two-phase test design, Briese et al. (2002) compared the host selection behaviour of four biological control candidates of the invasive weed Heliotropium amplexicaule Vahl (Boraginaceae). All four species established on the central plot during the first phase of the project. During the second phase of the experiment, when the target plants were removed from the central plot, two biological control candidate species dispersed rapidly from the plot: one candidate persisted for several days on a non-target species with some exploratory feeding and then could no longer be found, while the fourth species rapidly colonized and fed on a non-target plant. Despite the extended persistence on the central plot, the third species was proposed for extensive host range testing (Briese et al. 2002).

The relationship between the abundance and spatial distribution of target and non-target plants and the mobility of biological control candidates is a central issue regarding acceptance of non-target plants (Schaffner 2001). By offering target and non-target plants at different spatial distances (Courtney et al. 1989), thereby increasing the interval during which a herbivore may perceive signals from either of these plant species, much can be learned about the herbivore’s host fidelity (Roitberg 2000). This approach has been applied in some studies published recently (Schooler et al. 2003; Catton et al. 2015), with the most extreme example reported from Gandolfo et al. (2007), who set up control plots with target species some 40 km away from the plot with a non-target species, basically resulting in two separate no-choice open-field experiments. Gandolfo et al. (2007) found no noticeable feeding damage by the candidate biological control agent, Gratiana boliviana Spaeth (Coleoptera: Chrysomelidae), on the non-target species eggplant, Solanum melongena L. (Solanaceae), on which damage by G. boliviana was observed in no-choice larval development tests but the beetle showed a strong decline in fitness. Subsequently, G. boliviana was approved for field release in the USA.

Challenges

Conducting open-field tests to obtain information on host plant specificity or to measure potential impact of candidate biological control agents is normally conducted in the region of origin of the agents before an agent has been approved for use. Clement and Cristofaro (1995) proposed that open-field experiments should be conducted at a protected or undisturbed site, where the target plant and biological control agent are naturally abundant, where there is infrastructure to support experimentation, and where alien non-target plants can be grown outdoors for testing. In addition, it is important to try to conduct such studies in a location that has a similar climate to that of the region targeted for introduction, because of the possible importance of phenological synchrony and abiotic conditions which may affect risk to nontarget species, and where inter-specific competition for the same ecological niche is low. However, this poses many logistical challenges. The target plant is often rare (because it is under natural control), which makes it difficult to find suitable sites for experimentation. Such rarity often limits the number of plants and agents that are available for study. Furthermore, the target plants may occur in remote or politically unstable regions, which makes it difficult to establish experimental plots, and expensive and time-consuming travelling to maintain the plants and make frequent observations. Non-target plants of greatest interest are either cultivated species, which are generally not problematic, or native species that are closely related to the target weed. The latter are often difficult to obtain because they may be rare, or are not normally cultivated and difficult to grow because their biology is not well known: this may result in subpar non-target plant quality or plant number, which in turn can affect design and outcomes of open-field tests. Furthermore, there are increasing restrictions regarding the movement of plants between countries (van Lenteren and Cock 2009; Cock et al. 2010), and in any case, it is important to prevent the possible escape of alien plants used in a field experiment to avoid the establishment of a possibly invasive species.

Although biological control projects usually try to match climates between the region of origin and the region targeted for control, there are cases in which interesting prospective agents occur in regions that are climatically different from that of the non-target plant species to be tested, and non-target plant species may not be well synchronized with the agent in its natural region. In which case, it is ambiguous whether absence of attack is due to the unavailability of the critical host-plant stage, or due to unsuitability of the plant species. Sheppard et al. (2006) showed that in the case of the seed feeder Bruchidius villosus Fabricius (Coleoptera: Chrysomelidae), one critical non-target species was not attacked in an open-field experiment, while it appeared to be attacked to the same extent as the target species in a cage test where flowering was synchronised.

Clement and Cristofaro (1995) noted that prospective agents may naturally occur at such low densities that individuals may have to be collected from other sites and released at the site with the open-field host range test. In some cases, such insects have been observed to immediately disperse from the site, which may be a reaction to having been confined, especially with many other individuals of the same species, in a container during transport before release. It may therefore be helpful to put low densities of insects with abundant host plant material in containers, and keep them cool to reduce insect activity until they are released. Furthermore, it may be better to release them individually or in couples on target plants at the end of the day to minimise their tendencies to disperse (Cristofaro et al. 2013). In any case, it is important to have ‘positive’ control plants (i.e. target plants) to show that the agents are capable of attacking these plants.

One of the principal reasons for using a set design is to analyse the results of the field experiments by statistical means. However, analyses of the results from multiple-choice experiments are not easy since the data collected in such experiments are non-independent. Since Clement and Cristofaro’s (1995) and Briese’s (1999) reviews, the availability of rigorous and biologically relevant analytical methods has considerably improved, e.g. for analysing the amount of food consumed (Prince et al. 2004), or estimating consumer movement in choice situations (Zeilinger et al. 2014). Yet, typical statistical analyses that test differences between means (e.g. for a damage index or infestation rate) are not necessarily relevant because regulatory agencies will be convinced only by results that show a risk close to zero. For example, in the USA, the federal regulatory agency must determine that there is “no significant impact” (finding of no significant impact (FONSI); Horner 2004; Hinz et al. 2014). Thus, the question becomes, how confident are we that an observed attack or damage rate on a non-target plant is “insignificant”. One approach to estimating confidence limits is to calculate the probability of attack based on the number of plants that were tested. Thus, if 300 plants were tested and none were attacked, then if a hypothetical 301st plant was attacked, the probability of attack would be 1/301 = 0.33% (Smith et al. 2006). So, it can be stated that the real attack rate should be less than 0.33% (a point estimate). Furthermore, a confidence interval can be calculated using the binomial distribution (e.g. the upper 99.9% CI for an estimate of 0% attack is 1–exp(ln(a)/n), where a = 0.001, and n is the number of observations; Zar 1984; Cristofaro et al. 2013). Hence, a sample of 300 plants with zero attack establishes with 99.9% confidence that the true attack rate is lower than 3.0%. Thus, if the goal is to measure a low probability of attack on a non-target species a very large number of plants will need to be utilised.

Some experiments have used a Latin square design with the intent of randomizing and minimising the effect of one plant on its neighbour (e.g. Smith et al. 2009). The distance between plants is usually chosen based on practicality. However, there is always the risk that a neighbouring plant could increase or decrease the risk of attack. For example, agents dispersing from overly infested or dying target plants may be more likely to land on neighbouring non-target plants (Blossey et al. 1994). Thus, experiments meant to test the presence of ‘spill-over’ effects, i.e., the transient risk of attack by agents that have become extremely numerous on their normal host plant, tend to place non-target plants close to infested target plants, or may either kill or remove the target plants to simulate a ‘spill-over’ situation (Catton et al. 2014, 2015). On the other hand, plants that are repellent may reduce attack rate on neighbours (Marohasy 1998). Only a few experiments have been performed to show how risk of attack decreases with increasing distance from infested target plants (Schooler et al. 2003; Catton et al. 2014).

Another challenge to overcome in field experiments is distinguishing from damage caused by more than one insect species. For example, field tests of the weevil Ceratapion basicorne (Illiger) (Coleoptera: Brentidae), which develops inside rosettes of yellow starthistle, Centaurea solstitialis L. (Asteraceae), were conducted in areas where similar species occurred, including some known to attack the non-target plant of interest, safflower, Carthamus tinctorius L. (Cynareae) (Smith et al. 2006). Thus, the experimenters monitored the development of insects during the experiment and harvested the plants just as the insects completed development so that all adults could be captured and identified. Even this approach left uncertainty of the identification of individuals that died before emerging. DNA of these remaining individuals was analysed to determine their identity (Antonini et al. 2009). Molecular genetic analyses have proven to be a very useful way to identify both immature and adult insects attacking plants in such experiments (Rector et al. 2010). On the other hand, the presence of multiple species of insects can also be beneficial in that it provides the opportunity to collect data on more than one prospective agent during the experiment (Clement and Sobhian 1991; Briese et al. 2002).

Eriophyid mites (Acari: Eriophyidae) can be considered as a separate challenge because of their small size and their unusual dispersal behaviour (Smith et al. 2010). To ensure that all test plants are exposed, mites should be released on each plant. For example, Smith et al. (2009) placed an infested plant cutting in a water vial attached to the test plant so that mites could voluntarily disperse onto the test plant. After one week, the water in the vials was allowed to dry up so that the cutting would wilt, forcing remaining mites to disperse or die. Although a method to extract eriophyid mites by washing has been developed (Monfreda et al. 2007), it is extremely time-consuming to extract mites and determine whether they are alive, because dead mites do not necessarily indicate attack, and then to identify them (Smith et al. 2009). Moreover, live mites occurring on a test plant do not necessarily mean that the mites are feeding on the plant because they may have inadvertently landed on a plant while dispersing in the wind from a heavily infested target plant. Thus, measurements of impact on the plants should also be done, which means that some plants should be inoculated and others not, to serve as a control. In the end, impact will probably need to be correlated to mite numbers because of dispersal by wind which cannot be controlled.

Perennial and biennial plants can be especially difficult to work with because of the additional time needed to rear the plants to maturity. Such plants may require vernalisation in order to flower. The co-authors of the present paper have had many attempts frustrated by weather that was either too wet or too dry, winters that were too cold, and the destruction of non-target plants by various pathogens or pests before experiments could be completed. There is always a risk that open-field host range tests will fail, and we recommend scheduling plenty of time for setting up and maintaining such experiments.

The future of open-field host range studies

Open-field host range studies will continue to play a role in pre-release risk assessment studies of classical biological weed control programs, as they provide the opportunity to study host selection behaviour of a biological control candidate when it can display the whole array of pre- and post-alighting behaviour. To help in resolving discrepancies between host range results obtained in quarantine tests and those observed in the field, open-field experiments should be designed to address specific hypotheses. Throughout the 53 years, open-field host range studies have rarely been used to test hypotheses or predictions made pre-release (Frye et al. 2010; Catton et al. 2015). The two-phase experimental design described above was an important step in this direction, and it has been repeatedly applied in its original or in a revised form (Briese et al. 2002; Olckers and Borea 2009; Watson et al. 2009; Frye et al. 2010; Cao et al. 2011; Lake et al. 2015). However, publications on open-field host range studies often continue to lack expression of the specific hypotheses to be tested, especially those that are based on fundamental aspects of host-selection behaviour. For example, in an attempt to validate pre-release screening results, Frye et al. (2010) hypothesized that, when given a choice of taxonomically related plants in the field, the weevil Rhinoncomimus latipes Korotyaev (Coleoptera: Curculionidae) would rapidly shift from non-target plant species to the target weed, and that weevils would disperse away from the study area when the target weed was absent. Yet, no scientific explanation was given why R. latipes should behave as hypothesized.

Future open-field studies should attempt to elaborate more clearly on the putative mechanisms underlying the hypotheses to be addressed, and, as appropriate, with the help of some classical conceptual behavioural models such as that provided by Courtney et al. (1989). One possible mechanism by which non-target species, despite being within the fundamental host range of a biological control agent, may not experience consistent attack at the population level is that the biological control agent is not able to perceive cues emitted by the non-target species, or the herbivore perceives cues emitted by the non-target species but is either not attracted or even repelled by them (see Park et al., 2018). Catton et al. (2014, 2015) conducted an open-field study with the weevil Mogulones crucifer Pallas (Coleoptera: Curculionidae), a biological control agent of houndstongue, Cynoglossum officinale L. (Boraginaceae), in the introduced range and could show that the confamilial non-target species, Hackelia micrantha (Eastw.) J.L. Gentry, was attacked under field conditions, but that attack of the non-target plant was localized and driven by a spill-over mechanism. A lack of attraction to a non-target species can also be experimentally assessed pre-release, e.g. by arranging a mixed plot of the target and the non-target species in the centre of the experimental site and setting up satellite patches of either the target species or the non-target species with increasing distances along transects radiating from the central patch (Häfliger and Hinz, unpublished data). If the biological control agent is not attracted to the non-target species, it could reasonably be expected that non-target attack only occurs in the central plot and possibly in the nearest satellite patches, while patches of the target species are attacked much further away.

Results from hypothesis-driven open-field host range tests will also facilitate their validation post-release. For example, in the biological control program against mile-a-minute weed, Persicaria perfoliata (L.) H. Gross (Polygonaceae), laboratory and open-field studies in the native range suggested that two congeneric North American plant species will not be attacked upon release of this agent in North America (Colpetzer et al. 2004). This was subsequently confirmed in an open-field host range study in the introduced range (Frye et al. 2010).

During the early stages of open-field host range testing, numerous studies used an interspersion design to assess the likelihood of non-target attacks by biological control candidates (Table 1). However, it has been known for a long time that the abundance of potential host plant species can affect the searching behaviour of herbivores (Rausher 1978). Interestingly, the first open-field host range test by Andres and Angalet (1963) used the reverse interspersion approach by placing a few cut, heavily-infested target weeds amongst plantations of test plants, thereby increasing the likelihood that the non-target species will be attacked. It took 50 years until Cristofaro et al. (2013) revived this approach when testing the risks of non-target attack by the weevil C. basicorne on the crop safflower. The weevil was placed on target plants (yellow starthistle, C. solstitialis) in a row surrounded on either side by four rows of the non-target crop, safflower (Supplementary material, fig S1; Cristofaro et al. 2013). This design was intended both to simulate the cultivation of a crop in a solid block and to force the insects to disperse through it from a central row of target plants, which also served as a ‘positive’ control to show that the insects could attack plants. In this experiment, C. basicorne infested 54% of the yellow starthistle plants, but none of 1,021 safflower plants. These authors concluded that the probability that this insect attacked the non-target plant under these experimental conditions was less than 0.74%. Using more plants would lower this percentage. However, it is not possible to reduce it to absolute zero without testing every plant in existence.

Briese (1999) found that biological control candidates were more likely to attack non-target species in open-field studies when using equal numbers of target and non-target plants than when using an interspersion design. We suggest that the reverse interspersion design, which has been largely under-utilized so far, provides an even more robust approach to assess the likelihood of non-target effects, particularly in cases where biological control successfully reduces target plant population to the extent that the target plants become significantly less common than some of the test plant species. Most crops are grown today as monocultures, so it would seem reasonable to make use of this setting and test their susceptibility to non-target attack with a reverse interspersion design.

As with host range studies conducted in confined conditions, we propose that future open-field host range studies increasingly compare the outcome of multiple experimental designs. One of the few cases where multiple experimental designs were applied in the assessment of potential non-target effect of the accidentally introduced leaf beetle Ophraella communa LeSage (Coleoptera: Chrysomelidae), a biological control agent of common ragweed, Ambrosia artemiisifolia L. (Asteraceae), on sunflower, Helianthus annuus L. (Asteraceae), in China. Cao et al. (2011) and Zhou et al. (2011) report results from three different test designs: (a) common ragweed and sunflower were grown in alternating, contiguous circles around the release point; (b) common ragweed was grown in a central square and sunflower in a not touching, circumambient layout; and (c) Briese’s two-phase test design. Theoretically, one would expect that the risk of non-target effect increases from design (a), where common ragweed was available next to sunflower, to (b), where, to find alternative food, beetles would have to leave the central square once common ragweed is completely defoliated, and (c), where the non-target is present in the central plot and beetles can choose between staying or leaving the plot and searching for additional host plants. Although oviposition on sunflower was low in all three test designs, the pattern found agreed with the theoretical predictions outlined above: no eggs were found on sunflower in design (a), while 0.1% of all eggs were found on sunflower in design (b) and 3% on sunflower in design (c) (Cao et al. 2011; Zhou et al. 2011).

In general, we propose that the use of multiple test designs, both in confinement as well as under open-field conditions, will facilitate the interpretation of results obtained in pre-release host range testing. Marohasy (1998) argued that any single test may give false results. We agree with this statement but recommend moving away from using the terms “false positives” and “false negatives”. These terms are derived from statistical hypothesis testing, where e.g. a false positive test means that a positive result was found for a test for which a negative result should have been received. Yet, unexpected oviposition on a non-target species is not a false result. Rather, it means that host fidelity of an herbivore is lost under certain conditions. In essence, the strength of the two-phase test design is that it includes two test designs. For example, in the study conducted by Briese et al. (2002), one of the biological control candidates showed a switch in host usage under conditions of deprivation of the healthy plants from the preferred hosts, a result that would have gone unnoticed if only one test design was applied. Setting up multiple, hypothesis-driven test designs, either in separate experiments or combined in one experiment, will help us to move away from concepts of false positive/negative results and to advance our understanding of ‘why results differ among test designs’.

Over the 53-year history of open-field host range studies, the method has experienced some major changes in application. As predicted by Briese (1999), open-field host range studies continue to play an important role in predicting the likelihood of non-target effects of weed biological control candidates. However, we suggest that future open-field host range studies should be more hypothesis-driven and apply different experimental designs that facilitate the interpretation of their results. Also, efforts should be made to elaborate the mechanisms underlying apparently ambiguous test results and to link host range testing with the more theoretical literature on extrinsic and intrinsic factors affecting host specificity and host fidelity of invertebrate herbivores.

The utility of open-field host range studies does not only depend on scientific advancements in this area, but also on the ability of such results to influence decisions by regulatory authorities who must assess the risks associated with the release of candidate biological control agents. The guidelines in the USDA–Technical Advisory Group for Biological Control Agents of Weeds (TAG; USDA 2000) emphasize consideration of realized host range data. Yet, in two recent cases, the candidate biological control agent C. basicorne against yellow starthistle and Metriona elatior Klug (Coleoptera: Chrysomelidae) against tropical soda apple, Solanum viarum Dunal (Solanaceae), the results of open-field tests in the native range were not sufficient to obtain a permit, given that attack of crop species had been observed under no-choice conditions (Hinz et al. 2014). Although these results were sufficient for the advisory committee (TAG) to recommend approval, they were not sufficient for APHIS which makes the final approval. Thus, it appears that more work needs to be done to improve our ability to assess and communicate risk. The utility of open-field host range studies—both for improving the scientific understanding of host specificity as well as for facilitating decision processes–are yet to be exploited fully in weed biological control programs.

In contrast to biological control of weeds, open-field host range studies are hardly conducted in arthropod biological control, neither as part of pre-release studies nor post-release to validate predictions made pre-release. Rather, field work to assess the host-specificity of candidate biological control agents of arthropod pests has been primarily focused on surveys of the natural enemy complexes associated with target and non-target species in the native range (van Lenteren et al. 2006). However, the mobility of many arthropod species and life stages tend to make experimental host range studies more difficult with parasitoid and predatory biological control candidates than with herbivores, and thus only a few examples of open-field host range studies exist from arthropod biological control (Babendreier et al. 2003; Zhang et al. 2017). For example, Zhang et al. (2017) tested the likelihood of non-target effects by the egg parasitoid Trissolcus japonicus (Ashmead) (Hymenoptera: Platygasteridae), a biological control agent of the Brown Marmorated Stinkbug, Halyomorpha halys (Stål) (Hemiptera: Pentatomidae), by attaching sentinel egg masses of the target and four suitable non-target stink bug species to the underside of leaves within mulberry and peach orchards infested with H. halys. Besides high parasitism rates of the target, non-target egg masses were frequently attacked by T. japonicus, confirming previous fundamental host range studies. We suggest that open-field host range studies, as discussed in this paper, can also contribute to better risk assessments of non-target effects in arthropod biological control.