Introduction

Fungal communities are essential for the functioning of most terrestrial ecosystems (Eastwood et al. 2011; Zak et al. 2011). Plants depend on the activity of soil fungi in particular, the mycorrhizal taxa of which provide them with nutrients, while saprobic taxa recycle plant debris. However, increasing evidence indicates that the fitness of plants may also depend on the presence of suitable or compatible microbial taxa (Redman et al. 2011). This is in line with the recent understanding that the holobiont (the host including all symbionts) should, whenever possible, be the focus of ecological studies (for reviews see: Feldhaar 2011; Rosenberg et al. 2010).

Insight into such complex pedosphere and phyllosphere fungal communities has long been gained by cultivation techniques (e.g., Gams 2007; Osono 2006; Peršoh et al. 2010), but analysis of DNA extracted from environmental samples nowadays provides access to not-readily-cultivatable microbial organisms and allows for their rapid grouping into operational taxonomic units (OTUs). High-density microarrays, such as the GeoChips (He et al. 2007) and EcoChips (Peršoh et al. 2012), or next-generation sequencing assays (e.g., Baldrian et al. 2012; Jumpponen et al. 2010; Stoeck et al. 2010), have the potential for in-depth analyses of complex communities. As an immense amount of data is obtained by such approaches and has to be processed and interpreted by highly skilled experts, these methods are preferably applied to selected samples of interest. Pre-selection of samples for in-depth analyses may be achieved by less labor-intensive DNA fingerprinting approaches, such as denaturing gradient gel electrophoresis (DGGE), single-strand conformation polymorphism (SSCP), and terminal restriction fragment length polymorphism (T-RFLP) (Cordier et al. 2012; Oros-Sichler et al. 2007). However, unique primer combinations and non-standardized data analysis workflows make overall comparison of data almost impossible and therefore require de novo assessment of reference data (control samples/treatments) in every new study.

Automated ribosomal intergenic spacer analysis (ARISA), which is easily standardizable and provides molecular signatures of metagenomic DNA samples, has been extensively used in characterizing prokaryotic communities from environmental samples (for review, see: Popa et al. 2009), but has been to a much lesser extent applied to characterize environmental samples with regard to their fungal diversity (e.g., Banning et al. 2011; Carson et al. 2010; Torzilli et al. 2006). We have further adopted the ARISA protocol for applicability to fungal communities and applied it to comparably diverse environmental substrates (i.e., soils and tree leaf samples). The resulting ARISA signatures were used to verify the applicability of this method to detect differences in fungal community composition according to spatiotemporal factors and to host plant taxonomy. We expect that the comparative and diagnostic approach of characterizing environmental samples possesses a great potential for large-scale monitoring projects in agriculture, forestry and environmental ecology. ARISA may also be an attractive pre-screening option in large-scale sampling campaigns, to select samples for in-depth analyses, such as high-throughput sequencing. On the other hand, it may serve as an up-scaling step, by correlating ARISA profiles with results from in-depth assessments to allow for an economic extension of sampling frequency.

Materials and methods

Conceptual design

For diagnostic assessment of mycodiversity, the original ARISA protocol (Fisher and Triplett 1999) was adopted for improved applicability and target organism specificity. The sampling concept can be characterized by the following variables: environmental samples were either taken from the phyllosphere (tree leaves) or from soil samples. While phyllosphere samples varied with respect to a) the sampling location (beech leaf samples from three sites in Southern Germany) and to b) the tree species (leaf samples of six genera from one specific location), soil samples varied with respect to c) the sampling location (three forest sites in Bavaria, Germany), d) the soil horizon (Oe, Oa, A, B horizons) and e) the season (biweekly sampling over a full year).

Sampling sites and sample collection

Leaf samples were collected at three different sites: a mixed forest west of Bayreuth (‘Heinersreuther Forst’; 49.9618° N, 11.4648° E, 473 m alt.), and two European Beech forests at ‘Naturpark Spessart’ (49.9911° N, 9.3355° E, 469 m alt.), and at ‘Naturpark Steigerwald’ (49.9260° N, 10.5701° E, 521 m alt.). Soil samples were collected also at three different sites: at the ‘Naturpark Steigerwald’ (see above), at a spruce forest in the ‘Naturpark Solling-Vogler’ (51.7657° N, 9.5778° E, 530 m alt.; for details see: Bredemeier et al. 1998), and along the hillside ‘Hohe Warte’, a pine-dominated forest north of the city of Bayreuth (49.9738° N, 11.5808° E, 460 m alt.).

Twigs were taken from fully grown trees standing 25–300 m apart at a height of 4–6 m above ground on Oct. 4, 2009, and Sep. 26 to 28, 2011. The samples were transferred to the laboratory and surface-sterilized as described previously (Unterseher and Schnittler 2009). Leaf material for subsequent analysis was taken from leaf discs (5.5 mm diam.) of five independent leaves of the following tree species: Acer platanoides, Acer pseudoplatanus, Betula pendula, Fagus sylvatica, Picea abies, Pinus sylvestris, Quercus petraea, and Quercus robur.

Soil cores were drawn from the ‘Naturpark Steigerwald’ beneath the sampled trees and the ‘Naturpark Solling’ from four plots located at 2–5 m distances. Samples were transferred to sterile 15 ml reaction tubes and stored at −80 °C until further processing. For seasonal monitoring at the ‘Hohe Warte’ location, the covering layer of undecomposed litter (Oi) was removed and two replicates of soil samples were drawn every 2 weeks from January to December 2010 from the uppermost 5 cm of the organic Oe layer, consisting of fragmented litter and stored as described above. Data obtained from at least four of these samples were grouped to cover broader time intervals for statistical analysis.

DNA extraction and ITS1 amplification

Genomic DNA was extracted from leaf and soil samples following a protocol previously developed for DNA extraction from soil (Peršoh et al. 2008), except that for leaf samples the aluminum sulfate precipitation of humic acids was omitted and that larger ceramic beads (two beads of 2.8 mm in diam. per extraction) were used in addition to smaller glass beads in the sample homogenization process (for leaf samples, steps 12 to 35 of the above mentioned protocol were followed). A small set of samples (each one sample of A. pseudoplatanus and Q. robur) were also extracted using a commercial genomic DNA extraction kit (NucleoMag 96 Plant Kit, Macherey-Nagel, Düren, Germany) following a homogenization step described above. The corresponding results indicated that ARISA data were not influenced by the DNA extraction method applied. ITS1 fragments were amplified from genomic DNA (about 5–50 ng) in a 12.5 μl reaction volume consisting of 0.5 μM of each primer ITS1F-Z (5′-CTWGGYYATTTAGAGGAAGTAA; modified after Gardes and Bruns 1993) and ITS2 (White et al. 1990), 1.5 mM MgCl2, 1 × PCR buffer (Promega), 0.2 μM dNTP, and 0.35 U GoTaq Hot Start DNA polymerase (Promega) using the following PCR settings: initial activation of the hot-start polymerase at 95 °C for 2 min, followed by 35 cycles at 94 °C for 30 s, 56 °C for 30 s, 72 °C for 60 s, followed by a final extension at 7 °C from 5 min. The primer ITS1F-Z was labeled with the fluorescent dye BMN-5 (biomers.net GmbH, Ulm, Germany).

ARISA capillary electrophoresis and primary data collection

PCR-amplified ITS1 fragments (in most cases 1 μl of a 20-fold dilution with water, but altered if primary signals were too high or too low, respectively) were combined with 0.25 μl of size standard (DNA size standard kit 600 bp, Beckman-Coulter) and 39.5 μl of sample loading solution (Beckman-Coulter). The samples were analyzed on a GenomeLab GeXP Genetic Analysis System (Beckman-Coulter). Electropherogram data were analyzed using the GenomeLab GeXP software (ver. 10.2) using standard fragment analysis parameters. Peaks exceeding a relative peak intensity threshold of 5 % were considered as present. A primary data matrix (sample vs. peak size) of absolute peak heights was obtained using an AFLP-type size calling approach (maximum bin width was set to two nucleotides to correct for minor differences in migration time observed for larger, slower-migrating fragments).

Statistical analysis

The peak height matrices were analyzed using the software package Primer 6 (v. 6.1.11) in combination with the Permanova + addon (v. 1.0.1; both Primer-E Ltd., UK). Data standardization was applied to express each peak height value in the sample as a percentage of the sample total peak heights. Similarity matrices were calculated from abundance matrices using the Bray-Curtis similarity coefficient or from presence-absence matrices using the Jaccard coefficient (as indicated in the text). Samples were averaged if independent replicates were available (e.g., replicate samples from seasonal monitoring). Analysis of similarity (ANOSIM, maximal 10,000 permutations), non-metric multi-dimensional scaling (NMDS, 100 restarts), and canonical analysis of principal coordinates (CAP) were conducted using the tools available in ‘Primer 6/Permanova+’. To test for seriation (a linear relationship) of data, the RELATE tool (Primer 6) was used to test for matched resemblance matrices.

Results

The ARISA signatures of endophytic fungal communities could be clearly distinguished from soil signatures on NMDS plots (Fig. 1a). The considerably high R-value indicates that the signatures obtained from soil and leaf extracts are largely separated.

Fig. 1
figure 1

Similarity among soil and leaf DNA extracts according to fungal ARISA signatures visualized by non-metric multidimensional scaling (NMDS): a) endophytic and soil samples, b) soil samples collected at three different forests, and c) soil samples along the vertical profile of a spruce forest. ANOSIM R-values and significance levels indicate high resolution potential of the modified ARISA protocol for different spatiotemporal scales; NMDS (100 restarts) and ANOSIM (10,000 permutations) calculations were performed on presence-absence data matrices

More detailed comparison of ARISA signatures of soil samples revealed that the fungal community profile at the three forest sites differed considerably from each other (Fig. 1b). These pairwise comparisons by ANOSIM showed these differences to be highly significant (p < 0.01). Furthermore, the corresponding R-values ranging from 0.555 to 0.940 indicate quite dissimilar soil fungal communities at the sampling locations ‘Steigerwald’, ‘Solling’, and ‘Hohe Warte’, differing with respect to the dominant tree species (beech, spruce, pine).

Spatial analysis of soil fungal profiles along a vertical soil profile (soil horizons) at the ‘Solling’ site revealed that the layers could be distinguished with considerable resolution (Fig. 1c). While the profile of the adjacent layers (Oe–Oa, Oa–A, A–B) showed significant overlap, R-values in the range between 0.232 and 0.3 nevertheless indicate that adjacent layers tend to separate according to the fungal community profiles. Greatest differences in ARISA signatures could be observed between the lower (mineral) A and B soil horizons and the (organic) Oe layer, where high R-values indicate considerable separation of the ARISA signatures.

Pairwise comparison of all 25 soil sampling events at the ‘Hohe Warte’ site allowed us to group consecutive sampling dates to four distinct periods (seasonality) with considerable ARISA signature overlap. Testing these sampling periods by ANOSIM analysis again revealed statistically significant separation of ITS1 signatures (Fig. 2). Furthermore, a considerable seriation behavior (rho = 0.459) could be observed at high significance level (see Fig. 2), showing that ARISA signatures of consecutive sampling events are closer related between each other than to chronologically more separated samples.

Fig. 2
figure 2

Seasonal patterns of soil fungal ARISA signatures visualized by canonical analysis of principal coordinates (CAP); statistical support (except for the “mid. Feb – Jun” vs. the “Jul – early Oct” groups) for the manually defined time period groups is provided by the ANOSIM table. A statistically significant seriation trend (RELATE analysis) is present in the data set; CAP, ANOSIM (10,000 permutations), and RELATE (secondary matrix: seriation of sampling dates, 10,000 permutations) calculations were performed on abundance data matrices

Clusters of ARISA signatures amplified from fungal endophytes could be identified by non-metric multidimensional scaling with respect to the host species (Fig. 3). While ARISA signatures of two investigated species of the Coniferales (Pinus and Picea) largely overlap, species of the Fagales (Betula, Fagus, and Quercus) are well separated (although the Fagus-Betula p-value is only close to significance). In addition, ARISA signatures of Acer spp. (Sapindales) are highly distinct from Fagus and the two conifers. Considerably high pairwise R-values also indicate differences between Acer spp. and Betula (R = 0.787) and Quercus, (R = 0.438) although the corresponding p-values exceed the significance threshold of (p < 5 %).

Fig. 3
figure 3

Similarity among host genera according to endophyllous fungal ARISA signatures visualized by non-metric multidimensional scaling (NMDS). The two coniferalean genera can be distinguished from most other genera, although the ITS1 profiles do greatly overlap between Picea- and Pinus-inhabiting fungi. Within the Fagales, the fungal ARISA resolution potential is considerably higher in comparison to Coniferales. Statistically significant grouping of some genera is supported by ANOSIM (10,000 permutations) results based on an abundance matrix; non-significant R-values (p > 5 %) are shown in italics

In contrast, ARISA signatures obtained from Fagus leaves collected at three different sites in Northern Bavaria could not be distinguished from each other, indicating that the host factor dominates over a putative geographical factor (data not shown).

Discussion

Modification of the ARISA protocol

Molecular signatures of metagenomic DNA samples are versatile tools to characterize multipartite environmental samples. In prokaryote ARISA protocols, the intergenic region in between the small and large subunit of rRNA genes is amplified. Transferred to fungal genomes this would mean that PCR fragments from 600 to over 1,000 base pairs in length would have to be amplified and analyzed. Although this does not pose a problem in standard PCR reactions, the situation is slightly different in metagenomic (multiplex) PCR. It is well known that DNA fragments of different size do not amplify at comparable rates in multiplex PCR reactions and shorter DNA fragments tend to be overrepresented in the mixture of PCR products. This effect is less pronounced in multiplex PCR where the product sizes are considerably short. Therefore, we have chosen to focus on the ITS1 length polymorphism (expected ARISA fragments in the range from 200 to 600 bp) by selecting PCR primers targeting the 18S rRNA and the 5.8S rRNA gene. In addition to a more robust amplification efficiency of each individual ITS1 fragment, size separation of PCR products in the few-hundred base pair range can be performed more efficiently on most capillary electrophoresis systems.

Resolution power of ARISA fingerprints

Communities of soil fungi are among the most complex microbial communities (Gewin 2006) and the in-depth analysis of their structure is rather demanding, even with high-throughput molecular methods (e.g., Buée et al. 2009; Tedersoo et al. 2010). Probably for that reason, only a few studies analyzing the different factors shaping fungal community structure in soil have been conducted so far. Among the exceptions are laborious surveys on the temporal shifts (e.g., Dumbrell et al. 2011; Jumpponen et al. 2010), and compositional shifts along vertical soil profiles (Lindahl et al. 2007; Peršoh et al. 2013). The modified ARISA protocol allowed us to distinguish soil samples with regard to the fungal community composition according to the factors ‘dominant tree species’, ‘seasonality’, and ‘soil horizon’. While the sampling sites with different dominant forest trees were clearly separable (Fig. 1b), more detailed analyses at the single sites revealed the limits in resolution power: even though the ARISA signatures significantly differed among all layers of the vertical soil profile, signatures of adjacent layers overlapped to some extent (Fig. 1c). Similarly, chronologically adjacent sampling events exhibited overlapping ARISA signatures, but different time periods were still significantly distinguishable (Fig. 2). Considering these limitations, the ARISA signatures allow for a reasonably precise placement of soil samples in space and time. Because samples of the same categories mostly clustered in statistically delimitable groups, the protocol is particularly well suited for outlier analyses. Such deviating samples may be excluded for in-depth analyses of the communities aiming to analyze global patterns.

For larger scaled studies, such as forest monitoring programs aiming to detect and predict fitness and health of tree populations, it appears to be especially challenging to treat the plant individuals under a ‘holobiont view’ because the number of fungi even within host plant individuals is extremely high (cf. Tab. 12.1 in Stone et al. 2004), while host preference or specificity of endophytic fungi is considered to be relatively low (Rodriguez et al. 2009). Therefore, only meta-studies (Sieber 2007) or rather laborious single surveys (Arnold 2007) were so far able to discriminate different host taxa according to the inhabiting endophytic fungal community. In accordance with a survey analyzing 123,163 sequences from a massively parallel sequencing approach (Cordier et al. 2012), samples from Fagus clearly grouped together according to their endophytic ARISA profiles, as did those from Acer and the coniferalean genera. This qualifies the modified ARISA protocol as a diagnostic or pre-selection tool for endophytic assemblages in similar ways as discussed for soil samples. Deficiencies in statistical resolution power between host taxa (i.e. p-values > 0.05) were mostly accompanied by high R-values, indicating that the respective taxa are possibly separable by applying higher sampling densities. Only the two coniferalean genera may not to be distinguishable, indicating that the outstanding leaf chemistry and anatomy of conifers may be selective for a specialized, but not host-specific, fungal community

Standardized workflows

Our results indicate that ARISA signatures of metagenomic DNA preparations are widely applicable, because ARISA fragments could be amplified from DNA preparations of most diverse substrates such as soil samples and surface-sterilized leaves of various trees using identical PCR protocols. At the same time, the ARISA signatures are highly specific, not only for soil and leaf-inhabiting fungi, but even at finer scale within the soil and the host range of forest trees (at different taxonomic levels). Thereby amplicons of non-fungal origin accounted for less than 0.1 % in studies using the same PCR-conditions for preparation of ‘454-sequencing’ analyzes of soil and leaf samples (Peršoh, unpublished data). These findings suggest developing a standardized protocol, which can be directly applied in various research areas (e.g., forestry, agriculture, biotechnology). The results show that comparative analysis of independent data—elaborated in the same project over time or even by different groups—is also feasible if the underlying primary data were obtained using standardized workflows. The most important aspects to generate standardized ARISA data are certainly related to PCR conditions:

  • Standardized PCR primers: ARISA signatures can be only compared among different experiments, if the same primer pair is used in the amplification step. We have selected a set of two well-established primers. The only modification is due to the observation that emerging 18S sequence data from basal fungal lineages have shown that the ITS1F primer would not perfectly match this group of fungi (Werner et al. 2012). By introducing degenerated positions in the 5′ part of ITS1F, zygomycete loci are now also perfectly matched by the modified primer (ITS1F-Z).

  • Standardized primer annealing temperature: Primer binding kinetics to (metagenomic) DNA largely depends on the annealing temperature used in PCR. To allow merging of primary data, we suggest using an annealing temperature of 56 °C at the MgCl2 concentration of 1.5 mM.

  • Amount of template DNA: In order to avoid drop-out of ITS1 fragments of rare genotypes, we suggest applying not less than 5 to 20 ng of genomic DNA in the amplification reaction (Butler and Hill 2010).

Equally important are the procedures to determine the apparent sizes of the PCR products. Capillary electrophoresis (CE) has a superior size resolution capability compared to plate gel systems, and should be preferred if possible. Fragment sizes are determined in capillary electrophoresis devices by simultaneous detection of the migration time of fluorescently labeled PCR products and DNA size standards in the same capillary.

  • In practice, the chemical nature of the fluorescent dye will mainly influence the migration behavior of DNA fragments. Therefore, if different fluorescent dyes are used in the ITS1-PCR (either because the available color channels of the CE machines should be used in parallel, or because fragments were created in different laboratories), apparent migration differences with different fluorescent dyes should be adjusted for – if necessary.

  • Interestingly, previous experiments have shown that device-specific differences (separation of ARISA fragments on different capillary electrophoresis devices) can be neglected, if the same fluorescent dye is used for labeling of amplification products (de Baere et al. 2005).

Finally, it is highly advisable to archive the original electropherogram data and the raw size/abundance measurements for the case that those platform- or protocol-specific normalizations will have to be applied or modified. Based on our own experience that ARISA signatures obtained from different projects over time could be successfully combined for the analyses shown above, we hope that our results will stimulate cooperation of different research groups to share and compare their data.