Key words

1 Introduction

Biotyping must be understood here as variation within a given species to address issues related to this species. The biotyping methods can be phenotypic or genotypic. The genotyping methods have been widely accepted because of their non-subjective interpretation. These genotyping methods are well adapted to some fundamental issues such as population genetics, reproduction modes, sexual recombination, phenotype–genotype relationships but also to more practical issues such as epidemics investigation, nosocomial acquisition, route of transmission, emergence of antifungal agent-resistant strains, infection reoccurrence in a given patient versus reinfection, or quality control of reference strains to detect drift after repeated subcultures.

The main desirable features of a genetic marker are (1) to have a mean mutation rate able to generate enough polymorphism to distinguish isolates but not too high to allow isolate grouping, (2) not to be under selective pressure, (3) to have a weak reversion rate, (4) to avoid homoplasia, and (5) to be independent. From the practical point of view, these markers must be accurate, successful for every isolate of a given species, reproducible, easy to interpret, rapid with a high throughput, and cheap. However, instead of searching for an ideal genotyping method with the highest level of discrimination, the objective should be to use the method which answers the question. For investigating Candida transmission between two patients, a unique marker able to discriminate two isolates can be sufficient, knowing that when identical genotypes even with multiple markers are observed, a well designed temporal study must be undergone to prove transmission between these two patients. For population genetics, several markers are mandatory, if possible well spread all over the genome to get the most reliable view of the entire genome. The numerical index of discriminatory power (D) is often used to synthetize the quality of the genetic marker based on the formula [1]:

$$ D=1-\frac{1}{N\left(N-1\right)}{\displaystyle \sum}_{j=1}^s{x}_j\left({x}_j-1\right) $$

where s is the number of profiles, x j is the number of the population falling into the jth type, and N is the size of the population.

The DNA-based methods developed during the 1980s were diverse. They included pulsed fields, restriction enzyme analysis, and restriction fragment length polymorphism (RFLP) associated with probe hybridization after Southern blotting [2]. These techniques were technically demanding, had a low throughput, and generated results difficult to standardize. They were replaced in the 1990s by PCR -based methods such as randomly amplified polymorphic DNA (RAPD) [2], single-strand conformation polymorphism analysis [3], and amplified fragment length polymorphism [4]. These methods could be used without any previous knowledge of the genome of the studied microorganism, but yielded fingerprint profiles that consist of complex banding patterns that were difficult to reproduce in different settings [2, 5]. For instance, PCR reaction with short and nonspecific primers and low temperature of hybridization as used in RAPD is impossible to reproduce when dealing with complex genomes. In the late 1990s and the early 2000s, two methods overcoming most of the previous limitations of PCR based methods have emerged, namely microsatellite length polymorphism (MLP) [6, 7], and multilocus sequence typing (MLST) [8].

In contrast to RAPD, MLP and MLST need previous knowledge of the DNA sequences of the targeted microorganisms to design primers in conserved regions to be able to amplify all the isolates of the species. To retrieve repeated sequences, several software are needed to select appropriated loci (e.g., http://c3.biomath.mssm.edu/trf.html) [9]. For MLST targeting housekeeping genes, preliminary comparative studies with a model organism such as Saccharomyces cerevisiae are also needed [10]. This has obvious pitfalls when dealing with a rare Candida species with no genome sequenced. On the other hand, the PCR reactions in MLP and MLST rely on very specific primers and reaction conditions, which assume reproducible PCR results.

MLST methods for C. albicans and C. glabrata have been described elsewhere [10, 11] but are described here since we report only the methods routinely used in our laboratory to genotype isolates of C. albicans, C. glabrata, C. parapsilosis (MLP) and C. tropicalis , C. krusei (MLST).

1.1 Microsatellites

The microsatellites, also named short tandem repeats, were first investigated for eukaryotic genomes [12]. They are part of the numerous DNA repeated fragments found in eukaryotic genomes mainly in noncoding regions, hence less susceptible to selective pressure. Microsatellites are defined as tandem stretches of two to five nucleotides repeated 5–100 times and are classically opposed to minisatellites which consist in longer repeats (8–100 bases) repeated 2 to >100 times often clustered in telomeric regions. The polymorphism of these markers relies on variation in the number of tandem repeats in a short core sequence according to the different isolates. The loci retained for genotyping are those where several alleles are empirically observed. The discriminatory power of the locus depends on the number of alleles observed.

MLP genotyping is based on the labeling of one of the two primers used to amplify a microsatellite locus and measurement of the amplicon length (Fig. 1). Fragment sizing is performed automatically using high-resolution electrophoresis platforms. Microsatellite alleles are often expressed as DNA fragments of different sizes obtained after PCR amplification with primers flanking the microsatellite region. The final data can be used as such in bp or converted into the corresponding number of repeats by comparison to reference strains. If several primer sets labeled with different dyes are used, multiplex PCR can be used to save time and increase throughput. Moreover, since microsatellite markers test the presence of different alleles at a given loci, distinguishing heterozygotes in diploid organisms such as C. albicans is easy, which is impossible with RFLP or RAPD methods (Fig. 2).

Fig. 1
figure 1

Example of a polymorphic CCA microsatellite (or Short Tandem Repeats) at a given locus between three isolates. Thanks to the labeling of one primer, the PCR products can be easily differentiated on a sequencing electrophoretic gel according to the number of CCA repeats

Fig. 2
figure 2

Example of allele assignment using the CDC3 allelic ladder for two isolates, numbers 22 and 15. Allele peaks in the ladder (green peaks) are marked as p1–p7. GeneFlo 625 internal size standards (red peaks) with sizes in bp are shown beneath each peak. Isolate 15 is p2–p5 heterozygous, and isolate 15 is p4 homozygous [after ref. [25]]

Based on the variations in repeat numbers, genetic relatedness between different isolates can be assessed. MLP genotyping has already been reported for C. albicans [1316], Candida glabrata [17, 18], Candida tropicalis [19], or Candida parapsilosis [20, 21]. Microsatellites provide high-resolution analysis that is consistent with MLST analysis [22].

Although the digital format is very attractive for exchanging results between laboratories, the present limitation of microsatellite typing is transferability. The size of the DNA fragments is calculated according to their electrophoretic mobility in capillary electrophoresis platforms. This calculation can be influenced by multiple factors such as exact base composition, separation matrix, presence of denaturing compounds, temperature, and fluorescent labels. Even the size standard and the polymerase that is used for amplification may affect the calculated size of an allele [23]. Thus, careful calibration of the different platforms should be established [23, 24]. A straightforward and universally applicable method to achieve such a calibration is through the use of allelic ladders [25]. An allelic ladder consists of a well-defined mixture of alleles with predetermined repeat numbers and can be used to create reference positions for the interpretation of typing results (Fig. 2).

MLP is also subject to homoplasia, which can hamper the accuracy of the results. High-resolution DNA melting (HRM) analysis and SNaPshot minisequencing after a single amplification can help to investigate this possibility, but can also on the other hand add polymorphism and increase the discriminatory power of a single MLP marker [26].

1.2 Multilocus Sequence Typing

MLST is widely used in bacteriology and relies on DNA sequence analysis of housekeeping genes MLST [27]. The starting point is selecting genes with enough single-nucleotide polymorphism to differentiate isolates but with enough common sequences to design primers able to amplify every isolate of the studied species. For this purpose, housekeeping genes fulfill the prerequisite conditions. MLST measures the DNA sequence variations in the selected genes and characterizes strains by their unique allelic profiles. Approximately 450–500 bp internal fragments of each gene are used, which is the length of the PCR product providing readable chromatograms upon Sanger sequencing. For each gene, the different sequences are assigned as distinct alleles (number) and, for each isolate, the alleles at each of the loci define the allelic profile or sequence type (ST) or diploid sequence type (DST) for diploid microorganisms. The main advantage of MLST is the ability to provide indisputable data based on sequencing with Sanger methodology and the possibility to compare the results with those deposited in data banks (http://calbicans.mlst.net/).

This method was first developed for Candida albicans in sequencing five to seven housekeeping genes [10] and a consensus on the methodology was reached [8]. Similar MLSTs were developed for C. glabrata [11], C. tropicalis [28], and Candida krusei [29]. Beside C. albicans , active MLST schemes are publicly available for C. glabrata (http://cglabrata.mlst.net/), C. tropicalis (http://pubmlst.org/ctropicalis/), and C. krusei (http://pubmlst.org/ckrusei/). There are few technical limits to MLST when the Sanger sequencing provides readable chromatograms. Sequencing of both stands is recommended to avoid reading mistakes.

1.3 Comparison of MLST and MLP

Both methods are useful tools for genotyping of Candida isolates with high typeability, discriminatory power and good reproducibility. MLST data are exportable using online databases whereas MLP data can only be exportable when results are normalized with an allelic ladder. MLST method is more time and money consuming compared to MLP. Comparison of MLST using seven genes of C. albicans with three microsatellite markers showed a similar discriminatory power [22]. As MLP, MLST can easily give access to study heterozygosity in diploid microorganisms.

2 Materials

This section should list the composition of all buffers, media, solutions, specialist equipment, etc., that are necessary for carrying out the method described in Subheading 3. Suppliers are not needed for routine reagents (the reader will use his/her own local supplier) and catalog numbers are not required at all for reagents. All buffers, solutions and media should be presented in the same format, i.e., name, colon then composition on one continuous line, with components separated by commas not semicolons.

The five major yeast species responsible of invasive human infection are ascomycetes belonging to the order of Saccharomycetales: Candida albicans , Candida glabrata, Candida parapsilosis, Candida tropicalis , and Candida krusei (currently named Pichia kudriavzevii). Candida albicans, C. parapsilosis, and C. tropicalis belong to Lodderomyces/Candida albicans clade in the family Debaryomycetaceae [30]. Candida glabrata has been reclassified in the genus Nakaseomyces in the family Saccharomycetaceae and P. kudriavzevii belong to the family Pichiaceae [30, 31]. See also; Mycobank at http://www.mycobank.org/.

  1. 1.

    Genotype for C. albicans , C. glabrata, C. parapsilosis, C. tropicalis , and C. krusei are determined after purity check by using BBLChromagar and species identification by carbon assimilation profile determination and/or ITS sequencing for C. parapsilosis or duplex ITS/actin PCR for C. albicans [32].

  2. 2.

    Reference strains used as positive controls (PCR and allelic size control): C. albicans (B311, diploid), C. glabrata (ATCC2001, haploid), C. parapsilosis (ATCC22019, diploid), C. tropicalis (ATCC750, diploid), C. krusei (ATCC6258, diploid).

  3. 3.

    Yeasts are subcultured on Sabouraud dextrose agar plates for 24 h at 30 °C. DNA is extracted using the High Pure PCR Template Preparation Kit (Roche Applied Science, Indianapolis, IN) according to the manufacturer’s instructions and stored at −20 °C until use.

  4. 4.

    For PCR reactions: Amplitag Gold DNA polymerase, MgCl2, PCR buffer 10× (Roche, Applied Biosystems); deoxynucleotide (dNTP) solution mix at 25 mM; one primer of each pair (forward or reverse) used in the microsatellite assays are labeled in 3′, with one of the following fluorochromes:

    • HEX (4,7,2′,4′,5′,7′-hexachloro-6-carboxyfluorescein).

    • 6FAM (6-carboxyfluorescein).

    • NED (2′-chloro-5′-fluoro-7′,8′-fused phenyl-1.4-dichloro-6-carboxyfluorescein).

    • TET (6-carboxy-2′,4,7,7′-tetrachlorofluorescein).

  5. 5.

    PCRs are performed on an iCycler thermocycler (Bio-Rad, Hercules, CA).

  6. 6.

    Hi-Di Formamide and Geneflo 625 DNA ladder Rox labeled (6-carboxy-X-rhodamine, succinimidyl ester, Eurx) are used for the microsatellite mix.

  7. 7.

    Conical V-bottom 96-well microplates are used for microsatellites assays.

  8. 8.

    Double strand sequencing and capillary electrophoresis are performed on an ABIPrism 3730 XL (Applied Biosystems).

  9. 9.

    For MLP analysis, the determination of allelic sizes is performed using GeneMapper or PeakScanner software (Applied Biosystems).

  10. 10.

    For MLST analysis, the chromatogram analysis is performed using Geneious software or any other sequence editing software (Chromas, Sequencher, etc.) (see Note 1 ).

3 Methods

3.1 Microsatellites (MLP)

To assign a specific length to a PCR fragment, we systematically test the reference strain in all the PCR runs. To observe stutter peaks is normal due to artifacts of the DNA polymerase when encountering short tandem repeats [23]. The last highest peak is to be considered. Therefore, each allele is named according to the length in bp of the amplified fragment after alignment with the reference strain. In the case of diploid species ( C. albicans and C. parapsilosis), for each marker and for a given isolate, one or two peaks can be observed. Then each peak observed is assigned to an allele. When we observe electropherograms harboring one signal for a given locus, we consider the isolates to be homozygous for this locus [33]. For each species, microsatellite repeat types, primer sequences, gene products and chromosomic locations are listed in Tables 1, 2, and 3. For C. albicans, an allelic ladder for CDC3 marker is used to determine allelic size [25].

Table 1 Description of the microsatellites markers used to genotype isolates of Candida albicans
Table 2 Description of the microsatellites markers used to genotype isolates of Candida glabrata
Table 3 Description of the three microsatellites loci used to genotype isolates of Candida parapsilosis sensu stricto [21]

3.1.1 Candida albicans

Five microsatellites markers are amplified in duplex (loci CDC3/EF3) or uniplex (loci HIS3, CDR1, and STPK) PCR using the following conditions;

  1. (a)

    Reaction volume of 20 μL contained 2 μL of genomic DNA, 1.25 U of DNA polymerase, 2 μL of 10× PCR Buffer, 5 mM of MgCl2, a 0.25 mM concentration of dNTP, 10 pmol (for EF3), 5 pmol (for HIS3), and 2 pmol (for CDC3, CDR1, and STPK) primers.

  2. (b)

    The PCR program consisted of an initial denaturation step at 95 °C for 10 min, followed by 30 cycles (or 27 cycles for CDR1 and STPK loci) of 30 s at 95 °C, 30 s at 55 °C, and 1 min at 72 °C, with a final extension step of 30 min at 72 °C.

3.1.2 Candida glabrata

Five microsatellites markers are amplified by duplex (Cg4 and Cg6) or multiplex (RPM2/ERG3 and MTIA) PCR using the following conditions:

  1. (a)

    Reaction volume of 20 μL contained 1 μL of genomic DNA, 1.25 U of DNA polymerase, 2 μL of 10× PCR Buffer, 5 mM of MgCl2, a 0.25 mM concentration of dNTP, 10 pmol (for ERG3, MTIA, Cg4), 5 pmol (for RPM2 and Cg6) primers.

  2. (b)

    The PCR program for the duplex PCR, consisted of an initial denaturation step at 95 °C for 5 min, followed by 27 cycles of 30 s at 95 °C, 30 s at 52 °C, and 45 s at 72 °C, with a final extension step of 10 min at 72 °C.

  3. (c)

    The PCR program for the multiplex PCR, consisted of an initial denaturation step at 95 °C for 10 min, followed by 27 cycles of 30 s at 95 °C, 30 s at 55 °C, and 1 min at 72 °C, with a inal extension step of 5 min at 72 °C.

3.1.3 Candida parapsilosis

Three microsatellites markers (CP1, CP4, and CP6) are amplified in uniplex PCR using the following conditions:

  1. (a)

    Reaction volume of 25 μL contained 2 μL of genomic DNA, 1.25 U of DNA polymerase, 2.5 μL of 10× PCR Buffer, 1.5 mM of MgCl2, a 0.2 mM concentration of dNTP, 2.5 pmol for each primer.

  2. (b)

    The PCR program consisted of an initial denaturation step at 95 °C for 10 min, followed by 28 cycles of 30 s at 95 °C, 30 s at 58 °C (for CP1) or at 60 °C (for CP4 and CP6), and 30 s at 72 °C, with a final extension step of 7 min at 72 °C.

3.1.4 Microsatellite Mix

Following PCR , 2 μL of the amplification product is added to 13 μL of HiDi-Formamide and to 0.5 μL of Geneflo 625 ladder (see Note 2 ).

3.2 MultiLocus Sequence Typing (MLST)

Consensus sequences are edited by comparison of both DNA strands using the one-letter code for nucleotides from the International Union of Pure and Applied Chemistry (IUPAC) for heterozygous nucleotides. Fragmented sequences are then obtained after delimitation by start and end-sequences (Tables 4 and 5).

Table 4 Description of the MLST markers to genotype isolates of Candida tropicalis [28]
Table 5 Description of the MLST markers to genotype isolates of Candida krusei [29]

For each gene, distinct alleles were identified and numbered using the MLST database. The Sequence Type (ST) is the result of combination of the alleles at the different loci. For each species primers sequences are listed in Tables 4 and 5.

3.2.1 Candida tropicalis

Six loci (ICL1, MDR1, SAPT2, SAPT4, XYR1, and ZWF1a) are amplified by PCR according to Tavanti et al. [28] with slight modifications:

  1. (a)

    Reaction volume of 50 μL contained 3 μL of genomic DNA, 2.5 U of DNA polymerase, 5 μL of 10× PCR Buffer, 2.5 mM of MgCl2, a 0.25 mM concentration of dNTP, 10 pmol for each primer.

  2. (b)

    The PCR program consisted of an initial denaturation step at 94 °C for 7 min, followed by 30 cycles of 1 min at 94 °C, 1 min at 52 °C, and 1 min 5 s at 72 °C, with a final extension step of 10 min at 72 °C.

  3. (c)

    Sequences compared to the database online (http://pubmlst.org/ctropicalis/) developed by Jolley and sited at the University of Oxford [34].

3.2.2 Candida Candida krusei

Six loci (ADE2, HIS3, LEU2, LYS2, NMT1, and TRP1) are amplified by PCR according to Jacobsen et al. [29] with slight modifications:

  1. (a)

    Reaction volume of 50 μL contained 3 μL of genomic DNA, 1.5 U of DNA polymerase, 5 μL of 10× PCR Buffer, 2.5 mM of MgCl2, a 0.25 mM concentration of dNTP, 10 pmol for each primer.

  2. (b)

    The PCR program consisted of an initial denaturation step at 94 °C for 7 min, followed by 30 cycles of 1 min at 94 °C, 1 min at 52 °C, and 1 min at 72 °C, with a final extension step of 10 min at 72 °C.

  3. (c)

    Sequences compared to the database online (http://pubmlst.org/ckrusei/) developed by Jolley and sited at the University of Oxford [34].

4 Conclusion and Perspectives

Genotyping based on MLST or MLP can help us gain insight into the genetic relatedness of fungal isolates. Both have advantages and drawbacks depending on the task in question. Beside these well-established typing methods, other technologies are appearing in mycology, whereas they are now well used in bacteriology. Next-generation sequencing (NGS) technology is probably revolutionizing the way of genotyping microorganisms although the size of the fungal genomes makes NGS more expensive than for bacteria. The use of such advanced methods is currently restricted to specialized laboratories, but wider applications are possible in the near future.

5 Notes

  1. 1.

    When no DNA is amplified upon MLP or MLST, one can check for the identification of the isolate, since one of the main advantages of MLP and MLST is to be specific for the species studied.

  2. 2.

    When more alleles than expected are observed with MLP (e.g., three peaks for a diploid organism), this suggests a mixture of isolates. Checking purity can be performed to exclude this possibility. This last point is hardly investigatable using MLST since Sanger sequencing is not able to detect minority allele under 40 %.