Analysis of Microsatellites in 13 Hemiascomycetous Yeast Species: Mechanisms Involved in Genome Dynamics

Malpertuy, Alain; Dujon, Bernard; Richard, Guy-Franck

doi:10.1007/s00239-002-2447-5

Analysis of Microsatellites in 13 Hemiascomycetous Yeast Species: Mechanisms Involved in Genome Dynamics

Published: June 2003

Volume 56, pages 730–741, (2003)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Journal of Molecular Evolution Aims and scope Submit manuscript

Analysis of Microsatellites in 13 Hemiascomycetous Yeast Species: Mechanisms Involved in Genome Dynamics

Download PDF

Alain Malpertuy¹,
Bernard Dujon¹ &
Guy-Franck Richard¹

153 Accesses
15 Citations
Explore all metrics

Abstract

We have analyzed all di-, tri-, and tetranucleotide repeats in the partially sequenced genomes of 13 hemiascomycetous yeast species, and compared their sequences, lengths, and distributions to those observed in the genome of Saccharomyces cerevisiae. We found that most of the 13 species exhibit a unique distribution of microsatellites, not correlated to the base composition of their genome. Species close to S. cerevisiae exhibit a similar distribution, while species more distantly related show a more divergent distribution. We propose that de novo formation and continuous loss of microsatellites are active processes generating new DNA sequences. We also show that hemiascomycete-specific genes encoding transcription factors contain trinucleotide repeats more frequently than expected from their average frequency distribution. These transcription factors might play an important role in the speciation process, by regulating gene expression through DNA–protein or protein–protein interactions mediated by stretches of charged amino acids encoded by trinucleotide repeats.

Evolution of Nine Microsatellite Loci in the Fungus Fusarium oxysporum

Article 10 December 2015

Imperfect and Compound Microsatellites in the Genomes of Burkholderia pseudomallei Strains

Article 01 January 2019

Patterns of microsatellite distribution across eukaryotic genomes

Article Open access 22 February 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Microsatellites are short tandem DNA repeats present in variable numbers in the genomes of eubacteria, archaebacteria, and eukaryotes (reviewed in Charlesworth et al. 1994; Richard et al. 1999). Their size polymorphism was used to construct maps of the human, wheat, and trout genomes (Dib et al. 1996; Röder et al. 1998; Sakamoto et al. 2000). Microsatellites were also extensively used to type subpopulations belonging to the same species. Molecular typing was achieved by PCR amplification of selected microsatellite loci in Saccharomyces cerevisiae (Hennequin et al. 2001; Richard and Dujon 1996), Candida albicans (Lunel et al. 1998), Arabidopsis thaliana (Innan et al. 1997), or Vitis vinifira (Bowers et al. 1999).

Microsatellite tract length changes are believed to occur by replication slippage of the newly synthesized strand on its template during S phase replication or during recombination, promoted by the repetitive nature of the sequence (reviewed in Richard and Pâques 2000; Richards 2001). It was shown that perfect (uninterrupted) trinucleotide repeats were polymorphic among S. cerevisiae strains of different origins, whereas imperfect repeats were stable, supporting the notion that perfect repeats can be used to type populations differing at the molecular level (Richard and Dujon 1997).

Trinucleotide repeats are equally distributed in coding and non-coding regions, whereas other microsatellites (except hexanucleotide repeats) are underrepresented in coding regions (Debrauwère et al. 1997; Richard and Dujon 1997; Toth et al. 2000). This is most certainly due to the fact that trinucleotide repeat tract alterations within an ORF do not abolish its function by creating frameshift mutations. The most striking observation related to trinucleotide repeat-encoding genes is their propensity to encode nuclear products (Alba et al. 1999; Karlin and Burge 1996; Richard and Dujon 1997; Young et al. 2000). In S. cerevisiae, the most frequent aminoacid encoded by trinucleotide repeats is glutamine, but other charged aminoacids are also frequently encountered. Long poly-Gln stretches are supposed to be involved in a variety of nuclear processes, including protein–protein interactions and transcription activation (Karlin and Burge 1996).

During the course of the “Génolevures” sequencing project, 49,199 Random Sequenced Tags (RSTs) have been obtained from 13 hemiascomycetous yeast species (Souciet et al. 2000). Using these data (available at http://cbi.labri.u-bordeaux.fr/Genolevures) , we identified all dinucleotide, trinucleotide, and tetranucleotide repeats and compared their distribution in each of the 13 species with that found in S. cerevisiae. We show that the frequency of occurrence of each type of microsatellite varies among the different species studied, but that species close to S. cerevisiae exihibit a similar distribution. We confirm that trinucleotide repeats are more frequent in genes encoding nuclear factors, particularly transcription factors, and show that they are over-represented among the class of hemiascomycete-specific genes. We propose that trinucleotide repeats play a role in designing new regulatory domains by promoting rapid sequence diversity among the more flexible nuclear proteins.

Materials and Methods

Microsatellites were detected in the sequences of the Génolevures project (EMBL accession numbers from AL392203 to AL441602) using the software of Benson and Waterman (1994). The following parameters were used: match weight, + 1; mismatch weight, −2, −3, −4 (respectively, for di-, tri-, tetranucleotide repeats); insertion/deletion weight, −6, −9, −12; threshold to report, 10, 15, 20; pattern size, 2, 3, 4; lookcount, 2, 3, 4; no shortperiod, 1. This allowed us to detect microsatellites containing at least five perfect (uninterrupted) repeat units. Out of 49,199 RSTs sequenced during the Génolevures project, 4,814 contained rDNA, mitochondrial, Ty, or plasmid sequences and were excluded from the search.

The equation used to analyze the length distribution of di- and trinucleotide repeats is N = a × n ^−b, with n being the length class and N the number of occurrences in each length class. The resulting curves have been compared to the observed numbers using a Chi2 test (Chi2 > 6.63; p < 0.01).

Results

Sequence and Length Distributions of Microsatellites in the 13 Hemiascomycetous Species

In order to detect microsatellites in the 13 hemiascomycetous genomes, we used the same approach as the one described in former works (Richard and Dujon 1996, 1997; Richard et al. 1999). Altogether, we analyzed the 44,385 RSTs corresponding to nuclear DNA, totalling 41.2 Mb from 13 yeast species. We compared the results with the complete genome of S. cerevisiae. All di-, tri-, and tetranucleotide repeats that were at least five units long were scored. Their frequencies of occurrence are shown in Table 1. Three species, Saccharomyces exiguus, Kluyveromyces marxianus var. marxianus, and Candida tropicalis, exhibit a significantly higher number of di-, tri-, and tetranucleotide repeats as compared to S. cerevisiae. Only two species, Pichia sorbitophila and Pichia angusta, exhibit a lower number of both di- and trinucleotide repeats as compared to S. cerevisiae. The other species are either comparable to S. cerevisiae or show an excess or a deficiency in only one or two of the three classes. No species is entirely comparable to S. cerevisiae (Table 1).

Table 1 Total number of microsatellites identified in hemiascomycetous genomes

Full size table

Then, we compared the frequencies of occurrence of each of the four possible non-monotonous dinucleotide repeats found in each species to the frequency observed in S. cerevisiae. Results are shown in Table 2. AC_n and AG_n repeats are over-represented in five and nine species, respectively, and both are underrepresented only in D. hansenii var. hansenii. The situation is opposite for AT_n repeats which are underrepresented in eight species (including five of the previous ones) and over-represented only in D. hansenii var. hansenii. Finally, CG_n repeats are rare in all species except K. thermotolerans and P. angusta, where they are significantly over-represented. Interestingly, the three closest species to S. cerevisiae (S. bayanus var. uvarum, S. exiguus, and S. servazii) show similar distributions of dinucleotide repeats, except for the excess of AG_n repeats in S. bayanus var. uvarum. The most frequent dinucleotide repeats for each species are represented in Fig. 1.

Table 2 Frequencies of occurrence of each non-monotonous dinucleotide repeat in the hemiascomycetous species

Full size table

We also compared the frequency of each dinucleotide’s occurrence to the frequency of the corresponding dinucleotide repeat. There is no obvious correlation between dinucleotide frequencies and the corresponding dinucleotide repeat frequencies, except for K. thermotolerans which exhibits the highest frequency of CG dinucleotide and of CG dinucleotide repeats (Table 2). This shows that the frequency of each dinucleotide repeat is not determined by the genome average composition, suggesting that active mechanisms regulate the distribution of dinucleotide repeats in each genome.

Similarly, each of the 10 non-monotonous trinucleotide repeats was scored, and results are detailed in Table 3. The distribution of each kind of trinucleotide repeat is more homogeneous, as compared to S. cerevisiae. However, it is worth noting that AAT_n repeats are over-represented S. exiguus, S. servazii, Z. rouxii, and D. hansenii var. hansenii and that ATC_n repeats are underrepresented in K. marxianus var. marxianus, P. angusta, and Y. lipolytica. The two more distantly-related species, P. angusta and Y. lipolytica, show the greatest divergence as compared to S. cerevisiae, with four trinucleotide repeats out of ten exhibiting significantly different frequencies of occurrence. The most frequent trinucleotide repeats are represented in Fig. 1.

Table 3 Frequencies of occurence of each non-monotonous trinucleotide repeat in the hemiascomycetous species

Full size table

We have also examined the length distribution of di- and trinucleotide repeats. Mean lengths of di- and trinucleotide repeats in the 13 hemiascomycetous species range 8–12 repeat units and are not significantly different from that of S. cerevisiae (11 repeat units). Long repeats are less frequent than short ones in all species. If dinucleotide repeat lengths were not subject to any selection, we would expect to observe an exponential decrease of the number of dinucleotide repeats belonging to each length class. By fitting the observed distribution to an exponential curve, we calculated the expected number of dinucleotide repeats of each length class and for each of the 14 yeast species. We found that the most frequently over-represented length class is eight repeat units (11 species out of 14), followed by nine and 10 repeat units (seven species out of 14). For trinucleotide repeats, we did not observe an overrepresentation of a particular length class.

Amino Acids Encoded by Trinucleotide Repeats

Trinucleotide repeats located within a protein-coding sequence encode repetitions of amino acids. For each hemiascomycetous species, we analyzed all trinucleotide repeats occurring within the 339 ORFs identified by sequence homology to a S. cerevisiae ORF. The corresponding amino acids were examined and their frequency of occurrence computed (Fig. 2). A total of 15 distinct amino acids were found, the most common ones being Glutamine, Aspartic acid, Asparagine, and Glutamic acid. Phenylalanine, Tyrosine, Tryptophane, Isoleucine, and Methionine are never found.

Because the reading frames for each yeast species were determined by comparing the hemiascomycete sequence to the S. cerevisiae genome, we had more matches between S. bayanus var. uvarum and S. cerevisiae than with other, more distant species. Therefore, in order to have numbers high enough to do statistical tests, we pooled the other 12 species (Tables 4 and 5). First, we noticed that the three possible frames of the same triplet (e.g., AAC, CAA, and ACA) are not used with the same frequency. AAC is more frequently encountered in S. bayanus var. uvarum and CAA is more frequently encountered in other species than ACA. We observe that GAA, AAT, CAG, GCT, GAC, and GAT are more represented than the other frames (Table 4). This phenomenon probably reflects an effect of amino acid selection on some of the frames because, in terms of DNA sequence, the three possible frames are equivalent. Second, a given amino acid is preferentially encoded by one of its synonymous codons. For example, poly-Glu are preferentially encoded by GAA repeats, poly-Asp by GAT (although GAC is also frequently found), and poly-Gln by CAA in all species except S. bayanus var. uvarum (Table 4). Interestingly, poly-Asn are preferentially encoded by AAC in S. bayanus var. uvarum and by AAT in other species. We then compared these numbers to the Relative Synonymous Codon Usage (RSCU) of each triplet (Lloyd and Sharp 1992) for S. bayanus var. uvarum only, because the numbers were high enough to perform statistical tests (Table 5). Asparagine is preferentially encoded by AAC and Glutamine by CAG in trinucleotide repeats, opposite to what is observed in the overall genome. This is to be compared with human neurological disorders where poly-Glutamine repeats are always encoded by CAG repeats and never by CAA (Orr 2001). We concluded that for a given amino acid, the repeated codon is not determined by its absolute frequency in the genome, but by cis- or trans-acting factors which favor its presence within a repeated DNA sequence.

Table 4 Number of occurrences of each amino acid encoded by trinucleotide repeats in S. bayanus var. uvarum (Su) or in the other 12 species (others)

Full size table

Table 5 Comparisons between the expected number of codons (Exp.) (calculated by dividing the RSCU of a codon by the number of synonymous codons and then multiplying by the total number of occurrence of each amino acid), and the observed number (Obs.) of such codons, in S. bayanus var uvarum

Full size table

Conservation of Amino Acid Stretches Among the 13 Hemiascomycetes

From the alignments of each of the different hemiascomycetous sequences with S. cerevisiae sequences we could determine whether the amino acid repeats were conserved or not between the two yeast species. We found conservation in 64% of the cases and absence of conservation in 36% of the cases. Figure 3 shows some examples. In NOG1 and SSN3 gene products, the poly-Glu and poly-Ala stretches are perfectly conserved at the same position in all other yeast species where the gene was sequenced. In NAP1 and RPP2B gene products, the amino acid repeats are still detectable in all yeast species where the genes were sequenced but are not very well conserved and the length of the stretch changes between species. The YLR104w ORF product shows an interesting example of conservation of the repeat in four species out of five. Finally, one of the most intriguing cases we found was the PAB1 (YER165w) gene product. In all five species, this protein contains a poly-Ala. A poly-Gln repeat is only present in K. lactis, but two glutamines are present at the same position in the other species. The total length of the two repeats is similar in the different species, suggesting that the length of the repeat on the function and/or stability of the protein.

Trinucleotide Repeats are Over-Represented in Ascomycete-Specific Genes

In S. cerevisiae, trinucleotide repeats are often found in genes encoding products localized at the nucleus (Richard and Dujon 1997; Young et al. 2000), particularly transcription factors (Alba et al. 1999). In order to determine whether this phenomenon is specific to one or a few yeasts or is widespread throughout evolution, we examined the distribution of the 574 S. cerevisiae ORFs previously defined as containing trinucleotide repeats (Richard and Dujon 1997). We used the set of 1571 ascomycete-specific genes, as defined by Malpertuy et al. (2000), the set of 770 genes encoding nuclear products, as defined in the MIPS database, and the set of 101 transcription factors from the Yeast Protein Database (Fig. 4). This analysis confirmed that nuclear factors are over-represented among genes containing trinucleotide repeats, with twice as many observed occurrences (142) than expected (71). The bias is even stronger for the subset of genes encoding transcription factors (nine expected; 27 observed). The novelty is that genes containing trinucleotide repeats are also more frequently observed than expected among the set of ascomycete-specific genes (145 expected; 200 observed). Note that genes encoding transcription factors are themselves found more often than expected in ascomycete-specific genes, as previously observed by Malpertuy et al. (2000). The functions of the 27 transcription factors containing trinucleotide repeats are detailed in Table 6. Out of 27 proteins, 25 have been experimentally shown to interact directly with DNA in the yeast nucleus. By contrast, we did not find an over-representation of trinucleotide repeats in genes encoding the nuclear pore components or involved in nuclear export/import functions, nor in genes encoding DNA replication and/or DNA repair functions (data not shown).

Table 6 Function of the 27 transcription factors containing trinucleotide repeats, according to the Yeast Protein Database

Full size table

Discussion

Microsatellites Are De Novo Created and Lost During Evolution

In the present work, we compared the distribution of microsatellites in the genomes of 13 hemiascomycetous yeast species. Although the sequence data used are only partial (0.2–0.4 X coverage only; see Souciet et al. 2000), we believe that they represent a statistically valid sample of the entire genomes because the sequenced libraries are essentially unbiased (Tekaia et al. 2000). In addition, results previously obtained with 30% of the S. cerevisiae genome (Richard and Dujon 1996), gave the same distribution as the whole genome sequence (Richard and Dujon 1997).

It is clear that the four species that are closest to S. cerevisiae are strongly biased toward AT_n repeats, like two less-related species, K. lactis and D. hansenii var. hansenii (Table 2 and Fig. 1). If we take di- and trinucleotide repeats into consideration, only four species show similar distributions for both: S. cerevisiae, S. bayanus var. uvarum, Z. rouxii, and K. lactis. The other ten species all exhibit unique distributions, a sort of molecular signature of their genome. This suggest not only that microsatellites evolve at a rapid pace, but also that they are created de novo and lost during evolution of living organisms. Mechanisms that alter microsatellite size involve replication slippage during S-phase DNA synthesis unrepaired by the mismatch repair system (reviewed in Kolodner 1996; Modrich and Lahue 1996), or repair slippage during gene conversion of a microsatellite (reviewed in Richard and Pâques 2000). It is not clear what kind of mechanism could de novo create microsatellites. Zhu et al. (2000) suggested that replication slippage could be important even earlier in a microsatellite lifetime, expanding even very short microsatellites, but at least two tandem repeats are needed to get an expansion due to slippage, nevertheless. Gene conversion can lead to insertion of a microsatellite at a locus previously devoid of microsatellite (Richard et al. 1999; Richard et al. 2000), and could be a possible mechanism to spread microsatellites at new locations in the genome. Among the possible mechanisms responsible for loss of microsatellites, point mutations leading to degeneration of the repeat has been suggested (Zhu et al. 2000), but other mechanisms could also be involved. Figure 3 shows a possible example of a microsatellite loss by degeneration of the repeat, occurring in the NAP1 gene of S. cerevisiae and S. bayanus var. uvarum. Once point mutations occur within a microsatellite they will tend to stabilize the repeat, as was shown in mismatch repair-deficient human cells (Bacon et al. 2000). Size changes will thus become less frequent and accumulation of additional mutations will finally make the microsatellite undetectable.

Some dinucleotide-repeat length classes are over-represented in a majority of the analyzed species. If most, if not all, microsatellite length changes are produced by unrepaired slippage of DNA polymerases during replication or repair, there should be an equilibrium depending on: i) repeat length (long repeats have a higher mutation rate than short repeats; Tran et al. 1997; Wierdl et al. 1997); ii) mutations in replication and mismatch repair factors that destabilize microsatellites (Kokoska et al. 1999; Kokoska et al. 1998; Strand et al. 1993; Tran et al. 1997). Xu et al. (2000) showed that the rate of tetranucleotide repeat expansions in humans is constant for all alleles, whereas the rate of contractions increase exponentially with repeat length. Overrepresentation of eight-unit dinucleotide repeats suggests that, at equilibrium, forces driving expansions and contractions of dinucleotides would be equally balanced, leading to the accumulation of repeats of the critical length. Shorter repeats could correspond to young repeats, before equilibrium, or alternatively, old degenerated repeats.

Charged Amino Acids are Over-Represented in Trinucleotide Repeats Contained in Transcription Factors

Four amino acids (D, E, N, and Q) are most frequently found in ORFs containing trinucleotide repeats in all 13 hemiascomycetes studied. The same four amino acids were found in S. cerevisiae in addition to Serine being found frequently in short trinucleotide repeat arrays (Alba et al. 1999; Richard and Dujon 1996, 1997; Young et al. 2000). As previously observed (Richard and Dujon 1996), the three possible frames of a given repeat have distinct different frequencies of occurrence suggesting structural (or functional) contingencies at the protein level and not at the DNA level. Poly-Methionine are probably counter-selected to avoid possible interference with the translation machinery and the start codon.

From database compilation, Karlin and Burge (1996) first observed that proteins containing long homopeptide stretches were often playing a role in developmental functions, particularly in Drosophila. They encode long homopolymers of Gln, Asn, and His. Proteins containing long poly-Gln or poly-Asn tracts tend to aggregate. The best known examples are proteins involved in human neurodegenerative disorders, like Huntingtin (Scherzinger et al. 1997), and prions like the yeast Sup35p (Santoso et al. 2000) or the mammalian PrP (Prusiner et al. 1998). In the 13 hemiascomycetous yeasts studied, like S. cerevisiae, transcription factors are more often than expected encoded by genes with trinucleotide repeats, especially in ascomycete-specific genes, that were shown to diverge more rapidly during evolution (Malpertuy et al. 2000). Transcription factors might be more flexible than other proteins. By adding homopolymers of charged amino acids, biochemical properties of a transcription factor can be changed, modifying its interaction with DNA, with other DNA binding proteins, or with other transcription factors. This modified protein can then be selected for its new function, allowing the cell to increase diversity among its transcription factors, to specialize them, to adapt to a new environment, and eventually to speciate. The overrepresentation of hemiascomycete genes among these transcription factors containing trinucleotide repeats supports this hypothesis of fast evolving genes.

References

MM Alba MF Santibanez-Koref JM Hancock (1999) ArticleTitleAmino acid reiterations in yeast are overrepresented in particular classes of proteins and show evidence of a slippage-like mutational process. J Moll Evol 49 789–797 Occurrence Handle1:CAS:528:DC%2BD3cXks1Gq
CAS Google Scholar
AL Bacon SM Farrington MG Dunlop (2000) ArticleTitleSequence interruptions confer differential stability at microsatellite alleles in mismatch repair-deficient cells. Hum Mol Genet 9 2707–2713 Occurrence Handle1:CAS:528:DC%2BD3cXosVKktbo%3D Occurrence Handle11063729
CAS PubMed Google Scholar
G Benson MS Waterman (1994) ArticleTitleA method for fast database search for all k-nucleotide repeats. Nuc Acids Res 22 4828–4836 Occurrence Handle1:CAS:528:DyaK2MXitlSns74%3D
CAS Google Scholar
J Bowers J-M Boursiquot P This K Chu H Johansson C Meredith (1999) ArticleTitleHistorical genetics: the parentage of Chardonnay, Gamay, and other wine grapes of northeastern France. Science 285 1562–1565 Occurrence Handle10.1126/science.285.5433.1562 Occurrence Handle1:CAS:528:DyaK1MXlslOrt7g%3D Occurrence Handle10477519
Article CAS PubMed Google Scholar
B Charlesworth P Sniegowski W Stephan (1994) ArticleTitleThe evolutionary dynamics of repetitive DNA in eukaryotes. Nature 371 215–220 Occurrence Handle1:CAS:528:DyaK2cXmt1emu7o%3D Occurrence Handle8078581
CAS PubMed Google Scholar
H Debrauwère CG Gendrel S Lechat M Dutreix (1997) ArticleTitleDifferences and similarities between various tandem repeat sequences: minisatellites and microsatellites. Biochimie 79 577–586 Occurrence Handle9466695
PubMed Google Scholar
C Dib S Faure C Fizames D Samson N Drouot A Vignal et al. (1996) ArticleTitleA comprehensive genetic map of the human genome based on 5,264 sequences. Nature 380 152–154 Occurrence Handle8600387
PubMed Google Scholar
C Hennequin A Thierry G-F Richard G Lecointre HV Nguyen C Gaillardin B Dujon (2001) ArticleTitleMicrosatellite typing as a new tool for identification of Saccharomyces cerevisiae strains. J Clin Microbiol 39 551–559 Occurrence Handle1:CAS:528:DC%2BD3MXhtlGrtb0%3D Occurrence Handle11158105
CAS PubMed Google Scholar
H Innan R Terauchi NT Miyashita (1997) ArticleTitleMicrosatellite polymorphism in natural populations of the wild plant Arabidopsis thaliana. Genetics 146 1441–1452 Occurrence Handle1:STN:280:ByiH3czjtFM%3D Occurrence Handle9258686
CAS PubMed Google Scholar
S Karlin C Burge (1996) ArticleTitleTrinucleotide repeats and long homopeptides in genes and proteins associated with nervous system disease and development. Proc Natl Acad Sci USA 93 1560–1565 Occurrence Handle1:CAS:528:DyaK28Xht1ensbc%3D Occurrence Handle8643671
CAS PubMed Google Scholar
RJ Kokoska L Stefanovic AB Buermeyer RM Liskay TD Petes (1999) ArticleTitleA mutation of the yeast gene encoding PCNA destabilizes both microsatellite and minisatellite DNA sequences. Genetics 151 511–519 Occurrence Handle1:CAS:528:DyaK1MXhtlChu7k%3D Occurrence Handle9927447
CAS PubMed Google Scholar
RJ Kokoska L Stefanovic HT Tran MA Resnick DA Gordenin TD Petes (1998) ArticleTitleDestabilization of yeast micro- and minisatellite DNA sequences by mutations affecting a nuclease involved in Okazaki fragment processing (rad27) and DNA polymerase δ (pol3-t). Mol Cell Biol 18 2779–2788 Occurrence Handle1:CAS:528:DyaK1cXis1Gnu7w%3D Occurrence Handle9566897
CAS PubMed Google Scholar
AT Lloyd PM Sharp (1992) ArticleTitleEvolution of codon usage patterns: the extent and nature of divergence between Candida albicans and Saccharomyces cerevisiae. Nuc Acids Res 20 5289–5295 Occurrence Handle1:CAS:528:DyaK3sXjtlyqsA%3D%3D
CAS Google Scholar
FV Lunel L Licciardello S Stefani HA Verbrugh W Melchers JF Meis et al. (1998) ArticleTitleLack of consistent short sequence repeat polymorphisms in genetically homologous colonizing and invasive Candida albicans strains. J Bacteriol 180 3771–3778 Occurrence Handle1:CAS:528:DyaK1cXlt1ChsLw%3D Occurrence Handle9683470
CAS PubMed Google Scholar
A Malpertuy F Tekaia S Casaregola M Aigle F Artiguenave G Blandin et al. (2000) ArticleTitleGenomic exploration of the hemiascomycetous yeasts: 19. Ascomycetes-specific genes. FEBS Lett 487 113–121 Occurrence Handle10.1016/S0014-5793(00)02290-0 Occurrence Handle1:CAS:528:DC%2BD3MXisVSmtg%3D%3D Occurrence Handle11152894
Article CAS PubMed Google Scholar
HT Orr (2001) ArticleTitleBeyond the Qs in the polyglutamine diseases. Genes Dev 15 925–932 Occurrence Handle1:CAS:528:DC%2BD3MXjtVOku7w%3D Occurrence Handle11316786
CAS PubMed Google Scholar
SB Prusiner MR Scott SJ DeArmond FE Cohen (1998) ArticleTitlePrion protein biology. Cell 93 337–348 Occurrence Handle1:CAS:528:DyaK1cXjtFCjtLw%3D Occurrence Handle9590169
CAS PubMed Google Scholar
G-F Richard B Dujon (1996) ArticleTitleDistribution and variability of trinucleotide repeats in the genome of the yeast Saccharomyces cerevisiae. Gene 174 165–174 Occurrence Handle1:CAS:528:DyaK28XmtV2iur8%3D Occurrence Handle8863744
CAS PubMed Google Scholar
G-F Richard B Dujon (1997) ArticleTitleTrinucleotide repeats in yeast. Res Microbiol 148 731–744 Occurrence Handle1:CAS:528:DyaK1cXptlGnsg%3D%3D Occurrence Handle9765857
CAS PubMed Google Scholar
G-F Richard C Hennequin A Thierry B Dujon (1999) ArticleTitleTrinucleotide repeats and other microsatellites in yeasts. Res Microbiol 150 589–602 Occurrence Handle1:CAS:528:DC%2BD3cXoslWnsQ%3D%3D Occurrence Handle10672999
CAS PubMed Google Scholar
G-F Richard F Pâques (2000) ArticleTitleMini- and microsatellite expansions: the recombination connection. EMBO Reports 1 122–126 Occurrence Handle1:CAS:528:DC%2BD3cXnt1yrs7c%3D Occurrence Handle11265750
CAS PubMed Google Scholar
RI Richards (2001) ArticleTitleDynamic mutations: a decade of unstable expanded repeats in human genetic disease. Hum Mol Genet 10 2187–2194
Google Scholar
MS Röder V Korzun K Wendehake J Plaschke M-H Tixier P Leroy MW Ganal (1998) ArticleTitleA microsatellite map of wheat. Genetics 149 2007–2023 Occurrence Handle9691054
PubMed Google Scholar
T Sakamoto RG Danzmann K Gharbi P Howard A Ozaki SK Khoo et al. (2000) ArticleTitleA microsatellite linkage map of rainbow trout (Oncorhynchus mykiss) characterized by large sex-specific differences in recombination rates. Genetics 155 1331–1345 Occurrence Handle1:CAS:528:DC%2BD3cXlsFSku7Y%3D Occurrence Handle10880492
CAS PubMed Google Scholar
A Santoso P Chien LZ Osherovich JS Weissman (2000) ArticleTitleMolecular basis of a yeast prion species barrier. Cell 100 277–288 Occurrence Handle1:CAS:528:DC%2BD3cXotFChug%3D%3D Occurrence Handle10660050
CAS PubMed Google Scholar
E Scherzinger R Lurz M Turmaine L Mangiarini B Hollenbach R Hasenbank et al. (1997) ArticleTitleHuntingtin-encoded polyglutamine expansions form amyloid-like protein aggregates in vitro and in vivo. Cell 90 549–558 Occurrence Handle1:CAS:528:DyaK2sXlsVKrsL4%3D Occurrence Handle9267034
CAS PubMed Google Scholar
J Souciet M Aigle F Artiguenave G Blandin M Bolotin-Fukuhara E Bon et al. (2000) ArticleTitleGenomic exploration of the hemiascomycetous yeasts: 1. A set of yeast species for molecular evolution studies. FEBS Lett 487 3–12 Occurrence Handle11152876
PubMed Google Scholar
M Strand TA Prolla RM Liskay TD Petes (1993) ArticleTitleDestabilization of tracts of simple repetitive DNA in yeast by mutations affecting DNA mismatch repair. Nature 365 274–276 Occurrence Handle1:CAS:528:DyaK3sXms1equr0%3D Occurrence Handle8371783
CAS PubMed Google Scholar
F Tekaia G Blandin A Malpertuy B Llorente P Durrens C Toffano-Nioche et al. (2000) ArticleTitleGenomic exploration of the hemiascomycetous yeasts: 3. Methods and strategies used for sequence analysis and annotation. FEBS 487 17–30 Occurrence Handle1:CAS:528:DC%2BD3MXisVSgug%3D%3D Occurrence Handle11152878
CAS PubMed Google Scholar
G Toth Z Gaspari J Jurka (2000) ArticleTitleMicrosatellites in different eukaryotic genomes: survey and analysis. Genome Res 10 967–981 Occurrence Handle10899146
PubMed Google Scholar
HT Tran JD Keen M Kricker MA Resnick DA Gordenin (1997) ArticleTitleHypermutability of homonucleotide runs in mismatch repair and DNA polymerase proofreading yeast mutants. Mol Cell Biol 17 2859–2865 Occurrence Handle1:CAS:528:DyaK2sXislCms7g%3D Occurrence Handle9111358
CAS PubMed Google Scholar
M Wierdl M Dominska TD Petes (1997) ArticleTitleMicrosatellite instability in yeast: dependence on the length of the microsatellite. Genetics 146 769–779 Occurrence Handle1:CAS:528:DyaK2sXmsVartLw%3D Occurrence Handle9215886
CAS PubMed Google Scholar
X Xu M Peng Z Fang X Xu (2000) ArticleTitleThe direction of microsatellite mutations is dependent upon allele length. Nat Genet 24 396–399 Occurrence Handle10.1038/74238 Occurrence Handle1:CAS:528:DC%2BD3cXisVCjsbc%3D Occurrence Handle10742105
Article CAS PubMed Google Scholar
ET Young JS Sloan K Van Riper (2000) ArticleTitleTrinucleotide repeats are clustered in regulatory genes in Saccharomyces cerevisiae. Genetics 154 1053–1068 Occurrence Handle1:CAS:528:DC%2BD3cXitVOiu7k%3D Occurrence Handle10757753
CAS PubMed Google Scholar
Y Zhu DC Queller JE Strassmann (2000) ArticleTitleA phylogenetic perspective on sequence evolution in microsatellite loci. J Mol Evol 50 324–338 Occurrence Handle1:CAS:528:DC%2BD3cXivFyjtbc%3D Occurrence Handle10795824
CAS PubMed Google Scholar

Download references

Acknowledgements

We are grateful to our colleagues of the Génolevures project (http://cbi.labri.u-bordeaux.fr/Genolevures ) for giving us access to their sequence data before publication. We thank Florence Ricard, Gäelle Blandin, and Fredj Tekaia for excellent computer assistance and our colleagues of the Unité de Génétique Moléculaire des Levures for fruitful discussions. This work was supported by CNRS as part of the GDR 2354. B.D. is a member of the Institut Universitaire de France.

Author information

Authors and Affiliations

Unité de Génétique Moléculaire des Levures (URA 2171 CNRS, UFR 927 Université Pierre et Marie Curie), Institut Pasteur, 25 rue du Dr. Roux, 75724 Paris cedex 15, France
Alain Malpertuy, Bernard Dujon & Guy-Franck Richard

Authors

Alain Malpertuy
View author publications
You can also search for this author in PubMed Google Scholar
Bernard Dujon
View author publications
You can also search for this author in PubMed Google Scholar
Guy-Franck Richard
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guy-Franck Richard.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Malpertuy, A., Dujon, B. & Richard, GF. Analysis of Microsatellites in 13 Hemiascomycetous Yeast Species: Mechanisms Involved in Genome Dynamics . J Mol Evol 56, 730–741 (2003). https://doi.org/10.1007/s00239-002-2447-5

Download citation

Received: 08 October 2002
Accepted: 21 January 2003
Issue Date: June 2003
DOI: https://doi.org/10.1007/s00239-002-2447-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Analysis of Microsatellites in 13 Hemiascomycetous Yeast Species: Mechanisms Involved in Genome Dynamics

Abstract

Similar content being viewed by others

Evolution of Nine Microsatellite Loci in the Fungus Fusarium oxysporum

Imperfect and Compound Microsatellites in the Genomes of Burkholderia pseudomallei Strains

Patterns of microsatellite distribution across eukaryotic genomes

Introduction

Materials and Methods

Results

Sequence and Length Distributions of Microsatellites in the 13 Hemiascomycetous Species

Amino Acids Encoded by Trinucleotide Repeats

Conservation of Amino Acid Stretches Among the 13 Hemiascomycetes

Trinucleotide Repeats are Over-Represented in Ascomycete-Specific Genes

Discussion

Microsatellites Are De Novo Created and Lost During Evolution

Charged Amino Acids are Over-Represented in Trinucleotide Repeats Contained in Transcription Factors

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Analysis of Microsatellites in 13 Hemiascomycetous Yeast Species: Mechanisms Involved in Genome Dynamics

Abstract

Similar content being viewed by others

Evolution of Nine Microsatellite Loci in the Fungus Fusarium oxysporum

Imperfect and Compound Microsatellites in the Genomes of Burkholderia pseudomallei Strains

Patterns of microsatellite distribution across eukaryotic genomes

Introduction

Materials and Methods

Results

Sequence and Length Distributions of Microsatellites in the 13 Hemiascomycetous Species

Amino Acids Encoded by Trinucleotide Repeats

Conservation of Amino Acid Stretches Among the 13 Hemiascomycetes

Trinucleotide Repeats are Over-Represented in Ascomycete-Specific Genes

Discussion

Microsatellites Are De Novo Created and Lost During Evolution

Charged Amino Acids are Over-Represented in Trinucleotide Repeats Contained in Transcription Factors

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation