Sugar molecules called glycans cover the outer surface of all cells, and their composition and chemistry influence interactions between molecules, cells, and individuals. Glycans have a tremendous diversity of natural variation (Varki et al. 2009). Their structural properties allow many different forms and modifications, and their regulation can be fine-tuned spatially and temporally. Glycans are often variable within and between species, suggesting that they can evolve rapidly, perhaps in response to evolutionary conflict (Gagneux and Varki 1999; Springer and Gagneux 2013). Some of the first natural genetic polymorphisms discovered are now known to be glycan variants. Blood groups are a classic example (Haldane 1940). The ABO blood group results from antigenicity against different glycosylation variants and may be maintained by frequency-dependent selection since pathogens target these molecules during infection (Ségurel et al. 2012). Natural glycan variants can thus have important effects on fitness, and glycan biology is a ubiquitous but understudied aspect of phenotypic evolution.

The first unique human biochemical difference discovered is also a glycan. Humans lack the sialic acid Neu5Gc. Sialylated human N-glycans, O-glycans, and gangliosides instead terminate with the precursor form—Neu5Ac (Chou et al. 1998). The glycans of chimpanzees, gorillas, and other mammals contain both Neu5Gc and Neu5Ac, millions of each molecule on every cell. Sialic acids contribute to self-recognition by the innate immune system, and binding by sialic acid-binding immunoglobulin-type lectin (Siglec) proteins modulates the intensity of the innate immune response. Pathogens exploit these signals by mimicking or co-opting sialic acids to enter cells or suppress the immune response (Varki 2011). Some human pathogens have evolved receptors that recognize Neu5Ac specifically, and some diseases shared by other primates cannot infect humans and vice versa. Human sialic acid biology has changed extensively to compensate for the absence of Neu5Gc (Varki 2010). Sialic acid-binding proteins like Siglecs often evolve by positive selection, and many have unique human mutations. A range of human immune phenotypes—including inflammation, cancer, and even reproduction—differ from our closest relatives because of this change in cell surface chemistry (Pearce et al. 2014; Ghaderi et al. 2011; Pham et al. 2009).

The absence of endogenous Neu5Gc in human tissues is caused by inactivation of cytidine monophosphate-N-acetylneuraminic acid hydroxylase (CMAH), the enzyme that adds a hydroxyl group to cytidine monophosphate (CMP)-Neu5Ac forming CMP-Neu5Gc. The ability to synthesize Neu5Gc was eliminated in hominids ~3 million years ago (mya) when an Alu element deleted exon 6 of the CMAH gene (Hayakawa et al. 2006). Polymorphism data suggests that the CMAH(−) allele rapidly rose to high frequency in the ancestral hominid population. Selection apparently favored this loss of function allele despite its many pleiotropic effects on phenotype (Ghaderi et al. 2011). We might therefore expect to find other organisms where CMAH function has been lost, and indeed, birds and some monotremes lack Neu5Gc (Schauer et al. 2009), but to date, all mammals tested can synthesize Neu5Gc.

New World monkeys, the platyrrhines, are a radiation of 135 extant species which descended from a common ancestor ~30 mya in South and Central America and have evolved an amazing diversity of forms, colors, and behaviors (Perez et al. 2013; Santana et al. 2012; Perelman et al. 2011). New World monkeys have become important disease models, in part because they can be infected by some human pathogens. For example, the malaria parasite Plasmodium falciparum binds specifically to Neu5Ac, cannot bind to Neu5Gc (Martin et al. 2005; Varki and Gagneux 2009), and is capable of infecting humans and New World monkeys but does not infect other primates (Ward and Vallender 2012; Galland 2000). Using this as a clue, we examined the sialic acid composition of New World monkey tissues by anti-Neu5Gc Western blot, high-pressure liquid chromatography (HPLC), and mass spectrometry to show that, like humans, New World monkeys do not make endogenous Neu5Gc (Fig. 1). Genomic information from the marmoset (Callithrix jacchus) and squirrel monkey (Saimiri boliviensis) and direct sequence of CMAH exon 9 from several other New World monkey species reveals an inversion of exons 4 to 13 and an inactivated CMAH gene. This independent loss occurred at the base of the platyrrhine radiation: ~30 million years ago, compared to ~3 mya in ancestral hominids.

Fig. 1
figure 1

Neu5Gc loss by CMAH inactivation in New World monkeys. a Species assayed for the absence of Neu5Gc in tissues (yellow) or with CMAH sequence (green). All five platyrrhine families are represented. b New World monkey glycoproteins do not bind to anti-Neu5Gc antibody (af). M is a size standard, Neu5Gc(+) is chimpanzee milk, and Neu5Gc(−) is human milk. c, d HPLC and mass spectrometry do not detect Neu5Gc in New World monkey tissues. Representative results from cow and owl monkey are shown. e CMAH was inactivated by an inversion that eliminated exons 4 to 13. Exon 9 is on the strand opposite the remaining exons. f Frame shifts and stop codons (red) disrupt CMAH exon 9. g New World monkeys lost Neu5Gc ~30 million years ago by an independent inactivation of CMAH, the same gene that caused humans to lose Neu5Gc ~3 mya

The absence of Neu5Gc from New World monkeys expands their usefulness as disease models and also raises awareness of a new conservation risk. Their sialic acid composition makes New World monkeys susceptible to some human pathogens that interact with sialic acids. This independent loss of CMAH and Neu5Gc in hominids and platyrrhines is also an interesting case of parallel evolution at the interface of pathogen interaction and self-recognition by the innate immune system. Interactions with pathogens have driven the rapid evolution of Siglecs in the human lineage (Varki 2010). Siglec evolution and its influence on self-recognition can now be subject to comparative analysis, to study the range of responses that follow Neu5Gc loss and the repeatability of evolution following the parallel loss of a self-signal. Because New World monkeys have had 10 times long to compensate, their sialic acid biology may suggest new treatments for human diseases that result from our unusual sialic acid composition.

The absence of Neu5Gc can no longer be considered a unique feature of human glycans. New World monkeys also lack this key signal of self. This parallel evolutionary change will broaden our understanding of the many roles of glycans in health and disease, allow new tests of selection and constraint in primate evolution, and refine our ability to protect the fantastic diversity of New World monkeys.

Materials and methods

Neu5Gc detection

New World monkey tissues were obtained from John Kaas (Vanderbilt University), The San Diego Zoological Society, and Alison Muotri (UC San Diego). We analyzed the Sia content of New World monkey tissues by anti-Neu5Gc Western blot (Fig. 1b), HPLC, and mass spectrometry (Figs. 1c, d, respectively, and 2). Neu5Gc could not be detected in New World monkey tissues by any of these three methods.

Fig. 2
figure 2

Neu5Gc is absent from New World monkey tissues. HPLC and mass spectrometry analysis of New World monkey tissue samples. Neu5Gc(+) positive control (blue) is from cow (Bos). HPLC absorbance range is from 0 to 1 × 105 units on all plots. All New World monkey samples (red) are missing the peak that indicates the presence of Neu5Gc (elution time ~8.5 min). Mass spectrometry intensity range is from 0 to 8 × 106 on all plots. All New World monkey samples are missing fragments associated with Neu5Gc (m/z ratio ~440)

Anti-Neu5Gc antibody Western blots

We used 250 μg of spleen or liver tissue which is typically rich in Neu5Gc in mammals. Samples were placed in 1 mL lysis buffer (PBS with 0.2 % Triton X-100) and homogenized on a Polytron (three 20s bursts). Tissue lysates were ultracentrifuged (100,000 g, 1 h). Protein concentrations in the resulting supernatant were determined by BCA (Pierce). Samples were run on a 10 % denaturing SDS-PAGE gel (10 μg of protein extract per well). Consistent loading across samples was verified by Ponceau staining. Separated proteins were transferred to a nitrocellulose membrane and blocked overnight in Tris-buffered saline with Tween (TBS-T). Western blots were incubated with affinity-purified anti-Neu5Gc chicken IgY (20 °C, 2 h) (Diaz et al. 2009). Membranes were washed in TBS-T (five washes, 5 min each) and bound with HRP-conjugated secondary goat anti-chicken antibody (20 °C, 2 h). We ran two controls on binding (unspecific chicken IgY and secondary antibody only) and two controls on sialic acid composition [Neu5Gc(+) chimpanzee milk and Neu5Gc(−) human milk].

HPLC

Total Sia extracts were obtained by 2 M acetic acid hydrolysis. Released Sias were filtered through microcon 10 spin columns (Millipore) by centrifugation (13,000g, 20 min) and derivatized in 1,2-diamino-4,5-methylenedioxybenzene (DMB) reagent for 2.5 h at 50 °C in the dark. Derivatized Sias were separated by HPLC using a C18 reversed phase column (Varian) and eluted (0.9 mL/min, 50 min) in an isocratic solvent (85 % water, 7 % methanol, 8 % acetonitrile) (Manzi et al. 1990). Sia standards were isolated from bovine submaxillary mucin using the same protocol.

Mass spectrometry

DMB derivatives of sialic acids were run on a Finnigan MAT HPLC with online mass spectrometry on an LCQ Mass Spectrometer System. The HPLC eluent from above was simultaneously monitored by absorbance at 373 nm and by mass spectrometry (capillary temperature 210 °C, capillary voltage 31 V, lens offset voltage 0 V). Mass spectra were acquired by scanning from m/z 150 to 2,000 in the positive ion mode. MS/MS was acquired by selecting the parent mass and using a 20 % normalized collision energy. Data analysis was performed with the manufacturer’s Xcalibur data analysis program.

Genomic information

Draft genome assemblies of marmoset (C. jacchus: WUGSC 3.2, March 2009) and squirrel monkey (S. boliviensis: saiBol1, October 2011) show that the CMAH gene has been disrupted in both species. Local synteny around CMAH is conserved despite a large inversion on the chromosome. CMAH is flanked by CANT1 and FAM65b as it is in other mammals, though there are other inversions along the chromosome. Chain-net alignments accessed with the UCSC genome browser identify CMAH exons 1–3, 9, 14, and 15 in marmoset and squirrel monkey. Several of the recognizable exons are truncated or contain premature stop codons. CMAH exons 4 to 8 and 10 to 13 are not identifiable in either species. CMAH exon 9 is recognizable but exists on the strand opposite the other CMAH exons. Thus, the center of the CMAH gene appears to have undergone an inversion. The cause of this event is not clear, but the inverted region is roughly 20,000 bp longer than the corresponding region in catarrhines and contains stretches of unrecognizable DNA in addition to regions with similarity to CMAH introns. BLAT searches do not find other regions with similarity to CMAH in either New World monkey genome.

PCR and sequencing

We isolated DNA (DNeasy Blood and Tissue Kit; Qiagen) from various tissues of species representing four of the five platyrrhine families (all but Aotidae, which is a subclade of the Cebidae). PCR primers that bracket exon 9 were designed from Multiz alignments of several primate species (howlers: F 5′ ttcatgctctctgttcttcacc 3′, R 5′ tgcaacattttcctagcaacc 3′; other platyrrhines: F 5′ tttccttctcatgtcacattgc 3′, R 5′ tggaactttgctctatttctgc 3′). Diploid PCR products were amplified using standard reagents (Taq, buffer, and dNTPs; Invitrogen) and Sanger sequenced using standard protocols (Big Dye v3.1; Applied Biosystems). Exon sequences are shown in Fig. 1 and deposited in GenBank (accession numbers: X to X). Sequences of CMAH exon 9 contain stop codons and other disruptions in every platyrrhine species sequenced.