Introduction

Convergent evolution, which occurs ranging from the molecular level to the behavioral level, is considered a similar organismal trait evolving independently from different ancestors due to similar selective pressures in the environment (Zakon 2002). The dynamic of convergent evolution is to cope with a new stimulus and minimize unnecessary energy consumption when individuals face repetitive innocuous stimuli, which improves the survival of an organism (van Duijn 2017). Echolocation is a case of convergent evolution where similar traits evolved through identical genetic changes in bats and dolphins (Liu et al. 2010). Convergent adaptation shows great potential for the prediction of evolution and the procession of biological diversity. Gene loss is one way to generate convergent evolution (Branco et al. 2018).

G protein-coupled receptors (GPCRs), which are also known as seven-transmembrane (7TM) receptors, represent the largest and most diverse superfamily in vertebrates (Pedersen et al. 2018). Melanocortin 5 receptor (MC5R), a classic GPCR and one of five melanocortin receptor genes in placental mammals, is important in the regulation of energy homeostasis (Cone 2005). MC5R is expressed in terminally differentiated sebaceous glands, and MC5R knockout mice showed reduced water rejection after swimming (Chen et al. 1997). Interestingly, in both whales and manatees, sebaceous glands are lost or degenerative (Springer and Gatesy 2018), which may be the reason for the reduced requirement for MC5R (Springer and Gatesy 2018; Wang et al. 2015). With the exception of the loss of sebaceous glands and the reduction in hair, other similar water-dependent features in whales are retained, such as underwater communication and hearing, underwater birth, loss of scrotal testes, and dense limb bones to overcome buoyancy (Barklow 2004, Boisserie et al. 2011; Coughlin and Fish 2009; Gatesy 1997; Gatesy et al. 2013; Spaulding et al. 2009; Tsagkogeorga et al. 2015).

Homologous recombination and insertion of transposons or retrotransposons leading to gene loss have been reported by several studies (Dahal et al. 2018; Krom and Ramakrishna 2012; Sedivy and Sharp 1989). On one hand, homologous recombination plays an important role in DNA template switching; recombination is a common event leading to allelic loss (Henson et al. 1991). The loss of the agouti signaling protein gene in the lesser apes is mediated by unequal homologous recombination (Nakayama and Ishida 2006). Similarly, micro-homologous recombination of 5–25 base pairs efficiently repairs double-stranded breaks created during murine B lymphocyte development (Nussenzweig and Nussenzweig 2007). On the other hand, the insertion of transposons or retrotransposons leads to gene loss (Kanazawa et al. 2009). Homolog pairing, which plays a critical role in meiosis, poses a potential risk if it occurs in inappropriate tissues or between nonallelic sites, as it can lead to changes in gene expression, chromosome entanglements, and loss-of-heterozygosity due to mitotic recombination (Joyce et al. 2013). In retrotransposons, enzymes and proteins that encode transposons are called autonomous transposons and can independently complete the transposon process (Wisman et al. 1998). The insertion of retrotransposons can induce mutations near or within genes (Niu et al. 2019). The FAIRE signal is lost in promoters and enhancers of active genes and gained in heterochromatic gene-poor regions that make senescent cells smooth. Chromatin of major retrotransposon classes, Alu, SVA, and LINE-1 (Long interspersed nuclear element-1), becomes relatively more open in senescent cells, affecting most strongly the evolutionarily recent elements, and leads to an increase in their transcription and ultimately transposition (Cecco et al. 2013). Furthermore, retrotransposon-induced mutations are stable due to their replication mechanism (Monden et al. 2014). LINE-1 is the largest retrotransposon family in the human genome and is the only family capable of autonomous transposition, accounting for 17% of the genome (Goodier and Kazazian 2008). Active retrotransposons play a great role in biological evolution and species formation (Rajput 2015). For example, a heterozygous frameshift mutation was detected in Meckel-Gruber syndrome, which is a rare ciliopathy disease. It was detected that the insertion of LINE-1 affected exon 7 in the paternally derived allele (Takenouchi et al. 2017). However, in normal somatic cells, in order to maintain the stability of the genome, host cells have strict control over the transposition of LINE-1 (Ye et al. 2017).

Although MC5R was reported to be completely lost in whales and mostly deleted in manatees (Springer and Gatesy 2018), the molecular mechanism underlying MC5R loss remained poorly understood. In this study, we found differential molecular mechanism for MC5R loss in whales and manatees and revealed evidence of convergent evolution to the marine environment.

Materials and methods

Database querying and BLAST searches

In Homo sapiens, the protein-coding sequences for MC5R have 978 nucleotides. Protein-coding sequences of representative placental taxa that have annotated genomes (human: Homo sapiens; white-tufted-ear marmoset: Callithrix jacchus; mouse: Mus musculus; Norway rat: Rattus norvegicus; bottlenose dolphin: Tursiops truncatus; killer whale: Orcinus orca; Beluga whale: Delphinaptterus leucas; sperm whale: Physeter catodon; Minke whale: Balaenoptera acutorostrata scammoni; Yangtze River dolphin: Lipotes vexillifer; wild yak: Bos mutus; cattle: Bos taurus; dog: Canis lupus familiaris; walrus: Odobenus rosmarus; Weddell sea: Leptonychotes weddellii; African savanna elephant: Loxodonta Africana; and Florida manatee: Trichechus manatus latirostris) were extracted from NCBI. Sequences of Homo sapiens and Bos mutus were used as query sequences to BLAST against other placental mammals.

Phylogenetic analysis

The coding sequences downloaded from NCBI were aligned by Clustal W version 1.83 (Thompson and Higgins 1994) with default settings. Phylogenetic trees were constructed using one algorithm neighbor-joining (NJ) with 2000 bootstrap replicates in MEGA version 4 (Tamura et al. 2007).

Results and discussion

Homologous recombination mechanism of MC5R loss in manatees

Marine mammals do not represent a distinct taxon or systematic grouping, but have a multilineage relationship owing to convergent evolution, as they do not have a direct common ancestor (Jefferson et al. 1995). Based on molecular systematics, among marine mammals, whales belong to Cetartiodactyla and Trichechus manatus latirostris belong to Sirenia (Springer and Gatesy 2018) (Fig. 1). Adaptation to the aquatic lifestyle of marine mammals varies considerably between species. Whales and manatees are universally recognized as fully aquatic marine mammals. Additionally, the presence of hair is densely distributed in most mammalian species and is also closely related to the sebaceous glands (Springer and Gatesy 2018; Li et al. 2006). However, some taxa are largely hairless, or hairs are sparse, including whales and manatees, which both lost MC5R (Folk and Semken 1991). During the evolution of whales and manatees, the function of sebaceous glands deteriorated slowly after the transition from land to sea. In addition, after sebaceous glands were lost in whales and manatees, MC5R was lost or inactivated. Therefore, DNA sequences of the MC5R gene among various species were aligned to find molecular evolutionary aspects of MC5R. MC5R was found to be located between MC2R and RNMT in most mammals in large-scale studies of 10 classical mammals and 6 whales (Fig. 1). The genomic location of MC5R flanking RNMT and MC2R is highly conserved in mammals (Fig. 1). In most mammals, such as Homo sapiens (human), Callithrix jacchus (white-tufted-ear marmoset), Mus musculus (mouse), Rattus norvegicus (Norway rat), Bos taurus (cattle), Canis lupus familiaris (dog), Odobenus rosmarus (walrus), Leptonychotes weddellii (Weddell seal), and Loxodonta africana (African savanna elephant), MC5R is totally present (Fig. 1). However, MC5R is lost in whales and manatees, indicating that the levels of MC5R retention vary in mammals (Fig. 1). Only MC5R relics are detected in manatees, and no MC5R sequences could be found in whales (Tursiops truncates, Orcinus orca, Delphinaptterus leucas, Lipotes vexillifer, Physeter catodon, and Balaenoptera acutorostrata scammoni) (Fig. 2). Based on the above findings, loss of MC5R in aquatic animals is speculated to result from evolution to the marine environment. Taken together, our results suggest convergent evolution to the marine environment in whales and manatees.

Fig. 1
figure 1

Genomic location of MC5R in mammals. Phylogenetic tree follows previous studies (Tarver, et al. 2016), showing the relationships among 10 mammals and 6 whales used in the study. Outgroup branch lengths are not drawn to scale. Information about the lifestyle and the genomic location of MC5R in 16 mammals. The complete boxes are denoted in red (MC5R intact), blue (RTASE), and gray (conserved genes around MC5R). The incomplete box in red represents an MC5R relic. The squares beside the phylogenetic tree indicate the lifestyle. The black square represents fully aquatic, the white one represents terrestrial, and the half black-half white square indicates semiaquatic. Human: Homo sapiens; white-tufted-ear marmoset: Callithrix jacchus; mouse: Mus musculus; Norway rat: Rattus norvegicus; whales (Bottlenose dolphin: Tursiops truncatus; killer: Orcinus orca; Beluga: Delphinaptterus leucas; sperm: Physeter catodon; Minke: Balaenoptera acutorostrata scammoni; Yangtze River dolphin: Lipotes vexillifer); cattle: Bos taurus; dog: Canis lupus familiaris; walrus: Odobenus rosmarus; Weddell sea: Leptonychotes weddellii; African savanna elephant: Loxodonta africana; Florida manatee: Trichechus manatus latirostris

Fig. 2
figure 2

Genomic structure of whales after converging evolution. The tree topology among different whales follows previous studies (Gatesy et al. 2013). Outgroups branch lengths are not drawn to scale. Schematic representation of the gene organization comparisons in different whales, Bos taurus and Bos mutus. The sequences located between MC2R and RNMT are highly conserved. The sequences in yellow show similarity in Bos taurus, Bos mutus, Tursiops truncatus, Orcinus orca, and Delphinapterus leucas. The sequences in green are conserved in Bos taurus, Bos mutus, and Balaenoptera acutorostrata scammoni, and the sequences in orange are conserved in Bos taurus, Bos mutus, Lipotes vexillifer, and Physeter catodon. The brown sequences can be detected in all the species except Balaenoptera acutorostrata scammoni. The difference between Bos taurus and whales is that MC5R is located between these conserved sequences in Bos taurus, while RTASE is inserted in whales, which led to the loss of MC5R. The numbers above the solid lines and on the boxes indicate the size of the introns and exons, respectively. The dashed line indicates an uncertain number of the introns and exons. Cattle: Bos taurus; wild yak: Bos mutus; bottlenose dolphin: Tursiops truncatus; killer: Orcinus orca; Beluga: Delphinapterus leucas; Yangtze River dolphin: Lipotes vexillifer; sperm: Physeter catodon; Minke: Balaenoptera acutorostrata scammoni

MC5R is present in L. africana, Desmodus rotundusus, and Chrysochloris asiatica, while in T. m. latirostris, as described by Springer and Gatesy (Springer and Gatesy 2018), there is a 2823-bp deletion (relative to L. africana) which includes 1991 bp of 5′UTR sequences and 832 bp coding sequence of MC5R (Fig. 3). By comparing the genomes of L. africana and T. m. latirostris, we find that in L. africana, there are two homologous “TTATC” sequences that are relatively conserved on the chromosomes containing MC5R (Fig. 3). The first one is located upstream of coding sequence of MC5R, and the second one is 131 bp before the stop codon of MC5R (Fig. 3). In T. m. latirostris, there is only the second homologous sequence “TTATC” with a relic of MC5R retained.

Fig. 3
figure 3

MC5R gene with upstream and downstream sequences of closely related species in manatees. Phylogenetic tree follows previous studies (Tarver, et al. 2016), showing the relationships among Trichechus manatus latirostris, Loxodonta africana, Desmodus rotundusus, and Chrysochloris asiatica. MC5R is marked in red; “TTATC” and other similar mutation sequences are marked with ellipses. It shows that specific mutations have led to homologous recombination and gene loss in Trichechus manatus latirostris

The upstream sequence of the first homologous sequence “TTATC” (“CTGGGT”) is also relatively conserved in T. m. latirostris and L. africana (Fig. 3). However, the first homologous “TTATC” sequences away from the MC5R are not absolutely conserved. The mutations occur in the genome “TGAGG” in D. rotundus and “ATTTT” in C. asiatica (Figs. 3 and 4; Fig. S2). In terms of D. rotundus and C. asiatica, according to the results of a genomic BLAST, the second homologous “TTATC” sequences on MC5R are still conserved: “TCATC” in D. rotundus and “TAATC” in C. asiatica (Figs. 3 and 4; Fig. S2A and B). In the genome sequences analysis between L. africana and T. m. latirostris, we find that the loss of most sequences of MC5R in T. m. latirostris is caused by the recombination of homologous sequence “TTATC” (Fig. 4A). The high conservatism of the sequence adjacent to the “TAATC” sequence can be regarded as the reliability of the comparison results (Figs. 3 and 4; Fig. S2).

Fig. 4
figure 4

Two different mechanisms of MC5R loss in manatees and whales. A Sequence alignment of Trichechus manatus latirostris, Loxodonta africana, Desmodus rotundusus, and Chrysochloris asiatica is shown in A1. The conserved sequence “TTATC” and mutant sequence are indicated in red and green, respectively. The red box represents MC5R. In the genome of manatees, most MC5R sequences are lost and only a small number of sequences remain. In the gene of Loxodonta africana, there are two homologous sequences “TTATC” on the chromosome containing the MC5R gene. One is located between MC5R and MC2R; the other one is located on MC5R. However, in manatees, the sequences between the two “TTATC” are lost, and together with one “TTATC,” only a relic of MC5R is retained. B In the whale genome, the whole MC5R gene is lost, and RTASE is inserted. Sequence alignment of Bos taurus, Bos mutus, Tursiops truncatus, Orcinus orca, Delphinapterus leucas, Lipotes vexillifer, and Physeter catodon is shown in B1. The red box indicates MC5R, and the blue indicates RTASE. The conserved sequences before MC5R are marked in orange, and the ones after MC5R are marked in brown. The intact MC5R is located between the conserved sequence of Bos taurus and Bos mutus. In whales, MC5R is completely lost and RTASE is inserted

The mechanism of MC5R loss in whales

Unlike T. m. latirostris, MC5R sequences are completely lost in whales. Based on the phylogenetic tree of the species, the relationship among whales, B. taurus and B. mutus, is relatively close. Genome sequences of T. truncatus, O. orca, D. leucas, P. catodon, L. vexillifer, and B. a. scammoni were obtained to analyze the absence of MC5R in whales (Fig. 2). However, the complete MC5R coding sequences of six whales are missing. MC5R flanking with MC2R and RNMT and some noncoding conserved sequences were found between MC2R and RNMT in these species (Fig. 2). Sequences are marked with different colors and highly conserved between sequences marked by the same color (Fig. 3; Figure S1A and B).

MC5R is located between the conserved sequences in B. taurus and B. mutus, while MC5R is replaced by reverse transcriptase (RTASE) in whales. Furthermore, the lack of common MC5R ORF in whales indicates that these deficiencies occurred in a common ancestor. L1s are a class of repetitive DNA sequences that can spontaneously “copy-paste” themselves in the human genome (Hancks et al. 2011). After sequence alignment, there is only an RTASE in the LINE-1 insertion, and other elements are lost. Furthermore, the location of the RTASE insertion is like that of MC5R on the chromosome in B. taurus and B. mutus (Fig. 2). Consistent with the MC5R flanking sequence of B. taurus and B. mutus, the RTASE flanking sequences showed high similarity (Fig. 4B). Synthesizing all the discoveries and analyzes above, we suppose that ultimately, the absence of MC5R in whales is due to the insertion of a special enzyme, RTASE. Thus, whales tend to adapt to the marine environment during evolution.

The obtained results show that gene loss in whales and manatees exposes an occurrence of convergent evolution to the marine environment, and we try to explore the functional impact that is significant to environmental adaptation. The reason why this gene is not lost in semiaquatic mammals is currently unidentified and would be an interesting and important topic for further research.