Introduction

Mitochondrial genome is a single extrachromosomal circular DNA molecular with 15–18 kb in length and it consistently contains 13 protein coding genes (PCGs), two ribosomal RNA (rRNA), 22 transfer RNA (tRNA) genes, and a noncoding adenine(A) + thymine(T)-rich region (displacement loop region, D-loop) for most vertebrates (Satoh et al. 2016). Compared with nuclear genome, mitogenome is characterized by its small size, simple genomic organization, maternal inheritance, high rate of evolution, and almost unambiguous orthology. Hence, mitochondrial DNA (mtDNA) is typically considered to be an informative molecular marker widely applied in species identification, evolutionary biology, population genetics, and conservation biology (Duchene et al. 2012; Wang et al. 2016; Ko et al. 2018).

The Danioninae is one of the most species-rich subfamilies of the Cyprinidae, comprising approximate 300 species belonging to 50 genera (Tang et al. 2010; Liao et al. 2011a). However, the subfamily Danioninae and its members have had a long and somewhat convoluted taxonomic history (Tang et al. 2010), which had been described as a large assemblage containing most taxa not accommodated by the other subfamilies of the Cyprinidae. In previous phylogenetic studies, the Danioninae is not monophyletic, with putative members scattered throughout Cyprinidae (Tang et al. 2010). Within the Danioninae sensu stricto, it has been distinguished into three major lineages, including tribes Rasborini, Danionini, and Chedrini (Tang et al. 2010; Liao et al. 2011b). Besides, the remaining danionines, corresponding to Aphyocypris, Opsariichthys, Zacco and others, are still non-monophyletic (Wang et al. 2007; He et al. 2008; Mayden et al. 2008, 2009; Tao et al. 2013) and chaotic within the subfamily. Hence, many studies described these “displaced” danionines as Ex-Danioninae subfamily (Fang 2003; Liao et al. 2011a, 2011b; Liao and Kullander 2013; Huang et al. 2017), which has also been confirmed in the Fishbase (Froese and Pauly 2019). With the increasing data of molecular markers and genome sequences, recent studies have reclassified these “displaced” danionines into family Xenocyprididae (Tan and Armbruster 2018), which has been confirmed by phylogenetic analyses by mtDNA and nuclear DNA genes (Stout et al. 2016; Schönhuth et al. 2018). In these phylogenetic trees, family Xenocyprididae, including Aphyocypris, Opsariichthys, Zacco and other genera, is sister to family Danionidae. However, phylogenetic studies based on the mitogenomes are still few.

The cyprinid genus Aphyocypris was established in 1868, diagnosed by having a subsuperior mouth, no barbel, no knob on the anterior medium of the low jaw fitting into a notch in the upper jaw, a keel between the base of the pelvic fin and the anus, and an incomplete lateral line or its absence (Chen 1998). Most of the species of genus Aphyocypris were narrowly distributed in Southern China, expect for A. chinensis, a species widely distributed in East Asia. Therefore, most of the species of genus Aphyocypris were reported at the risk of extinction, to date, resulted from low population diversity, narrow distribution and habitat destruction (Hu et al. 2009; Liao et al. 2011c; Zhu et al. 2015; Jiang et al. 2016).

The Garnet minnow Aphyocypris lini (Weitzman and Chan, 1966) is an endangered cyprinid endemic to southern China. Aphyocypris lini, a subtropical freshwater benthopelagic fish, lives in clear shallow water with thick growth of weeds in river ditches and rivulets (Yue and Chen 1998; Zhu et al. 2013). The distribution of A. lini was only recorded in Hong Kong and it has been declared extinct in the wild (Hu et al. 2009). But beyond that, there were no references of this endangered fish. In this study, we firstly collected A. lini specimens in Fujian Province, China. It would be the first field report of this endangered species in nearly four decades, as well as a record of new distribution area to be found. Meanwhile, we amplified the first complete mitogenome sequence of A. lini and comparatively analyzed the mitogenome among related species endemically distributed in East Asia, regarding genome organization, composition, and its codon usage. Finally, we reconstructed the phylogenetic trees of genus Aphyocypris based on the PCGs. Our aim is to understand the features of A. lini mitochondrial genome and provide insight into the evolutionary relationships within genus Aphyocypris.

Materials and methods

Sampling and DNA extraction

Four specimens were collected from the Minjiang River, Fujian Province, China. For protection of this endangered species, the coordinates are not shown, but can be obtained by contacting the authors if needed. Specimens (Fig. 1) were identified following the diagnostic characteristics described by Chen (1998). Muscle tissues were preserved in 95% ethanol and deposited in the Institute of Oceanology, Minjiang University. Total genomic DNA was extracted by the salt-extraction method (Aljanabi and Martinez 1997).

Fig. 1
figure 1

Aphyocypris lini, in an aquarium (photographed by J.Z)

Primer design, PCR amplification and sequencing

The complete mitogenome of A. lini was amplified based on six universal primers and six specific primers described in Table 1. Firstly, six partial genetic fragments were amplified and sequenced using the universal primers. Subsequently, specific primers were designed to amplify long fragments based on the sequenced segments using online tool primer-BLAST (Ye et al. 2012). Polymerase chain reaction (PCR) amplification was carried out in a 20 μL reaction containing 50 ng template DNA, 10 μL SanTaq Plus PCR Mix (Sangon, Shanghai), 0.5 μL each primer (10 μmol/L), and supplemented with sterilized ddH2O. The PCR conditions were: 94 °C for 3 min, then 35 cycles of 30 s at 94 °C, 45 s at annealing temperature (Table 1) and 1 min at 72 °C, with a final elongation at 72 °C for 5 min. PCR products were examined by electrophoresis using 1.0% TAE agarose gel, and sequenced by Sangon Biotech (Shanghai, China).

Table 1 Information of primers in this study

Sequences assembly, annotation and analysis

The overlapped fragments were assembled into a linear mitochondrial DNA sequence using SeqMan (DNASTAR), then assembled sequences were manually checked. The mitogenome was annotated using MitoAnnotator web server (http://mitofish.aori.u-tokyo.ac.jp/annotation/input.html) (Iwasaki et al. 2013). Transfer RNA genes and their secondary structures were verified using tRNA-Scan SE web server (http://lowelab.ucsc.edu/tRNAscan-SE/) (Lowe and Chan 2016). A circular display of the mitogenome was drawn using CGView online server (http://stothard.afns.ualberta.ca/cgview_server/) (Grant and Stothard 2008). The composition of amino acids, nucleotide base and relative synonymous codon usage (RSCU) were calculated using MEGA X software (Kumar et al. 2018). The RSCU value for each codon was the observed frequency of this codon divided by its expected frequency under equal usage among the amino acid. Nucleotide composition skewness was carried out using the following formulas: AT-skew = (A - T) / (A + T) and GC-skew = (G – C) / (G + C) (Perna and Kocher 1995).

Phylogenetic analysis

A total of 22 sets of 13 PCG sequences were used to perform phylogenetic analysis. Those from other taxa were downloaded from GenBank (information of species and accession numbers were shown in Table 4), with Misgurnus anguillicaudatus (Accession number: MK093946) and Leptobotia elongate (Accession number: NC018764) sequences used as outgroups. Phylogenetic analyses were reconstructed using Maximum Likelihood (ML) and Bayesian Inference (BI) methods. The best-fit partition model of nucleotide evolution of PCGs was identified by PartitionFinder v2 (Lanfear et al. 2012) in PhyloSuite platform (Zhang et al. 2020) and was GTR + I + G according to the Akaike Information Criterion (Bozdogan 1987). ML and BI analysis were performed with MrBayes v3.2.7 and RaxML v8.2.12 programs following the manuals, separately (Ronquist et al. 2012; Stamatakis 2014). The bootstrap ML analysis was implemented under the GTRGAMMAI model and 1000 replications were used to evaluate the bootstrap support values and search for the best ML tree. BI analysis ran as two simultaneous Markov Chain Monte Carlo (MCMC) chains for 10 million generations, sampled every 1000 generations, and using a burn-in rate of 25%. Phylogenetic trees were visualized through online tool interactive tree of life (iTOL, https://itol.embl.de/) (Letunic and Bork 2019).

Results and discussion

Genome organization and composition

The complete mitogenome sequence of A. lini is 16,613 bp in length and contains 13 PCGs, 22 tRNA genes, two rRNA genes, and one CR. The complete closed-circular mitogenome of A. lini has been deposited in GenBank under accession number MW338757. All PCGs were encoded in the heavy strand (H) except NADH dehydrogenase subunit (ND6) in the Light strand (L). Eight tRNA (tRNAGln, Ala, Asn, Cys, Tyr, Ser, Glu, Pro) were located in the L-strand and the remaining 14 tRNAs were in H-strand (Fig. 2). This coding pattern in two strands was consistent with most vertebrates including fishes (Billington and Hebert 1991). There are 14 gene overlaps (from −1 to −31 bp in size) and 7 intergenic spacers (from 1 to 7 bp in length) in the mitogenome. The most overlapped fragment took place between tRNAAsn and tRNACys and the two longest spacer presents between ATPase8 and ATPase6, and between ND4 and ND4L (Table 2).

Fig. 2
figure 2

The schematic illustration for mitochondrial genome of A. lini

Note: Gene encoded on H- and L- strands with inverse arrow directions are shown inside and outside the circle. The purple-green ring indicates the GC-skew that the purple is positive, while the green is negative. The black ring shows the GC content that outward and inward peaks demonstrate above and below average GC content, respectively.

Table 2 Features of the mitochondrial genome of A. lini

The nucleotide composition of the complete mitogenome of A. lini is as follows: A = 5055 (30.4%), T = 4471 (26.9%), G = 2762 (16.6%), C = 4325 (26.0%) (Table 3). It showed a slight A + T bias (57.3%) which was same as most Xenocyprididae species (53.92 ~ 60.07) (Table 4) (Sitoh et al. 2006; Tang et al. 2010; Liaw et al. 2013; Luo et al. 2019). The AT-skew and GC-skew of the complete mitogenomes were also calculated that the composition is skewed away from A in favour of T (AT-skew is 0.0614) but is exceeded of C over G (the GC-skew is −0.2205) (Table 3).

Table 3 The nucleotide composition and AT/GC-skew of the mitochondrial genome of A. lini
Table 4 Comparison of characteristics within the mitochondrial genome of members of the Aphyocypris genus and related species

Protein-coding genes and codon usage

The total length of PCGs in A. lini was 11,417 bp. The composition of A + T content, AT-skew and GC-skew was significantly biased at different codon positions with the highest A + T content and the lowest value of AT- and GC-skew observed in the first codon position (58.8%, −0.0850 and − 0.3155, respectively). It suggests that a relaxed negative selection at neutral sites might affect the base composition of the complete mitogenome (Bachtrog 2007). Additionally, AT-skew in PCGs of A. lini is negative which is unusual to most species in Xenocyprididae, except for Zacco sieboldii (Temminck & Schlegel, 1846) (Table 4). The result indicated that PCGs of A. lini displayed an excess of T over A, whereas PCGs of most teleost fishes were biased towards using A not T (Yu et al. 2019). The reason of the unusual AT-skew of A. lini might be attributed to the unique selective pressures or processes which has resulted in the decreasing A in PCGs. The same results were also found in bitterling Sinorhodeus microlepis (Li, Liao, Arai & Zhao, 2017) and goby Rhinogobius leavelli (Herre, 1935) (Yu et al. 2019; Zhang and Shen 2019).

Of the 13 PCGs, twelve use canonical initiation codon ATG, and cox1 utilizes GTG which codes valine. All PCGs terminate with canonical stop codon TAA or TAG except for four genes (ND2, cox2, ND3, and Cytb) which had incomplete stop codons of T++. Moreover, gene overlapping regions were detected in ATP8-ATP6 (shared 7 nucleotides), ND4L-ND4 (shared 7 nucleotides) and ND5-ND6 (shared 4 nucleotides).

The amino acid codon usages of A. lini mitogenome are assessed by relative synonymous codon usage (RSCU) values (Fig. 3). Threonine (Thr), proline (Pro) and Leucine 1 (Leu1) are the most frequently translated amino acids, while Cysteine (Cys) is the least used amino acid. The AAA (Lys), CCC (Pro), AUU (Ile), UUA (Leu), and ACA (Thr), which are commonly used codons in A. lini mitogenome, are mostly composed of over-usage of A and T at the third codon position. That indicates the possibility of genome bias, optimal selection of tRNA, or the DNA repair efficiency, referred to other teleost fishes (Fischer et al. 2013).

Fig. 3
figure 3

Codon usage and relative synonymous codon usage (RSCU) in all protein coding genes of A. lini mitochondrial genome. Note: The codons are on the X-axis and RSCU values are shown on the Y-axis

Transfer and ribosomal RNA genes, non-coding region

The sizes of 22 tRNA genes of A. lini range from 68 bp to 76 bp which comprise 9.43% (1556 bp) of the whole mitogenome. Except for tRNASer, the remaining 21 tRNA could be folded into a typical cloverleaf secondary structure (Online resource 1: Fig. S1), which has been commonly witnessed in many other teleost mitogenomes (Garey and Wolstenholme 1989). However, tRNASer showed a D-arm-lacking structure, which was consist with many metazoan mitochondrial tRNAs (Frazer-Abel and Hagerman 2008; Watanabe et al. 2014). The two rRNA genes separated by tRNAVal had a length of 962 bp and 1687 bp, which is close to most danionines (Tang et al. 2010). Similarly, the strong AT-bias was also found in rRNAs. However, the AT content in rRNAs is lower than it in total mtDNA, tRNAs, PCGs, and control region. The strongly positive AT-skew was found in rRNA as well (Table 3), leading to a positive AT-skew in complete mitogenomes.

Finally, we annotated two common non-coding regions (light strand replication origin, OL and D-loop). OL, whose length is 31 bp, is located among five continual tRNA. D-loop is 936 bp in length, which is longer than most species within Xenocyprididae (the former Ex-Danioninae). But the short tandem repeats that might resulting to difference of length in D-loop were not found in A. lini. Even so, the difference of the length in D-loop suggests to be an important marker in phylogenetic studies (Tang et al. 2017).

Phylogenetic relationships

For better understanding the relationship within the Aphyocypris and other related species, the phylogenetic trees were established using ML and BI methods, based on concatenated nucleotide sequences of 13 PCGs from 22 minnows belonging to Danionidae or Xenocyprididae families and 2 outgroup taxa. The results of the phylogenetic trees exhibited a congruent topology by both methods. As Fig. 4 showed, on the overall scale, Danioninae sensu stricto and Ex-Danioninae (now mainly belonging to Xenocyprididae) were independently separate which is consistent with the phylogenies resolved from both morphological and molecular analysis (Tang et al. 2010; Liao et al. 2011b, 2011c). However, Danio albolineatus (Blyth, 1860) was included in Xenocyprididae clade with a relatively high bootstrap support value, showing an opposite result against the conclusion of a large assemblage of danionines. Although D. albolineatus was found at the base of the clade Xenocyprididae, the relevance between Danionidae and Xenocyprididae need to be discussed through more molecular and morphological evidences.

Fig. 4
figure 4

Phylogenetic relationships of Aphyocypris genus and related species inferred from RaxML and Bayesian inference methods. Note: Numbers at the nodes show bootstrap value of 100/100 in ML tree and posterior probability of 1.00/1.00 in BI tree

Furthermore, within Xenocyprididae, there are two major tribes. The tribe Xenocypridinae (including genus Aphyocypris, Yaoshanicus, Nicholsicypris, and Pararasbora) is sister to the tribe Opsariichthyinae (Opsariichthys, Zacco, and other genera like Parazacco, not mentioned in this study). In addition, some species, such as Tanichthys albonubes and Gobiocypris rarus, were found to be basal of clade Xenocyprididae of the phylogenetic tree. The results are also consistent with Tang et al. (2010), providing strong evidence of taxonomic validity on Xenocyprididae (or Ex-Danioninae in Fishbase) (Tang et al. 2010; Liao et al. 2011b, 2011c; Huang et al. 2017).

More complicatedly, the phylogenetic relationship clearly illustrated that genus Aphyocypris is not monophyletic. Aphyocypris kikuchii and A. chinensis were found to be closely related to Yaoshanicus arcus, Nicholsicypris normalis, and Pararasbora moltrechti. The taxonomy within phylogenetic tribe Aphyocypris is ambiguous, as well as the constant changes of names and classification (Du et al. 2003). Here, we suggest to reconsider the taxonomic status of genus Aphyocypris and other close genera. Taxonomists have found differences of key morphological traits between the genera in the past hundred years (Nichols and Pope 1927; Nichols 1943). With the increase in the number of specimens examined, however, intra-species diversity has been also increasing so that the taxonomic status remains to be further discussed. In the present study, the evidence of phylogenetic analysis based on mitogenome supported to integrate genus Aphyocypris, Yaoshanicus, Nicholsicypris and Pararasbora into one genus, which was proposed in reclassification by Tan and Armbruster (2018).

Current status and conservation

As mentioned above, A. lini was listed as extinct in the wild in Chinese Red List in the 1990s (Yue and Chen 1998). And there have been no studies or reports on this threatened species since then. With the deepening of field sampling and supporting of molecular methods, this endangered minnow and its new distribution record was confirmed in more than 30 years. However, it is not optimistic that the detected population in the wild is still small and their habitat requirement is extremely high, which issues high challenges to the protection. In addition to the usual anthropogenic influences such as dam construction, sand mining, and commercial fishing (Maitland 1995; Cooke et al. 2012), we should continue to pay attention to their native habitat (George et al. 2009). Furthermore, from the perspective of population genetics and conservation biology, we need to integrate multiple data to make more comprehensive protection recommendations for these minnows natively habited in the stream or rivulet (Vrijenhoek 1998; Alves et al. 2001).

Conclusion

In the present study, we first collected an endangered minnow A. lini in the field from a new distribution area. Then we determined the complete mitochondrial genome of A. lini, which contains 37 genes and one CR, as is typical of teleost mitogenomes. Comparative analysis of mitogenome structure, base composition, codon usage, and gene order revealed an unusual AT-skew of PCGs of A. lini. Further reconstructed phylogeny of Xenocyprididae and other related species suggested non-monophyly within genus Aphyocypris, and indicated to reconsider the taxonomy of genus Aphyocypris and its phylogenetically closest genera Yaoshanicus, Nicholsicypris and Pararasbora into one integrated genus. Finally, our study provided the genetic basis for the conservation of this endangered minnow.