Corylus fargesii, a Chinese endemic hazelnut species, grows in the mountain valleys from the elevation of 800–3000 m of the southwest China (Kuang et al. 1979). As a small arbor species in the genus Corylus, C. fargesii has been seriously threatened by human activities in the last few decades. Although several chloroplast (cp) DNA markers have previously been used for phylogenetic analysis (Erdogan and Mehlenbacher 2000; Palmé and Vendramin 2002; Boccacci and Botta 2009; Bassil et al. 2013; Martins et al. 2013), except the Corylus Chinensis (GenBank accession number KX814336, Hu et al. 2016), little is known about the chloroplast genome in this genus. In the present study, we report the first complete chloroplast genome of C. fargesii based on Illumina pair-end sequencing data. The annotated chloroplast genome of C. fargesii has been deposited into GenBank under accession number KX822767.

Fresh leaves were collected from a single C. fargesii plant that grew at the resources nursery of the Institute of Forestry and Pomology (Beijing Academy of Agriculture and Forestry Sciences, Beijing, China). DNA extraction was performed using the Plant Genomic DNA Kit (TIANGEN Biotech Co., Beijing, China) according to the manufacturer’s instructions, and high-throughput sequencing was carried out on the HiSeq2500 System (Illumina, San Diego, CA) by OriGene Technologies (Beijing, China). A total of 80.43 M of 126-bp raw reads were trimmed using SolexaQA (Cox et al. 2010), and approximately 9.577 G high-quality base pairs of sequence data were obtained and used for chloroplast genome assembly with SOAPdenovo software (Luo et al. 2012). Reference-guided assembly was then performed to reconstruct the chloroplast genome with the BLAST program (Altschul et al. 1990) using closely related species as references. After filling the gaps with GapCloser (http://soap.genomics.org.cn/index.html), a 159,856 bp chloroplast genome of C. fargesii was obtained. Annotation was performed using the CpGAVAS web server (Liu et al. 2012) to generate a physical map of the chloroplast genome.

The circular chloroplast genome of C. fargesii contains a pair of inverted repeat (IRa and IRb) regions, each of 26,602 bp, a large single-copy (LSC) region and a small single-copy (SSC) region of 88,313 and 18,339 bp, respectively (Fig. 1). It comprises 131 genes, including 94 protein-coding genes, 8 ribosomal RNA genes, and 29 transfer RNA genes. Among the annotated genes, 11 protein-coding genes contain introns, including nine with a single intron each (atpF, rpoC1, rpl2, ycf15, ndhB, ndhA, ndhB, ycf15, and rpl2), and two with two introns each (clpP and ycf3). The overall AT content of the C. fargesii cp genome is 63.49%.

Fig. 1
figure 1

Gene map of the Corylus fargesii chloroplast genome

To identify the phylogenetic position of C. fargesii, maximum likelihood analysis was performed based on the 12 Fagales plant chloroplast genomes and seven other outgroup plants using MEGA6 software (Tamura et al. 2013). The cp genome of C. fargesii was shown to be closely related to that of Ostrya rehderiana of the family Betulaceae (Fig. 2).

Fig. 2
figure 2

Phylogenetic tree inferred using MEGA6 software from 19 complete cp genomes

This complete chloroplast genome can be used for subsequent population, phylogenetic, and cp genetic engineering studies of C. fargesii. The new information will also be fundamental in formulating potential new conservation and management strategies for this important hazel species.