Carpinus putoensis, a critically endangered species, is endemic to Putuo Island of the Zhoushan Archipelago, Zhejiang Province, China. The wild C. putoensis tree, as a single mature individual species preserved on the top of Foding Mountain, is under first-grade state protection. This single tree was discovered in the early 1930s and since then no more wild individuals have been found (Shaw et al. 2014). This species is monoecious, in theory it is able to reproduce sexually in the wild, but it has a very low rate of seed production due to strong winds during flowering. Natural regeneration therefore is extremely poor and almost no seedlings are found under the tree. C. putoensis is currently listed as a critically endangered species in the IUCN red list (IUCN 2016; Shaw et al. 2014). Therefore, a good knowledge of its genetics would contribute to the formulation of conservation strategy. In this study, we assembled and characterized the complete chloroplast genome sequence of C. putoensis based on the Illumina pair-end sequencing data.

Fresh leaves of an individual were collected from Hangzhou Botanical Garden (Zhejiang, China; 30°15′N, 120°16′E), and were used for total genomic DNA extraction with the modified CTAB method (Doyle 1987). The whole-genome sequencing was conducted with 150 bp pair-end reads on the Illumina Hiseq Platform (Illumina, San Diego, CA). In total, about 900 million high quality base pairs (Gb) of sequence data were obtained and used for the cp genome assembly using Velvet software (Zerbino and Birney 2008). The resulting contigs were linked based on overlapping regions after being aligned to Juglans regia (NC_028617) (Peng et al. 2015) and visualized in Geneious version 8.0.5 (Kearse et al. 2012). Annotation was performed with a newly developed command-line application called Plann (Plastome Annotator), which is suitable for annotation of plastomes with a well-annotated sequenced relative (Huang and Cronk 2015). Then, we corrected the annotation with Geneious (Kearse et al. 2012). And next, we generate a physical map of the genome using OGDRAW (http://ogdraw.mpimp-golm.mpg.de/) (Lohse et al. 2013). A maximum likelihood (ML) tree with 100 bootstrap replicates was inferred using RaxML version 8 (Stamatakis 2014) from alignments created by the MAFFT (Katoh and Standley 2013) using plastid genomes of 10 species (Supplementary Figure S1). The complete chloroplast genome sequence together with gene annotations were submitted to GenBank under the accession numbers of KX695124 for C. putoensis.

The complete cp genome of C. putoensis is a double stranded, circular DNA 159,673 bp in length, which contains two inverted repeat (IR) regions of 26,044 and 26,043 bp each separated by a large single-copy (LSC) and a small single-copy (SSC) region of 89,020 and 18,568 bp, respectively (Fig. 1). The genome contained 128 genes, including 83 protein-coding genes, 36 tRNA genes and 8 ribosomal RNA genes. The most of gene species occur as a single copy, while 16 gene species occur in double copies, including all rRNA species (4.5S, 5S, 16S and 23S rRNA), 7 tRNA species (trnA-UGC, trnI-CAT, trnI-GAU, trnL-CAA, trnN-GTT, trnR-ACG and trnV-GAC) and 5 PCG species (ndhB, rpl2, rps12, rps7 and ycf2). The overall AT content of C. putoensis cpDNA is 63.6 %, while the corresponding values of the LSC, SSC and IR regions are 65.8, 69.8 and 57.5 % respectively.

Fig. 1
figure 1

Gene map of the C. putoensis chloroplast genome

The phylogenetic analysis which Rosa odorata var. gigantean (Yang et al. 2014) was used as the outgroup showed that the cp genome of C. putoensis is closely related to Ostrya rehderiana (Supplementary Figure S1). This complete chloroplast genome can be readily used for population genomic studies of C. putoensis, and such information would be fundamental to formulate potential new conservation and management strategies for this endangered species.