Introduction

The germplasm resources are the basis of genetic breeding, biotechnology and plant science. Nowadays, around 61 million germplasm resources have been collected and 90% of which are seeds. Seeds are one of the most common and cost-optimal ways to store germplasm resources for a long time [1]. The condition of seeds is monitored by genetic integrity, which refers to the difference in alleles and the frequency of genotype compared with the original population. The genetic integrity of cross-pollination plants like pearl millet is mainly affected by seeds aging, population and environment [2]. However, even if the seeds are stored suitably, they will age inevitably [3]. Seeds aging can cause not only the mutation and drift of gene but also the loss of genetic integrity, which is a big disadvantage to the conservation of germplasm resources [4]. Thus, it is necessary to make clear the impact of seeds aging on genetic integrity of plants.

Since it always takes a long time to age seeds in natural conditions, researchers often age seeds in an artificial way to accelerate the process and build germination gradient for profound research [5]. At present, there are lots of commonly used methods, for instance, hot water aging (58 ± 1 °C) [6], high temperature with high relative humidity (40 °C,100% RH) [7] and methanol solution (MS) [8, 9].

Among the molecular markers used to evaluate genetic integrity of plants, SSR (also known as short tandem repeat, STR) has high polymorphism and conservation [10]. The genomic-SSR, which is more polymorphic than expressed sequence tag SSR (EST-SSR) [11], has been widely used in numerous studies on different plant species, such as Cannabis sativa var. indica (Lam.) E. Small et Cronq. [12], Brassica rapa L. [13], Jatropha curcas L. [14] and Cuminum cyminum L. [15]. Thus, it is ideal to detect genetic integrity by using genomic-SSR to develop molecular markers.

Pearl millet (Cenchrus americanus (L.) Morrone syn., P glaucum (L.) R. Br., Poaceae family) is a staple crop in the world, which feeds one-third of the world’s population [16]. It is an annual, warm-season C4 monocot crop that widely distributes in South Asia [17], sub-Saharan Africa and Latin America [18]. Due to its characteristics of abundant biomass, high stress resistance, and excellent palatability, it is always regarded as an ideal food for poor people in the arid districts in Africa, India and other semi-arid tropical regions [19, 20]. This green fodder has a high assay in protein, iron, zinc, calcium and other minerals with low hydrocyanic acid content, which is used as a quality forage crop in South China, Korea and South America [21]. Although it has such high economic and ecological values, a few studies have been done about pearl millet, especially on genetic integrity.

When using markers to analyze the genetic integrity of plants, we can observe some significant differences with different sample sizes [22]. Theoretically, an infinite sample size could cover all genetic diversity, while many constraints like experimental expenditure, climate, seeds production and other limitations make it impossible [23]. However, a small sample size cannot cover all the genetic diversity of plants and is unrepresentative [24]. It is essential to select an optimal sample volume to reduce errors and the follow-up workload. Thus, this research has two aims, (i) to determine the optimal sample size of pearl millet for molecular study based on genomic-SSR markers; (ii) to analyze the effects of seeds aging on genetic integrity, which can help us to find the critical point of germination rate for updating pearl millet germplasm resources.

Materials and methods

Plant materials

The seeds of pearl millet cultivar “Tifleaf 3” (open-pollinated cultivar) were provided by Beijing Mammoth Seed Company (Beijing, China) and stored at 4 °C.

Plants used to determine the optimal sample volume were maintained in the growth chamber (Wenjiang, Sichuan, China) with a photoperiod of 16 h/8 h (day/night) at 26 °C/22 °C (day/night) [25] and 80% relative humidity (RH). 35 days later, 96 of the experimental samples were selected randomly and young leaves of each sample were collected to extract DNA.

Accelerated aging test and germination test

A pilot experiment was performed ahead of the accelerated aging test to confirm the most suitable conditions (for instance: temperature, humidity and aging gradient) to acquire different germination rate of seeds. The artificial accelerated aging treatment was based on the results of prior test and Delouche’s method [7]. Seeds were artificially aged by accelerated aging chamber (Top instrument, Zhejiang, China) at 45 °C, 99% RH for 4 h, 8 h, 12 h, 16 h, 20 h and 24 h. Seeds that were not disposed of the accelerated aging protocol were used as the control group.

The germination test was performed as the method recommended by the Rules of Seed Testing for Forage, Turfgrass and other Herbaceous Plant—Species and Variety Testing [26]. Three replicates of 75 seeds were sown in the petri dish (d = 9 cm, Biosharp, Hefei, China) with blotting paper wetted by distilled water and kept in the plant incubator (Wenjiang, Sichuan, China) at 26 °C /22 °C (day/night) with an illumination of 16 h /8 h (day/night). The germination rate was recorded on the 14th day (the sprout was identified as when the radicle was as long as the seed) [27]. Then, transplanted all of the seedlings to the experimental field of College of Animal Science and Technology, Sichuan Agricultural University (30° 37′ N, 103° 40′ E, Chongzhou, Sichuan, China) [28]. 25 days later, fresh leaves of each plant were collected to extract DNA.

DNA extraction

The genomic DNA of pearl millet cultivar “Tifleaf 3” was extracted from fresh leaves by a genomic DNA extraction kit (Tiangen Biochemical Technology Co., Ltd., Beijing, China) [29]. The quality and quantity of DNA were tested by 1% agarose gel electrophoresis (1% AGE) and Nano Drop 2000 spectrophotometer (Thermo Fisher, Wilmington, United States), respectively. The qualified DNA was diluted to 20 ng/μL as the template for amplification and stored at − 20 °C until required.

Genomic-SSR PCR amplification and detection

Sequence information of the 20 pairs of genomic-SSR markers was obtained from previously published literature of Wang’s research (Supplementary Table 1) [30]. The Polymerase Chain Reaction (PCR) amplification was as follows: 1.5 μL DNA template (20 ng/μL), 7.5 μL 2× Reaction Mix, 0.3 μL Golden DNA Polymerase (Tiangen Biochemical Technology Co., Ltd., Beijing, China), 0.6 μL forward and reverse primers (10 pmol/mL, synthesized by Shanghai Shenggong Biological Engineering Technology Services Ltd., Shanghai, China), then added ddH2O up to 15 μL [31]. The PCR process was as follows: 94 °C pre-denaturation 5 min; 94 °C denaturation 30 s, 57–60 °C annealing 45 s, 72 °C extension 1 min, repeated for 35 cycles, 72 °C final extension 7 min [32] and the amplified DNA products were stored at 4 °C. The amplification process was done twice to ensure there was no missing data and the PCR products were inspected on 8% polyacrylamide gel electrophoresis (8% PAGE). The process was as follows: 4 μL sample DNA (20 ng/μL), 4 μL 50 bp Marker (Tiangen Biochemical Technology Co., Ltd., Beijing, China), electrophoresis on the condition of 350 V for 90 min, silver stain 15 min and picture preservation.

Statistical analysis

The electrophoresis results were compared and revised manually. The present bands were labeled as ‘1’, while absent bands were marked as ‘0’ to establish the data matrix [33, 34] in Excel 2016 (each vertical row represented a sample).

Analysis of the detection of optimal sample amount

The data matrix of 96 samples was intercepted by vertical rows randomly to generate the submatrix, which was used to simulate the population with the sample size of 15, 30, 45, 60, 75, 90 and 95. And this process was repeated for 1000 times by Python 3.6. According to the results of the simulation process, the percentage of polymorphic bands (PPB, Eq. 1), the effective number of alleles (Ne, Eq. 2), the Nei’s gene diversity index (H, Eq. 3) [35] and the Shannon’s information index (I, Eq. 4) [36] were calculated by Popgene version 1.32 [37]. Finally, the significance level was tested by one-way analysis of variance (one-way ANOVA) by using SPSS 19. Besides, the equations of genetic diversity parameters are as below.

$$PPB=\frac{N}{M}\times 100\%$$
(1)
$$Ne=\frac{1}{\sum_{i=1}^{k}{pi}^{2}}$$
(2)
$$H=1-\sum_{i=1}^{k}{pi}^{2}$$
(3)
$$I=-\sum_{i=1}^{k}(pi\times lnpi)$$
(4)

N is the number of polymorphic bands, M is the total number of bands, k is the number of alleles and pi is the frequency of the ith allele.

Analysis of the genetic integrity

According to the data matrix of the accelerated aging test, PPB (Eq. 1), the number of alleles (Na), H [35] (Eq. 3) and I [36] (Eq. 4) were calculated by using Popgene version 1.32 [37]. Moreover, the significant difference in each aging level was tested by SPSS 19 via Duncan’s Multiple Comparisons (DMC). Finally, the genetic similarity coefficient among different treatments was analyzed by the unweighted pair-group method with arithmetic mean (UPGMA) based on the SM similarity matrices, using the NTSYS-PC 2.10 edition software package [38].

Results

Ascertainment of optimal sample volume

In this study, 96 individual plants were amplified by 20 pairs of genomic-SSR primers and a total of 84 bands were detected, 79 of which were polymorphic with an average of 3.95 polymorphic bands per pair of primers. We observed that with the increase in sample amount, the percentage of polymorphic bands (PPB) and the three genetic diversity indexes increased as well. The effective number of alleles (Ne) arose from 1.6554 to 1.7302 with a decreased slope and standard deviation.

The tendency of the Nei’s gene diversity index (H), the Shannon’s information index (I) and PPB was similar to Ne (Fig. 1). H was from 0.3719 to 0.4008, I was 0.5374 to 0.5780 and PPB ranged from 87.7045 to 94.0476% with the sample size varying from 15 to 95 (Supplementary Table 2). Based on the behaviors of four genetic diversity indexes, there were no distinct divergences when the sample size was 15, 30, 45, 60 while there were no significant differences when the sample size was 60, 75, 90, 95, which indicated that the minimum sample volume representing the population was 60. Thus, the optimal sample volume of pearl millet for molecular research was 60.

Fig. 1
figure 1

Relationship between four genetic indexes and sample size of pearl millet “Tifleaf 3”. a Relationship between the percentage of polymorphic bands (PPB) and sample size, b relationship between the effective number of alleles (Ne) and sample size, c relationship between the Nei's gene diversity index (H) and sample size, d relationship between the Shannon’s information index (I) and sample size. Data are expressed as means with ± SD (black bars) and different letters indicate significant differences at the 0.05 level

Effects of seeds aging on genetic integrity

Analysis of seeds aging on germination and polymorphism

After the seeds treated with 45 °C, 99% relative humidity (RH) for 4 h, 8 h, 12 h, 16 h, 20 h and 24 h, the germination rate reduced as the aging level increased (Fig. 2, Supplementary Table 3). Significant differences with control group (0 h) were observed at the germination rate of 68.00% (4 h) and 68.23% (8 h), while the differences between 4 and 8 h, 12 h and 16 h, 20 h and 24 h were not significant. In this research, seeds of 0 h (80.8%), 8 h (68.23%), 12 h (57.80%), 16 h (52.90%) and 20 h (39.10%) were selected to evaluate the genetic integrity, for 4 h (68.00%) and 24 h (40.87%) had a higher standard deviation.

Fig. 2
figure 2

Germination rate of pearl millet “Tifleaf 3” in seven different aging levels (0 h, 4 h, 8 h, 12 h, 16 h, 20 h and 24 h). Data are expressed as means with ± SD (black bars) of three repeats, and different letters indicate significant differences at the 0.05 level

The five selected groups were evaluated for genetic integrity with the sample size of 60. As for unaged materials, 187 bands were amplified by 20 pairs of primers, 186 (99.47%) of which were polymorphic. For the rest, the bands that generated by 20 pairs of primers were 138 (8 h), 115 (12 h), 139 (16 h) and 131 (20 h), respectively. The number of polymorphic bands and the percentage of polymorphic bands were 137 (99.28%, 8 h), 110 (95.65%, 12 h), 128 (92.09%, 16 h) and 119 (90.84%, 20 h), respectively. Thus, PPB deceased because of seeds aging. The results of four genetic diversity indices were presented in Table 1. Apparently, the four genetic diversity indexes of 0 h reached the highest of this study. A significant decrease with control group in Ne was observed at 8 h while in the number of alleles (Na) was 16 h. I and H showed similar behaviors with control group. Generally, with the decrease in germination rate, all of the polymorphism parameters went down, as well. However, the values of Ne, H and I in 12 h were abnormally higher than that in 8 h.

Table 1 Four genetic diversity parameters of pearl millet “Tifleaf 3” in five different aging levels (0 h, 8 h, 12 h, 16 h, 20 h)

Analysis of seeds aging on genetic similarity

The genetic similarity coefficients of five selected populations were calculated by using the UPGMA approach (Supplementary Table 4). The average of genetic similarity coefficients was 0.8140 and the five treatment groups could be divided into two major groups at this point (Fig. 3). The control group belonged to one cluster while 8 h, 12 h, 16 h and 20 h belonged to another cluster.

Fig. 3
figure 3

UPGMA dendrogram of pearl millet “Tifleaf 3” in five different aging levels

Discussion

The age and deterioration of seeds are inevitable phenomena during storage [39]. In order to make sure the genetic integrity of seeds can be preserved completely, it is necessary to study the genetic integrity of germplasm resources. Sampling strategy is the first problem to face in genetic integrity research. It is of great importance to develop effective sampling strategies for the evaluation of genetic diversity of plant populations [40]. Mixed sampling strategy and single plant sampling strategy are two major sampling methods [41]. Mixed sampling is timesaving, but the more mixed, the greater deviation and less reliability will be. The biggest disadvantage of single plant sampling is time consuming and huge workload, but it can be used to analyze the genetic variations between or within populations, which is comprehensive and reliable [42].

In this experiment, single plant sampling method and four genetic diversity parameters (the percentage of polymorphism sites, PPB; the number of effective alleles, Ne; the Nei's gene diversity index, H and the Shannon's information index, I) were selected to determine the optimal sampling size of pearl millet for molecular research. To make results reliable, the select process was repeated for 1000 times. The results showed that when the sample size arose to 60, those four genetic diversity parameters had no significant differences with the larger sample quantity. In case of finding the minimum optimal sampling volume that could replace the population, experiments with the sample size of 50 and 55 were added in the subsequent experiments, and there was a significant difference with the sample size over 60. Therefore, 60 would be the optimal volume of samples to study genetic integrity of pearl millet. In the study of the optimal sample size of plants for molecular research, the best sample volume of Astragalus sinicus L. was 30 [22], annual ryegrass (Lolium multiflorum Lam.) was 20 [43], Psathyrostachys Huashanica Keng ex P. C. Kuo was 26 [44], and A nuda L. was 70 and 50 [45]. It means that different plants have different biological characteristics so that the optimal sampling volume is usually different. Compared with self-pollinated and asexual plants, cross-pollinated plants have high gene heterozygosity and are different from individuals within populations. Thus, they often have a smaller sample size than self-pollinated and asexual plants when doing genetic diversity analysis [46].

In this study, seeds of pearl millet were artificially aged with the optimal sample volume of 60. In the germination test, the germination was decreased as the aging level increased. Moreover, at the point of 68.32%, there was a significant decrease with the unaged group. Similarly, the values of Ne, the number of alleles (Na), H and I had the same tendency with the germination rate. However, the values of Ne, H and I in 12 h were abnormally higher than 8 h, showing irregularity. This phenomenon might be caused by the fact that the pearl millet in this study is an open-pollinated cultivar and the small number of base pairs labeled by genomic-SSR markers or the inhomogeneous distribution of markers.

Owing to the age of seeds could cause the loss of genetic integrity and the reduction of yield. It is necessary to find out the critical point of germination to make sure the germplasm resources could be updated in time. At present, the critical point of germination rate is different around the world. For example, in India, the germination rate to renew germplasm resources is 75% [47], while Britain is 70% [48]. The United States does not update germplasm resources until the germination rate reduces to 50% [49]. Moreover, in China, the standard for regeneration germplasm resources is 60% according to the government documents made by the national forage seed bank [50]. In this research, the maximum germination rate with significant differences of the control group was 68.23%, which indicated that the germplasm resources of pearl millet should be updated when the germination rate decreased to 68.23%. Although 68.23% was significantly lower than the Chinese standard, the genetic integrity of pearl millet might have undergone at a higher germination rate. So, we still need to develop better molecular markers, detail the artificially accelerated aging gradient, use more advanced methods and do further analysis of genetic diversity parameters to get the precise critical point of renewal germplasm resources of pearl millet.