Introduction

There are many types of apples and thousands of apple varieties spread over different parts of the world (Daler et al. 2017). East Asia, Central Asia, West Asia-Europe and North America have been reported as the origin centers of apples in the world (Janick et al. 1996).

Apple genetic resources in Central Asia have a highly critical importance. Apple populations of Central Asia offer an important gene pool that will contribute to solving unresolved problems, including resistance to biotic and abiotic stress conditions, fruit quality characteristics and tree growth form (Forsline et al. 2003). Kyrgyzstan is an important apple origin center in the Central Asian region and there are significant natural apple populations there. However, it was reported that there were intense human-induced losses in these materials and these genetic resources were lost in the process (Dzunusova et al. 2008).

Genetic markers have seriously facilitated breeding programs. Various marker techniques are used in phylogenetic and genetic diversity analyses. Morphological, biochemical and molecular markers are used for genetic characterization of plant species. Molecular markers are the most widely used technique because of their prominent advantages. Simple sequence repeat (SSR), inter-simple sequence repeat (ISSR), sequence-related amplified polymorphism (SRAP) and random-amplified polymorphic DNA (RAPD) techniques were mostly used for molecular characterization of apples. Genotypes with good agronomic traits are selected with the use of genetic diversity (Rafalski 2010), while genotyping and genetic diversity analyses, determination of markers for important agronomic traits through these markers also significantly facilitate breeding studies. Detection of DNA markers associated with genes controlling important characters is called as quantitative trait locus (QTL). Linkage maps are generated to determine these regions. A special population needs to be developed for the creation of linkage maps (Rafalski 2010; Coşkun 2019).

Although genetic mapping studies have gained momentum in the world, populations consisting of F1 individuals, which are generally obtained through crossbreeding, are used in studies. Obtaining and maintaining these populations requires a long time and is costly, especially with problems such as long juvenile sterility, sexual incompatibility, which lasts for 4–7 years in apples. With association mapping (AM), it is possible to reveal the phenotypic-genotypic characteristics of independent individuals with molecular markers. In this study, 120 genotypes, selected from Kyrgyzstan genetic resources and containing four commercially important cultivars in Turkey, were used for mapping of some morphological traits (plant height and shoot hairiness at previous year’s shoots), phenological traits (mouse ear stage and defoliation stage) and pomological traits (fruit general shape and fruit flesh color) with the use of different markers (SSR, ISSR, SRAP and RAPD). The primary objective was to determine the population structure and genetic diversity through capturing the genome from as many and different points as possible with the use of different marker techniques. With 120 apple genotypes, instead of developing a long-term and costly mapping population, existing natural population was used to detect molecular markers associated with loci that control important agricultural traits of apples. Prospective findings will make highly significant contributions to apple breeding programs in the world.

Materials and Methods

Plant Material

A total of 120 apple genotypes were selected for this study. Selected materials included four different apple species: Malus kirghisorum (37), M. sieversii (20), M. domestica (59), M. niedzwetzkyana (4). Among the genotypes of M. domestica species, there are some commercially important varieties such as ‘Royal Gala’, ‘Fuji’, ‘Pink Lady’, ‘Granny Smith’, ‘Hüryemez’ and ‘Amasya’ (Table 1).

Table 1 Apple genotypes used for the present study

Phenotyping

Two morphological traits (plant height and shoot hairiness at previous year’s shoots), two phenological traits (mouse ear stage and defoliation stage) and two pomological traits (fruit general shape and fruit flesh color) were determined under Kayseri-Turkey conditions using three trees from each genotype and 20 fruits for pomological traits as UPOV apple descriptor.

DNA Extraction, Marker Analysis and Genotyping Assay

Apple genomic DNA was isolated from young leaves collected from 120 genotypes, using the modified CTAB method as reported by Doyle and Doyle (1990).

SSR Analysis

Genotypes were tested with 50 SSR primers and full-set PCR of 10 primers yielding high-quality bands was performed (Table 2).

Table 2 Simple sequence repeat primers and sequences

SRAP Analysis

For SRAP analysis, 208 SRAP primer combinations were used to test for polymorphism and 13 primer combinations which gave polymorphic bands were selected (Table 3). All genotypes were tested with 13 SRAP primer combinations. They were scored as the dominant marker. PCR amplification was conducted as reported by Uzun et al. (2009).

Table 3 Sequence-related amplified polymorphism primers and sequences

RAPD Analysis

For RAPD analysis, 40 different 10-mer RAPD primers were used and polymerase chain reaction (PCR) amplification conditions were performed as reported by Williams et al. (1990) and of the 40 RAPD primers, 15 polymorphic were selected and all genotypes were tested (Table 4). The genotypes with the highest polymorphic results were selected using these polymorphic primers. They were scored as the dominant marker.

Table 4 Random-amplified polymorphic DNA primers and sequences

ISSR Analysis

For ISSR analysis, 20 primers were used to screen for polymorphism and eight primers were selected. Genotypes were tested with these 10 ISSR primers (Table 5). They were scored as the dominant marker. PCR amplification was conducted as reported by Uzun et al. (2010).

Electrophoresis and Gel Imaging

About 3 µl loading buffer was added into PCR products that was loaded on 2% agarose gel for ISSR, SRAP and RAPD and run under 110 V for 3.5 h. For SSR markers, PCR products were run on 3% high-resolution agarose gel under 90 V for 4 h.

Linkage Disequilibrium Analysis

The linkage disequilibrium (LD) levels between the loci and compounded marker data were obtained from Tassel 5.2 software. Analysis was conducted after removal of the loci with a low number of alleles (f < 0.10).

Association Mapping

The population structure of 120 cultivars was estimated using the model-based (Bayesian) cluster software STRUCTURE 2.3.4 (Pritchard et al. 2000). STRUCTURE was run under the ‘admixture model’ with a burn-in period of 10,000 followed by 100,000 replications of Markov Chain Monte Carlo. Five independent runs were performed with the number of clusters (K) varying from 1 to 10. An ad-hoc measure, Dk, based on the relative rate of change in the likelihood of the data between successive K values was used to determine the optimal number of clusters (Evanno et al. 2005). Marker-trait association was estimated using the general linear model (GLM) of the TASSEL 5.2 software package (Bradbury et al. 2007), in which the percentage of admixture of each accession (Q matrix) was used as a covariate to conduct the regression between phenotypic variation and markers.

Results

Phenotyping

Tree heights of the investigated genotypes varied between 84.00–444.33 cm. Average tree height of 120 genotypes was identified as 249.91 cm with a variance value of 0.15. The tallest tree height was seen in genotype 17 of M. kirghisorum species and the shortest in genotype 99 of M. sieversii species (Fig. 1).

Fig. 1
figure 1

Plant height trait of 120 genotypes

In terms of hairiness of annual shoots, 55% of 120 genotypes had “none or slight”, 23% had “moderate”, 12% had “intense” and 10% had “highly intense” hairiness. For mouse ear stage, in which leaves are more remarkable, but in grouped fashion, the earliest was seen at the end of March in genotype 27 of M. kirghisorum species and the latest at the end of April in genotypes 104, 105 of M. sieversii species (Fig. 2).

Fig. 2
figure 2

Mouse ear stage of 120 genotypes

Defoliation stages of the trees included in this study were completed between the first and last weeks of November. Defoliation stage was completed the first in M. kirghisorum, followed by M. niedzwetzkyana, while M. sieversii defoliated both in the first and middle periods, and M. domestica was the last to defoliate (Fig. 3).

Fig. 3
figure 3

Defoliation stage of 120 genotypes

In terms of general fruit shapes, M. kirghisorum, M. niedzwetzkyana and M. domestica species generally had spherical shape and M. sieversii species had oblate shapes. For fruit flesh color, M. domestica and M. kirghisorum species had dominant white flesh color, M. sieversii species had yellowish flesh color and M. niedzwetzkyana species had reddish flesh color (Fig. 4).

Fig. 4
figure 4

Fruit flesh color of some apple genotypes

Linkage Disequilibrium Analysis

LD analysis with the use of Tassel 5.2 software revealed that 118 of 414 markers (28.5%), among which were found to have a certain level of LD, showed LD at the 0.05 level. D′ shows the difference between the expected allele frequency and the observed allele frequency of two loci. The average LD value (D′) among loci showing significant LD was 0.58. The LD blocks obtained for 414 markers are shown in Fig. 5 as a “heat map” and the markers associated with the traits are provided in Table 5.

Fig. 5
figure 5

Linkage disequilibrium measurement (the values above the diagonal) and probability values (the values below the diagonal, P)

Table 5 Inter-simple sequence repeat primers and base sequences
Table 6 Association mapping results for investigated traits by general linear model model (p< 0.01)

Association Mapping

For association mapping, six traits of 120 apple genotypes were used. Association analyses were performed using genotypic, phenotypic and Qmatrix data in GLM and 10,000 permutation modules of TASSEL 5.2 software. Considering the Q‑Q plots for the association mapping of 414 polymorphic DNA bands in six traits, it was determined that six traits yielded the best results with the GLM (Q) method. In the association made, 38 markers were determined according to p < 0.01. For morphological traits, three markers (OPBG5.1200, OPBH20.1300, Em4Me5.1300) were found to be associated with tree height and three markers (Em2Me4.600, CAC6.1200, AG7YC.750) with shoot hairiness. For phenological traits, 10 markers (OPBH13.550, Em12Me3.1300, TCC5RY.1050, OPBH13.1400, OPBG2.750, Em3Me4.2400, AG7YC.800, OPBG5.1050, OPBG5.950, Em8Me4.1000) were found to be associated with mouse ear stage and eight markers (AG7YC.800, Em7Me2.200, Em2Me4.225, TCC5RY.900, CAC6.1200, DBD_ACAX7.1000, Em6Me4.625, Em2Me5.275) with defoliation date. For pomological traits, 10 markers (Em4Me5.1300, OPBH15.1500, OPBH15.1400, Em8Me4.1000, Em6Me6.700, OPBH10.1050, Em3Me4.950, OPBG17.700, DBD_ACAX7.350, Em11Me6.1250) were found to be associated with general fruit shape and four markers (TCC5RY.250, AG7YC.1050, OPBH19.450, CH05g11.350) with fruit flesh color (Table 6).

Discussion

Two measures, QTL mapping and association mapping, are usually used to detect molecular markers associated with yield, quality and resistance. In contrast to QTL mapping, which is based on bi-parental populations, association mapping is based on LD and uses a sample of lines from the broader breeding population that are unrelated to any specific crossing design (Zhu et al. 2008). Therefore, a higher number of historical recombination events can be explored in natural populations than in the bi-parental segregating populations, resulting in a higher resolution of QTL mapping (Ersoz et al. 2007). Association mapping (AM), also known as linkage disequilibrium (LD) mapping, is a viable approach to overcome limitations of pedigree-based QTL mapping. In AM, genotypic and phenotypic correlations are investigated in unrelated individuals. Unlike QTL mapping, AM takes advantage of both LD and historical recombination present within the gene pool of an organism, thus utilizing a broader reference population. In plants, AM has been used in model species with available genomic resources. Pursuing AM in tree species requires both genotyping and phenotyping of large populations with unique architectures. Recently, genome sequences and genomic resources for forest and fruit crops have become available. Due to abundance of single nucleotide polymorphisms (SNPs) within a genome, along with availability of high-throughput resequencing methods, SNPs can be effectively used for genotyping trees (Khan and Korban 2012).

In the past few decades, several studies identified the QTLs and major genes controlling important horticultural traits, such as fruit quality (Bai et al. 2012; Zhang et al. 2012; Khan et al. 2013; Ma et al. 2016; Qi et al. 2020), disease resistance (Xu and Korban 2002; Belfanti et al. 2004; Le Roux et al. 2010; Liu et al. 2016) and bud dormancy (Brunel et al. 2002; van Dyk et al. 2010; Ren et al. 2016).

Tree architecture plays a key role in collecting light and assimilating carbon and thus affects plant growth and yield (Valladares et al. 2002; Niinemets 2010). Moreover, by modifying plant microclimate, it also determines many characteristics of fruit quality and can affect the development of orchard pests and diseases (Lauri et al. 2008). Previous studies have demonstrated that, across vascular plants, architecture exhibits remarkable regularities (Sussex and Kerk 2001) which are assumed to result from genetic control (Reinhardt and Kuhlemeier 2002). In the present GLM (Q) analysis, three of 414 markers were associated with tree height at p < 0.01 level. Of these associated markers, two RAPD markers (OPBG5.1200 and OPBH20.1300) and one marker (Em4Me5.1300) belonged to the SRAP marker system.

Urrestarazu et al. (2017) conducted a genetic mapping study for flowering and ripening period of apples and analyzed a total of 1168 apple genotypes from six different regions. There were four important QTLs in LG 03, LG 9, LG 10 and LG 16. Muranty et al. (2015) conducted an association mapping study for flowering time and harvest maturity of a population composed of six different gene sources and 1200 diploid apple genotypes. An important SNP marker was detected for both traits. It was reported that the marker associated with flowering time was on chromosome 9 and the marker associated with harvest time was on chromosome 3. In this study, association of 414 markers with mouse ear stage revealed that of 10 associated markers identified. Those were five RAPD (OPBH13.550, OPBH13.1400, OPBG2.750, OPBG5.1050 and OPBG5.950), three SRAP (Em12Me3.1300, Em3Me4.2400 and Em8 Me4.1000) and two ISSR (TCC5RY.1050 and AG7YC.800) markers. Also, eight markers was determined associated with defoliation stage, four of them were SRAP (Em7Me2.200, Em2Me4.225, Em6Me4.625 and Em2Me5.275) and four of them were ISSR (AG7YC.800, TCC5RY.900, CAC6.1200, DBD_ACAX7.1000) marker system primers.

Amyotte et al. (2017) aimed to correlate phenotypic-genotypic fruit quality parameters in a study they conducted with 85 apple genotypes. They determined the skin color in an important SNP region on chromosome 9, very close to the MdMYB1 gene, which controls skin color in apples. Shell color locus has been reported to be significant in GLM analyses on chromosome 9 but not significant in kinship analysis. Also, the quantitative fruit quality traits of sensory flavor, sensory texture (such as crispness, juiciness, firmness, and skin thickness) in fresh green apples were reported for all loci with repetitive effects, but significant for none in the GLM. No SNP effect has been reported. Parallel to our study, markers with the highest significance were detected in GLM, but no significant effect was detected for any of them in mixed linear model (MLM) analysis.

Fruit coloration is an important factor for consumer preferences. Apple color can be divided into skin color and flesh color. Skin color can be sub-divided into the presence or absence of red pigmentation, striped red or blushed red. Flesh color is mainly sub-divided into red fleshed and non-red fleshed. Red pigmentation of apple fruit is designated mainly by anthocyanins, which are color-producing secondary metabolites that accumulate in different tissues and organs of plants (Jaakola 2013). The major gene MdMYB1 was the first gene identified to control the presence or absence of red pigmentation in the fruit skin (Takos et al. 2006) and associated molecular markers were explored subsequently (Kumar et al. 2012). More recently, genome-wide association study (GWAS) analysis using SNP markers confirmed the association between fruit color and the MdMYB1 locus on chromosome 9 (Migicovsky et al. 2016; McClure et al. 2019). Comparative genomic analysis of 148 apple populations and a segregated hybrid population revealed that a gypsy-like long terminal repeat retrotransposon (designated redTE) was inserted 3297 bp upstream of MdMYB1, thereby activating the expression of MdMYB1 and controlling the redness of the skin (Zhang et al. 2019). Chagné et al. (2007) identified the Rni locus, a major genetic determinant of the red foliage and red color in the core of apple fruit. In a population segregating for the red flesh and foliage phenotype, inheritance of the Rni locus and DNA polymorphisms of candidate anthocyanin biosynthetic and regulatory genes were determined. SSRs and SNPs in the candidate genes were also located on an apple genetic map. It was shown that the MdMYB10 gene co-segregates with the Rni locus and is on Linkage Group (LG) 09 of the apple genome.

Red flesh color in apple fruit is a desired trait by consumers and it is associated to the anthocyanin content, which is mainly controlled by MdMYB10 with a R6 promoter. In this study, a high-density linkage group was constructed using the ‘Fuji’ × ‘Red3’ population which contained homozygous alleles R1R1 and R6R6, respectively. The linkage group consists of 7630 SNPs along 17 linkage groups, spanning 2270.21 cM, with an average density of 0.30 cM per marker. The cyanidin-3-galactoside concentration was used as the phenotypic data in QTL analysis. Moreover, one QTL peak which was flanked by two markers, marker2187260 to marker2173766, with LOD scores of 4.49 was detected. This QTL ranged from 0 to 40.79 cM on the top of linkage group (LG16). In addition, one candidate molecular marker (marker2175442) in this QTL was identified, which was significantly correlated with the flesh cyanidin-3-galactoside concentration. These genetic findings enrich the breeding basis of fruit flesh coloration in apple (Yang et al. 2021).

Fruit apex-associated QTLs were mapped by using 145 F2:3 families and 155 F2:6 populations that were derived from the cross of different ecotype cucumbers. Four major-effect QTLs, Bfal4.1, Bfai4.1, Bfad6.1 and Bfai6.1, were consistently and reliably detected across two environments, which could explain 11.6–33.6% phenotypic variations (R2) in the F2:3 families. Three major-effect QTLs, Ofai4.1 (R2 = 13.4–15.5%), Ofal4.1 (R2 = 10.7–12.8%) and Ofad6.1 (R2 = 11.6–12.4%), were stably detected in the F2:6 population in two experiments. Bfai4.1, Bfal4.1, Ofai4.1 and Ofal4.1 were integrated to be consensus QTL fa4.1, within which 11 candidate genes were predicted. Bfai6.1 and Bfad6.1 were integrated to be consensus QTL fa6.1. QTL interaction analysis showed that Bfai6.1 had an epistatic effect with Bfai4.1 (Zhu et al. 2022). The association of 414 markers with fruit flesh color values was also investigated in present study. GLM (Q) analysis revealed the association of four markers at p < 0.01 level. Of these markers, two ISSR (TCC5RY.250 and AG7YC.1050), one RAPD (OPBH19.450) and one SSR (CH05g11.350) markers.

Associated with the general fruit shape of genotypes 10 markers were determined and five of them were SRAP (Em4Me5.1300, Em8Me4.1000, Em6Me6.700, Em3Me4.950 and Em11Me6.1250), four of them were RAPD (OPB15.1500, OPB15.1400, OPBH10.1050, OPBG17.700) and one of them was ISSR (DBD_ACAX7.350) markers.

A mapping F2 population from the cross ‘Piel de Sapo’ × PI124112 was selectively genotyped to study the genetic control of morphological fruit traits by QTL analysis. In all, 10 QTL were identified, five for FL (Fruit Length), two for FD (Fruit Diameter) and three for FS (Fruit Shape) (Díaz et al. 2014).

Lee et al. (2020) obtained data for major fruit-related domestication traits (fruit length, width, weight, pericarp thickness and fruit position) using a highly diverse panel of 351 pepper accessions representing the worldwide Capsicum germplasm. Using a genotype-by-sequencing (GBS) method, they developed 187,966 genome-wide high-quality SNP markers across 230 C. annuum accessions. LD analysis revealed that the average length of the LD blocks was 149 kb. Using GWAS, they identified 111 genes that were linked to 64 significant LD blocks. GWAS results were cross-validated using 17 fruit-related QTLs and identified 16 causal genes thought to be associated with fruit morphology-related domestication traits, with molecular functions such as cell division and expansion. In a study carried out by Bardak et al. (2017) to determine the markers associated with gossypol by association mapping method in cotton, phenotypic data of 48 cotton genotypes including number of stem (number/cm2), number of leaf (number/cm2), number of bolls (number/cm2) and number of gossypol glands in seeds (number/mm2) and seed High Liquid Pressure Chromatography (HPLC) analysis data were used. As genotypic data, 83 polymorphic bands obtained by scanning 37 SSR markers among genotypes were used. In GLM, one marker associated with gossypol in the stem, four markers associated with gossypol in the boll, three markers associated with gossypol in the leaf, eight markers associated with gossypol in the seed and four markers associated with the seed HPLC results were determined. In the mixed linear model (MLM), one marker associated with gossypol in the stem, 12 markers associated with gossypol in the boll, three markers associated with gossypol in the leaf and three markers associated with gossypol in the seed were determined. The present findings obtained by association mapping were consistent with the literature.

Conclusion

Within the scope of this study, morphological, phenological and pomological analyses were carried out on 120 apple genotypes consisting of four different species, genetic characterization and population determination studies were carried out with different marker techniques such as simple sequence repeat (SSR), inter-simple sequence repeat (ISSR), sequence-related amplified polymorphism (SRAP) and random-amplified polymorphic DNA (RAPD) and association mapping was performed for six agronomically important traits with the use of resultant data. The association mapping method is a powerful method that reveals gene-marker relationships. With this method, high-resolution maps of existing, diverse natural populations can be obtained. Determining the effect of genetic variation on the results of association mapping, determining the candidate genes associated with these traits will contribute to future genetic and breeding studies, and with the verification of identified markers in special populations (F2, BC1, BC, RIL) from the same mother and father, it will be possible to use these markers with compatible results in marker assisted selection (MAS) studies. The results also indicate that some apple cultivars may be useful sources of genetic variation for further breeding programs.