INTRODUCTION

The Don Valley is a wine-growing region of the South of Russia, localized as a protected designation of origin in Rostov oblast [1]. There is historical and archaeological evidence of the existence of cultural viticulture and winemaking here in the 5th–3rd centuries BC; the first mentions of the winemaking industry in Russian sources date back to the era of Peter I (1709–1711) [2]. It is historical sources that set the main problem of viticulture in this region: the origin of autochthonous grape cultivars that are common here.

Belonging to a certain territory, autochthonous or native grape varieties that have arisen as a result of centuries-old selective breeding by people are the property of a particular nation and culture. Having a mostly unknown origin, or originating from wild forms of grapes [3], they appeared as a result of spontaneous hybridization by people in the pre-phylloxera era, i.e., before the second half of the 19th century, when both selective breeding and interspecific hybridization of the genus Vitis appeared [4]

Scientific publications of the 19th and 20th centuries describe 141 varieties of V. vinifera localized in the Don, Kuban, Crimea, and North Caucasus [59]. Forty two of these varieties can be attributed to varieties of the Don Valley. According to the main historical sources, they originate from varieties introduced from Hungary and Germany by Peter the Great, as well as from grape stalks brought from the 1814 campaign to Europe by the Don Cossacks [10], which can be directly evidenced by the names of a number of varieties (Burgundskiy, Shampanchik, Vengerskiy). There is also a theory about the bringing of varieties from the territory of modern Dagestan during the migration of the Khazars to the Don Valley in the 9th century BC [11]. These theories could be challenged or confirmed only by in-depth studies of the gene pool of varieties of the Don Valley.

Currently, most autochthonous varieties have been preserved only in public and private collections; in commercial plantations, they occupy only 2% of the mass of Russian vineyards (the area of which is 96 800 ha as of 2021 according to the Ministry of Agriculture of the Russian Federation) [12]. At the same time, the awakening interest of producers and consumers of Russian wines in autochthonous varieties as well as the threat of extinction of most of them required genetic certification and the creation of a biobank of varieties.

For the first time in the history of Russian science, this process was launched at the Kurchatov Genomic Center, National Research Center “Kurchatov Institute,” at the end of 2020. At the first stage of research, it was fundamentally important to find out whether we are dealing with a unique gene pool or whether we are talking about varieties from Western Europe introduced before the 20th century, as was indicated in the vast majority of scientific sources of past years. It was also necessary to find out about probable repetitions of varieties, separate varieties from clones, and identify probable interspecific hybrids passed off as varieties of V. vinifera.

EXPERIMENTAL

During 2021, we conducted six expeditions to search for plantings of autochthonous grape varieties and to collect biomaterial: two expeditions to Krasnodar Krai (Anapa, Krasnodar), three expeditions to Rostov oblast (Novocherkassk, Tsimlyansk, Lower Don regions) and one to Crimea (Yalta, Sevastopol). Autochthonous varieties of the Don Valley were found in the collections of the All-Russian Scientific and Research Institute of Viticulture and Winemaking named after Ya.I. Potapenko (Novocherkassk), Anapa Zonal Experimental Station (Anapa), in the private collection of V.I. Kosov, an agronomist of the Department of Plant Quarantine and Seed Production of the Federal State Budgetary Institution “Rostov Reference Center of Rosselkhoznadzor,” in private farmsteads and peasant farms of the villages of Mele-khovskaya, Mishkinskaya, the settlement of Sarkel and the city of Tsimlyansk of Rostov oblast.

Assistance in the search and ampelographic examination of the varieties was provided by Candidates of Agricultural Sciences N.V. Molchanov and S.I. Krasokhina, winemakers and experts N.P. Lukyanov, P.G. Serikov, V.A. Nefedov, E. Sofiysky, Ya.V. Kaurov, V.I. Kosov, and other winegrowers and winemakers from Rostov oblast. In addition, samples of wild grapes were obtained from wild plantations of the 1950s in the Tsimlyanskiy, Konstantinovskiy, and Aksayskiy raions of Rostov oblast.

The samples selected for research during the expeditions are presented in Table 1; notations are given in Table 2. We selected 68 samples of autochthonous varieties of the Don Valley and neighboring regions, including their clones and identified repetitions, as well as 15 samples of autochthonous varieties of Russia and other countries that showed kinship with some varieties of the Don Valley. Thus, the complete sample amounted to 83 genotypes of varieties and variety forms of grapes.

Table 1. Studied grapes specimens
Table 2. Sample source designations
Table 3. Specimens from the publication [15]

DNA extraction from grape leaves was performed according to the previously described protocol [13]. The purity of the resulting DNA was assessed by measuring the optical density on a Nanodrop 1000 spectrophotometer (Thermo Fisher Scientific, United States). The concentration was determined using a Qubit 2.0 fluorimeter (Thermo Fisher Scientific, United States).

Whole-genome-sequencing libraries were generated using the NEBNext Ultra II DNA Library Prep Kit (New England Biolabs, Ipswich, MA, United States). The length distribution of the resulting libraries was evaluated on an Agilent Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, United States). Whole-genome sequencing was performed using an Illumina Novaseq 6000 device (Illumina, San Diego, CA, United States) with an S2 flowcell (Illumina, San Diego, Unites States) and standard reagent kit. The read length was 2 × 150 nucleotides.

The process of creating amplicon libraries included the following main steps:

– Targeted enrichment of loci with single nucleotide polymorphisms of genomic DNA (10 ng) by PCR using uracil-containing primers (524 pairs);

– Removal of primers with restoration of the ends;

– Ligation of adapter sequences;

– Amplification and indexing of libraries.

The panel consists of a single pool of primers producing 125–175 base pair (bp) amplicons that cover 524 loci of the V. vinifera genome [14]. The total length of the region covered by the primers is 57 973 bp. Amplicon-library sequencing was performed on an Illumina MiSeq device using the MiSeq Reagent Kit V2 according to the manufacturer’s protocol.

Population and Genealogical Analyses

The quality of the obtained reads was assessed using the fast program. Read preprocessing, mapping of the reads onto the reference genome, and processing of the mapping files have been described previously [14]. A reduced set of single nucleotide polymorphisms (SNPs) that are suitable for use in population and genealogy studies was derived from a previously published set of 10 000 V. vinifera SNPs [15]; the filtering steps and parameters were detailed in [14]. Assessment of the similarity of varieties, the search for relationships, the analysis of ancestral populations, and the clustering and visualization of genetic data were carried out according to the methods described in [15]. The study additionally used the estimation of genetic distances using the NeiDA algorithm. The data obtained from whole genome and amplicon sequencing were combined with a reference database of 793 varieties located in five national collections in France, Germany and Spain, as well as Georgia [15]. The reference database as well as the data of the present study are deposited in the prototype of the National Genetic Information Database, which is located at the National Research Center “Kurchatov Institute,” and can be provided upon request.

RESULTS AND DISCUSSION

Population and Genealogical Analysis of Grape Varieties Based on Whole Genome Sequencing

Identical and “mislabeled” varieties. It has been reliably established that the Pukhlyakovskiy Belyi variety is a synonym for the Coarna Alba variety, which originates from the territory of Romania and Moldova.

Specimens of wild grapes from the settlement of Sarkel (Sarkel 1 and Sarkel 2) with a presumed assignment to the Tsimlyanskiy Chernyi variety were identified according to the analysis of origin and similarity of genotypes with other varieties as the Plechistik variety. Specimens of wild grapes Mishkinskaya 2 (the lands of the PF Nefedov, Aksayskiy raion, Mishkinskaya stanitsa) and a specimen of Sarkel 3 found in wild plantations (Shkol’naya Balka, the settlement of Sarkel, Tsimlyanskiy raion) relate to the Tsimlyanskiy Chernyi variety.

The Plechistik-variety specimen obtained in the PF Serikov of Tsimlyansk is presumably a descendant of the Plechistik variety based on data on the origin and genetic similarity with the specimens subjected to amplicon sequencing.

Related varieties and origin. It has been established that the Tsimlyanskiy Chernyi variety was obtained by crossing the Kokur Belyi and Plechistik varieties. Paired parent-offspring relations were found in the following specimens: Kokur Belyi and Kara oglan faux from the French collection of Montpellier [15], Plechistik (Sarkel 1) and Starenkiy, Tsimlyanskiy Chernyi, and Kumshatskiy Belyi.

Population and genealogical analysis according to amplicon-sequencing data. The specimens Ladannyi 2 and Mushketnyi (V.I. Kosov, Ust-Donetsk raion, Melikhovskaya stanitsa) have identical genotypes. The Moldavskiy Chernyi variety is a synonym for the Darkaya Noir variety (Coarna Neagra), a close relative of the Coarna Alba variety (Pukhlyakovskiy). The Durman variety (Muskat Konstantinopolskiy) corresponds in genotype to the specimen of the Muscate variety from Romania (Collection Ravaz).

The genotype of the specimen of the Asyl Kara (Terskiy Chernyi) variety that is an autochthon of the Terek Valley selected at the Anapa Zonal Experimental Station of Viticulture and Winemaking does not match the data on the Asyl Kara variety presented in [15], which may indicate the incorrect use of these two varieties as synonyms or about an error in the collections.

The origin of 16 grape varieties was established and confirmed with high reliability (Table 4).

Table 4. Origin of breeding and autochthonous Russian varieties reconstructed from amplicon-sequencing data

For the following varieties, related parent-offspring pairs were identified with a high probability: Sypun Chernyi and Plechistik, Khotsa Tsibil and Buryi 2, Staryi Goryun and Plechistik.

For a number of varieties, it is not possible to estimate the direction of parent-offspring relation without additional data: Donskoi Alyi and Dimiat, Kosorotovskiy and Pukhlyakovskiy Belyi, Alyi Terskiy and Asyl Kara, Pochatochnyi and Kokur Belyi, Dostoynyi and Krasnostop Zolotovskiy, Grdzelmtevana and Asyl Kara, Kostyukovskiy and Angur Kalan (Nimrang), Zhirnyi Slitnoy and Asyl Kara. The established phylogenetic tree of grape varieties of the Don Valley and neighboring regions is shown in Fig. 1.

Fig. 1.
figure 1

Established origin of Don grape varieties based on whole genome and amplicon sequencing.

The following varieties are related, but the nature of the relationship could not be established: Efremovskiy and Plechistik, Slitnoy and Khotsa Tsibil, Pervenets Praskoveyskiy and Khalili Belyi (also known as Ak Khalili, Ilyinskiy, Novrast Belyi, Tsarskiy, Yai Ouzioum), Efremovskiy and Buryi 1, Ladannyi 2 and Buryi 2, Narma and Gyulyabi Dagestanskiy.

Visualization of a two-dimensional convolution of data on the pairwise similarity of the genotypes of the studied grape varieties (Fig. 2) confirmed the differences in the genetic material of grape varieties in geographical groups and revealed the following large clusters:

Fig. 2.
figure 2

Visualization of two-dimensional convolution of data on pairwise similarity of the analyzed genotypes by the method of stochastic neighbor compression with a t-distribution (tsne).

– Middle and Far East (MFEAS), Eastern Mediterranean and Caucasus (EMCA), Russia and Ukraine (RUUK);

– Balkans (BALK);

– Western and Central Europe (WCEU), Apennine Peninsula (APPE);

– Maghreb (MAGH), Iberian Peninsula (IBER);

– New World (NEWO);

– A mixed cluster of hybrid varieties, including mainly varieties bred in the Balkans and the Apennine Peninsula.

The data obtained by the described method allow the use of amplicon sequencing as an alternative to whole genome sequencing, short repeat sequence (SSR) genotyping and sequencing by DNA microchips to establish and verify the origin of grape varieties.

The performed population analysis indicates that the studied Russian autochthonous grape varieties from the Don Valley form a separate cluster among other V. vinifera varieties.

Thus, the greatest contribution to the origin of the studied autochthonous grape varieties was made by:

– Don varieties Krasnostop Zolotovskiy, Tsim-lyanskii Belyi and Bulannyi, none of which currently have ancestors originating from other regions and countries;

– A decisive role in the origin and, probably, the directed selection of a number of varieties was played by the Romanian-Moldavian variety Pukhlyakovskiy Belyi (Coarna Alba). In particular, this may be due to the existence of the Exemplary wine cellar since the early 19th century at the Pukhlyakovskiy (Sobakinskiy) farmstead (since 1814); this was an experimental and educational institution of the Don Army [16]. Like its descendant Sibirkovyi, today Pukhlyakovskiy Belyi is a very common variety among the Don winegrowers;

– A special contribution to the gene pool of the Don varieties was also made by the Buryi variety originating from the varieties of the North Caucasus and the Kokur Belyi Crimean autochthonous variety.

To supplement the genesis of autochthonous varieties of Russia, a more detailed study of both wild-growing forms of grapes and other autochthonous varieties is required. It must include the genomes of varieties from neighboring countries (Georgia, Armenia, Abkhazia), as well as wild grapes in the place of pre-Soviet and late Soviet plantations, which may represent lost autochthonous varieties and ancestors of modern varieties.

CONCLUSIONS

A limited set of SNPs has proven to be a reliable tool for determining the origin of grape varieties and parent-offspring relations. For the first time in Russian science, the issues of the origin and relationship of the Don Valley autochthonous varieties have been resolved. Their place on the global phylogenetic tree of V. vinifera has been found and independently confirmed; algorithms have been developed that allow the identification of an unknown biomaterial of grapes with a high degree of probability. The work carried out forms the basis for the development of the Russian nursery sector and new approaches to breeding. A unique biobank and genetic base of domestic autochthonous grape varieties are being formed, many of which have been preserved in public and private collections in the amount of several bushes.

With the exception of the closely related varieties Coarna Neagra (named Moldavskiy on the Don) and Coarna Alba (that was renamed to Pukhlyakovskiy Belyi and became the ancestor of a number of other varieties), not a single foreign influence on the gene pool of the studied autochthonous grape varieties has been established. This also applies to the origin of varieties such as Burgundskiy, Shampanchik Bessergenevskiy, Shampanchik Tsimlyanskiy, whose French origin seems to be undoubted.

At the same time, the presence of the descendants of the varieties Kokur Belyi (Crimea), Koz Ouzioum and Asyl Kara (Northern Caucasus) among the autochthonous varieties of the Don indicates both the historical interaction of the peoples of Russia in the field of viticulture and breeding and the fact that most of the studied varieties of the Don, North Caucasus and Crimea form their own unique cluster on the world phylogenetic dendrogram of Vitis vinifera with their own parent-offspring trios and duets.