Introduction

The fungus Rhizoctonia solani Kühn (teleomorph Thanatephorus cucumeris ([A. B. Frank] Donk) (Deuteromycetes, Mycelia Sterilia) is responsible for many economically important plant diseases worldwide and also synonymed as Rhizoctonia praticola. The fungus can also degrade organic matter in soil as a saprobe and represents an important evolutionary link between beneficial and plant disease-causing fungi (Cubeta and Vilgalys 2000). It is divided into subgroups called anastomosis groups (AGs), in which isolates are categorized according to the ability of their hyphae to anastomose (fuse) with one another. At present, the R. solani species complex is comprised of fourteen anastomosis groups (AGs), most of which are reproductively incompatible with each other and are numbered AG1 through AG13 (Sneh et al. 1991). The ‘bridging isolate’ AG-BI is the exception, being compatible with multiple AGs (Gracia et al. 2006). It has been established that all AGs and their sub-groups are genetically isolated and represent highly divergent evolutionary units (Rosewich et al. 1999; Ceresini et al. 2007).

Anastomosis group 3 (AG3) is a common inhabitant of the soil ecosystem and has a worldwide distribution. It is an economically important pathogen that affects food crops mostly in the plant family Solanaceae including eggplant, pepper, tomato and potato. R. solani AG3-PT (The potato type) is a subgroup of AG3 and main causal agent of stem, stolon canker and black scurf of potato world wide including India. It is both tuber and soil borne. Loss of quantity and in particular quality caused by black scurf disease reaches even up to 50% (Keiser 2008; Woodhall et al. 2013). Although, many disease management strategies have been developed like chemical spray, maintaining low soil pH, avoiding areas with history of disease or long crop rotation, use of disease free certified seeds, seed treatment, timely harvest, tricoderma and other biological treatments, but so for none of these have proven to be completely effective (Khaldi et al. 2015). To date genomes of strains belonging to only four AGs have been sequenced viz. AG1 [AG1-IA (Zheng et al. 2013), AG1-IB (Wibberg et al. 2013)], AG2-IIIB (Wibberg et al. 2015b), AG3 (Cubeta et al. 2014) and AG8 (Hane et al. 2014). In order to gain insight into the infection and pathogenesis processes, the complete genome of AG3-PT strain RS-20 was sequenced to understand the mechanism of pathogenicity as well as to determine its evolutionary relationship with other Rhizoctonia species complex.

Material and Methods

Freshly harvested tubers of cv. Kufri Jawahar in the rabi season of 2010 from Kufri, (Himachal Pradesh, India; 31.0979°N 77.2678°E, located in mid himalya and having wet temperate climate) infected with black scurf were collected. The tuber samples were surface sterilized with 1% NaOCl; sections of affected tissue were excised and plated onto potato dextrose agar (PDA) media. Rhizoctonia-like colonies were identified and the pathogen was purified after repeated sub-culturing on PDA. Fungal mycelia was harvested from 10 to 15 days old culture broth. Genomic DNA was isolated from harvested mycelia using CTAB (Murray and Thompson 1980) method and its identity confirmed by comparing ITS (ITS1-F and ITS4 Primers) sequences with AG3-PT sequences from GenBank (JX27814 and KC157664) (Fiers et al. 2010). The purified DNA was quantified using NanoDrop 2000 sprectophotometer (Thermo Fisher, India) and used for genomic rapid library preparation (Roche Diagnostics) using the kits and user guide provided. Further sequencing was done using Roche 454 (GS-FLX Titanium) pyrosequencing platform. Two shotgun sequencing runs were carried out with complete region of PTP (Peco Titre Plate). The raw sequence data was assembled using GS DeNovo Assembler (Version 2.5.3) and GS Reference Mapper (Roche Diagnostics). Protein coding regions were predicted using eukaryotic GeneMark.hmm (Version 2.2a) and AUGUSTUS ( http://bioinf.uni-greifswald.de/augustus/submission/ ) for cross validation. The data was also used for getting the average number of exons per gene manually. Open software RNAmmer 1.2 Server (www.cbs.dtu.dk/Services/RNAmmer/) and FindtRNA Version 1.0 (www.bioinformatics.org/findtrna/FindtRNA.html) were used for finding the total number of rRNA and tRNA coding genes respectively. Genome-wide repeats were studied using MicroSAtellite identification tool (MISA- http://pgrc.ipk-gatersleben.de/misa/) and evolutionary relationship among the R. solani Anastomosis Groups (AGs) was analysed by Average Nucleotide Identity (ANI) using one-way ANI among R. solani Anastomosis Groups (AGs) datasets, as calculated by (Goris et al. 2007) (http://enve-omics.ce.gatech.edu/ani/).

Results and Discussion

The shot gun sequences yielded high quality 2,827,025 reads amounting to 1,031,371,409 bases which was equivalent to 17-fold coverage of its estimated ≅60 Mb genome. The GS De Novo Assembler (version 2.5.3) using the heterozygotic mode yielded 21,475 contigs (over 500 bp) with an average contig size of 4.1 kb, N50 contig size of 4068 bp and largest contig of 97.2 kb. The proportion of bases called that had a quality score of 40 or above was 98.66%. The contigs covered more than 54.2 Mb genome size of the pathogen. The draft genome of AG3-PT strain RS-20 has G-C content of 48.3%, which is comparable to all the other anastomosis groups sequenced so far that varies between 48.1 to 48.8%, except for the AG1-IA having GC content of 47.61% (Wibberg et al. 2013; Wibberg et al. 2015a). Gene prediction using the eukaryotic GeneMark.hmm (Version 2.2a) and AUGUSTUS revealed a total of 11,431 protein coding regions (CDSs) common in both, spread over genome covering 13.36 Mb (29.4% of genome as coding). This number is very close to the predicted genes number of AG1-IA strain B275 (10,489) (Zheng et al. 2013), AG1-IB strain 7/3/14 (12,616) (Wibberg et al. 2013; Wibberg et al. 2015a), AG2-IIIB strain BBA69670 (11,897) (Wibberg et al. 2015b), AG3, strain Rhs-1AP (12,726) (Cubeta et al. 2014) and AG8 strain WAC10335 (13,964) (Hane et al. 2014). The average exons per gene was found to be 5.24 and the average gene density was calculated to be 2.08 per 10 Kb. Total rRNA genes predicted to be 31 using RNAmmer 1.2 Server for eukaryotic genome (Lagesen et al. 2007) and the total tRNA coding genes were predicted to be 181 using FindtRNA Version 1.0 (Table 1). The initial pBLAST (blast.ncbi.nlm.nih.gov/Blast.cgi) of predicted genes against the non redundant protein sequences (e ≤ 10−9) revealed that the distribution of the predicted proteins among functional classes was similar to that of R. solani viz. 43.5% with R. solani AG3 (strain Rhs-1AP), 39.3% with R. solani (Strain 123E), ≅2.5% with both R. solani AG8 and AG1-IB and Nearly 1.7% of the genes had hits with other organisms and in general hits were dominated by membrane transporter MFS 1 (Major Facilitator Superfamily).

Table 1 List of sequenced Rhizactonia solani and their genome characters

The repeat region studies using MicroSAtellite identification tool revealed abundance of mono, di and tri mer repeats in the genome. Number of hexa mer repeats found to be higher than tetra and penta mers, and this feature was found across the genomes of anastomosis groups (AGs). A total of 178 predicted gene sequences had at least one microsatellite in them (genic microsatellites) and only 11 genes possessed more than one microsatellites. Tri-mer repeats dominated the genic microsatellites (80%) and only 11% of these are in the compound form. When compared to the Rhizoctonia genomes, AG1-IB had the highest number of microsatellites in compound form whereas AG8 had the least. The genome-wide microsatellite information can be handy for diversity and classification studies.

Average Nucleotide Identity (ANI) using one-way ANI among R. solani Anastomosis Groups (AGs) datasets, as calculated by (Goris et al. 2007), revealed a close association between AG3-PT (Strain RS-20) and AG3 (Strain Rhs-1AP) (97.75%) re-confirming its recent divergence from the AG3 based on the environment and host adaptation, whereas, least association was found with AG1-IA (Strain B275) (75.87%). The codon preference studies revealed that Methionine (ATG) and Tryptophan (TGG) have single codon coding for respective amino acids and TGA (43.2%) is the preferred stop codon over TAG (27.5%) and TAA (29.2%). The codon preference was found to exist for all the amino acids ranging from 3% to 15% over the non-preferred. The ANI with Coprinopsis cinerea (Order: Agaricales), Cryptococcus neoformans (Order: Tremellales) and Piriformospora indica (Order: Sebacinales) belonging to class Agaricomycetes was found to be 77.40%, 82.33% and 75.02% respectively indicating the phylogenetically, greater closeness of order catharellales (Rhizoctonia sp.) with tremellales (Table 2). These preliminary studies of the genomes indicate the presence of greater diversity among the anastomosis groups of R. solani and could need further sub-classification among these orders. A detailed genomic study needs to be taken up among sequenced genomes of the class agaricomycetes for better understanding and to fine-tune their systemic classification.

Table 2 Average Nucleotide Identity using one-way ANI among Rhizactonia solani Anastomosis Groups (AGs) and selected fungi from class Agaricomycetes datasets*, as calculated by Goris et al. 2007

There is necessary for more genomes representing all of the R. solani Anastomosis Groups to be sequenced, as enormous genome sequence data provide new opportunities for comparative genomics between R. solani anastomosis groups. This is the first draft genome sequence report for R. solani (AG3-PT) sub type and the availability of the genome sequence is certain to be a vital resource in epidemiological and quarantine studies. Validation of predicted genes would require additional transcriptome sequencing and expression experiments. A detailed examination of this genome and inter-comparison with other Rhizoctonia species complex genomes could expand our understanding of evolutionary relationships to obtain unexpected new insights into their origins, diversity, pathogenicitiy, and host-specificity. Further, functional characterization of potential effector genes is requirement for determining their roles in pathogenesis which needs to be carried out across anastomosis groups of R. solani.

Nucleotide Sequence Accession Numbers

Sequences from this whole-genome shotgun project have been deposited at NCBI/GeneBank under Bioproject PRJNA316175 and accession number LVWR00000000. The version described in this paper is the first version, LVER01000000. The strain is available with Division of Plant Protection, ICAR-CPRI, Shimla and also has been submitted to MTCC, CSIR-IMTECH, Chandigarh.