The number of cucurbit-infecting viruses worldwide has increased over the last two decades. Zucchini shoestring virus (ZSSV) has been proposed to be a putative potyvirus infecting cucurbits, based on the nucleotide sequence similarity of its coat protein (CP) to those of related cucurbit-infecting potyviruses [7]. ZSSV was detected in the province of KwaZulu-Natal (KZN) in the Republic of South Africa (RSA) during virus surveys conducted in the cucurbit-growing areas between 2011 and 2013. Symptoms associated with ZSSV include severe leaf filiformy and fruit deformation on baby marrow (Cucurbita pepo L.) [7]. These symptoms were observed throughout the surveys and in all growing areas. Losses up to 100 % were recorded in cases when the infection occurred before fruit formation. The full genome sequence of ZSSV is reported here.

The source of the ZSSV isolate used in this study was a baby marrow leaf displaying filliformy symptoms, randomly selected from the samples collected during virus surveys conducted between 2011 and 2013 in KZN. A NucleoSpin RNA Plant kit (Macherey-Nagel, Germany) was used according to the manufacturer’s instruction to extract total RNA, which was shipped on dry ice to the Agricultural Research Council’s Biotechnology Platform (ARC-BTP) in Pretoria, RSA, for library preparation and next-generation sequencing (NGS) on the Illumina HiSeq platform, using paired-end chemistry 125 × 125-bp reads. FastQC was used to access the quality of the NGS data generated. Removal of the adapters from the reads, trimming, and quality filtering were performed using Trimmomatic version 0.33 [4] with the following settings: {ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:9:1:true LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36}. The paired-end sequences were subsequently used as single reads for de novo assembly, which was performed using SeqMan NGen (software version 12.3.1 build 48; DNASTAR Lasergene) according to the default parameters with the removal of the host data. All generated contigs were used for a BLAST search of the National Center for Biotechnology Information database using the Netsearch function on the program SeqMan pro (software version 12.3.1 build 48 421; DNASTAR Lasergene).

A 10,308-kb contig (median coverage: 162.61) matched the ZSSV CP coding sequence (accession number: KP723639.1) with an E-value of 0.00 and was therefore used as the ZSSV draft genome. Direct sequencing of an overlapping amplicon flanking the draft genome of ZSSV was performed at Inqaba Biotechnical Industries (Pty) Ltd. (Pretoria, RSA) to confirm the integrity of the ZSSV genome sequence. Amplicons were produced by reverse transcription polymerase chain reaction (Table S1). The Open Reading Frame (ORF) Finder was used to predict the ORFs on the ZSSV genome. The cleavage sites were identified using the data provided by Adams et al. [1] and Romay et al. [12]. Information from Chung et al. [5] and from Wen and Hajimorad [15] were useful in identifying the PIPO. Multiple sequence alignments were performed using MUSCLE, implemented in MEGA 6.06 [14]. Nucleotide and amino acid (aa) sequence identity were computed using the SIAS tool (http://imed.med.ucm.es/Tools/sias.html). Any putative recombination junctions were checked using RDP [10], GENECONV [11], MaxChi [13], BootScan [9], and SIScan [6] included in RDP v4.56 package [8]. The full-genome sequences of all potyviruses were used with PAirwise Sequence Comparison (PASC) [3] to confirm the molecular taxonomic position of ZSSV. ZSSV phylogeny was inferred in MEGA 6.06 [14] using the maximum-likelihood method based on the general time-reversible model with a discrete gamma distribution and invariable sites with 500 bootstrap replicates.

The ZSSV genome consists of 10,295 nucleotides (accession number: KU355553) excluding the poly(A) tail (Fig. 1) and has a genome organization typical of a potyvirus. The ZSSV genome sequence shares the highest nucleotide sequence identity of 65.68 % with Algerian watermelon mosaic virus (AWMV; EU410442.1) (Table S2). Regarding the respective protein-coding sequences, ZSSV shares the highest nucleotide sequence identity with AWMV for eight out of the 11 potyvirus proteins (Table S2). The highest nucleotide sequence identity for the other three potyvirus proteins was shared with Moroccan watermelon mosaic virus (MWMV) isolates (Table S2). The highest amino acid sequence identity (69.22 %) was to AWMV. ZSSV shared the highest aa sequence identity with AWMV for eight potyvirus-encoded proteins (Table S3). Phylogenetic relationships were therefore expected between these isolates (Fig. 2). Two cleavage sites, CI|6K2 and Nia|NIb, were unique to the ZSSV genome. The motifs 624R-I-T-C627 and 882P-T-R884 instead of the highly conserved K-I-T-C and P-T-K, reported in AWMV and MWMV [16, 17], were also identified in the ZSSV genome. No recombination junctions were detected in the ZSSV genome. PASC and the nucleotide sequence identity results qualify ZSSV as a member of a distinct species in the genus Potyvirus according to the demarcation criterion for the genus Potyvirus [2]. ZSSV is the second cucurbit-infecting virus of the PRSV cluster to occur in RSA, and the third one reported in Africa. The shoestring symptom cannot be solely attributed to ZSSV at this stage because a tobamovirus and another potyvirus were also detected in the sample analysed in this study (data not shown).

Fig. 1
figure 1

ZSSV genome organization. The numbers on the diagram indicate the starting nucleotide position predicted for each gene. The cleavage sites are indicated by the symbol “|” between the fourth and the fifth aa

Fig. 2
figure 2

Phylogram of the coding sequences of the ZSSV polyprotein. ZTMV, zucchini tigré mosaic virus