Introduction

The origin and phylogeny of the genus Corchorus as a whole or its two domesticated species (C. capsularis L. and C. olitorius L.) has long been controversial (Edmonds 1990; Benor et al. 2010). For much wider diversity of Corchorus species in Africa, an African centre of origin of C. olitorius is nowadays supported, whereas C. capsularis is supposed to have originated in Indo-Myanmar region including south China (Kundu 1951, 1956). These competing views are, however, based on biogeographic diversity and species adaptation (Edmonds 1990).

Implicit in this ongoing debate on the origins of the two domesticated jute species, however, are three apparent caveats. Firstly, recent archaeobotanical evidence supports not only the possibility of an ancient use of jute in the Indus civilization, but also its clear differentiation from silk in that period (Wright 2010). Secondly, as opposed to earlier claims that C. capsularis is not found wild in Africa and Australia, now its presence in both the continents is well documented (Du Puy and Telford 1963; Benor et al. 2010). Thirdly, although both the domesticated jute species are an important source of fibre in Asian countries, they are widely utilized as a mucilaginous leafy vegetable throughout Africa (Grubben and Denton 2004).

Molecular marker-based genetic diversity analyses revealed considerable divergence between the two cultivated jute species, with relatively higher level of allelic diversity in C. olitorius than in C. capsularis despite fairly low genetic diversity at the intraspecific level (Hossain et al. 2002; Basu et al. 2004; Roy et al. 2006; Mir et al. 2008, 2009). Support for a distinct genetic differentiation between the two cultivated jute species was provided by morpho-physiological (Palit et al. 1996), karyomorphological (Maity and Datta 2009) and genomic (Basu et al. 2004; Roy et al. 2006; Haque et al. 2007; Mir et al. 2008) analyses. However, none of these studies could establish a relationship between geographic origin and phenetic classification. This reported lack of correspondence between genetic divergence and geographic diversity is indicative of not only uncertain origins of cultivated jute accessions, but also the possibility of their recent migration or germplasm movement across geographical boundaries (Roy et al. 2006).

Codominant nuclear microsatellites have been shown to be ideal markers for detecting phylogenetically significant diversity within cultivated jute (Mir et al. 2008, 2009). In the present study, we analyzed the nuclear microsatellite diversity in the nine Corchorus species including the two cultivated jute species that encompass wide geographic diversity. Chloroplast microsatellites represent a general marker system for assessing the genetic structure of plant populations. Several mononucleotide repeats resident on intronic and intergenic cpDNA regions are known to be conserved across dicotyledonous angiosperms (Weising and Gardner 1999). We used a set of those consensus microsatellite primers to study mononucleotide repeat-based chloroplast haplotype diversity in Corchorus species. We also sought to phenetically classify the nine Corchorus species based on morphological similarities by using a suite of most distinguishing vegetative growth and bast fibre traits.

Materials and methods

Plant material

Nine diploid (2n = 2× = 14) Corchorus species, viz., C. aestuans L. (WCIJ 037), C. fascicularis Lam. (WCIJ 004), C. pseudo-capsularis Schweinf. (WCIJ 007), C. pseudo-olitorius Islam & Zaid (WCIJ 034), C. tridens L. (WCIJ 046), C. trilocularis L. (WCIJ 072) and C. urticifolius Wight & Arnold (WCIJ 006) including the two cultivated jute species C. capsularis L. (cv. JRC-321 and mutant CMU-010) and C. olitorius L. (cv. JRO-524 and mutant PPO-4) were used in the present study. The selected accessions of wild species have been used in other published studies (Chakraborty and Palit 2009), whereas the selected genotypes of both the cultivated jute species represent leading cultivars in an agricultural setting (global area sown and bast fibre production). Except for C. trilocularis, which is a diffuse undershrub, they are all annual herbs, with distinct growth habits, morphological features and sexual compatibilities (Supplementary Table 1).

Field study

The experiment was laid out in a randomized complete block design with three replications over the two growing seasons. Each genotype was represented by three rows of 10 plants each at a spacing of 10 cm in rows 30 cm apart per replication. At 90 d after sowing, observations on vegetative characters, such as plant height (cm), basal stem diameter (cm), leaf length (cm), leaf breadth (cm), leaf area (cm2) and leaf ratio (average leaf length divided by average leaf breadth) were recorded for 10 randomly selected healthy plants per replication. The matured plants were harvested at 120 d after sowing. After drying, wood and bast fibre yields (g plant−1) were determined. Bundle strength and airflow fineness testers were used to measure tensile strength (g tex−1) and fibre fineness (tex), respectively. Fibre fineness was scored on a 0–5 scale of decreasing fineness, with 5 representing the coarsest fibre.

Morpho-phenetic analysis

Prior to univariate analysis of quantitative data for each character, normality (Shapiro-Wilk test) and equal variance (Bartlett’s test) assumptions were tested (GenStat Version 7.2.0.220; VSN International Ltd., Oxford, UK). Accordingly, the data were suitably transformed and analyzed by one-way analyses of variance followed by multiple comparisons of mean values by Tukey-Kramer honestly significant difference test. Associations between morphological and bast fibre traits were estimated by nonparametric method of Kendall’s coefficient (tau-b) of rank correlation. The dendogram was constructed using programs in NTSYSpc Version 2.20v (Applied Biostatistics, Inc., New York, USA), with resampling of means for each character (1000 bootstrap replicates) and clustering based on UPGMA. A majority-rule consensus tree was compiled from trees that resulted from bootstrap sampling.

Nuclear microsatellite genotyping

DNA was isolated from 2-week-old seedlings, according to Kundu et al. (2011). Altogether 66 nSSRs that exhibited ± polymorphism due to null alleles (Mir et al. 2008, 2009) were initially selected, of which 38 single-locus polymorphic nSSRs were finally selected based on their preliminary screening in all the nine Corchorus species (Supplementary Table 2). The nSSR sequences were amplified according to Mir et al. (2008), and the amplified products were denatured and separated in 10 % denaturing polyacrylamide (19:1) gels in 1.0 × TBE buffer at 80 V cm−1. The amplified products were stained with a silver staining protocol (Sanguinetti et al. 1994), and microsatellite alleles were scored as present (1) or absent (0).

Genotypic data analyses

For each nSSR locus, the number of observed alleles (n a ), the effective number of alleles (n e ) (Kimura and Crow 1964), the Levene’s observed heterozygosity (H o ) (Levene 1949) and the Nei’s unbiased genetic diversity (H e ) (Nei 1973) were calculated by using the software POPGENE Version 1.32 (Yeh et al. 1999). PIC for each nSSR locus was calculated as \( \mathrm{PIC}=1\text{-}\sum\nolimits_{{\mathrm{i}=1}}^{\mathrm{n}} {{{{\left( {\mathrm{Pi}} \right)}}^2}} \), where P i is the frequency of the ith allele and n is the number of alleles (Botstein et al. 1980). The allelic richness (R s ) that estimates the number of alleles independent of the sample size (El Mousadik and Petit 1996) was calculated by using the software FSTAT (Goudet 1995). The Shannon’s information index (Shannon and Weaver 1949) was calculated by using POPGENE. The number of private alleles (Mohammad et al. 2004) and their average frequencies were estimated from allele distributions by using MS Office Excel. For genetic diversity estimates (n a , n e , H e and R s ), the differences between the cultivated and wild Corchorus groups were tested by non-parametric Wilcoxon signed-paired rank test after Bonferroni correction.

Genetic analysis was performed by using the software DARwin5 Version 5.0.158 (Perrier and Jacquemoud-Collet 2006). A distance matrix was computed by using Jaccard’s coefficients and 1,000 replicate bootstraps. The dendogram was constructed by using the NJ method employing the weighted version of Saitou and Nei (1987) based on a principle of parsimony that finds a tree close to the true phylogenetic tree (Rohlf 2005). Since a radial tree is the better option if it is not known as to where the root lies (Hall 2001), the data were presented as a radial tree. The NJ tree resolved C. aestuans L. as an outgroup. It was grafted a posteriori on a tree built on the unit set without this outgroup by using the least-squares grafting method of DARwin5, and the grafting point was regarded as the ancestral root (Perrier et al. 2003). Concordance of the nSSR data to the morphological data was tested by two-way Mantel test.

Chloroplast microsatellite haplotyping

Ten pairs of consensus chloroplast microsatellite primers (Weising and Gardner 1999) were used to characterize mononucleotide SSR variation in chloroplast DNA (Table 1). The PCR was performed employing the same protocol as above, except for initial denaturation (94 °C for 5 min) and final extension (72 °C for 8 min), with an annealing temperature of 50 °C. Amplified products were separated using a non-denaturing polyacrylamide gel (8 %, 29:1) in 0.5 × Tris-glycine buffer and stained with a modified silver staining protocol (Sanguinetti et al. 1994).

Table 1 Ten pairs of consensus chloroplast microsatellite primers used to characterize mononucleotide repeat variation in intronic and intergenic cpDNA regions of the nine Corchorus species including the two cultivated jute species C. capsularis and C. olitorius

For cpSSR allele detection and scoring, Gene Profiler Version 4.05 (Scanalytics, Fairfax, USA) was used. Alleles were generated using linkage analysis database module of the software. Chloroplast haplotypes were resolved, and haplotype statistics, such as the effective number of haplotypes (N e ), the number of private haplotypes (P), haplotypic richness (R h ) (Petit et al. 1998), the Nei’s unbiased genetic diversity (H e ) (Nei 1973) and average genetic distance between individuals (D 2 sh ) (Morgante et al. 1998) were calculated by using the software HAPLOTYPE ANALYSIS Version 1.05 (Eliades and Eliades 2009). Since length polymorphisms caused by variable mononucleotide repeats are not suitable for phylogenetic analysis, chloroplast haplotypes were mapped onto the nSSR NJ tree.

Results and discussion

Morphological and morpho-phenetic analyses

Morphological traits were highly polymorphic/diverse and significantly different among the Corchorus species (Fig. 1). Our results showed that all the wild jute species produced finer fibre with lower tensile strength, while both the cultivated jute species produced relatively coarser fibre with higher tensile strength. Non-parametric Kendall’s coefficients of rank correlations showed varying degrees of positive associations between the seven out of eight quantitative traits (Table 2). Except for leaf ratio, all traits were significantly associated with fibre yield plant−1.

Fig. 1
figure 1

Variation in the eight highly polymorphic bast fibre yield- and quality-linked traits in Corchorus species. Means with common letters are not significantly different at P ≤ 0.05, according to Tukey-Kramer honestly significant difference test

Table 2 Associations between bast fibre yield- and quality-linked traits in Corchorus species based on non-parametric method of Kendall’s coefficients (τ)a of rank correlations

We considered here the majority-rule (>50 % bootstrap support) consensus tree produced by average taxonomic distance because of its universal use in numerical taxonomy (Rohlf 2005). The nine Corchorus species were grouped in the two major clusters, each with a >75 % bootstrap support-cluster I with both the cultivated jute species together with a mutant of C. olitorius, and cluster II with all the wild jute species together with a mutant of C. capsularis (Fig. 2). In the wild species cluster, C. aestuans, C. pseudo-capsularis and C. urticifolius were grouped in a distinct sub-cluster (IIb), while the remaining species formed a separate sub-cluster (IIc). Based on morpho-phenetic data, C. aestuans was found to be closely related to C. pseudo-capsularis (sub-cluster IIa). The suite of bast fibre yield- and quality-linked traits provided a high degree of phenetic support to both the cultivated jute species in relation to their wild relatives. Chakraborty and Palit (2009) have also found them to be closely related based on a large number of morphological characters.

Fig. 2
figure 2

A majority-rule consensus unweighted pair-group method arithmetic average tree constructed from the eight bast fibre yield- and quality-linked traits’ similarities. Photographs from left to right show seed, leaf and capsule of individual species, and the numbers at nodes represent the occurrence in 1,000 bootstrap replicates

Nuclear microsatellite polymorphisms

Summary statistics were computed for each nSSR locus (Supplementary Table 3) and for the whole sample and each of the two groups (cultivated and wild) over the 38 nSSR loci (Table 3). A total of 222 alleles was detected, with 95 and 183 alleles distributed in cultivated and wild jute, respectively. Maximum numbers of alleles were observed in C. fascicularis (59) followed by C. aestuans (54), while the least in C. capsularis (28). The number of observed alleles per locus (n a ) varied from 4 to 8, with an average (n o ) of 5.8 alleles per locus, whereas the effective number of alleles per locus (n e ) varied from 2.3 to 7.6, with an average of 4.6 alleles per locus. For n o , n e and the allelic richness (R s ), there was significant variation between the two Corchorus groups. The average genetic diversity (H e ) was 0.81 for the whole sample, and it was significantly higher in the wild (0.78) than in the cultivated jute (0.56). Overall, the average genetic diversity (H e ) in wild Corchorus species was ~40 % higher than that in the cultivated jute species. However, the average level of the Levene’s observed heterozygosity (H o ) was relatively low (0.31) for the whole sample. The average PIC was significantly higher in the wild (0.72) than in the cultivated (0.43) jute. Similarly, Shannon’s information index (I) was significantly higher in the wild (1.43) than in the cultivated (0.77) jute, with an average of 1.60 for the whole sample (Table 3).

Table 3 Summary statistics for nuclear microsatellite diversitya in Corchorus species, for the cultivated and wild Corchorus groups and for the whole sample over the 38 nuclear microsatellite loci

With increasing allele frequency, the number of alleles decreased (Fig. 3). The proportion of low (< 0.1 ≤ 0.3) frequency alleles was markedly higher in the wild (156) than in the cultivated (42) jute. Alleles with a frequency of <0.1 were not detected in the cultivated jute. In contrast, the proportions of medium (>0.3 ≤ 0.6) and high (>0.6) frequency alleles were higher in the cultivated (41 and 12, respectively) than in the wild (26 and 1, respectively) jute. Out of the 222 alleles, there were 166 private alleles; 39 in the cultivated and 127 in the wild jute. The cultivated and wild jute shared a total of 56 alleles, with an avergae frequency of 0.26.

Fig. 3
figure 3

Frequency distribution of nuclear microsatellite alleles in the cultivated and wild jute over the 38 loci

Together, the two cultivated jute species contained ~70% of the nuclear microsatellite diversity present in their wild relatives, reflecting the proportion of wild progenitors’ neutral genetic diversity that is typically retained by today’s modern crops (Gepts 2004; Allaby et al. 2008). The nSSR allele frequency indicates that the wild jute contains many alleles that are missing from the cultivated jute. This is reflected not only in the proportions of the total allelic diversity represented by the wild (82 %) versus cultivated (43 %) jute, but also in the distributions and patterns of private alleles. This reduction in neutral genetic diversity may be due to genetic drift in the form of domestication bottlenecks (Gepts 2004). It is well known that C. capsularis has a much lower level of neutral genetic diversity than C. olitorius (Basu et al. 2004; Roy et al. 2006; Haque et al. 2007; Akter et al. 2008; Mir et al. 2008, 2009). We have also recorded the least nSSR polymorphism in the former. The reduced diversity in C. capsularis compared to C. olitorius indicates its relatively recent origin as a species. Based on Nei’s measure of neutral gene diversity, Allender and King (2010) have also drawn the same conclusion for Brassica napus, in relation to B. oleracea and B. rapa.

nSSR neighbor-joining tree

Jaccard’s NJ tree separated two broad groups, with the tree rooted on C. aestuans: i) C. olitorius including its mutant together with C. urticifolius, and ii) C. capsularis including its mutant together with C. fascicularis, C. pseudo-capsularis, C. pseudo-olitorius, C. tridens and C. trilocularis (Fig. 4). The two-way Mantel test showed lower concordance of the nSSR results to the morphological data, as revealed by a low correlation value between the two data sets (r = 0.35). Notwithstanding, the nSSR data still supported a close relationship between C. fascicularis, C. pseudo-olitorius, C. tridens and C. trilocularis. However, the two cultivated jute species were not closely related, in agreement with earlier results of AFLP (Basu et al. 2004), ISSR (Roy et al. 2006), SSR (Basu et al. 2004; Akter et al. 2008; Mir et al. 2008), RAPD (Hossain et al. 2002; Haque et al. 2007; Roy et al. 2006) and STMS (Roy et al. 2006) analyses. C. urticifolius was found to be closely related to C. olitorius, and therefore it could be the closest extant relative of dark jute. In contrast, the nSSR results failed to identify a close relative of C. capsularis.

Fig. 4
figure 4

Phylogenetic tree based on a distance matrix of nine Corchorus species examined with 38 single-locus nuclear microsatellites. The tree was constructed by using Jaccard’s weighted version of the Neighbor-Joining method, with its topology shown as an inset. Unique chloroplast haplotype of each species as resolved by characterizing mononucleotide repeat-based chloroplast microsatellite variation over the eight intronic and intergenic cpDNA regions are indicated. Numbers at nodes represent % bootstrap support out of 1,000 bootstrap replicates

Our results suggest that C. aestuans could be a common ancestor to both the cultivated jute species. It is sexually compatible with both C. capsularis and C. olitorius (Arangzeb 1994). Roy et al. (2006) have earlier advocated that it could be a progenitor of C. olitorius, with a possibility that both of them might have evolved from a common ancestor. Recent morpho-molecular grouping (Chakraborty and Palit 2009) and karyomorphological analysis (Maity and Datta 2009) have also shown them to be primary relatives. In addition, a close relationship between C. aestuans and C. urticifolius may further allude to the former as an ancestor to C. olitorius, possibly introgressed via natural hybridization with C. urticifolius. The possibility of natural hybridization between the ancestral species of C. olitorius has been suggested earlier (Mir et al. 2008). Of the Corchorus species, C. aestuans is the most cosmopolitan and is not only capable of growing in extremely diverse habitats (Edmonds 1990; Benor et al. 2010), but is also present throughout the world across continents. It is thus logical to assume that both the cultivated jute species might have evolved from C. aestuans or one of its very similar ancestral species. It has been shown that they are allopatric with certain common alleles and may have originated from a closely related common ancestor (Basu et al. 2004).

Chloroplast haplotype diversity

Except for the loci rpl20-rps12 intergenic and ORF 74b-psbB intergenic, all cpSSR loci were amplified in Corchorus species. All of them, excepting trnK intron, were polymorphic and informative (Fig. 5). A total of 30 alleles was detected over the eight loci (Table 1). In total, nine unique chloroplast haplotypes (haplo-1 to haplo-9) were resolved (Table 4; Fig. 4). The haplotype of white jute was more similar to those of C. aestuans and C. pseudo-olitorius, while the haplotype of dark jute was relatively more similar to those of C. aestuans, C. pseudo-olitorius and C. urticifolius. Intra-group haplotype statistics are shown in Table 5. The wild jute had higher haplotypic richness (R h ), genetic diversity (H e ) and mean genetic distance between individuals (D 2 sh ) than the cultivated jute. The two cultivated species contained ~70% of the chloroplast haplotype diversity present in their wild relatives.

Fig. 5
figure 5

Non-denaturing polyacrylamide gels showing monomorphism at the trnK intron and polymorphism at the trnG intron, ORF 77-ORF 82 intergenic and rpl2-rps19 intergenic chloroplast microsatellite loci. M, 100-bp ladder; 1, C. capsularis cv. JRC-321; 2, C. capsularis mt. CMU-010; 3, C. olitorius cv. JRO-524; 4, C. olitorius mt. PPO-4; 5, C. aestuans; 6, C. fascicularis; 7, C. pseudo-capsularis; 8, C. pseudo-olitorius; 9, C. tridens; 10, C. trilocularis; 11, C. urticifolius

Table 4 Haplotypes of Corchorus species as resolved by characterizing the eight chloroplast microsatellite (cpSSR) loci representing the intronic and intergenic cpDNA regions
Table 5 Summary statistics for chloroplast haplotype diversitya in Corchorus species, for the cultivated and wild Corchorus groups and for the whole sample over the eight chloroplast microsatellite loci

What is intriguing in our study is that none of the wild species haplotypes are shared by either of the two domesticated jute species. We further haplotyped 19 and 20 indigenous plus exotic cultivars of white and dark jute, respectively (Supplementary Table 4). However, each cultivated species was characterized by a single chloroplast haplotype. C. olitorius cvs JRO-632 (indigenous) and Sudan Green (exotic from Sudan) were found to represent the same haplotype (haplo-2). Also, the haplotype of another exotic cv. Tanganyika-1 (exotic from Tanzania) was found to be similar to that of the indigenous cultivars of C. olitorius. Similarly, C. capsularis accessions from Africa (CEX-033), Australia (Cap Australia Green) and South America (Solimos) shared the same haplotype (haplo-4) of those from South Asia.

Considering the magnitude of the allele size variation, it seems most likely that the haplotypes of both the cultivated jute species are the result of introgression/chloroplast capture (Wills and Burke 2006). The allele size variation due to variable mononucleotide repeats in noncoding cpDNA is known to increase with phylogenetic distance between the taxa (Weising and Gardner 1999). Thus, on the maternal lineage, both the cultivated jute species are not phylogenetically close in agreement with earlier report that suggested their matrilineal origins to be different (Basu et al. 2004). Since the haplotypes of C. aestuans and C. pseudo-olitorius are more similar to that of C. capsularis than C. olitorius, either of the wild species, the haplotypes of which are very similar to each other, is a much more likely matrilineal progenitor of C. capsularis. But it is not possible at this stage to identify a putative matrilineal progenitor of C. olitorius.

Did both the cultivated jute species originate in Africa, but were independently domesticated in Asia?

Our study supports an African origin of dark jute, possibly in equatorial region of east Africa because its closest extant relative C. urticifolius is only restricted to this region. Archaeobotanical results also suggest its origin in Africa, but introduction to India together with many African crops that reached India in prehistory (Blench 2003). This finds support from the recent human Y-chromosome evidence that indicates an African origin of Dravidian agriculture in southern India (Winters 2010). Interestingly, however, archaeobotanical evidence suggests that C. urticifolius is of south Asian origin and reached Africa at an early period (Blench 2003). Taken together, it is thus most likely that dark jute had its origin in Africa from C. aestuans and C. urticifolius, but was domesticated in India through the development of an ennobled type from its African wild type. Benor et al. (2012) have hypothesized that dispersal of dark jute might have occurred via the Mediterranean-Indian trade route. However, there is no archaeobotanical evidence for C. capsularis, except for that it was used in China and India in early history in clear distinction with silk (Gopal 1961). The possibility of the dispersal of white jute from Africa to Asia (south China) via the Nile valley along the ancient silk route cannot be ruled out. However, in the absence of corresponding African, Chinese or Indian archaeobotanical evidence for white jute, this view would continue to be broadly speculative.