Introduction

The domestication of indigenous fruits and nuts for the diversification of subsistence agriculture is playing a big role in the achievement of the Millennium Development Goal, trying to combat poverty and hunger and mitigate environmental degradation in developing countries (Leakey et al. 2007). Studies on the biological variability of indigenous fruit tree species, their propagation using cheap and simple methods appropriate for rural development projects, and their suitability for domestication have been progressively increased in West Africa over the last decades (Leakey et al. 2000). In some cases, a participatory approach to cultivar development was implemented with success (Leakey et al. 2003). In the fulfilment of cultivars development for priority tree species, two key elements are (1) the identification of “plus trees” in natural populations and (2) their propagation by vegetative techniques (Leakey and Page 2006). Prior to “plus trees” identification, quantitative characterization of fruit, nut and kernel variation (Leakey et al. 2005a), variation in nutritive value and other food properties (Leakey et al. 2005b) have to be studied and an understanding of the interactions between different traits for multi-trait selection is needed (Leakey 2005). Tamarind, Tamarindus indica L. (Leguminosae: Caesalpinioideae), is a semi-evergreen multipurpose tree typical of savannah ecosystems, featuring prominently in riparian habitats (Fandohan et al. 2010). The tamarind tree has an important role in local economies, supplements the local diet, and is used in traditional and modern therapies (El-Siddig et al. 2006). Its pulp is much appreciated in condiments, used to make juice and is a good source of proteins, fats and carbohydrates that could be used to alleviate malnutrition in children (El-Siddig et al. 2006). In efforts to enhance the species’ genetic conservation and utilization it has been recently identified as one of the top ten agroforestry tree species to be prioritized for future crop diversification programs and development in sub-Saharan Africa (Eyog Matig et al. 2002). Although of high local economic importance, our knowledge of the morphometric and ecological diversity of tamarind is still limited outside of Asia. Works on Asian tamarind populations have revealed a considerable phenotypic and genotypic variation and allowed the selection of superior trees based on pulp mass, pulp taste and fruit length (El-Siddig et al. 2006). Other studies have addressed its domestication potential in Africa and provided data on biochemical analyses (Soloviev et al. 2004), the comparison of the genetic diversity of African, Asian and South-American populations as indicator of the species native area, breeding systems and pollination-related issues (Diallo et al. 2007, 2008). To our knowledge, no study has documented (1) indigenous perception of qualitative or size group morphological variation within the species and (2) quantitative morphological and genetic structuring to test whether locally perceived variations and preferences are either genetically or ecologically determined. Such bottom-up approach may help to identify and characterize “plus trees”, locate ecological conditions allowing the species to better express its potential (i.e. fruit size and yield, pulp productivity, pulp taste, etc.), identify links between traits and is crucial to make improvement strategies realistic. The current study aims at (1) matching the quantitative assessment of tamarind fruits traits with the folk classification based on local knowledge, (2) analysing its relationship with ecological conditions and, (3) analysing the implications for further improvement programs. Thus, the following questions were addressed: Do quantitative descriptors confirm folk classification of tamarind morphotypes? Which ecological factors drive the pattern of morphological variability in T. indica?

Materials and methods

Study area

The study was conducted in Benin (West Africa). Three different ecological zones were targeted based on tamarind distribution range: the Sudanian zone (9°45′–12°25′ N), the Sudano-Guinean zone (7°30′–9°45′ N) and the sub-humid Guineo-Congolian zone (6°25′–7°30′ N). The Guineo-Congolian zone is the wettest with a bimodal rain regime whilst the Sudanian zone is the driest with nearly a 7 month drought period. The vegetation is made of grassland and thickets and some relic rain forests in the Guineo-Congolian zone. In the Sudano-Guinean zone, the vegetation is dominated by Isoberlinia spp. woodlands whereas in the Sudanian zone, the vegetation is characterised by Combretum spp. and Acacia spp. tree savannas (White 1983). Table 1 summarizes the ecological characteristics of the three study sites.

Table 1 Characteristics of the three study sites (adapted from Hijmans et al. 2004)

Data collection

Tamarind individuals were sampled in the Sudanian, the Sudano-Guinean and the Guineo-Congolian ecological zones of Benin. Within each zone, trees were sampled where local people had experience and knowledge on tamarind tree. Ethnobotanical surveys were carried out on the local perception of the morphological variation in tamarind fruits. The survey revealed that local people distinguish ten morphotypes (Table 2), a morphotype being a group of tamarinds sharing some qualitative fruit traits as perceived by interviewees. Two experienced women very familiar to the described morphotypes chosen with the help of local leaders in each study site were asked to participate in the selection of tamarind individuals to be sampled for fruit morphological traits description. Five trees were sampled per morphotype: 25 individuals in the Sudanian zone, 15 in the Sudano-Guinean zone and 10 in the Guineo-Congolian zone. The variability in the number of samples per zone was due to the fact that not all morphotypes were found in all zones. From each selected tree, 30 samples of both fruits and seeds were collected for measurement following the protocol described by Leakey et al. (2000). We measured twelve morphological descriptors on fruits (length, width, thickness, number of seeds, fresh mass, dry mass, pulp mass, and the ratio pulp mass/fruit mass) and seeds (length, width, thickness and mass). To improve accuracy, fruit’s width and thickness were measured at the first, the second and the third quarter of each fruit and the arithmetic means were considered as the fruit’s width and thickness. Similar descriptors have already been used in other studies like IBPGR (1980), Leakey et al. (2000), (2005a, b), El-Siddig et al. (2006).

Table 2 Folk classification of tamarind morphotypes

To estimate pulp mass, fruits were oven-dried at 65°C for 48 h to obtain the dry weight. Dried fruits were broken and the content extracted (pulp + seeds + fibers). The pulp was removed by soaking the content in water. The residu (seeds and fibers) was oven-dried at 65°C for 48 h. This protocol was successfully used before for the baobab tree (Assogbadjo et al. 2005).

Overall, 18,000 individual values were recorded for 1500 fruits and 1500 seeds from the 50 analyzed trees.

Monthly climatic data (rainfall, relative humidity, minimum and maximum temperatures and insolation) and number of dry months per year for over 30 years (1978–2008) were obtained for each study site within ecological zones from Hijmans et al. (2004).

Analysis

The pulp mass (wp) in each fruit was computed using the following formula:

$$ wp = wP_{i} - wR_{i} $$
(1)

where, wp is the pulp content of a given fruit; wP i is the dried mass of the fruit (i); wR i is the total dried mass of seeds, fibers and husk of the fruit (i).

Univariate analyses of variance and Student–Newman–Keuls (SNK) tests were used to describe the morphotypes and identify the discriminative descriptors. Then, Least Square Means of fruits and seeds descriptors were estimated and a Canonical Discriminant Analysis (using the Mahalanobis distance) was performed to reveal links between the descriptors and plot distances between morphotypes. This multivariate analysis is a relevant and powerful test to distinguish between entities that fall into natural groupings i.e. morphological or ecological groups (Lowe et al. 2004).

Afterwards, the within and between morphotypes variability was evaluated using Variance Component Analysis (Goodnight 1978). To examine the influence of ecological conditions on tamarind fruit and seed traits, a Principal Component Analysis (PCA) was performed only on the quantitative descriptors. The PCA factor scores were correlated with the climatic index of Mangenot (1951), the minimum and the maximum temperatures and the insolation using a Pearson correlation.

The climatic index of Mangenot (I M) was computed for each sampled site as follows:

$$ I_{M} = {\frac{{\frac{P}{100} + M_{S} + \overline{Ux} }}{{nS + {\frac{500}{{\overline{Un} }}}}}} $$
(2)

where P: mean annual rainfall (mm), MS: mean rainfall of dry months (i.e. months with rainfall less than 50 mm, nS: number of dry months, \( \overline{U} x \): maximum of annual relative humidity (%), \( \overline{U} n \): minimum of annual relative humidity (%). A higher Mangenot index indicates wetter ecological conditions.

As the pulp is the principal trait of commercial importance we also carried out a linear regression to identify predictors of pulp yield per fruit and to test if the predicting power of the explanatory variables differs between morphotypes. We built a linear regression for pulp mass per fruit, with eight independent variables measured on fruits (length, width, thickness, and mass) and seeds (length, width, thickness and mass). Pearson’s correlation was performed between the independent variables to test multicollinearity. Since there were significant strong correlations between pairs of variables (r > 0.60, P < 0.001) only one independent variable (fruit mass) was finally used in the regression model. We insert traditional morphotype in the model as a dummy variable (see Kutner et al. 2005). The model tested was: pulp mass = β0 + β1(fruit mass) + β2(traditional morphotype) + ε. β0 indicates the intercept, β1 and β2 the partial regression slopes and ε the unexplained error associated to the model. The residuals normality plot, the residual vs. fitted plot and the residuals vs. leverage plots with Cook distance were used to diagnose the regressions models (Quinn and Keough 2005). Data were processed under SAS version9.1 (SAS Inc. 2003).

Results

Quantitative morphological assessment of traditionally classified tamarind morphotypes

Table 3 shows the mean values recorded for quantitative fruit and seed descriptors in the 10 identified morphotypes of T. indica. Mean fruit traits (length, width, thickness, number of seed, fresh mass, dry mass, pulp mass, ratio length/width and ratio pulp/fruit) and mean seed traits (length, width, thickness, mass and ratio length/width) significantly differed between morphotypes (P < 0.0001; Wilks’ Lambda ≥0.124). The lowest fresh and dry fruit mass and pulp mass, seed length, seed width and seed mass were recorded for fruits from morphotype A, whilst fruits from morphotype B showed the lowest fruit thickness. Fruits from morphotype C portrayed the highest ratio pulp/fruit and seed thickness whereas fruits from morphotype E exhibited the lowest length. Fruits from morphotype F showed the highest values in width, fresh and dry mass and pulp mass while fruits from morphotype G showed the highest values in length and thickness, seed length, seed thickness and seed mass. Fruits from morphotype H showed the lowest values in width, number of seeds per fruit and seed thickness whereas fruits from morphotype J showed the highest values for number of seeds per fruit but the lowest values for the ratio pulp/fruit.

Table 3 Means and standard errors of quantitative morphological descriptors of fruits and seeds of the ten locally identified tamarind morphotypes

The multivariate canonical discriminant analysis on fruit and seed descriptors using the Mahalanobis distance calculation confirmed the morphotypes as discriminated by local people (P < 0.0001; Wilks’ Lambda = 0.82).

The canonical discriminant analysis performed on the ten morphotypes showed that the first two axes explained 82% of the observed variation. These axes were thus used to describe the relationships between the investigated descriptors and traditional morphotypes. The correlation between the axes and the used descriptors is shown in Table 4. The first axis showed a strong and positive link with and between the fruit length, width, thickness, fresh mass, dry mass pulp mass and number of seeds per fruit and the seed length, seed width and seed mass. This axis was negatively correlated with the ratio pulp/fruit. Figure 1 shows the projection of the individuals from the ten morphotypes onto axes 1 and 2. From this plot and Table 4 it can be deduced that overall, morphotypes F and G (located in the upper positive part of the axis 1) outclassed the others for most of the quantitative descriptors but showed low values for the ratio pulp/fruit. In contrast, the other morphotypes had high values for the ratio pulp/fruit.

Table 4 Correlation between quantitative morphological descriptors of tamarind fruit and seed and canonical discriminant axes
Fig. 1
figure 1

Canonical discriminant analysis to reveal differences between morphotypes and links between descriptors–legend: A, B, C, D, E, F, G, H, I and J are the different morphotypes–can1 = first canonical axis, can2 = second canonical axis

Despite the significant differences among morphotypes suggested by the canonical discriminant analysis, the variance components analysis revealed that the variation within morphotypes is higher than that between them for all fruit and seed descriptors except seed length (Table 5). In general, 47 to 95% of the morphological variation was present within morphotypes. Nevertheless, important amounts of between morphotypes variations were detected for fruit mass, pulp mass, seed length, seed width and seed mass (24 to 53%).

Table 5 Results of the variance components estimation procedure (in percentage) on tamarind fruit and seed traits

Influence of ecological conditions on the quantitative descriptors of tamarind fruits and seeds

The Principal Component Analysis performed on morphological traits showed that the first two axes explained 63% of the variation. Table 6 shows the correlation between the axes and quantitative descriptors. The first axis shows a positive link between some fruit traits (length, width, thickness, fresh mass, dry mass and pulp mass) and seed traits (the length, width and mass). The second axis was correlated with seed thickness only. Moreover, the first axis was found significantly and positively correlated with the climatic index (I M) of Mangenot whereas it was negatively correlated with the maximum temperature (T max) and insolation (Ins) (Table 6). This means that the fruit traits (length, width, thickness, fresh mass, dry mass and pulp mass) and the seed traits (length, width and mass) increase with higher I M but decline with higher maximum temperature and insolation. The other relationships were not significant. Overall, it can be deduced that fruits from wetter zones (i.e. the Guineo-Congolian zone) generally had greater fruit and seed size and mass, whilst fruits from drier zones (i.e. the Sudanian zone) showed more thin-shaped and lightweight fruits and seeds.

Table 6 Correlation between quantitative morphological traits, ecological factors and PCA factors

Modelling pulp yield per fruit

Regression equations were used to build predictive models for pulp yield (the principal trait of commercial importance) based on fruit mass (Table 7). There were highly significant and strong relationships between fruit mass and pulp mass (R² = 0.795). However, fruit mass was a stronger predictor of pulp mass for morphotypes J, D, C, I and A (i.e. higher estimated regression slopes) than for morphotypes F, G, E, H and B (0.32 < β1 < 0.47 versus 0.07 < β1 < 0.3; P < 0.0001; Table 7).

Table 7 Linear regression model for T. indica pulp yield per fruit

Discussion

This paper quantifies variation in traditional morphotypes of tamarind and provides basic knowledge on the range of variation of several quantitative morphological descriptors within and between locally identified morphotypes, across ecologically different sites. Ten morphotypes were recorded using folk taxonomy. This is consistent with previous studies on tamarind in India, Thailand and Philippines were eight to fifty cultivars are differentiated based on fruit size and degree of sweetness (El-Siddig et al. 2006).These morphotypes may have resulted from complex genetic inter-crossing processes, but local people link the differences in pulp taste to habitat types. For instance, they affirm that sweet fruits are found in gallery forest while sour fruits in savannah lands (observations from an ongoing survey).

The quantitative morphological analyses on fruits and seeds of the 10 identified morphotypes confirmed the traditional discrimination to be effective. From the results we can conclude that fruits having a greater size and mass have a lower pulp/fruit ratio, despite a significant increase in pulp mass, in general. This may indicate that for superior morphotypes, the increase in pulp mass is lower than that of the remaining part of the fruits (seeds and oaks mass). The correlations were less evident for the number of seeds per fruit nonetheless. In fact, the number of seeds per fruit seemed to result from a trade-off between fruit length and seed size (e.g. fruits having greater or lower length may contain either lower or higher number of seeds depending on seeds size; personal observation).

Despite the confirmation of the traditional discrimination, the statistical analysis revealed that most of the variability of morphological traits of fruits and seeds is present within the morphotypes. This suggests a significant heterogeneity within fruits traditionally classified as belonging to the same morphotypes. To get a more powerful morphological discrimination, quantitative descriptors should hence be combined with locally perceived qualitative traits (pulp taste and color). The very extensive variation found irrespective of the descriptors is consistent with previous studies on tamarind (El-Siddig et al. 2006) and other indigenous fruit trees such as Detarium microcarpum Guill. and Perr. (Kouyaté and Van Damme 2002), Irvingia gabonensis (Aubry-Lecomte ex O’Rorke) Baill. ex Lanen and Dacryodes edulis (G. Don) H.J. Lam (Leakey et al. 2004), Adansonia digitata L. (Assogbadjo et al. 2006, 2008, 2009; Kyndt et al. 2009), Vitellaria paradoxa C.F. Gaertn. (Sanou et al. 2006) and Canarium indicum L. (Leakey et al. 2008).

Variation in fruit size and in number of seeds per fruit was found to be significantly affected by cross pollination and resource availability for T. indica (Thimmaraju et al. 1989; Diallo et al. 2008). According to these authors, self-incompatibility in self-pollinated flowers and resources limitation (which imposes a sorting by tamarind trees) may reduce fruit size and the number of seeds per fruit. The partial link with resource limitations is mirrored by the pattern of correlations found in this study between some ecological factors and some pinpointed morphological traits. For instance, the positive link between fruit traits (length, width, thickness, fresh mass, dry mass and pulp mass) and seed traits (the length, width and mass) on one hand and ecological factors such as the climatic index of Mangenot on the other hand, and their opposite link with maximum temperature and insolation suggest that fruit and seed size and mass of tamarind trees tend to increase with humidity (i.e. higher climatic index) and decline with aridity (i.e. higher maximum temperature).

Phenotypic plasticity was found in several species of tropical and temperate trees for many traits, usually in response to changes in ecological conditions (Heaton et al. 1999) and biogeographic history of individual species (Schlichting and Pigliucci 1998). Nevertheless, ecological differences may only partly explain the observed variations, the remaining part being driven by genetic variation.

The high variability indicates great potential for further improvement through the development of cultivars from elite trees using horticultural techniques (Leakey et al. 2008). Speedy benefits may be obtained by selecting superior morphotypes and propagating such stocks as clones (El-Siddig et al. 2006). Since morphotypes F and G (see Table 2 for their specific characteristics) portrayed the highest values for most of the investigated descriptors, especially pulp mass per fruit, they may be of particular interest if improvement programs are to be implemented with the purpose of improving pulp yield per fruit. As they showed greatest seed mass, they may also portray higher germination, seedling growth and survival performances (Khan 2004). Thus, they are potential candidates to be used as root stock onto which cultivars can be grafted. Other morphotypes like C, D, E, H and J (see Table 2 for their specific characteristics) showed intermediate or low fruits and seeds size and mass but high pulp/fruit ratio and sweet pulp and hence may also be of great interest as far as improving the ratio pulp/fruit and pulp taste is concerned.

The relatively strong relationships between fruit mass and pulp mass suggested by the predictive models indicate that selection for pulp can be based on fruit mass. The variability of the relationship between fruit mass and pulp mass confirms the differences between morphotypes and may have been driven by both ecological and genetic variation. Thus, further use of the obtained models should be made with respect to the morphotypes.

Practical conservation measures are to be taken to preserve genetic diversity and maintain multiple specimens. This study indicates that based on the quantitative descriptors, most of the variation is held within morphotypes. Nevertheless, the between morphotype variation was found to relatively high, particularly for fruit mass, pulp mass and seed mass. In addition, the perceived qualitative variation may be genetically determined and should not be neglected. Thus, pending further genetic finger-printings, one possible strategy for germplasm collection may consist of sampling a moderate number of trees within all the morphotypes. This may ensure capturing a wide range of variation. Because phenotypic variability results from both ecological and genetic effects, studies of genetic diversity and gene flow among ecological zones are needed to explain all the observed variation prior to effective germplasm collection, “plus trees” selection and propagation in traditional agroforestry systems.

Conclusion

This study has highlighted preliminary required information for tamarind further improvement based on natural individuals. It demonstrates opportunities to select wild “plus trees” for pulp production to meet the needs of traditional and modern markets. The developed predictive models could allow researchers and policy makers in partnership with local people to make quantitative assessment of the pulp yield potential of tamarind trees established in traditional agroforestry systems. However, further endeavours on phenotypic and genetic diversity in the species are required and much larger populations should be examined to build effective management strategies for its genetic resources. If the domestication of tamarind is to be implemented, more comprehensive researches extend to other sub-Saharan countries (Burkina Faso, Mali, Togo, Ghana, etc.) will be needed. Since traits for nutritional values should be taken into account in selection processes, further evaluations of organoleptic traits are also needed. Studies on provenance variation in germination and seedling growth dynamics are as well required to identify best provenances to be used as rootstocks on which selected cultivars will be grafted. As far as the aforementioned further research steps are concerned, African research programs can benefit from the experience and results of Asian research teams on the species.