Introduction

Tree domestication is an evolutionary process of transforming wild individuals into cultivated ones by continuously selecting trees with morphological traits. As a result, users’ preferred traits, which are typically linked to productivity and taste (fruit and kernel size, sweeter pulp), are improved in comparison to the wild ancestors (Mboujda et al. 2022; Purugganan and Fuller 2009). Domestication of indigenous fruit tree species has become a global program with a strong West African focus since its inception in the 1990s (Fandohan et al. 2011; Leakey et al. 2022). Each decade of domestication had a distinct focus. The first one was devoted to evaluating species potential and developing techniques for improved germplasm production, while the second decade focused on the characterisation of genetic variation using morphological and molecular techniques and on product commercialisation (Leakey et al. 2012). The nutritional and medicinal characterisation of trees, as well as the evaluation of natural resources and their value to local populations, were the key areas of growth in the third decade. However, testing of potential cultivars, adoption of participatory principles, and selection of plus trees and market-oriented ideotypes remained undervalued (Leakey et al. 2022).

Garcinia kola Heckel, a member of the diverse pantropical family Clusiaceae, is an underutilised African indigenous tree species native to West and Central Africa’s humid tropical forests (PROTA 2022; WFO, 2022). It was appointed a priority species for conservation by the Sub-Saharan Forest Genetic Resources Programme (SAFORGEN) (Sacandé and Pritchard 2004) and World Agroforestry (CIFOR-ICRAF) (Franzel et al., 2008; Franzel and Kindt 2012; Leakey 2014). Colloquially called bitter kola, the species plays a significant role in African ethnomedicine and traditional ceremonies (Manourova et al. 2019). Its seeds are the most valuable product, and they are also among the most traded agroforestry tree commodities in the region (Awono et al. 2016). The kernels are chewed raw to cure gastrointestinal illnesses, suppress inflammation, and treat the common cold, sore throat, and chest pain (Adegboye et al. 2008; Ayepola et al. 2014; Ijomone et al. 2012). Furthermore, the seeds are a popular aphrodisiac stimulant and snack food (Adaramoye 2010; Fondoun and Manga 2000). G. kola is a medium-sized tree that can grow up to 30 m tall and has a trunk diameter of about 100 cm (PROTA 2022). It has a dense, compact crown with upright, slightly drooping branches. The trunk is straight and cylindrical in shape, with smooth bark that is dark brown on the outside and pinkish on the inside. Fruits are globular-shaped berries. When mature, the exocarp is velvety and reddish-yellow, while the pulp is yellow/orange with an apricot odour. Because the pulp is sour and resinous, it is rarely consumed. One fruit contains about 2–4 seeds, which germinate hypogeally (Anegbeh et al. 2006; Eyog-Matig et al. 2007). The seed coat is light brown in colour and darkens as it dries or ages. The kernel is white and has brownish-red branched lines that produce red resinous globules (Onayade et al. 1998). The harvesting season lasts from April to October, according to the region and climate zone (Babalola and Agbeja 2010; Dosunmu and Johnson 1995). Despite widespread scientific interest in the therapeutic potential of G. kola seeds and the species’ significance for local communities, the tree’s domestication remains scarcely documented.

The initial phase in the domestication process is to describe morphological tree-to-tree variation, which is essential for future clonal cultivar development. Leakey (2005, pp. 1, 3); Tsobeng et al. (2020). Simple visual and measurement techniques can be used to determine the extent of variation in characteristics that are important for the quality and marketability of target tree products. As a result, tree and product ‘ideotypes’ are developed to meet the needs of consumers. Ideotype refers to the ideal model phenotype, which can be expected to perform predictably within a defined environment, thereby providing the background for genetic selection (Leakey and Page 2006). Once the ideotype is identified, there are several approaches for selecting the best individuals. If only phenotype is employed in the selection, the individuals are referred to as “plus trees”. If both phenotype and genotype are included, the trees are classified as “elite” (Costes and Gion 2015; Finkeldey and Hattemer 2007).

The objective of this study was to identify plus G. kola trees by assessing their overall phenotypic variance and determining prospective tree ideotypes in Cameroon. The level of domestication of G. kola was examined by comparing the morphological characteristics of geographically distinct populations as well as between wild and cultivated individuals.

Methodology

Study site description

The study was conducted in three different regions in Cameroon; Soutwest, Central and South, where Garcinia kola naturally occurs and was reported to be highly important for local communities (Fig. 1).

The first study site, Southwest region, borders Nigeria, an important trading partner to Cameroon, where bitter kola products are highly prized. The data were collected in the vicinity of Kumba, Lebialem, Mamfé, and Tombel villages, which are considered lowland areas with an average altitude of 325 m.a.s.l. (Fig. 1). Alfisols and ultisols are the predominant soil type (European Commission 2013). Southwest region belongs to the agroecological zone V (humid forest with monomodal rainfall), and its climate is classified as a tropical monsoon climate (Am) according to Köppen-Geiger (Kottek et al. 2006). The average rainfall varies by around 3170 mm per year, while the average temperature is about 24.6 °C (Climate Data 2022).

Central region is a landlocked region, seating the capital city of the country Yaoundé. Both Central and South regions belong to the agro-ecological zone IV (humid forest with bimodal rainfall) and are dominated by rather hilly landscapes (600–660 m.a.s.l.) (Climate Data 2022). The climate is classified as tropical rainforest (Af) according to Köppen-Geiger (European Commission 2013). The data in the Central region were collected within the areas of Akok, Bot-Makak, Lekie-Assi and Nkenglikok. The average precipitation is about 1540 mm per year, while the mean annual temperature varies around 23.2 °C. The predominant soil types are oxisols and ultisols (Climate Data 2022; European Commission 2013).

South region shares borders with Equatorial Guinea, Gabon and Congo-Brazzaville. The data collection was conducted in the vicinity of Ebolowa, Kye-Ossi, Sangmelima and Zoételé with altitude ranging from 570 to 770 m.a.s.l. The average annual rainfall varies by around 1770 mm with temperatures of about 23.4 °C. Oxisols are considered the prevalent soil type in the South region (Climate Data 2022; European Commission 2013).

Fig. 1
figure 1

Map of data collection sites in Cameroon, regions and their respective study sites.

Data collection

The study was conducted during the harvesting period of G. kola fruits (June-September) in the years 2016, 2018, and 2019. Altogether, 218 individual trees were measured and described, along with 1025 leaves, 1722 fruits, and 4553 seeds (Table 1). All trees were measured and described based on 18 quantitative and 8 qualitative descriptors adapted from mangosteen (Garcinia mangostana L.) (IPGRI 2003) and baobab (Adansonia digitata L.) (Kehlenbeck et al. 2015). Tree and trunk height was measured by a sine-height method using a laser rangefinder and clinometer. Diameter at breast height (DBH) was taken at the height of 130 cm by a girthing tape, and crown diameter was assessed by the cross method (Bragg 2014). Tree age was estimated by their “owners”. If possible, 8–10 mature fruits and 5 leaves were randomly collected per individual tree. The fruits and seeds were weighted using a portable semi-analytical balance (0.01 g precision). Fruits under 50 g and seeds under 2 g of weight were considered as immature and discarded from the analyses. Fruit length was measured by callipers, while fruit diameter was taken with a soft tape. Fruit colour and shapes were recorded based on the above-mentioned descriptors and according to the Royal Horticultural Society (RHS) Colour Chart (2001). Seeds were manually extracted and weighted, and seed length and width were measured by callipers. Overall, seed mass per fruit was determined by the sum of the weights of all seeds. Additionally, seed mass ratio was calculated as the proportion of the non-edible fruit pulp to the seed mass. Fruit seed mass, tree height and crown diameter were determined as the most important criteria related to seed production and thus considered the determining factor for identification of plus trees. To see whether a growing site is linked to the species domestication, the trees’ stands were categorised as agroforests, homegarden and wild habitat (Table 2).

Table 1 The number of samples used for morphological evaluations per region and study site
Table 2 Tree growing sites across the regions

Data analysis

PCA analysis was done in Python 3.11.1, utilising pandas 1.5.3 and sklearn 1.2.1 libraries. To generate trees dendrogram, pandas 1.5.3 and scipy 1.10.0 libraries were used. The resulting PCA and dendrogram were plotted using matplotlib 3.6.3 library. The correlation matrix was calculated using Mathematica 13.2.0. The selection of plus trees was performed in Python using pandas 1.5.3 and numpy 1.24.2 libraries.

To construct the PCA plot, firstly all rows with no missing values were selected, removing 115 rows out of 3 671 rows in the dataset. After transformation of the dataset to obtain 0 mean and unit variance for all tree, fruit and seed features, PCA was transformed using sklearn library in Python. The components (PC1, PC2) were then plotted on a (x, y) plane, while the information about study site of all datapoints was retained. The PCA plot was complemented by information about the construction of the principal components, which are standardized in such a way that the most prominent feature has value 1 for sake of clarity. Finally, information about variance of the principal component is given.

For purposes of Trees dendrogram construction, the group-wise mean aggregation was performed, first from the seeds to the fruits level and afterwards from the fruits to the trees level, resulting in a dataset with rows determined by Farmer ID and Tree ID. Afterwards, mean values of all features were calculated for all study sites, followed by dendrogram generation.

The data were first aggregated to a tree level to select plus trees, just as in the case of the dendrogram construction. Consequently, traits defining a tree’s quality were selected: Fruit Seed mass (representing commercialisation factor), Tree height and Crown diameter (representing harvesting factors). For each trait, a score function was defined, a step function for Tree height and a linear function for Crown diameter and Fruit Seed mass, with intuitive interpretation (score increasing with larger values of Fruit seed mass and Crown diameter and for lower Tree height). The trait scores were combined in such a way that the relative importance of the trait was 10%, 20%, and 70% for Tree height, Crown diameter and Fruit Seed mass, respectively.

Results

Tree phenotypical characterisation

A thorough morphological analysis was performed first in order to identify potential plus G. kola trees. Trees, leaves, fruits and seeds were measured and described based on 26 descriptors. To maintain clarity, we focused on finding the best trees from a domestication perspective in this chapter; for all morphological results, please see Supplementary materials.

Fruits and seeds

The most important products of G. kola are its seeds. As a result, when searching for the species’ ideotype, the number of seeds, seed weight, fruit seed mass and fruit seeds mass ratio were the essential parameters to consider.

Summarising the morphological results, an average bitter kola fruit had 6.86 ± 0.98 cm in diameter, 7.94 ± 2.37 cm in length, while its weight was 157 ± 69.7 g. Fruit weight is the most variable factor (44.2%) with a maximum of 515.9 g and a minimum of 50.3 g. Less much variation was detected in fruit diameter (14.3%) than in fruit length (30%). An average bitter kola seed would be 3.07 ± 0.59 cm in length and 1.53 ± 0.33 cm in width, having a weight of 6.01 ± 2.21 g. The heaviest seed in our study had 19.9 g. The standard number of seeds per fruit varied from 2 to 4, five may occasionally occur as an anomaly, and one seed could be a sign of a tree that is not in a good condition. On average, in one fruit, there would be 2.52 ± 1.05 seeds, with a total seed mass of 14.4 ± 8.56 g and seed mass ratio of 9.49 ± 4.88%. The biggest variation of almost 60% was revealed in fruit seed mass, followed by fruit seed mass ratio with 51%. The least variable characteristics were seed length and width scoring about 20%, while seed weight and the number of seeds per fruit varied by about 40%. Detailed information can be found in Tables S1 and S2.

More than half of the bitter kola seeds were of oblong-elongated shape (57.6%), while the second most common shape was oblong (35%) (Fig. 2). The other detected shapes of only minor occurrence were ellipsoid, globose, ovate, irregular and double-seeds. The most common fruit shapes were spherical and elliptical (31.5 and 28%, respectively), closely followed by flattened (23.6%). The other identified shapes were rhomboidal, oblate, kidney-shaped and irregular, in decreasing order of prevalence (Fig. 3). The distribution of shapes differed throughout the regions (Detailed information in Tables S3 and S4). Ripe fruits were primarily recognised in orange and red colours, but also in yellow tones in rare cases (Fig. 4). The seed coat was typically light brown, brown orange and dark brown, sometimes with purple overtones (Fig. 5).

Fig. 2
figure 2

Morphological diversity of Garcinia kola seeds

Fig. 3
figure 3

Morphological diversity of Garcinia kola fruits

Fig. 4
figure 4

Variability in Garcinia kola fruit colours; from medium orange, to dark red and medium yellow. The colours are based on on Royal Horticultural Society (RHS) Colour Chart, 2001

Fig. 5
figure 5

Variability in Garcinia kola seed colours; from orange brown, to medim brown and dark brown purple. The colours are based on on Royal Horticultural Society (RHS) Colour Chart, 2001

Trees and leaves

The criteria for direct tree production have already been described in section a) fruits and seeds. However, in order to get the full overview, indirect effectors such as traits important for harvesting related to tree habitus or the ability to assimilate carbon through photosynthesis should not be overlooked.

An average bitter kola tree was 13.9 ± 4.30 m in height, with first branching starting at 4.94 ± 3.57 m, have 51.1 ± 33.5 cm in DBH and 9.73 ± 3.21 m in crown diameter. Based on the farmers’ estimation, the age range of the measured bitter kola trees was from 7 to 120 years, with an average of 37.9 ± 21.0 years. Coefficient of variation shows that all the tree parameters are very variable, especially trunk height, where some of the trees started branching basically from the ground, DBH and tree age (72%, 66% and 56%, respectively). On the contrary, crown diameter and tree height showed standard variability of around 30%. An average bitter kola leaf would have 11.4 ± 3.20 cm in length and 4.63 ± 1.44 cm in width with 12.2 ± 4.15 mm long and 2.20 ± 0.83 mm wide petiole. Based on the coefficient of variation it can be concluded that the leaf parameters exhibit regular levels of variation of about 30%.

Majority of the observed trees (82%) were found in good growing condition “healthy, cropping well” (Table S7). More than half of the described trees were of pyramidal crown shape (54.2%), followed by oblong, elliptical and spherical types (Fig. 6). In most of the cases, the tree crown was rated as “good”, followed by “tolerable” and “poor” (44.5, 24.2, and 15.9%, respectively). Branching pattern was dominated by irregular type (74.9%), followed by the horizontal and semi-erect type. Shape of the trunk was mainly straight (38.4%), followed by a type with forking starting above 6 m (22.5%), forking starting from bottom and at less than 6 m (18.9 and 17.7%, respectively). The most prevalent shape of leaf blade was oblong (55.1%), followed by elliptic and lanceolate types (21.5 and 14.2%, respectively) (Table S8). Other shapes were identified as triangular, irregular and obovate (Fig. 7).

Fig. 6
figure 6

Shapes of G. kola trees based on descriptor of Garcinia mangostana (IPGRI 2003) and author’s drawing. A crown shape; B trunk shape

Fig. 7
figure 7

Morphological diversity of Garcinia kola leaves

Morphological correlation and plus trees selection

The majority of the phenotypical characteristics with strong correlations (r > 0.55) were related to fruit and seed traits (Table 3). Among all the parameters, seed length and seed weight had the strongest positive correlation r = 0.785. Focusing on prominent domestication characteristiscs, fruit seed mass was found to be highly correlated with number of seeds, seed length, seed weight, fruit seed mass ratio, and fruit weight (r = 0.749, 0.684, 0.681, 0.644, and 0.561, respectively). Fruit length was revelead to be associated to fruit diameter and weight (r = 0.668 and 0.604, respectively). From the other domestication-related attributes, tree height was positively correlated with age of tree and trunk height (r = 0.571 and 0.553, respectively), while crown diameter showed no positive correlation above the trashold. Furthermore, a very strong positive correlation of r = 0.704 was found between DBH and fruit length.

Table 3 Pearson’s correlation matrix of Garcinia kola morphological characteristics

The top ten elite trees were determined based on pre-selected domesticated traits: fruit seed mass, tree height, and crown diameter (Table 4). The best tree, from Ebolowa (South region), had by far the largest fruit seed mass of 31 g and ranked the highest despite its greater height (17.5 m) and narrower crown (9.80 m) in comparison to the rest of the plus trees. The final rating was very close between the second and third-best trees. Although the second one had a lower fruit seed mass (27.6 g), it displayed better tree parameters, including a modest tree height (10.6 m) and a broad crown (12.9 m). Only two of the top ten trees were from the Southwest and one from the Central regions; populations from the South accounted for the majority of the plus trees. Most of the top ten trees were found in agroforestry systems. Just one individual, the second highest rated one, originated from a wild forest stand and one tree was discovered in a homegarden.

Looking closely at fruit seed mass and the number of seeds as the most important production traits, the higher values did not appear to be associated with a particular fruit or tree crown shape, representing easily detectable morphological features for local farmers (Table 5). However, spherical, rhomboidal and ellipsoid fruit shapes displayed higher fruit seed mass and a bigger number of seeds compared to other shapes (15.1 ± 8.58 g and 2.71 ± 1.39; 15.2 ± 9.02 g and 2.54 ± 1.07; 14.8 ± 8.68 g and 2.55 ± 1.05, respectively). Contrary, kidney-shaped and irregular fruits seemed to be less probable to exceed average values in these traits. Even though flattened fruits possessed a high number of seeds (2.62 ± 1.03), the kernels were likely of smaller size (fruit seed mass ≈ 13.5 ± 8.08 g). No particular tree crown shape was found to be linked to significantly larger fruit seed mass or bigger number of seeds. However, oblong crown scored the highest in both parameters (15.6 ± 9.44 g and 2.65 ± 1.31).

Table 4 Top ten plus trees selected based on their fruit seed mass (70%), tree height (20%) and crown diameter (10%) parameters and arranged according to the final score
Table 5 Fruit seed mass and number of seeds per fruit linked to fruit and crown shapes

Morphological comparison within regions and study sites

The highest number of seeds per fruit was found in the South region (2.74 ± 1.00), which may collate with their relatively lower weight (5.61 ± 2.18 g). However, this region also ranked the highest in fruit seed mass, the most important ideotype parameter (15.4 ± 9.39 g). A good result in fruit seed mass was also obtained by the Southwest region (14.7 ± 7.71 g), where the heaviest seeds were discovered (6.09 ± 1.98 g). Apart from the seed weight, the Central region scored the lowest values in the rest of the fruit/seed parameters (Tables S1, S2). Comparing seed mass to overall fruit weight, the fruit seed mass ratio was calculated. In this characteristic, the Southwest reached the highest value of 11.1 ± 5.15%, followed by the South (9.17 ± 4.42%) and the Central region (7.23 ± 3.98%). This indicated that, despite the smaller size of the fruits in the Southwest, we may still expect a substantial seed yield. In the Southwest, spherical and flattened shapes of fruits were dominant (33.1 and 28%, respectively), whereas elliptical and flattened shapes were prevalent in the Central region (39.3 and 27.9%, respectively) and spherical and elliptical fruits were the most common in the South region (34.9 and 30.2%, respectively) (Table S3). The oldest trees came from the Central region (51.0 ± 21.6), whereas the youngest were from the Southwest (28.5 ± 16.7) (Table S5). No major differences were discovered in leaf parameters were discovered between the regions (Table S6).

Dendrogram analysis identified five different clusters based on quantitative morphological information of trees, fruits and seeds, calculated on a tree level (Fig. 8). Cluster 1 comprised all Southwest populations (Kumba, Tombel, Lebialem, Mamfé), which are the most similar and the most distant from the rest of the study sites. Cluster 2 was created by two study sites (Zoételé and Sangmelima) from South region. These populations were very similar yet far from the rest of the Central and South populations. In comparison, Cluster 3 study sites (Ebolowa, Kye-Ossi) of South region were much more related to Central region. Clusters 4 and 5 contained solely populations of Central region. While Lekie-Assi and Nkenglikok were the most similar study sites in Central region (Cluster 5), Cluster 4 was dominated by only one study site, Akok. Populations of the South and Central regions were in closer proximity in comparison to the South region, which was clearly defined as the most distant cluster.

Principal component analysis (PCA) revealed that all the studied populations were, to some extent, interfering when all of the morphological criteria for trees, fruits, and seeds were considered (Fig. 9). Nevertheless, Cluster 1 was the most compact compared to the rest of the study sites. Only Lebialem population (Southwest region) seemed to be a bit more scattered and distant. This finding was supported by the former dendogram clustering analysis. Cluster 2, 3, 4 and 5 were interfering a lot, just Akok study site (Central region, Cluster 4) seemed to be more dispersed, reaching the values of Lebialem.

Fig. 8
figure 8

The dendrogram presents quantitative morphological information of Garcinia kola trees, fruits and seeds, calculated on a tree level. The study sites are grouped into clusters based on the similarity of their morphological features. Cluster 1 (blue)—Southwest region: Mamfé, Lebialem, Kumba, Tombel; Cluster 2 (orange)—South region: Sangmelima, Zoételé; Cluster 3 (red)—South region: Ebolowa, Kye-Ossi; Cluster 4 (black)—Central region: Akok; Cluster 5 (green)—Central region: Bot-Makak, Lekie-Assi, Nkenglikok; Southwest region—blue, South region—orange, Central region—green

Fig. 9
figure 9

Principal component analysis of morphological parameters of Garcinia kola trees, fruits and seeds. Cluster 1 (blue)—Southwest region: Mamfé, Lebialem, Kumba, Tombel; Cluster 2 (orange)—South region: Sangmelima, Zoételé; Cluster 3 (red)—South region: Ebolowa, Kye-Ossi; Cluster 4 (black)—Central region: Akok; Cluster 5 (green)—Central region: Bot-Makak, Lekie-Assi, Nkenglikok; Southwest region—blue, South region—orange, Central region—green

Discussion

Identification of plus trees

Some recent (Mboujda et al. 2022; Phurailatpam et al. 2022; Solís-Guillén et al. 2017; Tsobeng et al. 2020; Yakubu et al., 2023) and older research (Atangana et al. 2001, 2011; Fandohan et al. 2011; Leakey 2005; Leakey et al. 2000, 2004; Onyekwelu et al. 2011) focused on tree morphological ideotypes, tree-to-tree variation and plus tree identification to maximise the commercial production of underutilised perennial species. Yet, to our knowledge, no study has examined G. kola morphological variability in order to identify prospective “plus trees.“

Identification of ideotype is much easier in G. kola compared to Irvingia gabonensis and Sclerocarya birrea, other important African fruit tree species. Based on the preferred plant part for daily use, two ideotypes, fruit and seed-based, were determined in these two species (Atangana et al. 2001; Leakey 2005, p. 3; Leakey et al. 2005a, b). G. kola’s situation is more similar to that of Pachylobus edulis, another essential Cameroonian fruit species which was also selected for domestication program focus by CIFOR-ICRAF (Franzel and Kindt 2012; Tchoundjeu et al. 2006). Because of a strong preference for the use of fruit pulp over the other possible uses, only one ideotype was identified in D. edulis (Mboujda et al. 2022). Analogously, the most important products of G. kola are clearly its seeds (Manourova et al. 2023; Yogom et al. 2020), with bark/roots also being used but to a lesser extent. Determination of G. kola seed ideotype should therefore be the focus of domestication. Fortunately, the fruit/seed correlation results show a link between high fruit seed mass and big-sized fruits, which is a great indicator for local farmers who can directly visually assess the tree production. The aforementioned research on I. gabonensis and S. birrea revealed the opposite trend. We believe that selecting trees with above-average fruit seed mass (more than 14.5 g) can result in significant improvements in the quality and uniformity of marketable seeds (Leakey et al. 2008). Therefore, this study focused on fruit seed mass as the most important production factor, supplemented by other tree parameters (tree height, crown diameter) that could help farmers in easier fruit harvesting.

Top ten plus trees were identified based on their fruit seed mass (70%), tree height (20%) and crown diameter (10%). Ideally, we search for a tree with high fruit seed mass (commercialisation factor), small/average tree height (harvesting factor) and large crown (harvesting factor). 8 out of 10 of the best trees were found in agroforestry systems, one in the wild stand, and one in homegarden. If the domestication process was advanced, most of the best trees would have originated in homegardens, where they would have been deliberately selected and cultivated by their owners (Leakey 2012, 2019). As the majority of our plus trees are from other agroforestry systems (such as cocoa and oil-palm agroforestry stands) and the highest-ranked one was discovered in a forest suggests that G. kola domestication has not yet progressed sufficiently enough to show phenotypic differences between wild and cultivated individuals. This is in accord with our previous genetic diversity findings (Maňourová et al. 2023). Moreover, no specific fruit or crown shape was found to be associated with the high fruit seed mass score. In comparison, D. edulis has already shown significant morphological differences between wild and cultivated trees (Mboujda et al. 2022). The fruits of D. edulis, on the other hand, appear to be among the most popular fruit tree products in Cameroon, and the trees are undisputedly more common in farmers’ compounds than G. kola (Leakey 2014).

The majority of the highest-rated trees (7/10) were discovered in the South region, with two trees located in the Southwest and one in the Central region. This is in line with a prior study that compared the morphological and genetic diversity in the South and Central areas. The results suggested the South populations as suitable plus trees for development of future breeding strategies (Maňourová et al. 2023).

Morphological variability on the level of populations

In our study, five population clusters were identified based on the quantitative morphological information of trees, fruits and seeds. Additionally, 18 quantitative and 8 qualitative descriptors were used to characterise the specie’s phenotype and determine its diversity among the studied populations in three different regions of Cameroon. The results of our study suggest that the phenotypical variation is greater within populations than between them, similar to the findings of (Atangana et al. 2001, 2011; Leakey 2005, p. 1; Maňourová et al. 2023).

Fruit seed mass, the most important domestication parameter, was the highest in the South region (15.4 ± 9.39 g), followed by Southwest (14.7 ± 7.71 g) and Central region (12.3 ± 8.38 g). Major difference occurred among the study sites. The highest score was reached by Zoételé (South) and Lekie-Assi (Central) (20.7 ± 8.19 and 19.9 ± 10.4 g, respectively). Surprisingly, in the fruit seed mass ratio, which takes fruit weight into account, the Southwest region surpassed the others, reaching the value of 11.1 ± 5.15%, followed by South with 9.17 ± 4.42% and Central region with 7.23 ± 3.98%. Kumba (Southwest) had the highest score (13.9 ± 6.32%). This suggests that fruits in the Southwest region generally had less fruit pulp weight, which is typically not consumed, but nevertheless produced good amount of fruit seed mass. Hence, there could be two possible paths for clonal cultivar development: one aiming at bigger fruits with many or larger seeds (as in South region), or the other at smaller fruits with an equivalent seed mass but less pulp (as in Southwest region).

Other G. kola morphological variation investigations were conducted in Benin (Dadjo et al. 2018; Dah-Nouvlessounon et al. 2016). Comparing basic tree parameters, trees from Cameroon had greater DBH, which was mainly influenced by values in the South region, as well as larger crow diameter and bigger trunk height. In contrast to the Benin research, the fruits were larger and heavier, with a comparable number of seeds of slightly less weight (Table 6). Compared to our study, more tree parameters showed high levels of correlation, but in terms of fruits and seeds correlation, the results of the research coincided. The general difference in morphological traits might result from different ecological conditions between the countries, the time of the data collection and significantly different sampling sizes (43 trees in Benin × 218 trees in Cameroon), which also explains the variable standard deviation span.

Table 6 Comparison of morphological characteristics between our study (Southwest, Central and South regions in Cameroon, and their mean value) and research from Benin

The way forward

Agroforestry trees are now in their fourth decade of domestication. The third decagon’s key growth areas were phytochemical and genetic research of indigenous food and medicinal species, ethnobotany, and the state of natural resources. On the contrary, areas including priority setting, elite tree selection, and ideotype determination were found to be underexplored (Leakey et al. 2022). The hope is that fourth-decade research will be able to combine both centralised and decentralised approaches to holistically investigate intraspecific tree-to-tree variation at different sites to identify traits suitable for market-oriented ideotype/elite tree selection.

This study of phenotypic tree-to-tree variation in different G. kola populations demonstrates the possibility of identifying individual trees with fruit/seed characteristics high above the species average. However, there are important knowledge gaps that need to be addressed before progressing in the domestication of the species.

One of the missing links is the relationship between tree morphological variability and preferences and perceptions of local communities. Do the botanical descriptors support the traditional G. kola morphotype classification? Two investigations have already been performed on the use of trees, management practices, and the commercialisation of G. kola products in relation to different ethnic groups and geographical areas in Cameroon (Manourova et al. 2023; Yogom et al. 2020). However, none of the investigations provided a deeper understanding of the species’ folk taxonomy, which is essential for selecting the most locally attractive morphotypes for domestication and to help to conserve the species in situ (Imorou et al. 2022; Leakey et al. 2022; Phurailatpam et al. 2022; Rimlinger et al. 2021).

Understanding how environmental and genetic factors, as well as their interactions, influence the tree phenotype is another key consideration for selecting the best morphotypes (Costes and Gion 2015; Tsobeng et al. 2020). According to a recent finding, the growing site factor had a minor influence on the genetic makeup of G. kola populations (Maňourová et al. 2023). This was previously hypothesised in research on other African species, such as S. birrea and I. gabonensis (Leakey 2005, p. 3; Leakey et al. 2000). However, there could be other types environment-phenotype links. As evidenced in Tamarindus indica, fruit pulp taste can be linked to different habitat types (Fandohan et al. 2011).

Data on the marketing chains of G. kola seeds have to be completed in order to fullfil the commercial potential of the species. Even though there have been only a few investigations on the tree’s economic importance, some of them promised high market opportunities, especially for the seeds (Awono et al. 2016; Ndoye 1995). The selling price of G. kola seeds in Cameroon tends to vary greatly depending on location and is influenced by seasonality and collectors’ post-harvest practices; spanning from 10 to 48 USD per 5-litre bucket of seeds (Manourova et al. 2023).

What are the seed trait preferences of the consumers? Are seeds from one location considered superior to those from other locations in Cameroon or neighbouring countries? What is the flavour that consumers seek? Is it better to look for sweeter or bitterer cultivars? All of these questions have to be considered when looking for the ideal G. kola seed ideotype and selecting the plus trees. These trees with outstanding traits may later serve as the first clonal cultivars, which will be distributed further through vegetative propagation and serve as the foundation for more advanced breeding programs (Leakey and Page 2006).

Conclusion

This study of G. kola phenotypic tree-to-tree variation demonstrated the possibility of identifying plus trees premised on species ideotype-based criteria relevant to its domestication. The most of these trees were discovered in agroforestry systems, with only one coming from a wild stand and a homegarden. The high fruit seed mass score was not associated with any specific fruit or tree crown shape. These findings suggest that G. kola domestication is not yet advanced enough to exemplify phenotypic differences between wild and cultivated individuals. The results of 18 quantitative and 8 qualitative descriptors, dendrogram, and principal component analysis indicated that phenotypic variation within populations is greater than variation between them. The fruit/seed correlation demonstrated a link between high seed mass and large fruits. This is a good indicator for farmers who can easily visually assess their trees. Fruit seed mass was found to be a highly variable parameter (CV = 60%) with an average of 14.4 ± 8.56 g. We recommend selecting trees with above-average fruit seed mass, as this can lead to significant improvements in the quality and uniformity of marketable seeds. There could be two approaches to ideotype-based clonal cultivar development. The first aspires to produce larger fruits with more and/or larger seeds (fruit seed mass). The second focuses on selecting trees with smaller fruits with higher seed mass (fruit seed mass ratio), given that the fruit pulp is not commonly consumed.

A few missing links must be addressed first to progress in the domestication of G. kola. More information on tree morphological variability in relation to local farmers’ preferences and perceptions is required. We need to learn more about how environmental and genetic factors, as well as their interactions, influence the tree phenotypes. To fullfil the species’ commercial potential, up-to-date data on G. kola seed marketing chains are necessary. To further identify the most suitable market-oriented elite trees, holistic investigations combining centralised and decentralised scientific approaches, without excluding local communities, should be encouraged.