Introduction

Hancornia speciosa Gomes (Apocynaceae), “mangaba tree” or “mangabeira,” is a native fruit species occurring in all Brazilian regions, but with greater incidence in areas covered by the Cerrado biome, Coastal Tablelands and coastal lowlands of the Northeast (Reflora 2023; Pereira et al. 2016). Its fruits, popularly known as “mangaba” (from Tupi Guarani “ma gawa”, “good thing to eat”) have high nutritional value and chemical and physical characteristics that make them suitable for human consumption (Nascimento et al. 2014).

Monachino (1945) described six botanical varieties for mangaba tree: H. speciosa var. cuyabensis, H. speciosa var. gardneri, H. speciosa var. lundii, H. speciosa var. maximiliani, H. speciosa var. pubescens, and H. speciosa var. speciosa. Four of them are represented in Fig. 1. More recently, the species list of the Flora of Brazil 2020 project recognized only two varieties: H. speciosa var. pubescens, as suggested by Monachino (1945), and H. speciosa (which includes all other forms) (Reflora 2023). Other research based on genetic and morphological markers supports the classification of the plants into two botanical varieties, maintaining the status of H. speciosa var. speciosa and grouping H. speciosa var. gardneri, H. speciosa var. pubescens, and H. speciosa var. cuyabensis as a single botanical variety (Terto et al. 2022; Morais et al. 2022). H. speciosa var. maximiliani and H. speciosa var. lundii have not been reported in recent research. However, following Monachinos’s description, H. speciosa var. maximiliani would be joined with H. speciosa var. speciosa and H. speciosa var. lundii would be joined with other varieties (Morais et al. 2022).

Fig. 1
figure 1

Details of the morphology of representatives of the four botanical varieties of Hancornia speciosa present in the germplasm collection of Federal University of Goiás. A: H. speciosa var. cuyabensis. B: H. speciosa var. gardneri. C: H. speciosa var. pubescens. D: H. speciosa var. speciosa

Because of its organoleptic characteristics when ripe, the fruits of H. speciosa are appreciated in natura (Pereira et al. 2016) and used as ingredient in the production of juices, ice cream, pulps, popsicles, puddings, compotes, liqueurs, wines, vinegars, and syrups (Costa et al. 2011; Almeida et al. 2021a; Oliveira and Aloufa 2021; Siqueira et al. 2021; Zitha et al. 2021;). Mangaba is considered an important source of vitamin C, fiber, folic acid, potassium, iron, zinc, carotenoids and phenolic compounds (Arruda et al. 2022) and its seeds have high levels of oleic acids (Silva and Jorge 2020). Its various parts are also often used in traditional medicine for combating symptoms and diseases (Santos et al. 2013; Coutrim and Souza 2018), and the efficiency of some of these applications has been corroborated by pharmacological research (Torres-Rêgo et al. 2016; Penido et al. 2017; D'Abadia et al. 2020).

According to the Instituto Brasileiro de Geografia e Estatística (IBGE), the national production of mangaba in 2021 reached 2173 tons (IBGE 2023). Despite its potential for exploitation, especially in the food and pharmaceutical industries, the production of mangaba results from predatory extractivism (Fonseca et al. 2018). This is a predominantly female job, carried out using rudimentary tools by family groups or groups of neighbors known as “mangaba collectors” who live in coastal areas (Neves and Liberato 2023). Furthermore, populations of this fruit species face drastic reductions as a result of the advancement of anthropogenic activities in their areas of natural occurrence (Gilberto 2015).

Thus, the adoption of strategies that guarantee the proper conservation and use of mangaba trees has become imperative. Knowledge of the phenology of this species can help gather essential information about the dynamics of its reproductive patterns over time. In addition, efforts to select superior genotypes with high yields and fruit quality for use in breeding programs are opportune. Research revealed genetic variability in the mangaba tree germplasm collection at the Federal University of Goiás (Goiânia, Goiás state, Brazil) (Ganga et al. 2009; Almeida et al. 2019), qualifying it for use as a base population in a pre-breeding program. However, there is a scarcity of adopting strategies for selecting superior plants (Almeida et al. 2021b), with emphasis on production and fruit quality, which are the main attractions for agribusiness and consumers.

In this context, our study aimed to investigate reproductive patterns, estimate genetic variability, and identify superior genotypes of H. speciosa for traits of agronomic importance among accessions of the germplasm collection of the Federal University of Goiás.

Material and methods

Data collection

Experimental data were obtained from November 2020 to November 2021 in the H. speciosa germplasm collection of the School of Agronomy of the Federal University of Goiás, located at Goiânia, Goiás, Brazil (lat 16° 35′ 39″ S, long 49° 17′  26″ W, altitude 720 MSL). According to the Köppen (1900) classification, the local climate is Aw tropical, with a rainy season from October to April, and a dry season between May and September. The average annual rainfall is 1481 mm, the average annual temperature is 22.1 °C (Lobato 2016) and the soil is categorized as a dystrophic dark red latosol.

The germplasm collection was installed in the field in December 2005 with simples (seeds from fruits) from natural subpopulations (populations) in the Brazilian states of Bahia (564,760.429 km2), Tocantins (277,423.627 km2), Minas Gerais (586,513.983 km2), Goiás (340,242.859 km2), Mato Grosso (903,208.361 km2), and Mato Grosso do Sul (357.142,082 km2), which represent almost the entire area of distribution of the species in the Brazilian Cerrado. The mother trees were found in native landscapes, in fragments of vegetation, and along roadsides. To classify the plants as belonging to the same population, a maximum distance of 30 km between them was considered (Ganga et al. 2009). The location of the twenty-nine populations and four botanical varieties sampled was mapped and presented in Fig. 2.

Fig. 2
figure 2

Sites of the populations and botanical varieties represented in the germplasm collection of the Federal University of Goiás, Goiânia, Goiás state, Brazil

In the experimental area, a completely randomized block design with four replications was adopted, with one plant per plot spaced 6 m × 5 m (Ganga et al. 2009). At the time of data collection, 143 individual accessions remained in the germplasm collection, structured into 55 open-pollinated progenies originating from the sampled mother plants, representing 29 populations and four of the six botanical varieties according to the classification of Monachino (1945), such as: H. speciosa var. cuyabensis (24 accessions), H. speciosa var. gardneri (89 accessions), H. speciosa var. pubescens (19 accessions), and H. speciosa var. speciosa (11 accessions). Details of the morphology of representatives of these four botanical varieties are presented in Fig. 1.

Phenological data

We carried out monthly evaluations between November 2020 and October 2021 (totaling 12 evaluations), considering the following phenological events: flowering (appearance of flower buds, anthesis, or senescent flowers) and fruiting (mature or immature fruits attached to the plant). The procedure was based on the methodology proposed by Ribeiro and Castro (1986), with modifications. We used an arbitrary scale of points ranging from zero to 10, assigned to the trees according to the approximate number of fruits or flowers as follows: “0 points”: absence of flowers and/or fruits; “1 point”: 1 to 10 flowers and/or fruits; “2 points”: 11 to 40 flowers and/or fruits; “3 points”: 41 to 60 flowers and/or fruits; “4 points”: 61 to 80 flowers and/or fruits; “5 points”: 61 to 80 flowers and/or fruits; “6 points”: 121 to 140 flowers and/or fruits; “7 points”: 141 to 160 flowers and/or fruits; “8 points”: 161 to 180 flowers and/or fruits; “9 points”: 181 to 200 flowers and/or fruits; “10 points”: more than 200 flowers and/or fruits. The number of flowers and fruits in each category was based on the climax of the phenophases previously observed in the year 2014.

Climatic data were obtained at the Campbel Automatic Station of the School of Agronomy of the Federal University of Goiás, located at Goiânia, Goiás, Brazil (lat 16° 55′ S, long 49° 27′ W, altitude 730 MSL). Average temperature (°C) and average rainfall (mm) were collected daily between November 2020 and October 2021. The absence of data between April 23, 2021 and April 30, 2021 is explained by maintenance/calibration activities carried out during this period at the meteorological station.

Monthly averages of points assigned to trees were obtained for each botanical variety, considering 12 uninterrupted months. To evaluate the simultaneity of the appearance of flowers or fruits, we adopted the synchrony index proposed by Morellato et al. (1990), which establishes three categories: asynchrony (less than 20% of the total individuals manifest phenophase), low synchrony (20 to 60% of the plants are in phenophase), and high synchrony (more than 60% of the trees are in phenophase). Here, we considered the assignment of a non-zero monthly score to flowering and/or fruiting in the phenological stage.

Dissimilarity between and within botanical varieties and its correlation with geographic distance

To determine the morphological dissimilarity among and within botanical varieties of H. speciosa, we performed phenotypic characterization considering 20 qualitative descriptors as follows: (i) whole trees: crown shape, trunk surface, branching pattern, tree growth habit, number of bifurcations at the height of the ramification, and branch pubescence; (ii) leaves: leaf shape, leaf base shape, leaf apex shape, leaf pubescence, number of leaves per node, phyllotaxis, and leaf texture; (iii) flowers: pedicel pubescence, calyx pubescence, corolla pubescence, and floral tube pubescence; (iv) fruit: fruit shape, color of fruit skin, and arrangement of fruits. Illustrations representing all shape-related descriptors used in our study can be consulted at Descriptors for Mangaba (Hancornia speciosa Gomes) as guides (Silva Junior et al. 2018).

Furthermore, eight quantitative traits were converted into ordinal qualitative variables based on the intervals provided by the difference between the highest and lowest values assumed by the trait for the formation of the three discrete classes. The quantitative descriptors analyzed qualitatively were: (i) height of the first branch, (ii) crown diameter, (iii) leaf length, (iv) leaf width, (v) petiole length, (vi) petiole thickness, (vii) internode length, and (viii) fruit peduncle length. This conversion was necessary for the adoption of Harrison's model (1968) provided by Oliveira et al. (2009), which deals only with vectors of qualitative variables (\({y}{\prime}= {y}_{1}, {y}_{2}, \dots , {y}_{l}\)), in which the i-th component assumes levels or classes. The levels of each qualitative descriptor were converted into binary variables, and the trees (\({o}_{i}\)) were compared pairwise for a set of binary variables arising from each qualitative criterion. As they were heterogeneous regarding the number of classes (e.g., five levels for crown shape, three levels for leaf base shape, and two levels for branch pubescence), weighting was performed by dividing the logarithm of the number of levels of each qualitative criterion (\({lnl}_{i}\)) by the sum of the logarithms of the number of levels of all variables evaluated (\({\sum }_{i=1}^{l}{lnl}_{i}\)). For the indicator function of coincidence of binary variables (\(I\)) the following were considered: “1,” if \({y}_{i}\left({o}_{1}\right) ={y}_{i}\left({o}_{2}\right)\) and “0,” if \({y}_{i}\left({o}_{1}\right) \ne {y}_{i}\left({o}_{2}\right)\) or \(NA\) for missing data. The similarities obtained from Harrison's model, \(s\left({o}_{1}, {o}_{2}\right)= \sum_{i=1}^{l}\frac{{\mathrm{ln}l}_{i}}{{\sum }_{i=1}^{l}{\mathrm{ln}l}_{i}}I\left[{y}_{i}\left({o}_{1}\right), {y}_{i}\left({o}_{2}\right)\right]\), were converted into measures of dissimilarity (\(d=1-s\)), organized in a 143 × 143 matrix.

To investigate the possible correlation between geographic distance and morphological dissimilarity between and within botanical varieties, a similarity matrix based on the geographic coordinates of the populations of origin of the accessions was also developed. For this, the distances in kilometers between the sites of origin of each of the trees computed pairwise were converted into values that varied from zero to one, proportional to the geographic distances, assigning “1” to trees from the same locality and “0” for trees that came from the most geographically distant populations.

Variance components and genetic parameters

We estimated variance components and related genetic parameters for the following quantitative traits: plant height (HEI), trunk circumference 10 cm above ground level (CIR), fruit volume (FRV), and estimated fruit yield (FRY). To calculate the fruit volume (FRV), we used the equation adopted by Resende et al. (2020) and Yunes et al. (2015), which considers the radial and longitudinal circumferences of fruits. The estimated fruit yield (FRY) was obtained from the product of the fruit mass and number of fruits per plant (developed or immature fruits present on the tree in September 2021, a period marked by the appearance of “falling fruits”). The mass and volume of the fruits were measured at the Laboratory of Native Plants of the Federal University of Goiás using 5–10 ripe fruits per plant (according to availability).

Analysis of variance was performed using the Restricted Maximum Likelihood/Best Linear Unbiased Prediction (REML/BLUP) method with two models. In the first model (Model 1), the effects of families, populations, and botanical varieties in a randomized complete block design were considered as follows:

$$Y=Xb+{Z}_{1}f+{Z}_{2}p {+ Z}_{3}v+\varepsilon$$
(1)

where:

Y = vector of observed values of variable;

X = incidence matrix under fixed effects;

Z = incidence matrix under random effects;

b: block effect, fixed;

f = family effect, random;

p = population effect, random;

v = botanical variety effect, random;

ε = experimental error associated with Y, random.

Based on Model 1, the components of genetic variance at the levels among botanical varieties (\({\sigma }_{var}^{2}\)), among populations within varieties (\({\sigma }_{pop}^{2}\)), and among families within populations (\({\sigma }_{fam}^{2}\)), were estimated.

An alternative mixed hierarchical linear model (Model 2) based on similarity matrices was also adopted in the estimation of variance components, considering the additive effects of half-sib families (matrix a), the geographic distance between populations (matrix c), and Harrison’s index (1968) (matrix h), in a randomized complete block design, as follows:

$$Y=Xb+{W}_{1}a+{W}_{2}c+{W}_{3}h+\varepsilon$$
(2)

where Y = vector of observed values of variable Y; X = incidence matrix under fixed effects; W = incidence matrix under random effects; b: block effect, fixed; a = additive effect of half-sib families, random with \({\sigma }_{a}^{2}\sim N(0, A{\sigma }_{a}^{2})\); c = effect of geographic coordinates, random with \({\sigma }_{c}^{2}\sim N(0, C{\sigma }_{c}^{2})\); h = effect of the Harrison index (1968), random with \({\sigma }_{h}^{2}\sim N(0, H{\sigma }_{h}^{2})\); ε = experimental error associated with Y, random.

From the best linear model used (1 or 2), the breeding values (EBV) and genotypic values (EGV) were estimated, considering each accession and each of the quantitative traits evaluated.

Selection index and response to selection

The identification of superior individuals was based on the following ideotypes: short plant, with voluminous fruits, and high productivity. For the selection of accessions that combined these characteristics simultaneously, we use the selection index based on the sum of ranks proposed by Mulamba and Mock (1978). The classification of individuals for the trait set was based on EBV and EGV obtained from the best mixed hierarchical linear model (1 or 2). For individuals with the lowest EBV-based indexes, the response to selection for the next recombination cycle was calculated with selection in single sex (maternal). The response to selection for cloning was estimated for those with the lowest rank for the EGV. In both cases, we select 15 individual accessions with the best performance (≈10%).

Once the 15 best individual accessions were identified by a selection index based on EBV and EGV, the status number (\({N}_{S}\)), defined as half the inverse of group coancestry of the selected units (Lindgren et al. 1996) was calculated considering the three characters under selection. The values of \({N}_{S}\) were also calculated considering the selection of superior individuals for each quantitative trait, separately.

Results

Reproductive phenology

Except H. speciosa var. speciosa, the botanical varieties present in the germplasm collection exhibited similar reproductive patterns during the evaluation period (Fig. 3b). In H. speciosa var. cuyabensis, a high level of synchrony for the appearance of flowers was achieved in September, with flowering peaking in October (average of 5.29 points on the grade scale and 91.67% of individuals with flowers). The fruiting of H. speciosa var. cuyabensis was highly synchronous between May and November, with 100% of its representatives having at least one fruit between June and September. The lowest percentage of individuals in that phenophase and the lowest monthly average occurred in February (16.67% and 0.16 points, respectively), and the peak of fruit appearance at different stages of maturation was reached in August, with an average of 8.46 points.

Fig. 3
figure 3

Source: Campbel Automatic Station of the School of Agronomy of the Federal University of Goiás. B Monthly averages of the points assigned to the botanical varieties of the germplasm collection for the appearance of flowers (B1) and fruits (B2) between November 2020 and October 2021. “var”: botanical variety. “cuy”: H. speciosa var. cuyabensis. “gar”: H. speciosa var. gardneri. “pub”: H. speciosa var. pubescens; “spe”: H. speciosa var. speciosa. The continuous lines correspond to the mean values for each variety

Reproductive patterns of botanical varieties between November 2020 and October 2021 and climatic conditions throughout the year. A monthly average temperature (°C) and total monthly precipitation (mm) in Goiânia, Goiás, Brazil.

For H. speciosa var. gardneri, an asynchrony in flowering was detected between November and August. High synchrony for flowering occurred in September, with 89.89% of the individuals of the botanical variety having at least one flower. October was marked by the climax of flower appearance, with a monthly average of 5.37 points. High synchrony for fruiting occurred between May and October, with 94.38% of individuals having at least one fruit in August and September. The peak of fruit appearance occurred in August and the monthly average for that phenophase was 6.76 points.

Flowering of H. speciosa var. pubescens were asynchronous between November and August. High synchrony for this phenophase occurred in September, and the highest monthly average for the appearance of flowers occurred in October (4.79). In May, 52.63% of the individuals in this botanical variety had at least one fruit (low synchrony). From June to October, fruiting was highly synchronous, with 94.74% of the individuals in this phenophase in August and September. The peak fruit appearance in H. speciosa var. pubescens occurred in August (a monthly average of 3.68 points).

In H. speciosa var. speciosa, flower buds, anthesis, or senescent flowers were observed only in September (apex of flowering, with an average of 0.91 points) and October (average of 0.63 points). The fruiting climax occurred in August and September; in both months, the average phenophase was 0.72 points. In addition to flowering, fruiting did not show high synchrony and the highest percentage of H. speciosa var. speciosa with at least one fruit was only 54.55% in September and October.

Morphological dissimilarity among and within botanical varieties

Based on Harrison's model (1968), it was possible to detect greater dissimilarity among the botanical varieties of H. speciosa, considering the qualitative traits evaluated (Fig. 4). The highest dissimilarity index was observed for H. speciosa var. pubescens and H. speciosa var. speciosa (pc x sc) (Fig. 4a). Within the botanical varieties, the dissimilarity was less pronounced (Fig. 4a). The botanical varieties H. speciosa var. speciosa, H. speciosa var. cuyabensis, and H. speciosa var. pubescens showed low dissimilarities among individuals within varieties. The dissimilarity index was, in general, positively correlated with the geographic distances, both among and within botanical varieties. (Fig. 4b). Thus, it is expected that mangaba trees belonging to the same variety or different botanical varieties are morphologically more divergent for the traits evaluated here as we increase the physical distance between the sites where they occur.

Fig. 4
figure 4

Dissimilarity among and within botanical varieties of mangabeira according to Harrison's model (1968). A dissimilarity among (above) and within (below) botanical varieties. B relationship between the geographic distances of the source populations (in km) and the dissimilarity between (above) and within (below) botanical varieties. “cb”: Hancornia speciosa var. cuyabensis. “gn”: H. speciosa var. gardneri. “pc”: H. speciosa var. pubescens; “sc”: H. speciosa var. speciosa

Variance components and genetic parameters

Plant height (HEI) varied from 2.60 to 11.00 m among accessions, with an average of 6.17 m. The mean circumference of the trunk 10 cm above ground level (CIR) was 63.30 cm, with minimum and maximum values of 22 cm and 158 cm, respectively. The fruit volume (FRV) ranged from 5.98 to 69.23 cm3, with an average value of 25.59 cm3. The estimated fruit yield (FRY) was equivalent, on average, to 17.92 kg/plant, with a minimum value of 0.02 kg/plant and a maximum of 111.18 kg/plant.

For the linear models used in the analysis of variance components, the Akaike Information Criterion (AIC) values of Model 2 were lower than those of Model 1 for all quantitative traits. AIC values are used as criteria for choosing a statistical model among a set of competitors, with the lowest value being the most parsimonious (Bozdogan 1987). Thus, our results show that Model 2, which considers the effects of matrices A, C, and H, can better explain the variation observed in HEI, CIR, FRV and FRY (Table 1).

Table 1 Variance components, percentages of variance components and AIC values considering linear models 1 and 2

Based on Model 1, it was possible to identify the existence of genetic variation among varieties, among populations within varieties, and among families within populations for the trait HEI, with the percentages of the variances in relation to the total variance corresponding to the 2.86%, 42.10%, and 13.04%, respectively. The same occurred with CIR, with variations between varieties, populations, and families equivalent to 13.97%, 20.46%, and 11.88% of the total variation, respectively. For FRV, genetic variation was detected between families (21.91% in relation to the total variance), with null variances between populations and botanical varieties. For FRY, the genetic variance among varieties and among populations corresponded, respectively, to 4.15% and 18.30% of the total variance, with zero variance among families.

According to Model 2, based on the sibling coefficient (or the probability that two genes will originate from the same parent) (Kang et al. 2001), it was possible to verify the existence of additive genetic variation among half-sib families for all quantitative attributes evaluated. For HEI and CIR, most of the variance was due to the effect of the H matrix, which considers Harrison’s similarity index constructed from qualitative characters. For FRV, most of the variation was explained by the additive genetic variation of families (matrix A) and for FRY, most of which was due to the geographic distance among populations (matrix C).

Selection index and response to selection

The best accessions (≈10%) of the germplasm collection of the Federal University of Goiás, selected based on the Mulamba and Mock selection index, are listed in Table 2 and in Supplementary Information (Table S1). Based on EBV, 15 trees comprising 10 families, nine populations, and three botanical varieties were selected. For HEI, the response to selection for the next cycle (\({RS}_{EBV}\)) was equivalent to – 4.29%. For FRV and FRY, \({RS}_{EBV}\) was equal to 22.34% and 6.57%, respectively. The first access of the ranking (G122) belongs to the variety H. speciosa var. gardneri and to the population Chapadão do Sul (Mato Grosso state, Brazil). His HEI 4.20 m and his FRY was 23.14 kg, with FRV of 51.19 mm3. The 15 individual accessions selected from EGV belonged to eight families, six populations, and two botanical varieties of the mangaba tree. For HEI, the response to selection for cloning (\({RS}_{EGV}\)) was – 4.52%. For FRV and FRY, \({RS}_{EGV}\) was 45.99% and 51.12%, respectively. The first access of the ranking (G071) belongs to the variety H. speciosa var. cuyabensis and to the population Chapada dos Guimarães (Mato Grosso state, Brazil). His HEI 4.70 m and his FRY was 8.54 kg, with FRV of 47.73 mm3.

Table 2 Summary of the best ≈10% accessions based on Mulamba and Mock (1978) selection index, response to selection for the next recombination cycle (\({RS}_{EBV}\)) and response to selection for cloning (\({RS}_{EGV}\))

The selection of accessions using the Mulamba and Mock index considering the EBV resulted in a slightly higher status number (\({N}_{S}\) = 12.16) than that obtained using EGV (\({N}_{S}\) = 11.25) (Fig. 5). However, in both cases, the status number values obtained approached the number of selected individuals.

Fig. 5
figure 5

Simulations of selection response for the next recombination cycle (RSEBV) and selection response for cloning (RSEGV) versus the number of selected individuals versus status number values (NS), considering four quantitative traits, individually. “EBV”: estimated genetic value. “EGV”: estimated genotypic value. “CIR”: trunk circumference 10 cm of ground level (in cm). “HEI”: plant height (in m); “FRV”: fruit volume (in cm3). “FRY”: estimated fruit yield (in kg/plant)

Discussion

H. speciosa var. cuyabensis, H. speciosa var. gardneri, and H. speciosa var. pubescens showed a similar flowering pattern of high synchronicity from September to October, with an apex in October. The flowering of a year corresponds to the fruiting of the following year, with a long time interval between the appearance of flower buds and fruit development. Previous research evaluated the phenophases in just a few months of the year (De Souza et al. 2008; Leão-Araújo et al. 2019). Here, we chose to carry out assessments over 12 uninterrupted months in order to observe changes in the reproductive patterns of the species throughout the year. But in fact, a longer evaluation period (two or more years) would certainly bring more accuracy to our results.

Two well-defined seasons prevail in the Cerrado biome: rainy (from October to March) and dry (from April to September) (Dos Santos et al. 2020). Phenological research involving plant species from the Neotropical savannah demonstrated a pattern of flowering based mainly on precipitation, with greater flower production at the end of the dry season and the beginning of the rainy season (Leão-Araújo et al. 2019; Godim et al. 2020; Novaes et al. 2020), similar to what we observed in mangaba tree (Fig. 3). Greater production of flowers in a short period (September and October) guarantees a concentration of nectar available to the pollinating agents of H. speciosa, which are usually visited by bees, butterflies, and moths (Darrault and Schlindwein 2005). Thus, the establishment of a commercial orchard and the increase in production of this native fruit species require proximity to areas containing other plant species capable of sustaining pollinator populations and providing them with food in periods when it is not in the flowering phase.

The climax of the appearance of fruits at different stages of maturation for all botanical varieties evaluated occurred in August shortly before the rainy season and the rise in average temperature. Although native species from Cerrado produce fruit predominantly in the rainy season (Maruyama et al. 2019), it is common for the appearance of fruits to be more pronounced during the dry-rain transition (Silva et al. 2011). In H. speciosa, we can observe that the "fruits fall" (falling off the tree when ripe) between September and October, and the seeds are released at the beginning of the most favorable season for the establishment and development of the seedlings due to the higher soil moisture (Salazar et al. 2011). In general, this behavior is intrinsically related to species whose seeds are not dormant and germinate quickly when subjected to ideal water, light, nutrient, and temperature conditions (Ramos et al. 2017). Hancornia speciosa var. speciosa predominantly inhabits the northeastern region of the Cerrado, typically in sandy soils and under low rainfall (Chaves et al. 2020). Silva et al. (2017) evaluated four populations of this variety in a Cerrado-Restinga ecotonal zone with low fertility soil in the state of Maranhão (Brazil) and obtained mean values of 338, 330, 280, and 172 fruits per plant. Thus, the relative inferiority of H. speciosa var. speciosa regarding the production of flowers and fruits detected in our study can be explained by the fact that the environmental conditions of the experimental field of the germplasm collection at the Federal University of Goiás, especially its soil with better nutritional conditions, are different from those of their naturally occurring sites, making this variety poorly adapted, as suggested by Almeida et al. (2019).

The highest morphological dissimilarity observed here for H. speciosa var. pubescens and H. speciosa var. speciosa corroborates with what we can see in the field (Fig. 1) and partially reinforces the findings of Flores et al. (2018), who evaluated the chemical composition of the leaves of four botanical varieties of mangaba trees and reported a more pronounced differentiation between H. speciosa var. pubescens and H. speciosa var. speciosa, as well as a chemical similarity between H. speciosa var. cuyabensis and H. speciosa var. gardneri. Our results showed a greater morphological similarity between trees belonging to the same botanical variety of mangabeira, according to the classification proposed by Monachino (1945). A study carried out on the same germplasm collection comparing QST parameter from quantitative traits and FST from molecular data showed that the quantitative divergence among subpopulations within botanical varieties is mainly due to genetic drift, while the divergence among varieties is due to both drift and divergent selection (Chaves et al. 2020).

A final botanical review proposed by the “Flora do Brasil 2020” project recognized only two botanical varieties of the mangaba tree: Hancornia speciosa var. pubescens (comprising individuals with pubescent structures) and H. speciosa var. speciosa (encompassing all other forms) (Reflora 2023). However, the findings by Collevatti et al. (2018) in a study involving the molecular characterization of mangabeira using neutral molecular markers did not corroborate this more recent classification, suggesting the existence of at least four different groups of H. speciosa. Furthermore, when estimating genetic parameters from microsatellite markers, Collevatti et al. (2016) obtained low but significant genetic differentiation among the same botanical varieties studied here. Our results also showed that, although H. speciosa var. pubescens and H. speciosa var. speciosa be more dissimilar, the varieties proposed by Monachino (1945) diverge phenotypically when compared to each other, mainly in terms of morphological attributes related to leaves, flowers, and branches. Considering the proposal to classify H. speciosa individuals into only two botanical varieties, it would be more coherent to separately maintain H. speciosa var. speciosa and group the other varieties into a single clade based on morphological traits. The same pattern was defended by Terto et al. (2022), based on molecular markers.

In our study, mangaba trees belonging to the same variety or different botanical varieties are morphologically more divergent as we increase the physical distance between the places where they occur. Fajardo et al. (2018) compared five populations of H. speciosa from the coast of Rio Grande do Norte (Brazil) and found no correlation between the genetic and geographic distances. In contrast, a positive and significant correlation between molecular genetic differentiation and geographic distance of up to 400 km was observed among 34 mangabeira tree populations from the Cerrado biome, constituting evidence of isolation by distance (Colevatti et al. 2018).

The use of the Mulamba and Mock selection index is promising in genetic breeding programs for fruit species (Barth et al. 2020; Oliveira et al. 2018), with advantages such as relative simplicity and ease of application. The ideotype defined here to classify the trees (short plants, with voluminous fruits, and high productivity) considered attributes that are usually sought in other fruit trees (Castro et al. 2020; Costa et al. 2020), aiming at great aptitude for commercial cultivation and the consumer market. Although CIR was not considered in the selection index, the \({RS}_{EBV}\) for this quantitative attribute was –1.76%, while the response to \({RS}_{EGV}\) was 1.84%. As the average of the population selected for HEI based on EBV was also reduced, the decrease in the average trunk circumference may not imply a compromise in the stability of the trees. The choice to adopt selection at the individual level is justified by the fact that this strategy is simpler than the selection of families, in addition to allowing the exploration of the additive variance among and within families.

Research involving microsatellite markers revealed high levels of genetic diversity in natural populations of mangabeira tree (Martins et al. 2012; Maia et al. 2022), although low levels of variation have also been observed (Costa et al. 2015; Fajardo et al. 2018), thereby evidencing a clear threat to the maintenance of this important plant genetic resource. When evaluating the Mangaba Genebank of Embrapa Tabuleiros Costeiros (Brazil) between the years 2006 to 2017, Da Silva et al. (2021) identified a high genetic divergence among its accessions, evidencing the success of the conservation strategies adopted and the potential of this genebank to support breeding research. The genetic variability existing in the germplasm collection of the Federal University of Goiás for agronomic traits such as number of fruits, diameter of fruits and number of seeds per fruit, also attests to its potential use as a base population for the pre-breeding of H speciosa (Almeida et al. 2019, 2021b).

The great genetic variability observed in H. speciosa allows the adoption of two strategies for its breeding: recurrent selection, which involves the selection and intercrossing of sexually propagated and open-pollinated superior plants, as well as the planting of clones generated by vegetative propagation (Chaves 2006; Pereira et al. 2006). For recurrent selection purposes, it is recommended that an improved population be obtained by harvesting the open-pollinated seeds of the superior individuals identified here, which characterizes selection in the female sex (Almeida et al. 2021a, b). The highest selection gains obtained based on EGV in our study showed that clone planting is the most promising strategy for the genetic breeding of H. speciosa when compared to the recurrent selection, allowing the acquisition of greater genetic gains in the short term as expected. According to Chaves (2006), after the selection of superior matrices of mangabeira trees (such as those identified here), they must be artificially reproduced via asexual propagation for the production of clones that will be planted and evaluated in an experimental scheme. The best clones should then be propagated vegetatively for broader evaluation, enabling the recommendation of superior materials for planting. It is important to highlight the need to use different clones, given the self-incompatibility characteristics of the species (Darrault and Schlindwein 2005; Collevatti et al. 2016).

The \({N}_{S}\) is a measure of representativeness indicative of the genetic diversity maintained after selection, and is related to the number of unrelated genotypes in the subgroup (Lindgren et al. 1996). The adoption of a greater intensity of selection allows for greater genetic gains but promotes a reduction in the status number (Andersson et al. 1998). Thus, breeders must seek a strategy that approaches the ideal scenario for both measures (Fig. 5). A higher status number using the Mulamba and Mock index considering the EBV (\({N}_{S}\) = 12.16) than that obtained using EGV (\({N}_{S}\) = 11.25) can be explained by the fact that the selection of the ≈10% best individuals from EBV brings together a greater number of families, populations, and botanical varieties than the group selected from EGV, reducing the number of related individuals. However, in both cases, it was possible to select dissimilar genotypes, increasing the possibility of allele regrouping in new favorable combinations.

Conclusion

The botanical varieties of H. speciosa present in the germplasm collection of the Federal University of Goiás flower between September and October, and the peak of immature fruits attached to trees occurs in August. The highest morphological dissimilarity was observed between H. speciosa var. speciosa and H. speciosa var. pubescens. The botanical varieties H. speciosa var. speciosa, H. speciosa var. cuyabensis, and H. speciosa var. pubescens showed low dissimilarities among individuals within varieties. Both among and within botanical varieties, this dissimilarity is generally associated with the geographic distance between their populations of origin.

There was genetic variation among the accessions of the germplasm collection for quantitative traits with agronomic importance related to size (plant height and trunk circumference) and fruits (fruit volume and estimated fruit yield). Clone selection is the most promising strategy for the breeding of H. speciosa, ensuring a better response to selection in the short term than recurrent selection. However, this strategy depends on the availability of a viable asexual propagation system, which still requires advances for the species.