1 Introduction

Fruit tree productivity is the number of fruits and their weight per unit area. The long-term sustainability of a plantation also depends on the age at which fruiting begins and how long the fruit trees remain productive (Goldschmidt 2013). This particular component is the most important for small-scale farmers in the intertropical zone as life cycles are generally long, yields decrease with age, and the costs of replanting and the subsequent waiting period are high (Somarriba et al. 2021). However, this component is rarely, if ever, taken into account when selecting for productivity.

Cacao, Theobroma cacao L., is a perennial species of the Malvaceae family, cultivated in intertropical zones under very diversified cropping systems. Globally, more than 5 million farmers produce cacao, mainly comprising smallholder growers. In Latin America and the Caribbean, cacao is grown by 350,000 producing families and directly supports 1.75 million people (IICA 2016).

Planting a new cacao crop requires high levels of investment that only generate returns in the long term, as the cacao tree takes around four years to reach full production. Generally, the bean yield of cacao varieties is indicated in publications (e.g. variety catalogues). This is given as the maximum annual production or by the average production over several years as evaluated in perennial trials, which rarely exceed a decade. The temporal stability of production is a key characteristic in the choice of producers who wish to receive a regular income. These essential characteristics are also major selection criteria for many perennial species that produce fruit over several decades, with the aim of improving varietal productivity over time (Alves et al. 2018; Cilas et al. 2003, 2011; Zamudio et al. 2008). Even if genotype x environment interactions are well understood (Sauvadet et al. 2021), stability over time has only been assessed for a few perennial species (Silva et al. 2014, 2023; Cilas et al. 2003), and more recently on T. cacao and T. grandiflorum (Tahi et al. 2019; Chaves et al. 2022, 2023).

A few experiments have studied the production distribution of cacao varieties over time, mainly to determine how many years of production are needed to recommend new varieties (dos Santos Dias and Kageyama 1998; Mustiga et al. 2018; Tahi et al. 2019; Chaves et al. 2022, 2023). Bean yield depends on many intrinsic and extrinsic factors; the age of the tree is one of these factors. Indeed, annual variations in production over time are important because the same cumulative production can be obtained via other production dynamics such as earliness, higher productivity in the first years, and tree senescence, often linked to external factors.

To ensure the long-term success of cocoa production, clone and hybrid selection must be based on the analysis of robust data. The chosen varieties should exhibit stability over time, demonstrating resilience in the ever-changing landscape of climate conditions. The total number of pods produced by a cacao tree during its lifetime corresponds to the potential production of the tree. However, the useful harvest corresponds to the production of healthy pods, the beans of which will be marketed. The study presented here is based on the analysis of monthly and annual production results obtained in a clonal trial of 46 cacao clones. In order to gain a clearer understanding of the link between production over successive years, a longitudinal data analysis was carried out (Piepho and Eckl 2014; De Faveri et al. 2015; Reckling et al. 2021). Longitudinal data consist of repeated measurements on the same statistical unit (or elementary plot) taken over a period of time.

The objectives of this study were to:

  1. 1.

    Examine the dynamics of pod production over time to determine the stability of the clone. The study addresses the research question, ‘If significant correlations exist between levels of production over successive years, what is the best longitudinal analysis model?’

  2. 2.

    Understand how the dynamics of total pod and healthy pod production over time reflects the behaviour of different varieties of cacao clones and allows selection of those with a high, stable and long-term production. The study addresses the research question, ‘How are the temporal production dynamics influenced by genetics?’

  3. 3.

    Assess whether the intra-annual dynamics of clones affect production. The study investigates the research question ’How do intra-annual dynamics vary among clones and are they linked to differences between total production and healthy production?’

2 Material and methods

2.1 Planting material and experimental design

The trial took place over eighteen consecutive years, from 2001 to 2018, at the CATIE experimental farm “La Lola” in the Limon Province of Costa Rica, in the canton of Matina, at 10° 06’N and 83° 23’W, at an elevation of 40 meters. In this trial, 42 clones were planted in 1998-1999, in a randomized complete block design, with four randomized blocks and elementary plots of 8 trees for each clone (Figure 1). Additionally, four clones, each represented by 32 trees per clone, were planted at the edge of the trial area. The spacing between trees was 3 meters by 3 meters. Some trees died and were not included in the analyses. The number of surviving trees varied between clones, from 10 to 32 at the beginning of the trial in 2001 and 5 to 32, depending on the clone, at the end of the trial in 2018. The clones in the trial (Table 1) were selected primarily for their tolerance to moniliasis (Moniliophthora roreri), black pod (Phytophthora palmivora) or witches’ broom (Moniliophthora perniciosa) and/or for their productivity. Nevertheless, clones with tolerance to moniliasis predominated in the trial, as there was a lack of data on their productive potential (Phillips-Mora et al. 2013).

Fig. 1
figure 1

Picture of the trial evaluated during 18 years. Here, is presented an elementary plot of the trees of the crossing UF 273 x Catie 1000.

Table 1 BLUP Comparison of cumulative production over 18 years of healthy pods and total pods per tree of the 46 clones and distribution of the clones in 4 classes defined by hierarchical classification of standardized monthly healthy and total pods production dynamics.

Banana plants (Musa sp.) were planted at a distance of 6 m x 6 m to provide temporary shade. These were gradually thinned out to leave only permanent shade plants such as guava (Inga edulis) and immortelle/poró (Erythrina poepiggiana), irregularly distributed. The cacao trees were given structural pruning at the beginning of the trial and periodic maintenance pruning (Phillips-Mora et al. 2013). Fertilization consisted of applying 150 grams of 18-5-15-6-0.3-7 (N-P-K-Mg-B-S) per tree every 3 months during the 18 years of the trial. No disease control was carried out other than the removal of diseased fruits during the monthly evaluations. Weed control consisted of manual weeding every 2 months, supplemented annually by 2 directed applications of paraquat (0.2 kg/ha).

All the pods produced by the trees were recorded monthly as healthy or damaged pods. The total number of pods and healthy pods produced were used in this study. The number of total pods corresponded to all mature fruit produced and healthy pods were mature fruits free from disease symptoms or pest damage (insects or rodents). Measurements began in 2000/2001, i.e. two years after planting, and were recorded monthly over the following 18 years until 2017/2018.

2.2 Traits analysed

For longitudinal analysis and estimation of heritability, the traits analysed were: i) the number of total pods (TP) and healthy pods (HP) produced per year, per elementary plot for each of the 18 years; ii) the sum of total pods and healthy pods produced over the 18 years, and iii) an earliness index. This earliness index (EI) was built as defined by Reddy et al. 2002, and Cilas et al. 2011, in order to study the earliness (or precocity) trait:

$$EI= \frac{n{y}_{1}+\left(n-1\right){y}_{2}+\dots +2{y}_{n-1}+ {y}_{n}}{Y}= \frac{18{hp}_{1}+17{hp}_{2}+\dots +2{hp}_{17}+ {hp}_{18}}{{hp}_{1-18}}$$

where:

hpi:

the number of healthy pods in year i,

hp1–18:

the total number of healthy pods, summed from year 1 to year 18,

n:

the number of years studied

The results on the sum of total pods and healthy pods produced over the 18 years and EI were presented in the supplementary material.

For the characterization of monthly production dynamics, the traits analysed were: standardized monthly counts for HP and TP over the 18 years.

The standardized count for HP (SHP) and TP (STP) for month m and clone c was:

$${SHP}_{m,c}= \frac{{hp}_{m,c}}{\sum {hp}_{m,c}}$$
$${STP}_{m,c}= \frac{{tp}_{m,c}}{\sum {tp}_{m,c}}$$

Where

\({hp}_{m,c}\) and \({tp}_{m,c}\) are the mean number of healthy and total pods respectively for clone c in month m,

\(\sum {hp}_{m,c}\) and \(\sum {tp}_{m,c}\) are the sum over the entire 18 year of the study period of the mean number of healthy and total pods, respectively, for clone c.

SHP and STP are respectively the proportions of healthy and total pod production in a month compared to healthy and total production over the 18 years. They provide information on the distribution of healthy and total production over time for each clone.

2.3 Statistical analyses

2.3.1 Longitudinal analyses of pods production

Longitudinal data analyses were employed to discern autocorrelation structures within elementary plot data over time. Various models were explored, including a first-order autoregressive model with homogeneous variances (AR) and heterogeneous variances (ARH), a Compound Symmetry model (constant correlation between years) with homogeneous variances (CS) and heterogeneous variances (CSH), an antedependence model (ANTE), and an unstructured model (UN), where correlations between different years were independent, along with variances for different years. The models can be expressed as:

$${hp}_{ijk} \left(and\;{tp}_{ijk}\right)= \mu + {c}_{i}+ {y}_{j} + {cy}_{ij} + {e}_{ijk}$$

where:

µ:

the mean

ci:

the effect of clone i

yj:

the effect of year j

(cy)ij:

clone x year interaction

eijk:

the residual per elementary plot k belonging to clone i for year j

hpijk:

healthy pod production per elementary plot k belonging to clone i for year j

tpijk:

total pod production per elementary plot k belonging to clone i for year j

  1. 1-

    The Compound Symmetry (CS) model assumes that the correlation structure between measurements is the same for all pairs of measurements taken at different time points. In other words, it posits a constant correlation between any two observations, regardless of how far apart in time they are:

Homogeneous Variances (CS)

In this case, the variability in measurements is consistent over the entire study duration: \({\text{V}}\left({e}_{{\text{ijk}}}\right)={\sigma }^{2}\)

Heterogeneous Variances (CSH)

Alternatively, the model allows for variations in variances across different time points, acknowledging that the variability in measurements may differ over time: \({\text{V}}\left({e}_{{\text{ijk}}}\right)={\sigma }_{{\text{j}}}^{2}\)

$$\begin{array}{ccccc}{\text{and}}& {{\text{Corr}}({\text{e}}}_{{\text{ijk}}}{,\mathrm{ e}}_{{{\text{i}}}^{\mathrm{^{\prime}}}{{\text{j}}}^{\mathrm{^{\prime}}}{{\text{k}}}^{\mathrm{^{\prime}}}})=&\uprho & \mathrm{if\;i}={{\text{i}}}^{\mathrm{^{\prime}}} , & {\text{k}}={{\text{k}}}^{\mathrm{^{\prime}}}\\ & & 0& & {\text{otherwise}}\end{array}$$
  1. 2-

    The autoregressive model accounts for temporal dependence in the data, recognizing that each observation is influenced by its past observations:

    $$\begin{array}{cc}{\text{V}}\left({e}_{{\text{ijk}}}\right)={\sigma }^{2}& \mathrm{if\;the\;variances\;were\;homogeneous\;between\;years\;}({\text{AR}})\\ ={\sigma }_{{\text{j}}}^{2}& \mathrm{otherwise\;}({\text{ARH}})\end{array}$$
    $$\begin{array}{ccccc}\text{and}&\text{Corr} (\text{e}_{\text{ijk}},\text{e}_{\text{i}^{\prime}}{_{\text{j}^{\prime}}}{_{\text{k}^{\prime}}} )= &\rho\left|^{\text{j}-\text{j}^{\prime}}\right|& \text{if}\; \text{i}=\text{i}^{\prime}, & \text{k}= \text{k}^{\prime}\\ & & 0 & & \text{otherwise}\end{array}$$
  2. 3-

    The antedependence model is selected when it is anticipated that both variances and correlations between measurements may vary across different time points, offering a flexible and comprehensive approach to capturing the complexity of the correlation structure in longitudinal data. In the antedependence model, covariances between measurements are modeled to reflect the specific relationships between successive observations over time, while variances may also be allowed to vary to account for changes in data dispersion over time. The antedependence model (ANTE) is a model where:

    $$\mathrm{the\;ij^{th}\;element\;of\;the\;matrix\;is}: {\sigma }_{i}{\sigma }_{j}{\prod }_{k=i}^{j-1}{\rho }_{k}$$
    $${\text{i}}.{\text{e}}.\mathrm{\;the\;matrix\;form\;for\;}3\;\mathrm{ years\;would\;be}: \left(\begin{array}{ccc}{\sigma }_{1}^{2}& {\sigma }_{1}{\sigma }_{2}{\rho }_{1}& {\sigma }_{1}{\sigma }_{3}{\rho }_{1}{\rho }_{2}\\ {\sigma }_{2}{\sigma }_{1}{\rho }_{1}& {\sigma }_{2}^{2}& {\sigma }_{2}{\sigma }_{3}{\rho }_{2}\\ {\sigma }_{3}{\sigma }_{1}{\rho }_{1}{\rho }_{2}& {\sigma }_{3}{\sigma }_{2}{\rho }_{2}& {\sigma }_{3}^{2}\end{array}\right)$$

This is also an autoregressive model, but the covariance structure has heterogenous variances and heterogenous correlations between adjacent elements. The correlation between two nonadjacent elements is the product of the correlations between the components that lie between the elements of interest.

  1. 4-

    The unstructured model is chosen when there is an expectation that both variances and correlations between measurements may vary across different time points, providing a flexible and comprehensive approach to capturing the complexity of the correlation structure in longitudinal data. In the unstructured model:

    $${\text{V}}\left({e}_{{\text{ijk}}}\right)={\sigma }_{{\text{j}}}^{2}$$
    $$\begin{array}{ccccc}{\text{and}}& {{\text{Corr}}({\text{e}}}_{{\text{ijk}}}, {{\text{e}}}_{{{\text{i}}}^{\mathrm{^{\prime}}}{{\text{j}}}^{\mathrm{^{\prime}}}{{\text{k}}}^{\mathrm{^{\prime}}}})=& {\uprho }_{{{\text{jj}}}^{\mathrm{^{\prime}}}}& \mathrm{if\;i}={{\text{i}}}^{\mathrm{^{\prime}}} , & {\text{k}}={{\text{k}}}^{\mathrm{^{\prime}}} ,\mathrm{ l}={{\text{l}}}^{\mathrm{^{\prime}}}\\ & & 0& {\text{otherwise}}& \end{array}$$
    $${\text{i}}.{\text{e}}.\mathrm{ the\;matrix\;form\;for\;}4\mathrm{\;years\;would\;be}:\left(\begin{array}{ccc}{\sigma }_{1}^{2}& \cdots & {\sigma }_{14}\\ \vdots & \ddots & \vdots \\ {\sigma }_{14}& \cdots & {\sigma }_{4}^{2}\end{array}\right)$$

Criteria such as Akaike Information Criterion (AIC) or Bayesian Information Criterion (BIC) are often used to determine the most appropriate correlation structure for the data, as these criteria balance the goodness of fit with model complexity (Verbeke and Molenberghs 2000). The lower these criteria are, the better the model proves to be; we presented only AIC. The longitudinal data analyses were performed with PROC MIXED of the SAS system (SAS 2019). The Best linear unbiased predictors (BLUP, Henderson 1975) were estimated for HP and TP using the PROC MIXED of SAS (Piepho et al. 2008).

2.3.2 Temporal analysis of intra-annual and inter-annual pod production rates

To identify groups of clones with similar production dynamics, monthly healthy and total pods, production dynamics for each clone were clustered jointly using hierarchical clustering (Murtagh and Legendre 2014). In order to compare the production distribution across months and years, but not its value, standardized monthly counts SHP and STP were used.

To calculate the average dynamics of each class, the monthly means of healthy and total pod production per class were computed and plotted, then broken down into seasonality and trend by moving average using multiplicative models, as the variability of the standardized time series SHP and STP increased with the mean. The trend corresponds to the long-term evolution of the data and is estimated by calculating the average of the values in the time series over a period longer than the production year in order to eliminate seasonality and some of the randomness. It is based on the assumption that values close in time are more similar. Seasonality corresponds to the average intra-annual variation. In our model, it does not depend on the year. The trend component was first calculated using an 18-month moving average to smooth out the dynamics and filter out random fluctuations. Then, the seasonal component was calculated by averaging the value of each month over the entire study, then dividing by the trend. Finally, the series of measured data were modelled as:

$${SHP}_{m,c}= {T}_{{H}_{m,c}}* {S}_{{H}_{m,c}}* {e}_{{H}_{m,c}}$$
$${STP}_{m,c}= {T}_{{T}_{m,c}}* {S}_{{T}_{m,c}}* {e}_{{T}_{m,c}}$$

Where

\({T}_{{H}_{m,c}}\)and \({T}_{{T}_{m,c}}\) are the trend of SHP and STP, respectively, for clone c in month m,

\({S}_{{H}_{m,c}}\)and \({S}_{{T}_{m,c}}\) are the seasonality of SHP and STP, respectively, for clone c in month m,

\({e}_{{H}_{m,c}}\) and \({e}_{{T}_{m,c}}\) are the residuals of SHP and STP, respectively, for clone c in month m.

To determine the difference in dynamics between each class, we used a GAM to model the trend and seasonality of clone dynamics. To do this, monthly means of SHP and STP were split into seasonality and trend by moving average using a multiplicative model such as mean per class. Then, trends and seasonality of SHP and STP were modelled using 4 general additive models (GAM) with a class effect and a smoothing effect of date, depending on the class.

Clustering analyses were carried out with R 4.0.3 (R Core Team 2017). Data manipulation was done using the plyr package (Wickham 2011). The mgcv package (Wood 2011) was used for GAM modelling. The ggplot2 package (Wickham 2009), and itsadug package (van Rij et al. 2020) were used for plots.

3 Results and discussion

3.1 Analysis of clonal production

The difference between means and BLUP clones in terms of TP and HP production over the 18-year study are important (Table 1). The three new improved clones, CATIE-R1, CATIE-R4 and CATIE-R6, performed better than the other clones in terms of HP production (Table 1). They are the only clones to have produced more than 500 healthy pods on average, per tree, accrued over 18 years (total production: 610 to 679 pods). ARF-10 is the most productive clone with 1188 pods produced over 18 years, but only 15% of the pods produced were healthy. Clone CATIE-R6 produced the largest number of healthy pods (96%). Clone GU-133N was the least productive in terms of total pods with a total of only 84 pods, of which 68 were healthy. RB-41 produced, proportionally, the least number of healthy pods (only 10.2% of 761 pods produced).

3.2 Longitudinal analysis of the quantity of pods produced

Longitudinal data analyses defined the correlation structure between elementary plot yields across the period of study. Repeated analysis of the data from all production years revealed that, of the four models included in the longitudinal data analysis (two of which were subdivided into homogeneous or heterogeneous variances), the unstructured model was best adapted to the different criteria for both HP and TP (Table 2). However, this required 171 parameters to be estimated (18 variances and 153 covariances or correlations). The least well adapted model was compound symmetry (CS), where the correlation between years was constant regardless of the number of years. Indeed, the correlation between years decreased when the gap between years increased (environmental correlations, Tables 3 and 4), which is typical for an autoregressive model and not of a compound symmetry model. Of the three autoregressive models, the antedependence model (ANTE) performed best. ANTE models are useful for analysing the covariance structure of continuous longitudinal data (Jaffrézic et al. 2003). They are more frugal than unstructured models, i.e. have fewer estimated parameters. ANTE models are more general than ARH models as they do not stipulate that correlations between measurements equidistant in time should be equal (Zimmerman and Núñez-Antón 1997).

Table 2 Choice criteria of the longitudinal data analysis model for annual production (healthy pods) and annual potential production (total of pods) - between brackets, for 18 years and for 15 years (from year 4 to year 18). CS compound symmetry model with homogeneous variances, CSH compound symmetry model with heterogeneous variances, AR1 first-order autoregressive model with homogeneous variances, ARH1 first-order autoregressive model with heterogeneous variances, ANTE antedependence model, UN unstructured model, AIC Akaike criteria.
Table 3 Estimated correlations between annual productions and cumulative production of for healthy pods (genetic, lower triangle; environmental, upper triangle).
Table 4 Estimated correlations between annual potential productions and cumulative potential production (total of pods, genetic, lower triangle; environmental, upper triangle).

Due to the gradual entry into production of certain clones, and therefore the time lag observed in earliness, it made sense to repeat the longitudinal analyses after eliminating the first few years of production (Table 2). For the full productive life of the trees (between 4 and 18 years), the longitudinal data analyses were not affected by precocity behaviour. After the unstructured model, the second-best model was the antedependence model for both HP and TP (Table 2).

It was different for the factorial mating design studied in Côte d’Ivoire. The Compound Symmetry model with heterogeneous variances (CSH) performed best, indicating an important tree or elementary plot effect (Tahi et al. 2019). It was logical to find a tree effect in mating designs as each tree is a unique genotype. The good match provided by the antedependence model indicated that the correlation between annual production levels decreases as intervals between years increase for both HP and TP. There was no biennial effect on production, as would be expected in an alternation in correlations. The same antedependence models performed well for healthy and total pod production. Thus, the impact of pests and diseases is significant but it does not change the on-time dependency structure between total and healthy pods, despite the fact that pest and disease effects are often spatialized (Brun et al. 1997; Sounigo et al. 2003; Ndoumbe Nkeng et al. 2017).

The genetic and environmental correlations between the 18 years were estimated for the two traits (HP and TP); longer intervals between sampling years resulted in a lower correlation between production levels (Tables 3 and 4). Correlations are generally higher between late years. This is probably due to the fact that the start of production varies greatly from one clone to another. It is also possible that the first years of production are more sensitive to climatic hazards. From year 7 onwards, genetic correlations were very high for both healthy and total pods (>0.85). From the seventh year onwards, annual production allows cumulative production over 18 years to be estimated (data not shown). Selection based on genetic values could be carried out with high accuracy from the 7th year of production. This result differs from the conclusions of another study conducted in Ghana (Owusu-Ansah et al. 2013), which recommended only 3 years of observation, but which was based on simulations; However, our conclusions are in line with a study, also on cocoa, carried out in Brazil, which recommended around 7 years of observation to improve the efficiency of the selection (Chaves et al. 2022).

3.3 Temporal analysis of annual and interannual pod production rates

Total pod production increased in the first seven years and then rose more slowly to reach a plateau after twelve years (Figure 2). For healthy pods, the annual production increased in the first seven years, reached a plateau, peaked in the 12th season of production with an average of 20 pods per cacao tree, and then declined after the fourteenth season of production (Figure 2). The difference between the average total pod count and the average count of healthy pods increases over time, highlighting the growing impact with the age of the plot of unhealthy pods on the yield beneficial to the producer.

Fig. 2
figure 2

Average, minimum and maximum number of healthy (blue) and total pods (red) produced per tree each season (from July to June).

Four dynamic patterns were identified by classifying the dynamics of standardized monthly counts SHP and STP (Table 1). These dynamics were split into seasonality over 12 months (Figure 3b) and trend (Figure 3c). The GAM models enabled comparison of classes for seasonality (intra-annual variability) and trend (inter-annual variability) (Figure 4).

Fig. 3
figure 3

Mean (a), Trend (b) and Seasonality (c) of standardized monthly dynamics of each production class for healthy (Blue) and total pods (Red). The area under each mean curve is 100 %. Standardized monthly productions correspond to total and healthy pod production rate, standardized with respect to the number of pods produced over the 18 years of the whole study for each clone.

Fig. 4
figure 4

Smooth by class based on predictions from the GAM model of healthy pods trends per clone (a), mean total pods trends per clone (b), mean healthy pods seasonality per clone (c) and mean total pods seasonality per clone (d). GAM models are built on decomposition by a multiplicative model of mean standardized dynamic of each clone over 18 years.

3.3.1 Inter-annual variability

Class C was the most numerous with 25 clones. Classes A, B and D had 6, 11 and 4 clones respectively. Average dynamics per class over the 18 years resulted in contrasting dynamics, with varying production earliness and distinct production peaks for each class (Figure 3a).

There was less variance in trends in total pod production between classes (Figures 3c and 4b). Regardless of the class, a stabilization or decrease in total production was recorded after a peak in production around the 14th year. Whatever the level of production of each clone, their production over time appears similar.

In contrast, trends in healthy pod production were different for each class (Figure 4a). Production of healthy pods began slightly later in Class B (around the 7th season). However, production of healthy pods was maintained for longer than other classes. Most of its production occurred between seasons 7 and 16. Class A had a more even production distribution over time but this decreased slightly after the 9th year. Class D was the least stable. In class D, most production occurred during the first 8 years. From the 13th year, relative production was low. Class C was positioned between classes B and D. Overall, the production of healthy pods in three of the classes, representing the producer’s main source of income, takes place in the first 12 years of production (Figure 3a).

The production curves of total pods and healthy pods do not follow simple exponential mathematical dynamics of the type alluded to in the article by Ryan et al. (2009; Figure 3a). Variations in production over time is complex and is influenced by the environment (climate, disease pressure, soil and shade), genetics (physiological behaviour, disease resistance), and genotype-by-environment interaction (competition, pollination).

Ultimately, pod production is maintainable beyond 18 years. However, production of healthy pods, needed by the producer to maintain profitability, will rapidly drop off after this time. This could be linked to genetic differences in disease resistance between clones. As this trial used fertilizer, it appears the environment, and in particular the proportion of diseased/damaged pods, are responsible for the non-sustainability of production.

An earlier (6th year), decrease in production was observed in a factorial mating design in Côte d’Ivoire (Tahi et al. 2019). Soil depletion and competition between cacao trees could be to blame (Euler et al. 1992; Montagnon et al. 2001; Trebissou et al. 2021). In Côte d'Ivoire, competition has had significant effects on yield from year 6 of production (Trebissou et al. 2021) with competition between cacao trees increasing with tree age. It is possible that competition may occur later in Costa Rica due to lower planting density, shade, use of fertilizer and/or better soil and water conditions (Sanchez 2002; Almeida and Valle 2007). Production is prolonged under Costa Rican conditions compared to Côte d'Ivoire. Group B appears to be the most sustainable. This could be related either to resistance (Class B produced a similar number of total healthy pods) or to production clustered in a period with less disease (Figure 3b).

The depletion phenomenon, also observed in other trees (McFadyen et al. 2011), requires consideration of relevant agronomic techniques to restore their productive potential. This may include measures to protect the health of the plant. It would be wise, for example, to check whether the wilt rate (young fruit dieback, known as cherelle wilt), sometimes involved in regulation phenomena, will not increase beyond a certain age. To complete this study, which is the first to address the long-term production sustainability of various clones, it would be worthwhile to gain a better understanding of the heritability of several traits linked to yield, such as flowering, pollination, wilt rate, and number of beans per pod (Valle et al. 1990; Cilas et al. 2010). This would help pinpoint which yield components change over time. Earliness is a sought-after trait in perennial plants, but it would be desirable to identify more sustainable plant material which may extend the production life of the tree.

3.3.2 Intra-annual variability

The production periods of the different clones vary significantly, even though the main production period falls between October and January (Figures 3b and 4d). The most abundant class, class C, displayed a typical distribution of total pod production, with two production periods per season. The first period ran from October to January and the second, shorter, period took place in April. Both peaks had the same maximum, but the first lasted three times longer than the second. Both classes A and B had a main peak in total pod production, but at different times. Class B had its main production peak in April, outside the main harvesting period, while class A was earlier in the season (October - November). They both had a very small secondary production peak. Class D had a very extended and early production of healthy pods, from July to November.

Class B had similar total and healthy standardized production dynamics whereas the standardized production dynamics of classes C and D were different, with peaks in healthy pod production slightly staggered over the year compared to total production (Figure 3b and c). Class A is intermediate. These differences between healthy and total pod production can also be observed in inter-annual trends with a marked reduction in healthy pods over time in classes C and D. Class A and B clones appear to be less affected by disease (particularly class B), but have very different production dynamics, producing pods at different times of the year. Class A has a relatively stable production over 18 years, whereas class B produces relatively late. These characteristics may have a genetic origin. Between June and September, the proportion of healthy fruit produced is higher than the proportion of total fruit, whereas between October and March, it is lower. (Figure 3b). This period therefore appears to be the less favourable for the production of healthy fruit, probably because it is more conducive to the development of diseases. This is a period of high total pod production, except for class B clones, which are differentiated from other clones and can therefore escape the disease. However, high-yielding clones chosen from classes A or B will be more durable than other clones and should be favoured by farmers who want a long-lasting plantation. Although this needs to be confirmed by a study of detailed disease data, it seems that class A clones produce healthy pods during periods when the other classes are affected by disease and would therefore be resistant, whereas class B clones produce outside the main disease period, which does not imply that they are resistant. Each class therefore has its own risks of bypassing resistance or of change in the period favourable to disease or in the production period in relation with climate change. Growers may therefore choose to combine high-yielding clones selected from these 2 classes to mitigate the risks and spread out the production period, or on the contrary, prefer to concentrate production by choosing clones from a single class.

3.3.3 Links between pod dynamics and quantity of pods produced

Class A had a main production peak in October and November and an average total production of 435 to 840 pods over 18 years (Table 1). Healthy pod production was relatively high but inconsistent (from 174 to 657). Class A included the CATIE-R1 clone, one of the clones producing the greatest number of healthy pods. Class B had a main production peak in April and began to produce more healthy pods after the 6th year. Class B was highly variable in its production of healthy and total pods (63 to 619 and 83 to 708, respectively) with similar numbers of healthy and total pods produced. Class B included the CATIE-R4 and CATIE-R6 clones, two of the three clones producing high numbers of healthy pods. Class C contained the most clones (25) and had two production peaks. There were large differences between clones. Indeed, production could be exceptionally high, ranging from 112 to 1188 total pods. Healthy pod production was average (31 to 463) and lower than total production. Class D had a relatively long annual production peak and pod production tended to be concentrated in the first 8 years. Class D contained only 4 clones that produced high numbers of pods (761 to 1071). However, this clone produced very few healthy pods (78 to 128, an average of only 12.2% healthy pods).

These dynamic classes are based on the proportion of production at any given time and are not homogeneous in terms of production. Clones with similar dynamics can have very different potential production (Table 1), meaning that production capacity and productive dynamics are independent traits. On the other hand, clones from the same group are relatively homogeneous in terms of relation between number of total and healthy pods produced. Class B clones produced almost exclusively healthy pods, which was not the case for class A. Class B, which may combine escape phenomenon and resistance, therefore appears to be more resilient than Class A. The genetic origin of the clones is highly diversified but could explain the behaviour of some clones. For instance, CATIE-R4, CATIE-R6, and others in Group B share common parents, moniliasis resistant clones UF-273 or PA-169, which are responsible for the high yield and resistance of these clones. Since CATIE-R1 shares the same mother (UF-273) but has a different father (CATIE-1000), the productive dynamics of CATIE-R4, CATIE-R6, and others appears to be inherited from the father PA-169, which displays the same behaviour.

4 Conclusion

We have found significant correlations between levels of production over successive years which are lower at the beginning of the production period.

The long-term temporal production dynamics of total pods appear to be relatively consistent across clones, indicating that, whatever the clone, cocoa trees have production potential over extended periods. Production began to fall after the fourteenth year of harvest, but the decline remained modest. Production in the 18th year remained high, and it would certainly be interesting to work on methods of boosting production after a certain age, such as pruning and fertilization. In contrast, the temporal production dynamics of healthy pods vary significantly among clones, highlighting genetics as a major factor in the sustainability of useful production. A minimum of 20 years of operation is therefore conceivable, depending on our ability to manage pest and disease constraints.

We have shown that the intra-annual dynamics vary significantly among clones. Although the majority of clones have a classic distribution with 2 production peaks, there are 3 other types of production distribution throughout the year that vary in spread and timing. The study of intra-annual variability is relatively unexplored, yet it has revealed a small group of clones that appear to combine several characteristics, making them the most resilient.

Intra-annual variability is expected to have a significant impact on clonal selection. Classes A and B, which include the most productive clones in terms of healthy pods, have their primary production peak shifted, early in the year for Class A and late in the year for Class B. Consequently, producers may benefit from selecting the most productive clones from both classes to mitigate climatic and sanitary risks and spread out harvests, which represent a significant labor cost. Since the two top varieties in Class B (CATIE-R4 and CATIE-R6) are self-incompatible, it is recommended to cultivate a polyclone composed of clones that are sexually compatible with each other. Interestingly, the most productive clone from Class A, CATIE-R1, is compatible with these two clones. The characteristics of these clones would need to be confirmed in other agro-environmental contexts.

In the future, it would be interesting to complement this study by combining the investigation of the influence of disease, climatic variations, and competition among trees on cocoa production. Integrating spatial analyses with longitudinal data analyses could enhance the management of various environmental factors and facilitate more efficient selection in numerous perennial crops (De Faveri et al. 2015). Resilience now emerges as a crucial criterion to be incorporated into cocoa breeding programs to address these new challenges effectively.