Introduction

Reputed food engineering journals should generally reject conventional articles on the use of response surface methodology (RSM) to analyze the supercritical (SC) carbon dioxide (CO2) extraction of vegetable substrates or microalgae. Indeed, using RSM provides authors with optimized conditions for the selected range of operating variables, but fails to provide an understanding of the underlying mechanisms and lacks sufficient depth in engineering science. Sharif et al. [1] published a highly quoted review article on the application of RSM to SC-CO2 extraction processes that we will use as a reference point to back up the claim that heads this paragraph.

To the best of the knowledge of the authors, del Valle and Aguilera [2] were the first to apply RSM to the analysis of SC-CO2 extraction processes, but this was due more to limitations of the experimental device than a consequence of careful planning. The authors performed SC-CO2 extraction of mushroom oleoresin in a 1-gallon batch reactor that did not dispose of a high-pressure pump and had the CO2 density (ρ) and system temperature (T) as the independent variables defining the conditions in the reactor. CO2 density was controlled by adjusting the amount of liquid CO2 loaded in the extraction vessel that was fed from a high-pressure cylinder equipped with a dip tube while gaseous CO2 was continuously removed from the upper end of the reactor through a tubing that projected a measured length into the upper portion of the reactor. The supercritical conditions in the reactor were achieved by heating, and attainment of the desired ρ-T conditions was confirmed by measuring the temperature and pressure (P) in the reactor during extraction and using PVT tables for CO2 to interpolate the corresponding values of ρ. Dried Suillus luteus L. mushrooms used as substrate were ground, size-classified (− 35 + 100 mesh), moistened to 130 g kg−1 water/substrate, and uniaxially compressed in a 53-mm ID cylindrical mold to produce disks of 24-mm height in an Instron Universal Testing Machine operating with a flat plunger at 6 mm/s. The third experimental variable was the specific volume of the substrate, which was adjusted by varying the load of conditioned powder loaded to the compression cell. The compressed substrate disks were placed between flat plates and attached to the upper lid of the reaction vessel above the level of the liquid CO2 in all experiments. Following completion of extraction, after 6 h, the loaded supercritical phase was expanded to a second 1-gallon high-pressure reaction vessel filled with gaseous CO2 at room temperature that was slowly depressurized. Because of CO2 expansion, the solubilized mushroom oleoresin came out of the solution and precipitated out both in the extraction reactor and separation reactor. Thus, the oleoresin yield (the response variable) experiments were not determined based on the recovered extract in the baskets placed at the bottom of the two reactors, but on the assessment of the weight and moisture losses of substrate disks. del Valle and Aguilera [2] used a three-factor fractional factorial design of experiments (DoE) consisting of 18 runs at 16 different experimental conditions.

About 15 years after this first study, the corresponding author was back in Pontificia Universidad Católica (UC) de Chile as an assistant professor, when the Laboratorio de Extracción de Materiales Biologicos (LEMaB) procured a Process Development Unit (PDU) from Thar Technologies (Pittsburgh, PA) to carry out research on SC-CO2 extraction processes from food materials. The first published work in this second stage was on the SC-CO2 extraction of rosehip seed oil using RSM [3], which will be reviewed in “Application of RSM on the Supercritical CO2 Extraction of Oilseeds” section. At that time, about 20 years ago, the WoS database reported only five other works on the application of RSM or DoEs to SC-CO2 extraction processes in the Food Science Technology field.

Since 2000, the application of RSM and DoE in the field of food engineering, particularly in the context of SC-CO2 extraction of vegetable substrates and microalgae, has gained significant attention. While RSM enables the determination of optimized conditions for a given range of operating variables, a debate persists regarding its limitations in providing a comprehensive understanding of the underlying mechanisms and engineering principles. In light of this, the primary objective of this review is to comprehensively discuss the utilization of RSM and DoE in SC-CO2 extraction, specifically focusing on the extraction of vegetable oils and associated minor lipids from seeds and related substrates.

To achieve this goal, the review is structured as follows. “RSM and its Application in Food Engineering” section provides a brief overview of RSM and DoE applications in the field of food engineering. “Application of RSM on the Supercritical CO2 Extraction of Oilseeds” section describes and analyzes selected studies that have employed RSM for the SC-CO2 extraction of oils and minor lipids from seeds and related substrates, highlighting both their advantages and limitations, as well as the importance of appropriate selection of dependent and independent variables, along with suitable experimental design strategies. “Effect of Process Variables on the Supercritical CO2 Extraction of Oil in Seeds” section discusses the key variables that influence the supercritical extraction of vegetable oils from seeds. Furthermore, “Supercritical CO2 Extraction of Oils and Carotenoids in Rosehip Seeds and Shells” section presents a state-of-the-art compilation, highlighting the opportunity to enhance the extraction of oils and carotenoids from rosehip seeds by including shells as part of the substrate. The subsequent section, “Case Study” section, serves as a case study, presenting an experimental work and statistical analysis of results for the coextraction of carotenoids and oil from rosehip shells and seeds. By addressing these topics, this review aims to provide a critical and up-to-date perspective on the application of RSM and DoE in SC-CO2, while also serving as a valuable reference and practical resource for researchers and professionals seeking to optimize the extraction processes of oils and minor lipids from seeds and related substrates. Finally, “Conclusions” section summarizes the main conclusion of this manuscript.

Literature Search

As indicated above, this section will expand the “Introduction” section, thus is just a justification of this review article from the standpoint of the experience of one of the authors, to briefly describe the application of RSM and DoE in the food engineering field, to discuss and describe 89 works in literature on the application of these methodologies to the SC-CO2 extraction of oil and minor lipids from seeds and related substrates, and to present and discuss the main variables affecting the SC-CO2 extraction of oil in seeds, in general, as well as the extraction of oil and carotenoids in rosehip seeds and shells, specifically. As part of “Supercritical CO2 Extraction of Oils and Carotenoids in Rosehip Seeds and Shells” section, we will justify the case study we devised for an improved analysis of RSM and DoE in the food engineering field that readers should consider in planning their experimental studies in the future.

RSM and its Application in Food Engineering

DoE and RSM are widely used research methods that use statistical and model-fitting tools to predict one or more dependent variables (responses) as a function of several independent variables (factors). These methods should be used mainly in those cases where an analytical description of the phenomenon under study is not yet available, and it is presumed that the change in the factors produces relatively mild alterations in the response(s) within an experimental region. The resulting models are categorized as empirical, or result-driven models, due to their lack of underlying theory and their limited ability to predict responses outside of the range of data that was used to fit the model.

DoEs were adapted from agricultural research to quality control in industry in the 1920s as reviewed by Bisgaard [4]. More recent authoritative reviews on the subject were published by Tanco et al. [5] and by de Oliveira et al. [6] that focus more specifically on the use of central composite designs (CCDs), of the type that will be used in our study, and data analysis using RSM. The reader interested on DoEs and RSM may consult the books or monographs of Haaland [7], Box et al. [8], Box and Draper [9], Gacula et al. [10], Myers et al. [11], Montgomery [12], Jones and Montgomery [13], and NIST/SEMATECH [14]. Response surface methodology has a major advantage over the one-factor-a-time approach in that it allows the evaluation of the effect of multiple factors (or independent variables) and their interactions on the response(s) [or dependent variable(s)] with a reduced number of trials. Interaction between independent variables is manifested when one factor does not produce the same effect on the response at different levels of another factor.

RSM studies allow (1) understanding the topography of the response surface (local maximum, local minimum, ridge lines); and (2) finding the region where the optimal response occurs. The goal is to move rapidly and efficiently along a path to get to a maximum or a minimum response so that the response is optimized. The methodology involves three steps [11]: (1) experimental design in which the independent variables and their experimental levels are set using well-established statistical experimental designs such as CCD; (2) response surface modeling through regression analysis; and (3) process optimization using the response surface models.

An important aspect of RSM is the DoE [9]. The objective of DoEs is the selection of experimental conditions where the response should be evaluated. Most of the criteria for optimal DoEs are associated with the mathematical model of the process. Generally, these mathematical models are polynomials with an unknown structure, so that the corresponding experiments are designed only for every particular problem. The choice of the DoEs can have a large influence on the accuracy of the approximation and the cost of constructing the response surface.

Table 1 summarizes the number of entries on DoE or RSM in the database of WoS under the category of Food Science Technology. About one-third of the contributions relate to extraction, and most of the 2240 entries on this subject relate in turn to emerging technologies such as those on subcritical or hot-pressurized water (29% of total), ultrasound-assisted (24%), microwave-assisted (16%), or supercritical CO2 extraction (9%) processes. Extraction is followed by formulation, fermentation, reaction, and storage with about 10% each. Then, the subjects’ sensory analysis, drying, and separation account for 7–8% each. Finally, extrusion, packaging, and thermal processing account for less than 5% each. Consequent to our observations in Table 1, Yolmeh and Jafari [15], on reviewing the applications of RSM in the food industry, highlight extraction, drying, and production of microbial enzymes and some other metabolites (all supported by extensive tabulations of practical examples), besides formulation processes, microencapsulation, and controlled-release processes, optimization of enzymatic hydrolysis and clarification, and blanching (a thermal process).

Table 1 An overview of topics covered by articles indexed to the Web of Science in the “Food Science Technology” category when “Response Surface Methodology” OR “Design of Experiments” was queried in September 2021

In “Application of RSM on the Supercritical CO2 Extraction of Oilseeds” section, we analyze critically the application of DoE and RSM to the SC-CO2 extraction of oil from seeds. This subject has been implicitly analyzed in reviews focussing on experimental designs on SCFE [1], the supercritical fluid extraction (SCFE) of vegetable matrices [16], or SCFE of seed oils [17].

Application of RSM on the Supercritical CO2 Extraction of Oilseeds

To ensure a comprehensive and focused review, we employed a specific search equation on the Web of Science (Date 08/06/2021). The search equation used was “supercritical CO2 extraction “ (searched in all fields) combined with “Response Surface Methodology” (searched in all fields), “oil” (searched in all fields), “seed” (searched in all fields), “Proceeding Paper” or “Book Chapters” or “Review Article” (document types excluded) and “Food Science Technology” or “Engineering Chemical” (Web of Science categories). This search equation allowed us to narrow down our selection criteria and identify relevant articles that addressed the specific aspects of our study. By combining these keywords and filters, we aimed to retrieve articles that focused on supercritical CO2 extraction, utilizing response surface methodology, and targeting applications in food science, technology, and chemical engineering with a specific emphasis on seed-related research. Through this rigorous search strategy, we obtained a collection of 89 scientific articles that met our predefined criteria. Appendix A in Supplementary Material lists all plant-derived substrates extracted in the selected works. Overall, 90 substrates, mostly seeds (84%), followed by byproducts (11%), and whole fruits (5%), were studied. The most popular was flaxseed (Linum usitatissimum), which appeared in six contributions.

From the chosen articles, we extracted valuable insights into the authors’ preferences concerning the factors and observables they prioritize when utilizing response surface methodology. Additionally, we captured information about their experimental strategies, considering the diverse options available for organizing experiments within the framework of response surface methodology. Table 2 summarizes the most studied factors or independent variables of the selected works. Most designs have three independent variables (60%) followed by four (21%), five (12%), and two (7%). Regardless of the number of factors, all the RSM studies reported in Table 2 used pressure (P) as an independent variable, and all but two also used temperature (T). P and T are by far the thermodynamic variables preferred by researchers for exploring the region where the solvent or solvent mixture is a supercritical fluid. The convenience of replacing one of these thermodynamic variables by another, like density, that might be more suitable will be discussed later. Thirty-nine studies considered the CO2 flow rate (Q) and 36 the dynamic extraction time (td) as independent variables, while 11 considered both. With 11 occurrences, the fifth most used factor was particle size (dp), a variable related to the inner resistance to mass transfer. Seventeen percent of the works considered the modifier concentration (MC) as a factor, in that the modifier is generally added to increase the yield of some minor polar and/or high-molecular-weight substrate component. Finally, the specific CO2 consumption (q) and static extraction time (te) were the least considered factors with 7 and 4 occurrences.

Table 2 Literature search on the extraction of vegetable oils using supercritical CO2

Six works of the 88 that we analyzed in our study are not listed in Table 2. These works are those that used one or more factors with only one occurrence. Jiang and Niu [58], for example, in addition to some of the factors mentioned in Table 2, included the substrate moisture in their preliminary trials; Wang et al. [99] used the amount of substrate loaded in the extractor (Ms); and Bernardo-Gil et al. [100] conveniently used superficial CO2 velocity (U), an intensive variable that does not depend on the geometry or size of the extraction vessel. These less frequently studied variables encompass phenomena associated with substrate composition, bed density, and flow effects from a scalability perspective. Although these phenomena hold significance for scalability based on theoretical models, which highlight the importance of assessing the impact of their variations on model performance, the authors have demonstrated less interest in investigating or comprehending their effects. This disparity could be attributed to the predominant focus, observed in most papers, on exploring new sources of vegetable oils, rather than optimizing the extraction process for promising well-studied vegetable oils through a comprehensive understanding of the underlying transfer phenomena. Based on our experience, we have observed a limited number of published works on RSM employing factors that could contribute novel experimental information to the understanding of phenomena that are not yet sufficiently well described for the SC-CO2 extraction of vegetable oils. Mixing speed and ultrasonic power [101], substrate pretreatment [102], and dispersion of the substrate in an inert matrix [103] may contribute to the understanding of SC-CO2 extraction of oil from seeds but are rarely studied in RSM and DoE. Appendix A in Supplementary Material also details the selected responses: 89% of the works evaluated oil yield as a response, 34% evaluated the yield of a minor compound in the oil, 12% evaluated percent recovery, and 11% evaluated antioxidant activity. A response variable that was found with few occurrences (three times) in the review, but that may be very useful from the standpoint of describing the rate of the extraction process, was the slope of the first part of the cumulative extraction curve. The idea was used by Bernardo-Gil et al. [100], and then by Comin et al. [102]. Finally, Bojanić et al. [63] fitted semi-empirical models to extraction curves to obtain the initial slope values from best-fit parameters values. (This will be illustrated in “Effect of Process Variables on the Supercritical CO2 Extraction of Oil in Seeds” section for the model of Naik et al. [104]).

Appendix B in Supplementary Material shows experimental designs used by researchers in their RSM applications. Among them, the central composite design (CCD) (35 occurrences) and the Box–Behnken design (BBD) (29 occurrences) stand out, both with characteristics of rotatability (or near-rotatability) appropriate for the application of RSM [11]. Other less popular experimental designs include the factorial (10 occurrences), face-centered (4), factorial multilevel (1), and fractional factorial (1) ones. More than half of these consulted manuscripts (52) studied 3 experimental factors, 20 studied 4, 11 studied 5, and only 6 papers studied 2 experimental factors. The advantage of the CCD lies in the sequential nature of its implementation, which allows experiments to be added to the factorial design (FD) as notions of lack of fit are observed, i.e., it is possible to divide the design into blocks that do not affect the estimates of the coefficients in the second-order model (orthogonal blocking) [14]. On the other hand, the advantage of the BBD design is that they require fewer treatment combinations than CCDs in cases involving 3 or 4 factors. However, BBDs have a limited capacity for orthogonal blocking compared to CCDs, and also contain corner regions of poor predictive capability in corners [14].

Appendix B in Supplementary Material groups the works in Appendix A into two categories according to their implementation: those that used optimization designs or factorial designs, and those that screened factors prior to optimization experiments. The prescreening approach is convenient when studying many factors because it allows determining the significance of the effect of the factors on the response(s) before considering factor optimization. Examples of applications of the screening plus optimization approach include the one-factor-at-a-time (OFT) plus BBD with six occurrences, OFT + CCD with four, and fractional factorial design + CCD and factorial design + OFT with one each. This sequential approach may allow reducing the number of experiments without losing statistically relevant information. However, despite its advantages, four times less (6) occurrences of the prescreening approach were carried out than the approach of performing complete designs (25 occurrences) when studying the effect of 4 or 5 factors. Solving a 5-factor BBD or CCD involves performing 46 or 52 experiments, respectively. When performing prescreening experiments prior to optimization experiments, this number of experiments could be considerably reduced. Such could have been the case in the study conducted by Suryawanshi and Mohanty [23] who solved a five-factor BBD (T, P, dp, Q, and MC) for the SC-CO2 extraction of Argemone mexicana where they found that T and Q did not have a significant effect on extraction yield. If the authors had employed the prescreening approach using, for example, a fractional factorial design of resolution III, i.e., a 25−2 FFD, they would have found the lack of significance of T and Q with only 8 experiments that correspond to the factorial part of a three-factor CCD. A CCD of these characteristics would have required 12 additional experimental data points corresponding to the 6-star points and 6 center points. This two-stage approach requires less than half of the time and consumables needed to achieve the same goals.

The vast majority of the analyzed works estimated the optimal extraction conditions from a model limited to the studied experimental region. As detailed in Appendix A in Supplementary Material, 55% of these optimal conditions coincide with a corner or an edge of the experimental region. Thus, if the purpose of the RSM model is to identify an optimal condition, approximately half of the studies failed because they did not properly define the experimental region so that it actually enclosed the optimal. For the remaining 45% of the cases, the predicted optimum is within the experimental region, but only 45% of these predictions were confirmed by the authors by performing one or more experiments and comparing predicted and experimental values.

As also noted in Appendix A in Supplementary Material, in the majority of studies, first-order or second-order models were fitted following recommendations to implement RSM. However, we also found studies that used artificial neural networks (ANNs) in addition to second-order models. According to Ahangari et al. [17], researchers who used ANN found better predictions of optimal conditions, such is the case for the works of Chouaibi et al. [22], Amani et al. [62], Azmir et al. [69], and Zahedi and Azarpour [97]. From a practical standpoint, when there are resource limitations, it may be convenient to use ANN to find the best extraction conditions. However, the opacity of the internal mechanisms of ANNs precludes their contribution to explanations of the effects of the factors on the response(s). When the network provides a solution, it does not lead to explaining why and how the different factors contributed to the answer. This moves us away from understanding the phenomenon, which is usually sought when considering adopting DoEs, regarding building or confirming hypotheses through experiments.

From our review, only Sodeifian et al. [30], in addition to first-order and second-order models, tested the third-order model. Apparently, the authors did so without a leading reason and without adding more experiments to their design, suggesting an overfitting of the experimental data. The third-order model can improve the goodness of fit criteria of the model as in the study by Sodeifian et al. [30], but at the cost of an increased unreliability of predictions of responses outside conditions of training data.

Finally, we observed that it is not common to find studies that include validation tests for the models. Only 16% of the studies used cross-validation strategies such as leave one out (LOO) where a single data point a time is excluded prior to model fitting, and a comparison is made between the prediction of this “limited model” and the actual “excluded” experimental observation. The validation and the R2-predictive or Q2 statistics inform about the accuracy of predictions of the response surface models by estimating the effect on predictions of the elimination of experimental data points used in model training. Incorporating validation tests enhances confidence in the optimization process by providing a more accurate evaluation of the model’s predictive capabilities and ensuring reliable outcomes.

The predictive power of a fitted response surface model at various points within the experimental region depends on the design being used. A second-order model can be constructed efficiently using CCDs [12]. CCDs are factorial (2n) designs augmented by additional center and axial experimental data points that allow estimation of the tuning parameters of a second-order model. It may be desirable to have designs that predict uniformly at all constant distances from their center points. The variances and covariances of the best-fit coefficients of the model for such a rotatable design remain unchanged [10]. Because the reason for using response surface designs is generally to locate an unknown optimal, it makes sense to use a rotatable design that provides equal precision in the estimation of the shape of the surface in all directions [10]. A CCD becomes rotatable by choosing an appropriate axial distance to the center of any start point (α times the coordinate of a factorial point in the same direction). The α-value to achieve conversion of a design to a rotatable type depends upon the number of independent variables (n, α = 2n/4) [12].

The CCD can be applied sequentially. To form a CCD, we can start with a 2n factorial design [10] to fit a first-order model. If the ANOVA for the first-order model shows a significant lack of fit, axial points can be added to allow the inclusion of quadratic terms in the model, as well as supplemental center points to complete a CCD [11]. The number of center points and the axial distance α are the two parameters we have to fix in a CCD. The center data points provide information about the experimental variance and curvature of the surface, whereas the axial data points allow estimating the coefficients of the second-order terms. A second-order model can describe surfaces that take many shapes. Therefore, the response surface model can represent a maximum, minimum, ridge, or saddle point [11].

When a second-order model is assembled sequentially from a first-order model, it is convenient considering the formation of blocks to reduce the variability that arises from “uncontrollable nuisance factors” [12]. The CCD can be easily carried out using an orthogonal blocking arrangement that provides block-independent coefficients of the response surface model [10]. In the absence of orthogonal experimental blocks, the matrix X'X will have some non-zero off-diagonal elements following least squares estimation. Consequently, it is impossible to calculate the sum of squares due to quadratic effects without first estimating the linear effects in the model. This impedes the coupling of experimental blocks to perform calculations. In addition to orthogonal, the added experimental blocks must be rotatable. However, because it is not always possible to obtain both rotatability and orthogonality, a value of α is defined that warrants exact orthogonality and near-rotatability (as much as possible in practice). In practice, depending on the definition of α, it is easier to guarantee exact orthogonality and get as close as possible to rotatability. For instance, if n = 4, if we have 16 cubic points and 8 axial points, together with 6 replicates of the center point, we can use an α-value of 2 to satisfy rotatability and orthogonal blocking [10]. If the number of factors is large enough, the factorial part can be divided into two or more blocks. The axial component must form a single block. Myers et al. [11] reported a table (blocking arrangements for orthogonal and near-rotatable composite design) of blocking arrangements for composite designs exhibiting rotatability and orthogonality [10].

Effect of Process Variables on the Supercritical CO2 Extraction of Oil in Seeds

Ahangari et al. [17] recently reviewed current trends in the SC-CO2 extraction of seed oils, including the effect of several parameters on process rate and yield. The SC-CO2 extraction process can be visually analyzed by constructing a plot for the extraction yield of oil, or a minor lipid that is coextracted with the oil versus the extraction time or specific CO2 consumption, keeping constant several independent variables such as the extraction vessel and vessel load, the substrate pretreatment and particle size, the CO2 conditions (extraction temperature and pressure), and the CO2 flow rate passing through the extraction vessel. The extraction experiment is extended for a given extraction time up to fully removing the oil from the seed.

A cumulative extraction curve of oil extraction yield (Y) versus specific CO2 consumption (q) is fairly uneventful [105] (Fig. 1). The curve has an initial straight portion by the origin, whose slope is related to the operational solubility of the oil in SC-CO2 at the selected extraction conditions (Cop). Following the initial straight portion, the cumulative extraction curve slopes down to reach a horizontal asymptote corresponding to the oil content of the substrate (Y). The transition between the two regions depends on the mass transfer mechanisms and mass transfer parameters, which depend in turn on the physical properties and superficial velocity (U) of the CO2, and the pretreatment and particle size and shape of the substrate. For a given substrate, changing extraction conditions affects mainly the operational solubility, but not the total extractable oil, as noted in Fig. 1B. del Valle et al. [106] noted that the operational solubility in the SC-CO2 extraction of oilseeds coincided with the actual thermodynamic solubility of vegetable oils (triacylglycerols of unbranched fatty acids) in SC-CO2 at process conditions (Csat), and did not change depending on the substrate. This is so because the thermodynamic solubility depends more on the carbon chain length of the substituent fatty acids (generally 18, in most vegetable oils) than on their number of unsaturations.

Fig. 1
figure 1

Cumulative extración curve of vegetable oil from a pretreated seed. A Actual curve, initial asymptote definining the operational solubility of the oil in supercritical CO2 at process conditions, and final, horizontal asymptote defining the extractable oil content of the pretreated seed. The two asymptotes intercept for a specific CO2 consumption of q*. B Cumulative extración curve as a function of specific CO2 consumption (q) and operational solubility (Csat, g kg−1 oil/CO2) of the oil in supercritical CO2: high (31.25), intermediate (12.50) and low (5.0)

Figure 1 is built using the empirical equation of Naik et al. [104] that is written in Eq. (1) as a function of Csat and Y:

$$Y=\frac{{Y}_{\infty }\,q}{\frac{{Y}_{\infty }}{{C}_{\text{sat}}}+q}\left(=\frac{{Y}_{\infty }\,q}{{q}^{*}+q}\right)$$
(1)

The intersection between the two straight asymptotes (Y = Csat q, initially; and Y = Y, at the end) occurs for a specific CO2 consumption (q*) that is given by Eq. (2):

$${q}^{*}=\frac{{Y}_{\infty }}{{C}_{\text{sat}}}$$
(2)

The percent discrepancy of Y given by Eq. (1) and the horizontal asymptote (Y = Y, for q > q*) for n-times q* can be computed using Eq. (3a):

$${\% \;\text{Error}\,}\left(n\,{q}^{*}\right)=100\left(\frac{1}{{Y}_{\infty }}\right)\left[{Y}_{\infty }-\frac{{Y}_{\infty }\left(n{q}^{*}\right)}{{q}^{*}+\left(n{q}^{*}\right)}\right]\left(=\frac{100}{1+n}\right),$$
(3a)

whereas the percent discrepancy of Y given by Eq. (1) and the straight line by the origin (Y = Csat q, for q < q*) for the n-th fraction of q* can be computed using Eq. (3b):

$$\begin{aligned}{\% \;\text{Error} }\,\left(\frac{{q}^{*}}{n}\right)=&\;100\left(\frac{1}{{C}_{\text{sat}}{q}^{*}/n}\right)\\&\left[{C}_{\text{sat}}\left(\frac{{Y}_{\infty }/{C}_{\text{sat}}}{n}\right)-\frac{{Y}_{\infty }\left({q}^{*}/n\right)}{{q}^{*}+\left({q}^{*}/n\right)}\right]\\&\left(=\frac{100}{1+n}\right),\end{aligned}$$
(3b)

where Csat q* = Y in Eq. (3b). Equation (3a) and Eq. (3b) indicate that for n = 99, the percent errors for q*/99 and 99 q* are both equal to 100/(99 + 1) or 1%. Correspondingly, the percent errors for q*/3 and 3 q* equal both 100/4 = 25%. The percent error reaches a maximal value equal to 50% [100/(1 + 1)] for q* (n = 1). Because the segmented straight lines in Fig. 1A are limited for q*/3 and 3 q*, the segmented straight segments in Fig. 1A limit those specific CO2 consumption levels where the assumption that the solvent capacity of the CO2 stream is fully employed overpredicts the actual yield predicted by Eq. (1) by a factor that increases from 25 to 50% between q*/3 and q*, and a factor that decreases from 50 to 25% between q* and 3 q*.

Ahangari et al. [17] list, among the parameters critically affecting the extraction of seed oils in a packed bed using neat CO2, the temperature (T) and pressure (P) conditions in the extraction vessel, the particle size of the substrate (dp), and the CO2 flow rate (Q) during the dynamic extraction time. Because Q is an extensive variable, it is inadequate to describe the SC-CO2 extraction process (its effect depends on the size of the system). More appropriate intensive variables related to Q include the ratio of CO2 mass flow to substrate loaded in the extraction vessel (Q/Ms, or the weight of CO2 per unit weight of the substrate and per unit time, that is inversely proportional to the residence time of CO2 in the extraction vessel) and the superficial CO2 velocity (U, or the velocity the CO2 stream would reach within the extraction vessel if it were empty, which will be defined in “Variations on Carotenoid Content in the Oil with Process Conditions” section). “Implication of Results” section will describe the effect of the system conditions (T, P) and extraction conditions (substrate pretreatment, dp, U) on the shape of the cumulative extraction curve (Fig. 1). Figure 1 clearly indicates that, for any given substrate and extraction conditions (T, P, dp, U), oil extraction yield increases as the specific CO2 consumption (q) increases, where q increases proportionally to the dynamic extraction time (td, q = Q t/Ms).

Besides the temperature and pressure conditions of the CO2, the substrate and its pretreatment affect largely the rate and yield of the SC-CO2 extraction process. There are two main strategies to fracture the main inner barrier to mass transfer in vegetable substrates, namely the intact walls of the target-solute-containing cells. They consist of the rupture of surface cells by impact, abrasion, or cut mechanisms, and the fracture of inner cells by high-shear mechanisms. The effectiveness of the fracture of superficial cells using impact mills, abrasive mills, or cutters benefits from a reduction in particle size that increases the surface-to-volume ratio of ground particles or the ratio of ruptured surface cells to intact inner cells in the pretreated vegetable substrate [17, 107]. High-shear pretreatments for vegetable substrates include flaking, prepressing, extrusion, and pelletization [108] that do not resort to a reduction in particle size to increase the fraction of ruptured cells. A high-shear pretreatment ruptures cell walls and frees solutes from inner cells, on one hand, and restructures the substrate into a high-density matrix with interconnected pores, on the other. Not requiring a significant reduction in particle size to be effective to free the oil and minor lipids from the oil-containing cells effectively constitutes a significant advantage of high-shear pretreatments as compared to particle-size-reduction pretreatments, because fines are difficult to handle in packed beds as required in batch SC-CO2 extraction processes for solid substrates. Indeed, fines have a tendency to aggregate and cause solvent channeling in the packed bed extraction vessel so as to diminish extraction efficiency [109,110,111], and to clog filters used to separate suspended particles from the extract-loaded solvent phase thus delaying industrial processing [112]. An added advantage of the high-shear pretreatments as compared to particle-size-reduction pretreatments is that they allow an increase in bulk density of the substrate in the packed bed, so that the extraction vessel load can increase. Densification of the substrate increases the volumetric productivity of the process (or the weight of extract per unit volume of the extraction vessel and per unit time) even if such densification restricts inner mass transfer during extraction [113, 114]. In an extruder, moistened particles of ground material are forced into a spiraling and contracting channel by a screw that turns into a heated barrel. Under the influence of lubricant water and high temperature, plasticized material emerges from the extruder through the constricting opening placed at the end of the barrel (die) and water vapor flashes to an extent dependent on the moisture content and final temperature of the material [115]. Gardner [116] reported an increase in bulk density of hop cones from 150–250 kg/m3 in ground, hammer-milled hops to 400 kg/m3 in extruded hops, as well as an increase in α-acid recovery from 75–80% from ground material to 95% from the extruded material, which they claimed was due to improved, shear-induced cell wall rupture of hop cone cells. Uquiche et al. [114] densified Jalapeño pepper using an extruder prior to SC-CO2 extraction of capsaicin-rich oleoresins. Other examples of the use of extrusion as a pretreatment include the SC-CO2 extraction of dried red pepper flakes [117, 118], Nannochloropsis gaditana microalgae [119], and marigold flower petals [120].

Supercritical CO2 Extraction of Oils and Carotenoids in Rosehip Seeds and Shells

Winther et al. [121] reviewed the chemical makeup and biological activity of rosehips. The pseudofruits are constituted of a red flesh pulp or hypanthium enclosing several thin-membrane-surrounded seeds or individual achenes (the actual botanical fruits). Following drying, the two parts of the rosehip pseudofruit give rise to shells and seeds, respectively. Rosehip seeds are rich in essential polyunsaturated fatty acids such as linoleic and α-linolenic, whereas the rosehip shells are rich in flavonoids and carotenoids. Particularly, carotenoids are important minor lipid compounds because of their function as pro-vitamin A, antioxidant, and coloring agent. However, the solubility of carotenoids in SC-CO2 is limited, which decreases their extraction yield from plant substrates.

Dąbrowska et al. [122] recently reviewed the extraction processes for rosehip seeds, with supercritical fluid extraction processes being prominently featured. Table 3 summarizes the conditions of the SC-CO2 extraction of oil from seeds. The oil of different rosehip species has been extracted using SC-CO2, including Rosa aff. rubiginosa in the cases of del Valle et al. [3, 108, 126], Eggers et al. [111], and Mabe et al. [123], and Rosa canina L. in the cases of Szentmihályi et al. [125], Machmudah et al. [127,131, Salgin et al. [128], Jahongir et al. [129], Illés et al. [130], and Tozzi et al. [132]. SC-CO2 extraction has been carried out both at laboratory scale [3, 108, 126, 128, 131, 132] as well as pilot scale [3, 111, 126, 129]. Table 3 also includes a contribution on the extraction of oils from whole fruits, and a couple of contributions on the SC-CO2 extraction of rosehip shell carotenoids, including the one Tozzi et al. [132], but in this work, ethanol was added as a cosolvent to increase their solubility so as to reduce CO2 requirements.

Table 3 Works on supercritical CO2 extraction of oil from rosehip (Rosa aff. rubiginosa or Rosa canina L) seeds and shells available in the open literature

There have been several works showing a positive effect of vegetable oil as a continuous cosolvent for the SC-CO2 extraction of lycopene from tomato products [133,134,135], β-carotene from carrots [136], lutein from marigold flower petals [137, 138], and astaxanthin from shrimp shells [139]. Besides using vegetable oil as a continuous cosolvent, which can be added using a cosolvent pump, it is also possible to carry out extraction in a mixed substrate-packed bed containing a vegetable oil source, in addition to the carotenoid source, as done by adding hazelnut to help extract tomato lycopene [140], avocado for extracting either tomato lycopene [141] or red pepper capsanthin [142], or roasted coffee for extracting brown seaweed fucoxanthin [143]. It is clearly advantageous to use a single substrate to coextract lipids and carotenoids, as done by Machmudah et al. [144] to extract tomato flesh byproducts using tomato seeds as the cosubstrate, or by Ndayishimiye and Chun [145] from Yuzu ichandrin (a citrus fruit) peels using Y. inchandrin seeds as the cosubstrate. Consequently, we proposed using rosehip seeds as a cosubstrate of rosehip shells to help extract shell carotenoids using SC-CO2. Our proposal has manifested itself in industrial practice with two products offered by Flavex Naturextrakte, a large German food ingredients company. Flavex Naturextrakte offers both an oil for use in cosmetics [146] and an extract with the carotenoids from the fruit shells, which give it a reddish color [147].

Case Study

In this section, we will study the coextraction of oil from the seeds and carotenoids from the shells of rosehips as a case study for the application of RSM and DoE to analyze the SC-CO2 extraction of oil and minor lipids. The study uses as independent variables the extraction temperature, CO2 density at the extraction conditions, specific CO2 consumption during the dynamic extraction time, and seed-to-shell ratio in the substrate, and as dependent variables the oil extraction yield and carotenoid content of the oil. Substrate mixtures were manufactured by extrusion which allows an effective destruction of inner barriers to mass transfer, while avoiding the presence of fines. The vessel load, particle size of the substrate, and superficial CO2 velocity were all kept constant in the unreported experiments of a student at Universidad de la Frontera (UFro). Appendix C in Supplementary Material presents the materials and methods of this unreported experimental work.

Table 4 shows the experimental results of oil yield and carotenoid content in the oil when extracting extruded rosehip seeds and shells using SC-CO2 as a function of the extraction temperature, CO2 density, specific CO2 consumption, and content of seeds in the pretreated substrate. Oil yield ranged from about 50.0 to 80.5 g kg−1 oil/substrate, depending on the substrate and the extraction conditions. To make results independent of the substrate, we computed the percent oil recovery, which was based on the fraction of seeds in the composite substrate (with the remainder being rosehip shells), and the experimentally measured oil contents of seeds and shells (Appendix C in Supplementary Material). The oil content of the extruded samples ranged from 50.5 g kg−1 oil/substrate when using 240 g kg−1 seed/substrate, to 99.1 g kg−1 oil/substrate when using only seeds. The percent oil recovery should have been limited to 100% but, because of experimental errors was above that limit for up to 10–15% in a few selected opportunities when applying a high specific solvent consumption of CO2 having a high solvent power (high temperature and density), or a few of those traits. The lowest percent oil recoveries were limited to 65–70% when applying a limited specific solvent consumption of CO2 having a low solvent power (low temperature and density), or a few of those traits. The carotenoid content in the oil changed even more dramatically between 0.10 g kg−1 carotenoids/oil when using rosehip seeds as the substrate to 5.5 g kg−1 carotenoids/oil when using a substrate containing 24% of seeds and 76% of shells. The yield of carotenoids was computed as the product of the two main responses which, the same as the carotenoid content of the oil, ranged widely from 7.9 to 320 mg kg−1 carotenoids/substrate. To make these results independent of the substrate, we computed the percent carotenoid recovery, which was also based on the fraction of seeds in the composite substrate, and the experimentally measured carotenoid contents of seeds and shells (Appendix C in Supplementary Material). The carotenoid content of the extruded samples increased from 73.4 mg kg−1 oil/substrate when using only seeds to 185.3 mg kg−1 carotenoids/substrate when using 240 g kg−1 seed/substrate. The percent carotenoid recovery should have been limited to 100% but, because of compound experimental errors of both the yield of oil and the carotenoid content in the oil, it was above that limit six times for up to 70%. The recovery of carotenoids varied widely from 10% in the single experiment using the substrate containing only seeds to 170% in the single experiment using the substrate containing 24% of seeds and 76% of shells.

Table 4 Results of the application of the rotable central composite design (CCD) to the supercritical CO2 extraction of mixtures of seeds and shells of rosehip

The remainder of this section will present the statistical analyses and graphical representations of the RSs separately for oil yield (“Variations on Oil Yield with Process Conditions” section) and carotenoid content in the oil (“Variations on Carotenoid Content in the Oil with Process Conditions” section) for the complete experimental CCD, which be followed by a discussion with the implications of our results.

Appendix D in Supplementary Material statistically demonstrates that the surfaces for oil yield and carotenoid content in the oil for Blocks 1 and 2, respectively, exhibit considerable curvature and therefore cannot be represented using a first-order model (not significant, statistically), which constitutes the first stage in the proposed methodology, before completing the experiments (Block 3) and analyzing the response surface for each response.

Variations on Oil Yield with Process Conditions

Clearly, the results in Appendix D of Supplementary Material indicate that a linear model is not adequate to describe the response for oil yield. There are curved regions that the linear model does not explain, so that a second-order model, such as the one presented in Eq. (4) should be applied:

$$Y=\sum\limits_{{\text{i}}=0}^{4}\sum\limits_{{\text{j}}={\text{i}}}^{4}{\beta }_{\text{ij}}{X}_{\text{i}}{X}_{\text{j}}$$
(4)

where the same as before, β00 is the intercept coefficient; β 0i is the main effect for the normalized factor “i” (0, 1, 2, 3, or 4); βii is the quadratic effect for the normalized factor “i”; βij is the interaction for the normalized effects of factors “i” and “j” (1, 2, 3, or 4, different from “i”); Xi and Xj are the levels of the coded independent variables i and j, respectively (X0 = 1); and Y is a response variable, that in this section corresponds to oil yield (Yoil). The coded independent variables in Eq. (4) are defined as follows (Eqs. 5ad):

$${X}_{1}=\frac{T-55}{10},$$
(5a)
$${X}_{2}=\frac{\rho -900}{40},$$
(5b)
$${X}_{3}=\frac{q-18}{6},\;\mathrm{ and}$$
(5c)
$${X}_{4}=\frac{S-620}{190},$$
(5d)

where T is the extraction temperature (°C), ρ (kg/m3) is the density of CO2 at the process conditions (defined by the extraction temperature and pressure), q (kg kg−1 CO2/substrate) is the specific solvent consumption, and S (g kg−1 seeds/substrate) is the fraction of seeds in the composite substrate.

Table 5 summarizes the statistical indicators obtained from the ANOVA when the selected second-order model, Eq. (4), was applied to the results for Yoil in Table 4. The second-order model was considered adequate, because of the insignificant lack of fit (p ≤ 0.05) relative to the pure error, the high signal-to-noise ratio (> 14), and the high coefficient of determination (R2 = 0.85). The R2 value indicates that the second-order model explains 84.6% of the variability in the values of Yoil. Appendix D in Supplementary Material complements the statistical indicators in Table 5, showing that the residuals of the model [or the discrepancies between experimental values in Table 4 and those predicted by Eq. (4)] behave normally.

Table 5 Statistical descriptors of the second-order model performance (Block. Analysis of variance for the fitted regression coefficients using the oil extraction yield (Yoil, g kg−1 oil/substrate) as response

To present our results graphically, we proceeded to eliminate nonsignificant coefficients (p > 0.05) in Eq. (4). However, as reported in Table 5, we kept the linear coefficient of dimensionless density (X2), which was not significant at the 5% level (p = 0.053), to keep the hierarchy of the response surface model. Only three second-order coefficients that affected oil yield were kept in the model, namely the interaction between the coded density (X2) and the coded specific CO2 consumption (X3), the interaction between the coded specific CO2 consumption (X3) and the coded fraction of seeds in the substrate (X4), and the square of the fraction of seeds in the substrate (X42). Equation (6) reports the second-order model for oil extraction yield that can be derived for the significant coefficient values informed in Table 5:

$$\begin{aligned}{Y}_{oil}=&\;{\hspace{0.17em}}61.91+2.555{\hspace{0.17em}\hspace{0.17em}}\left(\frac{T-55}{10}\right)+1.449{\hspace{0.17em}}\left(\frac{\rho -900}{40}\right)\\&+3.426{\hspace{0.17em}}\left(\frac{q-18}{6}\right)+3.936\left(\frac{S-620}{190}\right)+1.979{\left(\frac{S-620}{190}\right)}^{2}\\&-3.463\left(\frac{\rho -900}{40}\right){\hspace{0.17em}}\left(\frac{q-18}{6}\right)-3.007{\hspace{0.17em}}\left(\frac{q-18}{6}\right){\hspace{0.17em}\hspace{0.17em}}\left(\frac{S-620}{190}\right)\end{aligned}$$
(6)

RSM offers useful tools for understanding the behavior of a response as a function of several factors. However, when there are three or more independent variables, as in this case, it is difficult to visualize the behavior of the response. The nature of the response surface may be determined from the stationary point and the sign and magnitude of the eigenvalues in the so-called canonical analysis [12]. The eigenvalues of Eq. (6) are λ1 =  + 1.1508, λ2 =  − 2.6540, λ3 =  − 0.3396, and λ4 =  + 2.8469. Eigenvalues can be interpreted considering them in pairs: two positive values signal the presence of minimal value in the direction defined by the corresponding axis; two positive values signal the presence of a maximal value; and opposite signs eigenvalues signal that the surface portrays saddle points [10].

Statistical indicator Q2 for Yoil was 0.62 which suggests a low accuracy of predictions and overfitting of the regression model since it depends considerably on the experimental data. This result suggests that the fitted model will have difficulties in predicting new data that does not match the calibration experiments.

Figure 2 shows the interaction between ρ and q for the extraction at 45 °C of a substrate containing 62% of seeds. The surface represents a saddle, which is common to find in cases of a relatively flat experimental region that does not include a true maximum or minimum [6]. What we observe are basically situations in which the maximum oil yield can be reached virtually with the exception of those cases in which little CO2 is used, for small q, or its solvent power is low, for a low value of ρ, on the front and to the left when oil yield falls below the expected 75 g kg−1 oil/substrate.

Fig. 2
figure 2

Surface plot of oil extraction yield (Yoil) with supercritical CO2 as a function of the specific CO2 consumption (q) and CO2 density (ρ) at T = 45°C and S = 62%. Numbers on contour map correspond to oil yield (Yoil, g kg−1 oil/substrate)

Figure 3 shows the interaction between q and S for T = 45 °C and ρ = 940 kg/m3, which also represents a saddle. The same as in the case of Fig. 3, we observed a plateau and a fall to the front and left when the oil content in the substrate decreases when S (the fraction of seeds in the substrate) or when q (the quantity of CO2 used for extraction) decreases.

Fig. 3
figure 3

Surface plot of oil extraction yield (Yoil) with supercritical CO2 as a function of the fraction of seeds in the composite substrate (S) and specific CO2 consumption (q) at T = 45°C and ρ = 940 kg/m3. Numbers on contour map correspond to oil extraction yield (Yoil, g kg−1 oil/substrate)

Variations on Carotenoid Content in the Oil with Process Conditions

Equation (4) was also used to represent the effect of the experimental factors on the carotenoid content in the oil, given that Appendix D of Supplementary Material indicated that a linear model was not adequate to describe this response. Table 6 summarizes the statistical indicators obtained from the ANOVA when Eq. (4) was applied to the results for Ccar in Table 4. The same as in the case of oil yield, the second-order model was considered adequate, because of the insignificant lack of fit (p ≤ 0.01) relative to the pure error, the high signal-to-noise ratio (> 40), and the high coefficient of determination (R2 = 0.98). This last coefficient indicates that the second-order model explains 98.2% of the variability in Ccar. Appendix D in Supplementary Material complements the information provided by the statistical indicators in Table 6 showing that the residuals of Eq. (4) for carotenoid content in the oil behave normally.

Table 6 Statistical descriptors of the second-order model performance. Analysis of variance for the fitted regression coefficients using the carotenoid content in the oil (Ccar, g kg−1 carotenoids/oil) as response

As reported in Table 6, we proceeded to eliminate nonsignificant coefficients (p > 0.05) in Eq. (4) to present our results graphically. We noted that the coded specific consumption of CO2 (X3) does not affect the carotenoid content in the oil, which in practice indicates that the carotenoids and the oil are extracted in fixed proportions throughout extraction. Furthermore, only two of the remaining second-order coefficients do not affect the carotenoid content in the oil, which are the square of the temperature (X12), and the interaction between the temperature and the fraction of seeds in the substrate (X1X4). Equation (7) reports the second-order model for the carotenoid content in the oil based on the significant coefficient values informed in Table 6:

$$\begin{aligned}{C}_{car}=&\;{\hspace{0.17em}}14.233+0.933{\hspace{0.17em}\hspace{0.17em}}\left(\frac{T-55}{10}\right)+2.095{\hspace{0.17em}}\left(\frac{\rho -900}{40}\right)\\&-12.101\left(\frac{S-620}{190}\right)+0.833{\left(\frac{\rho -900}{40}\right)}^{2}\\&+3.337{\left(\frac{S-620}{190}\right)}^{2}-1.367\left(\frac{T-55}{10}\right)\left(\frac{\rho -900}{40}\right){\hspace{0.17em}}\\&-2.459\left(\frac{\rho -900}{40}\right){\hspace{0.17em}\hspace{0.17em}}\left(\frac{S-620}{190}\right)\end{aligned}$$
(7)

The eigenvalues of Eq. (7) are λ1 =  + 3.8955, λ2 =  − 0.6452, λ3 =  + 0.050 (nearly null value), and λ4 =  + 1.0779. Statistical indicator Q2 for Ccar was 0.93. Thus, the accuracy of predictions of the polynomial model is better for Ccar than for Yoil (Q2 = 0.62), which indicates that the independent variables and their chosen combinations better describe and predict carotenoid content in the oil than oil yield.

Figure 4 shows the interaction between temperature and CO2 density when extracting a substrate containing 43% seeds using 18 kg kg−1 CO2/ substrate. Ccar changed slightly with increasing T and ρ, resulting in a fairly flat response surface graph. Indeed, when the CO2 density increased from 820 to 980 kg/m3, Ccar increased slightly, perhaps because the extraction is more defined by the solute content in the substrate than by its solubility in CO2 at the selected extraction conditions.

Fig. 4
figure 4

Surface plot of carotenoid content (Ccar) with supercritical CO2 as a function of the CO2 density (ρ) and temperature (T) at q = 18 kg kg−1 CO2/substrate and S = 30%. Numbers on contour map correspond to using the carotenoid content in the oil (Ccar, g kg−1 carotenoids/oil)

Figure 5 shows the interaction between ρ and S when extracting the raw material using 18 kg kg−1 CO2/substrate at 75 °C. There is a great dependence on the carotenoid content in the oil when using CO2 with high solvent power (for ρ = 980 kg/m3), the case where the carotenoid content in the substrate determines its content in the oil. We think this may be due to the large cosolvent effect of the oil on the solubilization of carotenoids. When using CO2 with low solvent power (for ρ = 820 kg/m3), on the other hand, there is a smaller effect of the composition of the raw material on the content of carotenoids in the extracted oil.

Fig. 5
figure 5

Surface plot of carotenoid content (Ccar) with supercritical CO2 as a function of the fraction of seeds in the composite substrate (S) and CO2 density (ρ) at T = 75 °C and q = 18 kg kg−1 CO2/substrate. Numbers on contour map correspond to using the carotenoid content in the oil (Ccar, g kg−1 carotenoids/oil)

Figure 6 presents results in the variations in oil yield (Fig. 6A, B), and carotenoid yield (Fig. 6C, D), for rosehip seeds (Fig. 6A, C), and rosehip fruits containing approximately 25.1% of seeds and the rest of shells (Fig. 6B, D). The carotenoid yield was estimated as the product of the oil yield by the carotenoid content in the oil, which were the two responses modeled in this work. The curves represent changes in the extraction conditions that increased the oil solubility in CO2 from 5 g/kg at 43.6 °C and 23.8 MPa (ρ = 854.3 kg/m3), 12.5 g/kg at 53.9 °C and 36.9 MPa (ρ = 895.8 kg/m3), or 31.25 g/kg at 65.1 °C and 55.7 MPa (ρ = 940.2 kg/m3).

Fig. 6
figure 6

Prediction of the extraction yield as a function of specific CO2 consumption (q) and saturation concentration of the oil in CO2 (Csat) using the response surface model for A oil from seeds, B oil from fruits, C carotenoids from seeds, D carotenoids from fruits

Implication of Results

Figure 6 shows that the response surface is not capable of predicting extraction curves of oil yield versus specific CO2 consumption, in the sense that they do not start at the origin, and that they do not approach a horizontal asymptote for high values of q as suggested by Fig. 1. However, their trends approximate expected curves between q*/3 and 3 q* in that it is not the thermodynamic solubility of the oil for low values of q, nor the oil content in the substrate which determines the variations in oil yield (Yoil) with the specific CO2 consumption (q). In the case of rosehip seeds (for S = 1000 g kg−1 seeds/substrate), it is apparent that it is the oil content in the seeds which determines the yield of oil, with the trend decreasing as the solvent power of the SC-CO2 decreases (Fig. 6A). On the other hand, in the case of the complete fruit (for S = 251 g kg−1 seeds/substrate), a trend closer to that expected is seen, which indicates that the solubility of the extract is lower than the solubility of the oil; as the solvent power of CO2 increases, oil extraction is closer to being determined by the oil content in the substrate, at least until reaching a specific CO2 consumption of 25 kg kg−1 CO2/substrate (Fig. 6B). The trend is more pronounced in the case of carotenoid extraction, in which there is a more marked effect of the solvent power of CO2 (Fig. 6D).

RSM claims that it can provide an approximate picture of the behavior of a system with a limited number of experiments. However, this is not the case for the extraction of vegetable oil from an optimally sheared substrate, where there have been notable advances in science and technology aspects to allow a more accurate picture of the phenomena with an even more limited experimental effort. Indeed, del Valle [148] claimed that a single experiment was required to determine a best-fit pretreatment-dependent microstructural factor to describe the extraction of prepressed, flaked, pelletized, or extruded seeds that were ground and size-classified using the shrinking core model. According to del Valle [148] the extraction curve predicted by the shrinking core model will depend on four groups of factors. The first group corresponds to the substrate and depends on its treatment, which will define, in turn, the microstructural factor (FM) and the particle diameter (dp). The second group refers to the target solute, which in the case of a vegetable oil can be homologated to a triolein molecule. The third group relates to the extraction conditions that can be characterized by the process temperature (T) and pressure (P). T and P, in turn, and under the assumption that the limited amount of dissolved oil does not affect the physical properties of CO2, determine the density (ρ) and viscosity (µ), which can be estimated using the NIST database [149]. The last physical property, the diffusivity of the oil in CO2 under the extraction conditions (D12), can be estimated as a function of µ and the molecular weight of triolein (a triglyceride usually employed to represent typical vegetable oils) using the equation of Funazukuri et al. [150]. The last group refers to the characteristics of the extraction process that depend on the CO2 flow rate (Q), and the volume (VE) and aspect ratio (L/D) of the cylindrical extraction vessel which, in turn, define its high (L) and internal diameter (D). The superficial CO2 velocity in the extraction vessel (U) can be estimated by dividing Q by ρ and the cross-sectional area of the extractor (πD2/4). Then, the dimensionless Schmidt (Sc) number can be determined, which depends on the physical properties of the system under study (ρ, µ, D12). On the other hand, the dimensionless Reynolds (Re) number, which characterizes the flow regime in the packed bed, depends on the physical properties of CO2 (ρ, µ), U, and dp. Re and Sc allow the determination of the dimensionless Sherwood (Sh) number using a specific correlation such as the one of King and Catchpole [151] for mass transfer in packed beds operating with supercritical fluids. Finally, ρ and T allow estimating the solubility of the oil in CO2 (Csat) under the extraction conditions using the equation of del Valle et al. [106]; FM and D12 allow estimating the effective diffusivity of the oil in the pretreated substrate (De) [108], which characterizes the internal mass transfer in the solid substrate; and Sh makes it possible to estimate the film coefficient (kf) that characterizes mass transfer by convection in the packed bed.

Something similar occurs in the case of seeds subjected to size reduction by impact, abrasion, or cutting mechanisms that cause rupture of surface cells. Extraction of these type of substrates can be represented by the intact-and-broken cell model [151,152,153,154] whose predictions depends on the fraction of broken cells, an oil partition coefficient between intact and broken cells, and mass transfer coefficients for the intact tissue (characterized by inner mass transfer coefficient ks), and from broken superficial cells to the surrounding SC-CO2 phase (characterized by kf). Reverchon and Marrone [107] suggested using microscopy to characterize the parenchymatous cells and related the fraction of broken cells to the ratio α of the volume of a superficial layer half the size of a typical parenchymatous cell (δ), and the volume of the whole particle, containing intact inner cells. For a spherical particle, the relationship is given by Eq. (8) [155]:

$$\alpha \left(=\frac{{V}_{\text{libre}}}{V}\right)=3\left(\frac{\delta }{{d}_{\text{p}}}\right)-3{\left(\frac{\delta }{{d}_{\text{p}}}\right)}^{2}+{\left(\frac{\delta }{{d}_{\text{p}}}\right)}^{3}$$
(8)

Recently, Arias et al. [156] estimated ks as a function of a pretreatment-dependent microstructural factor (FM) and kf using a dimensionless correlation for packed beds operating with supercritical fluids, and best-fitted the values of substrate-dependent α and FM, and extraction-condition-dependent K (partition coefficient) for the extraction curves for the Lippia origanoides’ oleoresin using ethanol-modified SC-CO2. Urrego et al. [157] measured of equilibrium partition of rapeseed oil between SC-CO2 and prepressed rapeseed.

There is some support to the claim that the use of vegetable oil as cosolvent improves the solubility of carotenoids in CO2, as done by Araus et al. [158] for the solubility of β-carotene in triolein-modified SC-CO2, and by Araus et al. [159] for the solubility of capsanthin in triolein-modified SC-CO2, both in the form of an enhancement factor, or an indication of how many times larger is the solubility of the carotenoid in triolein-modified CO2 than in pure CO2 at the same system temperature and pressure, and where the solubility of triolein in SC-CO2 is not modified by the presence of small amounts of the dissolved carotenoids.

Clearly, oil solubility in SC-CO2 under the extraction conditions determines the extraction curves for the oil and carotenoids in Fig. 6. Figure 7 shows the variations in oil solubility in SC-CO2 as a function of temperature (different isotherms) and pressure (abscissa). For low pressures, the solubility of the oil decreases as temperature increases, due to the reduction in the density and the solvent power of CO2 [160]. On the other hand, at high pressures, for which CO2 is less compressible, the solubility of the oil increases with temperature, due to the increase in the vapor pressure and volatility of the oil. Therefore, there is an intermediate pressure region, at the so-called crossover pressure, which is highlighted in the expanded grey area in Fig. 7, in which the effects of the solvent power of CO2 and the volatility of the oil are compensated, and the solubility of the oil in the CO2 does not change with temperature [161]. In the case of vegetable oils, the crossover pressure increases with the temperature between 29 MPa for the crossover between 35 and 45 °C, and 40 MPa for the crossover between 65 and 75 °C (Fig. 7).

Fig. 7
figure 7

Crossover effect. A Solubility of vegetable oils in supercritical CO2 as a function of temperature (different isotherms) and pressure (abscissa axis); B variations in oil solubility in SC-CO2 in the experimental range of the extraction conditions studied

As a result of these changes in the oil solubility in CO2 with temperature, a decrease in oil yield with increasing temperature was expected in studies at low pressures (below 29 MPa) as can be observed in the first three lines of Table 7. An increase in oil yield would be also expected with increasing temperature in studies at high pressures (above 40 MPa), but no such studies were found in the literature. Finally, at intermediate pressures, including the crossover pressure range between 29 and 40 MPa, mixed results would be expected, with an inconclusive effect of temperature on oil extraction yield, and a great statistical significance of the interactions between pressure and temperature, as highlighted in the last two lines of Table 7.

Table 7 Effect of tempertaure (T) and their interaction with pressure (T×P) on oil extraction yield from oilseeds reported in literature as a function of different pressure and temperature ranges

The supercritical of Chen et al. [86] predicts a crossover pressure of 31.4 to 35.2 MPa for a temperature interval from ca. 40 to 60 °C, but in this case, the crossover decreased instead of increased as the temperature increased. Chen et al. [86] used the system temperature and pressure, and the specific CO2 consumption as the factors, and the percent oil extraction yield as the response, and Appendix E in Supplementary Material details how this information was used to estimate the crossover pressure, which is a condition for which the derivative of an estimate of oil solubility as a function of temperature keeping pressure constant is null.

Unlike the complex effects of system temperature and pressure on solute solubility, which includes the crossover phenomena that modify the functional dependency of the solubility on system temperature, the equation proposed by Chrastil [162], on which Fig. 7 is based, predicts positive effects of both system temperature and CO2 density on solute solubility. Indeed, Chrastil’s equation predicts that solute solubility in SC-CO2 depends on three factors, which are the solubility at an arbitrary reference condition; a correction factor for the CO2 density, which depends on a factor k that defines the number of CO2 molecules that form a solvate-complex with a single molecule of the solute; and a correction factor for the absolute temperature, which depends on the heat of solvation ΔH of the solute, that includes the thermal effects to first vaporize the solute and to secondly dissolve it in CO2 phase [106, 163].

Considering the positive effects of system temperature and CO2 density on solute solubility predicted by the equation of Chrastil [162], on one hand, and the positive effect of solute solubility on the CO2 extraction curves (Fig. 1B), it is strange that there are such a limited number of studies that use T and ρ instead of T and P as independent variables. An exception is the work of del Valle and Aguilera [2] mentioned in the “Introduction” section, but some additional contributions can be also mentioned, including those on the SC-CO2 extraction of polyphenols from an inert matrix [164] or grape seeds [165], and on the SC-CO2 desorption of lemon essential oil from silica gel [166]. We have also used T and ρ as independent variables to describe the SC-CO2 extraction of oils and antioxidant carotenoids from pelletized Nannochloropsis gaditana microalgae using RSM [119, 167].

Conclusions

Researchers in the areas of process engineering and food engineering should exercise caution when reviewing the state of the art when attempting to understand, describe, or analyze a process of interest. Its main objective should be to find and use models supported by fundamental principles, which allow safe scaling projections. In the event that the study is so novel that the effects of the factors on the response(s) are unknown, then they should be concerned with carefully selecting the independent variables, and the experimental space where they will study them, before embarking on doing the DoE and the subsequent RSM.

We suggest that future contributions to the supercritical extraction of solid matrices favor an approach based on the construction of extraction curves (e.g., cumulative extraction yield versus specific solvent consumption). This information will allow the application of models to obtain mass transfer (e.g., diffusion coefficient) and equilibrium (e.g., operational solubility) parameters, relevant information for the scaling of the process, and for the understanding of the phenomenon. extraction. The response surface methodology only has one use when investigating the effect of an additional variable. For example, it would apply when investigating the effect of the type of pretreatment. Researchers should avoid using variables such as temperature since there are analytical expressions that describe the effect of these variables.

This study demonstrates the advantage of using T, ρ, and q instead of the very commonly used T, P, and td as independent variables in DoE and RSM studies on the SC-CO2 extraction from oilseeds. Based on the review and analysis of the literature, this work claims that response surface designs have been used excessively, and consequently, most of the articles that have used these designs have not provided new information to understand the extractive phenomenon, particularly in the extraction of vegetable oils with SC-CO2. For example, in the carotenoid extraction case study, the response surface was not able to predict extraction curves descriptive of the extraction process. Furthermore, in this case study, we propose the use of ρ and T as independent variables, instead of P and T, because they bypass confounding effects of solute solubility in CO2, and the specific consumption of CO2 is a more appropriate independent variable than time. Finally, the purpose of the case study was to exemplify the application of a sequential strategy of the RSM from a first-order model to a second-order one. This sequential experimentation strategy favored the determination of the effects of coextraction of oil and carotenoids, where the extraction of carotenoids from rosehip peels was favored when the proportion of rosehip seeds as cosubstrate increased.