Introduction

Soil water repellency (SWR) is a widespread phenomenon that occurs in forest and agricultural soils under various plant cover and climatic conditions (Doerr et al. 2000; Jordán et al. 2013). It refers to the reduced ability of soil to absorb water, leading to the formation of water-repellent zones (Doerr et al. 2000; González-Pérez et al. 2008). The presence of SWR has significant negative impacts on soil and plant health, as well as on the hydrology and ecosystem functioning. One of the most significant impacts is soil waterlogging in lowland areas, as water is not absorbed by the soil but instead carries soil particles with it, increasing surface runoff and soil erosion in highlands (Coelho et al. 2005; Urbanek et al. 2015; DeBano 2000a; Doerr and Thomas 2000; Smettem et al. 2021; Zheng et al. 2017). Moreover, it hinders plant growth and lowers crop yields by reducing the water availability to plants (Doerr and Thomas 2000; Li et al. 2019; Müller et al. 2016). Soil water repellency also affects the cycling of nutrients, the storage of carbon, and microbial activity and function in soils, all of which have a cascading effect on ecosystem services (Doerr et al. 2000; Schonsky et al. 2014). For example, microbial activity can be inhibited in hydrophobic soils, leading to inhibited nutrient cycling and organic matter decomposition (Muñoz-Rojas et al. 2012; Kraemer et al. 2019). This may in turn affects plant growth and soil health. In addition, the increased runoff and erosion leveled up by SWR contributes to the loss of nutrients locally and the contamination elsewhere (Kodešová et al. 2008). This also affects the quality of surface and groundwater, as contaminants are carried away by runoff. In conclusion, understanding the mechanism and management of SWR is an important area of research with broad implications for agriculture, forestry, environmental science, and urban development. By developing effective management strategies to mitigate the negative impacts of SWR, we can promote sustainable land use and protect the health and productivity of our ecosystems.

Although SWR is common in soils and has many detrimental effects on ecosystems, its causes are still not completely understood. Several studies have focused on investigating the factors that contribute to SWR, including soil texture (Dekker and Ritsema 1996), soil particle-size distribution (Jiménez-Morillo et al. 2022), moisture content (Chau et al. 2014; Karunarathna et al. 2010), wildfires (DeBano 2000b; Arcenegui et al. 2007), microbial activity (White et al. 2017; Simpson et al. 2019), as well as soil organic carbon (SOC) (Mataix-Solera et al. 2007; Karunarathna et al. 2010; Hermansen et al. 2019; Balshaw et al. 2020; González-Pérez et al. 2004). Specifically, researches have shown that the presence of certain bacterial species can contribute to SWR (Chai et al. 2022; Seaton et al. 2019; Song et al. 2019). Therefore, the exact mechanism causing SWR remains a subject of debate, but an increasing number of studies consider hydrophobic organic compounds (i.e., SWR biomarkers), which mainly produced by plant exudates, as a critical factor (Jiménez-Pinilla et al. 2016; González-Peñaloza et al. 2013; Doerr et al. 2005). It has also been suggested that hydrophobic compounds produced by microbial activity or fire-induced changes in organic matter (OM) have a greater impact on SWR (Doerr et al. 2000; Jiménez-Morillo et al. 2022). Up till now, different extraction methods have been used to identify hydrophobic compounds, including n-alkanoic acids, n-alcohols, and n-alkanes (Franco et al. 2000; Horne and McIntosh 2000; Mao et al. 2014, 2015), alkenoic acids, α-alkenes, terpenes, sterols (Bull et al. 2000; Franco et al. 2000; Lozano et al. 2013), and some complex biopolymers, e.g., cutin and suberin (Mao et al. 2014, 2015). However, the mechanisms of how hydrophobic substances interact with soil particles and alter SWR are still uncertain. Several theories have been proposed to explain this phenomenon, but the most widely accepted one is the “coating” or “film” theory. This theory contends that soil particles are covered in hydrophobic materials, which form aggregates that repel water (DeBano 2000a; Doerr et al. 2000). According to this theory, hydrophobic substances bind to soil particles through van der Waals forces, hydrogen bonding, or chemical reactions, forming a hydrophobic film that repels water (DeBano 2000a; Doerr et al. 2000; Doerr and Thomas 2000; Bayry et al. 2012; Mao et al. 2019; Smettem et al. 2021; Leighton-Boyce et al. 2007). Overall, the understanding of the mechanisms underlying SWR formation is still evolving, and further research is needed to fully elucidate this complex phenomenon.

In spite of the fact that the occurrence of SWR is usually associated with wildfires in natural ecosystems, severe SWR is not always confined to burned sites, and the degree of repellency is often reported to vary with indigenous species-dominated plants. Most of soils under evergreen trees such as Pinus, Eucalyptus and Quercus always showed relatively high values of SWR (Rodríguez-Alleres and Benito 2011, 2012; Jordán et al. 2009, 2013; Lozano et al. 2013, 2014; Butzen et al. 2015; de Blas et al. 2010, 2013; Mao et al. 2014; Zavala et al. 2009; Walden et al. 2015), while soils under shrubs could be wettable to slight SWR (Alanís et al. 2016; Mao et al. 2014) and all bare soil samples were wettable (Lozano et al. 2013; Jordán et al. 2009, 2013). Probably, various dominant plant species (DPS) belonging to different plant functional types (PFTs) input different organic carbon (OC) to the soil, including plant residues, root exudates, and decaying vegetation, which differentially alter the extent of SWR (Keesstra et al. 2016; Seaton et al. 2019). For instance, evergreen trees tend to produce more recalcitrant OC (e.g., high-molecular-weight bound lipids), contributing to the development of strong SWR (DeBano 2000a; Mao et al. 2014). In contrast, grasslands may enhance microbial activities that mitigate repellency due to their more labile organic inputs (e.g., low-molecular-weight free lipids) (Doerr et al. 2000; Mao et al. 2014; Jordán et al. 2013). Nevertheless, de Blas et al. (2013) have studied the relationship between plant-derived extractable free lipids and bound lipids with the degree of SWR, achieving inconsistent results; therefore, the relationship between SWR and PFTs is intricate and multifaceted. As SWR occurs both in topsoil and subsoil (Dekker and Ritsema 1994), it is expected that both plant leaves and roots will contribute to SWR. In particular, the leaves of evergreen trees may have a strong influence on surface SWR, thereby affecting water infiltration (Mao et al. 2015). At this point, it is necessary to further investigate how the input and sequestration of source-specific SWR biomarkers modulate the effect of PFTs on SWR in artificial forest ecosystems.

The Chinese Loess Plateau (6.4 × 105 km2) is a semi-arid region and is considered one of the most eroded areas (mean soil loss rate: 2860 t km−2 year−1); it is characterized by an extremely complex soil-eroding catena (Wang et al. 2017). Loess is the most widely distributed soil in the Chinese Loess Plateau. However, little is known about the behavior of SWR in loess soils, which are the initial soils in erosive environments with very low SOC content (Chai et al. 2022). In recent decades, with the accumulation of SOC after the world’s largest vegetation restoration project “Grain for Green”, the SWR of loess has emerged as an important topic (Liu and Zhan 2019). Investigating the effects of cover PFTs on SWR in loess and clarifying the possible relationship between SWR and SWR biomarkers could have significant implications for developing effective strategies to manage soil water availability, particularly in arid and semi-arid regions where water is scarce.

To investigate the main reasons for SWR differences under different DPS (i.e., Pinus tabulaeformis Carr., Robinia pseudoacacia L., Hippophae rhamnoides L., Coronilla varia L., Agropyron cristatum (L.) Gaertn.) belonging to various PFTs (i.e., evergreen trees, deciduous trees, shrubs, legumes, grasses). In this paper, we studied the SOM accrual and chemical composition of topsoils under/around these five DPS on the Chinese Loess Plateau. More importantly, we quantified the major leaf-derived (cutin), root-derived (suberin), and other sources biomarkers (short-chain fatty acids, long-chain fatty acids) to assess the relationship between the input and sequestration of source-specific group of SWR biomarkers with SWR, thus improving our mechanistic understanding of SWR occurrence at the molecular level. Based on the SWR performance during extraction, the possible origin of different chemical composition, and the distinct PFTs among the five DPS, we hypothesized that (1) evergreen trees would be more likely to cause more severe SWR than other PFTs; (2) all soils under/around different DPS would have a greater proportion of root-derived suberin than leaf-derived cutin, but the leaf-derived cutin would be the main cause of SWR; and (3) the abundance and composition of SWR biomarkers vary depending on PFTs; evergreen trees would provide more SWR-inducing cutin than deciduous trees, shrubs, legumes, and grasses.

Materials and methods

Sampling

This study was conducted in the Wangdonggou watershed of Changwu Country, Shaanxi Province, China (35°13′N, 107°40′E; 1220 m a.s.l.). The study watershed has an area of 6.3 km2, and the total length of the main gully was 4.97 km (Wang et al. 2021). Wangdonggou is located in the typical loess hilly and gully region of the Chinese Loess Plateau, and the topography is highly fragmented. The region features a semi-arid continental climate, with an annual mean temperature of 9.1 °C. The annual mean precipitation is 568 mm, and predominantly occurs from July to August as short heavy storms. The frost-free period is 194 days, lasting from early April to mid-October, and the potential evapotranspiration is 967 mm (Huang et al. 2003). There are four vegetation types in the watershed, including two typical artificial forests and two herbaceous vegetation. One artificial forest is dominated by Pinus tabulaeformis Carr. (PT), and the other one is Robinia pseudoacacia L.—Hippophae rhamnoides L. (RP-HR) forest; the two herbaceous vegetation are dominated by Coronilla varia L. (CV), and Agropyron cristatum (L.) Gaertn. (AC), respectively. The functional types of these five dominant species are as follows: evergreen trees, deciduous trees, shrubs, legumes, and grasses. All soils were classified as typical silt loam (USDA soil texture classes), developed from aeolian loess parent materials. More details on soil texture are given in Chai et al. (2022). Expect for the topsoil of PT and RP, most topsoil showed a light brownish or yellow–brown tint, of which the litter layer was scarcely existent and grass leaves decompose quickly, but debris of plant leaves and many fine roots were observed. The topsoil of PT and RP, however, had a dark grey or brownish hue, and a 1–3 cm litter layer was discovered, although only a few roots were visible.

Locations of the four vegetation types were chosen for the collection of plant and soil samples in the watershed. The selected sampling locations do not have any human activity interfere with them, and have comparable soil type, vegetation restoration years, elevation, and management practices, to eliminate possible effects of non-experimental factors on SWR. The study region and sampling locations are summarized in Fig. 1 and Table 1. For all of the five dominant species, six stems for replicates were selected to take plant samples, and soil samples under/around them, as well. Although high variability of SWR was observed in our previous work (Chai et al. 2022), to avoid as far as possible distance-dependent sampling of SWR and other measurements, the six selected stems are not neighboring. Under/around each stem of each species, three micro-sampling positions with more than 50 cm spacing separation were placed to eliminate sampling variations. For trees and shrubs, the soil micro-sampling positions were located at an intermediate distance between the trunk and the edge of the canopy, where the microhabitat is most favorable for the degradation of plant litter by soil microorganisms (Moro et al. 1997). However, micro-sampling positions of herbaceous plants were as close to the stem as possible to ensure that the withered litter in the collected soil samples were formed by these plants themselves. Additionally, in the RP-HR forest, it was nevertheless possible to sample at micro-sampling places under each stem of each species without interfering with the others because of the relatively low density of vegetation. In May 2021, after the topsoil was almost air-dried, soil samples were taken from 0–3 cm depth of topsoil at micro-sampling positions under/around each of the five dominating plant species (n = 18 per species). The living plant leaf and root samples were taken separately from each plant species (n = 6 per species). Prior to further analysis, all collected soils were air-dried to constant weight at room temperature (20–25 ℃) and sieved (2 mm) to remove coarse soil particles. All the plant samples were freeze-dried and stored in a dry place.

Fig. 1
figure 1

Information for the study area (cited from Wang et al. 2016a) (a) and sampling locations (b). PT, Pinus tabulaeformis Carr.; RP-HR, Robinia pseudoacacia L.—Hippophae rhamnoides L.; CV, Coronilla varia L.; AC, Agropyron cristatum (L.) Gaertn

Table 1 The basic information for each sampling location

Soil characteristics and water repellency measurements

The soil organic matter (SOM) and total nitrogen (TN) were measured by the K2CrO7-H2SO4 oxidation method and the Kjeldahl method, respectively (Fu et al. 2011; Nelson and Sommers 1982; Bremner and Mulvaney 1982). The SOM value was converted to a total organic carbon (TOC) value using the established 1.724:1 SOM: TOC ratio.

For measuring SWR, approximately 15 g of soil was placed on a 50-mm diameter aluminum box and exposed to a controlled laboratory (20 °C, ~ 50% relative humidity) for one week to eliminate the potential impacts of preceding atmospheric humidity on SWR. The SWR was measured using the water drop penetration time (WDPT) test. This simple test measures the average time taken for three 100 μl water droplets to penetrate into the soil. The method is widely known and accepted to assess the persistence of SWR (Van’t Woudt 1959; Wessel 1988; Dekker and Ritsema 1994). The water-repellent classes were divided into five grades with different intervals according to Bisdom et al. (1993), with WDPT ≤ 5 s representing wettable and WDPT > 5 s water-repellent conditions. According to Lozano et al. (2013), to linearly rate the SWR, the logarithm of the WDPT value was used to determine a sample was water-repellent or not. Specially, if the value of log10 WDPT is > 0.7, it was considered as water-repellent (Table 2). After the determination of the WDPT value, a series of samples under/around the same DPS with different log10 WDPT intervals were obtained for subsequent SWR biomarkers extraction experiments. Finally, a total of 18, 12, 9, 9, and 6 soil samples were obtained from PT, RP, HR, CV, and AC. To compare the changes in SWR before and after extraction, the WDPT values were also measured on the air-dried soils after each extraction.

Table 2 WDPT classes and class increments used in the present study

Soil and plant extraction procedures

To investigate SWR biomarkers, sequential extraction methods were applied to all the soils and the plant samples. Lipids were extracted by subsequently dichloromethane/methanol (DCM/MeOH) (9:1 v: v) and isopropanol/ammonia solution (IPA/NH3) (7:3 v: v, 25–28% ammonia solution) (Bull et al. 2000; Mao et al. 2014).

For extracting free lipids from samples, according to the method of Bull et al. (2000), each dried soil, leaf, and root sample was accurately weighed and extracted using a Soxhlet apparatus (AI-ZFCDY-6Z, Na ai Co., Ltd., Shanghai, China) containing DCM/MeOH (9:1 v: v) at 70 °C for 24 h to obtain the D fraction. The solvent was removed with a rotary evaporator (BUCHI Lab. AG, Flawil, Switzerland R-215). After redissolving the lipids in the solvent, extracts were passed through an SPE column filled with anhydrous Na2SO4 (2000 mg, 6 ml) to remove any water and were dried using a gentle stream of nitrogen, and then stored at -20 °C.

The residual soils were air-dried and extracted by IPA/NH3 (7:3 v: v) using the Soxhlet apparatus at 95 °C for 48 h. After extraction, the solvent also needs to be concentrated using the rotary evaporator. The soluble lipids (AS fraction) and the insoluble residues (AI fractions) with ester bonds were separated from the dried IPA/NH3 extracts by DCM/MeOH (9:1 v: v). All solutions were combined and both this combined solution and the insoluble residue were dried under nitrogen, and then stored at -20 °C.

Prior to analysis, all the D and AS fractions of the soils and DCM/MeOH extracts of the plants were methylated using 100 µl of (trimethylsilyl)diazomethane (TMS-CH2N2) at room temperature. The AI fractions and the lipid-free dried leaves and roots were depolymerized through trans-methylation using BF3–MeOH at 70 °C for 24 h (Riederer et al. 1993). Then, all the extracts were eluted over a small silica gel (100–200 mesh) column with ethyl acetate and were silylated using N, O-bis (trimethylsilyl)trifluoroacetamide (BSTFA) at 70 °C for 30 min.

Derivatized extracts were analyzed using a triple quadrupole gas chromatography-mass spectrometry (GC–MS) GCMS-TQ8050NX instrument (Shimadzu Production Co., Kyoto, Japan) with a mass range of m/z 50–800. 1 μl of the derivatized extracts was injected onto an SH-Rxi-5Sil MS capillary column (Shimadzu 30 m × 0.25 mm inner diameter × 0.25 μm film thickness), using helium at a 1.0 ml min−1 constant flow rate as the carrier gas. The oven heating program started with an initial temperature of 70 °C, increased to 130 °C at 20 °C min−1, heated at 4 °C min−1 from 130 °C to 320 °C, and finally held at 320 °C for 20 min.

Identification of the individual compounds was carried out by their mass spectra using NIST libraries, interpreted spectra, retention indexes, or comparison to literature data (Feng et al. 2010; Otto and Simpson 2007; Tamura and Tharayil 2014). Based on GC–MS analysis, the relative correction factors of compound groups (alkanes, alcohols, fatty acids, ω-hydroxy fatty acids and α, ω-dicarboxylic acids) were rather similar and barely discriminated between various types of compounds. Therefore, 2 μl of squalane (40.5 μg/ml in ethyl acetate) as an internal standard was added to extracts, and the possible co-eluting compounds were quantified by peak area integration from GC–MS chromatograms and the following formula.

$${c}_{ir}\left(\mu g{ g}^{-1} TOC\right)=\frac{{A}_{i} \times {c}_{is}\times {v}_{is}}{{A}_{is}\times { v}_{i} \times {m}_{s} \times \mathrm{ TOC}}\times {f}_{{i}{\prime}}\times \alpha$$
(1)
$${c}_{ia}\left(\mu g{ g}^{-1} soil\right)=\frac{{A}_{i} \times {c}_{is}\times {v}_{is}}{{A}_{is}\times { v}_{i} \times {m}_{s} }\times {f}_{{i}^{\mathrm{^{\prime}}}}$$
(2)

where \({c}_{ir}=\) the relative concentration of the identified compound; \({c}_{ia}=\) the absolute concentration of the identified compound; \({A}_{i}=\) the peak area of the identified compound; \({v}_{i}=\) the volume of the identified compound; \({v}_{is}=\) the volume of the internal standard; \({A}_{is}=\) the peak area of the internal standard; \({c}_{is}\) = the concentration of the internal standard; \({m}_{s}=\) the weight of the dried sample; \({f}_{{i}{\prime}}=\) the relative correction factor, 1; \(\alpha =\) the conversion coefficient, 1000.

Statistical data analysis

All data were tested for normality and homogeneity of variance using SPSS Statistics 26 (IBM crop., USA), and log-transformed when necessary. One-way ANOVA combined with LSD test was used to evaluate the significance of differences in soil properties and SWR biomarkers among different DPS. Simple linear regression was performed to quantify the relationship between soil characteristics, SWR biomarkers, and SWR (i.e., the measured WDPT value). In this study, the quality parameters of compound groups were defined as the ratio of the relative concentrations of the two different compound groups. In addition, functional compound groups were distinguished according to their compound classes (Table S1), i.e., long-chain fatty acids (LFA), short-chain fatty acids (SFA), cutin, or suberin.

Results

Soil chemical properties and SWR

One-way ANOVA analysis showed that TOC, TN, and C/N ratio differed significantly under/around different DPS. PT (48.01 g kg−1 soil) and RP (54.16 g kg−1 soil) had the highest TOC, followed by HR (38.53 g kg−1 soil) and CV (37.53 g kg−1 soil), and then AC (16.42 g kg−1 soil) (Fig. 2a). The highest TN was found in the soils under/around RP and CV with mean values of 4.54 g kg−1 soil and 4.57 g kg−1 soil, respectively, while CV had the lowest C/N ratio with a mean value of 8.22%, which were 1.4–1.9 times lower than those of the soils under/around other DPS (Fig. 2b–c). By the WDPT test, 78% of the measured samples were water repellency (Fig. S1). More importantly, only the soils under PT had strong to extreme SWR, while most soils under/around the other DPS were classified as slightly water-repellent (Fig. S1).

Fig. 2
figure 2

The soil TOC content, TN content, and C/N ratio under/around the different DPS (a–c). SWR (log 10 WDPT(s)) as a function of log10 TOC, log10 TN, and log10 C/N ratio before and after DCM/MeOH and IPA/NH3 extractions (d–f). TOC, total organic carbon; TN, total nitrogen; C/N, TOC: TN ratio. Error bars in (a–c) represent standard deviations of the mean values (n = 18, 12, 9, 9, 6) with lowercase letters above bars indicating significant differences among plant species treatments in specific fractions at p < 0.05

Before extraction, log10 TOC showed a strong positive linear correlation with SWR (r2 = 0.65, p < 0.001) (Fig. 2d). Moreover, Pearson's correlation was positive but weak between log10 TN (r2 = 0.31, p < 0.001) and log10 C/N ratio (r2 = 0.20, p < 0.001) with SWR (Fig. 2e–f). After DCM/MeOH extraction, SWR increased in most of the soils (p < 0.001) (Fig. S2), whereas the average TOC content (p < 0.001) and TN content (p < 0.01) of all soils decreased by 3.68% and 2.49%, respectively. (Fig. S2). The average C/N ratio also decreased, but the difference was not significant (p > 0.05) (Fig. S2). A strong positive correlation was found between log10 TOC and SWR for the soils (r2 = 0.66, p < 0.001) (Fig. 2d), while log10 TN (r2 = 0.40, p < 0.001) and log10 C/N ratio (r2 = 0.11, p < 0.05) also showed weak positive linear correlations with SWR (Fig. 2e–f). After IPA/NH3 extraction, again the average TOC content and TN content of all soils reduced by 4.62% and 4.50%, respectively (p < 0.001), and the average SWR class of all soils reduced to ‘wettable’ (p < 0.001) (Fig. S2). Meanwhile, log10 TOC (r2 = 0.16, p < 0.01) and log10 TN (r2 = 0.08, p < 0.05) showed positive but weak correlations with SWR (Fig. 2d–e), whereas log10 C/N ratio did not have significant correlation with SWR (p > 0.05) (Fig. 2f).

Biomarker groups and SWR

Soil biomarker groups

To get a more general view of the relationship between certain compounds and SWR, we have analyzed compound groups (i.e., the sum of all compounds of the same class in the three extraction fractions) (Table S1). As for extractable plant-derived biomarkers, the relative abundance of LFA were on average 52.17% higher in the soils under HR relative to the soils under PT (p < 0.01) (Fig. 3b). Cutin, primarily derived from leaf tissues, followed a trend contrary to that of LFA, in which the soils under PT had 35.39–78.77% higher cutin concentrations than that under/around other DPS (p < 0.05) (Fig. 3c). Additionally, the relative abundance of root-derived suberin under evergreen trees, deciduous trees, and shrubs (i.e., PT, RP and HR) were, on average, 56.77%, 19.55% and 62.69% lower than that around the legumes and grasses soils (i.e., CV and AC), respectively (p < 0.05) (Fig. 3d). To trace the microbial origin of TOC, we calculated the concentration of SFA. The lowest SFA concentrations (402.25 μg g−1 TOC) were found in the soils under PT as compared to RP, HR, CV and AC (p < 0.05) (Fig. 3a). Overall, the relative concentration of SFA in the soils under shrubs (HR) and legumes (AC), LFA in the soils under shrubs (HR), cutin in the soils under evergreen trees (PT), and suberin in the soils under legumes (AC) and grasses (CV) were highest, respectively (p < 0.05) (Fig. 3a–d).

Fig. 3
figure 3

The total relative concentrations (sum of D, AS, and AI fractions) of SFA, LFA, cutin, and suberin under/around the different DPS. a Short-chain fatty acids (C16–C18 n-alkanoic and n-alkenoic acid). b Long-chain fatty acids (> C24 alkanes, > C22 n-alkanoic acids, and alkanols). c Cutin (C14–C18 ω-hydroxyalkanoic acids). d Suberin (α, ω-dicarboxylic acids [C16–C24; saturated and substituted] and ω-hydroxyalkanoic acids [C20–C30; saturated and substituted]). Error bars are ± SD (n = 18, 12, 9, 9, 6 correspond to the number of soil samples from PT, RP, HR, CV, and AC) of the mean with lowercase letters above bars indicating significant differences between DPS at p < 0.05

Plant biomarker groups

For group concentrations of SFA, LFA, and cutin of leaves (Fig. 4a–c), the SFA concentrations of AC were on average 41.15% higher than that of other DPS, and the LFA concentrations of RP were on average 61.67% higher than that of other DPS. More importantly, PT had the highest cutin concentrations (2.16 mg g−1 dried leaf), which was 10–60 times higher than other DPS. In contrast, the highest concentration of SFA and LFA was found in the roots of CV, followed by AC, and then HR, RP, and PT (Fig. 4d–e). Moreover, the suberin concentrations in the roots of HR were on average 36.13% higher than that of other DPS (Fig. 4f).

Fig. 4
figure 4

The group concentrations of SFA, LFA, cutin of leaves (a–c) and the relative abundances of SFA, LFA, suberin of roots(d–e). Error bars represent standard deviations of the mean values (n = 6) with lowercase letters above bars indicating significant differences among DPS at p < 0.05

Relations between groups of soil–plant linked biomarkers and SWR

Linear regression analysis revealed that cutin was positively related to SWR (r2 = 0.24, p < 0.001), whereas SFA (r2 = 0.31, p < 0.001) and LFA (r2 = 0.17, p < 0.01) were negatively related to SWR (Fig. 5a–c). However, no significant correlation was found between suberin and SWR (p > 0.05) (Fig. 5d). In order to understand whether the quality parameters of biomarkers can describe the SWR, the correlation between the ratio of the two separate compound groups and SWR was analyzed. For all soils, cutin was essential for a significant combination, when cutin were acted as the numerator, the correlation between the ratio of two separate groups and SWR was positive, otherwise, no significant relationship was observed (Fig. 5e–j).

Fig. 5
figure 5

Correlations between relative concentrations (log10 (µg g.−1 TOC)) of biomarker groups and SWR (log10 WDPT (s)) in all soils (a–d), and correlations of the ratio of the two separate compound groups and SWR (log10 WDPT (s)) based on the quality factor (e–j), n = 54

Discussion

High OC input induces water repellency in surface soils

By supplying additional OC inputs to the soil, the Grain for Green Program has long been acknowledged to play a significant role in preserving or boosting soil C sequestration (Wang et al. 2016b). The higher cumulative C inputs in all sampling locations may be primarily attributed to the Grain for Green Program's favorable effect on SOC accumulation. Due to the introduction of hydrophobic materials into the soil, one of the causes of the high WR in the examined soils may be the high SOC content (Zavala et al. 2009). According to multiple studies (Scott 2000; Harper et al. 2000; Mataix-Solera and Doerr 2004; Mataix-Solera et al. 2007; Doerr et al. 2005; Lozano et al. 2013; Hermansen et al. 2019; Mao et al. 2019; Wang et al. 2016a; Seaton et al. 2019), there is a positive relationship between SOC and SWR, particularly when samples were taken from the same soil type and under or near the same DPS. In the present study, we found that some soil characteristics were highly correlated with SWR, in particular, TOC. However, the very strong relation between TOC and SWR determined by WPDT found in our study is not supported by other studies (Horne and McIntosh 2000; de Blas et al. 2010; Dekker and Ritsema 1994; Doerr et al. 2005). This discrepancy may be primarily due to the soils used in others' work were from different soil types and geographical locations, having different effects on SWR, while the loess we used were all from the same area. The positive correlation between TOC and SWR demonstrated that SOC greatly contributes to SWR. Despite the soils under them having similar TOC content (Fig. 2a), we also found different water-repellency classes under PT and RP (Fig. S1). This inconsistency has been attributed to the fact that SWR can be controlled by the type and quality of SOM rather than by its amount (Wallis and Horne 1992; Dekker and Ritsema 1994; de Blas et al. 2010).

TN was positively but weakly correlated with SWR, as earlier found by Lachacz et al. (2009), which is probably caused by the ability of legumes to fix atmospheric N, high-C and high-N litter inputs may shift the stoichiometry of substrates, accelerating the mineralization rate of microorganisms and increasing the yield of polysaccharides, thereby improving SWR. The C/N ratios were also positively but weakly correlated with SWR, which is consistent with the findings of Lachacz et al. (2009) and Mao et al. (2014). A high C/N ratio implies a recalcitrant material that can only be decomposed by fungi (Holtkamp et al. 2011), which may contribute to more severe SWR than labile organic material (Schindler et al. 2007). Regarding the interrelation between soil parameters, the decomposition of TOC in soils likely cause the change in C/N ratio, which leads to the correlation between C/N ratio and SWR. Consequently, TOC will be considered an important soil parameter to predict the WR levels of loess after returning farmland to forest (grassland).

Leaf-derived cutin contributes to surface SWR

In the present work, sequential extraction methods have been used to identify SWR biomarkers, including n-alkanoic acids, n-alkanols, n-alkanes, alkenoic acids, and some monomers derived from cutin and suberin, which are all hydrophobic compounds (Franco et al. 2000; Horne and McIntosh 2000; Mao et al. 2014, 2015; Bull et al. 2000; Lozano et al. 2013); however, the severity of SWR is not related to all the constituents of SOC, or the effectiveness of individual organic compounds on SWR decreases with increasing concentration (Doerr et al. 2005), thus we analyzed the possible relations between compound groups and SWR. The identified compounds in the D fraction were mainly the extractable free lipids, while the AS fraction and AI fraction is a combination of free lipids and ester lipids hydrolyzed by microbes and BF3–MeOH, which agreed with some previous studies (Van Bergen et al. 1997; DeBano 2000b; Franco et al. 2000; Mao et al. 2014). In terms of the extent to which SWR was represented during the extraction sequence, the SWR of most soils increased after DCM/MeOH extractions (p < 0.001), while the SWR significantly reduced after IPA/NH3 extraction (p < 0.001) (Fig. S2). These results indicated that it is ester-bound lipids, not free lipids, that play a major role in causing SWR. Although no clear evidence for such molecular behavior has been given, this paradox can be explained by a certain degree of mobility of free lipids. Due to the longer chain and less mobility of ester-bound lipids, they may be more hydrophobic than free, low-molecular-weight lipids. DCM/MeOH solvent, by extracting more lipid material, leaves more hydrophobic adsorption sites on the surface. Once the free lipids were removed, the adsorbed extent of ester-bound lipids onto high-affinity hydrophobic sites increased and as a consequence of that SWR increased as well. Because ester-bound lipids exist as polyfunctional macromolecules, they may change their structural conformations due to interactions involving the carboxyl and alcoholic hydroxyl functional groups. These interactions are disturbed by the extraction of polar compounds by IPA/NH3 solvent, causing conformational disruption, hence desorption of polar compounds and exposure of surface polar adsorption sites, resulting in a wettable soil (Atanassova and Doerr 2010; Riederer et al. 1993; Mao et al. 2019). Thus, SWR increased in all the soils after DCM/MeOH solvent extraction and was eliminated by IPA/NH3. Although the IPA/NH3 solvent extracted less free lipids and more ester-bound lipids than the DCM/MeOH solvent, it was more suitable for extracting compounds of certain associated with SWR elimination.

We separately analyzed the absolute and relative concentrations of compound groups in all extraction fractions. For all soils, the absolute concentrations of most compound groups associated with SWR (Fig. S4), which is supported by Mao et al. (2015) who also noted that the positive relationships between the absolute concentrations of all the compound groups and SWR follow the positive correlations between TOC and SWR. Obviously, the relationships between absolute concentrations of compound groups and SWR are determined by the TOC content, we cannot identify which compound group affects SWR by analyzing the absolute concentrations of the compound groups. Regarding the relative concentrations of SWR biomarker groups, PT with higher WDPT values had 59.70% higher cutin concentrations than that under/around other DPS (Fig. 3c), which was the only group to show positive relation with SWR (p < 0.001), while the groups of LFA, SFA showed negatively relations with SWR for all soils (p < 0.01) (Fig. 5a–c). However, no significant correlation was found between suberin and SWR (p > 0.05) (Fig. 5d). Similar results have been reported in many previous studies (Mao et al. 2014; Atanassova and Doerr 2010; Hansel et al. 2008), which showed that the differences in SWR studied cannot be attributed solely to a single factor or group, but also the interaction of multiple compound groups. In our study, we have tried to explain which SWR biomarker groups are the most relevant in the development of SWR in loess. Our correlation data of the ratio of the two separate compound groups and SWR revealed that when cutin were acted as the numerators, the ratio of the two separate compound groups was always positively correlated with SWR, and vice versa, with no significant relationship (Fig. 5e–j). The positive correlations between cutin / LFA (r2 = 0.27, p < 0.001) and cutin / SFA (r2 = 0.30, p < 0.001) with SWR were strongest (Fig. 5e and f). This finding might be mainly due to the increased adsorption of cutin onto high-affinity hydrophobic sites on the soil surface after more D fraction compounds were extracted, resulting in a larger SWR. According to the above results, SWR occurring on the loess surface seems to be the most influenced by cutin-derived C, and cutin / FA is considered an important soil quality parameter for predicting SWR levels on the loess surface (Fig. 6).

Fig. 6
figure 6

Conceptual framework of the effect of biomarker groups on SWR

Cover PFTs-induced changes of the abundance and composition of hydrophobic compounds maintain SWR

Cover plant types may largely alter the content and composition of SOC by providing additional OC residues (Van Bergen et al. 1997; Nierop 2001; Kögel-Knabner 2002; Zhang et al. 2022). In the present work, the main groups of the extractable and ester-bound lipids present in the leaves and roots were all identified in the soils under/around the given plant species, which might be due to the cover plant are the main sources of the SWR biomarkers (Nierop et al. 2003; Naafs et al. 2004). In line with previous studies (Naafs et al. 2004; Nierop et al. 2006; Zhang et al. 2022), the predominance of even-over-odd numbered LFA (> C24 alkanes, > C22 n-alkanoic acids and alkanols) (Fig. S5) indicated that the primary sources are plants (Jansen et al. 2006; Zhang et al. 2022). By contrast, SFA (C16–C18 n-alkanoic and n-alkenoic acid), which were identified in all of our topsoil and plant samples, are mostly microbial-derived biomarkers. It was detected in plant samples, possibly due to the Endophytic microorganisms in plants (Jaffé et al. 1996; Jansen et al. 2006; Zhang et al. 2022). Cutin and suberin in the AS fractions may be from microbially hydrolyzed ester lipids, while those in the AI fractions are polymers hydrolyzed by BF3–MeOH from leaves/needles and roots of plants, respectively (Riederer et al. 1993; Mao et al. 2014). Therefore, the role of microorganisms in the hydrolysis of biopolymers cutin and suberin in the natural environment cannot be underestimated. In addition, the relationship between other microbial-derived C, such as ergosterol and glomalin-related soil protein, and SWR was not considered in our work. These carbon sources in soil may be produced by filamentous fungi (Feeney et al. 2004; Wessels 1996, 2000; White et al. 2017; Rillig et al. 2010; Bayry et al. 2012; Lozano et al. 2013), which directly affect SWR. However, our research only emphasized the influence of microbial-derived biomarkers SFA on SWR, the relationship between other microbial-derived biomarkers in loess and SWR may need further validation.

From the above analysis, the ester-bound lipids biomarkers after IPA/NH3 extraction represent the cutin and suberin-derived compounds in the plant leaves/needles and roots, respectively. The differences in suberin concentrations were observed, in which legumes, grasses soils, and shrubs soils were richer than evergreen trees and deciduous trees (Fig. 3d). This may be primarily due to plants with shallow and horizontal extended roots easily accumulating suberin than deep-rooted plants in topsoil. In our work, the small amounts of suberin under deep-rooted plants may derive from grasses providing suberin to the topsoil. Another source could be bark, which also contains suberin although they contribute less to the soil than roots (Preston et al. 1994). Evergreen trees with higher WDPT values had higher cutin concentrations in leaves and soils as compared to deciduous trees, legumes, grasses, and shrubs (Fig. 3c; Fig. 4c). These results are in line with Rodríguez-Alleres and Benito (2011) and Badía et al. (2013), who both manifested that the SWR persistence of grassland soil was much less than that under forest soil. Seaton et al. (2019) and Smettem et al. (2021) have successively reached the same conclusion that some deep-rooted plants can take advantage of SWR under drought stress, leading to the emergence of co-evolutionary behavior in natural ecosystems. This observation is also found in some previous studies, which showed SWR not only allows rainwater to form a preferential flow, penetrate into deep soil, and store in large quantities (De Boeck and Verbeeck 2011; Zeppenfeld et al. 2017; Alanís et al. 2016; Lozano et al. 2013), but also reduces soil water evapotranspiration loss through various mechanisms, making deep-rooted plants more resistant to drought than shallow-rooted plants (Doerr et al. 2006; Rye and Smettem 2018; Shahidzadeh-Bonn et al. 2007; Gupta et al. 2015). This can be inferred that deep-rooted evergreen trees tend to accumulate cutin rather than suberin on the loess surface, forming a hydrophobic layer and were hence more likely to cause SWR than other PFTs.

Conclusions

This study presents the first molecular investigation to explore the relationship between SOM composition and SWR output in the Chinese Loess Plateau, the extremely complex soil-eroding region. Since SWR increased after DCM/MeOH extraction and disappeared almost after IPA/NH3 extraction, we speculate that ester-bound lipids after IPA/NH3 extraction were stronger SWR-markers than free lipids. Our GC–MS data showed that the cutin concentrations in leaves and soils of evergreen trees with higher WDPT values were 10–60 times and 35.39–78.77% higher than those of other PFTs, respectively. However, the root-derived suberin in the soils under evergreen trees, deciduous trees, and shrubs were 19.55–62.69% lower than that around the legumes and grasses. Therefore, we demonstrate that leaf-derived cutin appears to have the greatest effect on SWR, which was also supported by the results of the correlation analysis. Besides, the correlation between the ratio of the two separate compound groups and SWR indicated that the combination cutin / FA was more strongly correlated with SWR, and thus it was considered as an important soil parameter for predicting the SWR. These results shed new light that the differentiation of cover PFTs alter the abundance and composition of hydrophobic compounds in soils, thereby affecting SWR levels in a semi-arid region of the Chinese Loess Plateau. Furthermore, this study highlights the important role of plant-soil interactions in mediating the high variability of SWR in typical loess soils, which could have significant implications for developing effective strategies to manage soil water availability at a later stage, particularly in arid and semi-arid regions where water is scarce.