Introduction

Plants in a community need to overcome the barriers imposed by abiotic and biotic factors to survive (Cornwell and Ackerly 2009). To understand the ecological processes that sustain plant species diversity (Mckane et al. 2002) and patterns of vegetation structure (Grime and Pierce 2012), it is necessary to recognize the essential resources used by plants in their habitats (Higgins et al. 2011). The coexistence of many plant species in communities where environmental filters shape the composition and structure of vegetation may indicate the presence of niche partitioning, which demonstrates diversification in resource use by plants (Higgins et al. 2011; Rossatto et al. 2014). Variations in soil properties are strong environmental filters that influence the composition of plant communities (Langan et al. 2017). This is because of the differential responses of plant species to variations in soil texture (Ruggiero et al. 2002; Medinski et al. 2010; Huang and Hartemink 2020), amount of available water (Medinski et al. 2010; Ribeiro et al. 2021), nutrient availability (Mota et al. 2018; Souza et al. 2021; Amaral et al. 2022), and soil pH (Medinski et al. 2010; Viani et al. 2014; Amaral et al. 2022).

Veredas (Brazilian savanna palm swamps) are one of the habitats found in the Cerrado (Brazilian savanna), and are characterized by wet, poorly drained, permanently waterlogged, or seasonally waterlogged soils during the rainy season (Durigan et al. 2022). They are commonly found in shallow valleys or flat terrains, where they may occur in poorly defined drainage lines, near springs, or bordering gallery forests (Ribeiro and Walter 2008). The fluctuation of the water table height is important for both the edaphic factors and the flora community composition (Nunes et al. 2022; Nogueira et al. 2022; Horák-Terra et al. 2022). Veredas can generally be subdivided into three zones, considering their moisture gradient: border, middle, and core (Eiten 1994). The border is surrounded by slightly elevated areas with dry and well-drained soils, and a typical savanna (cerrado) vegetation; then, the vegetation progressively transitions from typical cerrado on the drier upland towards the seasonally wet grassland on the lower region of the valley, which is predominantly composed by a continuous herbaceous and shrub stratum. The middle is periodically saturated with water and presents an herbaceous and subshrub vegetation. The core is the lower region of the terrain; it is composed of a permanent wet grassland (herbaceous and shrub stratum) and a strip of buritis (Mauritia flexuosa L.f.) with an occasional shrub layer underneath them. The transitions of the zones can be sudden or gradual. The soils of the veredas are hydromorphic, classified as Histosols and Gleysols (EMBRAPA 2006; Ramos et al. 2006, 2014). These soils are generally moderately acidic, with high levels of aluminum, organic matter, and iron and low levels of phosphorus, magnesium, and calcium (Ramos et al. 2014). Because of the characteristic topography and soil drainage of the veredas, the soil moisture gradient is correlated with differences in chemical properties (Ramos et al. 2006, 2014), thus favoring the occurrence of microhabitats, leading to variations in flora composition and richness at the same location (Araújo et al. 2002; Oliveira et al. 2009; Resende et al. 2013). These differences may be associated with water table level seasonality, distinct sediment deposition, or anthropogenic interference (Oliveira et al. 2009; Rosolen et al. 2015; Arantes et al. 2019).

Veredas play an essential role in the ecosystem by conserving the hydrological system (Queiroz 2015) and contributing to biodiversity by harboring several taxonomic groups and endemic species (Sousa et al. 2013; Bijos et al. 2017; Nogueira et al. 2022). They also provide many ecosystem services, such as the storage of large amounts of carbon in the soil (Sousa et al. 2015), water (Lima and Silva 2008), traditional farming families (Schmidt et al. 2011), and raw materials for crafts, fibers, and fruits for traditional communities (Sampaio et al. 2008). The herbaceous and shrub flora of veredas is distinct from that of other surrounding Cerrado vegetation types (Silva et al. 2018). Nevertheless, floristic heterogeneity occurs both within (Araújo et al. 2002) and between veredas from different localities (Resende et al. 2013; Bijos et al. 2017; Nogueira et al. 2022). When comparing veredas from different areas, the heterogeneity is significantly high, as the communities exhibit a large proportion of exclusive herbaceous and shrub species despite the differences in their diversity and floristic composition (Bijos et al. 2017; Silva et al. 2018; Nogueira et al. 2022).

Although some studies have evaluated the flora (Araújo et al. 2002; Bijos et al. 2017; Silva et al. 2018; Arantes et al. 2019) and soil properties of veredas (Ramos et al. 2006, 2014; Sousa et al. 2015; Nogueira et al. 2022; Nunes et al. 2022), there is a knowledge gap regarding the correlations between the distribution, composition, and diversity of herbaceous and shrub species, and soil properties of these communities at a regional scale. Research on other Cerrado formations has demonstrated that at regional or local scales several factors, such as the presence of soil aggregates, nutrients, amount of exchangeable aluminum, water availability, and soil drainage, can directly or indirectly influence plant diversity and habitat heterogeneity (Ribeiro et al. 2021; Lira-Martins et al. 2022). In addition, soil texture (Lira-Martins et al. 2022), water table depth (Ribeiro et al. 2021), topographical relief (Oliveira-Filho et al. 1989), fire frequency (Veenendaal et al. 2018), and anthropogenic disturbance (Ribeiro and Walter 2008; Veenendaal et al. 2018) also affect the floristic composition of plant communities.

To understand the species distribution patterns of veredas and determine the main soil properties driving the formation of these communities, we aimed to investigate the relationships between the distribution and composition of herbaceous and shrub species and soil properties in the veredas of Central Brazil, covering more than 7° of latitude and 9° of longitude. Thus, we addressed the following questions: (1) Do herbaceous and shrub species composition and diversity vary significantly among veredas? (2) Are the composition and diversity of plants in veredas related to soil properties? (3) Which are the most important soil variables that influence species composition and diversity of plants in veredas on a regional scale? In this work, it is hypothesized that physicochemical soil properties exhibit a significant relationship with distinct plant species composition and diversity of veredas. Since the soils of veredas have low phosphorus content and pH, but high organic matter content (Ramos et al. 2014), we also hypothesized that phosphorus and pH are positively correlated with the floristic diversity of veredas, but that organic matter content is negatively correlated.

Materials and methods

Study area

The present study was conducted in 21 veredas from national parks or preserved private lands of Central Brazil. Veredas were distributed in six Brazilian states, ranging between 10-17º S, 43-52º W (Fig. 1). The phytosociological sampling data of nine veredas were published by Bijos et al. (2017), soil and phytosociological data of three veredas were published by Nogueira et al. (2022), and nine were included in the present study. According to the Köppen classification system, the climate of the study areas was Aw, which is described as tropical seasonal with a rainy summer (October to April) and a dry winter (May to September) (Alvares et al. 2013). The soil types found in the study areas are classified as Gleysols (Ramos et al. 2014), characterized as hydromorphic and poorly drained, which contain poorly decomposed organic material under a grayish layer (Reatto et al. 2008). The predominant soil classes in the regions of the veredas include dystrophic Red Ferralsols, dystrophic Leptsols, dystrophic Haplic Cambisols, and Orthic Arenosols (Santos et al. 2011) (Table S1). Study sites were located over four hydrographic regions: Tocantins-Araguaia, Paraná, São Francisco, and Atlântico Leste (National Water and Sanitation Agency, ANA 2021) (Table S2).

Fig. 1
figure 1

Map of the 21 vereda study sites and a photo of the vereda GO7. RPPN = Private Natural Heritage Reserve, DF = Distrito Federal, GO = Goiás, MG = Minas Gerais, BA = Bahia, TO = Tocantins, and MT = Mato Grosso

Vegetation and soil sampling

Vegetation sampling was conducted during the rainy season (November to May), as the chances of sampling a larger number of species in their reproductive stage are greater in this period (Munhoz and Felfili 2006). In each vereda, sampling of herbaceous, subshrub, and shrub species was conducted in 15 transects of 10 m each (sample units), from the M. flexuosa line to the border of the vereda with the cerrado. The distribution of the transects were randomly stratified, with five transects in each of the three previously described zones of the veredas: border, middle, and core. To determine the horizontal linear cover and species composition, we applied the line intersection method (Canfield 1941; Munhoz and Araújo 2011). Thus, using a millimeter tape, we recorded the length of all herbaceous, subshrub, and shrub individuals that touched or crossed the tape, covering the base and clump of plants, or perpendicular projections of leaves intercepted by the line (Cummings and Smith 2000). We calculated the absolute cover (AC) and relative cover (RC), according to the methods described by Munhoz and Felfili (2006). We considered the AC value to be the sum of species horizontal projections on the sampling units. To determine the RC of each vereda, we divided the AC value of each species by the sum of the AC of all species and multiplied it by 100. Species were identified by comparison with herbarium (University of Brasília, UB) specimens, consultation with experts, and specialized literature, including the ‘Flora e Funga do Brasil’ platform and speciesLink. The collected botanical material was deposited in the UB herbarium and the Brazilian Institute of Geography and Statistics (IBGE) herbarium. The botanical nomenclature of the families was determined according to APG IV (Angiosperm Phylogeny Group IV) and that of the species according to the project ‘Flora e Funga do Brasil’ 2022 (Flora e Funga do Brasil 2022). The species, family, and author names were checked using the ‘flora’ package (Carvalho 2020) of the R program (version 4.2.2; R Core Team 2022), which contains all the accepted botanical names and synonyms available in ‘Flora e Funga do Brasil’ (2020). Although M. flexuosa, a characteristic palm tree of veredas, was present in all the study sites, we did not include this species and trees in our analysis because their cover represents less than 10% of the vegetation, and in the middle and border zones they are absent (Ribeiro and Walter 2008).

We collected soil samples in each 10 m transect from a depth of 0–20 cm. Samples were collected 1 m from the center point of each sampling unit. The physical and chemical properties of the soil were analyzed to determine the relative contents of sand, clay, and silt; pH (CaCl2 0.01 mol L−1); aluminum (Al3+); calcium (Ca2+); magnesium (Mg2+); potassium (K+); phosphorus (P); organic matter (OM); cation saturation (V); and cation exchange capacity (CEC). Soil texture (sand, clay, and silt content) was determined using the densimetric method of Bouyoucos, and the percentage of organic matter was quantified using the Walkley–Black method (% organic C × 1.724). Cations Ca2+, Mg2+, and Al3+ were extracted using a 1 mol L−1 KCl solution, and K+ and P were extracted with Mehlich-1 solution (0.0125 mol L−1 H2SO4 + 0.05 mol L−1 HCl). Potential cation exchange capacity (CEC) was estimated according to the equation (T = Ca2+  + Mg2+  + K+  + H+  + Al3+). Cation saturation (V) by {V = [(Ca2+  + Mg2+  + K+∕CEC)] × 100}. Soil physicochemical properties were analyzed according to the Embrapa protocol (2017).

Data analysis

We wanted to understand how different soil properties affected species richness and species cover of the veredas and whether there were significant dissimilarities between the herbaceous and shrub species compositions in the veredas. Additionally, we wanted to understand whether the veredas significantly differed in terms of their soil properties.

To assess species diversity in the veredas, we used the Shannon–Wiener index (H'), Simpson's dominance index (D), and Pielou's evenness index (J'), with species cover data as a measure of abundance, which were calculated using the ‘diversity’ function of the ‘vegan’ package (Oksanen et al. 2022).

To verify the floristic relationship among different veredas, we applied the hierarchical cluster analysis (Borcard et al. 2011), using species AC data by site. We calculated the distance matrix using the Bray–Curtis dissimilarity index, then applied the ‘hclust’ function using the 'average' method, which calculates the lowest distance (or highest similarity) between clusters before merging (Legendre and Legendre 2012). We selected the ‘average’ (UPGMA; Unweighted Pair-Groups Method using Arithmetic Averages) clustering model based on the cophenetic correlation value, a coefficient that measures the goodness-of-fit between matrices. The clustering method that produces the highest cophenetic correlation may be the one that provides the best clustering model for the distance matrix (Legendre and Legendre 2012). We used the gap statistic method (Tibshirani et al. 2001) to determine the appropriate number of clusters present in the dataset by applying the ‘clusGap’ and the ‘fviz_gap_stat' functions of the ‘cluster’ (Maechler et al. 2022) and ‘factoextra’ (Kassambara and Mundt 2020) packages, respectively. According to the gap statistic method, the optimal number of clusters present in the dataset was three, leading us to use it in all further analysis. (Fig. S1).

To order the areas in a multidimensional space while accounting for spatial autocorrelation, we used the adjusted principal coordinate analysis (aPCoA; Shi 2021), employing the presence and absence data and species AC data in the analysis. For the presence and absence analysis, we used Jaccard's similarity coefficient, and for the AC analysis, we employed the Bray–Curtis coefficient (Borcard et al. 2011). Ordination analysis was performed using the ‘aPCoA’ function of the ‘aPCoA’ package (Shi 2021).

To check whether there was a significant difference between the species composition of the communities and to identify the distinctions or separations between the emerged groups, we performed a permutational multivariate analysis of variance (PERMANOVA) using the ‘adonis’ function of the ‘vegan’ package (Oksanen et al. 2022), with 999 permutations.

To identify indicator species significantly associated with each cluster group, we performed the indicator species analysis (Dufrêne and Legendre 1997) using the ‘multipatt’ function of the ‘indicspecies’ package (De Cáceres and Legendre 2009). A total of 999 permutations were performed to test species significance using the Monte Carlo test. Species were considered indicators if p ≤ 0.05 and the indicator value (IndVal) ≥ 25% (Dufrêne and Legendre 1997).

The Kruskal–Wallis rank sum test (Borcard et al. 2011) was used between the veredas and groups formed in the cluster analysis to assess the significant differences in each soil variable. We applied the ‘kruskal.test’ function on the soil data and later the ‘pairwise.wilcox.test’ function to assess which pairs of veredas or groups were significantly distinct with regard to the soil variable. Prior to the test, we verified that the assumptions of normality and homoscedasticity of the residuals were satisfied.

We used canonical correspondence analysis (CCA; Legendre and Legendre 2012) to examine the correlation between the soil physicochemical properties and flora composition of the communities. We used the AC matrix of species and soil data in the analysis and applied the 'cca' function from the 'vegan' package (Oksanen et al. 2022). CCA is used to explain species-environment and environment-area relationships, even when species distributions are skewed, samples are taken from unusual sampling designs, and environmental variables are highly self-correlated (ter Braak and Smilauer 2012). The statistical validity of CCA, its axes, and the soil variables were assessed using the Monte Carlo permutation test (α = 0.05; Borcard et al. 2011) with 999 permutations. Prior to CCA, we checked for multicollinearity among the soil variables using Pearson correlation analysis. We retained the variables that exhibited values of |r|< 0.7 and were ecologically important for the study areas.

To understand how soil physicochemical properties affect changes in richness and species cover, we used regression models. To better represent species cover, we used PCoA (principal coordinate analysis) eigenvectors from axis one as the response variable, and the soil variables selected from the Pearson correlation (V, OM, P, pH, and sand) (Table S3) as predictors, for which we applied a linear model (LM). We also ran a generalized linear model (GLM) with a ‘poisson’ distribution log link function to understand how soil V, OM, P, pH, and sand affected changes in species richness. We selected the best model by applying a backward selection for the predictor variables (V, OM, P, pH, and sand) until only significant variables remained in the model. We analyzed and validated the models according to the recommendations by Zuur et al. (2009).

We tested for spatial autocorrelation in species richness and species composition, and in the residuals from the regression models (GLM and LM) using Moran’s I correlograms. This enabled us to evaluate whether geographically close sites were similar concerning species richness and composition and to check for the residual independence assumption of regression models. For species richness, both correlograms, based on raw data and on residuals from GLM, did not show a significant autocorrelation (p > 0.05 after Bonferroni’s correction). To apply Moran’s I correlogram to species cover data and presence/absence data, we used the coordinate points of the first axis of each PCoA, respectively. For presence/absence data, the correlogram based on the first axis of the PCoA showed a significant positive autocorrelation at the first distance class (Moran’s I = 0.96, p < 0.001 after Bonferroni’s correction), as well as for species cover data (Moran’s I = 0.82, p < 0.001 after Bonferroni’s correction). The correlogram based on residuals from LM showed a significant negative autocorrelation in the third distance class (Moran’s I = -0.45, p = 0.01 after Bonferroni’s correction). Thus, we generated distance-based Moran eigenvector maps (MEM; Dray et al. 2006) and performed a global test on the residuals from LM against all MEM variables (Blanchet et al. 2008). Neither positive nor negative eigenvectors were significant (positive MEMs: p = 0.37; negative MEMs: p = 0.96) in the global test, indicating that the spatial autocorrelation in model residuals was not strong and that it was not necessary to add spatial variables to the LM. The PCoA was performed using the ‘cmdscale’ function of the ‘vegan’ package (Oksanen et al. 2022), while for the Moran’s I analyses, we used the function ‘correlog’ of the ‘pgirmess’ package (Giraudoux 2022), and for the generation of MEM variables we used the function ‘mem.select’ of the ‘adespatial’ package (Dray et al. 2022). We also tested for the presence of spatial autocorrelation in each soil variable using Moran’s I correlogram, by applying ‘Moran.I’ function of the ‘ape’ package (Paradis and Schliep 2019) and corrected the p-value using Bonferroni correction.

All analyses were performed using the R program (version 4.2.2; R Core Team 2022). We considered a 5% significance level in all statistical tests.

Results

Vegetation composition and diversity

A total of 560 plant species, belonging to 220 genera and 69 families, were sampled (Table S4). The families with the highest numbers of species were Poaceae (16.96%), Cyperaceae (10.36%), Asteraceae (9.46%), Melastomataceae (6.96%), Xyridaceae (6.96%), and Eriocaulaceae (5.89%). The number of plant species per vereda ranged from 33 to 124, while the proportion of exclusive species ranged from 4 to 38% (Table 1). Only Paspalum lineare Trin. was present in all the veredas. Rhynchospora globosa (Kunth) Roem. & Schult. occurred in 20 areas but was not recorded in MG2 (Table S4). The species with the highest relative coverages were P. lineare and Trachypogon spicatus (L.f.) Kuntze, and R. globosa. P. lineare was most abundant in the veredas GO6 and GO1, T. spicatus in MG2 and GO7, and R. globosa in MG1 and MT3 (Table S4). The most abundant species in the veredas varied greatly, with GO4 and GO5 having a greater number of species (Table S4), indicating high evenness (Table 1). The veredas with few highly dominant species in the communities were DF1, GO1, GO6, MG2, and MT3 (Table S4). The veredas with most of their total relative cover concentrated in a few species were DF1, GO1, GO6, MG2, and BA1 (Table S4), which was corroborated by the low evenness values (Table 1). The Shannon diversity values differed from the species richness values of the veredas (Table 1). The Simpson's index varied from 0.78 to 0.98, and the evenness was high for most areas, with higher values for GO4 and GO5 and lower for GO6 (Table 1). Based on the diversity indices, we can consider the veredas GO4 and GO5 as the most diverse, and BA1 and GO1 as the least diverse (Table 1).

Table 1 Richness and diversity of herbaceous and shrub species in the veredas

The cluster dendrogram (Fig. 2) and PCoA of species cover (Fig. 3a, axis 1) showed a marked division between the veredas located in Tocantins, Minas Gerais, and Bahia in the northeast and eastern Cerrado, and Goiás and Distrito Federal in the central region. The PCoA second axis and dendrogram demonstrated that the veredas of Mato Grosso (MT1, MT2 and MT3) and MG1 were different from GO7, BA1, and MG2 (Figs. 2 and 3a) in terms of species cover. The dendrogram shows a formation of smaller groups between vereda pairs, such as in GO7 and MG2, GO1 and GO2, and GO3 and DF2 (Fig. 2). Groups of veredas were slightly separated in the presence/absence PCoA analysis (Fig. 3c); the veredas from Mato Grosso showed significant differences from the other veredas in axis 2; meanwhile, most of the veredas from Tocantins were found to be distinct from the others in axis1 (Fig. 3c); finally, the veredas from Group 1 were only partially differentiated from the other veredas, indicating some overlap with other groups (Fig. 3c). The PCoA showed a similar pattern to the cluster analysis, representing the compositional differences among the veredas, both in relation to species cover and presence and absence (Figs. 2 and 3). After we removed the effect of spatial location using aPCoA, the separation of the veredas in three major groups became less apparent, considering species cover (Fig. 3a and b) and presence/absence (Fig. 3c and d), showing that the species composition dataset is affected by the location. The PERMANOVA test demonstrated significant differences in species composition between different veredas (F = 9.72; R2 = 0.398; p = 0.001) and between the three major groups formed in the cluster analysis (F = 27.77; R2 = 0.151; p = 0.001).

Fig. 2
figure 2

Dendrogram of floristic similarity between different veredas using Bray–Curtis dissimilarity and ‘average’ (cophenetic coefficient = 0.78) linkage method. The keys highlight the separation between the groups of veredas located in Central Cerrado and northeast and east Cerrado, and the thick black lines highlight the groups of veredas that showed similarity. The site codes are described in Fig. 1

Fig. 3
figure 3

Original PCoA ordination diagram (A and C) showing floristic relationship in the studied veredas and covariate adjusted PCoA (B and D) colored by the main groups formed in the hierarchical cluster analysis (Groups 1, 2 and 3). (A and B) Analysis diagram generated from species cover data and (C and D) with presence/absence data. The ellipses represent 95% intervals around the centroid of the vereda groups. The site codes are described in Fig. 1

The most representative indicator species were from group 3 (BA1, GO7, and MG2), which had 15 species, and included Borreria poaya (A.St-Hil.) DC, Palhinhaea cernua (L.) Franco & Vasc, and Macairea radula (Bonpl.) DC (Table 2). Group 2 alone had one indicator species and was more meaningful when combined with group 3, thereby represented the following species: T. spicatus (L.f.) Kuntze and Trichanthecium cyanescens (Nees ex Trin.) Zuloaga & Morrone (Table 2). Group 1 included five indicator species, the most significant one being Scleria leptostachya Kunth (Table 2). Groups 1 and 3 combined were represented by two species, Andropogon virgatus Desv. and Axonopus brasiliensis (Spreng.) Kuhlm.

Table 2 Relative frequency values (0–100%) of indicator species per cluster group

Overall, veredas had acidic soils with high levels of aluminum and organic matter and low levels of phosphorus, magnesium, and calcium (Table S5). The average proportion of sand in the veredas was higher than that of clay and silt (Table S5). Veredas of DF and GO1 had higher clay levels than other groups; higher levels of organic matter were recorded for DF1, DF3, DF4, GO7, MG1, and BA1; veredas of Mato Grosso, TO1, and GO7 had a higher proportion of cation saturation (V), while BA1 and MG1 had lower cation saturation; and the amount of phosphorus was higher in GO5 and lower in DF2 than in others (Table S5).

The veredas clearly differed in regional soil classes (Table S1). Soil classes for group 2 and 3 had predominantly plain relief sandy soils (Orthic Arenosols), and veredas from group 1 were distributed in regions with three different soil classes (Table S1). Soil variables were significantly different between the veredas as well as between the three groups formed by the cluster analysis (Table S6). However, in groups 1 and 2, Al, OM and pH were not significantly different, and in groups 2 and 3, Ca, Mg, K, and soil textures did not differ significantly. CEC, phosphorus, and cation saturation were significantly different in all groups. Moran’s I correlogram based on raw soil data showed significant spatial autocorrelation for the following variables Al (Moran’s I = 0.27, p = 0.04 after Bonferroni’s correction), K (Moran’s I = 0.33, p = 0.02 after Bonferroni’s correction), pH (Moran’s I = 0.37, p = 0.02 after Bonferroni’s correction), V (Moran’s I = 0.50, p = 0.001 after Bonferroni’s correction), CTC (Moran’s I = 0.28, p = 0.04 after Bonferroni’s correction), sand (Moran’s I = 0.25, p = 0.04 after Bonferroni’s correction), silt (Moran’s I = 0.31, p = 0.01 after Bonferroni’s correction), clay (Moran’s I = 0.30, p = 0.01 after Bonferroni’s correction), indicating that the spatial location is affecting the variables. Thus, some of the variables did not show a significant autocorrelation: OM, Mg and P (p > 0.05 after Bonferroni’s correction).

Soil-vegetation interactions

The soil variables retained after Pearson correlation were sand, pH, phosphorus, V, and OM. The total amount of variation explained by the CCA was 30.2%. The first four axes accounted for 27% of the variation in soil variables, with 8.60% of the variation explained by axis 1, 7.40% by axis 2, 5.84% by axis 3, and 4.73% by axis 4 (Table 3). The Monte Carlo test (Table 3) indicated that the relationship between species composition and soil properties of the veredas was significant (p = 0.002), but only for axis 1 (p = 0.012). The soil variables most closely associated with the veredas composition were cation saturation (p = 0.001), organic matter (p = 0.007), and pH (p = 0.003) (Table 4). The strongest correlation with CCA axis 1 was observed for sand (negative) (Table 4). CCA axis 2 was highly correlated with pH (positive), cation saturation (positive), and organic matter (negative), and CCA axis 3 showed strong correlation for phosphorus (positive) (Table 4). The distribution of the veredas in the CCA was similar to both the dendrogram and the PCoA, with axis 1 showing clear separation between the veredas of Mato Grosso, Tocantins, Minas Gerais, BA1, and GO7 and those of Distrito Federal and Goiás (Fig. 4). The first axis showed a positive correlation with phosphorus and pH and a negative correlation with cation saturation, organic matter, and sand; the second axis showed a negative correlation with organic matter and sand, and a positive correlation with cation saturation, pH, and phosphorus (Fig. 4).

Table 3 Summary of canonical correspondence analysis (CCA) and Monte Carlo permutation test (α = 0.05) of plant species cover and soil properties of the veredas
Table 4 Intraset correlation of soil variables with the first three axes of CCA and F and p-values estimated by permutation test (α = 0.05) of plant species cover and soil properties of the veredas
Fig. 4
figure 4

Ordination diagram obtained by canonical correspondence analysis (CCA) of the 21 veredas and soil variables on the first two axes, based on species cover data. Total amount of variation explained: 30.2%. Variation explained by axis 1 and 2 were 8.6%, and 7.4%, respectively. OM = organic matter, V = cation saturation, P = phosphorus. The site codes are described in Fig. 1

The results of the GLM showed that phosphorus, organic matter, and pH were significant predictors of species richness (Table 5). Phosphorus and pH were positively correlated with species richness, whereas organic matter was negatively correlated (Table 5). For compositional similarity (PCoA1 scores), organic matter, cation saturation, and sand were negatively correlated, whereas phosphorus was positively correlated (Table 5). The selected soil predictors accounted for 86% (R-squared = 0.86) and 63% (pseudo-R-squared = 63.49) of the variation in species composition and species richness, respectively.

Table 5 Significant soil predictors of herbaceous and shrub composition of the veredas

Discussion

The veredas showed major variations in species composition and diversity and shared only one common species, Paspalum lineare Trin., which is one of the most widespread species, with its distribution ranging from Mexico to northeastern Argentina (Zuloaga et al. 2004). Overall, the number of exclusive species in all veredas was similar to the total number of species recorded for each vereda, and the number of unique species was higher in the veredas that exhibited higher species richness. A low number of shared species have also been reported in other studies involving veredas (Araújo et al. 2002; Resende et al. 2013). For herbaceous and shrub species in other vegetation types, the similarities among sites are also low (Mendes et al. 2012; Amaral et al. 2022).

The studied veredas varied in altitude, temperature, and precipitation (Table S5), because of their broad spatial distribution. The occurrence of veredas can be influenced by climatic, edaphic, and topographic variables (Gonçalves et al. 2022). The most diverse veredas (GO4 and GO5) are in the mountainous regions of Chapada dos Veadeiros (GO4) and Topázios Range (GO5), two areas in higher altitudes (above 1,000 m), with cooler temperatures (mean min 15 / max 26 ºC), and with high species richness and endemism (Simon and Proença 2000; Proença et al. 2000; Vidal et al. 2019). Geographic barriers can limit species dispersal, restricting their occurrence to a specific region. The veredas located in the mountainous regions of Chapada dos Veadeiros (GO3, GO4), Topázios Range (GO5), and the Espinhaço Range (MG1) contained high-altitude endemic plant species from the families Eriocaulaceae and Xyridaceae (Giulietti et al. 1987; Munhoz and Proença 1998; Proença et al. 2000). The dispersal of these species is limited by lower temperatures and differentiated soil factors, such as the abundance of rocky soils, which are all characteristic of high-altitude areas (Giulietti et al. 1987; Stannard 1995). On the other hand, one of the least diverse veredas, with the lowest Shannon (H’) value and low evenness, was found in Serra do Tombador (GO1), northeastern Goiás, on the edge of Chapada dos Veadeiros at a lower altitude (below 1,000 m). The Cerrado vegetation surrounding the vereda GO1 in northeastern Goiás is known for having a predominance of mesotrophic species, with records of low number of woody species (Ratter et al. 2003). As previously observed for the Cerrado species, the species distribution of the veredas is also mosaic-like (Felfili et al. 2004, 2008), especially since they are restricted to hydromorphic soils and surrounded by vegetation on well-drained soils.

The PCoA and cluster analysis supported the high variation in species composition demonstrated by our results. The ordination showed that species occurrence was more similar than species cover in the groups of veredas formed by the cluster analysis. The veredas from the cluster groups were closely distributed in the ordination and each group was distanced from one another. When excluding the effect of space on species composition, the adjusted PCoA analysis did not reveal any clear separation among groups. This is because spatial autocorrelation, which can create patterns of similarity or dissimilarity, was no longer a factor in the analysis. Group 1 and 2 contained five and one indicator species, respectively (Table 2), confirming that high heterogeneity existed in the areas. Two indicator species were common between groups 2 and 3. Group 3 contained 15 indicator species with an IndVal > 72%, higher than the other groups. The cluster analysis at the third level of the divisions completely agreed with the Cerrado ecoregion zoning proposed by Sano et al. (2019), except for MG1, which is isolated in the Jequitinhonha ecoregion. The species composition of MG1 was distinct compared with that of the other veredas, probably because the site region, i.e., the south of the Espinhaço Range, is influenced by neighboring sites of high-altitude southeastern grasslands with many records of plant endemism (Giulietti and Pirani 1988; Echternacht et al. 2011).

Studies evaluating the distribution of trees (Sano et al. 2019) and herbaceous and shrub (Amaral et al. 2017) species in Cerrado have described floristic ecoregions, i.e., a geographically defined assemblage of natural communities with similar environmental conditions (Dinerstein et al. 1995), which are associated with soil, geomorphology, climate, altitude, vegetation, and plant species data. In the Brazilian Planalto Central ecoregion, the altitudinal gradient creates important centers of endemism for diverse species (Simon and Proença 2000; Vidal et al. 2019), including herbaceous and shrub flora (Amaral et al. 2017; Sano et al. 2019). Group 1 is entirely distributed in the Planalto Central ecoregion, which is near the center of the Cerrado, known to be a hotspot of biological diversity and where our richest veredas were located (GO5, GO4, DF3; Table 1). Additionally, the location of these veredas leaves them far from neighboring biomes, which can also explain the greater local richness (Françoso et al. 2016). The Araguaia Tocantins ecoregion encompasses practically all the veredas from group 2 (MG1 is an exception) and it is in a hydrographic region with the same name. Few endemic and exclusive species are registered from this ecoregion (Arruda 2003; Sano et al. 2019; Vidal et al. 2019), which is supported by our results as well, especially for the veredas in the Tocantins state, which have the lowest incidences of exclusive species. The veredas from group 3 (GO7, MG2, and BA1) were entirely located in the Chapadão do São Francisco ecoregion, where the soil drainage capacity and slope of the terrain are very low (Arruda 2003), characteristics that are associated with the occurrence of hydromorphic soils and veredas. The rate of species endemism in this ecoregion is intermediate when compared to the other ecoregions (Sano et al. 2019), which corresponds to our results for the veredas in group 3.

CCA analysis showed that soil organic matter, cation saturation, and pH were good predictors of species cover. As expected, veredas from Tocantins state (TO) and from group 3 were correlated with sand, since the sites are in a region of orthic Arenosols, which are deep soils with a sandy or sandy loam texture composed essentially of quartz (Reatto et al. 2008). Cation saturation was an important driver of species cover for the veredas, indicating high levels of this attribute at the Mato Grosso sites. The increase in cation saturation and sand proportion was negatively correlated with compositional similarity. Some studies have acknowledged cation saturation (Mota et al. 2018; Souza et al. 2021) and soil texture (Ruggiero et al. 2002; Amaral et al. 2022; Lira-Martins et al. 2022) as important factors associated with the flora communities in Cerrado. In other savannas, higher levels of sand can indicate higher richness of herbaceous and shrub vegetation (Medinski et al. 2010). High sand content (> 50%) decreases soil capacity to retain or release cations or anions, soil organic carbon and cation saturation (Huang and Hartemink 2020), thus reducing soil fertility in the veredas. Also, the low percentage of soil cation saturation in the veredas was probably limited by the acidic soils, which contains acid cation aluminum, limiting plant growth and nutrient cation absorption (Weil and Brady 2016). Although the fertility is low, waterlogged sandy soils makes the soil structure softer and suitable for root growth and shoot emergence (Huang and Hartemink 2020).

The increase in soil phosphorus content was positively correlated with overall species richness and compositional similarity, whereas organic matter negatively correlated with them. Nogueira et al. (2022) also found significant interactions between phosphorus and species richness and relative cover in veredas. Phosphorus is a vital nutrient responsible for photosynthesis, transformation of sugars and starches, energy transfer, nutrient movement within plants, and efficient root growth (Weil and Brady 2016). Regarding organic matter, the hydromorphic soil of the veredas is responsible for the accumulation of organic materials in the early stages of decomposition because the poor aeration conditions of the soil reduce the humification process and increase the C:N ratio (Sousa et al. 2015). Although the soil organic matter content at the veredas is high, the humus derived from the oxidation of this element is slowly produced and less available, which, combined with the low pH levels, makes the soil nutrient-poor (Sousa et al. 2015). These conditions restrict species occurrence, allowing only specialized species to survive (Weil and Brady 2016). A strong negative correlation of soil organic matter accumulation on herbaceous and shrub species richness has also been observed in other Cerrado vegetation types (Xavier et al. 2019; Souza et al. 2021).

Richness was positively correlated with an increase in pH. pH was a good predictor for some of the veredas with the highest species numbers, such as those in Distrito Federal (DF2, DF3, DF4), Chapada dos Veadeiros (GO4), and Topázios Range (GO5). Plant communities in wetlands present a higher number of species with higher soil pH levels (Xiaolong et al. 2014). Extreme levels of soil pH can influence plant distribution, not only directly, but also by affecting the mineralization of soil organic matter and the availability of nutrients (Medinski et al. 2010; Viani et al. 2014). The soils of the veredas presented low pH, and the H+ ions remained high, which can hinder the absorption of essential nutrients by the plants (Ellis and Mellor 1995), therefore reducing the chances for species to colonize a habitat under such harsh conditions.

Soil water saturation and soil nutrients can limit plant establishment, once low nutrient content and water saturation create a strong environmental pressure that might perform a more rigorous species filtering (Viani et al. 2014; Lira-Martins et al. 2022). Typically, the soils of the veredas presents a non-uniform distribution of moisture due to variation in the water table fluctuation (Hórak-Terra et al. 2022), which creates high patchiness in the soil moisture gradient, and may also promote plant species richness and variation in species composition in the communities (Araújo et al. 2002; Oliveira et al. 2009; Resende et al. 2013). The water table depth of the vereda increases from the core zone to the border zone and its levels can oscillate influenced by the rainy and dry season of the Cerrado. During the dry season, when the water table level is lowered, especially at the border of the vereda (the core zone is always wet), surface soil horizons become exposed and unsaturated, intensifying the organic matter decomposition, and changing nutrient availability in the soil (Hórak-Terra et al. 2022). During the rainy season, suspended soil particles (Ca, Mg, Na, Fe, Mn, and Al) can be removed by the rain from the border and accumulate in the lower parts of the vereda (Ramos et al. 2014). Also, the core zone of the veredas present higher proportions of accumulated organic matter (Hórak-Terra et al. 2022), because soil-pore spaces of wet regions are predominantly filled by water (Ramos et al. 2014), which slows down the humification of organic residues (Sousa et al. 2015).

The studied veredas showed significant differences in soil properties and species composition which were driven by specific soil filters, such as cation saturation, organic matter, phosphorus, sand proportion, and pH. These results are key to understanding how different these environments are, in comparison to other vegetation types within and outside the Cerrado, how they function and are relevant for decision making to efficiently protect fragile wetlands such as veredas, especially under imminent global environmental changes and anthropogenic interference. The increasing conversion of native vegetation to cultivated pasturelands, along with the excessive use of center-pivot irrigation systems for agriculture, has reduced water availability in important ecoregions of the Cerrado (Planalto Central, Chapadão do São Francisco, and Jequitinhonha) (Rodrigues et al. 2022), negatively impacting the soil organic carbon stock and water holding capacity of the veredas (Horák-Terra et al. 2022). As a result, the flora of the veredas is highly endangered, and these areas are considered a high-priority for conservation efforts, including the creation of new conservation units (Amaral et al. 2017; Sano et al. 2019). We hope that this study can provide new insights into this topic, with practical consequences for restoration programs concerning the management and recovery of degraded areas, as well as the conservation of natural areas.