Introduction

Montane floras have a high degree of endemism and diversity (Chen et al. 2009; Merckx et al. 2015). As on islands, the isolation of flora on mountaintops leads to similar patterns of dispersal, speciation and extinction (Merckx et al. 2015; Rahbek et al. 2019). Depending on the context and on the organism, mountains can function as bridges or barriers for species flow (Perrigo et al. 2020). Covering around 25% of the planet’s terrestrial area, some estimates suggest that mountain regions harbor more than 85% of the terrestrial species; this pattern is especially observed in the Neotropics, such in the Andes and the mountains of the Atlantic Forest (Rahbek et al. 2019).

Campos de altitude (Fig. 1) are montane ecosystems consisting of unique high-altitude grasslands that occur in the highest and coldest regions of Atlantic Forest in southeastern and southern Brazil (Eiten 1983; Veloso et al. 1991; Safford 1999a, b). While being surrounded by a dense forest matrix and under the influence of similar climatic conditions, campos de altitude differ by having shallow, acidic, and extremely nutrient-poor soils (Safford 1999a, b), favoring their grasslands condition. Typically, these grasslands occur at or above ca.1800–2000 m a.s.l., especially in Caparaó, Serra dos Órgãos, and Itatiaia national parks (Safford 1999a, b). However, the proximity to the ocean and latitude strongly influence this elevational threshold; in some cases, this formation starts at ca.1500 m a.s.l. (Vasconcelos 2011) or even at ca.1300 m in southernmost Brazil (Paraná and Santa Catarina states) (Martinelli and Orleans e Bragança 1996; Struminski 1996; Mocochinski and Scheer 2008; ICMBIO 2021). Elevation and soil type are known to influence the phylogenetic relationships and distribution of species (Neri et al. 2017; Gastauer et al. 2020). Thus, delimiting campos de altitude based solely on elevation is complex, since these grasslands usually form mosaics with the adjacent cloud forests, with different micro-climates and biotas.

Fig. 1
figure 1

Campos de altitude and their flora. a Landscape view of Parque Estadual dos Três Picos (PETP); b Grassland in Parque Nacional do Itatiaia (PNIT); c Vriesea itatiaiae Wawra in Parque Nacional do Itatiaia; d Zygopetalum maculatum (Kunth) Garay in Parque Nacional da Serra dos Órgãos; e Barbacenia irwiniana L.B.S.m. in Parque Nacional da Serra dos Órgãos. Photos: Igor M. Kessous

In Brazil, campos de altitude are not the only type of natural mountaintop grasslands, a common misconception in the literature and databases. Campos de altitude occur only in the Atlantic Forest usually on igneous or metamorphic rocks, in the mountaintops of the Serra do Mar and Serra da Mantiqueira (Safford 1999a). Also, the Serra Geral, at the southernmost part of the Serra do Mar and extending into the interior of Brazil and Paraguay (Frank et al. 2009), harbors grasslands that are recognized as campos de altitude (ICMBIO 2021). Despite originating from the same tectonic event in the Paleocene (Almeida and Carneiro 1998), the Serra do Mar and the Serra da Mantiqueira exhibit significant structural, geographical, and biological differences, making them distinct “vegetation islands”. The Serra da Bocaina is located between these two mountain ranges and is also considered part of the Serra do Mar. Recent climatic events have favored the emergence of a unique biota in the region (Behling et al. 2007).

Other open montane vegetation types occur in Brazil, mainly in the Cerrado (campos rupestres); and in the Amazon (tepuis). In contrast with campos de altitude, these vegetation types occur on sandstone or quarzitic rocks (Alves and Kolbek 2010; Silveira et al. 2020). Many plant collections from these other grasslands include the term “campo de altitude” on their labels; however, these plant communities are biogeographically different.  The flora of the campos rupestres is more closely related to that of the tepuis, while campos de altitude are more related to the Andes (Safford 1999b, 2007; Vasconcelos 2011; Guedes et al. 2020). Because of their similarities to Andean Páramos, campos de altitude came to be known as Brazilian Páramos, although with a greater influence of seasonality (Safford 1999a, 2007). Nevertheless, the term campos de altitude was already widely used in the literature and in the national biogeographic classification and is therefore more commonly used today to refer to this vegetation.

The flora of the campos de altitude includes ancient plant lineages (Silveira et al. 2020) and has high species richness and endemism (Safford 1999a; Neri et al. 2017). Advances in taxonomy and associated technology over the past 20 years have dramatically increased the number of known species in Brazil. Previous work focusing on the diversity of campos de altitude suggested that they contain fewer than 1000 vascular plant species (Safford 2007), encompassing 100 endemics (Alves and Kolbek 2010). However, in these studies, the scope of the concept of campos de altitude was not always clear, and most of them included only those high-altitude grasslands typical of the northern Serra do Mar and Serra da Mantiqueira. With the recent expansion of open-source data repositories, the use of spatial data has enabled us to reach a better understanding of species distribution patterns, habitats, and conservation.

Here, we compared two methods using spatial data to estimate the diversity of angiosperms in campos de altitude. We provide two lists of species and infraspecific taxa, the first based on sensu lato searching (filtering occurrence data by elevation, canopy height, location and keywords) and the other based on sensu stricto searching (the same as the first, however adding a filter of “campos de altitude” in the vegetation type of the Flora e Funga do Brasil). Furthermore, we aimed to provide an editable list of angiosperm species of the campos de altitude, and estimate the extrapolated species richness, under the hypotheses that (1) angiosperm taxa occurring in campos de altitude are underestimated in previous inventories; (2) campos de altitude of the Serra da Mantiqueira and Serra do Mar are internally more similar than to each other; and (3) the application of “filtering” methods to spatial data and habitat information can be used to estimate the biodiversity in campos de altitude. Based on the information collected, we estimated the number of taxa per family, threatened taxa, life forms, habitat, and endemism, and provide explanations for species distribution patterns.

Methods

To construct lists of angiosperm taxa (specific and infraspecific) occurring in campos de altitude and surroundings, we downloaded from Global Biodiversity Information Facility (GBIF) 9,101,129 records (GBIF.org -09 February 2022- GBIF Occurrence Download https://doi.org/10.15468/dl.e9c9cf) of all preserved specimens of Magnoliopsida (Eudicots + Basal angiosperms) and Liliopsida (Monocots) from all of Brazil. We excluded ambiguities and all records not identified at least to species level.

Backbone list

To obtain a backbone list (BBL), we divided the search into two steps: first, we obtained a list of taxa with georeferenced data (decimalLatitude and decimalLongitude) further filtered by elevation and canopy height; second, we obtained a list of taxa filtered by keywords to also encompass the occurrences without geographical coordinates.

In the first step, we removed duplicates and cleaned biased records using the R (R Core Team 2020) package CoordinateCleaner (Zizka et al. 2019), flagging and removing “capitals”, “centroids”, “equal”, “gbif”, “institutions”, “zeros” and “countries”. We then downloaded two raster files containing elevation information (Earth Resources Observation and Science—EROS (2018), with 30 m resolution:  https://www.usgs.gov/centers/eros/science/usgs-eros-archive-digital-elevation-shuttle-radar-topography-mission-srtm?qt-science_center_objects=0#qt-science_center_objects ) and canopy height, with 1 km resolution: and extracted the information for each point using the R package raster 3.5–29 (Hijmans et al. 2022). There are no official maps specifically containing the sites of campos de altitude (Vasconcelos 2016), thus we specified elevation filters according to the literature for each site where they occur (Table 1, Fig. 2) or when lacking information, using Google Earth®. In addition, we applied a < 5 m filter for canopy height, since campos de altitude are covered predominantly by grasses, shrubs, and forbs, with only a few scattered small trees.

Table 1 Protected areas that include the campos de altitude used in the present study and their minimum elevation threshold
Fig. 2
figure 2

Map with the campos de altitude analyzed here. Map color references the elevation raster. Acronyms are explained in Table 1

We then downloaded the map of the Brazilian protected areas (MMA, http://mapas.mma.gov.br/i3geo/datadownload.htm ), rasterized using the functions shapefile and rasterize of the R package raster (Hijmans et al. 2022), and extracted the information from each coordinate point to determine the distribution of each taxon in the different campos de altitude. We excluded the records corresponding to other Brazilian mountaintop vegetation types, such as the grasslands of central and northern Brazil.

For the keyword search, we used the raw list downloaded from GBIF, including records with and without coordinate data, and performed the search in the locality column: “campo de altitude” and “campos de altitude”, ignoring capitals. Again, we filtered the search output according to the elevation obtained by the GBIF database, keeping only the records according to Table 1, or those that had no elevation information. We removed all records not corresponding to campos de altitude and duplicates.

Final lists

To evaluate incorrect occurrences and obtain data on vegetation type, conservation status, endemism, life form, family, and habitat, we merged our BBL data with previously published data (Moreira et al. 2020; Carrijo et al. 2018), and unpublished lists obtained by our research group (in PNIT and PNCP), and extracted the information from Flora e Funga do Brasil (http://floradobrasil.jbrj.gov.br/reflora/listaBrasil/). For this, we performed the function get.taxa of the R package flora 0.3.5 (Carvalho 2021). The Flora e Funga do Brasil database is based on information from about 1000 taxonomists who collaborated in the description of the Brazilian Flora for more than a decade (Brazil Flora Group 2021). For the synonymized taxa not recognized in this R package, we performed searches for the accepted name of each separately, using the platforms World Flora Online (http://www.worldfloraonline.org/) and Tropicos (https://www.tropicos.org/), and resubmitted them for analysis in the package flora. We excluded all taxa considered to not occur in Brazil. The final output, covering all taxa and localities, is referred to here as the sensu lato list (SLL), representing species and infraspecific taxa that occur in campos de altitude and possibly in surrounding vegetation types. Because of methodological issues of resolution of the rasters and in order to include border species, we considered the use of “campos de altitude and surroundings” to refer to SLL. Preliminary sensu lato lists were made for all localities (Table 1) separately and then merged.

To obtain our sensu stricto list (SSL), we applied a “campos de altitude” filter to the vegetation type column in our preliminary sensu lato lists, representing only those taxa that occur in campos de altitude according to the Flora e Funga do Brasil. Thus, we were able to compare the results and analyses between the two datasets. The resulting lists were compared by the non-parametric Mann–Whitney U test (function wilcox.test) performed in R to verify differences in the number of taxa. Here, we considered only campos de altitude located at or above at least 1300 m and with more than 50 taxa in SLL (except by the outgroup), after a detailed literature search for each specific case. We did not include sites that are doubtful as to their vegetation or “high altitude rupestrian complexes” (see Benites 2002; Garcia 2003; Garcia and Pirani 2005; Vasconcelos 2011). An editable SSL-based angiosperm taxa list can be found in https://doi.org/10.5281/zenodo.8114353.

Cluster, network, dissimilarity analyses, conservation status, and collections

We fully coded the presence and absence of each taxon by site and performed clustering analyses for both datasets, using the R package BinMat (van Steenderen 2022). For the hierarchical cluster analyses, we chose the Jaccard distance method, applying the clustering method of WPGMA (Weighted Pair Group Method with Arithmetic Mean), since we considered that there is an unequal number of taxa in the sites/clusters. We performed the function upgma and applied 10,000 bootstrap replications to calculate the p-value of each cluster. Reliability was measured by approximately unbiased (AU) probability values and bootstrap (BP). We then performed the dissimilarity analyses using the Jaccard method, and obtained the number of taxa for each site, using the functions vegdist and specnumber, respectively, of the R package vegan 2.6.2 (Oksanen et al. 2022).

We performed a network analysis using the eLasso (least absolute shrinkage and selection operator) method (van Borkulo et al. 2014) for SLL and SSL, using the R package IsingFit 0.3.1 (van Borkulo et al. 2016). This procedure allowed us to use binary data, using the Extended Bayesian Information Criterion (EBIC; Chen and Chen 2008), which helps to decrease false-positive rates (van Borkulo et al. 2014). We applied the AND-rule and chose the higher gamma value (1.0) to avoid unreliable connections, especially false positives (van Borkulo et al. 2014, 2016). To statistically compare the output networks resulting from SLL and SSL, we submitted our binary datasets to a direct comparison test (network comparison test), using the NCT function of the R package NetworkComparisonTest 2.2.1 (van Borkulo et al. 2019), with the same parameters as for the network construction.

Conservation threat status (CR: Critically Endangered; VU: Vulnerable; and EN: Endangered) of the taxa were obtained from the latest Official National List of Endangered Plant Species (MMA 2022). Information for near threatened (NT) and least concern (LC) taxa was obtained by use of the package flora 0.3.5 (Carvalho 2021). Life form, habitat, endemism, and the corresponding families (according to the Flora e Funga do Brasil), in addition to the conservation status, were plotted using the R package ggplot2 (Wickham 2016). We included only accurate and unambiguous information about habitat and life form. After obtaining the final lists, we estimated the mean number of collections per taxon (Table 2). We plotted the graphical networks using the package qgraph 1.9.3 (Epskamp et al. 2022).

Table 2 Estimated number of taxa, collections, and extrapolated species richness (based on the SLL vouchers) from each site

We estimated species-accumulation curves (SACs) through the method of “rarefaction” and the extrapolated species richness for the GBIF species dataset (based on the SLL vouchers), as well as for each site separately using the R package vegan 2.6.2 (Oksanen et al. 2022). For the extrapolated species richness analyses, we used multiple estimators, as suggested by Brose et al. (2003): Chao, Jacknife (first and second order, Jack1 and Jack2 respectively) and Bootstrap (Boot). All codes and list of vouchers are available on GitHub (https://github.com/kessous/campos_de_altitude).

Results

Sensu lato list

Our sensu lato list (SLL) resulted in 2398 species and infraspecific taxa, 651 genera, and 131 families (Online Appendix 1). The estimated number of angiosperm taxa in each campos de altitude (and surroundings) is given in Table 2. Asteraceae is the best represented family, encompassing 345 taxa, followed by Poaceae (191 taxa) and Orchidaceae (162 taxa) (Fig. 3a). Most of the taxa are herbs (997 taxa; shrubs: 311; trees: 184; Fig. 3a), terrestrial (1792 taxa; epiphytes: 108; rupicolous: 61; Fig. 3a), and endemic to Brazil (1407 taxa; non-endemic: 926, Fig. 3a). A total of 224 taxa of angiosperms are threatened (CR: 44; VU: 45; EN: 135; Fig. 3a) and 47 taxa are near threatened. In general, our results indicated a larger number of taxa in sites with more collections and higher mean numbers of collections per taxon (Table 2). Of the four most diverse sites, PNSB had the fewest collections per taxon (2.4), while PNIT and PNSJ had the most collections per taxon (4.0 and 5.3, respectively). PEDS, PESP, PECJ, APAP, and PIMA had fewer than two collections per taxon.

Fig. 3
figure 3

Number of taxa by: 1- life form; 2- endemism; 3- family; 4- conservation status; 5- habit. a sensu lato list (SLL); b sensu stricto list (SSL). Data obtained from the flora package, according to the Flora e Funga do Brasil (2022) and MMA 2022. IUCN categories: LC least concern, NT near threatened, EN endangered, CR critically endangered, VU vulnerable. Values of each line are explained on the side of the pink bar

Sensu stricto list

The sensu stricto list (SSL) list resulted in 1087 species and infraspecific taxa, 347 genera, and 89 families (Online Appendix 2). This list showed similar taxonomic patterns to the SLL. The Asteraceae was also the best represented family, with 225 taxa, followed by Poaceae (112 taxa) and Fabaceae (58 taxa) (Fig. 3b). Also, most of the taxa are herbs (539 taxa; shrubs: 131 taxa; subshrubs:108 taxa; Fig. 3b), terrestrial (809 taxa; rupicolous: 37 taxa; epiphyte: 15 taxa; Fig. 3b), and endemic to Brazil (686 taxa; non-endemic: 384 taxa, Fig. 3b). According to the SSL, 133 taxa of angiosperms (12%) are threatened (CR: 33 taxa; VU: 19 taxa; EN: 81 taxa; Fig. 3b), and 20 taxa are near threatened according to the IUCN criteria. Of the total number of angiosperm taxa, about 223 (20% in the SSL) are endemic to the campos de altitude vegetation. The number of taxa did not present significant differences between the two lists (W = 157, p = 0.07).

Species accumulation curves and extrapolation

The analysis of SACs and extrapolated species richness demonstrate a continued increase in the number of species within campos de altitude, suggesting that the number of species can reach up to 4052 species. (Fig. 4; Table 2). Across the entire dataset, the observed richness accounted for mean 68.5% (58–83%) of the extrapolated richness (Chao: 64%, Jack1: 69%, Jack2: 58%; Boot: 83%). The completeness means varied across the different sites. PNIT and PNCP had the highest means (76.2% and 75%, respectively), while PESB and PIMA exhibited the lowest means (35.5% and 47.4%, respectively). Similar patterns were observed in the separate datasets, with the minimum percentage typically around 45–50%, often associated with Jack2, and the highest percentage occurring in Boot, ranging from 75 to 85% (refer to Table 2 for precise values).

Fig. 4
figure 4

Species accumulation curves (SACs) per years. a SAC of the entire GBIF species dataset. The blue line represents the SAC estimated by the “rarefaction” method. The boxplots curve represents the SAC estimated by the “random” method; b SAC of each site separately using the GBIF species dataset. Confidence index is represented by the shadows. Acronyms are explained in Table 1

Hierarchical relationship and dissimilarity

The hierarchical cluster dendrograms based on the SSL and SLL data showed very similar topologies and branch support (Fig. 5; acronyms explained in Table 1). Three main clusters were formed: 1– Northern Serra do Mar (PNSO, PEDS, APAP), with PETP nested together only in SSL; 2– Serra da Mantiqueira and Serra da Bocaina; and 3– Southern Serra do Mar + Serra Geral. Importantly, the Southern and Northern Serra do Mar did not cluster together in any of the analyses; however, the Northern Serra do Mar emerged together with the Serra da Mantiqueira in SSL, and partially in SLL, excluding PETP. Internally to the Southern Serra do Mar + Serra Geral, the localities in Paraná state (PEPP and ARAC) clustered with strong support in both analyses. In the Northern Serra do Mar communities, PNSO and PEDS clustered together in both analyses. In the Serra da Mantiqueira, we observed a strongly supported cluster of PNSB, PNIT, and PNCP, with PNCP and PNIT clustering together. The relationship among PDMI, PESP, PECJ, and PESP was inconclusive, with PECJ + (PESP + PDMI) in SLL and PESP + (PECJ + PDMI) in SSL. The dissimilarity analyses (Table 3) showed congruent results to the cluster analyses, although with some differences. However, analyzing the lowest dissimilarity between each site, we observed several of the same pairs in both datasets (Table 3).

Fig. 5
figure 5

Hierarchical clusters dendrogram generated by the R package BinMat. a data from the sensu lato list (SLL); b data from the sensu stricto list (SSL). The number above the branches refers to the bootstrap value (BP) and the number below the branches refers to the approximately unbiased probability values (AU). Colors refer to mountain ranges: reddish-brown-Serra da Mantiqueira; green-Serra do Mar; blue- Serra Geral. Acronyms are explained in Table 1

Table 3 Jaccard dissimilarity analyses in both datasets

Network analyses

The network analyses showed similar results regarding positive and negative connections in the datasets, although with more connections (p < 0.05) in SLL than in SSL (Fig. 6). The analysis based on the SLL dataset (Fig. 6a) showed a large number of positive connections. PNSJ (Serra Geral) showed only negative connections, with PNIT, APAP, and PEDS (p < 0.05). SSL network also showed a negative connection between PNSJ and PNIT, in addition to positive connections in three groups: Southern Serra do Mar, Northern Serra do Mar, and Serra da Mantiqueira + PNSB, with the latter two positively connected. The Network Comparison Test showed mostly non-significant differences in comparing the network structures of the SLL and SSL datasets edge weight (per edge: Online Appendix 3; maximum difference in edge weights: p = 1) and global strength (p = 0.77).

Fig. 6
figure 6

Graphical networks among campos de altitude communities generated by the R packages qgraph and IsingFit. Each line weight of the edges indicates positive (green) and negative (red) connection strength. Alpha transparency refers to the p-values, which are transparent where p > 0.05 and less transparent toward 0. Only connections with p < 0.05 are shown. a data from the sensu lato list (SLL), b data from the sensu stricto list (SSL). Acronyms are explained in Table 1

Discussion

Output lists

Few studies have aimed to estimate the number of plant species in the campos de altitude, and most of them have focused on different collection methods, elevation gradients, or specific regions or groups (Pereira et al. 2006; Ribeiro et al. 2007; Mocochinski and Scheer 2008; Pompeu et al. 2014; Campos-Cordeiro and Neri 2019; Campos et al. 2020). Considering the sites analyzed here, we estimate that the number of angiosperms in campos de altitude is between our SLL (2398) and SSL (1087). The former number is higher since it probably contains species of other vegetation types. On the other hand, the latter may have discarded some records, particularly of grassland plants that occur bordering forests. Further advances in developing reliable species lists of the campos de altitude will depend on collection efforts and record validation by specialists on different families. Therefore, an online and editable list is available at https://doi.org/10.5281/zenodo.8114353.

Previous studies suggested that fewer than 1100 plant taxa (Fundação Florestal 2015), including fewer than 1000 vascular plants (Safford 2007) occur in campos de altitude. We demonstrate that it is not possible to estimate accurately the number of angiosperms in campos de altitude solely through the Flora e Funga do Brasil information on habitat occurrence. Our analyses confirmed that number of taxa were underestimated in previous inventories. According to Flora e Funga do Brasil database (searched in August 2022), an estimated 2353 species (and 179 infraspecific taxa) of angiosperms are associated to campos de altitude. However, this number includes certain taxa that do not occur in campos de altitude but in other Brazilian ecosystems, such as the Amazon (tepuis), Cerrado (campos rupestres), and Pampas (campos sulinos). Although some authors have used the term campos de altitude for Amazonian montane grasslands (e.g., Martinelli 2007), we here considered the more recognized use in literature for campos de altitude, it means, only those from the Atlantic Forest. Applying an “Atlantic Forest” filter, this number reduces to 2170 species (and 163 infraspecific taxa), although still including more than 1000 taxa that occur in Rio Grande do Sul state, outside the usual defined geographic range of the campos de altitude (after Martinelli and Orleans e Bragança 1996; Safford 1999a), and not included in the present study.

Family and functional diversity

Grasslands are regions of utmost importance for human survival, representing more than a third of the Earth’s land cover (Buisson et al. 2022). Despite the term “grasslands”, not only grasses, but especially forbs and shrubs are present in these environments (Dixon et al. 2014). Brazil ranks seventh among countries in total grassland area (ca. 17% of its entire land area; White et al. 2000). Here, we found slightly divergent results between the two datasets, with SLL including more tree and epiphyte families than SSL, as expected considering the two methods. On the other hand, Asteraceae and Poaceae were consistently the most diverse families in the campos de altitude (Fig. 3), as observed in previous studies of this vegetation (Safford 1999a; Fundação Florestal 2015; Alves et al. 2016; Flora e Funga do Brasil) and of the alpine flora as a whole in the Americas (Figueroa et al. 2022).

High-altitude environments around the world are occupied by herb- and shrub-rich communities, usually without tall trees (White et al. 2000; Alves and Kolbek 2010; Blair et al. 2014; Dixon et al. 2014; Le Stradic et al. 2015; Mucina 2018; Rada et al. 2019). Here, we found that the number of herbaceous plants is dramatically higher (ca. 3–4 times) than the number of shrubs, the second most common life form in both datasets, which can be explained by the difficult environmental conditions for a tropical vegetation. The campos de altitude environment is subject to wide variations in daily and seasonal temperature, solar radiation, and precipitation (Safford 1999a, b). In the Andean Páramos, grasses and forbs are estimated to be more resistant to water deficiency and low-temperature injury than woody plants (Rada et al. 2019), which could explain the predominance of these life forms in campos de altitude.

The difference in frequency of life forms between our two datasets (i.e., the larger number of tree-like life forms in SLL) reflects that SLL captured cloud forest species, while SSL was more closely restricted to species of campos de altitude. Orchidaceae and Bromeliaceae are represented in campos de altitude (Safford 1999a); however, these families showed a considerable increase in SLL, with a high number of epiphytic species. We also observed an increase in the number of Myrtaceae and Rubiaceae in SLL, which was clearly induced by the surrounding cloud forests that harbors many species of these families (Pompeu et al. 2014).

Conservation, endemism and biogeography

About 59–63% of the angiosperm taxa occurring in campos de altitude are endemic to Brazil, which is similar to the proportion for the Atlantic Forest as a whole (ca. 63%; Flora e Funga do Brasil). Also, our analysis showed that ca. 20% of the taxa (in SSL) occur only in campos de altitude, in general agreement with previous estimates based on more restricted sampling (ca. 21%; Safford 1999a). While their environmental conditions have led to high endemism, campos de altitude are hypothesized to be gateways for plant groups into the Brazilian Shield, especially regarding the Andes-Atlantic Forest pathway, as in the case of certain Bromeliaceae (Givnish et al. 2011) and liverworts (Santos and Costa 2010), and through the co-occurrence of several other groups (Martinelli and Bragança 1996; Safford 1999a). In addition to the Andean similarity, campos de altitude share taxa with several other disjunct phytogeographic elements (temperate, holarctic, austral-antarctic; see Safford 1999a) and with other South American mountain regions (Safford 2007; Alves et al. 2007).

Montane ecosystems are rich in endemic species, which makes them a priority for conservation (Chen et al. 2009; Merckx et al. 2015; Assis and Mattos 2016; Rahbek et al. 2019). Most localities harboring campos de altitude are included in protected areas (see Table 1), and are even covered by a broader national law (the Brazilian “Lei da Mata Atlântica”). However, over time, campos de altitude are subject to many kinds of human pressure (Martinelli 2007; Ribeiro and Freitas 2010) and are especially vulnerable to climate change (Neri et al. 2017). Only ca. 20% of the taxa in both lists have been assessed for their conservation (SLL: 513; SSL: 231), of which a high proportion presented a threatened or near threatened status (SLL: 53%; SSL: 65%). One of the primary factors affecting endemic and endangered campos de altitude species is human-caused fires during the dry season. Conversely, natural fires during the wet season are more limited and may even be beneficial for some species (Safford 1999a, 2001; Aximoff and Rodrigues 2011). These results demonstrate the urgency of acting to protect the campos de altitude, through specific national plans for this vegetation and a greater effort to assess the risks to this environment and its species.

Similarity among campos de altitude communities

Similarity analyses focused on different South American mountain ranges suggest the proximity of the campos de altitude flora to floras in other high-elevation regions in South America, especially the Andean region (Safford 2007), although the internal relationship between them still lacks a clear resolution. Our results refuted our expectation of a higher similarity among floras of the same geological formations.

In contrast to our expectation, Southern Serra do Mar sites are more similar to the Serra Geral (PNSJ) than to those in the Northern Serra do Mar (APAP, PNSO, PEDS, and PETP). One explanation for the pattern observed is the greater distance of the Southern Serra do Mar from the Serra da Mantiqueira + Serra da Bocaina (PEPP to PECJ: ca. 400 km) compared to the Northern Serra do Mar (PNSO to PNIT: ca.150 km), facilitating the flow of species between them. Proximity also seemed to influence the clustering, and networking Northern Serra do Mar.

The Serra da Bocaina, clustered with the sites of the Serra da Mantiqueira, also explained by their greater proximity (40 km from PNIT, although set apart by the lowland Paraiba do Sul river valley). On the other hand, proximity does not explain the relationship of the PNIT + PNCP cluster, which are ca. 350 km apart. One explanation for the massive presence of shared taxa in PNIT and PNCP probably relates to the recent Last Glacial Maximum, which led to the vegetation of the campos de altitude extending to lower elevations, which reached about 700 m in the Caparaó surroundings, forming a “corridor”. These regions were subsequently isolated as the Holocene advanced, as supported by palynological and zoological data (Safford 1999a, b; Gonçalves et al. 2007; Meireles and Shepherd 2015).

Some previous analyses have attempted to investigate the similarity of the campos de altitude to other mountain ecosystems and surrounding physiognomies (Safford 2007; Meireles and Shepherd 2015; Mendonça 2017). Meireles and Shepherd (2015) found that cloud forests at high altitudes are similar to each other, which seemed to be reflected in our locality networks, as SLL showed many more connections than SSL. On the other hand, Mendonça (2017) suggested that the flora of the campos de altitude of the Serra da Mantiqueira is similar to the floras of the campos rupestres of south-central Minas Gerais and the campos sulinos (here also considered as campos de altitude) of the Serra do Mar of Paraná, in contrast to our results. Also, Safford (2007) suggested that the campos de altitude (sensu Safford 2007) are floristically similar to the Aparados da Serra (near PNSJ). Here, PNSJ clustered, with low support, with PEPP + ARAC, and is the only one site located in the Serra Geral, at the southern limit of the Serra do Mar.

Extrapolation of species richness and methodological considerations

The lack of collections can lead to premature conclusions in spatial analyses (Farooq et al. 2021). As expected, our species accumulation curves (SACs) show an increasing trend in the number of species in most sites and in the total dataset. The areas differ greatly in size and accessibility, which affects sampling sufficiency and collection efforts. Taxonomic, geographic, and temporal biases (according to Meineke and Daru 2021) are often observed in these datasets.

The SACs and extrapolated richness have demonstrated that the expected number of species can vary depending on the specific site. However, it is important to note that our SAC analyses did not account for the inclusion of imputed species from recent reviews of two significant sites (PNIT and PNCP, as detailed in the methods section), which could have positively influenced the comprehensiveness of our species lists.

Considering the entire dataset, we found that at least 58% of the species were accounted for in the true richness, with variations of up to 83% depending on the estimation method used. It is crucial to acknowledge that while SACs can provide an estimate of expected species numbers, they are not always considered reliable sources (Longino et al. 2002; Brose et al. 2003).

Moreover, it is noteworthy that the campos de altitude represent only approximately 350 km2 of the Atlantic Rainforest (Safford 1999a), which covers a vast area of about 200,000 km2. At first glance, it might seem unlikely that this relatively small region encompasses a quarter of the native species found within this domain. The Flora and Funga do Brasil estimates the Atlantic Rainforest to host approximately 16,000 angiosperm species. Nevertheless, it is plausible that the count of angiosperms in the Atlantic Forest is underestimated. Furthermore, due to the notable distinctiveness of the campos de altitude habitat in comparison to the forest matrix, it is conceivable that they contribute a substantial portion, potentially a quarter, of the species within this domain.

It is important to mention that the shape of the SAC can be influenced by the spatial distribution of species, which can vary depending on the habitat and environmental conditions (Brose et al. 2003). Hence, it is crucial to exercise caution when interpreting the results obtained from SACs and to consider other factors that may impact estimates of species richness. By employing multiple methodologies, we can obtain a more robust estimate of true species richness and enhance the reliability of our conclusions.

Although SACs extrapolate the number of species in our dataset, our statistical analyses were based on a random subset of all occurring species. In our study, the sites with the largest number of collections were mostly those with the most species, being influenced by several aspects and influencing the analyses performed. PNCP, PNSJ, and PNIT have the largest extent of campos de altitude and have the most collections; however, the sampling bias goes beyond these aspects, especially regarding accessibility. This factor becomes clear when sites near large cities, universities, and research institutions or with easy access are better sampled (Meineke and Daru 2021). Some of the sites are only accessible by long trails, sometimes requiring more than a day of walking (e.g., PDMI), while others can be reached by car (e.g., PNIT). Our results evidenced a disproportionate collecting effort among the campos de altitude, making it highly necessary for further collections to focus on these less collected sites, especially PEDS, PESP, PECJ, APAP, and PIMA (mean collections per taxon < 2). On the other hand, even the most intensely collected sites did not reach saturation in the number of species.

Methodological bias is always present in statistical analyses. Here we tried to standardize biases and errors, since all data were submitted to the same method, without further external intervention. However, the collection bias is present in spatial data databases such as GBIF (Beck et al. 2014), a frequent problem in analyses of large-scale occurrence data. GBIF data come from different collections, which have different collection methods and focus, and therefore some groups and sites are better sampled than others. Also, this can be problematic when performing SACs, as there may not have clear sampling events. On the other hand, by using automated methods to clean the raw matrix, we eliminated some of these biases, although it was then necessary to deal with the information loss (according to Zizka et al. 2020), one of the main issues of filter methods. In traditional checklist methods, some taxonomic groups are better studied than others, and often taxonomic revisions are performed by researchers who are not specialists in all the families involved, resulting in disproportionate effort in the accuracy of species identification. Therefore, to reduce the bias of information loss, and assist to reach the total coverage of species, we here proposed the creation of an editable list to seek an accurate number and the identity of taxa that occur in campos de altitude.

Conclusions

According to the two methods we tested, between 1087 and 2398 angiosperm taxa are present in the Brazilian campos de altitude, an increase over previous estimates. However, according to our extrapolation analyses, this number could potentially exceed 4000. Geographical proximity is a better predictor of floristic similarity than presence in the same mountain range. Also, we found that a relatively few taxa in campos de altitude have conservation assessments, and of these, ca. 53–65% are threatened or near threatened. In campos de altitude, many collections have focused on only a few sites, and this should be changed for a more comprehensive knowledge of this unique tropical ecosystem. New collections in under-sampled sites should be carried out, as only then will we be able to determine the true diversity of these sites.

Overall, our SLL and SSL showed similar patterns, indicating that our “filtering” method based on location, elevation and canopy height may be suitable for estimating biodiversity in campos de altitude and in other mountain environments. The estimated numbers and analyses are mostly proportionally similar between the two datasets. The differences between the SLL and SSL networks and number of taxa were statistically not significant.

Herewith, we make available an online and editable list to seek contributions from taxonomists to eventually estimate a precise number of angiosperm species, to serve as a baseline for phytogeographic and conservation analyses. Finally, we suggest that further conservation and biogeographic studies explore the real state-of-the-art of campos de altitude species conservation status and drivers of their great diversity.