Introduction

Environmental DNA (eDNA) metabarcoding is a non-invasive technique focused on the retrieval of DNA materials produced by organisms and released into their environment (Fukumoto et al. 2015; Boussarie et al. 2018; Garlapati et al. 2019). The method can provide broader biodiversity assessments compared to more classic methods (e.g. Smart et al. 2015; Shaw et al. 2016; Evans et al. 2017; Ruppert et al. 2019) due to the minimal reliance on observer expertise and the ability to detect hidden species. The use of eDNA metabarcoding is increasing globally and has been applied to a wide range of fields in a variety of terrestrial and marine ecosystems. For example, eDNA has been used to understand how biodiversity might vary under different degrees of anthropogenic stress (DiBattista et al. 2019, 2020; Alexander et al. 2020), and to identify and genotype molluscs within laboratory settings (e.g. Ardura et al. 2015; Clusa et al. 2017; Holman et al. 2019). This method has also been a crucial to developing biomonitoring tools to identify seasonal patterns of zooplankton in Western Australia (Berry et al. 2019; West et al. 2020). Despite its perceived usefulness and wide applications, in marine environments eDNA applications have tended to be overly wide in focus, with few studies focused on large taxonomic groups responsible for a major percentage of marine biodiversity (e.g. Closek et al. 2019). Although informative, such research can only provide broad estimates of biodiversity without providing an understanding of how each group responds to change, especially anthropogenic stresses. Therefore, research focused on identifying perturbation-sensitive taxa needs to be emphasized as such data could help to provide reliable and comprehensive biodiversity assessments that are useful for refining conservation practices. The Chordata (fish, mammals, birds, reptiles, and amphibians), Mollusca (characterised by a soft body and generally distinguishable head and foot regions including sea snails, clams, octopus and squid) and Echinodermata (characterised by radial symmetry and the body is covered by calcerous plates that can be moveable or fixed including sea stars, sea cucumbers and sea urchins) are three phyla that contribute large percentages to marine biodiversity (Gray 1997; Sala and Knoltown 2006; Tittensor et al. 2010) They play important economic and ecological roles and include many of the species which act as keystone taxa pivotal for the integrity of marine ecosystems (Grassle et al. 1991; Raghukumar and Anil 2003; Strong et al. 2015). The Indonesian seas are reported to contain 10% of the global marine species diversity covering 574 coral species (Veron et al. 2009), 13 seagrass species (Burke and Selig 2002), 45 species of mangrove (Spalding 2010), and 6 species of sea turtle (Ario et al. 2016). Despite this expected high biodiversity, there is still a paucity of genetic records of marine taxa from Indonesia in the public databases and a considerable proportion of existing records are incomplete. For example, a number of locally published studies have recorded immensely rich echinoderm diversity in several sites across Indonesia, including 44 species in the Banda Sea (Setyastuti and Wirawati 2008); 76 species in Lembeh strait (Supono et al. 2014); 692 species in North Natuna (Lane et al. 2000). High molluscan and reef fish diversity has also been reported from eastern Indonesia; for example, the Raja Ampat area is home to at least 699 Molluscs species, 972 reef fishes in 45 observation sites (Allen 2003), and it play important role in the lives of local coastal communities (Nijman 2019). Nonetheless, data from Indonesia are poorly represented in the in BOLD system database (www.v3.boldsystems.org), with only 201 molluscan and 35 echinoderm records, and 1509 fish species belonging to the Actinopterygii (Chordata). Although there may be several other reasons, the limited number and taxonomic scope of Indonesian marine biodiversity records in public databases is most likely related to the relatively low level of genetic research and observations made in this country. This indicates a need to encourage research efforts, particularly using molecular tools (primarily DNA barcoding) combined with morphological identification in order to reliably detect and record new species and to expand the range of taxa with reference barcode database accessions, as well as the use of eDNA methods to detect the magnitude of marine biodiversity.

In some areas, coral reef organisms at their planktonic larva dispersal phase comprise a large proportion of eDNA materials in the ocean. For example, although overall eDNA release is higher in larger Japanese eels (Anguilla japonica), the rate of release (relative to body weight) is higher in early life-stages and there is a significant discharge of eDNA associated with spawning (Takeuchi et al. 2019), while juvenile bluegill sunfish (Lepomis macrochirus) also release eDNA at a higher rate than adult fishes (Maruyama et al. 2014). Many larvae float in the water column or remain close to the air-sea interface; some settle close to their natal habitat, while others drift away and can become widely dispersed. Larval dispersal could theoretically be measured directly, but exact movements are unknown without direct tracking (Saenz-Agudelo et al. 2011; Madduppa et al. 2014; Williamson et al. 2016; Timm et al. 2017). One challenge in direct larval trajectory tracking research is that species level identification is often challenging, preventing a precise estimate of larval movement for specific taxa. To solve this issue eDNA methods can provide an alternative method for identifying larval-stage taxa (Baltazar-Soares et al. 2020; Ratcliffe et al. 2020). eDNA has been used to detect evidence of larval distribution following spawning events in sand eel and clupeid species (Ratcliffe et al. 2020). Albeit arguably still in its infancy, the use of eDNA is a promising tool that can provide species-specific data on larva dispersal and thereby aid in the detection of larval distribution patterns including settlement habitat location and quality as well as life history characteristics of the taxa involved (Cowen et al. 2006; Kool et al. 2011; Treml et al. 2012, 2015).

The high species diversity in the Coral Triangle is widely recognized (Hughes et al., 2002; Tittensor et al. 2010); however, threats to this region include over-exploitation and alarming rates of anthropogenic activities (Burke et al. 2012; Halpern et al. 2015). One response to this situation has been the growing extent of Indonesian coral reefs designated as Marine Protected Areas (MPAs) with the twin goals of protecting biodiversity and ensuring long term sustainability of their use (Mustika and Ratha 2013; Hamdani 2018; Krueck et al. 2017). There are also protection frameworks specifically targeting important taxa such as top predators (e.g. sharks and groupers), migratory species (e.g. marine mammals), and other ecologically important groups (e.g. some molluscs and echinoderms), and economically important species under two Indonesian ministerial decrees (KEPMEN-KP by Ministry of Marine Affairs and Fisheries; PERMEN KLHK by Ministry of Environment and Forestry of Republic of Indonesia). By the first quarter of 2020, around 23 million hectares had been protected by the Indonesian government under a range of MPA designations (0.07% of Indonesian territorial) with targets to protect another nine million hectares by 2030. In addition to MPAs, the Indonesian government is also managing fisheries within eleven Fisheries Management Areas (FMAs). These FMAs, which were designated based on ecological characteristics and the abundance of fish stocks, will serve as units for integrated fisheries management with core functions including conservation and as well as the control and surveillance of fishing and related activities. However, despite ongoing improvements in fisheries management across Indonesia, there is a severe lack of research that can compare species composition between management units. To support future management activities and ensure that the FMAs can achieve their management targets (e.g. sustaining fish stocks and marine biodiversity), reliable estimates of taxonomic composition based on reliable data are needed. Such data could be used to build effective fisheries management plans including conservation areas that are resilient to present and future threats.

Detailed information on the distribution and abundance of species are critical for the design of conservation areas, in particular MPAs, in order to ensure long term benefits from the marine biodiversity conserved (Beger et al. 2015) and maximise the potential to enhance both fisheries management and conservation (Krueck et al. 2017). Therefore, this study was conducted to assess marine biodiversity across the Indonesian coral reef systems and to investigate the relative abundance and distribution of ecologically and economically important coral reef-associated organisms across fisheries management areas in Indonesia using eDNA methods. We focused on the between-site differences in three phyla (Chordata, Mollusca, and Echinodermata) that comprise the majority of taxa in coral reef communities. The data obtained will help to elucidate patterns of biodiversity while providing crucial information for species conservation and fisheries management.

Materials and methods

eDNA seawater sample collection

A total of 92 eDNA seawater samples were collected from 17 sites across nine locations in Indonesia (Fig. 1). The sampling locations in this study fall into four Fisheries Management Areas (FMA). FMA 712 was represented by the Seribu Islands in Jakarta and Probolinggo in East Java; FMA 713 by Lombok in West Nusa Tenggara and Selayar in South Sulawesi; FMA 714 by the Wakatobi Archipelago in Southeast Sulawesi; and FMA 715 by sites in Gorontalo, North Maluku, Raja Ampat, and Misool in West Papua. At each site two to four replicate eDNA seawater samples were collected. The eDNA seawater samples were collected directly from the surface or underwater by diving using Self-Contained Underwater Breathing Apparatus (SCUBA) at depths between one and nine meters. This depth was determined by the depth at which coral reefs were mostly observed at the study sites. The seawater samples were collected in 4 L water bottles. Each seawater sample was filtered through a 0.45 µm Pall Corporation sterilized filter paper (47 mm diameter) using a peristaltic pump. The filtration process was interrupted if the flow stopped due to clogging of the filter. A protocol developed by the University of Rhode Island was strictly followed to avoid contamination, using 10% bleach and distilled water to sterilise all equipment used between samples and sampling sites. Filtration was completed at the end of each sampling day. After filtering was completed, each filter paper was placed in a 2 mL cryotube filled with 1.5 mL ZymoBIOMICS DNA/RNA shield.

Fig. 1
figure 1

Nine eDNA seawater sample collection locations across Indonesia: (1) Seribu Islands, Jakarta; (2) Probolinggo, East Java; (3) Lombok, West Nusa Tenggara; (4) Selayar, South Sulawesi; (5) Wakatobi, Southeast Sulawesi; (6) Gorontalo; (7) North Moluccas; (8) Raja Ampat, West Papua; (9) Misool, West Papua. The Indonesian Fisheries Management Area (FMA) boundaries are also shown. The locations represent FMA 712 (locations 1 and 2), 713 (locations 3 and 4), 714 (location 5), and 715 (locations 6, 7, 8 and 9)

eDNA laboratory analysis, library preparation, and next-generation sequencing

DNA extraction was carried out after the field sampling trip (not more than one month after collection) using ZymoBIOMICS DNA extraction kits produced by the Zymo Research Corporation following the manufacturer's guidelines. An approximately 313 bp segment of the mtDNA COI region was amplified using a universal COI primer pair: mICOlintF-adpt as forward (5′-TCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GGW ACW GGW TGA ACW GTW TAY C) and dgHCO2198-adpt as reverse (5′-GTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTA AAC TTC AGG GTG ACC AAA RAA) (Leray et al. 2013). Polymerase Chain Reaction (PCR) reactions contained 2.5 μL 10× buffer, 2.5 μL MgCl2, 0.5 μL mICOlintF-adpt, 0.5 μL dgHCO2198-adpt, 0.5 μL dNTPs, 0.15 μL Taq polymerase, 17.35 μL ddH2O and 1 μL sample template. The PCR profile comprised an initial pre-denaturation stage at 94 °C for 1 min, 30 cycles of denaturation at 94 °C for 30 s, 42 °C for 30 s and extension at 72 °C for 1 min, followed by a final extension at 72 °C for 8 min.

Libraries were purified using AMPure XP beads (Beckman Coulter) according to the Illumina MiSeq 16S metagenomic sequencing library protocol (Illumina 2020). Dual indices and Illumina sequencing adapters from the IDT for Illumina-Nextera DNA Unique Dual Indexes, Set A (catalogue number 20027213) (Illumina, San Diego, USA) were added to the target amplicons in a second PCR step using Kapa HotStart HiFi 2× ReadyMix DNA polymerase (Kapa Biosystems Ltd., London, UK). The PCR profile was 95 °C (3 min), then 9 cycles of 95 °C (30 s), 55 °C (30 s), 72 °C (30 s), followed by a final extension at 72 °C (5 min). Libraries were quantified using the Qubit fluorometric method. The barcode amplicon libraries were combined in equal concentrations into a single pool according to their quantification measurement. The library pool was diluted and denatured according to the Illumina MiSeq library preparation guide. The amplicon library (10 pM) was spiked with 20% denatured and diluted PhiX Illumina control library version 3. The sequencing was conducted on an Illumina MiSeq using the MiSeq reagent kit V3 600 cycle. All library preparation and sequencing was conducted at the Oceanogen Environmental Biotechnology Laboklinikum (Oceanogen) in Bogor. The Oceanogen is a recently established laboratory facility aiming to multiply environmental DNA research effort across Indonesia. All relevant protocols and ongoing projects information are available at www.oceanogen.co.id.

Bioinformatics and data analysis

All eDNA sequences were analysed using the Anacapa pipeline (Curd et al. 2019). Instead of using a publicly available database (e.g., MIDORI, QIIME2 database), a COI database was built using the CRUX package (Creating Reference libraries Using eXisting tools; https://github.com/limeybean/Anacapa/tree/NewMaster/Anacapa_db) to create a de novo sequence reference library combining records from publicly accessible databases, such as European Molecular Biology Laboratory (EMBL) and National Center for Biotechnology Information (NCBI). Upon the completion of database development, Anacapa then de-multiplexed the amplicon read based on the primer sequences mentioned in the laboratory analysis section and trimmed the primers from the reads. Anacapa employed the DADA2 algorithm (Callahan et al. 2016) to denoise and perform error correction on the raw sequence data, merge paired-end reads, and assign high quality reads to Amplicon Sequence Variants (ASVs) through ASV Parsing. In the last step of the pipeline, Anacapa used Bowtie 2 and the Bayesian Least Common Ancestor algorithm to assign ASVs to taxa by using a default likelihood threshold value as suggested by Gao et al. (2017). For most analyses, we excluded singletons to provide the most conservative measure of diversity. To examine patterns of taxonomic diversity, ASV tables output from DADA2 were converted to the BIOM format and imported to R for diversity analyses, statistical testing, and data visualization.

The core-metrics pipeline from the Phyloseq package (McMurdie and Holmes 2013) plugin in R (R development core team) was used to analyse and visualize DNA amplicon data using an even sampling (rarefaction) depth for each sampling locality. The number of reads ranged from 2,361 to 1,054,491 reads per sample (Suppl. 1). For downstream analysis, we excluded taxa within the Orders Primata, Rodentia, and family Bovidae from the Class Mammalia, as well as the Orders Passeriformes and Galliformes from the Class Aves. Furthermore, we also omitted bacterial taxa and rarefied the data to an even depth of 5,219 reads per sample. Using this filtering approach, we discarded two samples with less than 5,000 reads per sample (LBK_175 and RA_107); hence we only used 90 samples from nine locations and 17 sites for further data analysis. This post filtering step provided a total of 469,710 reads with 15,739 ASVs consisting of the Eukaryote Domain and unidentified taxa. The taxonomic composition of each sample was summarized with taxa merged at seven different levels (Kingdom, Phylum, Class, Order, Family, Genus, and Species), and filtered, so that groups representing on average less than 2% (at phylum and order level) and 10% (at species level) of the community were removed. Venn diagrams were constructed through http://bioinformatics.psb.ugent.be/webtools/. We examined patterns of alpha diversity by calculating taxonomic richness for all ASVs and three major phyla (Chordata, Mollusca, and Echinodermata) for each location and Fisheries Management Area (FMA). Multivariate analysis (PERMANOVA) based on the presence/absence of ASVs (Jaccard similarity) and the abundance of reads (Bray–Curtis) was performed in the Vegan package (Oksanen et al. 2018) in R version 3.6.3 (R Development Core Team 2013). Statistical significance was tested using 9999 permutations with a confidence level of α = 0.05. We tested the alpha and beta diversity between locations and FMAs with equal numbers of samples. We analysed seven sampling areas (locations) with four samples each (two locations, East Java and Selayar, had less than four samples). The analysis with an equal number of samples per FMA only included one location per FMA (Seribu Islands, Lombok, Wakatobi, Raja Ampat) with four samples from each location. Principal Coordinate Analyses (PCoA) was implemented using the Ampvis2 package (Andersen et al. 2018) with the phyloseq ordination function for both Jaccard and Bray Curtis dissimilarity matrices using ggplot2 (Wickham, 2009).

Results

Biodiversity assessment using COI marker with eDNA approach

A total of 13,819,634 paired-end reads corresponding to 23,252 unique sequences were generated from the COI amplicons obtained from the 92 samples collected from 17 sites across nine sampling locations. Of the reads used for further analysis, the Eukaryote taxa identified comprised 52% (8,228) of all ASVs and 68% (319,845) of all reads while unidentified taxa composed of 48% (7,511) ASVs and 32% (149,865) reads.

At the kingdom level, taxonomic composition across the sampling locations was dominated by unidentified taxa (Fig. 2). Based on taxon presence-absence, Kingdom Animalia contributed a higher proportion (13%) than Protista (11%), Plantae (5%), and Fungi (3%). Based on read abundance, the Kingdom Fungi was dominant in East Java and prominent in Raja Ampat.

Fig. 2
figure 2

Barplot based on Kingdom for nine sampling locations based on the number of reads (A) and number of taxa (B). The bar plot was constructed based on phyla contributing more than 2% of the relative abundance of each sample

At the phylum level (Fig. 3), based on read abundance, Lombok and Wakatobi had a higher proportion of Chlorophyta than unidentified taxa. East Java had high read abundance of Ascomycota and Bacillariophyta, with the most Chordata detected in Raja Ampat. Based on taxon presence-absence, the phyla Arthropoda and Bacillariophyta, and unidentified taxa were the most common taxonomic groups across all locations.

Fig. 3
figure 3

Barplot based on Phylum for nine sampling locations based on the number of reads (A) and the number of taxa (B). The bar plot was constructed based on phyla contributing more than 2% of the relative abundance in each sample

This study also grouped the locations based on their position in the Fisheries Management Areas (FMAs) (Fig. 4). Comparing the taxonomic composition between the FMAs, FMA 715 had the highest number of unique taxa in this study and FMA 714 the lowest. In particular, FMA 715 sites had more unique species from the Chordata, Mollusca, and Echinodermata. All FMAs shared 45 taxa. The highest species overlap was between FMA 712 FMA 715, even though FMA 715 had the most unique taxa. Interestingly, FMA 714 and FMA 712 had no shared chordate/echinoderm/molluscan species. This result could be due to the ASV approach which considers every unique sequence as a single taxonomic unit.

Fig. 4
figure 4

Venn diagram showing the number of overlapping species between Fisheries Management Areas (FMAs for A all species, B Phylum Chordata, C Phylum Mollusca, and D Phylum Echinodermata. The FMAs (712, 713, 714 and 715) are colour-coded

At the location level, when including all ASVs, Lombok (East Nusa Tenggara) had the highest number of ASVs (3764), followed by Misool (2518 ASVs), Seribu Island (2473 ASVs), and North Mollucas (2204 ASVs) and lowest number was recorded in Selayar (528 ASVs) (Table 1). However, when considering only ASVs from the phyla Chordata, Mollusca, and Echinodermata, Raja Ampat had the highest number of ASVs (140), followed by Gorontalo (138 ASVs) and Seribu Island (101 ASVs) and Wakatobi had the lowest number (14 ASVs). At the FMA level, FMA 715 had the highest number of total ASVs (7722) as well as for ASVs in the three major phyla (350). The lowest number of ASVs was recorded from FMA 714 for both total ASVs (1839) and the three major phyla (14) (Table 2).

Table 1 Taxonomic richness from environmental DNA (eDNA) data from nine sampling locations across Indonesia
Table 2 Taxonomic richness from environmental DNA (eDNA) data from four Fisheries Management Areas (FMAs) across Indonesia

Beta diversity analysis showed significant differences in taxonomic composition between locations and between FMAs (PERMANOVA: p < 0.05) based on Bray–Curtis and Jaccard indices. This is consistent with two other tests (equal numbers of samples per location and FMA). PCoA plots (Figs. 5, 6) show a clustering pattern, where Wakatobi sites closely clustered with Lombok sites. Some Raja Ampat sites clustered with Lombok, but Misool in Raja Ampat formed an isolated cluster, while sites in other locations clustered together.

Fig. 5
figure 5

Beta diversity ordination plots based on Bray–Curtis and Jaccard indices: A and B include all samples (9 locations); (C and D) include 7 locations with an equal number of samples (four) per location (East Java and Selayar had less than four samples and were excluded from this analysis)

Fig. 6
figure 6

Beta diversity ordination plots by Fisheries Management Area (FMA) based on Bray–Curtis and Jaccard indices: A and B include all samples (9 locations, 90 samples); (C and D) include one location in each FMA (Seribu Islands in FMA 712, Lombok in FMA 713, Wakatobi in FMA 714, and Raja Ampat in FMA 715) with an equal number of samples (four) per location

Assessment of eukaryotic taxa interest using eDNA and a COI barcode

This section describes a snapshot of the distribution of three phyla (Chordata, Mollusca and Echinodermata) based on read abundance across the sampling locations (Fig. 7). Taxa of special interest were included in the dataset for these phyla (Fig. 8), including economically valuable and/or protected bony fish, sharks, and marine mammals (phylum Chordata); giant clams (phylum Mollusca); and echinoderms.

Fig. 7
figure 7

The relative abundance of orders within the Phylum Chordata (A), Phylum Mollusca (B) and Phylum Echinodermata (C) based on read numbers from nine sampling locations

Fig. 8
figure 8

The distribution and taxonomic composition at species level within three phyla across nine sampling locations: A Phylum Chordata, B Phylum Mollusca and C Phylum Echinodermata. The pie charts were constructed based on the read abundance dataset

Phylum chordata

The database resulting from this study included 329 ASVs and 22,923 sequence reads, covering five classes, 21 orders, 33 families, 45 genera, and 59 species. The distribution of phylum Chordata at the Order level is shown in Fig. 7A. Within the Phylum Chordata, this study recovered 73 species from 25 orders. There were high read abundances of Pristiformes (sawfishes) at several sites, including Gorontalo and Selayar. Perciformes (ray-finned fish) were observed at all sites with the highest number of reads from Lombok and Wakatobi. Some economically valuable fish genera were detected, including groupers (Plectropomus and Epinephelus), snappers (Lutjanus), and fusiliers (Caesio) (Fig. 7A). For the Class Actinopterygii, relatively common species of interest included Plectropomus areolatus (Seribu Islands, East Java, North Moluccas and Misool); Epinephelus coeruleopunctatus (Seribu Islands, Gorontalo and Misool); Caesio cuning (Lombok and West Papua) and E. aerolatus (Raja Ampat). Another order with high read numbers, the Enterogona (tunicates) was found at all sites but most abundant at Misool. The dataset also captured a single ASV of a rare species, Orcaella brevirostris (Irrawaddy Dolphin) from class Mammalia, with a total of seven reads from a single sample, LBK_190 (Lombok, West Nusa Tenggara).

Phylum mollusca

With 124 ASVs and 17,658 reads, taxa identified as belonging to the phylum Mollusca belonged to 17 orders and one unidentified taxon, and represented four classes (Bivalvia, Gastropoda, Cephalopoda, Polyplacophora) (Fig. 7B). A high percentage of molluscan taxa were unidentified at four sites (North Moluccas, Seribu Islands, Gorontalo and Misool). The location with highest number of reads for the Order Verenoida (bivalves) was Lombok, followed by Raja Ampat and East Java. Read numbers of Nudibranchia (sea slugs) were highest in Raja Ampat and Seribu Islands and Teuthida (squid) reads were highest in Wakatobi and Seribu Islands. Littorinimorpha (gastropods) were also common at most sampling sites.

The tridacnid calms of the subfamily Tridacninae (Class: Bivalvia, Order: Veneroida, Family Cardiidae), previously known as the Family Tridacnidae, are a molluscan taxon of special interest. They are highly sought-after due to their high economic value in and increasing demand from the global aquarium trade as well as for human consumption, curios and other uses (Reksodihardjo-Lilley and Lilley 2007; Yusuf and Moore 2020). Due to the widespread threats and predicated future dwindling population size, all members of this group have been listed in CITES Appendix II to provide protection from international trade (Wells 1997), although CITES listing does not protect these clams from in-country exploitation. With 26 ASVs and 17,002 reads, only two species in the genus Tridacna were recorded: Tridacna maxima (25 unique ASVs) and T. crocea (1 unique ASV) (Fig. 8). Tridacnid clams were identified from three locations with three reads from East Java (T. maxima); 5,373 reads from Lombok (14 T. crocea and 5,359 T. maxima); and 11,083 reads from Raja Ampat (T. maxima) (Fig. 8).

Phylum echinodermata

There were 2,760 reads assigned to the Phylum Echinodermata. These reads comprised 130 ASVs belonging to six classes, ten orders, 20 families, 22 genera, and 21 species, and a group of unidentified taxa (Figs. 7C, 8C). The abundance and taxonomic composition of echinoderm reads obtained varied considerably between teach nine locations. Five out of nine sites had a high proportion of unidentified taxa. The most commonly identified order was Spatangoida (heart urchins) with the highest number of reads from the Seribu Islands followed by Gorontalo, East Java, and Selayar. Among the 10 Orders, only one (Euryalida) was detected in Raja Ampat, represented by a single species (Asrochele pacifica). The most dominant echinoderm group in the Wakatobi was the order Comatulida (crinoids), also identified from five other locations (Seribu Islands, Lombok, Misool, North Moluccas, and East Java). The order Valvatida (starfish) was recorded in Misool and Gorontalo. An order of special interest is the Asphidochirotida (sea cucumbers), detected mostly in Misool, with lesser read numbers from four other sites (Gorontalo, North Moluccas, Seribu Islands, and Selayar).

Discussion

Illuminating marine biodiversity across Indonesian coral reefs and fisheries regions

This study demonstrates the utility of eDNA metabarcoding to illuminate marine biodiversity across coral reef ecosystem in Indonesia. We further highlighted its potential for improving surveillance and conservation of ecologically and economically important taxa, including several taxa supporting coral reef fisheries in Indonesia. eDNA metabarcoding can provide an important complement to traditional approaches (i.e., visual census and morphology-based identification) and generate data on marine biodiversity that previously could not be detected using traditional methods (Taberlet et al. 2012; Deiner et al. 2015; Thomsen and Willerslev 2015; Shaw et al. 2016; Evans and Lamberti 2018; Evans et al. 2017; Porter and Hajibabaei 2018; Ruppert et al. 2019). Nevertheless, the application of this method has been relatively limited in the Coral Triangle (a marine biodiversity hotspot comprising much of Indonesia, Philippines and Papua New Guinea), hindering evaluation on the efficacy of the method in areas with a high concentration of marine biodiversity. As one of the few studies done in the Coral triangle using an eDNA metabarcoding method, this study shows promising results.

The high abundance of unidentified taxa suggests that many of the taxa discovered in this research are not yet recorded in global databases. However, given the high rate of new taxa being described from the area, high marine diversity per square meter and the extent of relatively unexplored regions (Cahyani D., unpublished data) we also argue that some of the unidentified ASVs maybe undescribed marine species as yet unknown to science, potentially leading to an enhanced evaluation of marine biodiversity based on these data after future species identifications have been made. As an example, only 28% of known molluscan species from Raja Ampat (McKenna et al. 2002) could be assigned based on the BOLD database, while much of our eDNA data cannot be assigned to taxa in any of the existing public databases. This suggests that much of the rich biodiversity in Indonesia remains to be explored and highlights a poor representation of taxa from the coral triangle and neighbouring areas in global COI databases. Further, the high abundance of unidentified taxa in this research highlights the gap that needs to be filled in global metazoan databases to support robust taxonomic assignment. Nevertheless, this research has provided important baseline sequence data useful for comparative study with species currently present in these databases and could also be important for future biodiversity and conservation research.

Coral reefs are estimated to harbour approximately 950,000 multicellular species (±40% of estimated global biodiversity), of which only around 10% have been identified (Fisher et al. 2015). These figures reflect the vast gap in our existing knowledge on the biodiversity of coral reefs and highlight how much research remains to be done (Hoeksema 2017). Therefore, research focused on filling the information gaps should be done in areas with enormous biodiversity, but as yet severely underexplored, like the Coral Triangle region, as well as in habitats under high anthropogenic pressure such as many areas in western Indonesia. Although efforts are underway, biodiversity records in the region suffer from, inter alia, methods which are ineffective when used to build comprehensive inventory catalogues in rapid and consistent ways (e.g., biodiversity monitoring mainly uses visual surveys by observers with varying levels of expertise) (Hoeksema 2017). In this context, our study has taken important steps in introducing a new method of biodiversity inventory and has highlighted the magnitude of the gap between what we know and don’t know regarding marine biodiversity in the Coral Triangle. Although still relatively new and arguably not yet developed to its full potential, the application of eDNA metabarcoding can speed up auditing process for species identification by providing a general picture regarding the magnitude of biodiversity. Furthermore, our research has successfully identified several taxa living in interstitial spaces in the coral reef environment (e.g. amphipods and isopods) and taxa that are frequently observed in coral rubble (e.g. bryozoans, sponges, polychaetes and tunicates) (Meesters et al. 1991). These findings show that large components of marine biodiversity dwelling in areas that are hidden and, in many cases, inaccessible using traditional monitoring tools, can potentially be identified using eDNA metabarcoding. These hidden species, and our limitations in observing the life within them, have resulted in poor investigation of the species associated with such habitats; these include species inhabiting reef caves and crevices such as sponges (Reitner et al. 1996; Vicente et al. 2016), some scleractinian corals (Hoeksema 2012) and zoantharians (Irei et al. 2015; Kise and Reimer 2016). Our findings have further highlighted the magnitude of marine biodiversity across regions in Indonesia and the need to undertake further taxonomic studies that aim to provide reference taxonomic information complete with DNA data. The results also further emphasize the importance of eDNA in revealing hidden marine biodiversity, including finding new organisms (Maggioni et al. 2017; Montano et al. 2015; Tornabene et al. 2016), and adding new records to public databases (Valdez-Moreno et al. 2019).

Monitoring of taxonomic composition across space and time in marine biodiversity hotspots like Indonesia remains a significant challenge, especially for less studied, small, rare, and uncommon species (e.g., Madduppa et al. 2020). The Alpha and Beta diversity analyses in our study highlight the significant differences in species richness and biotic community composition between locations. We also show that the percentage of sequenced species among families is highly variable, which is a barrier to robust estimation of species richness. Interestingly, two locations with distinct environmental conditions had a similarly high percentage of unidentified taxa. Raja Ampat is a relatively remote area with pristine reefs in western Papua while the Seribu Islands north of Jakarta is exposed to high anthropogenic pressures. Surprisingly, even though Raja Ampat is well known as an area with particularly high marine biodiversity (Allen 2003; Hukom et al. 2018) our results show that overall the sites in the area had relatively low alpha diversity. This could be due to various ecological reasons (Gray 2001; Stanley 2007), but in the context of our study this could also be due to technical challenges during sampling and laboratory work. Nonetheless, Misool, one of the locations within Raja Ampat, had the highest alpha diversity. The sites in the Seribu Islands, on the other hand, had relatively high alpha diversity compared to the other sites in this study. Our findings contrast with the general idea that areas with historically polluted environmental conditions, like Seribu Islands, necessarily have relatively low biodiversity. Given that this is the first study using an eDNA approach done in these areas, the results suggest that eDNA could potentially reveal hidden and cryptic diversity unobserved by traditional monitoring methods. Therefore, the assessment of biodiversity should not be totally reliant on the environmental state of a given site, as even apparently degraded sites could house unique patterns of biodiversity. The combination of voucher-based DNA monitoring, eDNA, and conventional visual survey methods is therefore highly advisable to aid in understanding the magnitude and patterns of biodiversity.

The relatively comprehensive data from ecological surveillance and visual census of the coral reefs of the Seribu Islands (e.g., Rachello-Dolmen and Cleary 2007; Madduppa et al. 2013; Cleary et al. 2016; Rachman 2020) show that this archipelago suffers from a variety of anthropogenic stresses, including large numbers of introduced and invasive species induced by shipping activities (Huhn et al. 2020; Subhan et al. 2020; Anzani et al. 2019; Utami et al. 2018; Dias et al. 2018). In addition, many of the previous studies in the area are poorly documented in terms of accessions deposited in public DNA sequence databases, (e.g., Polónia et al. 2014; Patantis et al. 2013; Madduppa et al. 2017, 2019). In contrast, habitats in Raja Ampat comprise a high diversity of reef formations, both typical and atypical, with varying coastal and bathymetric profiles, and include unusual types of coral habitat and coral communities (McKenna et al. 2002; Erdmann and Pet 2002). Such features are particularly noticeable in the area around Misool Island, where many of the reefs display none of the classic or predictable coral community zoning patterns. In particular, southeastern Misool has vast variety of unexplored habitat types (Turak and Souhoka 2003) that could be refuges for unexplored and unique biodiversity, which might explain the high percentage of unidentified taxa in this region.

The use of eDNA metabarcoding using standard and taxon-specific COI mini barcodes (up to ~ 400 bp) has been developed to examine degraded DNA materials present in aquatic environments. In this analysis, we successfully identified a variety of taxa from three major phyla (Chordata, Mollusca, and Echinodermata) using short mitochondrial DNA COI sequences. We found that these phyla were not equally distributed over the areas sampled. With respect to sampling methodology, the relatively low number of taxa detected from the water samples might be due to the small amount of DNA material released into the environment by the organisms present, rapid degradation processes following DNA shedding, or local current movements that consistently transport the DNA in the water column.

Unsurprisingly, Raja Ampat, a globally renowned marine biodiversity hotspot (Cox and Bright 2017), had a higher percentage of reads from Chordata than the other locations. The reputation of this area is largely based on a number of studies that have discussed the exceptional richness of chordate taxa, in particular fish (Allen 2003), sharks (Beer 2015, Jaiteh et al. 2017, Sembiring et al. 2015) and marine mammals (Agostini et al. 2012; Ender et al. 2014). However, even though Raja Ampat had the highest chordate species richness recorded in this study, we only detected ~ 20 chordate species, two orders of magnitude lower than the ~ 1,000 species of fish reported from or thought to live in this area (Fenner 2002; McKenna et al. 2002). The dominant fish families of Raja Ampat are usually well represented on coral reefs across the Indo-Western and Central Pacific (Allen 2003). For example, Allen (2003) reported that the five most speciose families in the region are the gobies (Gobiidae; 137 species), damselfishes (Pomacentridae; 114 species), wrasses (Labridae; 109 species), cardinalfishes (Apogonidae; 73 species), and groupers (Serranidae; 58 species). In Raja Ampat, and in the Indonesian region in general, the diversity of sharks and rays (Class Chondrichthyes) can be considered very high, with a total of 207 Chondrichthyes of which 109 are classified into 34 shark families (Fahmi 2010; White et al. 2006). In our study we found 139 shark and ray ASVs with 23 identified species, 60% of which were detected from Raja Ampat. Shark and rays have been exploited for many years and due to their global population status, many have been placed on one of the at risk categories in the IUCN Red List (Suppl. 3). Another interesting finding was a high abundance of Pristiformes reads of at several sites including Gorontalo, Selayar, Lombok and Wakatobi. These reads comprise the first record of Pristiformes in the Lombok and Wakatobi areas.

With respect to the Mollusca, we found a high percentage of unidentified taxa at four sites: North Moluccas, Seribu Islands Gorontalo and Misool. The highest read numbers for the Order Veneroida (bivalve molluscs) were from Lombok, followed by Raja Ampat and East Java, for Nudibranchia (sea slugs) from Raja Ampat and the Seribu Islands and for Teuthida (squids) from Wakatobi and the Seribu Islands. The giant or tridacnid clams are another important taxonomic group identified in this study. Eight species of the former family Tridacnidae (now family Cardiidae, sub-family Tridacninae) are known to live in Indonesian waters (Neo and Low 2018; Borsa et al. 2015). Two tridacnid clam species were recorded at one or more locations in this study: the small giant clam Tridacna maxima (27 unique ASVs) and the boring clam T. crocea (1 unique ASV). The small giant clam T. maxima has the widest distribution among the Tridacninae, and is found throughout the Indo-Pacific region from Africa to island in the eastern Pacific, while T. crocea, the boring clam (also known as the crocus clam, crocea clam or saffron-coloured clam), has a more restricted range in the central Indo-Pacific region (Bin Othman et al. 2010). Our results are in line with other studies reporting T. maxima (the small giant clam) and T. crocea (the boring clam) as the most commonly observed tridacnid clam species in Indonesia (DeBoer et al. 2008; Kochzius and Nuryanto 2008; Nuryanto and Kochzius 2009; Hernawan 2010; Arbi 2017; Yusuf and Moore 2020). The presence of T. crocea may be attributed to the behaviour that earned this species the name of boring clam, as its shell is typically embedded into hard substrate, especially massive coral lifeforms. This boring habit makes it difficult for fishermen to harvest the species, while its relatively small size also makes it a less attractive target compared to other tridacnid clam species. Overexploitation of tridacnid species is reported to occur due to the shell and curio industry (Pasaribu 1988; Wells 1997; Yusuf and Moore 2020), artisanal and subsistence fisheries, cultural practices and (including T. crocea) as feed for high value commodities such as lobsters (Yusuf and Moore 2020) as well as the marine ornamental trade (Reksodihardjo-Lilley and Lilley 2007). Thus far, however, there do not seem to be any records of industrial-scale exploitation of tridacnid clams in Indonesia.

eDNA metabarcoding has proven useful as a tool for species identification in a variety of marine invertebrate taxa, including echinoderms (e.g., Günther et al. 2018). This study detected eleven orders, 21 families, 24 genera, and 23 species belonging to the Phylum Echinodermata. At seven out of nine sites there were unidentified echinoderm taxa. Interestingly, we only identified Euryalida (basket stars) from Raja Ampat. The most frequently identified orders were the Spatangoida (heart urchins) and Ophiurida (brittle stars), while the order Valvatida (starfish) was only identified from Misool and Gorontalo. One order of particular interest was the Asphidochirotida (sea cucumbers); the most abundant reads were from Misool, followed by four other sites (Gorontalo, North Moluccas, Seribu Islands, and Selayar). The Spatangoida (heart urchins) were mostly recorded from the Seribu Islands, Gorontalo, East Java, and Selayar. However, the Comatulida (crinoids) reads were mostly from the Wakatobi with low read numbers from five other locations (Seribu Islands, Lombok, Misool, North Moluccas and East Java). The family-level distribution of echinoderms indicates that echinoderm community composition varied considerably between the nine locations. However, many reads were unidentified, highlighting the need for research to improve reference databases for the phylum Echinodermata.

Implications for conservation and fisheries management

Biodiversity conservation and fisheries management both depend to a varying but considerable extent on effective monitoring of the species present, in particular economically important and endangered taxa, across patchy habitats in large areas (Noss 1990; Regan et al. 2008). Many of the species detected by eDNA metabarcoding in this study are economically important and exploited by small scale fishermen. Across the Indonesian Archipelago, the live reef food-fish trade (LRFFT) tends to concentrate on a narrow range of high value reef fish species (Khasanah et al. 2019a). For example, fishes commonly targeted for sale in the LRFFT include coral trout (Plectropomus leopardus), predominantly in the juvenile or female phase size range (Khasanah et al. 2019a, b), and (mostly) juvenile Napoleon wrasse (Cheilinus undulatus) (Yusuf and Moore 2020). Whenever LRFFT holding facilities in Raja Ampat and Misool were encountered and inspected, they invariably contained a moderate number of these species.

With respect to the Mammalia, a single ASV of the marine mammal Orcaella brevirostris (Irrawaddy dolphin) was detected by eDNA metabarcoding in Lombok, a new species record for Lombok and indeed for West Nusa Tenggara. While often called the Irrawaddy river dolphin, O. brevirostris is not a true river dolphin but rather a euryhaline oceanic dolphin that is often found in brackish water, occurring in discontinuous subpopulations along sea coasts, including estuarine environments and rivers (Minton et al. 2017). The global conservation status of this species based on the latest IUCN Red List assessment is Endangered (EN) (Minton et al. 2017), although five subpopulations are classified as Critically Endangered (CR), including the Mahakam River population in East Kalimantan, Indonesia (Jefferson et al. 2008). Entanglement in fishnets and habitat degradation are given as the main threats to Irrawaddy dolphin populations. No range-wide survey has been conducted for this endangered species, and direct observation and identification of these organisms is always difficult and costly (Bogich et al. 2008). However, in Indonesia Irrawaddy dolphins are frequently observed in the freshwater reaches of the Mahakam River (Kreb 1999; Oktaviani et al. 2007) and Banten Bay (Khalifa et al. 2014). Conservation efforts are being made at international and national levels to alleviate threats to this species; however, the Irrawaddy dolphin's proximity to developing communities makes conservation difficult in practice (Beasley et al. 2013). Nonetheless, a conservation program, entitled Conservation Foundation for the Protection of Rare Aquatic Species of Indonesia, is focused on protecting the Irrawaddy dolphin population and their habitat in the Mahakam River (http://www.ykrasi.org/). The program not only educates and surveys the public, but also monitors the dolphins and their habitat. A prime example of this is the establishment of patrols in several villages. The detection of Irrawaddy dolphin sequences from Lombok calls for verification and eventually targeted conservation measures if the record is confirmed.

Shark and rays identified in this study can help managers in prioritizing areas and species within regions and in particular within fisheries management areas (FMAs). Among the taxa detected, two species are categorized as critically endangered (CE) in the IUCN Red List (IUCN 2021), three as endangered (EN), four as vulnerable (VU), nine as near threatened (NT), two as least concern (LC), one as data deficient (DD) and one species has not yet been evaluated (NE) (Suppl. 3).Sharks are cartilaginous fish of the Class Chondrichthyes and are mostly top predators in the ocean food chain (Fahmi and Dharmadi 2013). Sharks play an important role in maintaining the stability of the food chain through their predation on other organisms (Roff et al. 2016; Heupel et al. 2014). Therefore, it is important to manage shark populations sustainably, especially in areas such as Lombok which are known to be hotspots for shark fisheries (Sembiring et al. 2015).

The results of this study indicate that the eDNA approach used is not (at least as yet) capable of reliably identifying all the species actually present in a given area. However, this method can provide insights into which taxonomic groups occur or predominate in specific locations. This study reinforces the contention that this approach can be very useful in evaluating marine biodiversity, despite the many caveats for implementing metabarcoding analysis for eDNA samples using COI markers (Stat et al 2017; Thomsen and Willerslev 2015; Cristescu and Hebert 2018). However, these caveats or restrictions limit or even prohibit the use of metabarcoding approaches for characterizing whole fish assemblages by direct species classification. This study also highlighted the scarcity of genetic reference sequences for Indonesia in extant databases, as evidenced by the high percentage of unidentified taxa at all locations. Largely due to a lack of genetic sampling and data submission, this lack is especially critical for species-rich ecosystems such as the marine biodiversity hotspots in Indonesia (Juhel et al. 2020). One reason for the limited genetic sampling could be the difficulties involved in biodiversity sampling at many Indonesian sites by classical methods and even with the advent of organism-based DNA barcoding. A second challenge is that even when sequences are deposited in the public reference repositories they may have been collected by people who are not experts in the taxonomy of the species in question and therefore identification of the specimens from which the sequences were extracted may not always be reliable. The lack of sequence coverage highlights the immense gap that needs to be filled in the building of robust online repositories, and the large number of species which still need to be identified and barcoded. Therefore, all unidentified sequences from this study remain potentially useful for further research and are visualized to demonstrate their similarity with other taxa. In addition to their value as useful genetic data in the future, they can serve to raise awareness, especially in the Indonesian scientific community and government.

Contributions of eDNA metabarcoding to support the management effectiveness of Marine Protected Areas (MPAs) could include monitoring the presence of targeted and important species within MPAs and the surrounding region. MPAs are seen as an important tool for achieving sustainable use and avoiding over-exploitation of living marine resources (Palumbi 2001). Protected areas have been widely promoted as an important tool for biodiversity conservation and management (Brooks 2010; Venter et al. 2014). Species and habitat protection support the achievement of MPA goals (Beger et al. 2003, Williams et al. 2009, Hart 2003, Péron et al. 2013), including fisheries management (e.g., enhanced productivity, species diversity, and spillover effect) (Russ and Alcala 2010; Hamilton et al. 2011; Edgar et al. 2014; Smith and Garcia 2014), and the provision of recreational and educational opportunities (Ballantine 2014). Consequently, MPAs are highly recommended as a management tool for marine fisheries in Indonesia (Mous et al. 2005). It is suggested that the spatial distribution should suit the dispersal capabilities of the species to be protected to ensure connectivity between MPAs (Palumbi 2003). Two decades ago, Indonesia had established 51 MPAs with coral reefs covering an area of about 58,000 km2 (Spalding et al. 2001), representing around 1% of the Indonesian Archipelago’s total marine area. Since then, many more MPAs have been established under four MPA categories that are “Suaka Alam Perairan” (Marine Wildlife Reserve), “Taman Nasional Perairan” (Marine National Park), “Taman Wisata Perairan” (Marine Recreational Park) and “Suaka Perikanan” (Fisheries Refugee) (Wiadnya et al. 2011). Monitoring these vast areas is clearly a major challenge. Using eDNA protocols to effectively inform management regarding the presence of certain important species could be an efficient way to support the achievement of MPA objectives. While financial support for genetic research in Indonesia is an important issue, it is not the only aspect where resources need to be increased, human resources are also crucial. Environmental DNA (eDNA) metabarcoding surveys have proven useful for many ecosystems and can potentially provide an efficient and cost-effective complement to traditional ecosystem surveys, such as visual underwater censuses. This could be beneficial for marine conservation managers and fisheries officers to monitor the areas for which they are responsible. However, the managers need to understand the principles; and relevant people within (or collaborating with) these agencies need to be trained in eDNA sampling protocols, so that they can collect seawater sample from their area. These could then be used for metabarcoding a wide range of species, depending on the taxa of interest within their locations, using general and/or specific primers with high-throughput sequencing.

Conclusion

eDNA metabarcoding is a sensitive, efficient technique for studying marine organisms with large scale spatial distribution patterns. The use of eDNA methods can complement traditional methods. Increased availability of reference sequences, especially from similar geographic areas, will increase eDNA identification scope and confidence. eDNA metabarcoding could potentially be an excellent tool for species identification, tracking habitats, biodiversity surveys and to support environmental conservation at large spatial extents and in remote areas.