Introduction

The common ancestor of fish and mammals dates back to 440 million years ago (MYA), with the origin of vertebrates around the end of the Ordovician era. Currently, more than 45% (about 30,000 species) of vertebrates are teleosts (Ravi and Venkatesh 2018; Sacerdot et al. 2018), among which zebrafish and medaka were found to play important roles in our understanding of vertebrate genome function and evolution. Furthermore, human gene structure is characterized by the prediction of teleost genes, leading to insights into disease mechanisms (Howe et al. 2013; Bizuayehu and Babiak 2014; Daya et al. 2020). Humans and teleosts share many developmental pathways, organ systems, and physiological mechanisms (Crollius and Weissenbach 2005; Gorman et al. 2012; Howe et al. 2013). The respective advantages of zebrafish, medaka, tetraodon, platyfish, rainbow trout, Atlantic salmon carp, trout, tilapia, and takifugu have been exploited, through both classical genetics and advances in next generation sequencing (NGS) technologies (Volff 2005). Many species have been used as models to elucidate different biological functions such as anatomy, physiology, or cell biology (Howe et al. 2013; Bizuayehu and Babiak 2014; Daya et al. 2020). The most famous tropical teleost is Danio rerio (zebrafish), which has many advantages for genetic analysis: a short generation time (about 3 months), large egg clutches all year round, easy maintenance, and external development of a transparent embryo in combination with large-scale mutagenesis screening (Howe et al. 2013; Daya et al. 2020). Teleosts offer a wide variety of opportunities for comparative genomics to resolve many fundamental biological research questions. Tilapia, salmon, barramundi, and catfish are alternative models to study aquaculture for agricultural economic purposes (Pomeroy et al. 2008; Ansah et al. 2014). Therefore, additional studies of teleost model species, broadening applications to environmental genomics or aquaculture, are necessary to integrate with more advanced research disciplines. One complex trait of vertebrates is their patterns of behavioral aggression, as found in betta fish, for example; some species of five families of teleosts show forms of aggressive behavior (Villars 1983; Pitcher 2012; Watanabe and Kuczaj 2013). The emergence of cross-disciplinary platforms, including disruptive technology, allows the use of betta as a sustainable bioresource under the fourth industrial revolution (World Economic Forum 2018).

Fighting fish or betta (Betta spp.), which are native to Southeast Asia, have been proposed as a focus for the study of fighting and ornamental attributes. Out of 91 species, the most famous is B. splendens (the splendid battler), of which the male exhibits a pugnacious nature and beauty (U.S. Fish and Wildlife Service 2019). Betta can bite any targets on the opponent fish such as the swimmers and tail fins (for stability and kicking ability). Historically, betta fighting is a native sport of Thailand (Na-Ayudhya 2001; Monvises et al. 2009; Kowasupat 2012; Chailertrit et al. 2014; Prakhongcheep et al. 2018; Ponjarat et al. 2019; Ahmad et al. 2020; Singchat et al. 2020), and male betta are selected for paired-staged fights. Combating males are bred with large, strong bodies and hard scales (hard targets) but smaller fins (flimsy targets) as protection against bites from the opponent. Long-term selection as crossbreeding between different lines or species has honed their aggressive behavior (Witte and Schmidt 1992). This process has occurred for more than six centuries, resulting in short-fin varieties (Ramos and Gonçalves 2019). One line or species is selected for staged fights (“fighters”) and another line or several species for wild-types. Fighter males are more aggressive than wild-type males. Aggressive behaviors are correlated across male and female fighter siblings, suggesting possible genetic and physiological mechanisms for male and female aggression in betta lineage (Ramos and Gonçalves 2019). By contrast, these widespread activities of artificial selection have resulted in outbreeding depression for hybrid betta between different species/lines. Thus, the large biodiversity of betta is currently being lost more rapidly than at any time in the past several million years, with the invasion of alien species or hybrids introduced into the wild, leading to genetic admixture (Saint-Pé et al. 2018; Beer et al. 2019). This is a very serious problem in the context of conservation biology and genetics and needs to be resolved as a matter of urgency. Furthermore, the appreciation of body features such as color pattern, scale iridescence, body shape, and fin size has resulted in a flourishing global market for various bettas (U.S. Fish and Wildlife Service 2019). Changing the breeding emphasis from fighting to ornamental is preferable not only for the ethical state of the animal but also as a commercial strategy to generate higher profit margins. Breeding of betta males as ornamental fish for local sale and export is also becoming increasingly cost-effective. According to the Department of Fisheries (2010–2019), bettas are ranked among the top two ornamentals in terms of number of fish and revenue in Thailand. The latest official value (2019) for bettas is about USD $5.55 million or THB ฿172 million (Department of Fisheries (DOF) 2019). Ornamental fighting fish with novel colors, patterns, and fins are regularly exported and are in high demand. Extensive local propagation can increase prices by up to 3–5-fold. Surprisingly, these morphological varieties of betta are derived from local breeding and selection knowledge based on classical Mendelian inheritance. No scientific bioresource information is available, despite constant demand for new market features.

NGS is growing rapidly and becoming less expensive (Debnath et al. 2010). Fully sequenced genomes are publicly available for 350 vertebrate species (National Center for Biotechnology Information (NCBI) 2020) and 45 species of teleosts. These are mostly used to address fundamental questions of gene function or for economically important aquaculture (Shen and Yue 2019). The rapidly emerging field of comparative genomics has yielded dramatic results, and has become feasible given the availability of a number of completely sequenced genomes, allowing global views on vertebrate genome function and evolution. Comparison of human genes with those from other teleost genomes in a genomic landscape could also help assign novel functions to unannotated genes found in zebrafish or medaka (Kasahara et al. 2007; Howe et al. 2013). Consequently, this research could generate valuable insights and improvements to structural and functional genomics of betta fish and their evolutionary, biochemical, genetic, metabolic, and physiological pathways. Study of many betta genomes is essential to better understand the patterns of genome evolution under the prediction of cryptic diversity, pigmentation, and behavioral aggression. The order Anabantoidei is a key group within Perciformes. Genome drafts from bettas will provide insights into ancestral Perciformes and teleost genomes. Among Anabantoidei, only the genomes of Siamese fighting fish (B. splendens, GenBank accession number: QFDP00000000), kissing gourami (Helostoma temminkii, OMLM00000000), and climbing perch (Anabas testudineus, OOHO00000000) have been sequenced (Fan et al. 2018; Jakobsen et al. 2018; Zhang et al. 2019). Betta genomes provide the best hope for elucidating the gene and genomic properties of behavior and pigmentation. An evolutionary model has already been used to elucidate the genome size and limited protein sequences. Betta also present several interesting fundamental biological questions that can be approached from a genomic perspective. Here, we propose a strategic plan as part of our betta bioresource project, which will involve sharing research on betta through original scientific contributions. Genomic resources such as accurate assembly and annotation of bettas will provide valuable information for further studies, demonstrating the utility of comparative genomic analyses of various species to elucidate important facets of genome evolution. Perspective scenarios are presented in order to open up many aspects of betta bioresources, biodiversity, and eco-management.

The significance of betta bioresource as an emerging vertebrate model

The location of Southeast Asia, with its mainland and numerous islands and its hot and humid climate gives it a huge variety of biodiversity and bioresources. Betta (Osphronemidae, Anabantoidei) with 91 nominal species, is the largest group among labyrinth fishes that originated in the Middle Eocene or the late Cretaceous period (approximately 40–90 MYA) in freshwater (Hedges and Kumar 2009). Bettas play a major role in warm water ecosystems both as predators and prey (da Silva et al. 2014). In many vertebrates, female-only parental care dominates but male-only parental care is a dominant feature in teleosts (Blumer 1979, 1982; Gross and Sargent 1985). A comparative study of parental care patterns in teleosts indicated that evolution from a ‘guarding’ form of care, such as substrate spawning or bubble nesting, to ‘mouthbrooding’ is generally assumed to be linked with shifts in reproductive life-history as predicted by Shine’s ‘safe harbor’ hypothesis (Shine 1978). The Darwinian force of natural selection most likely favors large egg numbers when the egg stage can be considered a safe harbor due to investment of parental care. In betta lineage, males take spawned eggs into their mouths and incubate them for up to 4 weeks in 70% of species (Schmidt 1996). The remaining 30% exhibit bubble nesting as the dominant reproductive style and plesiomorphic condition among osphronemids (Britz and Cambray 2001). The eggs and larvae are guarded and defended by the male (Fig. 1), in agreement with molecular phylogenetic placement (Rüber et al. 2004). Moreover, bubble nesting and mouthbrooding bettas differ from each other in several phenotypic and behavioral aspects including head shape, spawning embrace, egg surface structure, and degree of sexual dimorphism (Vierke 1991; Schmidt 1996). It is believed that mouthbrooding bettas evolved at several times independently within the betta lineage (Rüber et al. 2004), but this scenario has not yet been addressed by new disruptive omic technologies such as genomics and transcriptomics.

Fig. 1
figure 1

Phylogeny of the fighting fish genus Betta showing the 12 nominal species in Thailand. Bubble nest builders (Betta splendens, B. smaragdina, B. imbellis, B. siamorientalis, and B. mahachaiensis) and mouth brooders (B. prima, B. simplex, B. pi, B. pallida, B. apollon, B. ferox, and B. pugnax). Phylogeny was partially derived from Panijpan et al. (2014)

There are twelve nominal species of betta within Thailand. Five species are Bubble nest builders (B. splendens, B. smaragdina, B. imbellis, B. siamorientalis, and B. mahachaiensis), and seven are mouth brooders (B. prima, B. simplex, B. pi, B. pallida, B. apollon, B. ferox, and B. pugnax) (Fig. 1). However, new species are discovered on average every 5–10 years and most exhibit highly similar morphologic characters (Monvises et al. 2009). This suggests that species radiation with cryptic diversity occurred in the betta lineage. Different species are located in different geographic regions such as B. splendens and B. imbellis, leading to allopatric speciation models like darter fish (Carlson and Wainwright 2010; Near et al. 2011). In Thailand, B. mahachaiensis is only found in a narrow and specific area of Samut Sakhon Province (13.550333 N, 100.273968 E) and Samut Prakan Province (13.601040 N, 100.605302 E), in tidal brackish waters with thick vegetation, subject to a daily influx of saltwater (Fig. 2) (Kowasupat 2012). Remarkably, no other wild betta species seem able to survive in such an inhospitable environment. This distribution shows the hypothetical model of parapatric speciation as also found in mormyrid fish (Pollimyrus castelnaui) (Kramer et al. 2003). However, due to these specific water chemistry requirements and its restricted occurrence, B. mahachaiensis has been endangered by recent anthropogenic activities such as urbanization and industrial development (Griffin 2005; Monvises et al. 2009). Most wild bettas are currently listed as threatened mainly due to urbanization, industrialization, tourism, and agriculture (Andrews 1990; Monvises et al. 2009). The ‘twin emergencies’ affecting humanity are loss of biodiversity and climate change, leading to the sixth global mass extinction of possibly up to one million species by the end of this century. Biodiversity and genetic wealth are being eroded due to a variety of causes; most importantly, loss of habitat due to irrigation projects, industry, agriculture and aquaculture. On the other hand, aquarium betta have a high economic value in many countries (Tlusty 2002). B. splendens is the most common species used in breeding programs with various characteristics such as big ear (dumbo betta), long fin, halfmoon betta, double size (giant betta), butterfly pattern, crowntail betta, double tail betta, delta tail betta, albino betta (white solid color), bi-color betta, and various colors that increase their market value (Fig. 3). Betta are popular as freshwater fish among aquarium hobbyists for their beauty and fighting abilities. The price of male individuals can be as much as four times as high as that of females. Development of reliable and effective knowledge to control pigmentation, shape, aggression, and gonadal sex of betta, to produce all-male populations in controlled conditions would be highly advantageous for breeders.

Fig. 2
figure 2

Hypothetical model of sympatric speciation, parapatric speciation, and allopatric speciation in Betta spp. in Thailand

Fig. 3
figure 3

List of characteristics that increase the market demand values in bettas. Big ear (dumbo betta), long fin, short fin, halfmoon betta, crowntail betta, double tail betta, delta tail betta, double size (giant betta), marble pattern, dargon pattern, butterfly pattern, albino betta (white solid color), bi-color betta, and various colors (color figure online)

Globally, bettas were a source of trade worth more than USD $5 million or 10% of the total aquatic animal value in 2019 (DOF 2019). Given their popularity and inherent public fascination, efforts focused on betta genomics are ideally suited for modern (precise and smart) agriculture, education, and outreach focused on evolution and comparative genomics. Bettas represent important research organisms for diverse fields that include evolution and phylogenetics, functional morphology, sex determination, hybridization, physiology, developmental biology, functional trait diversity, biological diversity, aggressive behavior, and population genetics. Betta have been extensively used as a model for examining the environmental impact of various contaminants (Alyan 2007). To provide the bioresources necessary to expand our knowledge of this fascinating lineage, the National Research Council of Thailand (NRCT) and Kasetsart University (KU) are actively obtaining and organizing bioresource information through the National Betta BioResource Project (NBBRP) comprising betta biology, distribution, developmental biology, genomics, and a biobank. Betta genomes will provide useful sources of data for biological, agricultural, and future biomedical research. For further information about the project and preliminary assemblies, see http://www.nbbrp.sci.ku.ac.th.

In light of the foregoing, we propose to study: (1) monophyly of the genus Betta including a single-versus-multiple origin of mouthbrooding; (2) the state of cryptic diversity and evolutionary forces by sympatric and/or parapatric speciation in the betta lineage; (3) responsive genes or genetic interaction to parental care, behavioral aggression, pigmentation and other betta biology; and (4) preservation technology for betta as insurance against accidental loss of biodiversity this century. Future studies on this topic should also focus on basic understanding of the subject, transferring the knowledge acquired to the scientific community and public for use in betta bioresource planning for sustainable development, and establishing a national betta bioresource research center. Our recent sociological and economic programs were intended to assist and support the community in realizing the important biological functions of betta. Preliminary data from our efforts have been used in pilot activities such as a drawing competition to integrate conservation efforts for young teenagers in Thailand, as part of a campaign to improve public awareness of betta bioresources.

Betta can also act as an exciting vertebrate model to obtain various scientific outcomes for better comprehension of the mechanisms of important biological phenomena. One auspicious trait of betta fish is behavioral aggression, which is why we propose betta as an excellent model to investigate the genetics of social behavior. Aggression comprises a complex suite of behaviors serving a number of adaptive purposes. Fish use aggression to protect offspring, monopolize resources such as food, territory and mates and establish dominance hierarchies. Recent research involving comparative transcriptomics has demonstrated the suitability of B. splendens to model some aspects of complex fighting behavior with discovery of several differential expressed genes linked with such behavior (Vu et al. 2020). Results suggest that reward behavior, learning and memory, aggression, anxiety and sleep might be conserved regulatory processes between fish and mammals. As a result, betta fish as a vertebrate model can further help to elucidate the mechanisms of genetic pathways and neural circuits that control vertebrate behavior. Comparative studies of many model organisms, including betta, are necessary to determine the general principles of behavioral control. Another scientific advantage of betta as a vertebrate model is the remarkable variations in different phenotypes such as color patterns and fins/tail shapes. The genetic basis of such morphological variation, both within and between species, is a major research area in evolutionary biology. Betta produce complex elaborate color patterns, and many suitable species can be chosen for more detailed analyses to study particular aspects of color pattern evolution. We plan to collect different betta species with certain color and pattern variants, and perform genome-wide single nucleotide polymorphism (SNP) analyses as well as transcriptome analyses to better understand gene expression dynamics. Other novel experimental techniques, such as gene editing with CRISPR/Cas9, will allow the production of mutants to unlock basic genetic information of the major pathways involved in morphological trait variations.

Properties of betta genomes and available genomic resources

Recent next-generation sequencing has been performed to assemble the B. splendens genome using short reads (Illumina) sequencing integrated with Hi-C data (Fan et al. 2018). The assembled genome is 465.24 Mb, comprising 21 pairs as the common chromosome number in teleosts (Fontana et al. 1970; Natarajan and Subrahmanyam 1974). However, this Illumina-based genome assembly needs to be improved with high coverage long-read sequencing such as PacBio and ultra-long read sequencing, i.e., using Oxford Nanopore technologies as a means of achieving highly accurate chromosome-scale genomes and accurate genes and repeat representations. In addition, due to the variety of bettas with genetic admixture under the influence of anthropogenic breeding selection, it is necessary to perform several reference betta de novo genome assemblies and annotations from different species of this group. The current version of the B. splendens genome must also be updated into a complete high-quality assembly. We further recommend that genomes of native species from naturally occurring populations should be performed and compared with the existing genome of B. splendens, which originated from a pet shop, as reported by Fan et al. (2018). Wild original localities are also very important to trace the lineage of species as opposed to pet shop and aquarium specimens. Betta have a very short genetic distance under the state of cryptic diversity (Sriwattanarothai et al. 2010; Panijpan et al. 2014). Currently, several specimens of vertebrate species from pet shops have been misidentified but still act as published reference data (Srikulnath et al. 2012, 2015). The genetic sex-determination system of bettas still remains unknown because the species do not have heteromorphic sex chromosomes (Srikulnath et al. 2011). The similarity or dissimilarity of sex chromosomes in Osphronemidae requires phylogenetical study, and may lead to the hypothesis that betta exhibit an XX/XY system (Rüber et al. 2006).

The main challenge when assembling a genome is to resolve the repetitive stretches of DNA that occur throughout coding and non-coding regions. Their specific nature may result in introducing and propagating bias in various analyses. This could lead to errors and false information when interpreting genomic organization and function. A considerable portion of genome repeats are transposable elements (TEs) that make up a total of 15.12% of betta genomes, with enrichment of long interspersed nuclear elements (LINEs) (Fan et al. 2018). We therefore recommend focusing on TEs to identify novel repeats and compare their abundance and proportion within betta linages as well as in other groups such as teleosts and amniotes (Volff 2005). In addition to nuclear genomic sequences and mapping information, complete mitochondrial genomes (mito genomes) of betta are available for five species, namely B. splendens, B. pi, B. apollon, B. simplex, B. mahachaiensis, and B. imbellis (Song et al. 2016; Prakhongcheep et al. 2018; Ponjarat et al. 2019; Ahmad et al. 2020; Singchat et al. 2020). However, the phylomitogenomics of betta have not yet been constructed to delineate the evolutionary relationships of the betta group. Only partial nuclear and mitochondrial genes and non-coding genes have been used to reveal evolutionary diversity and new betta species (Chailertrit et al. 2014; Ponjarat et al. 2019).

Strategy for betta genomes

Due to the availability of betta legacy data (Fan et al. 2018), we are currently planning to perform whole-genome sequencing (WGS) and de novo assembly of each betta genome using a combination of multiple sequencing technologies and optical mapping. We plan to develop a strategy generating high coverage short-read Illumina, long-read PacBio, and ultra long-read Oxford Nanopore sequencing data. The sequenced reads will be assembled into a hybrid genome assembly, comprising scaffolds that will subsequently be anchored in chromosomes via chromatin conformation techniques, Hi-C, and optical mapping BioNano sequencing. The short-long sequencing approach integrated with Hi-C and PacBio modern technologies was recently applied successfully to generate high-quality complete assemblies of different species such as the Indian cobra genome (Suryamohan et al. 2020). A list of the proposed different sequencing technologies to develop our betta genome project is summarized as Table 1. Apart from de novo genome assembly, we also plan to collect wild specimens (B. splendens, B. smaragdina, B. imbellis, B. siamorientalis, and B. mahachaiensis, B. prima, B. simplex, B. pi, B. pallida, B. apollon, B. ferox, and B. pugnax) of each betta species to perform WGS and generate about 35 × coverage from an Illumina mate-pair library. We will sequence the transcriptomes from different tissues (gonads, skin, muscle, brain) and analyze transcriptomic data for structural and functional annotation of the assembled betta genome. We will further perform repetomic analysis to annotate the mobilome and satellitome (TEs and satellite repeats). The annotated repeat sequences will be mapped to physical chromosomes using fluorescence in situ hybridization (FISH) to elucidate their organization and facilitate the anchoring of portions of the genome assembly to chromosomes. Current betta genome resources coupled with cytogenetic techniques such as FISH will pave the way towards exciting new areas such as cytogenomics (genomics + cytogenetics), chromosomics, and comparative genomics (Deakin et al. 2019).

Table 1 Multiple sequencing approaches proposed for Betta genomic analysis

In the field of molecular evolution, comparative genomics has many applications when selecting model organisms. A model is a simple, idealized system that is accessible and can be easily manipulated. Gene finding is an important application of comparative genomics to identify linkage homology or synteny, and hence reveal gene clusters and assist in the clustering of regulatory sites, allowing recognition of unknown regulatory regions in other genomes. Regulation of metabolic pathways, and adaptive properties of organisms such as sex evolution and gene silencing can also be correlated to genome sequences using comparative genomics. Here, we study the conservation of linkage homologies flanked by Perciformes. Association between bettas is of great interest when selecting candidate genes to detect economically important quantitative trait loci such as various pigmentations or behavioral aggression. In addition to promoting the aquaculture industry, investigation of these phenomena encompasses important ecological and evolutionary implications that can foster developments in genomics and broaden the range of teleosts for comparative genomics by adding phylogenetically related but ecologically divergent non-model species. Apart from betta biology, a crucial element in translational biomedical research applications involves linking molecular readouts measured in betta to information relevant to human health and disease. One key requirement involves matching gene sequences from betta to humans at a genome-wide scale. This goes beyond the automated conversion of gene symbols and involves investigating the association between multiple homologous sequences in RNA sequencing experiments. In our prospective research, major obstacles will be overcome to allow systematic gene mapping between bettas and humans. Important disparities and partial agreements will be identified between two public homology resources: HomoloGene and Ensembl (Herrero et al. 2016) in order to meet the need for standardized, comprehensive genomic mapping.

Betta genomics promises exciting advances toward the important conservation goal of maximizing evolutionary potential and diversity. Many traits might be polygenic and strongly influenced by minor differences in regulatory networks, with epigenetic variation not visible in DNA sequences due to the surrounding habitat of low heritability. This critical complexity is difficult to detect by commonly used methods to identify adaptive variation. Appropriate consideration is required when planning genomic screens, and when basing management decisions on genomic data (Harrisson et al. 2014). The genomic basis of adaptation and future threats must be well understood before focusing on particular adaptive traits. Screening genome-wide variation is a sensible approach for more typical conservation scenarios. This may provide a generalized measure of evolutionary potential that accounts for the contributions of small-effect loci and cryptic variation, while also remaining robust to uncertainty about future change and required adaptive responses. The best outcomes can be achieved by investigating genomic estimates of evolutionary dynamics in a well-controlled situation.

Roadmap for research analyses and experiments

The importance of biodiversity is increasingly being recognized. The goal of bioresource management is to support sustainable development by protecting and using resources in a way that does not diminish the variety of genes and species or destroy important habitats and ecology (Niesenbaum 2019). A general lack of information highlights the need for urgent development of scientific, technical, and institutional capacities to provide basic understanding upon implementation of appropriate measures. There is a need to take one step backwards before moving forwards, given that betta bioresources cannot be restored once they are lost, especially wild betta. Establishment of betta bioresources is essential to provide cutting edge research in agriculture and future biomedical models (Fig. 4). A bioresource framework can be enhanced by increasing value-added genomic resources and developing preservation technologies such as cloning for systematic collection, preservation, and distribution of bioresources. The focus will be on the required strategic development at the highest global standards such as the Convention on Biological Diversity and the International Organization for Standardization. Further insight and better knowledge are needed to maintain wild bettas in their habitats, while keeping domesticated ones as diverse as possible to develop new varieties. We plan to collect, preserve, and provide bioresources as essential experimental materials for all research fields and agricultural development. The NBBRP will be upgraded with the relevant related technologies to promote education and dissemination of information and implement the following four core programs to facilitate collection, preservation, and arrangement of bioresources as: (1) core facility program by collecting and maintaining wild betta in the aquarium to improve knowledge of betta biology; (2) genome information program; (3) fundamental preservation technology program; and (4) information center program (Fig. 4).

Fig. 4
figure 4

Infographic showing national betta bioresources research center with cutting edge research, agriculture, and biomedical model

Core facility program: Having surveyed the habitats and diversity of bettas in Thailand, we highlight the need for: (1) detailed information on all localities for all betta species; (2) preserved specimens for all species/subspecies/varieties in the NBBRP to identify genetic profiles using mitochondrial DNA (Song et al. 2016; Prakhongcheep et al. 2018; Ponjarat et al. 2019), microsatellite (Chailertrit et al. 2014), and genome-wide single nucleotide polymorphism (SNP); and (3) comprehensive research on betta biology such as systematics, ecology, developmental biology, pathology, functional morphology, sex determination, hybridization, physiology, developmental biology, functional trait diversity, biological diversity, aggressive behavior, and population genetics. This will allow us to design a species maintenance program for vulnerable or threatened species.

Pigments and body colors Pigments of bettas provide red, blue, green, and yellow body surface colors and scale iridescence (Monvises et al. 2009). Pigments are compounds of general structures such as melanins, carotenoids, xanthines and pterins, whereas iridescent compounds on the scales and body surfaces are guanines and purines (Delgado-Vargas et al. 2000; Lorin et al. 2018). These pigments are also common in other vertebrates including humans. However, teleosts have special pigment cells (chromatophores) that respond to neural signals as a quick outward expression by deepening of color intensity and changing of shades. The genetic basis and synthetic routes for these pigments have not been well established, let alone the interactions between them. The chromatophores of teleosts change color in accordance with habitat, environment, and stimuli (Sugimoto 2002). Restriction site-associated DNA markers (RAD-Seq) were used for a genetic study related to the color pattern of cichlid fish. Color traits such as carotenoid, melanin, and mixed in several genomic regions were found to be highly differentiated between distinct color morphs (San-Jose and Roulin 2017). Developmental biology and genome information will therefore be useful for producing desired body color patterns on demand.

Aggressive behavior Betta is a particularly interesting model to investigate proximate and ultimate behavioral aggressive questions. Its aggressive behavior has been well characterized and female–female aggression occurs naturally (Braddock and Braddock 1995; Simpson 1968; Ramos and Gonçalves 2019). Bettas are social animals capable of living in groups with a pecking order or in isolation. They are very territorial, especially after isolation or during courtship. Their aggressiveness is usually expressed by body color intensity, expansion of fins, and opening of gill covers (opercula), all of which are features preferred by females. Although such expression addresses the desires of gamblers in the results of fights, people who raise them as ornamentals would prefer to have beautiful displays with minimal combat. Fighter males have higher swimming activity, perform frequent fast strikes in the direction of the intruder, and display from a distance. Wild-type males are less active and exhibit aggressive displays mostly in close proximity to the stimuli. Females of the fighter strain not used in fights are also more aggressive than wild-type females. The same fighter and wild-type strains show differences in male cortisol response to unfamiliar environments, with wild-types but not fighters displaying increased cortisol levels (Verbeek et al. 2008). The betta is a good model for artificial selection and experimental evolution when testing proximate and ultimate causes of behavior. Testing may be carried out under controlled conditions in the laboratory or natural settings, or arise from unintended natural experiments. The study of these systems may provide relevant information on the underlying genetic and physiological mechanisms of aggression and also on ways in which sexual conflict may shape the evolution of aggressive behavior (Rice 1992; Wright et al. 2018).

Genome information We will outline major structural tasks including population genetics and genomics such as SNP to disclose population status and species-specific SNP/morphologic variety-specific SNP. This will be the responsibility of the betta genome project as the core of the genome information. We propose a number of research questions and hypotheses at the level of genome evolution and betta biology. A crucial step in making genome resources useful to the scientific community is the generation of gene annotations and understanding diversity. We plan to de novo annotate the complete genome, predicting novel genes as well as eukaryotic core genes and their structure, frequency and function (Klasberg et al. 2016; Daya et al. 2020). We will also carry out a number of traditional analyses of genome content using betta genomes, focusing on repeated sequences and gene families. Repeatomics analysis will enable the identification of novel repeat families as well as conserved elements, and reveal the patterns of TE evolutionary landscapes. We will compare repeat family content within betta genomes and with teleosts and vertebrates. We will also conduct analyses of gene family evolution within teleosts to identify specific genes and other functional elements, including the identification of ultra-conserved regions and potential micro RNA sequences, with a specific focus on those sequences that could have been gained or lost both within bettas or in comparison with other relevant lineages that are now available for investigation. There are also several fundamental biological questions specific to bettas, which we will address by analyzing genomic and RNA sequencing data via experimental techniques such as sex determination systems. We will also perform genome-wide screening of SNPs and structural variation analyses and extend applications of genome sequencing that are particularly relevant to farm-bred betta breeding programs.

Fundamental preservation technology This refers to the loss of biodiversity in wild betta, resulting in an increase in research aimed at development of conservation strategies. A number of teleosts now face extinction due to overfishing and habitat destruction. Although considerable efforts have been expended on preservation, such as the establishment of closed areas and seasons, improvements in teleost resources have been limited. Once teleost habitats are destroyed, they are extremely difficult to repair in a short period of time (Yoshizaki and Lee 2018). Therefore, emergency measures are needed to preserve valuable genetic resources of endangered teleosts. Several techniques for teleost conservation are available including the formation of biobanks with cryopreservation, assisted reproductive techniques such as artificial insemination, embryo transfer (ET), in vitro fertilization, cloning using somatic cell nuclear transfer, and biobanking or genome resource banking (Coward et al. 2002; Vasta et al. 2004; Luo et al. 2011; Agca 2012; Peuß et al. 2019). Biobanking is based on the storage of biological material at sub-zero temperatures using cryopreservation. The material can be in the form of cell lines, gametes, embryos, tissue, blood, DNA, somatic cells, or chromosomes (Strand et al. 2020). The remaining techniques require considerable time and highly skilled human resources to develop as a practical approach. Interestingly, potential uses of stored tissue samples and cell lines include development and application of species-specific stem cell technologies (including induced pluripotent stem cells), collection of natural and produced artificial gametes, in vitro embryo production, ET into surrogate mothers, and rearing of offspring. Based on experiences using similar technologies in several teleosts such as carp, salmon, rainbow trout and medaka, these methods may be adapted to bettas. However, cryopreservation of teleost eggs remains very difficult due to the large size and high lipid and yolk content (Mazur et al. 2008). Sperm alone cannot produce live teleost individuals. A few reports have described the possibility of cryopreserving teleost eggs (Zhang et al. 1989; Chen and Tian 2005) but reproducibility of the techniques has not been confirmed (Edashige et al. 2006). As an alternative to fish egg cryopreservation, we plan to focus on the preservation of primordial germ cells (PGCs). These are small enough to be cryopreserved and do not contain much lipid and yolk material (Patino et al. 1995). Therefore, PGCs could be kept in liquid nitrogen semi-permanently, and whenever bettas were needed, the frozen PGCs could be converted into functional gametes by transplanting them into recipient betta or teleosts of a closely related species to the donor. When the recipient teleosts matured, they would be able to produce donor-derived gametes. Therefore, simply by mating male and female recipients, endangered or even extinct bettas could be regenerated solely from frozen genetic material. We need to preserve genetic material through collection of biological material from threatened betta species.

Information center The NBBRP has a role in assisting technology transfer to farmers for aquaculture of all bettas and educate all members of the general public, students, and researchers. Several betta species are listed as critically endangered (B. mahachaiensis and B. simplex), vulnerable (B. prima, B. pi and B. splendens) and threatened in situ (B. imbellis and B. smaragdina). A number of wild betta forms have been domesticated and incorporated into farming programs to serve the export industry. The greatest demands are for B. splendens, B. smaragdina, B. imbellis and all nest-building bettas (Monvises et al. 2009). Mouth brooding species are a small-demand niche market and not as popular. Habitats for all bettas must be protected as a matter of urgency.

Project timeline and goals

The first phase of our bioresource project will involve generating 35 × coverage Illumina WGS data for 15 different species of Thai native origin wild bettas. This WGS data will be mapped to the reference genome assembly for a polymorphism survey generated in the second phase to understand species-level divergence. Simultaneously, we will survey ecological habitats and collect bettas across Thailand. The second phase will involve generating high coverage Illumina and PacBio sequencing with at least 50 × depth for one species to generate hybrid assemblies. Each individual will also be subjected to Hi-C and optical mapping to anchor the hybrid scaffolds into chromosomes. We will study population genomics by ddRAD sequencing of 10 from specimens collected across regions from more than then populations. The third phase will focus on genome annotation and gene ontology of betta biology including species radiation, parental care, behavioral aggression, and pigmentation as a relevant model for human biology. This will also involve FISH mapping of repeats to assign scaffolds to chromosomes. Biobank cryopreservation and cell cloning will also be conducted to maintain both wild specimens and live specimens in aquariums. The fourth phase will involve the development of molecular markers for a breeding and conservation program. We will also perform comparative genomic analyses within bettas and among teleosts and other vertebrates. The final phase will involve setting up a bioresource center to educate the general public. When all phases are completed, we can assess the most pressing questions involving betta genomics. Individual genes and their regulatory regions will be of primary interest. Completion of each phase will be publicly communicated via our website and social networks. Links to the data and assemblies will be available to researchers. At the same time, we will transfer our knowledge and propose public activities to include teenagers, students, researchers, scientists, farmers, government sectors, private sectors, and stakeholders via books, mobile applications, and movie clips. We anticipate data collection and initial analyses to be completed by June 2025.

How other groups can join the project or publish independently with our early release data

This project is affiliated with the NRCT and KU. We invite other broader scientific communities to access and make use of the bioresources that we will produce. Any group performing independent analyses is welcome to use our data without restriction. As a matter of courtesy and to avoid duplicated effort, we request that competing bioresource-scale projects or analyses that overlap with the areas stated above disclose their status to the NBBRP, and that relevant papers that describe the data are cited. As other groups build their own bioresource research and carry out data collection, the research community has an opportunity to build a legacy that will influence public opinion on research, data-sharing, and informational privacy for years to come. Research has also shown that members of the public have privacy preferences that are out of step with long-standing privacy policies applied to research. There have been efforts to educate the public about the benefits of bioresource research. Further project description and a complete list of current NBBRP members can be accessed on the website dedicated to this project.

Final remarks and recommendations

The future of betta genomics is bright. This prediction is supported in three ways listed below, in which specific characteristics of bettas can be successfully exploited to gain insights into a broad range of subjects. First, betta is a highly diversified group that has experienced an astonishing range of environmental conditions to which their physiologies, body shapes, and lifestyles have adapted. This study will be performed in a non-model species, exemplifying the use of a specific species of teleost with a specific feature of interest. Bettas share many aspects of their developmental pathways, physiological mechanisms and organ systems with mammals, and these results are also relevant to human physiology. Our findings should be prioritized by biologists and breeders wishing to make contributions to future knowledge by considering bettas in innovative ways to promote them as an ornamental species for enjoyment and export, while conserving their habitats and diversity. The second illustration of the strength of future genomic research using bettas is the recognition by international funding agencies that bettas are similar to zebrafish for fundamental and translational biomedical research. Bettas might be used as in vivo models with potential clinical relevance in the near future to elucidate disease mechanisms, novel therapeutic targets, and candidate therapeutics. This is a critical requirement for enabling research in different domain applications. The complexity of this endeavor is magnified by intertwined evolutionary and genomic factors, including considerable levels of gene similarity at genome and gene-family levels. Further standardized public efforts are needed, which will depend greatly on stronger support from research funders, researchers and other stakeholders. Genomic techniques and sequence comparisons with genomic models are cornerstones of this project. We hope to illustrate how fish genomics may grow in the coming years outside fundamental research labs towards objectives that are more applied and more essential to human welfare. Thirdly, there is an opportunity to expand the phylogenomics. Because many teleosts are routinely studied at both molecular and genomic levels, by drawing on the similarities and differences between teleosts and other vertebrate genomes, it is possible to gain profound insights into genomic evolution and the function of individual genes associated with human disorders.