Introduction

Being sessile, plants are constantly exposed to a variety of pathogens, which has prompted the evolution of strong immune systems and intricate defense mechanisms including pathogen-associated molecular pattern-triggered immunity (PTI) and effector-triggered immunity (ETI) to detect and combat pathogens (Zhang et al. 2014; Yu et al. 2017). Upon local induction, PTI and ETI cause systemic acquired resistance (SAR) (Fu and Dong 2013). The PTI is the first line of inducible defense in the plant, and it is activated by highly conserved pathogen-associated molecular patterns (PAMPs) found in pathogens. Pattern recognition receptors (PRRs) such as transmembrane receptor-like kinases (RLKs) or receptor-like proteins (RLPs) present in the plasma membrane recognize PAMPs and send downstream immune signaling. The second layer of inducible defense is the ETI, which is triggered by the resistant (R) genes upon recognition of the effector molecules, i.e., the genes that confer pathogen virulence (Avr), and this interaction typically results in the hypersensitive response (HR), a sort of programmed cell death (PCD), which stops the pathogen’s growth (Cui et al. 2015). Due to the emergence of new virulent strains, pathogens often manage to overcome ETI resistance (Yu et al. 2014). Therefore, identifying R genes is essential for comprehending the molecular basis of resistance and creating resistant cultivars.

Currently, more than 300 R genes that confer resistance to various diseases have been cloned from various plant species (Yang and Wang 2016; Kourelis and van der Hoorn 2018). According to their domains, R genes can be divided into five groups, with the nucleotide-binding site leucine-rich repeat (NBS-LRR) class accounting for more than 60% of all characterized R genes (Kourelis and van der Hoorn 2018). They identify the avirulence proteins from the pathogens either directly or via the Guard model, and this family of R genes encodes a variable domain at the N terminal, a central NBS domain, and an LRR domain at the C terminal end (Collier and Moffett 2009; Dangl and Jones 2001) .

Based on structural characteristics at the N terminal end, the NBS-LRR family of genes has further divided into three main subclasses. These include the TIR-NBS-LRR (TNL) proteins, which have a domain that resembles the intracellular signaling domains of Drosophila Toll and mammalian IL-1 receptors, the CC-NBS-LRR (CNL) proteins, which contain a putative coiled-coil domain, and the RPW8-NBS-LRR (RNL) proteins harbor resistance to powdery mildew8 (RPW8) domains (Shao et al. 2016). TNLs are solely found in dicots, but CNL and RNL are found in both monocots and dicots (Shao et al. 2016). The NBS domain binds to ATP/GTP and performs the hydrolysis reaction, supplying energy for downstream signaling, whereas the C terminal LRR domain is involved in pathogen recognition and protein–protein interaction (Goyal et al. 2020). This class of genes primarily operates in disease resistance following pathogen detection, which initiates downstream cascades that result in a variety of defense responses, including HRs and PCD (Guo et al. 2011).

NBS-LRR genes evolved early in the plant lineage, and due to the ongoing arms race between pathogens and plants, plants maintain a large number of R genes in their genome (Gu et al. 2015; Li et al. 2016; Shao et al. 2019). Genomic and evolutionary studies have provided a detailed and accurate understanding of how functional R genes evolved and were preserved. NBS-LRR genes are mostly involved in defense responses, but ADR1, an Arabidopsis CC-NBS-LRR, and At5g17880, an Arabidopsis TIR-NBS-LRR, have been linked to drought tolerance (Chini et al. 2004) and photomorphogenic development, respectively (Faigón-Soverna et al. 2006). NBS-LRR genome-wide analyses in many monocots and dicots have yielded a wealth of information on the structure and function of this gene family (Meyers et al. 1999, 2003; Bai et al. 2002; Mondragon-Palomino et al. 2002; Baumgarten et al. 2003; Ayliffe and Lagudah 2004; Zhou et al. 2004; Ameline-Torregrosa et al. 2008).

Banana is a major food crop in many developing countries, but it is susceptible to a variety of biotic (Fusarium wilt, leaf spot, bacterial wilt, bunchy top virus disease, weevils, and nematodes) and abiotic (salinity and drought) stresses that limit its production (Nansamba et al. 2020). To reduce the impact of these stresses, it is necessary to identify resistance genes and use novel transgenic strategies to develop improved banana cultivars. Furthermore, understanding genetic variations in NBS-LRR genes and the number of conserved genes in bananas will greatly aid in estimating the genetic diversity of R genes available in Musa species (Yang et al. 2006). In addition, studies on the genomic evaluation and evolutionary patterns of R genes will result in a better understanding of the basis of resistance and susceptibility, leading to the identification of functional R genes in bananas. As a result, there is a need for systematic evaluation of NBS genes from bananas, and with the availability of complete genome sequences (D'Hont et al. 2012; Wang et al. 2019), characterization is much needed to elucidate the diverse molecular mechanisms underlying host–pathogen interaction, as well as for mapping and cloning of R genes, which will help with the mining and exploitation of R genes for developing improved resistant varieties.

Materials and methods

Identification of NBS gene family members in banana

To create a local protein database, NBS gene and protein sequences were downloaded from the banana genome hub database (https://banana-genome-hub.southgreen.fr/) for both the A and B genomes. NBS-encoding genes were identified, and motif architecture was investigated using a method similar to that used in Oryza sativa L. var. Nipponbare, Arabidopsis thaliana, and Brachypodium distachyon (Meyers et al. 2003; Zhou et al. 2004; Tan and Wu 2012). A reiterative process was used to identify NBS gene sequences from both the A and B genomes of bananas. The Hidden Markov Model (HMM) (Eddy 1998) was used to choose candidate NBS genes from the entire set of predicted M. acuminata (DH Pahang) and M. balbisiana (DH PKW) proteins with NBS motifs, as well as the Pfam database for NBS domain (PF00931; http://pfam.sanger.ac.uk/search). BLASTP searches in NCBI (Altschul et al. 1990) were used to compare the sequences of the predicted NBS-containing proteins to the non-redundant (nr) database (the threshold expectation value was set to 1e−10), allowing the identification of regular and non-regular NBS genes. NBS encoding genes were classified into sub-groups based on domain information from the NCBI conserved domain database (CDD) (Tan and Wu 2012) . The genome IDs of NBS genes were arranged according to their chromosomal location, from Chr01 to Chr11 (Supplementary Data: 1a, 1b).

Phylogenetic tree construction

Using the CLUSTALW tool in the BioEdit sequence alignment editor version 7.0.3.1, NBS gene sequences from both genomes were aligned (Hall 1999). Multiple sequence alignments were performed using MUSCLE v.3.8.31 with default parameters to examine the evolutionary relationships of NBS between Musa (A, B genome), O. sativa, and A. thaliana, and MEGA X was used to create a maximum likelihood (ML) phylogenetic tree using all sites with bootstrap analysis (1000 replicates) (Kumar et al. 2018).

Gene structure, cis-regulatory elements, and genomic distribution of NBSgenes

The Bio-sequence Structure Illustrator application of the TBtools software version 1.077 was used to illustrate the exon–intron structure of NBS genes (Chen et al. 2020). The MEME tool (http://meme-suite.org/tools/meme) was used to identify conserved motifs with the default parameter settings: maximum number of motifs = 20, and the results were displayed by TBtools. Using TBtools, each NBS gene was assigned to a chromosome based on its location on the Musa A and B genomes. We utilized the TBtools Quick MCScanX Wrapper program to find the tandem and segmental duplication gene pairs. PlantCARE (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) was used to investigate the upstream sequences (2 kb) of each NBS gene to determine the expected cis-regulatory elements.

Synteny and gene duplication analysis

Using the default settings of MCScanX (Wang et al. 2012), duplication events of NBS genes were examined. On the basis of details regarding collinear pairs and genetic location for both Musa A and B genomes, advanced Circos was utilized to produce diagrams for collinear analysis (Chen et al., 2020). Using TBtools, a multiple synteny plot was created between the Musa A and B genomes and the O. sativa and A. thaliana species.

Ka/Ks analysis and estimated divergence time for the duplicated Musa NBS genes

Using the PAML tool (Yn00 package), the non-synonymous substitution rate and synonymous replacement rate of gene duplication events were calculated (Xu and Yang 2013) . The approximate date of the duplication time (T) (million years ago, Mya) was calculated using the Ks values for each gene pair because Ks of duplication genes are expected to be similar over time in a molecular clock (Shiu et al. 2004), using the formula: T = Ks/2λ*10−6, where λ = clock-like substitution rate (Lynch and Conery 2000) and λ for banana = 4.5 × 10−9 (Lescot et al. 2008).

Expression analysis of NBS genes

The differential expression (DE) value (log2fold change) of each NBS gene under Pseudocercospora eumusae, Pratylenchus coffeae, and drought stress conditions in the corresponding contrasting cultivars was obtained from the database maintained at ICAR-NRCB (http://nrcb.res.in/nrcbbio/about.html), along with Fusarium oxysporum f.sp. cubense, (Foc) race1 (VCG 0124), and Foc tropical race 4 (TR4) DE values. In order to create a heat map, TBtools employed the significant log2fold change values of the variously expressed NBS genes.

Plant materials and stress treatments

Three-month-old healthy tissue cultured Foc race1 (VCG 0124) (Thangavelu et al. 2021) resistant (cv. Rose, AA) and susceptible cultivars (Namarai, AA) and Foc TR4 resistant (cv. Rose, AA) and susceptible cultivars (Matti, AA) were individually planted in the pots containing a pasteurized potting mixture and maintained in a greenhouse at 25 °C with a 12-h photoperiod. Separately, 30 g of Foc race1 and TR4 fungal mixtures were inoculated around the root zone of the respective contrasting genotypes at 2–3 cm below the soil surface. Root and corm samples were collected from Foc inoculated plants on the 0th, 2nd, 4th, 6th, and 8th after inoculation. Leaf and root samples were collected and snap frozen in liquid nitrogen before being stored at − 80 °C until use.

Regulation by microRNAs

M. acuminata miRNAs were retrieved from the Prediction miRNA site (PmiREN2.0:https://www.pmiren.com/) as miRNAs from Musa spp. were not found in mirBase. The downloaded mature miRNAs and the NBS CDS of both the genome were submitted to identify the targets from psRNATarget: A Plant Small RNA Target Analysis Server using default search parameters after selecting the option “submit small RNAs and targets” (https://www.zhaolab.org/psRNATarget/).

Gene Ontology classification

The functional annotation of the NBS proteins from Musa spp. was investigated using the Gene Ontology Functional Annotation Tool Blast2GO version 3.3.5 (https://www.blast2go.com/blast2go-pro) (Conesa et al. 2005) . BLASTP was used for the annotation with default setup parameters and an E-value filter of ≤ 10−5 against the NCBI non-redundant (nr) protein database. GO terms associated with each of the hits were examined based on their molecular functions, biological processes, and cellular localization to illustrate the potential functions of our genes.

Interaction network of NBS proteins

Using STRING v11.0 for NBS-LRR, protein–protein interaction (PPI) was investigated, and a PPI network was built using A. thaliana as a reference (Szklarczyk et al. 2019). The medium confidence level was chosen for the minimum needed interaction score parameters.

Results and discussion

NBS genes are the largest R gene family in the plant genome, and they play an important role in pathogen response. A comprehensive analysis of NBS-encoding genes across the entire banana genome (A and B) was performed in this study, providing an opportunity to mine and use these in disease resistance breeding.

Identification and features of NBS genes

Musa A and B genomes contained 116 and 43 NBS genes, respectively, and the number of R genes was lower than that of A. thaliana (174), rice (636), B. distachyon (239), Zea mays (129), Sorghum bicolor (245), Grapevine (535), and Popular (416) (Table 1). This demonstrates that the total number of NBS genes is not related to genome expansion, and one possible explanation for the lower number of genes in Musa species could be transposable elements that cause pseudogenization of the NBS genes, resulting in gene loss (Li et al. 2010a). Moreover, the number of NBS genes in the B genome is significantly lower than that of the A genome, and similar findings have been reported in orchids (Zhang et al. 2016) and in the three Cucurbitaceae species Cucumis sativus, C. melo, and Citrullus lanatus, particularly in Ci lanatus, where only 45 genes have been reported (Lin et al. 2013). This could be due to stringent gene loss or to a limited number of gene duplication and diploidization events that occurred extensively in the B genome following whole genome duplication (WGD) events in the Musa lineage (Wang et al. 2019; Zhang et al. 2016). Alternatively, the threefold difference in the number of NBS genes between the A and B genomes could be explained by the recent expansion of NBS genes in the A genome. Similar intra-species variation events have also been observed in the potato and tomato (Qian et al. 2017), as well as in Oryza, Glycine, and Gossypium (Zhang et al. 2010).

Table 1 Number of NBS encoding genes in Musa spp. and other crops

In addition, Li et al. (2010a) reported that gene gain and loss events cause inconsistency, resulting in a shrinking pattern of NBS genes in Asian rice, maize, S. bicolor, and B. distachyon. Gene family expansion and contraction studies between M. acuminate and M. balbisina revealed that 83 gene families, including those involved in plant-pathogen interaction, notably expanded in the A genome, while they significantly contracted in the B genome (Wang et al. 2019). Similarly, varied evolutionary patterns have been reported, like the NBS genes in Fabaceae and Rosaceae families were continuously expanding (Shao et al. 2014; Jia et al. 2015), but in Brassicaceae family, the genes were expanding then contracting (Zhang et al. 2016). In the Solanaceae family, genes showed shrinking and consistent expanding patterns in pepper and potato, respectively, and first expanding and then shrinking in tomatoes (Qian et al. 2017).

Furthermore, NBS genes were classified as regular if the aligned region shared ≥ 50% identity with the nr database, and non-regular for the remaining hits. Contrary to regular genes, non-regular genes have short motif lengths and diverse motifs despite harboring NBS structure (Tan and Wu 2012). By comparing with the nr database, 38 and 13 hits were defined as regular NBS genes, primarily showing ≥ 50% identity with the subject sequence of the nr database, and the remaining hits, 78 and 30, were defined as non-regular NBS encoding genes from the banana A and B genomes, respectively. Similar to this, not all of the discovered NBS genes from plants such as rice, B. distachyon, tomatoes, potatoes, peppers, and orchids have all of the domains intact, which could be the result of recombination, fusion, and pseudogenization (Zhou et al. 2004; Tan and Wu 2012; Qian et al. 2017; Xue et al. 2020). Given that their genomes have been thoroughly sequenced and annotated, rice and Arabidopsis have less truncated genes than both the genomes of Musa spp (Meyers et al. 2003; Zhang et al. 2016; Xue et al. 2020).

In a prior work, Chang et al. (2020) reported 98 NBS-LRR genes in M. acuminata, whereas we have reported 116 genes (Anuradha et al. 2022a) which could be due to stringent criteria followed and the use of the HMM model, and similar results have been reported in case of A. lyrata and B. distachyon (Guo et al. 2011; Tan and Wu 2012). NBS protein family typically contains two major subfamilies toll/interleukin-1 receptor-NBS-LRR (TNL) and coiled-coil NBS-LRR genes, out of 38 and 13 regular NBS genes, 24 and 5 were NBS-LRR types of which 21, 3 were CC-NBS-LRRs, having all the domains and 3, 2 were NBS-LRRs without coiled-coil motif (CC) from A and B genomes, respectively. Further, among the regular NBS genes, 5 and 3 were CC-NBSs lacking LRR domains, 2 and 3 were X-NBS, where X is an unknown motif, and, 7 and 2 had only NBS in the A and B genomes, respectively (Supplementary data: 1a, 1b). Many crops including rice, B. distachyon, maize, sorghum, Arabidopsis, C. sativus, grapevine, popular, cabbage have a similar grouping of NBS genes (Li et al. 2010a; Guo et al. 2011; Wan et al. 2013; Goyal et al. 2020; Liu et al. 2021a, b). Furthermore, banana being a monocot, there are no TNL groups of NBS-LRR, which are usually present in dicots, and reports of TIR in monocots are scarce (Li et al. 2010b; Pan et al. 2000).

In the A and B genomes, the length of NBS family proteins varied from 51 to 2275 and 118 to 2254 amino acids (aa), respectively. The molecular weight of these proteins ranged from 5.78 to 258.01 kDa in the A genome and 12.79 to 255.59 kDa in the B genome. The isoelectric point (pI) of NBS proteins ranged from 5.55 to 9.88 and 4.82 to 9.56 in the A and B genomes, respectively. Out of 116 and 43 proteins from A and B genomes, only 23 and 6 had signal peptides and the signal peptide of NBS family members ranged from 16 to 31aa and 17 to 54aa in the A and B genomes, respectively (Supplementary data: 1a, 1b). The signal peptides had mitochondrial and chloroplast targeted peptides that are targeted to mitochondrion and chloroplast, respectively, and some proteins may be localized in the secretory pathway. Similar findings have been reported on protein length, molecular weight, and isoelectric point of NBS of grapevine and cabbage (Goyal et al. 2020; Liu et al. 2021a, b).

Phylogenetic analysis

A phylogenetic analysis of 159 NBS genes, including regular and non-regular genes, was constructed to investigate the evolutional relationship between all of the identified NBS genes from both banana genomes (Fig. 1). The tree showed the clustering of CNL, CN, XNL, XN, XL and N separately and TIR motif harboring NBS genes as an out-group. There were few mixtures of other classes of genes in many of the clusters, indicating that they are co-evolving or that genetic material is being exchanged between the genes (Yang and Wang 2016). Further, most NBS genes on the same chromosome as well as those having high sequence similarity and similar motifs were grouped in the same clades except for a few genes which showed that tandem duplication has occurred (Guo et al. 2011; Wan et al. 2013; Mace et al. 2014; Yang and Wang 2016; Chang et al. 2020). None of the banana NBS genes clustered with Arabidopsis TNL genes, indicating that TIR motifs are absent in monocots (Meyers et al. 2002; Richly et al. 2002; Tan and Wu 2012; Chang et al. 2020). Representative NBS genes from Arabidopsis, rice and Brachipodium were found to cluster in different clades, which may be due to species differences because of the loose functional domain structures of the CC domain, NBS, and LRRs (Liu et al. 2021a, b).

Fig. 1
figure 1

Phylogenetic analysis of NBS genes from Musa A, B genomes and representatives sequences from A. thaliana and O. sativa. Maximum likelihood phylogenetic tree was constructed by using MEGA X with 1000 bootstrap replications. Squares, circles, diamonds, triangles (pink), and triangles (blue) represent the NBS genes of Musa A, B, rice, Arabidopsis, Brachipodium, respectively

Gene structure, conserved protein motifs

The structural evolution of the NBS gene family will be better understood as a result of gene structure research. Both genomes have similar gene structures with few variations, which is common in many functional genes (Lescot et al. 2008; Wang et al. 2019). Most of the genes, i.e., 21 and 35 from regular and non-regular NBS genes of the A genome and 6 and 11 from regular and non-regular genes of the B genome had no introns, and 42, 11 genes had single exon from A and B genomes, respectively (Supplementary Fig. 1a, 1b, 1c, 1d). A total of 6, 12 from regular and non-regular NBS genes of the A genome and 2, 11 from regular and non-regular NBS genes of the B genome had more than two introns. Meyers et al. (2003) reported lesser diversity in exon number and most of the CNLs to have only one exon. In general, NBS genes have fewer introns and are not conserved, and our findings are consistent with NBS genes from other crops (Meyers et al. 2003; Mun et al. 2009; Lozano et al. 2015; Shao et al. 2016; Chang et al. 2020; Liu et al. 2021a, b; Goyal et al. 2020). Exon/intron gain, deletions/insertions, and exonization/pseudo-exonization are the main causes of the variation in gene structure, which may contribute to enhanced gene expression (Roy and Gilbert 2005; Long et al. 2013; Xu et al. 2012; Wan et al. 2013; Jo and Shim 2015; Goyal et al. 2020; Anuradha et al. 2022b). Furthermore, the majority of NBS genes with the same exon–intron organization clustered together, indicating a high degree of conservation throughout evolution (Chang et al. 2020; Liu et al. 2021a, b). In addition, the number of motifs increases with the number of exons, which is associated with gene length in both genomes (Yang and Wang 2016), and similar results were also reported by Yu et al (2021).

NBS proteins in both the genomes were examined to see whether they shared any common motifs, and a total of 20 motifs were discovered (Supplementary Fig. 1a, 1b, 1c, 1d). Motifs 1, 3, 7, and 13 were found in the NBS domain of all the genes, and motifs 4 and 8 were present only in one cluster of the regular genes whereas motifs 2, 4, 6, 8, 9, 10 of the NBS domain were present in all the non-regular genes of A genome of banana. The most conserved motifs are the P-loop, GLPL, Kinase-2, RNBS-A, RNBS-B, RNBS-D, and MHDV, with the C1, P-loop, GLPL, Kinase-2 being the most common, and many have reported the presence of these motifs in the NBS genes (Meyers et al. 2003; Zhou et al. 2004; Tan and Wu 2012) (Supplementary data 2a, 2b). The motifs are highly conserved and ordered in the signaling NBS domains, whereas the LRR domain interacts with the pathogen, resulting in changes in the LRR binding specificities (Yu et al. 2021). Moreover, the members with similar motifs clustered together in the phylogenetic tree as well (Fig. 1; Supplementary Fig. 1). The P-Loop and Kinase-2 motifs are involved in ATP/GTP binding and their high conservation is critical for protein function (Traut 1994; Meyers et al. 2003; Habachi-Houimli et al. 2018) and the presence of GLPL motif is essential for disease resistance (Dodds et al. 2001). Furthermore, the last amino acid of the kinase-2 motif was W (tryptophan) and many have also reported the presence of W in non-TNL genes, and based on the presence of W/D as the last residue we could distinguish the type of NBS genes as Non-TNL/TNL (Meyers et al. 2003; Wan et al. 2013; Die et al. 2018) (Supplementary data 2a, 2b).

Gene distribution, collinearity, and synteny analysis

NBS genes are unevenly distributed and exist as clusters, a region that contains four or more genes within 200 kb or less (Holub, 2001), across the chromosomes of both A and B genomes of bananas (Supplementary Fig. 2a, 2b, 2c, 2d) and similar findings have been reported for rice, Arabidopsis¸ grapevine, poplar, and Brachipodium (Richly et al. 2002; Meyers et al. 2003; Zhou et al. 2004; Yang et al. 2008; Tan and Wu 2012). The majority of NBS genes are found in clusters, which serve as a reservoir of genetic variation for NBS genes via gene conversion, duplication, and diversifying selection (Meyers et al. 2005; Ameline-Torregrosa et al. 2008).

Regular NBS genes were distributed across all the 8 chromosomes except for chr2, 5, 11 and were located in chr1, 4, 7, and 8 in the A and B genomes, respectively, whereas non-regular NBS genes were distributed across all the 11 chromosomes in A genome and in all chromosomes except for chr3, 11 in B genome. A maximum number of regular NBS genes were found in chr3 (23 genes) and chr1 (10 genes) and non-regular NBS genes were located in chr6, chr10 (13 genes), and chr6 (8 genes) of the A and B genomes, respectively. Gene duplication, uneven crossing over, ectopic recombination, gene conversion, and diversifying selection may all have contributed to the distribution of R genes (Friedman and Baker 2007; Yang et al. 2015a; Chang et al. 2020).

A total of two clusters (chr3) in regular and six clusters (chr1, 3, 6, 7, 9, 10) in non-regular NBS genes were observed in the A genome, whereas in the B genome a single cluster (chr1) was observed. The clusters may be monophyletic (sequence with high sequence similarity and a close relationship) or mixed clusters (sequence with low sequence similarity and a diverged relationship). NBS genes in Musa spp. have monophyletic clusters in both regular (chr3, 6, 9 and chr1) and non-regular genes (chr1, 3, 6, 7, 9 and chr1, 6, 9) of the A and B genomes, respectively and mixed clusters of genes (Supplementary data: 3a, 3b, 3c, 3d). NBS genes in monophyletic clusters are small and have higher sequence similarity than NBS genes in the mixed clusters, indicating that they evolved through different mechanisms. Most monophyletic clusters may have resulted from a local duplication event, which contributes to the gene diversification and increase in the number, and the difference in numbers may also be attributed to the pressure exerted by the pathogens (Mace et al. 2014; Yang and Wang 2016) whereas mixed clusters may have resulted from ectopic recombination (Yang and Wang 2016). Eitas and Dangl (2010) discovered that two NBS genes are involved in resistance, with the majority of them coming from clusters. Most NBS genes are clustered, but some are present as singletons on the chromosome, which could be due to gene loss by pseudogene formation, or deletion or these genes may not have undergone local duplication (Zhang et al. 2014). These singletons may further act as trailblazers, resulting in the formation of new NBS regions and clusters.

Tandem duplication events were discovered in both banana genomes, resulting in the clustering of NBS genes on the chromosome (Fig. 2a–d). Tandem duplication events were found in chr1, 3 and chr1, 3, 4, 6, 7, 9, and 10 in the A genome’s regular and non-regular genes, respectively. In the case of the B genome, tandem duplication events were found in chr1 (regular genes) and chr6, 9 (non-regular genes), respectively. This demonstrates that tandem duplication events played a significant role in the expansion of the NBS genes in bananas and similar findings have been reported in many other crop species (Meyers et al. 2003; Li et al. 2010b; Kang et al. 2012; Wan et al. 2013; Shao et al. 2014; Yang and Wang 2016; Zhang et al. 2016, 2020; Qian et al. 2017; Chang et al. 2020; Liu et al. 2021a, b). Many studies have revealed that the evolution of most of the NBS genes falls under medium to high tandem duplication classes (Santamaria et al. 2001; Die et al. 2018). Further, the low number of duplication events in the B genome indicates a lack of recent duplication events, as well as contraction of gene families involved in plant–pathogen interactions following divergence from the A genome (Wang et al. 2019). The low number of genes and duplication events also revealed that these genes are sufficient for pathogen surveillance (Porter et al. 2009).

Fig. 2
figure 2

a, b Collinearity mapping of regular and non-regular NBS genes in Musa A genome. Different chromosomes are shown in yellow color. The gene density is displayed in the form of histogram. The inner colored lines represent the collinearity relationships of NBS families. c, d Collinearity mapping of regular and non-regular NBS genes in Musa B genome. Different chromosomes are shown in yellow color. The gene density is displayed in the form of histogram. The inner colored lines represent the collinearity relationships of NBS families

The synteny relationship of NBS genes from bananas was investigated to explore the evolutionary events that that occurred between the orthologous from O. sativa and A. thaliana (Fig. 3a). The results revealed that NBS genes from both banana genomes had homologous regions in rice and Arabidopsis but the degree of synteny was higher with rice than Arabidopsis, which could be due to whole genome duplication (WGD) even before Musa spp. diverged from poaceae, as well as due to the existence of microsynteny between Musa, rice and Arabidopsis followed by independent cycles of WGD and diploidization (Lescot et al. 2008; D'Hont et al. 2012). In addition, a high degree of orthologous relationship was observed between the A and B genomes of bananas, which might be attributed to a high degree of homology between the two genomes (Davey et al. 2013) (Fig. 3b). Lescot et al. (2005, 2008) reported high level of microsynteny with gene order preservation in the genic regions between the two genomes. Wang et al. (2019) also reported greater genomic collinearity and sequence similarity between the two genomes. However, the number of orthologous genes between the two banana genomes is lower, which could be due to less expansion and more contraction of the gene family in the B genome after divergence from the A genome (Liu et al. 2021a, b).

Fig. 3
figure 3

a Synteny analysis of NBS genes between Musa A genome, Musa B genome, A. thaliana, and O. sativa. The grey lines in the background indicate the collinear blocks within Musa A, B, A. thaliana, and Oryza sativa genomes, while the blue and red lines highlight the syntenic NBS gene pairs. b Synteny analysis of NBS genes between Musa A and B genomes. The grey lines in the background indicate the collinear blocks within Musa A, B genomes, while the blue lines highlight the syntenic NBS gene pairs

Cis-elements

Cis-elements of promoters are crucial for gene regulation, and the types of the cis-elements indicate the gene’s potential function in response to the pathogens (Rushton et al. 2002). Cis-elements were identified in the 2 kb promoter region of banana NBS genes (Supplementary data: 5a, 5b). Many cis-elements were identified from both the genomes (Supplementary Fig. 3a, 3b, 3c, 3d) which were similar to the cis-elements identified from NBS genes of Brachipodium, grapes, Chinese cabbage (Tan and Wu 2012; Goyal et al. 2020; Liu et al. 2021a, b). The NBS promoter regions contained cis-elements from various classes, including defense and stress-related, development-related, and hormone-related elements, particularly salicylic acid-responsive elements, ethylene-responsive elements, methyl jasmonate responsive elements, and abscisic acid-responsive elements, indicating their extensive role in the resistance mechanism (Goyal et al. 2020) (Supplementary Fig. 4a, 4b).

Many of the NBS genes had defense and stress-related elements, wound and pathogen responsive elements like WUN-motif and W BOX, which are associated with WRKY transcription factors, and stress-responsive element GT-1 box associated with GT-1-like transcription factors, respectively resulting in induced expression upon pathogen infection (Dong et al. 2003; Goyal et al. 2020; Liu et al. 2021a, b) (Supplementary Fig. 5a, 5b, 5c, 5d). The binding of GT-1 factors to the promoter’s GT-1 elements reduced TMV infection and influenced the expression of genes induced by SA (Buchel et al. 1999). WBOX is a distinctive disease-related element present in NBS genes as well as the upstream promoter region of NPR1, a positive regulator of inducible plant disease resistance and PR1 (Yu et al, 2001; Rushton et al. 1996) and most Arabidopsis and Brachipodium pathogen response genes (Li et al. 2004; Tan and Wu 2012). WRKY transcription factors bind specifically to WBOX elements of PR10 and also play a role in ABA and GA signaling (Eulgem et al. 2000; Zhang et al. 2004). Additionally, it was discovered that the NBS genes in both the banana genomes have an excess of regulatory regions linked to stress-related transcription factors like MADS, C2H2, MYB, HD-ZIP, WRKY, bHLH, ERF, and bZIP (Tan and Wu 2012; Goyal et al. 2020; Liu et al. 2021a, b). Overall, the various types and number of regulatory elements present in the promoter region of NBS genes indicate that these genes may be involved in the defense mechanism.

Ka/Ks analysis and estimated divergence time for the duplicated NBS genes

Positive selection drives host–pathogen co-evolution and selection for new resistance genes. The driving force behind gene duplication was examined using the non-synonymous (Ka) and synonymous (Ks) nucleotide substitution rates of the duplicated genes, and the Ka/Ks ratio was used to identify and quantify the direction and strength of selection (Habachi-Houimli et al. 2018). A total of 119 (53 in regular, 66 in non-regular) and 20 (17 in regular, 3 in non-regular) putative paralogous gene pairs were identified from A and B genomes, respectively (Supplementary data: 5a, 5b) and maximum number of paralogs was observed for non-regular genes in the case of the A genome and regular genes in the case of the B genome.

The analysis of selection pressure among the duplicated NBS genes in both genomes revealed that the genes were under purifying or negative selection, which could be due to a highly conserved NBS domain with strictly ordered motifs involved in signaling (Ka/Ks was less than one in each duplicated gene pair) (Mace et al. 2014). Moreover, many of the paralogous gene pairs that are under negative selection belong to the N, CN, and XN categories of genes and they lack the LRR region that is involved in pathogen-ligand recognition and are highly variable because they typically evolve binding specificities and are subject to positive selection (Yoshimura et al. 1998; Mace et al. 2014; Yang and Wang 2016). Our findings are consistent with those of Andersen et al. (2016), Habachi-Houimli et al. (2018), and Li et al. (2016).

The average Ka/Ks ratio of NBS genes in a banana is much higher than in bread wheat and barley; this higher value, as well as the presence and absence of genes between the two genomes, indicate that NBS genes are rapidly evolving following natural ploidization and artificial selection expansion (Gu et al. 2015; Habachi-Houimli et al. 2018). Duplication events between gene pairs may have occurred as a result of negative selection, implying an expansion due to tandem duplication, indicating that this gene family is conserved and may not be quickly overcome by virulence evolution, and the natural diversity available in NBS genes is likely an important source of durable resistance (Santamaria et al. 2001; Cannon et al. 2004; Li et al. 2010b). However, evolutionary pressure may cause structural and functional variation within paralogs (Lan et al. 2009).

Based on the divergence rate of 4.5 × 10–9 synonymous mutations per synonymous site year proposed for banana, the estimated time of occurrence of duplicated events of paralogous NBS gene pairs was calculated. Paralog duplication events may have occurred between 1.10 and 20.91 in regular NBS genes and 0.28 and 19.43 in non-regular NBS genes in the A genome, and between 6.13 and 21.09 in regular NBS genes and 8.43 and 21.10 in non-regular NBS genes in the B genome. Many pairs of duplication events were discovered to have occurred lately in both banana genomes, which may have contributed to the evolution of various NBS gene functions.

Differential expression of NBS genes under various stress conditions

To investigate the potential biological functions of NBS genes, we examined the expression patterns of various NBS genes in bananas using challenged and unchallenged transcriptome data from resistant and susceptible cultivars under various biotic and abiotic stress conditions (Fig. 4).

Fig. 4
figure 4

a, b A graphical representation of expression details of NBS genes under biotic and abiotic stresses in Musa cultivars [a Eumusae leaf spot, nematode, and drought, b Foc race 1 and TR4]. The heat map was drawn using log2 logarithmic transformed expression values. Green to red represents low and high expression levels, respectively. Based on the expression, the NBS genes were hierarchically clustered and divided into various gene clusters in the figure

The results indicated that the NBS genes such as Macma4_03_g09360.1 (Ma03_g09130), Macma4_05_g04110 (Ma05_g04000), Macma4_06_g00260.1 (Ma06_g00230), Macma4_06_g00610.1 (Ma06_g00230), Macma4_10_g11850.1 (Ma10_g08140), Macma4_07_g23010.1 (Ma07_g21730), Macma4_09_g02770.1 (Ma09_g02710), Macma4_04_g35000.1 (Ma04_g32720) were significantly up-regulated in resistant (cv. Rose, AA) than susceptible cultivars (Matti, AA) upon Foc TR4 infection. NBS genes such as Macma4_03_g10980.1 (Ma03_g10480), Macma4_06_g25530.1 (Ma06_g23910), Macma4_10_g11840.1 (Ma10_g08250), Macma4_03_g09360.1 (Ma03_g09130) showed significant up-regulation in resistant (cv. Rose, AA) than susceptible cultivars (Namarai, AA) upon Foc race1 infection. Despite the fact that Macma4_03_g09360.1 (Ma03_g09130) showed higher expression in resistant cultivars upon both the Foc races, significant up-regulation was observed upon Foc TR4 rather than race1. Different NBS genes responded to Foc race1 and TR4 infection, implying the involvement of race-specific pathogen effectors and furthermore, NBS genes confer race-specific resistance, confirming the gene for gene hypothesis (Jones and Dangl 2006; Chen et al. 2018). Peraza-Echeverria et al. (2008) identified NBS-type resistance gene candidates, RGC1-5 (Macma4_08_g32130.1, Macma4_03_g09360.1, Macma4_04_g37480.1, Macma4_03_g 10,980.1, Macma4_06_g38310.1) from Foc TR4 resistant wild banana, M. acuminata ssp. malaccensis. Among these five genes, RGC2 (Macma4_03_g09360.1) and RGC5 (Macma4_06_g38310.1) were compared with I2, and the expression of RCG2 correlated with resistance to Foc STR4 (Peraza-Echeverria et al. 2009). Sutanto et al. (2014) isolated RGAs from three fusarium-resistant banana cultivars and identified that MNBS17 which shared 50.5% identity with RGC2 (ABY75802), an NBS gene associated with Foc TR4 resistance in bananas. The above finding shows that Macma4_03_g09360.1 is up-regulated upon Foc infection which corroborates with our results. Dale et al. (2017) also reported transgenic Cavendish lines transformed with RGC2 and its expression strongly correlated with resistance to Foc TR4. Miller et al. (2008) reported significant up-regulation of Macma4_04_g35000.1 (Ma04_g32720) upon TR4 infection. Macma4_03_g10980.1 was found to be significantly up-regulated in resistant cultivars upon Foc race1 (Peraza-Echeverria et al. 2008). Chang et al. (2020) found that the genes Ma09_g12410, Ma07_g22920, and Ma09_g08180, Ma07_g22920 were significantly up-regulated in resistant cultivars after Foc race1 and TR4 infection, our results also showed similar trends, but they were not significant. This could be due to the genetic background of the hosts used in both studies (Wang et al. 2019). Furthermore, the number of NBS responsive genes against TR4 was higher than race1, which could be due to race-specific recognition against different races of a pathogen (Chang et al. 2020).

Upon P. eumusae infection, Macma4_10_g11850.1 (Ma10_t08140.1) gene showed significant up-regulation, along with this, other genes like Macma4_06_g21640.1 (Ma06_g21200.1), Macma4_03_g24860.1 (Ma03_g23360.1) also showed significant differential expression in resistant cultivar (Manoranjitham, AAA) as compared to the susceptible cultivar (Grand Nain, AAA). According to Timm et al. (2016), RGA1 (Ma06_g21200.1) was up-regulation in resistant cultivar (Calcutta-4) rather than susceptible cultivar (Grand Nain) against P. fijiensis. Passos et al. (2013) also reported that NBS genes are the most abundant class of R genes that were expressed upon P. musicola infection. Emediato et al. (2013) compared the transcriptional activity of NBS genes in Calcutta-4 (AA, resistant) and Cavendish Grand Nain (AAA, susceptible) cultivars after P. musicola infection and found that some RGAs displayed higher/lower expression constitutively in resistant cultivars at an early stage, whereas some expressed across the infection time course.

Macma4_06_g25510.1 (Ma06_t23890.1) was significantly up-regulated in P. coffeae resistant cultivar (YKM5, AAA) than in the susceptible cultivar (Nendran, AAB). Backiyarani et al. (2013) reported that root lesion nematode infection induced RGA clusters, C1 and C5 in the resistant (cv. Karthobiumtham) but not the susceptible (cv. Nendran) cultivar.

Some of the NBS genes expressed more in sensitive cultivars than in tolerant cultivars, indicating that these genes play an active role in drought response. Chini et al. (2004) reported that drought tolerance is associated with increased expression of the CC-NBS-LRR genes. CC-NBS-LRR expression was found to be higher in hexaploid sweet potato and soybean during drought (Arisha et al. 2020; Kim et al. 2020). Many studies have reported that B-genome contributes to resistance to both biotic and abiotic stresses (Davey et al. 2013; Hu et al. 2015). Generally, cultivars with AAB or ABB genomes are more drought tolerant and hardy due to the presence of the B genome and this could be one of the possible reasons for the tolerant nature of Saba (ABB) to drought stress (Davey et al. 2009; Liu et al. 2010; Vanhove et al. 2012; Ravi et al. 2013; Muthusamy et al. 2016). Furthermore, despite the fact that the B genome contains fewer NBS genes, the resistance nature may be due to B genome-specific miRNAs regulating these R genes. Even though a similar number of miRNA families were reported in both the genomes of bananas, additional miRNAs were discovered in the B genome, which is unique and they may have predicted targets indicating novel stress-related pathways that evolved separately in M. balbisiana (Davey et al., 2013). Hence, there is a need to investigate the NBS genes and miRNAs expressed in B genome cultivars to gain a better understanding of resistance mechanisms to various biotic and abiotic stresses.

Interaction of NBS genes with miRNAs

A total of 247 miRNAs from M acuminate was retrieved and used for the identification of miRNA targets in NBS genes of both the genomes, of which 90 and 77 miRNAs were found to have targets in 104 and 34 NBS genes in the A and B genomes, respectively. The percentage of miRNA to NBS target is higher for the B genome and it was suggested that several miRNAs from the B genome are involved in tolerance/response to biotic and abiotic stresses (Davey et al. 2013). Eight and four significantly up-regulated NBS genes against Foc TR4 and race1 in resistant cultivars have target sites for miRNAs. Macma4_03_g09360.1 which was found to be up-regulated in resistant cultivars against both the Foc races had target sites for three miRNAs (Supplementary data: 6a, 6b). Similarly, all the significantly up-regulated NBS genes against leaf spot disease and nematodes had miRNA target sites, indicating that these miRNAs may regulate gene expression in susceptible cultivars. By producing phased trans-acting siRNAs (phasiRNAs) that target the domains of NBS, miRNAs serve as a master regulator that modifies the arms race between hosts and pathogens (Park and Shin 2015; Koroban et al. 2016; Yang et al. 2021). Yang et al. (2015a, b) reported that miR482 modulate resistance in potato during Verticillium dahlia infection by suppressing the expression of NBS-LRR genes. Resistance to V. dahlia, Fusarium oxysporum in cotton and tomato was due to reduced expression of miR482 and increased expression of the NBS gene, respectively (Zhu et al. 2013; Ouyang et al. 2014). Yang et al. (2021) reported decaying of NBS-LRR mRNAs by miR482 in potato, tomato, and tobacco, and miR9863 against Mla transcripts in barley upon powdery mildew fungus by triggering the production of phasiRNAs. Hence, the low expression of the NBS genes in susceptible cultivars in bananas against various stresses might be due to miRNAs and this has to be validated through miRNA expression studies and their targets in both the resistant and susceptible cultivars upon specific pathogen infections.

GO functional annotation of NBS genes

NBS genes from both genomes were annotated into different classes such as biological process, molecular function, and cellular component, which allows us to gain insight into the protein’s molecular and biological functions (Ashburner et al. 2000) (Supplementary data: 7a, 7b). Most of the genes were under biological processes (238, 205) followed by molecular functions (181, 200) and cellular components (67, 67) in the A and B genomes, respectively (Fig. 5a, b). NBS genes under biological processes are diverse and are in a broad range of GO terms. In both the genomes, the maximum number of genes under the biological processes category belonged to the GO term, response to stimuli, response to stress, and defense response and it is in tandem with their function, disease resistance. Under molecular function and cellular components, most of genes were under binding and cell periphery in both the genomes, respectively. Similar outcomes have been documented for blueberry, Brassica napus, grapes, and durian (Die et al. 2018; Fu et al. 2019; Goyal et al. 2020; Cortaga et al. 2022). NBS genes under molecular function are involved in binding as they are mostly cellular receptors engaged in signaling via kinase cascades (Cristina et al. 2010). NBS genes in cellular components convert external stimuli into intracellular responses for defense activation because they are primarily found in the cell membrane, cytoplasm, and nucleus and serve as recognition sites for PAMP/MAMP and effector proteins (Cortaga et al. 2022). In general, GO term analysis supports the NBS genes’ ability to recognize conserved binding sites and trigger defense responses.

Fig. 5
figure 5

a, b Functional classification of NBS genes from Musa A and B genomes on the basis of Gene Ontology (GO) terms assigned to various genes using BLAST2GO tool

Interaction network of NBS proteins

The NBS protein interaction network was built using the interaction relationships of homologous NBS proteins from A. thaliana (Supplementary Fig. 6). The interaction network analysis revealed that NBS proteins from both genomes interacted with CAX7, solute carrier family 24 (sodium/potassium/calcium exchanger), member 6; CYTC-1, cytochrome C-1; electron carrier protein; CYTC-2, cytochrome C-2-eletron carrier protein; DAR5, DA1-related protein 5 (DAR5); AT5G45510, probable disease resistance protein At5g45510; At5g46520, disease resistance protein (TIR-NBS-LRR class) family; RLP36, disease resistance family protein/LRR family protein; ASA2, anthranilate synthase alpha subunit 2, chloroplastic; At5G53850, probable bifunctional methylthioribulose-1-phosphate dehydratase/enolase-phosphatase E1; AT2G34930, disease resistance family protein/LRR family protein. Santaella et al. (2004) found that NBS and the solute carrier family were up-regulated in Manihot esculenta during infection with Xanthomonas axonopodis pv. manihotis. According to Wang et al. (2022), NBS genes from Lagenaria siceraria were found to interact with two electron carrier proteins (CYTC-1 and CYTC-2), transporter family proteins, NB-ARC domain-containing proteins (SNC1, ADR1-L2, ADR1, ZAR1, TIR, RPM1), anthranilate synthase (ASA2), and receptor-like protein (RLP36).

Conclusion

A comprehensive analysis of NBS genes is a valuable resource for many questions about immune system evolution, such as natural variation in innate immunity and hybrid failure. This study provides a thorough understanding of the NBS gene family in bananas, including its classification, genome organisation, phylogenetic relationship, gene structure, motifs, evolution, gene expression patterns, and regulation by miRNA against biotic and abiotic stresses. Even though NBS genes are present in susceptible cultivars, higher expression of NBS genes in resistant cultivars indicates NBS gene family resistance to various biotic and abiotic stresses. More research on NBS genes, as well as the expression of miRNA and its targets under different stresses, is needed to functionally validate their biological significance and molecular mechanisms for using these genes as defence sentinels.