ASLncR: a novel computational tool for prediction of abiotic stress-responsive long non-coding RNAs in plants

Pradhan, Upendra Kumar; Meher, Prabina Kumar; Naha, Sanchita; Rao, Atmakuri Ramakrishna; Gupta, Ajit

doi:10.1007/s10142-023-01040-0

ASLncR: a novel computational tool for prediction of abiotic stress-responsive long non-coding RNAs in plants

Methodology
Published: 31 March 2023

Volume 23, article number 113, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Functional & Integrative Genomics Aims and scope Submit manuscript

ASLncR: a novel computational tool for prediction of abiotic stress-responsive long non-coding RNAs in plants

Download PDF

Upendra Kumar Pradhan¹,
Prabina Kumar Meher ORCID: orcid.org/0000-0002-7098-8785¹,
Sanchita Naha²,
Atmakuri Ramakrishna Rao³ &
…
Ajit Gupta¹

488 Accesses
6 Citations
Explore all metrics

Abstract

Abiotic stresses are detrimental to plant growth and development and have a major negative impact on crop yields. A growing body of evidence indicates that a large number of long non-coding RNAs (lncRNAs) are key to many abiotic stress responses. Thus, identifying abiotic stress-responsive lncRNAs is essential in crop breeding programs in order to develop crop cultivars resistant to abiotic stresses. In this study, we have developed the first machine learning-based computational model for predicting abiotic stress-responsive lncRNAs. The lncRNA sequences which were responsive and non-responsive to abiotic stresses served as the two classes of the dataset for binary classification using the machine learning algorithms. The training dataset was created using 263 stress-responsive and 263 non-stress-responsive sequences, whereas the independent test set consists of 101 sequences from both classes. As the machine learning model can adopt only the numeric data, the Kmer features ranging from sizes 1 to 6 were utilized to represent lncRNAs in numeric form. To select important features, four different feature selection strategies were utilized. Among the seven learning algorithms, the support vector machine (SVM) achieved the highest cross-validation accuracy with the selected feature sets. The observed 5-fold cross-validation accuracy, AU-ROC, and AU-PRC were found to be 68.84, 72.78, and 75.86%, respectively. Furthermore, the robustness of the developed model (SVM with the selected feature) was evaluated using an independent test dataset, where the overall accuracy, AU-ROC, and AU-PRC were found to be 76.23, 87.71, and 88.49%, respectively. The developed computational approach was also implemented in an online prediction tool ASLncR accessible at https://iasri-sg.icar.gov.in/aslncr/. The proposed computational model and the developed prediction tool are believed to supplement the existing effort for the identification of abiotic stress-responsive lncRNAs in plants.

ASmiR: a machine learning framework for prediction of abiotic stress–specific miRNAs in plants

Article 20 March 2023

Long Non-coding RNA for Plants Using Big Data Analytics—A Review

Feature Extraction of Long Non-coding RNAs: A Fourier and Numerical Mapping Approach

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Due to the growing world population, demand is going to be increased in global food consumption, and by 2050, that demand is expected to be doubled (Tilman et al. 2011). Abiotic stresses, on the other hand, present a substantial challenge to agriculture and the ecosystem due to changing climatic conditions, resulting in significant crop yield loss (Saeed et al., 2023; Wani et al., 2016). In order to adapt to challenging environmental conditions, plants modify the expression of several genes at the transcriptional, post-transcriptional, and epigenome levels in response to different abiotic stresses (Liu et al. 2022a; Choudhury et al. 2021; Zhu et al., 2022). The functional elucidation of many genes at the transcription, post-transcriptional, post-translational, and epigenetic levels has been significantly improved with the advancement in genome sequencing technology, especially next-generation sequencing (NGS) (Li et al. 2018). The NGS technologies have led to the identification of novel non-coding RNAs (ncRNAs) (Öztürk Gökçe et al., 2021; Bhogireddy et al., 2021; Yu et al. 2019) and their roles in the regulation of multiple biological processes, including plant response to various abiotic stresses (Yang et al. 2023;Yu et al. 2019).

The long non-coding RNAs (lncRNAs) are a group of ncRNAs which are more than 200 bp long and not be translated into a protein (Quan et al. 2015). Transcriptional, post-transcriptional, and epigenetic regulations of gene expression are three ways that lncRNA acts as a gene regulatory factor (Quan et al. 2015). The lncRNAs are reported to be important modulators of various biological processes (Mercer et al., 2009). Their involvement in controlling transcription through enhancers and providing regulatory binding sites has been well documented (Wang and Chekanova, 2017). These are also said to act as miRNA sponges, suppressing miRNA function by causing deflection to their potential target (Wang et al. 2010). The lncRNAs are also found in the nucleus, where they serve as major components of nuclear speckles (Hutchinson et al. 2007). In the cytoplasm, lncRNAs interact with a variety of RNA-binding proteins (RBPs) to monitor and control their regulatory dynamics (Glisovic et al. 2008).

Plant lncRNAs make up around 80% of all ncRNAs and are involved in a wide range of biological processes, including abiotic stress response (Wang et al. 2021). The first lncRNA reported in plants was ENOD40 in Soybean (Yang et al. 1993). Despite the fact that the plant genomes are more complicated than animal genomes, the number of experimentally identified lncRNAs in plants are much less than that reported for animals. Several lncRNAs that respond to abiotic stresses have been reported to be present in a wide range of plant species. Table 1 contains a list of recently identified lncRNAs reported to be involved in various abiotic stresses. Due to the discovery of abiotic stress-responsive lncRNAs and their target genes in a range of plant species, we now have a better understanding of the molecular mechanism underlying these stress adaptations. For example, in drought conditions of Arabidopsis thaliana, lncRNA lincRNA340 is induced to repress miR169, relieving nuclear factor Y (NF-Y) gene expression to improve stress tolerance (Qin et al. 2017). Further, lncRNA973 functions as a positive regulator of salt-responsive genes in ROS (reactive oxygen species), enhancing salinity tolerance in cotton (Zhang et al., 2019). Similarly, GhDNA1, which targets AAAG DNA double strands to regulate drought-responsive genes in trans, was discovered to be associated with drought tolerance in cotton (Tao et al., 2021). These findings support the idea that lncRNAs can be induced or suppressed in response to abiotic stress. Furthermore, these abiotic stress-responsive lncRNAs have been linked to phytohormone signal transduction, secondary metabolite biosynthesis, and sucrose metabolism pathways, each of which has been reportedly engaged in plant abiotic stress response (Ding et al. 2019; Yang et al., 2022; Lamin-Samu et al., 2022).

Table 1 Representative lncRNAs found to be involved in plants responding to different abiotic stresses

Full size table

The studies cited above indicate that lncRNAs may be exploited as genetic targets to develop crop cultivars that are resistant to abiotic stresses. However, the lncRNAs are needed to be identified first before using them as genetic targets. To date, techniques such as serial expression of gene expression (SAGE), the expressed sequence tag (EST), whole-genome tiling arrays, lncRNA microarray, RNA capture sequencing (RNA CaptureSeq), and RNA-sequencing (RNA-seq) have all been employed to identify abiotic stress-related lncRNAs. However, the wet-lab experiments consume a lot of resources (Lee and Kikyo 2012). Furthermore, the advanced sequencing techniques are species-specific. Thus, there is a need to develop a computational method for predicting abiotic stress-responsive lncRNAs using lncRNA sequence data. In other words, the development of machine learning-based computational methods may be a better alternative for predicting lncRNAs associated with abiotic stress. Considering the above facts, the present study is devoted to develop the first machine learning-based computational model for predicting abiotic stress-responsive lncRNAs using sequence-derived features. The proposed approach is expected to supplement wet-lab methods and other sequencing techniques for identifying abiotic stress-responsive lncRNAs in plants.

Materials and methods

Collection of abiotic stress-responsive lncRNA sequence data

The PncStress database (Wu et al., 2020) is the most recent source for abiotic stress-responsive lncRNAs. It contains experimentally validated ncRNA sequences linked to a variety of abiotic and biotic stresses. With 114 species responding to 48 abiotic and 91 biotic stresses, PncStress now has 4227 entries, including 2523 miRNAs, 444 lncRNAs, and 52 circRNAs validated by different experimental methods. The PncStress database (Wu et al., 2020) was accessed on July 30, 2022, in order to retrieve lncRNA sequences relevant to abiotic stresses. A total of 444 abiotic stress-responsive lncRNA sequences, representing 27 different abiotic stress categories, were obtained from 24 plant species.

Construction of positive and negative dataset

The abiotic stress-responsive lncRNA sequences obtained from the PncStress database were used to construct the positive set. On the other hand, 238,226 lncRNA sequences retrieved from the PLncDB V2.0 database (accessed on August 05, 2022) (Jin et al., 2021) were used to construct the negative set. To prevent homologous bias in the prediction accuracy, the homology reduction at 50% sequence identity was applied to both positive and negative datasets using the CD-HIT method (Huang et al., 2010). After the redundancy sequences were removed, the positive and negative sets produced 364 and 97,654 lncRNA sequences, respectively. To avoid prediction bias toward the non-abiotic stress class having a larger number of sequences, a balanced dataset with an equal number of abiotic stress and non-abiotic stress-responsive lncRNA sequences was taken into consideration. In other words, 364 non-abiotic stress sequences were chosen at random from the pool of 97,654 sequences to prepare a balanced training dataset that comprises an equal number of sequences from both classes. Out of the 364 sequences in each class, 101 lncRNA sequences were kept aside to prepare the independent dataset. The remaining 263 stress-responsive lncRNAs and 263 non-stress-responsive lncRNAs were used as positive and negative sets for the training dataset.

Numeric feature generation

In this study, we generated Kmer features to transform each lncRNA sequence into a numeric feature vector. The Kmer features are represented as the occurrence frequencies of K neighboring nucleic acids (Lee et al. 2011), which has been successfully used in several computational studies including lncRNA prediction (Sun et al. 2013). The numeric value for the Kmer size k can be calculated as

$$f_k(t)=\frac{N_k(t)}{N-k+1},$$

(1)

where N_k(t) is the number of Kmer type t of size k, and N is the length of the nucleotide sequence. For example, for an RNA sequence ‘CUGACUGACUGACUGUA’, ${f}_1(C)=\frac{4}{17}$, ${f}_2(CU)=\frac{4}{16}$, ${f}_3(CUG)=\frac{4}{15},$ ${f}_4(CUGA)=\frac{3}{14}$, ${f}_5(CUGAC)=\frac{3}{13}$, and ${f}_6(CUGACU)=\frac{3}{12}$. A brief representation of the Kmer feature is shown in Fig. 1. The number of Kmer features of size k is 4^k. In this study, we have considered Kmer sizes 1 to 6 to generate the features for each sequence. Thus, for Kmer sizes 1, 2, 3, 4, 5, and 6, the number of features generated was 4, 16, 64, 256, 1024, and 4096, respectively. The Kmer sizes 1 to 6 were denoted as K₁, K₂, K₃, K₄, K₅, and K₆. In total, 5460 features were generated for each lncRNA sequence.

Prediction algorithms

Several bioinformatics fields have effectively applied machine learning techniques for prediction purposes (Guo et al. 2017, Pradhan et al. 2022, Abbas and EL-Manzalawy 2020, Pradhan et al. 2021). The support vector machine (SVM; Vapnik 1963), extreme gradient boosting (XGB; Chen and Guestrin 2016), random forest (RF; Breiman 2001), light-gradient boosting machine (LGBM; Ke et al. 2017), bagging (BAG; Breiman 1996), adaptive boosting (ADB; Freund and Schapire 1999), and gradient boosting decision trees (GBDT; Friedman 2001) were the seven machine learning techniques we used in this study. Table 2 lists the R-packages used to implement the learning models and the parameter settings for each learning model.

Table 2 Software used and parameter setting for different machine learning models used for prediction of abiotic stress-responsive lncRNAs

Full size table

Feature selection approach

By eliminating duplicate and irrelevant features, feature selection reduces the computational burden while increasing classification accuracy (Pradhan et al., 2022). The support vector machine recursive feature elimination (SVM-RFE; Guyon et al., 2002), random forest variable importance measure (RF-VIM; Daz-Uriarte and Alvarez de Andrés, 2006), XGB variable importance (XGB-VIM; Sandri and Zuccolotto, 2008), and LGBM variable importance measure (LGB-VIM; Ke et al., 2017) were used to select important and relevant features. According to past studies (Guyon et al., 2002; Pradhan et al., 2022), the top features in this study that led to a classifier with the best classification accuracy was chosen. The sigFeature R-package was used to implement the SVM-RFE technique (Das et al., 2020). The R-packages randomForest (Liaw and Wiener 2002), xgboost (Chen et al., 2021b), and lightgbm (Shi et al. 2022) were used to implement the RF-VIM, XGB-VIM, and LGB-VIM methods, respectively.

Cross-validation and performance metrics

A five-fold cross-validation approach was used to assess the performance of the prediction models. Both the positive and negative datasets were randomly separated into five subgroups of equal size to perform the five-fold cross-validation (Jiang and Wang, 2017). In each fold of the cross-validation, one randomly selected subset from each class served as the test set, while the remaining four subsets from both classes were pooled to serve as the training set. With distinct training and test sets for each fold, the experiment was carried out five times, and the accuracy over the five folds was recorded. The different steps involved to develop the proposed approach are shown in Fig. 2. The following metrics were used to evaluate the performance of the prediction models: sensitivity, specificity, accuracy, precision, area under receiver operating characteristic curve (AU-ROC; Fawcett, 2006), and area under precision recall curve (AU-PRC; Boyd et al., 2013). In the following formulae, TP and FP respectively represent the number correctly and wrongly predicted positive samples, whereas TN and FN respectively represent the number correctly and wrongly predicted negative samples.

$$\textrm{Sensitivity}=\frac{TP}{TP+ FN}$$

(2)

$$\textrm{Specificity}=\frac{TN}{TN+ FP}$$

(3)

$$\textrm{Accuracy}=\frac{1}{2}\left(\frac{TP}{TP+ FN}+\frac{TN}{TN+ FP}\right)$$

(4)

$$\textrm{Precision}=\frac{TP}{TP+ FP}$$

(5)

$$AU- ROC={\int}_0^1\frac{TP}{P}d\left(\frac{FP}{N}\right)$$

(6)

$$AU- PRC={\int}_0^1\frac{TP}{TP+ FP}d\left(\frac{TP}{P}\right)$$

(7)

Results

Performance analysis of MLAs with independent Kmer feature set

The performance of each machine learning method was evaluated independently with each Kmer feature set (K₁ to K₆). The highest sensitivity of 69.05% was achieved with LGBM for K₄, followed by the BAG (67.59%) with K₂ (Fig. 3). In comparison to the other combinations of Kmer size and learning algorithm, BAG also achieved the highest specificity (72.68%) with K₄. The BAG algorithm also achieved the highest precision of 68.10% for K₄ (Fig. 3). As far as overall accuracy is concerned, RF achieved the highest value of 61.79% with tri-nucleotide compositional features (K₃), followed by XGB (61.95%) and GBDT (61.21%) with dinucleotide (K₂) and tri-nucleotide (K₃) features, respectively (Fig. 3). With K₃, RF also achieved the highest AU-ROC (70.70%) and AU-PRC (70.69%). In comparison to the remaining learning algorithms, XGB with K₂ was found to produce higher AU-ROC (70.32%) and AU-PRC (69.51%) (Fig. 3). Because the features generated with large Kmer sizes are sparse, the accuracy obtained with K₅ and K₆ may be worse than with K₁, K₂, K₃, and K₄, similar to the present study.

Performance analysis of MLAs with combined Kmer feature set

In addition to evaluating the accuracy of each Kmer feature set separately, the performance of machine learning algorithms was evaluated using combined Kmer feature sets such as K₁₂ (K₁+K₂), K₁₂₃ (K₁+K₂+K₃), K₁₂₃₄ (K₁+K₂+K₃+K₄), K₁₂₃₄₅ (K₁+K₂+K₃+K₄+K₅), and K₁₂₃₄₅₆ (K₁+K₂+K₃+K₄+K₅+K₆). The highest sensitivity (79.98%) was achieved by SVM with K₁₂ features, whereas the BAG method achieved the highest sensitivity for the rest of the feature combinations (Fig. 4). The highest specificity (66.17%) and precision (62.82%) was achieved by GBDT with K₁₂₃, followed by RF (65.36%, 61.91%) with K₁₂ features. When XGB was used, the highest accuracy was found to be 62.16% with K₁₂ features, followed by GBDT (62.15%) with K₁₂₃ and RF (62.14) with K₁₂ features (Fig. 4). Barring a few exceptions, the accuracies were seen to be declining with an additional increase in the Kmer features (Fig. 4). The RF achieved the highest AU-ROC (69.4%) with K₁₂₃, followed by XGB (69.37%) with K₁₂ features (Fig. 4). The highest AU-ROC with K₁₂₃ features was seen to be less than that obtained with RF for K₃ (70.70%). When RF was employed as the classifier, K₁₂ produced the highest AU-PRC (70.18%), which was also lower than the AU-PRC of RF achieved with K₃ (70.70%) (Fig. 4).

Performance analysis MLAs with selected Kmer features

In order to improve prediction accuracy further, four different feature selection procedures (SVM-RFE, RF-VIM, XGB-VIM, and LGB-VIM) were employed to identify relevant and non-redundant features. The features were ranked in order of relevance, with the first being the most significant and the final being the least important. The prediction accuracy of learning algorithms was further evaluated in terms of AU-PRC by adding 10 top features at a time (Fig. 5). The BAG method was observed to achieve the highest AU-PRC of 65.08% using the top 70 XGB-VIM features (Table 3). Similarly, BAG achieved the highest AU-PRC of 65.66% with 590 top-selected features of LGB-VIM. SVM was found to achieve the maximum accuracy (72.66%) among the considered models with 100 top features chosen by RF-VIM (Table 3). Furthermore, SVM was observed to achieve the highest AU-PRC of 76.16% using the top 530 SVM-RFE features (Table 3). The prediction accuracy of the learning algorithms was observed to be improved when compared to the performance with all 5460 features. The SVM was found to be the best performer, followed by the RF when the prediction was done using the selected features of SVM-RFE and RF-VIM (Fig. 5). The BAG method was found to be the better achiever when it came to prediction using the chosen features of XGB-VIM and LGB-VIM in comparison to the other methods (Fig. 5).

Table 3 Performance metrics of different machine learning methods using the selected features

Full size table

Analysis of cross-validation and independent test set prediction

Since the SVM was found to achieve the highest accuracy with 530 top-selected features of SVM-RFE, the same combination was employed for cross-validation performance analysis. As far as cross-validation analysis is concerned, the sensitivity, specificity, overall accuracy, precision, AU-ROC, and AU-PRC were observed to be 73.03, 64.61, 68.84, 67.58, 73.98, and 75.54%, respectively (Table 4). The model trained with SVM using 530 selected features was also employed to predict the independent test set (101 positive and 101 negative sequences). For the independent test set, the sensitivity, specificity, overall accuracy, precision, AU-ROC, and AU-PRC were found to be 91.08, 61.38, 76.23 and 70.22, 87.71, and 88.49%, respectively (Table 4). The higher degree of sequence similarity with the training dataset may be attributed to the higher accuracy of the independent test set when compared to the cross-validation accuracy.

Table 4 Performance metrics for the training and independent test datasets

Full size table

Development of an online prediction tool

In order to predict the abiotic stress-responsive lncRNAs, we further developed an online prediction tool called ASLncR (https://iasri-sg.icar.gov.in/aslncr/). The front end of the server was designed using HTML, while its back end uses PHP to execute the developed in-house R-code. This server implemented the SVM model using the 530 chosen features. For prediction, the user has to either paste or upload the lncRNA sequences in FASTA format. The results are displayed in tabular format, where the probability of each lncRNA being associated with stress is provided.

Performance analysis of ASLncR with experimentally validated dataset

To further confirm the efficiency of the developed tool ASLncR, lncRNA sequences for various abiotic stresses were manually collected from published literature (Jha et al. 2020; Urquiaga et al. 2020; Patra et al. 2023). For 9 different plant species, a total of 190 sequences were collected for the abiotic stresses cold, heat, light, salt, drought, flood, and others. We were left with 138 sequences for the evaluation using our model after eliminating the sequences that were present in the positive set of training and independent test dataset. The abiotic stress responsiveness of the sequences was predicted using the ASLncR server, and it was discovered that 81.88% (113 out of 138) of the sequences were correctly identified.

Discussion

Abiotic stresses brought about by climate change pose a serious challenge to crop production and productivity. Therefore, it is necessary to develop abiotic stress-tolerant crop cultivars to meet the food security demand. In the last decade, a considerable amount of research has focussed to understand the different regulatory roles of lncRNAs in plant response to abiotic stresses and their indispensable roles in environmental adaptation (Chen et al., 2023; Yang et al., 2022; Liu et al., 2022b; Zhang et al., 2022; Tian et al., 2023; Ye et al., 2022; Chen et al., 2022). To put it another way, lncRNAs are multifaceted regulatory components that are essential for controlling cellular stress in response to various abiotic stimuli. For instance, Eom et al. (2019) revealed that lncRNAs co-express with mRNA in tomatoes in response to drought stress. Network analysis of the interactions between lncRNA and miRNA in Brassica juncea reveals a target for regulating drought tolerance (Bhatia et al., 2020). In order to understand how plants respond to various environmental stresses, it is crucial to identify abiotic stress-responsive lncRNAs. However, due to intricate genomic architecture, wet-lab experiments for lncRNA identification are costly and time-consuming. Thus, we developed a machine learning-based computational model for predicting abiotic stress-responsive lncRNAs based on the sequence-derived features.

Though several tools are available for plant lncRNA prediction, no single tool is available for predicting abiotic stress-responsive lncRNAs. It has been shown that lncRNAs with related functions share comparable K-mer profiles (Kirk et al., 2018). Additionally, the K-mer features have been successfully utilized to establish relationships between sequence and function among lncRNAs (Kirk et al. 2018; Kirk et al. 2021). In order to capture the abundance of short motifs in an lncRNA, in the present study, the K-mer features were used to encode lncRNAs into numeric feature vectors. The Kmer features have also been successfully applied in other areas of bioinformatics such as sequence assembly (Li et al. 2010), metagenomics (Dubinkina et al. 2016), DNA barcoding (Meher et al. 2016), and lncRNA prediction (Sun et al. 2013). We considered Kmer sizes 1 to 6, where the accuracy obtained with individual Kmer features was found to be higher than the accuracy obtained by combining all 5460 Kmer features. Shorter K-mers are more common, and their relative frequencies are more strongly cross-correlated than for longer K-mers (Klapproth et al. 2021), which could be a probable reason for the low accuracy with higher K-mer features.

It was seen that while all the 5460 features were utilized, the prediction accuracy was low. Thus, in order to improve prediction accuracy, significant and non-redundant features were selected by employing feature selection methods. To choose important features, four distinct feature selection strategies, including SVM-RFE, RF-VIM, XGB-VIM, and LGB-VIM, were adopted. As compared to all the 5460 features, BAG achieved the highest accuracy with 70 and 590 features selected using XGB-VIM and LGB-VIM methods, respectively. Similarly, SVM achieved the highest accuracy with 100 and 530 features selected using RF-VIM and SVM-RFE methods, respectively. Compared to the other three approaches, SVM-RFE ranking features had greater accuracy. Furthermore, it was discovered that prediction with selected features improved the accuracy of learning algorithms. When using the 530 top-ranked features of SVM-RFE, SVM had the highest accuracy among the learning algorithms, despite being the least effective when the prediction was done with individual or combined Kmer features.

The robustness of the proposed approach was also assessed using an independent dataset. The higher accuracy with the independent dataset as compared to the cross-validation accuracy may be attributed to a higher degree of sequence similarity between the training and independent test dataset. For easy implementation of our computational approach to predict abiotic stress-responsive lncRNA, we have established an online prediction tool ASLncR. Furthermore, to check the effectiveness of ASLncR, 138 experimentally confirmed abiotic stress-related lncRNAs were revalidated. The accuracy obtained from the cross-validation, independent test set validation, and the revalidation of ASLncR supports the applicability of the proposed model for predicting abiotic stress-responsive lncRNA in a plant.

Conclusion

Intensifying evidence from various plant species signifies that lncRNAs play critical roles in abiotic stress responses. Compared to humans, the application of lncRNAs in plant breeding is still in its initial phases. Despite the fact that lncRNAs mediate plant regulation in response to abiotic stresses in many species, their potential as valuable genomic resources in plant molecular breeding or as indicators have yet to be confirmed. Studies of lncRNAs in a wider range of plant species will aid in understanding the evolution and diversity of their roles in environmental adaptation. Due to the dearth of wet-lab as well as computational approaches, potential applications of lncRNAs in plant abiotic stress are currently lacking. The present work provides one of the first computational methods, ASLncR (https://iasri-sg.icar.gov.in/aslncr/), for predicting lncRNAs that are responsive to abiotic stress. The ASLncR can be successfully employed for large-scale prediction of abiotic stress-responsive lncRNAs using only sequence information. The suggested strategy is expected to supplement the current experimental approaches for predicting abiotic stress-related lncRNAs, given the significance of lncRNAs in plant response to abiotic challenges.

References

Abbas M, El-Manzalawy Y (2020) Machine learning based refined differential gene expression analysis of pediatric sepsis. BMC Med Genet 13:122. https://doi.org/10.1186/s12920-020-00771-4
Article CAS Google Scholar
Alfaro E, Gamez M, Garcia N (2013) adabag: an R package for classification with boosting and bagging. J Stat Softw 54(2):1–35 http://www.jstatsoft.org/v54/i02/
Article Google Scholar
Bhatia G, Singh A, Verma D et al (2020) Genome-wide investigation of regulatory roles of lncRNAs in response to heat and drought stress in Brassica juncea (Indian mustard). Environ Exp Bot 171:103922. https://doi.org/10.1016/j.envexpbot.2019.103922
Article CAS Google Scholar
Bhogireddy S, Mangrauthia SK, Kumar R et al (2021) Regulatory non-coding RNAs: a new frontier in regulation of plant biology. Funct Integr Genom 21:313–330. https://doi.org/10.1007/s10142-021-00787-8
Article CAS Google Scholar
Boyd K, Eng KH, Page CD (2013) Area under the precision-recall curve: point estimates and confidence intervals. In: Blockeel H, Kersting K, Nijssen S, Železný F (eds) Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, Heidelberg, pp 451–466
Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140. https://doi.org/10.1007/BF00058655
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Article Google Scholar
Cao Z, Zhao T, Wang L et al (2021) The lincRNA XH123 is involved in cotton cold-stress regulation. Plant Mol Biol 106:521–531. https://doi.org/10.1007/s11103-021-01169-1
Article CAS PubMed Google Scholar
Chen J, Zhong Y, Qi X (2021a) LncRNA TCONS_00021861 is functionally associated with drought tolerance in rice (Oryza sativa L.) via competing endogenous RNA regulation. BMC Plant Biol 21:410. https://doi.org/10.1186/s12870-021-03195-z
Article CAS PubMed PubMed Central Google Scholar
Chen L, Shi S, Jiang N et al (2018) Genome-wide analysis of long non-coding RNAs affecting roots development at an early stage in the rice response to cadmium stress. BMC Genomics 19:460. https://doi.org/10.1186/s12864-018-4807-6
Article CAS PubMed PubMed Central Google Scholar
Chen P, Song Y, Liu X et al (2022) LncRNA PMAT–PtoMYB46 module represses PtoMATE and PtoARF2 promoting Pb2+ uptake and plant growth in poplar. J Hazard Mater 433:128769. https://doi.org/10.1016/j.jhazmat.2022.128769
Article CAS PubMed Google Scholar
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, New York, pp 785–794
Chapter Google Scholar
Chen X, Jiang X, Niu F et al (2023) Overexpression of lncRNA77580 regulates drought and salinity stress responses in soybean. Plants 12:181. https://doi.org/10.3390/plants12010181
Article CAS PubMed PubMed Central Google Scholar
Choudhury S, Mansi MSK et al (2021) Genome-wide identification of Ran GTPase family genes from wheat (T. aestivum) and their expression profile during developmental stages and abiotic stress conditions. Funct Integr Genom 21:239–250. https://doi.org/10.1007/s10142-021-00773-0
Article CAS Google Scholar
Das P, Roychowdhury A, Das S et al (2020) sigFeature: novel significant feature selection method for classification of gene expression data using support vector machine and t statistic. Front Genet 11:247. https://doi.org/10.3389/fgene.2020.00247
Article CAS PubMed PubMed Central Google Scholar
Díaz-Uriarte R, Alvarez de Andrés S (2006) Gene selection and classification of microarray data using random forest. BMC Bioinform 7:3. https://doi.org/10.1186/1471-2105-7-3
Article CAS Google Scholar
Ding Z, Tie W, Fu L et al (2019) Strand-specific RNA-seq based identification and functional prediction of drought-responsive lncRNAs in cassava. BMC Genomics 20:214. https://doi.org/10.1186/s12864-019-5585-5
Article PubMed PubMed Central Google Scholar
Dubinkina VB, Ischenko DS, Ulyantsev VI et al (2016) Assessment of k-mer spectrum applicability for metagenomic dissimilarity analysis. BMC Bioinform 17:38. https://doi.org/10.1186/s12859-015-0875-7
Article CAS Google Scholar
Eom SH, Lee HJ, Lee JH et al (2019) Identification and functional prediction of drought-responsive long non-coding RNA in tomato. Agronomy 9:629. https://doi.org/10.3390/agronomy9100629
Article CAS Google Scholar
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognition Letters, ROC Analysis in Pattern Recognition 27:861–874. https://doi.org/10.1016/j.patrec.2005.10.010
Article Google Scholar
Freund Y, Schapire RE (1999) A short introduction to boosting. Jpn Soc Artif Intell 14(5):771–780
Google Scholar
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232. https://doi.org/10.1214/aos/1013203451
Article Google Scholar
Glisovic T, Bachorik JL, Yong J, Dreyfuss G (2008) RNA-binding proteins and post-transcriptional gene regulation. FEBS Lett 582:1977–1986. https://doi.org/10.1016/j.febslet.2008.03.004
Article CAS PubMed PubMed Central Google Scholar
Greenwell B, Boehmke B, Cunningham J, et al (2022). gbm: generalized boosted regression models. R package version 2.1.8.1. https://CRAN.R-project.org/package=gbm
Google Scholar
Guo F-B, Dong C, Hua H-L et al (2017) Accurate prediction of human essential genes using only nucleotide composition and association information. Bioinformatics 33:1758–1764. https://doi.org/10.1093/bioinformatics/btx055
Article CAS PubMed Google Scholar
Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46:389–422. https://doi.org/10.1023/A:1012487302797
Article Google Scholar
He X, Guo S, Wang Y et al (2020) Systematic identification and analysis of heat-stress-responsive lncRNAs, circRNAs and miRNAs with associated co-expression and ceRNA networks in cucumber (Cucumis sativus L.). Physiol Plant 168:736–754. https://doi.org/10.1111/ppl.12997
Article CAS PubMed Google Scholar
Huang Y, Niu B, Gao Y et al (2010) CD-HIT suite: a web server for clustering and comparing biological sequences. Bioinformatics 26:680–682. https://doi.org/10.1093/bioinformatics/btq003
Article CAS PubMed PubMed Central Google Scholar
Hutchinson JN, Ensminger AW, Clemson CM et al (2007) A screen for nuclear transcripts identifies two linked noncoding RNAs associated with SC35 splicing domains. BMC Genomics 8:39. https://doi.org/10.1186/1471-2164-8-39
Article CAS PubMed PubMed Central Google Scholar
Jha UC, Nayyar H, Jha R et al (2020) Long non-coding RNAs: emerging players regulating plant abiotic stress response and adaptation. BMC Plant Biol 20:466. https://doi.org/10.1186/s12870-020-02595-x
Article CAS PubMed PubMed Central Google Scholar
Jiang G, Wang W (2017) Error estimation based on variance analysis of k-fold cross-validation. Pattern Recogn 69:94–106. https://doi.org/10.1016/j.patcog.2017.03.025
Article Google Scholar
Jin J, Lu P, Xu Y et al (2021) PLncDB V2.0: a comprehensive encyclopedia of plant long noncoding RNAs. Nucleic Acids Res 49:D1489–D1495. https://doi.org/10.1093/nar/gkaa910
Article CAS PubMed Google Scholar
Ke G, Meng Q, Finley T et al (2017) LightGBM: a highly efficient gradient boosting decision tree. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., Red Hook, pp 3149–3157
Google Scholar
Kirk JM, Kim SO, Inoue K et al (2018) Functional classification of long non-coding RNAs by kmer content. Nat Genet 50:1474–1482. https://doi.org/10.1038/s41588-018-0207-8
Article CAS PubMed PubMed Central Google Scholar
Kirk JM, Sprague D, Calabrese JM (2021) Classification of long noncoding RNAs by k-mer content. Methods Mol Biol 2254:41–60. https://doi.org/10.1007/978-1-0716-1158-6_4
Article CAS PubMed PubMed Central Google Scholar
Klapproth C, Sen R, Stadler PF et al (2021) Common features in lncRNA annotation and classification: a survey. Non-Coding RNA 7:77. https://doi.org/10.3390/ncrna7040077
Article CAS PubMed PubMed Central Google Scholar
Lamin-Samu AT, Zhuo S, Ali M, Lu G (2022) Long non-coding RNA transcriptome landscape of anthers at different developmental stages in response to drought stress in tomato. Genomics 114:110383. https://doi.org/10.1016/j.ygeno.2022.110383
Article CAS PubMed Google Scholar
Lee D, Karchin R, Beer MA (2011) Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res 21:2167–2180. https://doi.org/10.1101/gr.121905.111
Article CAS PubMed PubMed Central Google Scholar
Lee C, Kikyo N (2012) Strategies to identify long noncoding RNAs involved in gene regulation. Cell Biosci 2:37. https://doi.org/10.1186/2045-3701-2-37
Article CAS PubMed PubMed Central Google Scholar
Li C, Nong W, Zhao S et al (2022b) Differential microRNA expression, microRNA arm switching, and microRNA:long noncoding RNA interaction in response to salinity stress in soybean. BMC Genomics 23:65. https://doi.org/10.1186/s12864-022-08308-y
Article CAS PubMed PubMed Central Google Scholar
Li J-R, Liu C-C, Sun C-H, Chen Y-T (2018) Plant stress RNA-seq nexus: a stress-specific transcriptome database in plant cells. BMC Genomics 19:966. https://doi.org/10.1186/s12864-018-5367-5
Article CAS PubMed PubMed Central Google Scholar
Li R, Zhu H, Ruan J et al (2010) De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 20:265–272. https://doi.org/10.1101/gr.097261.109
Article CAS PubMed PubMed Central Google Scholar
Li S, Cheng Z, Dong S et al (2022a) Global identification of full-length cassava lncRNAs unveils the role of cold-responsive intergenic lncRNA 1 in cold stress response. Plant Cell Environ 45:412–426. https://doi.org/10.1111/pce.14236
Article CAS PubMed Google Scholar
Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22
Google Scholar
Liu G, Liu F, Wang Y, Liu X (2022b) A novel long noncoding RNA CIL1 enhances cold stress tolerance in Arabidopsis. Plant Sci 323:111370. https://doi.org/10.1016/j.plantsci.2022.111370
Article CAS PubMed Google Scholar
Liu P, Zhang Y, Zou C et al (2022a) Integrated analysis of long non-coding RNAs and mRNAs reveals the regulatory network of maize seedling root responding to salt stress. BMC Genomics 23:50. https://doi.org/10.1186/s12864-021-08286-7
Article CAS PubMed PubMed Central Google Scholar
Meher PK, Sahu TK, Rao AR (2016) Identification of species based on DNA barcode using k-mer feature vector and random forest classifier. Gene 592:316–324. https://doi.org/10.1016/j.gene.2016.07.010
Article CAS PubMed Google Scholar
Mercer TR, Dinger ME, Mattick JS (2009) Long non-coding RNAs: insights into functions. Nat Rev Genet 10:155–159. https://doi.org/10.1038/nrg2521
Article CAS PubMed Google Scholar
Meyer D, Dimitriadou E, Hornik K et al (2021) e1071: Misc functions of the department of statistics, probability theory group (formerly: E1071), TU Wien. R package version 1:7–9 https://CRAN.R-project.org/package=e1071
Google Scholar
Öztürk Gökçe ZN, Aksoy E, Bakhsh A et al (2021) Combined drought and heat stresses trigger different sets of miRNAs in contrasting potato cultivars. Funct Integr Genom 21:489–502. https://doi.org/10.1007/s10142-021-00793-w
Article CAS Google Scholar
Patra GK, Gupta D, Rout GR, Panda SK (2023) Role of long non coding RNA in plants under abiotic and biotic stresses. Plant Physiol Biochem 194:96–110. https://doi.org/10.1016/j.plaphy.2022.10.030
Article CAS PubMed Google Scholar
Pradhan UK, Sharma NK, Kumar P et al (2021) miRbiom: machine-learning on Bayesian causal nets of RBP-miRNA interactions successfully predicts miRNA profiles. PLoS ONE 16:e0258550. https://doi.org/10.1371/journal.pone.0258550
Article CAS PubMed PubMed Central Google Scholar
Qin T, Zhao H, Cui P et al (2017) A nucleus-localized long non-coding RNA enhances drought and salt stress tolerance. Plant Physiol 175:1321–1336. https://doi.org/10.1104/pp.17.00574
Article CAS PubMed PubMed Central Google Scholar
Quan M, Chen J, Zhang D (2015) Exploring the secrets of long noncoding RNAs. Int J Mol Sci 16:5467–5496. https://doi.org/10.3390/ijms16035467
Article CAS PubMed PubMed Central Google Scholar
Quan M, Liu X, Xiao L et al (2021) Transcriptome analysis and association mapping reveal the genetic regulatory network response to cadmium stress in Populus tomentosa. J Exp Bot 72:576–591. https://doi.org/10.1093/jxb/eraa434
Article CAS PubMed Google Scholar
Ramírez Gonzales L, Shi L, Bergonzi SB et al (2021) Potato cycling DOF factor 1 and its lncRNA counterpart StFLORE link tuber development and drought response. Plant J 105:855–869. https://doi.org/10.1111/tpj.15093
Article CAS PubMed PubMed Central Google Scholar
Ren J, Jiang C, Zhang H et al (2022) LncRNA-mediated ceRNA networks provide novel potential biomarkers for peanut drought tolerance. Physiol Plant 174:e13610. https://doi.org/10.1111/ppl.13610
Article CAS PubMed Google Scholar
Rutley N, Poidevin L, Doniger T et al (2021) Characterization of novel pollen-expressed transcripts reveals their potential roles in pollen heat stress response in Arabidopsis thaliana. Plant Reprod 34:61–78. https://doi.org/10.1007/s00497-020-00400-1
Article CAS PubMed PubMed Central Google Scholar
Saeed F, Chaudhry UK, Raza A et al (2023) Developing future heat-resilient vegetable crops. Funct Integr Genom 23:47. https://doi.org/10.1007/s10142-023-00967-8
Article CAS Google Scholar
Sandri M, Zuccolotto P (2008) A bias correction algorithm for the gini variable importance measure in classification trees. J Comput Graph Stat 17:611–628. https://doi.org/10.1198/106186008X344522
Article Google Scholar
Shi Y, Ke G, Soukhavong D et al (2022) lightgbm: light gradient boosting machine. R package version 3(3):4 https://CRAN.R-project.org/package=lightgbm
Google Scholar
Suksamran R, Saithong T, Thammarongtham C, Kalapanulak S (2020) Genomic and transcriptomic analysis identified novel putative cassava lncRNAs involved in cold and drought stress. Genes 11:366. https://doi.org/10.3390/genes11040366
Article CAS PubMed PubMed Central Google Scholar
Sun L, Luo H, Bu D et al (2013) Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts. Nucleic Acids Res 41:e166. https://doi.org/10.1093/nar/gkt646
Article CAS PubMed PubMed Central Google Scholar
Tan X, Li S, Hu L, Zhang C (2020) Genome-wide analysis of long non-coding RNAs (lncRNAs) in two contrasting rapeseed (Brassica napus L.) genotypes subjected to drought stress and re-watering. BMC Plant Biol 20:81. https://doi.org/10.1186/s12870-020-2286-9
Article CAS PubMed PubMed Central Google Scholar
Tao X, Li M, Zhao T et al (2021) Neofunctionalization of a polyploidization-activated cotton long intergenic non-coding RNA DAN1 during drought stress regulation. Plant Physiol 186:2152–2168. https://doi.org/10.1093/plphys/kiab179
Article CAS PubMed PubMed Central Google Scholar
Tian R, Sun X, Liu C et al (2023) A Medicago truncatula lncRNA MtCIR1 negatively regulates response to salt stress. Planta 257:32. https://doi.org/10.1007/s00425-022-04064-1
Article CAS PubMed Google Scholar
Tilman D, Balzer C, Hill J, Befort BL (2011) Global food demand and the sustainable intensification of agriculture. Proc Natl Acad Sci 108:20260–20264. https://doi.org/10.1073/pnas.1116437108
Article PubMed PubMed Central Google Scholar
Urquiaga MCO, Thiebaut F, Hemerly AS, Ferreira PCG (2020) From trash to luxury: the potential role of plant LncRNA in DNA methylation during abiotic stress. Front Plant Sci 11:603246. https://doi.org/10.3389/fpls.2020.603246
Article PubMed Google Scholar
Vapnik V (1963) Pattern recognition using generalized portrait method. Autom Remote Control 24:774–780
Google Scholar
Wang H-LV, Chekanova JA (2017) Long noncoding RNAs in plants. In: Rao MRS (ed) Long Non Coding RNA Biology. Springer, Singapore, pp 133–154
Chapter Google Scholar
Wang J, Liu X, Wu H et al (2010) CREB up-regulates long non-coding RNA, HULC expression through interaction with microRNA-372 in liver cancer. Nucleic Acids Res 38:5366–5383. https://doi.org/10.1093/nar/gkq285
Article CAS PubMed PubMed Central Google Scholar
Wang J, Chen Q, Wu W et al (2021) Genome-wide analysis of long non-coding RNAs responsive to multiple nutrient stresses in Arabidopsis thaliana. Funct Integr Genomics 21:17–30. https://doi.org/10.1007/s10142-020-00758-5
Article CAS PubMed Google Scholar
Wani SH, Kumar V, Shriram V, Sah SK (2016) Phytohormones and their metabolic engineering for abiotic stress tolerance in crop plants. Crop J 4:162–176. https://doi.org/10.1016/j.cj.2016.01.010
Article Google Scholar
Wen X, Ding Y, Tan Z et al (2020) Identification and characterization of cadmium stress-related LncRNAs from Betula platyphylla. Plant Sci 299:110601. https://doi.org/10.1016/j.plantsci.2020.110601
Article CAS PubMed Google Scholar
Wu W, Wu Y, Hu D et al (2020) PncStress: a manually curated database of experimentally validated stress-responsive non-coding RNAs in plants. Database 2020:baaa001. https://doi.org/10.1093/database/baaa001
Article CAS PubMed PubMed Central Google Scholar
Xu S, Dong Q, Deng M et al (2021) The vernalization-induced long non-coding RNA VAS functions with the transcription factor TaRF2b to promote TaVRN1 expression for flowering in hexaploid wheat. Mol Plant 14:1525–1538. https://doi.org/10.1016/j.molp.2021.05.026
Article CAS PubMed Google Scholar
Yang H, Cui Y, Feng Y et al (2023) Long non-coding RNAs of Plants in response to abiotic stresses and their regulating roles in promoting environmental adaption. Cells 12:729. https://doi.org/10.3390/cells12050729
Article CAS PubMed PubMed Central Google Scholar
Yang W-C, Katinakis P, Hendriks P et al (1993) Characterization of GmENOD40, a gene showing novel patterns of cell-specific expression during soybean nodule development. Plant J 3:573–585. https://doi.org/10.1046/j.1365-313X.1993.03040573.x
Article CAS PubMed Google Scholar
Yang X, Liu C, Niu X et al (2022) Research on lncRNA related to drought resistance of Shanlan upland rice. BMC Genomics 23:336. https://doi.org/10.1186/s12864-022-08546-0
Article CAS PubMed PubMed Central Google Scholar
Ye X, Wang S, Zhao X et al (2022) Role of lncRNAs in cis- and trans-regulatory responses to salt in Populus trichocarpa. Plant J 110:978–993. https://doi.org/10.1111/tpj.15714
Article CAS PubMed Google Scholar
Yu F, Tan Z, Fang T et al (2020) A comprehensive transcriptomics analysis reveals long non-coding RNA to be involved in the key metabolic pathway in response to waterlogging stress in maize. Genes 11:267. https://doi.org/10.3390/genes11030267
Article CAS PubMed PubMed Central Google Scholar
Yu Y, Zhang Y, Chen X, Chen Y (2019) Plant noncoding RNAs: hidden players in development and stress responses. Annu Rev Cell Dev Biol 35:407–431. https://doi.org/10.1146/annurev-cellbio-100818-125218
Article CAS PubMed PubMed Central Google Scholar
Zhang X, Dong J, Deng F et al (2019) The long non-coding RNA lncRNA973 is involved in cotton response to salt stress. BMC Plant Biol 19:459. https://doi.org/10.1186/s12870-019-2088-0
Article CAS PubMed PubMed Central Google Scholar
Zhang X, Shen J, Xu Q et al (2021) Long noncoding RNA lncRNA354 functions as a competing endogenous RNA of miR160b to regulate ARF genes in response to salt stress in upland cotton. Plant Cell Environ 44:3302–3321. https://doi.org/10.1111/pce.14133
Article CAS PubMed Google Scholar
Zhang Z, Zhong H, Nan B, Xiao B (2022) Global identification and integrated analysis of heat-responsive long non-coding RNAs in contrasting rice cultivars. Theor Appl Genet 135:833–852. https://doi.org/10.1007/s00122-021-04001-y
Article CAS PubMed Google Scholar
Zhu L, Wang X, Tian J et al (2022) Genome-wide analysis of VPE family in four Gossypium species and transcriptional expression of VPEs in the upland cotton seedlings under abiotic stresses. Funct Integr Genom 22:179–192. https://doi.org/10.1007/s10142-021-00818-4
Article CAS Google Scholar
Chen T, He T, Benesty M, et al (2021b). xgboost: extreme gradient boosting. R package version 1.5.0.2. https://CRAN.R-project.org/package=xgboost
Peters A, Hothorn T, Ripley BD, et al (2023) ipred: improved predictors. https://cran.r-project.org/package=ipred
Pradhan UK, Meher PK, Naha S et al (2022) PlDBPred: a novel computational model for discovery of DNA binding proteins in plants. Brief Bioinform:bbac483. https://doi.org/10.1093/bib/bbac483

Download references

Acknowledgments

The authors sincerely acknowledge the Director, ICAR-IASRI, New Delhi, for providing the necessary facilities to carry out the research work. The authors also acknowledge the ASHOKA supercomputing facilities available at ICAR-IASRI, New Delhi.

Funding

This work was funded by the ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi-110012, India.

Author information

Authors and Affiliations

Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi, 110012, India
Upendra Kumar Pradhan, Prabina Kumar Meher & Ajit Gupta
Division of Computer Applications, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi, 110012, India
Sanchita Naha
Indian Council of Agricultural Research (ICAR), New Delhi, India
Atmakuri Ramakrishna Rao

Authors

Upendra Kumar Pradhan
View author publications
You can also search for this author in PubMed Google Scholar
Prabina Kumar Meher
View author publications
You can also search for this author in PubMed Google Scholar
Sanchita Naha
View author publications
You can also search for this author in PubMed Google Scholar
Atmakuri Ramakrishna Rao
View author publications
You can also search for this author in PubMed Google Scholar
Ajit Gupta
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: PKM and UKP. Methodology: UKP, PKM, and ARR. Software: SN, PKM, and UKP; validation: UKP; formal analysis: UKP, PKM, SN, and AG; investigation: PKM, AG, and ARR; resources: SN and AG; data curation: UKP and PKM; writing—original draft preparation: UKP and PKM; writing—review and editing: PKM, UK, ARR, and AG; visualization: PKM, UKP, and SN; supervision: PKM, ARR, and AG. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Prabina Kumar Meher.

Ethics declarations

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors. All secondary data used in the study are available at https://iasri-sg.icar.gov.in/aslncr/dataset.php

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Pradhan, U.K., Meher, P.K., Naha, S. et al. ASLncR: a novel computational tool for prediction of abiotic stress-responsive long non-coding RNAs in plants. Funct Integr Genomics 23, 113 (2023). https://doi.org/10.1007/s10142-023-01040-0

Download citation

Received: 30 December 2022
Revised: 23 March 2023
Accepted: 24 March 2023
Published: 31 March 2023
DOI: https://doi.org/10.1007/s10142-023-01040-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

ASLncR: a novel computational tool for prediction of abiotic stress-responsive long non-coding RNAs in plants

Abstract

Similar content being viewed by others

ASmiR: a machine learning framework for prediction of abiotic stress–specific miRNAs in plants

Long Non-coding RNA for Plants Using Big Data Analytics—A Review

Feature Extraction of Long Non-coding RNAs: A Fourier and Numerical Mapping Approach

Introduction

Materials and methods

Collection of abiotic stress-responsive lncRNA sequence data