Introduction

Antibiotic resistance is a global healthcare concern, and Nocardia farcinica poses a significant threat due to its inherent resistance to multiple antibiotics. The genus of Nocardia belongs to the actinomycetes, a group of aerobic bacilli that are found commonly in soil and water (Mehta and Shamoo 2020; Conville et al. 2017). Although, there are more than 80 species in Nocardia, approximately 54 species notably N. nova complex, N. abscessus complex, N. transvalensis complex, N. farcinica, N. asteroides type VI (N. cyriacigeorgica), N. brevicatena/N. paucivorans complex, and N. brasiliensis are pathogenic to humans (Duggal and Chugh 2020). In Nocardia, the pathogenesis mechanism is not completely understood (Ji et al. 2020). Nocardia species are regarded in the aerobic actinomycete group and virulence in Nocardia has been ascribed to its ability to survive and grow in various human cells and evade the immune response by producing antioxidant enzymes (catalase/superoxide dismutase (SOD)), inhibiting formation phagolysosome complex, reducing levels of phosphomonoesterases II in tissue macrophages, secreting toxins and hemolysin (in few cases) (Mehta and Shamoo 2020; Conville et al. 2017). Since the disease is difficult to diagnose and can be left untreated, it can spread to other organs of the body, including the spine and brain (Kövér et al. 2023). Nocardiosis of the brain or spinal cord leads to mortality for more than 85% of them (https://www.cdc.gov/nocardiosis/infection/index.html). Nocardiosis infections can spread through injuries to the subcutaneous tissue. It may result in closely related cellulitis, pyoderma, abscess formation, and Staphylococcal or Streptococcal infections. However, disseminating the infection via lymphatics to the regional lymph nodes may provide insights into lymphocutaneous Nocardiosis (Duggal and Chugh 2020). Nocardia infections constitute an important threat to human health, and the diagnosis and etiology of this disease are very important (Zhang et al. 2023). Regretfully, the research on the role of the majority of the genes in Nocardia due to has not been yet studied (Ji et al. 2022). Currently, there are few studies on the N. farcinica virulence factors, and many potential virulence factors are yet to be discovered. Virulence factors are as follows, Superoxide Dismutase (SOD) and Catalase: N. farcinica prevents phagosome-lysosome binding, reduces lysosomal enzyme action in macrophages, neutralizes phagosomal acidification, and inhibits the oxidative killing mechanisms of phagocytes. This resistance is due to the production of surface-associated superoxide dismutase and high levels of catalase. Cell wall glycolipids: Complex cell wall glycolipids act as a virulence factor in N. farcinica. In chronic granulomatous disease, neutrophils and macrophages inhibit the burst of oxidative metabolism during phagocytosis, thus reducing the intracellular killing of catalase-positive bacteria such as N. farcinica. Although N. farcinica may exhibit resistance to a typical oxidative burst. Mammalian cell entry protein family: Mce1C and Mce1D protein (Mammalian cell entry protein family) are situated in the cell wall of N. farcinica. Both Mce1C and Mce1D genes are expressed at the level of protein and mRNA and evoke antibody activity during the period of infection. Mce1C and Mce1D proteins inhibit the expression of proinflammatory cytokines and prevent the NF-κB and MAPK signaling pathways, thereby inhibiting the innate immune response. Mycolic acid: N. farcinica are actinomycetes, commonly referred to as mycolata, that contain mycolic acid, which helps to provide a defense mechanism for pathogens to resist the host immune systems. rox gene: In N. farcinica, the rox gene plays a secondary function in rifampicin resistance by initiating the process of rifampicin breakdown and producing a novel metabolite in the first phase. The rox gene of N. farcinica expresses a rifampicin monooxygenase that can change rifampicin into the molecule 2-N-hydroxy-4-oxo-rifampicin, which has resulted in reduced antibiotic activity. phospholipase C: Phospholipase C has the role of destroying tissue, Hemolysin (toxic proteins): destroying red blood cells, lipases, and proteases (Cai et al. 2022; Ji et al. 2022).

Antibiotic resistance is an alarming issue and a worldwide threat faced by the healthcare community (CDC 2019). N. farcinica remains a virulent species and exhibits intrinsic resistance to many antibiotics, including third-generation cephalosporins (Mehta and Shamoo 2020). N. farcinica is a gram-positive organism that can cause life-threatening infections because of the organism’s trend to disseminate rapidly and resist antibiotics (Wu et al. 2021). There is currently no defined procedure for antimicrobial susceptibility testing for guiding clinical therapy, and antimicrobial susceptibility varies depending on the species of Nocardia isolates (Lee et al. 2021). It has a unique antibiotic susceptibility pattern and resistance to antimicrobial agents (Gao et al. 2021). Globally, N. farcinica is the most common Nocardia species that produce pulmonary infections in humans (Yetmar et al. 2023) with a mortality rate of 10–31% in Asia, Europe, and North America (Jiao et al 2021). The majority of sulfonamides, that is trimethoprim–sulfamethoxazole (TMP–SMX), are the standard treatment for nocardiosis based partially on the results of a few retroactive reviews. These indicated found that patients receiving regimens containing sulfonamides had a trend toward longer survival. In patients with sulfonamide allergy or resistance, as well as in cases where clinical treatment failures have been documented, alternative or combination therapy is chosen. The objective of that study was to assess the factors linked to in-hospital mortality in patients with pulmonary nocardiosis, including risk factors, clinical, radiographic, and microbiologic features, as well as outcomes (Rahim et al. 2023). It is extremely virulent and is known to be naturally resistant to numerous antibiotics (Adapa et al. 2020). Antimicrobial susceptibility testing is presently suggested for all Nocardia spp. isolates before starting antimicrobials, because sensitivities are commonly difficult to predict, and first-line antibiotic therapy may not tolerate for patients. Cotrimoxazole, imipenem, linezolid, and amikacin recommended as initial therapeutic for nocardiosis, were the antibiotics most commonly utilized and exhibited the most favorable sensitivity patterns. Cotrimoxazole has been thought to be the initial therapeutic for nocardiosis because most studies reveal great sensitivity rates to this drug; however, the number of cotrimoxazole-resistant Nocardia is reported to be increasing. Compared with some of the previous studies, we discovered higher rates of resistance to imipenem, ciprofloxacin, and amoxicillin–clavulanate, although resistance to ceftriaxone was lower than other reports (Besteiro et al. 2023). Difficulties in timely detection and challenging diagnoses often lead to treatment delays, resulting in poorer patient outcomes (Wu et al. 2021). N. farcinica can lead to various clinical manifestations, including brain abscess, keratitis, bacteremia, and infections in the lungs, kidneys, and skin (Bell et al. 2019). In certain instances, N. farcinica has been identified as the causative agent of peritonitis in immunocompromised individuals. The clinical manifestation depends on the patient’s medical history, as nosocomial infections caused by N. farcinica are frequently encountered in patients undergoing chemotherapy, dialysis, or receiving treatments for conditions like HIV infection and autoimmune diseases. The most preferred antibiotic for treating N. farcinica infections is trimethoprim–sulfamethoxazole (SXT). Yet, because of its rapidly evolving antibiotic resistance, a combination of SXT with other antibiotic medications, including amikacin, imipenem–cilastatin, and moxifloxacin, is currently utilized. Despite all these therapeutic efforts, treated patients often experience a relapse in the disease prognosis, and the mortality rate remains around 39% (Adapa et al. 2020). Sulfonamides, aminoglycosides, β-lactams (penicillins, carbapenems, and cephalosporins) and β-lactam/β-lactamase inhibitors, quinolones, macrolides, and tetracyclines are the six main categories of antimicrobial substances presently in clinical use. Aminoglycoside antibiotics like Amikacin and Tobramycin act as bactericidal by disrupting the 30S ribosomal subunit, causing faulty protein synthesis. β-Lactam antibiotics (Penicillins, Carbapenems, and cephalosporins like Cefotaxime, Ceftriaxone, Cefixime, Cefuroxime) and the β-Lactam-β-lactamase inhibitor amoxicillin–clavulanic acid attain bactericidal activity by impeding bacterial cell wall construction by preventing transpeptidase responsible for catalysis of peptidoglycan cross-linking. Macrolide antibiotics such as Erythromycin and Clarithromycin employ diverse bactericidal mechanisms, with clavulanic acid safeguarding amoxicillin from degradation by deactivating a broad spectrum of β-lactamases. In contrast, the macrolides erythromycin and clarithromycin inhibit protein synthesis by attaching to the 50S ribosomal subunit. Oxazolidinones, exemplified by Linezolid, function as antibacterial agents by preventing the synthesis of bacterial proteins. These antibiotics, such as quinolones (e.g., ciprofloxacin, moxifloxacin), sulfonamides (sulfamethoxazole, trimethoprim/TMP–SMX), and tetracyclines (doxycycline, minocycline), disrupt bacterial processes. Quinolones inhibit gyrase, essential for DNA folding, sulfonamides block the folate pathway, and tetracyclines interfere with anticoagulant and antacid treatments (Nouioui et al. 2020).

In the present study, whole-genome sequencing was performed using the Illumina sequencing platform to decipher the genomic basis of virulence factors and antimicrobial resistance. The attained sequence data were further assembled and annotated utilizing various bioinformatics tools. This research demonstrates the importance of genome-based analysis in assessing the potential health risks associated with emerging Nocardia pathogens. The abundance of bacterial genome sequence information has facilitated various distinct methods for identifying therapeutic targets. The objective of the study is to perform the whole-genome sequencing of N. farcinica to detect resistance to first- and second-line anti-nocardiosis drugs.

Materials and methods

Isolation and culture conditions of Nocardia farcinica

Nocardia farcinica, isolated from sputum samples, was cultured at Frontier Lifeline Hospitals, Chennai, Tamil Nadu, India. Conventional culture methods were employed using Luria-Bertani (LB) medium for bacterial cultivation at 37 °C for 42 h. Gram stain, colony morphology, and biochemical analysis were employed to confirm the identity of the acquired Gram-positive N. farcinica (Fig. 1a). The strain was cultured on blood agar, and after 3 days of incubation at 37 °C, sufficient growth was observed. Characteristic dry, chalky, gray, and wrinkled colonies appeared on the blood agar (Fig. 1b).

Fig. 1
figure 1

a Gram staining of N. farcinica revealed Gram-positive bacteria characterized by rod-shaped morphology and visible branches. b Colonial characteristics of N. farcinica on blood agar plates include pale yellow pigmentation, elevated growth, a rough surface, and opacity (color figure online)

Antibiotic susceptibility test

The antibiotic sensitivity test was performed in Orbito Asia Diagnostics, Coimbatore, Tamil Nadu, India using the Kirby–Bauer disc diffusion method on cation-adjusted LB agar plates. The inoculums of N. farcinica were prepared in accordance with Clinical and Laboratory Standards Institute (CLSI) standards. The antibiotic disc used, include vancomycin (VA), tobramycin (TOB), tetracyclines (TE), teicoplanin (TEI), spectinomycin (SPT), amikacin (AK), ofloxacin (OF), fusidic acid (FC), carbenicillin (CB), imipenem (IPM), ciprofloxacin (CIP), levofloxacin (LE), netillin (NET), polymyxin-B (PB), rifampicin (RIF), streptomycin (S), gentamicin (GEN), kanamycin (K), penicillin-G (P), norfloxacin (NX), nalidixic Acid (NA), erythromycin (E), clindamycin (CD), and chloramphenicol (C) was procured from Himedia. The results were observed after 72 h of incubation. The diameter of the inhibition zone was measured for each antibiotic disc and compared with thresholds. A ‘non-susceptible’ isolate was considered resistant (R) and a ‘susceptible’ isolate was defined as Sensitive (S). All the tests were performed in duplicates (Mozrall et al. 2022).

Genomic DNA extraction and quantification

The genomic DNA extraction and quantification was carried out at Genotypic Technology Pvt. Ltd., Bengaluru-560094, Karnataka, India. N. farcinica was grown overnight (37 °C; 42 h) and the DNA from the Nocardia cell pellet was extracted using a DNA extraction kit Qiagen DNeasy Blood and Tissue Kit (Cat No. 69506). The cell pellet was re-suspended in Lysozyme procured from Sigma (Cat. No. L7651) recommends a concentration of 10 mg/ml. The enzyme activity is optimal at 37 °C and for hydrolysis of the peptidoglycan layer, the sample is incubated for 30 mins. The protocol as recommended by Sigma is incorporated. The bacterial suspension treated with AL buffer and Proteinase K was incubated at 56 °C for 2 h followed by RNaseA (Cat. No. 2101076; MP Biomedicals) treatment for 20 min at 65 °C. The lysate was mixed with half the volume of absolute ethanol loaded into the spin column, and placed in a 2 ml collection tube. The tube was centrifuged at 8000 rpm for 1 min, and the flow through was discarded. The remaining wash step was followed according to the manufacturer’s protocol. DNA was eluted in 10 mM Tris HCl, pH 8.0. Genomic DNA concentration and purity were measured (Thermo Scientific; 2000) using the Nanodrop Spectrophotometer, and DNA integrity and amount of DNA were analyzed using Agarose gel electrophoresis and Qubit dsDNA HS assay kit (Cat No: Q32854), respectively. Agarose Gel Electrophoresis parameters for DNA integrity analysis using DNA were loaded on 1% gel and electrophoresis was performed at 100 volts. The pure samples with optimal yield and concentration were considered suitable for Illumina and Nanopore library preparation.

Strain purity check

The purity of the bacterial strain was assessed using the 16S rRNA gene with PCR amplification conducted using 30–50 ng of genomic DNA as a template and 16S rDNA primers, 27 Forward (AGAGTTTGATCCTGGCTCAG) and 1492 Reverse (TACGGCTACCTTGTTACGACTT), and Takara ExTaq in a 25 μl reaction mix were used 1.5 kb PCR product was generated, purified, and used for Sanger sequencing. Column-based PCR clean-up kit (Genetix) purification method used for the 1.5 kb PCR product before Sanger sequencing (Naveed et al. 2023).

The PCR Conditions are as follows:

  • Step 1: Initial Denaturation 98 °C, 2 min

  • Step 2:Denaturation 98 °C, 20 s

  • Step 3: Annealing 60 °C, 30 s

  • Step 4: Extension 72 °C, 30 s

  • Step 5: Go to step 1, 30 times

  • Step 6: Final extension 72 °C, 1 min.

Library construction and genome sequencing

Library construction was carried out at Genotypic Technology company using QIASeq FX DNA Library Preparation protocol (Cat#180475) by following the manufacturer’s instructions. The libraries were sequenced on Illumina Nova Seq 6000 (Illumina, San Diego, USA) using 150 bp paired-end chemistry following the manufacturer’s instructions. 50 ng of Qubit quantified DNA was enzymatically fragmented, end-repaired, and A-tailed in a one-tube reaction using the FX Enzyme Mix provided in the QIASeq FX DNA kit. The end-repaired and adenylated fragments were subjected to adapter ligation, whereby an index-incorporated Illumina adapter was ligated, to generate a sequencing library. The library was subjected to 6 cycles of Index-PCR (Initial Denaturation at 98 °C for 2 min, cycling (at 98 °C for 20 s, 60 °C for 30 s, 72 °C for 30 s) and final extension at (72 °C for 1 min) to enrich the adapter-tagged fragments. Finally, the amplified library was purified using JetSeq Beads (Bio, # 68031) followed by a library quality control check. Illumina-compatible sequencing library was quantified by Qubit fluorometer (Thermo Fisher Scientific, MA, USA) and its fragment size distribution was analyzed on Agilent 2200 TapeStation.

Genome sequencing, assembly, and annotation

The obtained genomic DNA of N. farcinica was sequenced on Illumina Nova Seq 6000 using 150 bp paired-end chemistry following the manufacturer’s instructions (Cortese et al. 2021). A total of ~7.1 million Illumina sequencing data was generated for the Nocardia sample. The obtained raw fastq files were subjected to adapters removal using fastp (version 0.23.0). Further, the quality of raw reads was examined through the FastQC tool. Further, Unicycler (version 0.5.0) was considered to identify the good quality reads with the de novo assembled short-read data, based on the default settings (Mozrall et al. 2022; Juraschek et al. 2021; Irfan et al. 2023). The purpose of utilizing the unicycler (0.5.0) is to filter out contigs of lesser deep, yielding clean assemblies even if the read set has a low level of contamination (https://github.com/rrwick/Unicycler). The assembled genome was evaluated using the Quality Assessment Tool for Genome Assemblies (QUAST) software employing a default parameter setting of minimum contig length which was set at 500 (Mozrall et al. 2022). QUAST generates a complete set of metrics and statistics to evaluate the accuracy of a genome assembly. This provides different statistics such as N50, L50, contigs, Scaffolds, and total length of the assembly that help the researchers to decipher the complete assembly (Sharma et al. 2023). The Rapid Annotation using Subsystem Technology (RAST) server (version 2.0) and Prokka (Prokaryotic Genome Annotation) software were utilized to annotate the assembled N. farcinica genome (Madaha et al. 2020; Zhao et al. 2020). The RAST (https://rast.nmpdr.org/rast.cgi) are fully automated with default parameters and Prokka (https://github.com/tseemann/prokka) parameters with default settings such as similarity e value cut-off (‘1− 09’), minimum coverage on query protein (80), minimum contig size (1) (Aziz et al. 2008; Overbeek et al 2014; Brettin et al. 2015; Pei et al 2021; Seemann 2014). The antimicrobial resistance genes and virulent factors were predicted using the ABRicate (version 1.0.1) program with default parameters settings (Seemann 2022) (https://github.com/tseemann/abricate). ABRicate is a software widely used for mass screening of contigs to identify antimicrobial resistance and virulence genes. This software encompasses NCBI, Resfinder, CARD (comprehensive antibiotic resistance database), and Virulence Factor DataBase (VFDB) (Zakaria et al. 2021). The presence of plasmids was computationally predicted through the PLSDB server based on the default parameters of the search strategy of mash screen, maximum p value 0.1 and minimum identity 0.99 (https://ccb-microbe.cs.unisaarland.de/plsdb/plasmids/search_form/seq/) and the sequences were visualized using the proksee tool (https://proksee.ca) (Tian et al. 2022). PLSDB is a database that contains 13,789 plasmid records that have been collected from the NCBI nucleotide database. The PSLDB is widely utilized for the prediction of plasmids within the bacterial genome assemblies (Galata et al. 2019). In Proksee bacterial assembled and annotated genomes from different third-party tools are visualized in one graphical map. Here, plasmids can also be visualized utilizing the proksee server (Grant et al. 2023). Moreover, DNA–DNA hybridization analysis was performed using type strain genome server (TYGS) based on the default settings (https://tygs.dsmz.de/user_requests/new). The rationale behind performing DNA–DNA hybridization analysis using the type TYGS is that allowed for the genome-based replacement of taxonomic graded techniques including DNA: DNA hybridization (DDH), 16S rRNA gene sequencing, G + C-content measurement, and multi-locus sequence analysis. This server creates digital DNA: DNA hybridization values, which are utilized to identify the degree of genetic resemblance between two bacterial strains. This approach is utilized for identifying new bacterial species and for identifying the degree of genetic resemblance between two bacterial strains (Meier-Kolthoff and Göker 2019). The average nucleotide identity score was calculated using the FastANI tool in the proksee server. The Analysis of Average Nucleotide Identity (ANI) is a whole-genome similarity metric that facilitates taxonomic analysis at high resolution across thousands of genomes from varied phylogenetic groups. ANI is a reliable and useful metric for determining the association between two genomes. FastANI is more accurate for both complete and draft genomes, and three orders of magnitude faster in comparison to alignment-based methods (Jain et al. 2018). Identification of the prophage sequences was performed in the PHASTER server (https://phaster.ca/). PHASTER (PHAge Search Tool Enhanced Release) is an important improvement to the popular PHAST web server with default parameters for rapidly finding and annotating the prophage sequences in the genomes of bacteria and plasmids. Three types of prophages are identified by PHASTER: intact, incomplete, and questionable. Prophage prediction confidence levels are reflected in these categories. Here, questionable prophages have less confidence, therefore intact prophages are more confidently predicted as complete. These are the criteria used for prophage sequence identification in the PHASTER server (Arndt et al.2016; Zhou et al. 2011). The presence of secondary metabolite gene clusters was detected using the antiSMASH server with default settings (version 7.0.0) (https://antismash.secondarymetabolites.org/#!/start). Gene profiles for particular kinds of gene clusters are stored in hidden markov models (HMMs). Gene clusters are determined based on the co-occurrence of genes involved in secondary metabolism utilizing the ClusterFinder algorithm. The manually curated clusters of biosynthetic genes and related metadata are available in the MIBiG database. The antiSMASH database consists of gene clusters found with antiSMASH version 4 in over 6000 completed bacterial genomes. All publicly available genomes are analyzed using the ClusterBlast algorithm, which identifies similar gene clusters. These are the databases used in antiSMASH for detecting secondary metabolite gene clusters in bacterial genomes (Blin et al. 2023).

Results

Antibiotic susceptibility test

The results of the antibiotic susceptibility test with 24 antibiotics for N. farcinica are represented in (Fig. 2). N. farcinica was found susceptible to 13 antibiotics, namely vancomycin (VA), tobramycin (TOB), tetracyclines (TE), teicoplanin (TEI), spectinomycin (SPT), amikacin (AK), ofloxacin (OF), fusidic acid (FC), carbenicillin (CB), imipenem (IPM), ciprofloxacin (CIP), levofloxacin (LE), netillin (NET) and found resistant to 11 antibiotics, namely polymyxin-B (PB), rifampicin (RIF), streptomycin (S), gentamicin (GEN), kanamycin (K), penicillin-G (P), norfloxacin (NX), nalidixic acid (NA), erythromycin (E), clindamycin (CD), and chloramphenicol (C). The resistance and susceptibility patterns of N. farcinica are represented in (Table 1).

Fig. 2
figure 2

Antibiotic susceptibility testing of N. farcinica with disc diffusion methods. N. farcinica antibiotic susceptibility: disc diffusion analysis reveals distinctive zones of inhibition. The study unveils distinctive zones of inhibition, highlighting the bacterium’s response to various antibiotics

Table 1 Antibiotic susceptibility test and interpretation thresholds

Genomic DNA extraction and quantification

The samples that passed the quality assessment with optimal yield and concentration were deemed suitable for Illumina and Nanopore library preparation (Table 2 and Fig. 3)

Table 2 Estimated DNA concentration and purity
Fig. 3
figure 3

Agarose gel electrophoresis (AGE) of the N. farcinica DNA sample. Illustration of the N. farcinica DNA using agarose gel electrophoresis (AGE), with ‘I’ denoting the sample lane and ‘L’ representing the molecular weight ladder

Strain purity check

The 16s rRNA sequence was subjected to a BLAST search against the nucleotide collection (nr/nt) database for analyzing the purity of the strain. The BLAST results show a percentage identity value and higher total score of 100% and 7404, respectively, which indicates that the sample was a pure resemblance match to a known bacterial species N. farcinica with high similarity. Amplification of 16S rRNA for strain purity test by Sanger sequencing (Fig. 4).

Fig. 4
figure 4

Amplification of ~1500 bp observed for sample. Gel visualization of ~1500 bp 16S rRNA amplicon bands in N. farcinica. This image captures the distinct bands corresponding to approximately 1500 base pairs, representing the amplified 16S rRNA in N. farcinica

Sequence obtained from forward and reverse primer is used to generate contig for which BLAST analysis (NCBI) is performed against nr/nt database. The bacterial strain that appeared at the top of the BLAST analysis results with maximum query coverage and percent identity is identified as the strain (Figs. 5 and 6). Chromatogram for 16 s rRNA forward and reverse sequencing (Fig. 7a, b).

Fig. 5
figure 5

Identification of 16 s rRNA gene sequencing. Visualization of the 16S rRNA sequence of Nocardia farcinica

Fig. 6
figure 6

Sequence identity analysis using 16 s rRNA sequences of N. farcinica. The BLAST result of the 16S ribosomal RNA of Nocardia farcinica sequence shows 100% identity with N. farcinica. This significant result underscores the high level of sequence similarity, suggesting a robust association and potential taxonomic association. The bacterial strain that appeared at the BLAST analysis results with maximum query coverage and percent identity is identified as the strain

Fig. 7
figure 7figure 7

a, b Representation of a chromatogram depicting the outcomes of 16S rRNA forward and reverse sequencing in the N. farcinica strain, offering insights into the genetic composition. Peaks and patterns in the chromatogram illustrate the sequencing results, contributing to the molecular characterization of the bacterial strain. Red, yellow, and blue indicated the pure base; whereas, brown, orange, and violet indicated the mixed base (color figure online)

Library construction and genome sequencing

Thus, 7181520 raw readings were found, and its GC content of 70.7%. The Illumina data generated for the sample had ~7.1 million reads with a sequencing coverage of 308X (Table 3).

Table 3 Illumina read statistics

The Illumina-compatible sequencing library for the samples showed an average fragment size of 381 bp as well as sufficient concentration for obtaining desired sequencing data. Lists the concentration of libraries obtained and indices used. TapeStation profile of a sequencing library (Table 4 and Fig. 8).

Table 4 Description of libraries
Fig. 8
figure 8

TapeStation profile of a sequencing library. Exploring genomic libraries is achieved through TapeStation profiling of a sequencing library. The investigation of genomic libraries using TapeStation profiling provides valuable insights into library composition and quality

Adapter was removed by fastp and evaluated by QUAST (Quality Assessment Tool for Genome Assemblies)

Automatic adapter fastp trims both single-end and paired-end Illumina data in which each pair of sequences is identified based on their overlap (Supplementary 1 and 2). Subsequently, a draft genome assembly metrics analyzed using QUAST showed a total length of 6,123,581 bp, contigs of 103, N50 of 29,253 bp, GC content of 70.78%, and 63 contigs greater than 1000 bp (Table 5 and Fig. 9).

Table 5 Overview of genomic assembly statistics
Fig. 9
figure 9

A Comprehensive genomic evaluation through QUAST analysis of N. farcinica. Represents a detailed genomic assessment using QUAST analysis, providing accurate and detailed insights into the structure and quality of the N. farcinica genome

Annotation results by rapid annotation subsystems technology RAST and Prokka

The genome annotations were predicted by Prokka and RAST. According to prokka, was found 6,131,470 bp, 5683 protein coding sequences, 4 rRNA, 7 repeat regions, 60 tRNA, and 1 tmRNA. Rapid Annotations Subsystems Technology found that 5945 protein-coding sequences belonged to 302 subsystems and 54 rRNA (Table 6). The functional analysis obtained from RAST (Fig. 10) revealed that the genome had

  1. a)

    cofactors, vitamins, prosthetic groups, and pigments: 185 genes

  2. b)

    cell wall and capsule: 24 genes

  3. c)

    Virulence, Disease, and Defense: 58 genes

  4. d)

    potassium metabolism: 8 genes

  5. e)

    Miscellaneous: 34 genes

  6. f)

    membrane transport:36 genes

  7. g)

    acquisition and metabolism: 7 genes

  8. h)

    RNA metabolism: 55 genes

  9. i)

    Nucleosides and Nucleotides:79 genes

  10. j)

    Protein metabolism: 186 genes

  11. k)

    Regulation and cell signaling: 16 genes

  12. l)

    Secondary metabolism: 5 genes

  13. m)

    DNA metabolism: 96 genes

  14. n)

    Fatty acids, lipids, and isoprenoids: 203 genes

  15. o)

    Nitrogen metabolism: 23 genes

  16. p)

    Dormancy and sporulation: 1 gene

  17. q)

    Respiration: 114 genes

  18. r)

    Stress response: 49 genes

  19. s)

    Metabolism of aromatic compounds: 62 genes

  20. t)

    Amino acid and derivatives: 364 genes

  21. u)

    Sulfur metabolism: 13 genes

  22. v)

    Phosphorus metabolism: 24 genes

  23. w)

    Carbohydrates248 genes

Table 6 Decoding N. farcinica for a comprehensive overview of genomic insights
Fig. 10
figure 10

Subsystem distribution in different categories of Nocardia farcinica. Subsystem coverage indicates the total genes in the subsystems. Each part of the pie graph represents different functions and proportions of genes

Identification of antimicrobial resistance and virulence gene

CARD (Comprehensive Antibiotic Resistance Database) analysis for antimicrobial resistance genes in N. farcinica of the genome, results in the presence of rifamycin resistance (RbpA gene), macrolide; penam resistance (mtrA gene), and penam resistance (FAR-1 gene). Resfinder-predicted resistance genes encoding blaFAR-1 resistance (blaFAR-1_1). NCBI-predicted resistance genes responsible for rifampin resistance (rox) and beta-lactam resistance (blaFAR-1). Virulent factors such as relA, icl, and mbtH genes were found in VFDB. The presence of both virulent and resistant genes in N. farcinica’s genome (Table 7 and Fig. 11).

Table 7 Exploration of antimicrobial resistance and virulence genes for a comprehensive understanding of microbial defense mechanisms
Fig. 11
figure 11

Distribution of antibiotics resistant and virulent genes in the genome of N. farcinica. The distribution pattern of antibiotic-resistant and virulent genes within the genome of Nocardia farcinica. The analysis provides valuable insights into the genetic elements contributing to antibiotic resistance and virulence in this bacterium

Plasmid identification and genetic analysis of plasmids

The PLSDB server v2.1.1 was used to detect the contigs that could belong to a plasmid. The results showed that N. farcinica had a chromosome of 6.1mb and one predicted plasmid (Accession: NZ_LN868939.1) with a sequence length is 2,657,929 bp, respectively (Table 8). Proksee server was utilized to visualize the plasmid maps of the N. farcinica (Fig. 12).

Table 8 Identification of plasmids in N. farcinica utilizing PSLDB Server
Fig. 12
figure 12

The plasmid map of Nocardia farcinica for a comprehensive exploration using proksee tool. This illustrative figure showcases the plasmid map of Nocardia farcinica, generated through the utilization of the Proksee tool. The intricately annotated map provides a detailed depiction of key genetic elements, including coding sequences, structural features, and functional annotations associated with the identified plasmid

Prophage regions in the genome

Three prophages were identified in the chromosome through the PHASTER tool and the obtained results are shown as follows.

  1. 1.

    “ > 2 “, “length = 928,982” bps, “depth = 1.08×” 0 intact, o questionable, and 1 incomplete region is presented in the chromosome (Fig. 13). Prophage regions 1 with the start: 596,655, End:604,096, CDs:8, predicted type: incomplete, and GC % contents of 68.23%, respectively. PHASTER tool revealed that the aforesaid regions were incomplete as they scored <70. The gene function of region 1 (incomplete prophage regions), was found to be essential for phage activity while phage-like protein, hypothetical protein, and fiber protein were identified to play a crucial role in the same.

  2. 2.

    “ > 3 “, “length = 775,693” bps, “depth = 0.94×” 0 intact, o questionable, and 1 incomplete region is presented in the chromosome (Fig. 14). Prophage regions 1 with the start: 76,665, End:85,470, CDs: 8, predicted type: incomplete, and GC% contents of 66.90%, respectively. PHASTER tool revealed that the aforesaid regions were incomplete as they scored <70. The gene function of region 1 (incomplete prophage regions), was found to be essential for phage activity, while 3 phage-like proteins and 5 hypothetical proteins were identified to play a crucial role in the same.

  3. 3.

    “ > 6 “, “length = 274,970” bps, “depth = 1.00×” 0 intact, o questionable, and 1 incomplete region is presented in the chromosome (Fig. 15). Prophage regions 1 with the start:202,280, End:210,331, CDs:10, predicted type: incomplete, and GC% contents of 72.62%, respectively. PHASTER tool revealed that the aforesaid regions were incomplete as they scored <70. The gene function of region 1 (incomplete prophage regions) was found necessary for phage activity and 2 phage-like proteins and 8 Hypothetical proteins were identified to play a crucial role in the same.

Fig. 13
figure 13

a A total of 1 prophage region is positioned in the chromosome. Green indicates the intact prophage regions (score > 90), blue represents the questionable prophage regions (score 70–90), and red specifies the incomplete prophage regions (score < 70). b Structure of one intact prophage region. Genes are colored based on the predicted functions (color figure online)

Fig. 14
figure 14

a A total of 1 prophage region is positioned in the chromosome. Green indicates intact prophage regions (score >90), blue represents questionable prophage regions (score 70–90), and red specifies incomplete prophage regions (score <70). (b) Structure of one intact prophage region. Genes are colored based on the predicted functions (color figure online)

Fig. 15
figure 15

a A total of 1 prophage region is positioned in the chromosome. Green indicates the intact prophage regions (score >90), blue represents the questionable prophage regions (score 70–90), and red specifies the incomplete prophage regions (score <70). b Structure of one intact prophage region. Genes are colored based on the predicted functions (color figure online)

Annotation of DNA–DNA Hybridization

TYGS was used to infer the taxonomic status within a particular group of the bacterial dataset. Formulae d0 and d6 were used to measure the similarity in gene content and formula d4 reports a similarity based on sequence identity. Based on the whole-genome phylogenetic result it was observed that the draft bacterial genome (assembly. fasta) was closely related to the N. farcinica NBRC 15532 strain (Fig. 16).

Fig. 16
figure 16

DNA–DNA hybridization identification using type strain genome server (TYGS). The whole genome-based phylogenetic tree phylogenies built with TYGS are inferred using FastME 2.1.4 with a BioNJ starting tree and subtree pruning and regrafting post processing

Average nucleotide identity (ANI)

The similarity analysis was carried out between the assembled genome and reference genome (N. farcinica NCTC1134, N. farcinica DSM43257, N. farcinica NBRC 15532) using FastANI which showed the identity of 99.0739, 99.0679, and 99.0454%, respectively, against the reference genomes. The red line segment denotes the reciprocal mapping between the query and reference genome, indicating their evolutionary conserved regions (Fig. 17a–c).

Fig. 17
figure 17

Similarity analysis between the assembled genome and reference strains. a, c e Indicates the query, b Indicates the N. farcinica NCTC1134, d Indicates the N. farcinica DSM 43257, and f indicates N. farcinica NBRC 15532, utilizing Proksee to ascertain the average nucleotide identity (FastANI). Each red line division represents a reciprocal map between the two genes, showing their evolutionarily conserved sequences

Identification of gene clusters involved in secondary metabolite

Gene clusters involved in the bioactive compound synthesis, type such as 4 Terpene, ranthipeptide, NRP-metallophore, 12 NRPS, 2 NRPS-like, 4 T1PKS, hgIE-KS, ectoine, redox-cofactor, amino polycarboxylic acid, 2 T3PKS, NAPAA, aryl polyene, Ripp-like, furan (Table 9).

Table 9 Identification of secondary metabolite in N. farcinica using antiSMASH

Terpene: It has been found that the terpene compounds in Nocardia have antibiotic and cytostatic properties. Terpenoids and meroterpenoids produced by actinomycete that have significant antibacterial activity are of interest as a source of new antibiotics efficient against drug-resistant pathogenic bacteria. It is an important pathway in actinomycetes’ secondary metabolism to synthesize terpenes and terpenoids.

NRPS and RIPP-like:

Nocardia genomes can generate a variety of secondary metabolites, siderophores, antibiotics, and other small bioactive molecules due to the high number of non-ribosomal peptide synthases (NRPS) in each strain.

RiPPs (ribosomal synthesized and post-translationally modified peptides) and NRPS (ribosomal synthesized and post-translationally modified peptides) are produced by Nocardia strains.

Aryl polyene:

Any potential damage that might result as a consequence of exposure to reactive oxygen species is protected by the aryl propene.

Discussion

Nocardia, as a facultative intracellular pathogen infecting the brains and lungs of immunosuppressed patients with consequences that could be at risk of mortality. Nocardia is difficult to identify because of its prolonged incubation time and it is undiagnosed, unrecognized, and neglected because of its non-specific symptoms of infection. Nocardia can enter and survive within host cells, including macrophages and epithelial cells, and resist the host immune system by generating several virulence factors, including hemolysin and superoxide dismutase (Ji et al. 2022). The use of potent antibiotics to treat N. farcinica leads to adverse side effects and paves the way for drug resistance. There is an urgent need to identify alternative drug targets capable of combating the multi-drug resistance through antibiotics. This research aims to design potent inhibitors of N. farcinica infections, minimizing or eliminating side effects. As per the previous report of Ji et al. (2022), the Nocardia virulence factors such as mce, hbha, and nfa34810 play a significant role in adhesion and invasion. Complete genome sequence data indicated that the genome of N. farcinica consists of many inducible virulence genes, including catalases and nbt, which may play important roles during the infection process (Ji et al. 2022). This study embarked on whole-genome sequencing of N. farcinica strains resistant to commonly employed antibiotics. This comprehensive exploration unveiled six pivotal antibiotic resistance genes—RbpA, mtrA, FAR-1, rox, blaFAR-1, and BlaFAR-1_1. The RbpA gene resistance phenotype is Rifampicin or Rifampin. The Rifampicin or Rifampin antibiotic plays a major role in inhibiting DNA-dependent RNA polymerase, leading to the suppression of RNA synthesis and cell death (Newell et al. 2006; Hu et al. 2012; Wang et al. 2020). The FAR-1 is a class A β-lactamase gene and its resistance phenotype is penicillin antibiotic. All penicillins are beta-lactam antibiotics in the penam sub-group and are used in the treatment of bacterial infections caused by Gram-positive organisms. The resistance mechanism of FAR-1 is antibiotic inactivation (De Pascale and Wright 2010; Wang et al. 2020). The resistance phenotype associated with the mtrA gene is characterized by its ability to resist macrolide antibiotics, particularly erythromycin, through a mechanism involving drug efflux (Wang et al. 2020; Sun et al. 2014). In N. farcinica, Rox breaks down Rifampicin, creating a new metabolite and confers resistance to Rifampin by inactivating the antibiotic through a monooxygenase mechanism (Hoshino et al. 2010). The blaFAR-1 gene resistance phenotype is Beta-Lactam. Notably, penicillin resistance was associated with the presence of blaFAR-1 genes, respectively (Wang et al. 2020). Moreover, the whole-genome sequencing of multidrug-resistant N. farcinica revealed the existence of three vital virulence factor genes—relA, Icl, and mbtH. In this era of antibiotic resistance, our findings underscored the need to explore novel approaches and strategies to combat AMR. This study bridges the gap between genetic intricacies and therapeutic challenges by elucidating the genetic basis of multi-drug resistance in N. farcinica. Additionally, by uncovering significant virulence factors, we gain insights into the pathogen’s strategies for thriving. In summary, this research constitutes a pivotal stride toward addressing the pressing need for innovative interventions against N. farcinica infections. It explores both drug resistance and virulence factors revealing insights that could lead to potential therapeutic breakthroughs. Empowered with this profound understanding, the goal is to identify potent lead compounds, paving the way to outsmart N. farcinica’s multi-drug resistance. The significance of the Nocardia Species lies in their emergence as human pathogens which closely resemble other mycolic acid-containing genera of the order Actinomycetales, particularly Mycobacterium tuberculosis. As a result, misdiagnosis and it is possible for therapies to fail in clinical settings. This research aimed to address these gaps by using a collection of clinical isolates and investigating potential identification approaches, antimicrobial susceptibility patterns, and resistance mechanisms in Nocardia species. The outcomes of this research will play an important role in improving the diagnosis and effective treatment of infections caused by Nocardia species. Additionally, this research aims to establish clear and standardized criteria for assessing antimicrobial susceptibility, aiding in more precise interpretations.

Conclusion

In the present study, the research has uncovered the complex genetic makeup of N. farcinica through genome analysis. This research has uncovered important genetic components, such as the putative virulence factors relA, icl, and mbtH besides shedding light on pathogen’s versatility by identifying resistance genes to beta-lactams, macrolides, penams, and rifampin. This study has predicted the genetically closest match as the N. farcinica NBRC 15532 strain. The complete genome characterization also offers prospective directions for the development of novel drug targets by providing invaluable insights into the clinically isolated N. farcinica sample. The investigation of the molecular characteristics and resistance mechanisms of Nocardia species is aided by whole-genome sequencing technology. In the future, subtractive genomics, comparative genomics, and reverse vaccinology approaches will be utilized to predict pathogen-specific potent drug and vaccine targets within significant metabolic pathways. (Khan et al. 2023; Afzal et al. 2023). Furthermore, virtual screening strategies will be employed to identify the therapeutic drug targets in future studies (Irfan et al. 2023; Hassan et al. 2023; Hassan et al. 2023).