Introduction

Natural products have served as a major foundation for the development of drugs since the golden age of antibiotics of the mid-twentieth century (Rossiter et al. 2017). A significant number of natural product drugs/leads are actually derived from microbes (Barzkar et al. 2019; Cragg and Newman 2013; Fenical 2020; Newman and Cragg 2016). Actinomycetes, particularly the genus Streptomyces, is an important microbial resource that produces antibiotics and other active compounds. To date, approximately 50% of microbial natural products have been derived from actinomycetes (Yang et al. 2019). Secondary metabolites from actinomycetes usually possess antibacterial, antiviral, antitumor, anti-inflammatory bioactivities, etc. However, the unchecked usage and the increasing emergence of resistance of known drugs (Asbell et al. 2020; Rossiter et al. 2017) demonstrated that new drugs are critical for modern medicine.

Genomic sequencing and bioinformatic analysis had previously revealed that each Streptomyces contains a wide range of secondary metabolite biosynthetic gene clusters (BGCs), indicating the potential sources of novel bioactive agents (Lee et al. 2020). However, most of these gene clusters were cryptic under typical growth conditions due to their tight control in response to either direct or indirect environmental signals (Bentley et al. 2002; Challinor and Bode 2015; Ohnishi et al. 2008; Zhong et al. 2013). Given the new tools developed recently in the fields of bioinformatics, analytics, and structural and molecular biology, various approaches have been developed for finding the products of silent BGCs. The strategies include variation in growth conditions, manipulating global/pathway-specific regulators, epigenetic perturbation, heterologous expression, refactoring, reporter-guided mutant selection, etc. (Guo et al. 2015; Tanaka et al. 2010; Zhang et al. 2017; Mao et al. 2018; Rutledge and Challis 2015). However, these strategies can only be applied on a case-by-case basis, with each having their own limitations.

Lots of transcription factors exist in the genome of each Streptomyces, indicating that biosynthesis of secondary metabolites in Streptomyces is controlled by subtle and precise regulatory systems (Liu et al. 2013; Xu and Yang 2019; Romero-Rodriguez et al. 2015). Manipulation of these regulators has been an effective strategy to improve yields of valuable secondary metabolites (Xia et al. 2020; Martin and Liras 2010). Among the enormous regulators, the family of Streptomyces antibiotic regulatory proteins (SARPs) has been found only in actinomycetes, and most of them exist within Streptomyces (Romero-Rodriguez et al. 2015). It has been demonstrated that the SARPs mainly function as activators of secondary metabolism, such as the well-characterized ActII-ORF4 and DnrI (Bruheim et al. 2002; Li et al. 2019; Tang et al. 1996).

In this work, through overexpression of SARPs existing in Streptomyces tsukubaensis L20 and bioactivity-guided screening, we successfully activated a silent BGC and identified the novel tsukubarubicin. Bioactivity experiments showed that the promising tsukubarubicin exhibited more potent antitumor activity than doxorubicin. Furthermore, we also identified the gene cluster responsible for the biosynthesis of tsukubarubicin, which was never reported before. Our strategy here will be generally applicable to many other silent BGCs and streamline discovery of potential new compounds.

Materials and methods

Bacterial strains and culture conditions

All strains used in this study are listed in Table 1. Escherichia coli TG1 was used for plasmid cloning. E. coli ET12567/pUZ8002 was used for conjugation to transfer plasmids into S. tsukubaensis L20 (China General Microbiological Culture Collection, CGMCC 11252). E. coli strains and Bacillus subtilis ATCC 67736 were cultured on LB agar plates or in LB liquid medium at 37 °C. All Streptomyces strains were grown on ISP4 solid medium (1% soluble starch, 0.1% K2HPO4, 0.1% MgSO4·7H2O, 0.1% NaCl, 0.2% (NH4)2SO4, 0.2% CaCO3, 0.0001% FeSO4·7H2O, 0.0001% MnCl2·7H2O, and 2% agar) for spore preparation or conjugation and in TSB (3% trypticase soy broth, w/v) for preparation of genomic DNA or seed medium. The YEME, R5 (Kieser et al. 2000), and ISP4 liquid mediums were used as fermentation mediums for production of the potential compounds. The culture and fermentation temperature of Streptomyces strains was 28 °C. Relative antibiotics were added to the medium when needed (ampicillin, 50 μg/mL; kanamycin, 25 μg/mL; apramycin, 50 μg/mL; and chloramphenicol, 30 μg/mL).

Table 1 Strains and plasmids used in this study

Plasmid construction

Plasmids and primers used in this work are listed in Table 1 and Table S1, respectively. The genes encoding putative SARPs were amplified using primer pairs from 1 to 12, respectively. The fragments were then individually cloned into NdeI/NotI-digested pLM1 (Mao et al. 2009) using the ClonExpress II one-step cloning kit (Vazyme biotech, Nanjing, China) and confirmed by sequencing, yielding plasmids for gene overexpression. Two 1.5-kb DNA fragments flanking the gene tsuA were amplified from the genomic DNA of S. tsukubaensis L20 using primer pairs 13 and 14, respectively, and then cloned into EcoRI/HindIII-digested pKC1139 (Bierman et al. 1992) generating the disruption plasmid pKC1139-ΔPKS. Accordingly, the disruption plasmid pKC1139-ΔBO (used to delete the intergenic region between the genes fkbB and fkbO) was also constructed as mentioned above, using primer pairs 15 and 16.

Construction of S. tsukubaensis strains

The overexpression plasmids including pLM107 (used for gene tsuR1 overexpression) were transformed into E. coli ET12567/pUZ8002 and then introduced into S. tsukubaensis L191 via intergeneric conjugation to get the various overexpression strains as described previously (Macneil and Klapko 1987; Zhang et al. 2016).

For gene deletion, the plasmid pKC1139-ΔBO was introduced into S. tsukubaensis L20 by conjugation as mentioned above. Strains with single-crossover recombination were selected by culturing the transformants on ISP4 plates containing 50 μg/mL apramycin at 37 °C. Subsequently, after two rounds of sporulation on plates without antibiotics at 28 °C, double-crossover mutants (designated as L191) were selected by their apramycin sensitivity and further confirmed by PCR. Similarly, the gene tsuA coding subunits of the minimal polyketide synthase (PKS) for tsukubarubicin biosynthesis was also deleted in L1907 and the resulted strain was named as L1907-ΔPKS.

Fermentation and analysis of products from S. tsukubaensis strains

The strains were cultured on ISP4 plates for about 7–10 days at 28 °C for sporulation. The spores were inoculated into 30-mL TSB medium (seed medium) in 250-mL flasks and then cultured at 28 °C and 220 rpm for 24 h. For shake-flask fermentation, the seed culture was inoculated into 30-mL fermentation medium, giving a final OD600 of 0.4, and cultured at 28 °C and 220 rpm for 168 h. For fermentation in a 15-L fermenter, the seed culture (200 mL) was then transferred into the fermenter containing R5 medium (8 L). The fermentation was carried out at 28 °C and 200 rpm for 7 days with an air-flow rate of one volume per volume of media per minute.

For compound analysis, the crude extracts were injected into a high-performance liquid chromatography (HPLC) system (Agilent, Palo Alto, CA, USA) equipped with an Eclipse Plus C18 column (5 mm, 4.6 × 150 mm). The chromatography used mobile phase (H2O + 0.1% formic acid and acetonitrile) with a linear gradient from 5 to 55% (v/v) acetonitrile over 25 min, then the column was washed with 90% acetonitrile for 10 min and equilibrated with 5% acetonitrile for 10 min.

Purification of tsukubarubicin

After 7 days, 15 L of cultures was obtained and the fermentation broth was harvested by centrifugation at 4000×g for 10 min. The broth was then extracted with equal volume of dichloromethane for three times at pH 8.5. Both extracts were combined and evaporated to dryness under vacuum at 35 °C; this afforded 11 g of dark reddish powder. The power was then dissolved in methanol (10 mL) and chromatographed on a silica gel column ( 40 × 220 mm) which was eluted with a mixture of dichloromethane and methanol (4:1) to yield a crude powder of 800 mg. The above power was then further purified with reversed-phase column chromatography in a column ( 35 × 400 mm) eluting with 40%, 50%, 60%, and 70% methanol (each concentration eluted two bed volumes) to afford 24 fractions. The fractions were then analyzed using a HPLC system as mentioned above and fractions containing pure tsukubarubicin were pooled and dried to obtain 10.3 mg of reddish powder (tsukubarubicin).

Structure elucidation

ESI-MS was conducted on a triple quadrupole mass spectrometer coupled with an Agilent 1290 HPLC system (Agilent, Palo Alto, CA, USA). 1H and 13C (DEPT-135) NMR spectroscopy, 1H-1H COSY, NOESY, HSQC, and HMBC NMR spectra were performed on an NMR spectrometer as used previously (Zhou et al. 2015). CD3OD was used as the solvent for NMR experiments and chemical shifts were referenced to the solvent peaks. The NMR data was shown in Table S2 and Fig. S1-S6.

Biological activity assays

For the antimicrobial activity bioassay, the E. coli TG1 and B. subtilis ATCC 67736 were used as indicator strains. The strains were incubated in LB liquid medium and grown overnight at 37 °C and then coated on LB agar plates. The Streptomyces strains were fermented in YEME medium for 7 days and equal volumes (1 mL) of the cultures were harvested and mixed with an equal volume of methanol to disrupt cells, respectively. The supernatants were then evaporated to dryness to obtain the crude extract samples. The antibacterial activities of the crude extract samples dissolved in equal volume (40 μL) of DMSO were then assessed as the zone of microbial growth inhibition after overnight incubation at 37 °C on LB agar plates which had been coated with E. coli TG1 or B. subtilis ATCC 67736. DMSO and the crude extract sample from strain L191 were used as negative controls.

All the cancer cell lines (MGC803, MDA-MB-231, A549, and HCT116) were obtained from National Collection of Authenticated Cell Cultures (Shanghai, China). The cell lines MGC803 and MDA-MB-231 were cultured in RPMI-1640 medium, and A549 and HCT116 were cultured in Dulbecco’s modified Eagle’s medium (DMEM) and McCoy’s 5A medium, respectively. All the cells were cultured at 37 °C in humidified atmosphere with 5% CO2. Cell suspension (100 μL, 2 × 103 cells/well) was added to 96-well microtiter plates and incubated for 24 h. The cells were then treated with different concentrations of compounds (0–2000 nM, 10 μL) for 48 h. After addition of 10 μL CCK-8 solution (Vazyme Biotech, Nanjing, China) to each well, cells were incubated for another 2 h according to the manufacturer’s instructions. The absorbance was measured at 450 nm by using a microplate reader. The percentage of cell viability versus concentration was plotted by software GraphPad Prism (version 6.02) and the IC50 values were calculated by non-linear fit curves. The clinical anticancer agent doxorubicin was used as a reference compound, and 0.1% DMSO (final concentration) was used as a negative control. Each assay was repeated three times. To compare the difference between the test and control data, statistical significance was calculated by Student’s t test.

RNA analysis by quantitative real-time PCR

Total RNA of S. tsukubaensis L20 and its derivative strains was extracted from R5 medium cultures after 36 h, using EASYspin Plus bacteria RNA extract kit (Aidlab Biotech, Beijing, China) according to the manufacturer’s instructions. Genomic DNA was removed with RNase-free DNase I (TaKaRa, Tokyo, Japan). The cDNA was then synthesized using M-MLV reverse transcriptase according to the protocol (TaKaRa, Tokyo, Japan). Subsequently, quantitative real-time polymerase chain reaction (qRT-PCR) was performed using SYBR Premix Ex Taq II (TaKaRa, Tokyo, Japan) in 20 μL volume following the manufacturer’s instructions. The sigma factor gene hrdB was used as an internal control to normalize the transcriptional levels. The fold changes of the transcriptional levels were calculated by the comparative Ct method according to the manufacturer’s protocol (TaKaRa, Tokyo, Japan). The transcription levels of the corresponding genes in the strain L191 were defined as having a relative value of 1. The software GraphPad Prism (version 6.02) was used to analyze qRT-PCR data. To compare the difference between the test and control data, P values were calculated by Student’s t test. PCR primers used here are listed in Table S1. Each experiment was performed in triplicate.

Bioinformatics analysis of the SARPs and the tsu gene cluster

Using the well-characterized SARPs (ActII-ORF4 and DnrI, respectively) as queries, putative SARPs existing in S. tsukubaensis L20 were predicted by performing local BLASTp analysis using BLAST+ software (version 2.11.0+, ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/) (Camacho et al. 2009) against the reference database which was created by the proteins encoded by all the genes in strain L20 according to the detailed user manual (https://www.ncbi.nlm.nih.gov/ books/NBK279690/). By setting the threshold with an E value below 1 × 10−10 and 30% identity, 12 SARPs existing in strain L20 were selected. These SARPs were further analyzed by the online program BLASTP (version 2.11.0+, https://blast.ncbi.nlm.nih.gov/Blast.cgi) in non-redundant protein sequence database of NCBI to identify their predicted functions and homologous proteins (Altschul et al. 1997).

For prediction of BGCs within strain L20, the genome sequence of strain L20 (GenBank accession number CP070379) was analyzed by antiSMASH (antibiotic and secondary metabolite analysis shell) following instructions of the website (https://antismash.secondarymetabolites.org/). Among the predicted gene clusters, only the gene cluster tsu possessed both one type II polyketide synthase (PKS) and five glycosyltransferases which was consistent with the structure of tsukubarubicin. The gene cluster was further verified by gene deletion. Subsequently, detailed function annotations of proteins encoded by the tsu gene cluster were conducted based on the online program BLASTP (version 2.11.0+, https://blast.ncbi.nlm.nih.gov/Blast.cgi) as described above.

Accession number of nucleic acid sequence

The genome sequence of S. tsukubaensis L20 had been deposited in GenBank (accession number CP070379). The DNA sequences of the tsu gene cluster were deposited in GenBank (accession number MW561258).

Results

Activation of the tsukubarubicin biosynthetic gene cluster

The putative uncharacterized SARPs were obtained from the proteins encoding by S. tsukubaensis L20 via local BLASTp analysis using the well-characterized ActII-ORF4 and DnrI as prototypes (Table 2). At the same time, the existing FK506 production was abolished by deleting the intergenic region between fkbB and fkbO, to create a cleaner background and increase the sensitivity of phenotypic screens, yielding the strain L191. To examine functions of these SARPs, the genes encoding putative SARPs were then individually cloned into the integrative vector pLM1, under the control of the constitutive ermEp* promoter. The resultant plasmids were integrated into the chromosome of S. tsukubaensis L191, respectively, to create the recombinant strains.

Table 2 Putative SARPs obtained from strain S. tsukubaensis L20

The strains were then cultured on ISP4 agar plates for 7–10 days and further fermented in YEME medium for 7 days. The crude extracts of the strains were subjected to bioactivity assays against gram-positive bacteria (B. subtilis ATCC 67736) and gram-negative bacteria (E. coli TG1), using extracts from the parental strain L191 as a negative control. Among the recombinant strains, extract from the strain L1907, overexpressing the gene orf02677 (namely tsuR1), showed obvious growth inhibition zone against B. subtilis ATCC 67736, whereas no obvious inhibition zone was found in extract from the parental strain L191 (Fig. 1a). Furthermore, the strain L1907 produced a red substance when cultured either in liquid YEME medium or on ISP4 agar plates (Fig. 1b). Comparison of the HPLC profiles of the extracts from the strains L191 and L1907 revealed distinctive peaks, implying that they might be the new products generated by strain L1907, since no other obvious peaks were detected under the same conditions (Fig. 1c). The absorption spectrum of the main peak (tsukubarubicin) is also shown in Fig. 1d.

Fig. 1
figure 1

Bioassays, phenotypes, and HPLC analysis of the strain L1907 and L191. a The bioactivity of crude extracts against B. subtilis ATCC 67736. DMSO and the crude extract sample from strain L191 were used as negative controls. b Comparison of phenotypes between strain L1907 and L191 cultured on ISP4 agar plates and in YEME liquid medium, respectively. c HPLC analysis of the crude extracts from strain L191 (1), L1907-△PKS (2), and L1907 (3). The detection wavelength was 500 nm. d The absorption spectrum of tsukubarubicin (star)

Isolation and structure elucidation of tsukubarubicin

The emerging bioactivity and distinctive peaks guided us to isolate the potential novel compounds. Different fermentation media (including YEME, ISP4, and R5 liquid mediums) were then evaluated to determine the best culture conditions to produce the new products. R5 medium giving the highest production analyzed by HPLC was eventually chosen as the fermentation medium (Fig. 2). Next, scale-up fermentation was carried out. The fermentation broth (15 L) was harvested by centrifugation and adjusted to pH 8.5 and then extracted with dichloromethane for three times. The organic phase was then concentrated and evaporated to dryness under vacuum at 35 °C. The powder was then dissolved in methanol and applied onto silica gel column chromatography and reversed-phase column chromatography to obtain 10.3 mg of reddish powder (tsukubarubicin).

Fig. 2
figure 2

HPLC analysis of the tsukubarubicin production in different mediums. The strain L1907 was fermented in ISP4, YEME, and R5 medium, respectively. Equal volume of fermentation broths from different mediums was harvested and subjected to HPLC analysis. The detection wavelength was 500 nm. (a) ISP4 liquid medium, the yield was 15 mg/L. (b) YEME liquid medium, the yield was 28 mg/L. (c) R5 liquid medium, the yield was 45 mg/L

The HPLC-ESI-MS of tsukubarubicin displayed a mass of m/z 942.50 [M+H]+ in positive-ion mode. Its molecular formula was determined as C48H67N3O16 by mass spectrum and further NMR analysis. Basing the analysis of the 1H-NMR and 13C-NMR spectra (Table S2), the data of tsukubarubicin shows high similarity with the known compound avidinorubicin (Aoki et al. 1991) except for absence of the succinyl signal and L-rhodosaminyl signal, indicating that tsukubarubicin may be a derivative of avidinorubicin. This hypothesis was further confirmed by its molecular formula and detailed analysis of the key 1H-1H COSY and HMBC correlations (Fig. 3a). In addition, the key NOESY correlations were also summarized to confirm the stereochemistry of the sugar moieties (Fig. 3b). Therefore, combining all NMR data and previously reported literatures, the structure of tsukubarubicin was elucidated as shown in Fig. 4.

Fig. 3
figure 3

2D NMR analysis of the tsukubarubicin. a Selected HMBC (→) and COSY (bond) correlations of tsukubarubicin. b Selected NOESY (↔) correlations of tsukubarubicin

Fig. 4
figure 4

Chemical structure of tsukubarubicin

Bioactivity analysis of the tsukubarubicin

To investigate the biological activity of tsukubarubicin, four kinds of human cancer cell lines were used to evaluate the bioactivity of tsukubarubicin by analyzing the value of IC50 (Fig. S7). Cell viability assays indicated that tsukubarubicin was more active against all the tested cell lines compared with the anticancer agent doxorubicin (Table 3). Meanwhile, the tsukubarubicin was more potent against A549 cells and MDA-MB-231 cells (lung cancer and breast cancer cell lines, respectively) than against HCT116 cells and MGC803 cells (colon cancer and gastric cancer cell lines, respectively), indicated by the IC50. Although tsukubarubicin was better than doxorubicin, it was also worth noting that the tsukubarubicin was much more potent against MDA-MB-231 cells than doxorubicin (IC50 = 2.93 ± 0.28 nM and 57.54 ± 1.72 nM, respectively). All the results indicated that the tsukubarubicin might be a promising new lead for anticancer drug discovery, especially for the breast cancer.

Table 3 Bioactivity analysis of tsukubarubicin against lung cancer cell line A549, colon cancer cell line HCT116, breast cancer cell line MDA-MB-231, and gastric cancer cell line MGC803. Each assay was repeated three times and the standard deviations were shown. All the results were statistically significant (P < 0.0001)

Characterization of the tsukubarubicin biosynthetic gene cluster

According to the structure of tsukubarubicin and structural conservation of aglycone among anthracyclines, at least one type II polyketide synthase (PKS) and glycosyltransferase were expected to be involved in the biosynthesis of tsukubarubicin. The genome sequence of S. tsukubaensis L20 was then subjected to antiSMASH (Blin et al. 2013) and screened for potential gene cluster of tsukubarubicin biosynthesis. Among the predicted gene clusters, an intriguing unreported gene cluster (tsu) attracted our attention. To confirm tsukubarubicin was product of the tsu gene cluster, gene tsuA encoding subunits of the minimal PKS in the cluster was disrupted by homologous recombination in strain L1907 (the resultant strain designated as L1907-ΔPKS). As was expected, HPLC analysis showed that inactivation of tsuA abolished tsukubarubicin production in strain L1907 (Fig. 1c).

To further determine the exact boundaries of the tsu gene cluster, basing on antiSMASH prediction, we analyzed transcriptional levels of the selected genes residing upstream and downstream regions of the tsu gene cluster by qRT-PCR. Compared with the parental strain L191, expression of the tested genes (excluding the flanking genes orf2676 and orf2722) was significantly upregulated (Fig. 5). Combining all these experimental data and bioinformatic analysis, the tsu gene cluster (approximately 47 kb) was finally identified, which spanned the region from gene tsuR1 to tsuR2 comprising 43 genes (Fig. 6).

Fig. 5
figure 5

Analysis of relative transcription levels of the selected genes. The selected genes resided upstream and downstream regions of the tsu gene cluster. RNA samples were prepared from 36-h cultures of strains L191 and L1907 in R5 medium. The sigma factor gene hrdB was used as an internal control to normalize the transcriptional levels. The transcription levels of the corresponding genes in the strain L191 were defined as having a relative value of 1. Error bars showed standard derivations. The stars (asterisks) indicated the statistic significant differences. ***P < 0.0003; ****P < 0.0001; N, no significant difference

Fig. 6
figure 6

Organization of the tsu gene cluster. The cluster spanned roughly 47 kb and contained 43 predicted ORFs. The colors of genes delineated their putative roles in tsukubarubicin biosynthesis based on protein analysis

Detailed annotation of the tsu gene cluster was then conducted based on the online BLASTP program as shown in Table S3 and putative roles of the genes in tsukubarubicin biosynthesis were proposed and delineated by colors as shown in Fig. 6. The results revealed the cluster included thirteen genes expected to participate in polyketide construction or modification (tsuA to tsuM), five glycosyltransferase genes (tsuT1 to tsuT5), fifteen genes consistent with the expected production of sugar moieties (tsu1 to tsu15), three putative regulatory elements (tsuR1 to tsuR3), two genes associated with membrane transport (tsuP1 and tsuP2), two other biosynthetic genes (tsuO1 and tsuO2), and three ORFs (tsuU1 to tsuU3) of unknown function. Furthermore, according to antiSMASH prediction, 41 proteins encoded by the tsu gene cluster shared high identity with that of the BGC of komodoquinone B (Grocholski et al. 2019), indicating that the komodoquinone B may also be the intermediate product of the aglycone during tsukubarubicin biosynthesis.

Discussion

New drugs/leads from natural products are still critical for modern medicine due to the evolving pathogens and new emerging diseases. The genus Streptomyces is an important microbial resource for novel compounds, unveiled by the existence of a wide range of silent secondary metabolite BGCs in each Streptomyces (Lee et al. 2020). This is also the case for the genome-sequenced S. tsukubaensis L20, which is mainly used as an industrial strain for the production of FK506 during our previous work. However, how to activate the silent gene clusters in Streptomyces is still a challenge, owning to the subtle and complicated control of the strain (Jones and Elliot 2018; Kong et al. 2019; Liu et al. 2013). Among the enormous regulators, the family of SARPs has been mainly found within Streptomyces and almost all of them act as activators of secondary metabolism (Li et al. 2020; Ma et al. 2018; Yang et al. 2015). Inspired by this, we overexpressed the SARPs existing in S. tsukubaensis L20 individually and conducted bioactivity-guided screening. As was expected, a silent BGC (tsu) was activated and the novel anthracycline tsukubarubicin with better bioactivity was identified quickly. In addition, many bioactivity screenings only offer sufficient sensitivity to detect the most abundant and bioactive metabolites. In strains overexpressing other SARPs excluding TsuR1, we failed to detect new compounds by bioactivity assays. All the facts suggest that highly targeted screening methods and/or other culture conditions should also be employed to improve our strategy. It should be noted that this strategy is generally applicable and potentially scalable. Besides the SARPs, the transcriptional regulators of LuxR family also mainly act as activators of secondary metabolism (Guerra et al. 2012; Mo and Yoon 2016; Panthee et al. 2020; Romero-Rodriguez et al. 2015), which can be taken into consideration in future studies. Using the same experimental setup, this strategy can in principle enable the activation of other cryptic BGCs in Streptomyces and other actinomycetes.

As searching for compounds with better activity is our ultimate purpose, the bioactivity of tsukubarubicin against cancer cell lines was also evaluated. The results indicated that the tsukubarubicin was a promising lead for anticancer drug discovery, especially for the breast cancer. From a biosynthetic point of view, the structure of tsukubarubicin is unusual due to the dual attachment of the deoxy-sugar through both O-glycosylation and a carbon-carbon bond between C2 and C5′, which is the same with nogalamycin (Bhuyan and Dietz 1965; Siitonen et al. 2016). At the same time, it also contains two units of the new aminosugar avidinosamine, which was first found in avidinorubicin (Aoki et al. 1991). Many natural products owe their biological activity to carbohydrate units attached to aglycones (Kren and Martinkova 2001; Thorson et al. 2001). For nogalamycin, both the C2–C5′ carbocyclization and C2′ hydroxylation are important for its bioactivity due to the enhanced interaction of nogalamycin to DNA through these structures (Smith et al. 1995; Temperini et al. 2005; Williams et al. 1990). Similarly, both the typical structure similar with nogalamycin and the number of sugars (including the distinctive aminosugar avidinosamine) may contribute to the more potent activity of tsukubarubicin.

Although it seems that tsukubarubicin may be a derivative of avidinorubicin (Aoki et al. 1991), the gene cluster and biosynthetic mechanism of avidinorubicin have not yet been reported. Previous studies on avidinorubicin and its analogue decilorubicin were rare and mainly focused on elucidating their structures (Ishii et al. 1983a, b, 1984; Izawa et al. 1991; Nishimura and Ishii 1990). In this work, we successfully identified and characterized the gene cluster (tsu) for tsukubarubicin biosynthesis. Intriguingly, the function-unknown gene tsuU2 in the tsu gene cluster resembled the gene kijd3 which was postulated to catalyze the oxidation of the amino group to nitro-group (Bruender et al. 2010), indicating that there might have a correlation between tsukubarubicin and its analogue decilorubicin containing a nitro-sugar. Future investigation on the biosynthetic route of tsukubarubicin may assist in elucidating the biosynthetic mechanisms of avidinorubicin and decilorubicin and discovering novel biosynthetic elements (such as the nitro-sugar and avidinosamine biosynthetic elements) for further combinational biosynthesis of anthracycline analogues.

In summary, we had activated and discovered the novel tsukubarubicin, analyzed its bioactivity, and identified its BGC which was never reported before. The strategy presented here may streamline discovery of novel compounds in Streptomyces. Our future research may focus on elucidating the biosynthetic mechanism underlying the biosynthesis of tsukubarubicin, which may discover novel biosynthetic elements and assist in elucidating the biosynthetic mechanisms of avidinorubicin and decilorubicin.