Introduction

Brain metastases (BM) are the most common tumor to affect the adult central nervous system, occurring at a rate ten times greater than that of primary brain tumors [37]. With a median survival calculated in weeks in untreated patients, extended to only 3-18 months following multimodal therapy, BM represent a leading cause of cancer mortality [14]. Although BM pose a significant clinical burden, the metastatic process is highly inefficient as individual tumor cells maintain a variable capacity for invasion, dissemination, therapeutic resistance, and other “hallmarks of metastasis” [44]. While it is apparent that metastatic lesions are genetically divergent from their primary tumors [4, 55], determining the subclonal architecture of primary and metastatic lesions may lead to novel therapeutic opportunities, the identification of predictive biomarkers and subsequent prophylactic treatment of those patients most likely to develop metastases.

The use of primary patient samples to study metastasis is limited by several factors [49], from initial procurement of specimens to gathering a sufficient number of cells for experiments. Additionally, models utilizing primary human cells have yet to be successful in recapitulating each stage of the metastatic cascade [44]. Instead, current frameworks investigate each metastatic stage in isolation, limiting the translational efficacy of putative therapeutic targets [40]. In the current study, we have established in-house early-passage brain metastasis cell lines termed as brain metastasis-initiating cells (BMICs) from primary patient samples of lung-to-brain metastases. We demonstrate that BMIC lines are enriched for stem cell properties as well as the ability to recapitulate the metastatic process both in vitro and in vivo. We also established a BMIC patient-derived xenotransplantation (PDXT) model to interrogate human lung-derived BM. We performed in vitro and in vivo RNA interference screens to identify SPOCK1, a TGF β target gene [29] and TWIST2 as important BMIC regulators. SPOCK1 in particular is a potent regulator of BMIC self-renewal, tumor initiation, and metastasis. Moreover, our results were translatable to clinical samples, where in a prospective cohort of primary non-small cell lung cancer patient slides, both targets were exclusively expressed in only those patients who developed BM. We then examined whether an indirect relationship exists between SPOCK1 and TWIST2 using protein–protein interaction networks, through which we identified a unique set of BMIC interactors that have significant prognostic value in lung cancer patient survival. One such marker, INHBA, a TGF-β ligand found mutated in lung adenocarcinoma [41], showed reduced expression in BMICs with knockdown of SPOCK1, further implicating a role for the TGF β pathway in BM development as well as identifying novel therapeutic targets in the pathway. With this work, we present a unique method of modeling human BM using patient-derived BMICs. Using this model, we have successfully identified and functionally validated novel regulators of primary and metastatic tumor growth, and clinically validated our targets as predictive biomarkers for BM. In blocking the metastatic process, we aim to transform a uniformly fatal systemic disease into one that is locally controlled and eminently more treatable.

Materials and methods

Patient sample processing and cell culture

Brain metastases (BM) originating from non-small cell lung carcinoma (NSCLC) primary samples were obtained from consenting patients, as approved by the Research Ethics Board at Hamilton Health Sciences. BMs were processed and maintained in Tumor Sphere Media (TSM) as previously described [54]. BMICs were grown as tumorspheres that were maintained at 37 °C with a humidified atmosphere of 5% CO2. Each BMIC line was characterized by performing limiting dilution assays (LDA) as previously described [45]. The most aggressive BMIC line with an enhanced engraftment capacity, BT478, was the primary BMIC line used for in vivo and RNAi experiments.

Cell proliferation assay

Single cells were plated in a 96-well plate at a density of 1000 cells/200 µL per well in quadruplicate and incubated for 4 days. 20 µL of Presto Blue (Invitrogen) was added to each well approximately 2 h prior to the readout time point. Fluorescence was measured using a FLUOstar Omega Fluorescence 556 Microplate reader (BMG LABTECH) at excitation and emission wavelengths of 535 and 600 nm, respectively. Readings were analyzed using Omega analysis software.

Zone exclusion migration assay

A zone exclusion assay was utilized to assess cell migration over 3 days, a timepoint determined to be most appropriate for the slow doubling times associated with GBM and BMIC populations. BMIC and glioblastoma (GBM) spheres were dissociated to single cells and plated at a density of 30,000 cells per well in a 96-well plate containing TSM + 1% FBS, and a 1% agar drop in the center of the well. Following cell adherence (24 h), the agar drop was removed to create a cell-free zone, then wells washed gently with PBS and media replaced with TSM. Migration into the empty zone was monitored over 3 days, with time point images taken at day 0 and day 3. The percentage of migration was calculated as shown below:

$$\% \,{\text{Migration}} = \frac{{{\text{Area}}\,{\text{of}}\,{\text{empty}}\,{\text{space}}\,{\text{at}}\,{\text{endpoint}}}}{{{\text{Area}}\,{\text{of}}\,{\text{empty}}\,{\text{space}}\,{\text{at}}\,T = 0}} \times 100.$$

Flow cytometric characterization

BMIC tumorspheres were dissociated and single cells resuspended in PBS + 2 mM EDTA. Cell suspensions were stained with anti-CD133, anti-CD44, anti-CD15, anti-EpCAM or matched isotype controls (Miltenyi) and incubated for 30 min on ice. For TWIST2 and SPOCK1 co-expression analysis, single cell suspensions of BMIC lines BT478 and BT530 were stained with primary SPOCK1 antibody (Mouse monoclonal IgG, Abcam) followed by goat anti-mouse alexafluor 647 secondary antibody (1:1000, Invitrogen), and TWIST2 (Sheep Polyclonal IgG, R&D Systems) followed by donkey anti-sheep FITC secondary antibody (1:1000, Life Technologies), with 30-min incubations on ice for each antibody.

Samples were run on a MoFlo XDP Cell Sorter (Beckman Coulter). Dead cells were excluded using the viability dye 7AAD (1:10; Beckman Coulter). Compensation was performed using mouse IgG CompBeads (BD). Surface marker expression was defined as positive or negative based on the analysis regions established using the isotype control.

Immunofluorescence staining of BMIC lines for SPOCK1 and TWIST2

50,000 cells from BMIC lines BT478, BT530, and BT751 were cultured on Matrigel coated-coverslips in 24-well plates for 72 h. Wells were washed with PBS and cells fixed with 4% paraformaldehyde for 20 min at room temperature (RT), permeabilized with 0.1% Triton X-100 for 5 min, then blocked with 1% BSA at RT for 20 min. Cells were stained with primary Anti-human SPOCK1 antibody (10 µg, Mouse monoclonal IgG, Abcam) followed by donkey anti-mouse alexafluor 647 secondary antibody (1:1000, Invitrogen), or Anti-human TWIST2 (10 µg, Sheep polyclonal IgG, R&D Systems) followed by donkey anti-sheep FITC secondary antibody (1:1000, Life Technologies), for 2 h at RT then followed by secondary antibodies for 1 h at RT. Slides were then mounted with mounting medium and analyzed using Volocity imaging software.

In vivo modeling of metastasis

All experimental procedures involving animals were reviewed and approved by McMaster University Animal Research Ethics Board (AREB). NOD-SCID mice were used for all experiments. Mice were anaesthesized using gas anesthesia (isoflurane: 5% induction, 2.5% maintenance) before minimally invasive surgery.

  1. (a)

    Intracranial injections (ICr)

    Intracranial injections were performed as previously described [53]. Briefly, 10 μL of cell suspension (Online Resource 1 Table 2) was injected into the right frontal lobe of 8- to 10-week-old mice.

  2. (b)

    Intrathoracic injections (IT)

    30 uL cell suspension (5 × 105 cells) mixed with 30 μL of growth factor reduced matrigel (BD Bioscience) was injected into the right upper chest of 8- to 10-week-old mice.

  3. (c)

    Intracardiac injections (ICa)

    100 μL cell suspension (2.5 × 105 cells) was injected into the left ventricle over 15 s in 8- to 12-week-old mice.

Mice were monitored weekly, and upon reaching endpoint brains and lungs (for IT and ICa injections) were harvested, sectioned, and paraffin-embedded for hematoxylin and eosin (H&E) and Anti-human COXIV immunohistochemistry. Images were scanned using an Aperio Slide Scanner and analyzed by ImageScope v11.1.2.760 software (Aperio).

Reverse transcription and quantitative PCR of mRNA

Total RNA was isolated using Norgen RNA extraction kit (Biotek) and reverse transcribed using qScript cDNA Super Mix (Quanta Biosciences) and a C1000 Thermo Cycler (Bio-Rad). qRT-PCR was performed using the Cfx96 (Bio-Rad) with SsoAdvanced SYBR Green (Bio-Rad) using gene specific primers (Online Resource 1 Table 7) and 28SrRNA as the internal control.

shRNA dropout screen

shRNA constructs (5–8 individual shRNA constructs per gene) were obtained through the RNAi Consortium and the pool was constructed as previously described [19]. Replication-incompetent lentivirus was produced as described in Venugopal et al. [53].

400 million cells from BT478 BMIC line were transduced with the pooled lentiviral library at an MOI of 0.3 for the shRNA screens, with approximately 80,000-fold representation (cells per construct). Cells were treated for 48 h with puromycin and then utilized for either the in vitro or in vivo screen. 10% of cells were collected for use as the baseline for all screens.

  1. (a)

    In vitro tumorsphere formation shRNA screen transduced cells were cultured in T150 flasks, propagated as secondary spheres, and divided into three replicates. Tumorspheres were passaged for four passages. At each passage, genomic DNA was isolated for sequencing.

  2. b)

    In vivo screen 200,000 cells were injected intracranially into NOD-SCID mice (n = 10). At endpoint, brains were harvested, perfused, and genomic DNA isolated for sequencing.

Lentivirus was prepared for individual shRNA constructs similar to the pooled shRNA library. BMIC lines were transduced with lentivirus along with an shGFP control and treated with puromycin for 48 h prior to in vitro assessment of sphere formation, proliferation, and migration or in vivo injections by all three routes (ICr, IT, and ICa) to assess tumor formation and metastasis.

shRNA dropout screen analysis

A list of 150 genes involved in regulation of BM was expertly curated from the literature (Online Resource 1 Table 4). We used Wordle (http://wordle.net) to highlight biological processes associated with the 144 selected shRNAs (select genes in Online Resource 1 Table 5, full list in Online Resource 2). Using Pathway Data Integration Portal ver. 1 (pathDIP, http://ophid.utoronto.ca/pathDIP), we first performed pathway enrichment analysis across 19 major pathways databases and identified 532 pathways that are significantly enriched by the selected shRNAs (p < 0.05). Pathway names were corrected by removing uninformative words (e.g., Homo sapiens) prior to generating the word frequency figure (Fig. 3b).

In vitro analysis

Libraries were sequenced on an Illumina HiSeq 2500. Library preparation and sequencing were carried out as described previously [20]. An average of 3.1 M reads were obtained per sample. Reads consisting of 21–22 nucleotides, followed by a XhoI restriction site, allowing a single mismatch, were retained for mapping. Reads were mapped against a library-specific, non-redundant FASTA file using bowtie (v 0.12.7) with the following command-line options: -l 21 -m 1 -n 2 --best –strata. On average, 99.5% of properly formatted reads aligned, allowing two mismatches and suppressing multi-mapping reads.

Mapped hairpins were enumerated from the resulting SAM file using a bespoke Java program, which also merges together all samples and hairpin annotations. Each sample was normalized by summing the total number of reads assigned to hairpins, dividing the sum by 1000,000, and then dividing the individual hairpin counts by the normalized sum. The resulting value is “reads per million mapped reads,” which is then transformed to log-base2.

Linear regression was performed for each hairpin to model the dropout trend over time. Normalized abundance counts were regressed against the passage number using the lm function in R (v 3.1.2). The majority of hairpins, as expected, had zero slope. The 5% of hairpins with the most negative slopes were selected for validation.

In vivo analysis

Sequencing and read mapping were performed as per the in vitro screen. Between 1.1 and 9.5 M reads (mean = 4.8 M) were processed per sample. Log fold-change values were computed for each hairpin in each in vivo sample by taking the ratio against the matching P0 hairpin abundance and transforming to log2. Only hairpins observed in both the P0 sample and at least one of the in vivo samples were retained for further analysis. Gene depletion was then ranked by comparing the aggregate fold-change value across all samples to the population of negative control hairpins using the Wilcoxon rank-sum test. P values were adjusted for multiple testing correction using the method of Benjamini and Hochberg. Finally, dropouts were ranked by the adjusted P value.

Immunohistochemical staining of patient samples for SPOCK1 and TWIST2

4 µm formalin-fixed paraffin-embedded sections were dewaxed in 5 changes of xylene and brought down to water through graded alcohols. Antigen retrieval was performed by pretreatment with Tris–EDTA (pH 9.0), then primary antibodies for SPOCK1 (Abnova #H00117581-M01; 1/2000) or TWIST2 (Atlas #HPA007450; 1/800) applied. Endogenous peroxidase was blocked with 3% hydrogen peroxide. The detection systems used were MACH 4 universal HRP polymer system (TWIST2, Intermedico Cat# BC-M4U534) and species-specific ImmPRESS polymer system (SPOCK1, Vector Labs). After following kit instructions, color development was performed with freshly prepared DAB (DAKO Cat# K3468). Finally, sections were counterstained lightly with Mayer’s Hematoxylin, dehydrated in alcohols, cleared in xylene and mounted with Permount mounting medium (Fisher cat# SP15-500). The optical densities (OD) of each sample were determined by densitometric analysis using ImageJ software.

Protein–Protein Interaction Mapping of SPOCK1 and TWIST2

Physical protein interactions (PPI) for SPOCK1 and TWIST2 and their interacting partners were identified using Integrated Interaction Database (IID) v.2016-03 (http://ophid.utoronto.ca/iid) [22]. Protein interaction networks were visualized and analyzed using NAViGaTOR version 2.3 (http://ophid.utoronto.ca/navigator) [5]. Corresponding XML file in NAViGaTOR can be downloaded from http://www.cs.utoronto.ca/~juris/data/NCB/BMnetwork.xml. We calculated centrality/betweenness by counting the number of shortest paths through a given node and edge, connecting all possible pairs of nodes in the network, resulting in all pairs shortest path count (APSP). All proteins in the network were selected for pathways enrichment analysis as described above, using Pathway Data Integration Portal (pathDIP) ver. 1 http://ophid.utoronto.ca/pathDIP).

Edge centrality was evaluated by betweenness, and nodes participating in the most central interactions were selected for further analysis of prognostic properties (all nodes, edges, and betweenness scores are listed in Online Resource 1 Table 8 and connection derivations in Online Resource 3).

Cox modeling of prognostic potential of top 20 genes identified through PPI network

To evaluate multivariate prognostic potential of the top 20 genes, we developed Cox proportional hazards model (SPOCK1, PLAT, APBB1, CELSR2, PLG, PLAU, PRSS3, PRKCQ, TCF4, INSR, TP53, ATXN1, EP300, JAG2, INHBA, LAMA3, ELAVL1, STMN2, VAV2, ARHGAP26), where the gene expressions served as only covariates. Model was derived using R’s package glmnet [13], applying ridge regression (also known as L2, or Tikhonov regularization) to prevent over-fitting. TCGA’s lung adenocarcinoma (ADC) RNA-seq data were standardized by converting to z scores and along with the corresponding clinical data were used as “training data.” Model was validated on the independent, publicly available lung ADC data (GSE31210) [35], and its prognostic performance was evaluated by concordance index (function survConcordance from R’s package survival [50], v 2.38.3) and area under receiver operating characteristics (ROC) curve—AUC, measured at the third year after initial time point (function AUC.cd from the R’s survAUC package, v 1.0.5). Resulting model then predicted the risk score for each patient in the validation dataset. Patients were grouped into two groups based on the predicted risk score, using the threshold that was obtained prior to validation to maximize the hazard ratio obtained on the training data (for more details see Royston et al. [39]). Validated hazard ratio (HR) between these two groups, as well as associated statistical significance (Log-rank test) were finally calculated (function survdiff from the survival package) and Kaplan–Meier (KM) plot showing time-dependent survival probabilities of these two groups was generated.

Generation of univariate gene expression Kaplan–Meier plots

Kaplan–Meier (KM) curves depicting patient survival by risk group based on SPOCK1 and TWIST2 expression in Fig. 6c were generated using SurvExpress, a suite for validation of biomarkers and survival analysis [1] using TCGA Lung Adenocarcinoma (TCGA-LUAD) dataset (version Sept. 2012) [6]. KM curves were generated from quantile-normalized TCGA lung ADC expression data and matching information about patient survival.

Prognostic properties of the most central nodes were evaluated using http://kmplot.com (version 2015; data downloaded on March 6, 2016) [17]. Only lung ADC samples were used, and biased samples were removed (n = 2437). Auto select best cutoff and censor at threshold was used, with JetSet probe selection. Obtained hazard ratios and corresponding P values are plotted in Fig. 7b. Resulting KM plots for overall survival are included in Fig. 7c and Online Resource 1 Fig. 11d; specifying probe sets used.

Statistical analysis

Replicates from at least three samples were used for each experiment. Respective data represent mean ± SD with n values listed in figure legends. Student’s t test and 2-way ANOVA analyses using GraphPad Prism 5. P < 0.05 were considered significant.

Results

Patient-derived BMICs exhibit stem cell characteristics and enhanced migratory potential in vitro

We successfully established several early-passage BM cell lines termed as brain metastasis-initiating cells (BMICs) from human patient BM of lung origin (Online Resource 1 Table 1). BMIC lines formed tumorspheres [36] in serum-free media (Fig. 1a) and limiting dilution analysis (LDA) quantified median stem cell frequencies ranging from 1/100 to 1/350 cells (Fig. 1b, Online Resource 1 Fig. 1). BMIC sphere formation was comparable to that of primary glioblastoma (GBM). Thus, BMIC cultures are highly enriched for BMIC properties but not to homogeneity.

Fig. 1
figure 1

Characterization of BMICs in primary BM from the lung. a Representative bright field images of BMIC spheres (20x). b Self-renewal of BMICs and primary GBM CSCs determined through secondary sphere formation. c Migration capacity of BMICs and GBMs. d Surface expression of CSC markers. Each dot represents a single patient sample, bar indicates mean. e BT478 and BT530 reformed tumors with as low as 100 cells after intracranial (IC) injections. Images shown are representative H&E sections. Red scale bar = 4 mm, blue scale bar = 100 µM, n = 5

Migratory capacity was variable between individual BMIC lines (Fig. 1c), but overall BMIC lines had enhanced migration compared to invasive GBM lines. Through analysis of each patient’s BMICs for known cancer stem cell (CSC) markers, we found differential expression of CSC surface markers CD133, CD15 [46], and CD44 [2, 10, 25], as well as epithelial cell adhesion marker (EpCAM) [59] (Fig. 1d, Online Resource 1 Fig. 2). Collectively, these data confirm the presence of a migratory CSC-like population in vitro and underscore the existence of patient-to-patient heterogeneity between BMIC lines. The variation of CSC expression levels between patients further confirms our previous work, where typical primary tumor CSCs markers are inefficient at identifying BMIC population [34] and emphasizes the necessity of novel BMIC markers.

Patient-derived xenotransplantation models effectively recapitulate human brain metastases in vivo

We have utilized our unique BMICs to establish appropriate patient-derived xenotransplantation (PDXT) models of BM that allow for serial in vivo enrichment and propagation of the functional tumor-initiating cell (TIC) population that initiates BMs. Firstly, BMIC tumor initiation capacity (TIC) was assessed in vivo through intracranial injections of two BMIC lines (BT478 and BT530) in NOD-SCID mice. This model was adapted from a PDXT model established by our lab for primary BTIC (Brain tumor initiating cell) population [46]. In vivo LDA (Online Resource 1 Table 2) showed that BMICs were able to form tumors from as few as 100 cells (Fig. 1e). To evaluate if our BMICs retained original patient tumor marker profile, xenotransplants were stained with routine marker profiles commonly utilized to diagnose brain metastases from a primary lung cancer, and we found that xenografts recapitulated the original patient tumor histology and cyto-architecture (Online Resource 1 Fig. 3).

Secondly, to assess BMICs ability to complete different stages of the metastatic cycle in vivo, we injected two BMIC lines (BT478 and BT530) into NOD-SCID mice through two different injection routes: a) intracardiac injections (ICa) and b) intrathoracic injections (IT) (Online Resource 1 Table 3).

Homing potential of BMICs was assessed through ICa injections of 250,000 cells, resulting in preferential micro- and macro-metastasis formation in the brain and none in the lung or heart (Fig. 2, Online Resource 1 Fig. 4a-4b). IT injection of 500,000 cells showed that not only are BMICs capable of re-forming tumors in the lung environment, but they could migrate through the vasculature and seed the brain to form micro-metastases. However, mice succumbed to gross lung tumor burden before full macro-metastases developed (Fig. 2). Primary GBM lines were injected through our previously developed BTIC-PDX model [46] as well as our novel BMIC PDXT model only achieved tumor formation through ICr injections and not through IT or ICa injections (Online Resource 1 Fig. 4c). Taken together, these data verify the self-renewal and metastatic properties of BMICs in vivo, and establishes our PDXT models as appropriate model systems for human BM.

Fig. 2
figure 2

Development of a novel human-mouse xenograft model of lung-to-brain metastasis. Schematic representation of novel in vivo BM model. BMIC capacity is demonstrated through formation of seeded tumor cell nests and large tumors in the brain after IT and ICa injections, respectively. Images shown are representative H&E sections. Red scale bar 4 mm, blue scale bar 100 µM. ICr, n = 2; ICa, n = 6; IT, n = 6)

shRNA drop out screen identifies TWIST2 and SPOCK1 as novel genetic regulators of brain metastasis

RNA interference (RNAi) using short-hairpin RNAs (shRNAs) provides a versatile tool allowing for rapid interrogation of gene function in mammalian cells, and has been used to identify genetic regulators of tumorigenesis and metastasis of various solid cancers [32, 48, 56].

In order to identify key regulators of BMIC migration and self-renewal, we functionally interrogated the 30 Nolte candidates previously identified by our lab [34] as well as an additional 120 key genes known to regulate metastasis [3, 27, 52] using pooled shRNA screens followed by in vitro and in vivo validation of chosen hits in our PDXT model systems (Fig. 3a). Pathway enrichment analysis using 19 pathway databases integrated from pathDIP revealed our 150 curated genes to be involved in several key processes and pathways; specifically, gene regulation and cell adhesion (Fig. 3b, Online Resource 1 Fig. 5, Table 5, Online Resource 2). We obtained a corresponding library of shRNAs targeting these 150 genes as well as negative control hairpins (shGFP, shLacZ) and positive control hairpins (shMET, mTOR pathway genes) (Online Resource 1 Table 4), and performed a functional in vitro shRNA dropout screen. Our pooled shRNA library included negative control hairpins (shGFP, shLacZ) and positive control hairpins (MET, mTOR pathway genes) (Online Resource 1 Fig. 6a–6c, Online Resource 2). Since tumorsphere culture conditions enrich for self-renewing stem-like populations with metastatic potential to seed the brain, we used these conditions to screen the 150 genes for their role in BMIC self-renewal in vitro. We identified several genes that had an influence on self-renewal capacity (Fig. 3d).

Fig. 3
figure 3

shRNA screen identifies genes involved in BMIC self-renewal. a Schematic representation of shRNA screen procedure. b Word frequency analysis from significantly enriched pathways using the selected 150 genes. c Scatter plots of normalized reads per shRNA between passage 0 (P0) and passage 1 (P1) or passage 4 (P4). Data shown highlight genes whose corresponding shRNAs were specifically depleted in long-term passaging BMICs (red dots). Blue line is diagonal line with ratio of 1.0. Red-dashed line shows cut-off for 1.7-fold change. d Heat map generated from top 5% of hairpins screen hits. Genes selected for further validation and their corresponding slope values are highlighted in the adjoining box

Genes from the top 5% of all hits were selected for further validation through our in vitro BMIC model based on their rate of dropout of from the shRNA screen (Fig. 3c), as depicted by the negative slope, as well as their novelty and potential role in BM development (Fig. 3d, Online Resource 1 Fig. 6a, 6b, 7, 8). SPOCK1 and TWIST2 showed the most significant results in vitro and were validated further with our in vivo PDXT model. TWIST2 (or Dermo-1) belongs to the family of basic helix-loop-helix (bHLH) transcription factors and is implicated as an epithelial -mesenchymal transition (EMT) regulator [12], while SPOCK1 (or testican-1) is a seminal plasma proteoglycan and TGF-β target gene [29]. Both genes are overexpressed in several primary cancers [28, 57], yet SPOCK1 has no known relevance to BM development. Knockdown of TWIST2 or SPOCK1 with shRNA in three patient-derived BMIC lines showed the most significant reduction of sphere formation, migration, and proliferation (Fig. 4, Online Resource 1 Fig. 9, 10). shTWIST2 and shSPOCK1 reduced expression of CD133 as seen by FACS analysis (Online Resource 1 Fig. 11a), further implicating SPOCK1 and TWIST2 in regulating BMIC stemness.

Fig. 4
figure 4

SPOCK1 and TWIST2 identified as novel regulators of BM through shRNA screen. a Transcript levels of SPOCK1 and TWIST2 after of lentiviral knockdown. In vitro characterization was carried out through. b sphere formation, c migration, and d proliferation assays. Data are expressed as mean ± SEM of n = 3. Ns not significant, *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001

Our functional shRNA screen was repeated in vivo to validate our in vitro results. BMICs transduced with the pooled shRNA library were injected ICr into NOD-SCID mice, with tumor formation as the primary outcome measured. BMICs proliferate until tumors are detectable, where the initial homogeneous representation of transduced cells (P0) is skewed towards a set of tumor cells that received shRNAs that do not functionally affect the propagation of cells. SPOCK1 and TWIST2 were found to have high dropout rates as compared to initial representation at P0, confirming both genes as regulators of tumor formation from ICr injections (Fig. 5a, Online Resource 1 Fig. 6d, 6e).

Fig. 5
figure 5

KD of shSPOCK1 and TWIST2 inhibits BM formation in vivo. a Evaluation of SPOCK1 and TWIST2 shRNA dropout after in vivo shRNA screen. b Representative H&E images of brain and lungs collected at endpoint after ICr (n = 2), IT (n = 4), and ICa (n = 5) injections of BT478 cells were transduced with either shSPOCK1, shTWIST2, or shGFP as control. shSPOCK1 and shTWIST2 both inhibited metastasis to the brain through IT and ICa route. shTWIST had slightly reduced tumor formation from ICr route but no effect on tumor formation in the lung from IT route. shSPOCK1 inhibited tumor formation in both the lung and the brain from IT and ICr routes, respectively. Below are graphs depicting relative tumor volume (mm2) for ITB and ITL samples. Red scale bar 4 mm, blue scale bar 100 µM, ns not significant, *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. c Kaplan–Meier survival curves for ICr injections of shGFP, shSPOCK1, and shTWIST2, where mice had increased survival for shSPOCK1 and shTWIST2 as compared to the shGFP control

Knockdown of SPOCK1 or TWIST2 leads to inhibition of brain metastasis in vivo

BMICs were transduced with shGFP, shTWIST2, or shSPOCK1 and injected through the ICr, IT, and ICa routes. In ICr injections, which served as controls to validate tumor engraftment, shTWIST2 in BMICs reduced tumor volume when compared to the shGFP transduced control cells, while shSPOCK1 prevented tumor formation altogether (Fig. 5b, n = 2, p values not calculated). For IT injections, although BMICs with shTWIST2 did not reduce lung tumor formation, cells failed to metastasize to the brain. shSPOCK1 had both greatly reduced lung tumor formation and no BM (Fig. 5b). Mice receiving shSPOCK1 or shTWIST2 cells had a survival advantage over mice that received control cells through ICa injections. Moreover, knockdown of these genes completely ablated BMIC metastasis to the brain (Fig. 5b, c). Collectively, these data implicate SPOCK1 as a vital regulator of BMIC self-renewal and the development of brain metastasis, whereas TWIST2 was confirmed to be important to initiation of metastasis of BMICs.

Expression levels of SPOCK1 and TWIST2 in lung cancer patient samples are predictive of brain metastasis

SPOCK1 and TWIST2 expression were profiled by immunohistochemistry (Fig. 6a) in primary NSCLC biopsies in patients who developed BM (n = 12) and patients who did not (n = 14) to determine if these proteins are predictive of BM formation (Online Resource 1 Table 6). The second cohort (n = 14) had a median follow-up time of 36 months and all patients were clearly documented to have no clinical or radiological development BM, in five patients who were alive at 5 years and thriving, and in nine patients who all died of other disease burden (primarily lung disease burden or metastasis to bone). Conversely, the majority of BM patients (n = 12) were diagnosed synchronously with the primary lung cancer and thus the short 3–6 months survivorship of these patients was dictated by the BM. Follow-up for both sets of patients differed only due to the nature of the disease; the poor survival associated with BM had a much shorter follow-up period (19 months follow-up, with all patients dying from brain metastatic burden).

Fig. 6
figure 6

SPOCK1 and TWIST2 are predictive of lung-to-brain metastasis. a Expression of SPOCK1 and TWIST2 in patient biopsy samples of BM from lung, lung primary adenocarcinomas with known BM, and lung primary adenocarcinomas with no known metastases. b Quantification of optical densities of a. Red arrows indicate positively stained cells. Data are expressed as mean ± SEM of BM n = 10, lung with BM n = 6, lung without BM n = 6). *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. c Kaplan–Meier curves depicting gene expression by risk group, as obtained from SurvExpress using in TCGA data from lung adenocarcinoma patients

SPOCK1 and TWIST2 expression were only observed in primary lung cancer specimens in patients who ultimately developed BM (Fig. 6b, SPOCK1 p < 0.01; TWIST2 p < 0.0003), and were also present in the corresponding patient-matched BM (n = 12). Interrogation of The Cancer Genome Database (TCGA) for SPOCK1 and TWIST2 expression in lung cancer patients determined that in a validation cohort of 255 NSCLC patients, high expression of these genes predicted poor patient survival (Fig. 6c, SPOCK1 p = 0.0036; TWIST2 p = 0.0001). These data suggest that SPOCK1 and TWIST2 may have clinical utility as predictive biomarkers of BM formation in NSCLC patients.

Protein–Protein interaction mapping between SPOCK1 and TWIST2 identifies novel pathway interactors

We characterized possible interactions between SPOCK1 and TWIST2 that could mediate their role in BM development. Co-expression of SPOCK1 and TWIST2 in BMICs was confirmed in a minority of BMICs though FACS analysis and immunofluorescence (Online Resource 1 Fig. 11b, c). Although co-expression of SPOCK1 and TWIST2 may not be essential to BMIC function, since we showed that each gene regulates functional BMIC properties, we questioned whether their interaction occurring in a small subset of BMICs could further identify novel regulators of brain metastasis. A direct interaction between SPOCK1 and TWIST2 was unlikely, as in silico probing of transcription factor binding databases revealed that the SPOCK1 promoter lacked a TWIST2 binding domains [24]. Thus, we derived a physical protein–protein interaction (PPI) network to identify indirect connections between TWIST2 and SPOCK1 (Fig. 7a). In a protein–protein network model, betweenness centrality refers to the number of shortest paths going through a given edge, where higher centrality denotes a very important protein hub. We analyzed centrality/betweenness of the nodes and edges to identify the most central 20 connector proteins (Online Resource 1 Table 8). These genes were then interrogated for prognostic value using transcriptomic data from a NSCLC patient cohort. Individually, 19 of the 20 genes were found to have significant impact on patient survival (Fig. 7b, Online Resource 1 Fig. 11d and Table 8). We applied these unique connector genes as a signature to probe the TCGA mRNA-seq dataset of 433 lung adenocarcinoma patients, then validated in an independent dataset of 204 lung adenocarcinoma patients. Strikingly, we found that the signature had significant prognostic value (Fig. 7c). RTPCR screening of select PPI connector genes found expression of only INHBA, the most direct connector between SPOCK1 and TWIST2, to be significantly reduced in BMICs with shTWIST2 or shSPOCK1 (Fig. 7d, Online Resource 1 Fig. 12). INHBA is a ligand of the TGF-β superfamily and was found to be significantly mutated in lung adenocarcinoma [9, 41], and further hints at the involvement of the TGFbeta pathway in BM development. Collectively, these data implicate a novel set of connectors between TWIST2 and SPOCK1 involved in BM development that could predict patient prognosis, and provide unique therapeutic targets.

Fig. 7
figure 7

Novel interactors between SPOCK1 and TWIST2 predict BM. a Network of physical protein–protein interactions from IID database linking TWIST2 and SPOCK1. Node color corresponds to GeneOntology biological function; Edge color represents tissue evidence for the interaction. Top 20 genes with the highest centrality measure in the network are highlighted with name and red. b Volcano plot depicting HRs and corresponding P values of the 20 genes with the highest centrality in (a). Red dots depict harmful genes and blue dots are protective genes. c Kaplan–Meier plot showing overall survival rates of low- and high-risk patients, whose risk score was predicted by the Cox model, where expression of the selected 20 genes served as covariates. Prognostic performance of the model was validated on the independent datasets from [35]. d Transcript level of INHBA in BMIC lines (BT478, BT530, BT751) transduced with shGFP (control), shSPOCK1, or shTWIST2

Discussion

The invasive nature of BM and their ability to escape aggressive treatments predict uniformly poor patient outcome [47]. Advances in screening, detection, and therapeutics for systemic cancers have led to an increase in cancer patient survival, yet leaves cancer survivors vulnerable to an increased prevalence of BM [23, 38]. Here, we have successfully developed BMIC lines from primary patient BM originating from lung cancer, where these samples represent an enriched source of human cells capable of completing the lung-to-brain metastatic cascade.

Syngeneic mouse models have significantly improved our understanding of molecular factors that govern subclone survival throughout stages of metastasis [3, 16, 18], though the clinical utility of these models is limited as transgenic murine models may not recapitulate the inter- and intratumoral heterogeneity of lesions in humans. Current models for studying metastasis investigate each stage in isolation, failing to capture the dynamic evolution of tumor subclones during primary tumor growth. Nonetheless, these models have contributed to the identification of metastatic drivers, such as ST6GALNAC5 [3], L1CAM [7] from breast cancer, and LEF1 and HOXB9 [33] for lung cancer. When paired with RNAi screens, these models have aided the identification novel metastasis suppressor genes [15, 32].

Through our BMICs derived from patient samples of BM of lung origin, we have successfully generated an appropriate and feasible model of human BM. We build upon our previous work, where we had confirmed our BMICs to possess typical stem characteristics such as self-renewal and tumor initiation [34]. In our current work, we show that our BMICs are capable of completing the complexities of metastasizing to the brain. Our unique PDXT model provides a unique opportunity to isolate and characterize BMICs as well as provides a novel platform to screen and validate BM treatments.

Our shRNA dropout screens performed with our in vitro BMIC and in vivo BMIC PDXT model systems identified SPOCK1 and TWIST2 as regulators of BMICs, with SPOCK1 in particular proving to be critical to BMIC self-renewal, tumor initiation, and migration. BMICs with knockdown of SPOCK1 and TWIST2 also displayed reduction in stemness as depicted by downregulation of CSC marker CD133 expression, suggesting a potential relationship between SPOCK1, TWIST2, and stemness. Functional validation of both genes by knockdown in our PDXT model resulted in complete abrogation of brain metastasis through ICa and IT injections. Interestingly, knockdown of SPOCK1 completely inhibited tumor formation in both the primary lung and secondary brain environments, implicating SPOCK1 as an essential regulator of tumor initiation irrespective of the microenvironment. The dramatic reduction in secondary sphere formation seen with shSPOCK1 further corroborated its role in governing BMIC self-renewal. Conversely, knockdown of TWIST2 had no significant effect on primary lung tumor formation but reduced BM, providing further validation for the role of TWIST2 in BMIC initiation of metastasis. Our data implicate SPOCK1 and TWIST2 in metastatic progression, and may prove to be useful as candidate genes in the detection of migrating BMICs or circulating tumor cells (CTCs). Development of anti-metastatic therapies has been limited due to a lack of understanding of the intricacies of the metastatic cycle, with most drugs targeting the end stages [30].

Systemic cancer progression follows two basic models: linear progression, where the metastatic cells develop within the primary tumor prior to dissemination to a secondary site, or parallel progression that is independent of primary tumor formation. Current research favors the parallel progression model, indicating a need to re-evaluate current treatment modalities [21]. Additionally, recent studies have shown that the presence of metastatic CSCs correlates with the overall incidence of metastasis development in patients, and so offers as a novel pre-metastatic therapeutic option in metastasis prevention. However, targeting these cells faces several difficulties. For instance, the limited knowledge of the biology of metastatic CTCs challenges the ability to precisely identify this particular population [11], and conventional therapeutics have a short time in circulation that may result in inefficient exposure and consequently ineffective killing of CTCs [26]. The identification of novel biomarkers of this metastatic population opens up a unique therapeutic avenue for prevention of metastatic development. Early detection and therapeutic targeting of CTCs based on SPOCK1 or TWIST2 expression could terminate the metastatic process at its initiation, ideally reducing the risk of BM development.

Given that not all lung cancer patients develop BM, the immediate clinical utility of our findings is in the ability to identify those patients who are more likely to develop a BM such that they may receive targeted therapy or an escalation in current treatment protocols of NSCLC. Interestingly, SPOCK1 and TWIST2 immunohistochemistry was only positive in those lung cancer patients who developed BM, whereas lung cancers that did not develop BMs were devoid of SPOCK1 and TWIST2 expression. These data suggest that SPOCK1 and TWIST2 may have clinical utility as predictive biomarkers of BM formation in newly diagnosed NSCLC patients.

Despite SPOCK1 and TWIST2 being implicated in other primary cancers and metastases [43, 58], to date no work has shown any relationship between these two genes. We have shown that although SPOCK1 and TWIST2 are co-expressed in a minority population of BMICs, they have no direct interaction. Through PPI networks analysis, we have identified several genes that connect SPOCK1 and TWIST2 indirectly and are specific to lung tissue. Several of these connecting genes have been implicated in neurological development [42] or the progression of neural diseases such as Alzheimer’s [8, 60] suggesting that these genes may predispose BMICs to home from the lung to the brain. None of these genes have been implicated in co-operating with SPOCK1 or in BM development. Of the top twenty connector genes with the highest centrality in the SPOCK1-TWIST2 PPI network (Fig. 7a), nineteen genes each had significant predictive potential for BMs in a primary lung adenocarcinoma patient cohort. When applied as a signature, these genes had an even greater impact on predicting poor patient survival. Expression of the most significant connector, INHBA, was correlated with SPOCK1 or TWIST2 expression, validating an interaction between these three genes. INHBA is a known TGFbeta ligand, where its binding to TGFB receptors promotes the downstream activation of the SMADs pathway [51]. Activation of the TGFbeta pathway upregulates expression of several transcription factors known to regulate the epithelial-mesenchymal transition (EMT), such as TWIST2 [31]. SPOCK1 has also been shown to be a TGFbeta downstream target gene, where its activity also promotes EMT [29]. Inhibition of TGFbeta pathway through INHBA has been shown to reduce experimental metastatic development [51]. From this circumstantial evidence, we suggest a possible interactive pathway of INHBA-activated TGFbeta regulation of SPOCK1 and TWIST2, promoting BMIC proliferation and metastasis through EMT.

The known involvement of TGF-β with SPOCK1 and INHBA implicates a role for the TGF-β signaling pathway in BM development [29, 41]. Our data present a novel group of genes interact with TWIST2 and SPOCK1 to influence BMIC metastasis and BM development in lung cancer patients. The multifaceted functional roles of BMIC genes in driving metastasis and the potent ability of BMICs to evade most therapies render BMIC regulatory genes as ideal therapeutic targets.

In conclusion, we have successfully developed a novel BMIC model system of human lung-to-brain metastasis through the establishment of unique patient-derived BMIC lines. Through our clinically relevant BMIC models, we identified SPOCK1 as a novel predictive biomarker of BM and critical regulators of the metastatic process for lung-derived BM. We have also identified a predictive prognostic signature of lung cancer patients and present novel therapeutic targets. Blocking the metastatic process would transform a uniformly fatal systemic disease into a locally controlled and eminently more treatable one.