Single-cell transcriptomic analyses reveal distinct immune landscapes in primary and brain metastatic NSCLC
To investigate the differing immune microenvironments of PTs and BMs in NSCLC, we performed scRNA-seq on live, immune-related (CD45⁺) cells isolated from both sites using fluorescence-activated cell sorting (FACS) (Fig. 1A). We also integrated scRNA-seq data from previous studies by selecting BM samples containing more than 2,000 cells14. Together, these constituted our discovery cohort of 19 samples (8 PTs and 11 BMs). After stringent quality control and doublet removal, our final dataset comprised 117,752 cells with an average of 1532 genes and 5519 unique molecular identifiers (UMIs) per cell (Figure S1A–C, Supplementary Data 1). Removal of batch effects between our inhouse and integrated samples were performed using Harmony algorithm (Figure S1D).
Fig. 1: Single-cell RNA-seq profiles of PT and BM NSCLC samples.
A Schematic overview of the study design. Created in BioRender. Bai, M. (2026) https://BioRender.com/b4nlzov. B–D UMAP plots of 117,752 cells colored by cell type (B), sample origin—primary or brain metastasis (C), and individual samples (D). E A dot plot showing expression levels of marker genes across the indicated cell types. F UMAP plots displaying marker gene expression used for cell type identification.
We used unsupervised clustering based on gene expression profiles to identify seven major cell types characterized by lineage-specific markers: 15,019 epithelial cells (EPCAM, CDH1), 774 cancer-associated fibroblasts (COL1A2, SPARC), 70,895 T and natural killer (NK) cells (CD2, CD3D), 7097 B cells (CD19, MS4A1), 3484 plasma cells (MZB1, IGHG3), 2798 mast cells (KIT, CPA3), and 17,685 myeloid cells (LYZ, ITGAX) (Fig. 1B–F). Data on epithelial cells and cancer-associated fibroblasts were obtained from previous studies and were excluded from further analyses.
HSP70-High T cells enriched in BMs are associated with poor immunotherapy outcomes in NSCLC
Given the pivotal role of tumor-infiltrating lymphocytes in the tumor microenvironment and their impact on patient outcomes, we first evaluated T cell populations. We compared the transcriptional profiles of CD4⁺ and CD8⁺ T cells from PTs and BMs to investigate the unique transcriptional state of T cells in BMs. The heat shock proteins HSPA1A and HSPA1B were among the most upregulated genes in both CD4⁺ and CD8⁺ T cells in BMs (Fig. 2A; Figure S2A, B, and Supplementary Data 2, 3). To determine whether this upregulation extended to other HSP70 family members, we examined the expression of additional isoforms (HSPA5, HSPA8, HSPA9 etc.). However, these isoforms showed no significant differences between PTs and BMs (Figure S2C), indicating that the observed upregulation was specific to the inducible HSP70 isoforms HSPA1A and HSPA1B. Sample-level analysis of HSP70 module scores based on expression of HSPA1A and HSPA1B confirmed significantly elevated expression in both CD4⁺ (p = 0.027) and CD8⁺ (p = 0.0018) T cells from BMs compared to PTs (Fig. 2B). Based on this score, we categorized CD4⁺ and CD8⁺ T cells into HSP70-High, HSP70-Medium, and HSP70-Low groups. scRNA-seq data revealed an increased proportion of HSP70-High CD4⁺ and CD8⁺ T cells in BMs (Figure S2D).
Fig. 2: Elevated HSP70 expression in T cells from BMs is associated with poor prognosis in patients undergoing immunotherapy.
A Volcano plots showing differentially expressed genes between primary and brain metastatic NSCLC in CD4⁺ and CD8⁺ T cells. B Boxplots comparing the mean HSP70 score between primary and brain metastatic samples. P-values from two-sided Wilcoxon Rank Sum tests are shown. CD4⁺ T cells: Primary n = 8 and Brain n = 10; CD8⁺ T cells: Primary n = 8 and Brain n = 11. C Representative multiplex immunofluorescent images of NSCLC sections stained for CD4 (green), CD8 (red), and HSP70 (yellow); PTs and BMs are compared. Scale bars, 50 μm (main panels) and 10 μm (insets). Images are representative of n = 34 PT and n = 26 BM patients. D, E Box plots quantifying HSP70⁺ CD4⁺ T (D) and HSP70⁺ CD8⁺ T E cells from multiplex immunofluorescence images. (n = 34 Primary and n = 26 Brain samples). P-values from two-sided Wilcoxon Rank Sum tests are shown. F A dotplot illustrating average score of curated gene signatures in HSP70-High, HSP70-Medium, and HSP70-Low CD4+ (left) and CD8+ (right) T cells. G, H Enriched Gene Ontology (GO) terms in HSP70-High CD8⁺ T (G) and HSP70-High CD4⁺ T (H) cells. I, J Kaplan–Meier survival curves for patients with NSCLC in the Ravi (n = 122 patients) and Kim cohorts (n = 27 patients), stratified by HSP70-High CD8⁺ T cell (I) and HSP70-High CD4⁺ T cell (J) signature scores. Optimal cutoffs were determined using “surv_cutpoint” within each cohort. Hazard ratios (HRs) with 95% confidence intervals (CIs) and P-values from two-sided log-rank tests are shown. Survival curves represent Kaplan–Meier estimates; +, censored observations. In box plots (B, D, E), center line represents the median, box bounds indicate the 25th and 75th percentiles (IQR), whiskers extend to 1.5× IQR from box bounds, and outliers are shown as individual points.
To address potential confounding effects of disease stage, we assembled a stage-matched validation dataset by integrating publicly available scRNA-seq data of T cells from stage III-IV NSCLC patients with both primary tumors and brain metastases14,20,21,22 (Supplementary Data 4, Figure S2D, E). The upregulation of HSPA1A and HSPA1B in T cells showed a consistent trend in this stage-matched validation cohort (Figure S2E, F), suggesting that elevated HSP70 expression may be intrinsic to the brain metastatic microenvironment rather than solely a consequence of disease stage differences. In our in-house SDZL cohort (Supplementary Data 5), comprised of treatment naïve PT (n = 34) and BM (n = 26) samples form surgical resection or diagnostic biopsy, multiplex immunofluorescence staining revealed elevated HSP70 expression across multiple cell types in BMs (Fig. 2C). While HSP70 upregulation was not restricted to T cells, reflecting a broader stress response in the metastatic microenvironment, we focused detailed quantification on T cells given their central role as effector cells and pharmacological targets of immune checkpoint blockade. This analysis showed a significantly higher proportion of HSP70⁺ CD4⁺ T cells (p = 0.002, Fig. 2D) and a trend toward higher HSP70⁺ CD8⁺ T cells (p = 0.081, Fig. 2E) in BMs compared to PTs. Consistently, when we restricted the mIF analysis to stage IV patients (10 PT vs. 26 BM), BMs again showed higher proportions of HSP70⁺ CD4⁺ (p = 0.019) and a trend toward higher HSP70+ CD8⁺ (p = 0.073) T cells than extracranial PTs, indicating that this pattern persists within advanced-stage disease (Figure S2G).
The elevated expression of HSPA1A or HSPA1B in CD4⁺ and CD8⁺ T cells is associated with a heightened cellular response state linked to immunotherapy resistance23. Thus, we analyzed curated T cell signatures to further explore the functional characteristics of HSP70-High T cells (Fig. 2F). Both CD4⁺ and CD8⁺ HSP70-High T cells displayed elevated stress response signatures and reduced naïve signatures compared to their HSP70-Low counterparts. Beyond these shared features, HSP70-High CD4⁺ T cells exhibited increased exhaustion signatures, whereas HSP70-High CD8⁺ T cells showed elevated cytotoxicity signatures accompanied by attenuated NF-κB signaling. Pathway enrichment analysis based on sub-cluster-specific differentially expressed genes revealed that both HSP70-High CD4⁺ and CD8⁺ T cells were significantly enriched for biological processes related to protein folding, response to topologically incorrect protein, and chaperone-mediated protein folding (Fig. 2G, H; Supplementary Data 6, 7), consistent with an active cellular stress response program.
To assess the clinical relevance of HSP70-High T cells, we analyzed their association with patient outcomes in two independent NSCLC immunotherapy cohorts24,25. Higher signature scores for HSP70-High CD8⁺ T cells were significantly associated with shorter progression-free survival (PFS) in both the Ravi cohort (p = 0.015) and Kim cohort (p = 0.032), indicating association with poor immunotherapy outcome (Fig. 2I). For HSP70-High CD4⁺ T cells, higher signature scores were similarly associated with worse outcomes in the Ravi cohort (p = 0.038), with a consistent trend observed in the Kim cohort (p = 0.081) (Fig. 2J).
To elucidate the mechanisms underlying the upregulation of HSP70 expression in T cells, we performed NicheNet analysis to predict the ligands driving transcriptomic changes in these cells. PTGS2, encoding the COX-2 enzyme, was prioritized as a top candidate upstream regulator for both HSP70-high CD4⁺ and CD8⁺ T cells (Figure S2H), which we interpret as implicating activation of a COX-2–PGE₂ axis rather than PTGS2 acting as a classical secreted ligand. Additionally, the signatures of HSP70-High CD4+ and CD8+ T cells were significantly correlated with PTGS2 expression in the TCGA-LUAD cohort (Figure S2I).
In summary, our analysis revealed that T cells in brain metastases are characterized by elevated expression of HSPA1A and HSPA1B, which encode inducible HSP70 proteins. These HSP70-high T cells exhibit heightened stress responses, and their enrichment is associated with poor outcomes following immunotherapy, suggesting they may contribute to the immunosuppressive microenvironment in brain metastatic NSCLC.
Depletion of memory T cells in brain metastases is associated with immunotherapy resistance in NSCLC
Further re-clustering of T/NK cells in PTs and BMs based on transcriptional profiles and curated gene signatures revealed 15 distinct clusters. These included two NK cell clusters—NK_FGFBP2 and NK_XCL1—characterized by high expression of natural killer markers, and 13 T cell clusters marked by CD3D⁺, CD4⁺, and CD8⁺ expression (Fig. 3A, B, Figure S3, Figure S4A, Supplementary Data 8).
Fig. 3: Characterization of T/NK cells in PT and BM samples reveal enrichment of immune-suppressive Cycling T cells in BMs.
A A UMAP plot of all T and NK cells, colored by identified cell clusters. B A heat map illustrating normalized expression of canonical T and NK cell marker genes across clusters; TRM denotes tissue-resident memory cells. C A dotplot showing average score of curated gene signatures among T cell clusters. D Box plots comparing the proportions of T and NK cell subclusters between primary (n = 8) and brain metastatic (n = 11) NSCLC samples. P-values from two-sided Wilcoxon Rank Sum tests are shown. E, F Enriched GO terms in CD4_IL7R (E) and cycling (F) cells. P-values were calculated using one-sided hypergeometric tests with Benjamini–Hochberg FDR correction. G, H Kaplan–Meier survival curves for patients with NSCLC in the Ravi (n = 122 patients) and Kim (n = 27 patients) cohorts, as stratified by high and low levels of CD4_IL7R (G) and cycling (H) cells signature score; optimal cutoff values were determined using “survminer.” Hazard ratios (HRs) with 95% confidence intervals (CIs) and P-values from two-sided log-rank tests are shown. Survival curves represent Kaplan–Meier estimates; +, censored observations. In box plots (D), center line represents the median, box bounds indicate the 25th and 75th percentiles (IQR), whiskers extend to 1.5× IQR from box bounds, and outliers are shown as individual points.
Among CD4⁺ T cells, we identified seven functionally distinct subtypes. The central memory-like T cell cluster (Tcm-like, CD4_IL7R) exhibited elevated expression of canonical central memory markers including CCR7, SELL, IL7R, TCF7, and KLF2, alongside early activation markers CD69 and CD40LG (Fig. 3B, Supplementary Data 8). While these cells are present in the tumor microenvironment and undergo clonal expansion due to antigen stimulation, as indicated by CD69 expression, they exhibit gene expression patterns associated with long-term survival and self-renewal, similar to central memory T cells. The TIMP1⁺ memory T cell cluster (TIMP1+ Tn, CD4_TIMP1) showed high expression of IL7R, TIMP1, FTH1, and CD40LG. The interferon-stimulated gene (ISG)-enriched helper T cell cluster (ISG⁺ Th, CD4_ISG) was characterized by elevated expression of interferon-response genes and displayed Th1-associated features including CXCR3 and TBX21 expression (Fig. 3C, Figure S4B, C). The stress-response T cell cluster (Tstr, CD4_NR4A1) exhibited elevated expression of heat shock proteins (HSPA1A, HSPA1B, HSP90AA1) and showed high stress response signature scores among CD4⁺ T cells. The follicular helper T cell cluster (Tfh, CD4_CXCL13) was marked by high expression of CXCL13, CTLA4, PDCD1, TOX, and TIGIT, with elevated exhaustion signatures. The regulatory T cell cluster (Treg, CD4_FOXP3) displayed classical Treg markers including TNFRSF4, IL2RA, FOXP3, and CTLA4. An additional CD4⁺ cluster (CD4_XIST) was also identified. Assessment of classical Th1/Th2/Th17 lineage markers revealed that while CD4_ISG exhibited Th1-associated features, no clusters showed strong Th2 (CCR4⁺GATA3⁺) or Th17 (CCR6⁺RORC⁺IL17A⁺) polarization (Figure S4B, C).
Among CD8⁺ T cells, we identified six functionally distinct subtypes. The effector memory T cell cluster (Tem, CD8_GZMK) displayed high expression of effector molecules GZMK, GZMA, KLRG1, and CD44, with elevated activation signature scores. The tissue-resident memory T cell cluster (Trm, CD8_ZNF683) exhibited characteristic residency markers, including CD69, ITGAE, and ZNF683, with downregulated tissue egress markers CCR7, SELL, and S1PR1. The interferon-response T cell cluster (Tisg, CD8_ISG) showed elevated expression of interferon-stimulated genes and high interferon-response signature scores. The proliferating T cell cluster (CD8_Cycling) was characterized by high expression of proliferation markers MKI67, TOP2A, and STMN1, alongside elevated metabolic signatures. The exhausted T cell cluster (Tex, CD8_HAVCR2) displayed high expression of exhaustion markers HAVCR2, LAG3, TIGIT, PDCD1, and CTLA4. Consistent with previous findings, this cluster also expressed cytotoxicity-associated genes including CXCL13, ENTPD1, and TNFRSF9, suggesting these are antigen-experienced cells. The precursor exhausted T cell cluster (Tpex, CD8_SPRY1) occupied an intermediate state between effector and exhausted T cells, showing relatively lower expression of TOX and checkpoint inhibitors, while maintaining higher expression of TCF7, IL7R, and CD28 (Fig. 3A–C, Figure S4D, Supplementary Data 8). The annotation of these T cell clusters was supported by gene signatures from published studies23,26,27.
Comparative analysis of these sub-clusters between PTs and BMs, using both Wilcoxon sum rank test and scCODA analysis28, a Bayesian compositional data analysis framework, revealed notable differences (Fig. 3D, Figure S4E). BMs exhibited a lower fraction of CD4⁺ Tcm-like cells (CD4_IL7R) and CD8+ Trm cells (CD8_ZNF683) than the PTs. In contrast, BMs showed a trend of enrichment proliferating CD8⁺ T cells (CD8_Cycling). Given the imbalanced EGFR mutation distribution between PTs and BMs in our cohort (Supplementary Data 1), we performed logistic regression analysis adjusting for EGFR status. Both CD4_IL7R and CD8_ZNF683 depletion in BMs remained statistically significant after adjustment (CD4_IL7R: adjusted P = 0.0002, OR change = 5.1%; CD8_ZNF683: adjusted P = 0.00046, OR change = 31.7%), demonstrating that memory T cell depletion is largely independent of EGFR mutation status, though CD8_ZNF683 shows potential EGFR-associated variation (Supplementary Data 9). In the stage-matched validation cohort, we identified 12 major T/NK cell clusters (Figure S4F, G, Supplementary Data 10). Consistent with our discovery cohort, we identified CD4_IL7R as central memory-like T cells characterized by elevated IL7R and CD40LG expression pattern (Figure S4H). Further, we found CD4_IL7R were significantly depleted in BMs compared to PTs in validation cohort (p = 0.012, Figure S4I).
Pathway enrichment analysis of subcluster-specific differentially expressed genes revealed that CD4_IL7R cells were enriched in cellular responses to extracellular stimuli, including oxygen levels, hormonal signals, and mechanical cues, alongside lymphocyte differentiation pathways (Fig. 3E). CD8_ZNF683 cells displayed elevated expression of genes associated with lymphocyte-mediated immunity and cytotoxicity, including T cell receptor signaling, antigen receptor-mediated signaling, and cell killing pathways (Fig. 3F). These findings suggest that both CD4_IL7R and CD8_ZNF683 cells contribute to active immune surveillance and anti-tumor responses.
Pseudotime trajectory analysis using Monocle 2 with latent-dynamics validation by scTour confirmed the developmental relationships among T cell subclusters (Figure S4J, K). Both CD4_IL7R and CD8_ZNF683 occupied early positions in their respective developmental trajectories, consistent with their memory phenotypes and capacity for differentiation into more specialized effector states.
To assess the clinical relevance of these T cell populations, we analyzed bulk RNA-seq data from patients with NSCLC undergoing immunotherapy. Higher abundance of CD4_IL7R cells was significantly associated with improved PFS in the Ravi cohort (p = 0.0078) and showed a consistent trend in the Kim cohort (p = 0.057) (Fig. 3G). Similarly, elevated CD8_ZNF683 levels correlated with better prognosis in the Kim cohort (p = 0.00073), although this association did not reach statistical significance in the Ravi cohort (p = 0.28) (Fig. 3H). In contrast, increased abundance of proliferating CD8⁺ T cells (CD8_Cycling) were associated with poor outcomes in both cohorts (Ravi: p = 0.0028; Kim: p = 0.057) (Figure S4L). Additionally, we examined the clinical relevance of the precursor exhausted T cell cluster CD8_SPRY1 (Tpex), which represents a critical target population for PD-1 inhibitors. Higher abundance of CD8_SPRY1 cells was associated with improved prognosis in patients receiving immunotherapy (Ravi: p = 0.0014; Kim: p = 0.0065) (Figure S4M), consistent with their role as a renewable reservoir for anti-tumor immune responses.
These findings revealed distinct T cell populations with contrasting prognostic roles: CD4_IL7R central memory-like T cells and CD8_ZNF683 tissue-resident memory T cells, which are enriched in PTs and associated with favorable clinical outcomes; and proliferating CD8_Cycling cells, which are enriched in BMs and linked to poor prognosis following immunotherapy.
Anti-inflammatory cDC2-like dendritic cells are abundant in primary NSCLC and enhance immunotherapeutic efficacy
Owing to the crucial regulatory role of myeloid cells in the TIME, particularly in BMs, we analyzed this cell lineage. Unsupervised clustering revealed 12 myeloid cell subsets, including four tumor-associated macrophage clusters (TAM_MARCO, TAM_PLTP, TAM_SPP1, and TAM_MKI67), two monocyte clusters (Mono_FCN1 and Mono_TIMP1), four dendritic cell (DC) clusters (DC_CLEC10A, DC_CLEC9A, DC_CD207, and DC_LAMP3), and one neutrophil cluster (Fig. 4A; Figure S5). These clusters exhibited distinct gene expression patterns and transcription factor profiles (Fig. 4B, C; Supplementary Data 11).
Fig. 4: Elevated anti-inflammatory cDC2-like DCs in primary NSCLC correlate with immunotherapy efficacy.
A A UMAP plot of all myeloid cells, colored by identified cell clusters. B–D Heat maps showing normalized expression of differentially expressed genes (B), transcription factors (C), and canonical dendritic cell marker genes (D) among clusters; TLRs denote toll-like receptors. E Box plots comparing the proportions of dendritic cell subclusters between primary (n = 8) and brain metastatic (n = 11) NSCLC samples. P values from two-sided Wilcoxon Rank Sum tests are shown. F, G Enriched GO terms in DC_CLEC10A (F) and DC_CD207 (G) cells. P-values were calculated using one-sided hypergeometric tests with Benjamini–Hochberg FDR correction. H, I Trajectory analysis of dendritic cells inferred by Monocle 2, colored by cell cluster (H) and pseudotime (I). J Kaplan–Meier survival curve for patients with NSCLC in the Ravi cohort (n = 122 patients), as stratified by high and low levels of DC_CLEC10A cells; the optimal cutoff value was determined using “survminer.” Hazard ratios (HRs) with 95% confidence intervals (CIs) and P-values from two-sided log-rank tests are shown. Survival curves represent Kaplan–Meier estimates; +, censored observations. In box plots (E), center line represents the median, box bounds indicate the 25th and 75th percentiles (IQR), whiskers extend to 1.5× IQR from box bounds, and outliers are shown as individual points.
Examination of DC populations (Figure S6A) revealed distinct transcriptional profiles between DCs from PTs and BMs. Differential expression analysis showed that DCs in PTs upregulated genes associated with antigen presentation and immune activation, while DCs in BMs showed upregulation of genes related to metabolic reprogramming and stress response (Figure S6B; Supplementary Data 12). Pathway enrichment analysis further confirmed these differences: DCs from PTs were enriched for pathways related to antigen processing and presentation, peptide antigen assembly with MHC class II, and adaptive immune response, whereas DCs from BMs showed enrichment in ATP metabolic processes and oxidative phosphorylation (Figure S6C). Among the four DC clusters, DC_CLEC9A displayed high expression of cDC1 markers (XCR1, CLEC9A, CADM1); DC_CD207 and DC_CLEC10A were associated with cDC2 markers (CD1C, FCER1A, CLEC10A); and DC_LAMP3 was classified as activated DCs, characterized by high expression of activation and co-stimulatory genes (LAMP3, CD80, CD86) (Fig. 4D).
Comparative analysis of DC subcluster proportions between PTs and BMs, using both Wilcoxon rank-sum tests and scCODA, revealed that the two cDC2 clusters, DC_CLEC10A and DC_CD207, were significantly enriched in PTs (p = 0.00099 and p = 0.028, respectively, scCODA FDR < 0.05) (Fig. 4E, Figure S6D). Pathway enrichment analysis of cluster-specific differentially expressed genes revealed that DC_CLEC10A was enriched for antigen presentation of exogenous antigen, antigen presentation via MHC class II, peptide antigen assembly with MHC class II, and MHC protein complex assembly (Fig. 4F). Similarly, DC_CD207 displayed enrichment in antigen processing and presentation, peptide antigen assembly with MHC class II, and lymphocyte-mediated immunity (Fig. 4G). These findings indicate that both cDC2 populations contribute to antigen presentation and T cell activation in primary tumors. Pseudotime trajectory analysis using Monocle 2, validated by scTour, revealed the developmental relationships among DC subclusters (Fig. 4H, I; Figure S6E). Both DC_CD207 and DC_CLEC10A occupied early to intermediate positions along the developmental trajectory, consistent with their roles as antigen-presenting cells capable of T cell priming.
To assess the clinical relevance of these DC populations, we analyzed their association with patient outcomes in NSCLC patients receiving immunotherapy. The abundance of DC_CLEC10A was significantly correlated with improved overall survival (p = 0.035, Fig. 4J), whereas DC_CD207 showed a trend toward worse survival that did not reach statistical significance (p = 0.073, Figure S6F). These findings suggest a distinct distribution pattern of DC subclusters between PTs and BMs in NSCLC, with DC_CLEC10A representing a prognostically favorable population associated with enhanced immunotherapy efficacy.
Distinct monocyte subsets in NSCLC with differential associations to immunotherapy response
We identified two monocyte subclusters characterized by elevated expression of monocyte markers, such as FCN1, VCAN, S100A8, and S100A9 (Fig. 5A; Figure S7A). Comparative analysis of monocyte proportions between PTs and BMs revealed significant differences by Wilcoxon rank-sum test (Mono_FCN1 enriched in PTs, p = 0.011; Mono_TIMP1 enriched in BMs, p = 2.6 × 10⁻⁵). scCODA compositional analysis, which accounts for the interdependence of cell type proportions, confirmed Mono_FCN1 depletion in BMs as the primary compositional shift (Fig. 5B, Figure S6D). This distribution pattern was validated by multiplex immunofluorescence, which demonstrated significantly higher levels of CD14⁺FCN1⁺ cells in primary tumors (p = 7.5 × 10⁻⁵), with a trend toward increased CD14⁺TIMP1⁺ cells in brain metastases, though this did not reach statistical significance (p = 0.15) (Fig. 5C, D). In a subset analysis restricted to stage IV disease (10 PT vs. 26 BM), PTs again exhibited higher proportions of CD14⁺FCN1⁺ cells (p = 0.0085), whereas BMs showed a similar trend toward higher CD14⁺TIMP1⁺ cells (p = 0.17), indicating that this pattern persists within advanced-stage disease and is more consistent with tissue-specific rather than purely stage-driven differences (Figure S7B).
Fig. 5: Monocytes in BM and PT samples exhibit distinct functions and predict immunotherapy outcomes.
A UMAP plots showing two monocyte subclusters. B Box plots comparing the proportions of monocyte subclusters between primary (n = 8) and brain metastatic (n = 11) NSCLC samples. P values from two-sided Wilcoxon Rank Sum tests are shown. C Representative multiplex immunofluorescent images of NSCLC sections stained for TIMP1 (green), FCN1 (red), and CD14 (yellow), comparing PTs and BMs. Scale bars, 50 μm. Images are representative of n = 34 PT and n = 26 BM patients. D Box plots quantifying FCN1⁺ CD14⁺ and TIMP1⁺ CD14⁺ cells from the immunofluorescence images (n = 34 PT and n = 26 BM patients). P values from two-sided Wilcoxon Rank Sum tests are shown. E Violin plots showing pseudobulk expression (log2 CPM) of representative genes in Mono_FCN1 (green) and Mono_TIMP1 (orange) (n = 8 PT and n = 11 BM samples). P-values calculated using two-sided quasi-likelihood F-tests in edgeR with FDR correction. F Enriched GO terms in Mono_FCN1 and Mono_TIMP1 cells. P-values were calculated using one-sided hypergeometric tests with Benjamini–Hochberg FDR correction. G Trajectory analysis of monocytes and tumor-associated macrophages inferred by Monocle 2, colored by cell cluster. H, I Kaplan–Meier survival curves for NSCLC patients (n = 122 patients) in the Ravi cohort, as stratified by high and low levels of Mono_FCN1 (H) and Mono_TIMP1 (I) cells; optimal cutoff values were determined using “survminer.” Hazard ratios (HRs) with 95% confidence intervals (CIs) and P-values from two-sided log-rank tests are shown. Survival curves represent Kaplan–Meier estimates; +, censored observations. In box plots (B, D), center line represents the median, box bounds indicate the 25th and 75th percentiles (IQR), whiskers extend to 1.5× IQR from box bounds, and outliers are shown as individual points.
Transcriptional profiling using pseudobulk analysis revealed distinct functional programs between these monocyte subsets (Figure S7C). Mono_FCN1 was characterized by elevated expression of immediate early transcriptional regulators (FOS, FOSB, and NR4A2) together with immune activation markers (NLRP3 and CD83), suggesting that this subset represents an activated monocyte population undergoing transcriptional reprogramming associated with differentiation and immune activation. In contrast, Mono_TIMP1 exhibited higher expression of genes involved in immunometabolic reprogramming including MIF, LDHA, ENO1, TPI1 and GSTO1, indicative of a metabolically altered state that may support immunosuppressive functions (Fig. 5E, Figure S5C, Supplementary Data 13).
Pathway enrichment analysis further confirmed these functional distinctions (Fig. 5F). Mono_FCN1 was enriched for pathways related to response to corticosteroid, positive regulation of leukocyte-mediated cytotoxicity, chronic inflammatory response, and regulation of B cell activation, consistent with an activated, pro-inflammatory phenotype. In contrast, Mono_TIMP1 showed enrichment in ribonucleoside metabolic processes, ATP metabolic processes, and purine nucleoside triphosphate metabolic processes, underscoring its metabolically reprogrammed state. Trajectory analysis using Monocle 2 and scTour confirmed that Mono_TIMP1 represents naïve-like monocytes, while Mono_FCN1 represents an intermediate state between monocytes and macrophages (Fig. 5G, Figure S7D).
We analyzed bulk RNA-seq data from patients with NSCLC to evaluate the potential clinical relevance of these monocyte subclusters. While Mono_FCN1 abundance showed no significant association with PFS (p = 0.25, Fig. 5H), higher abundance of Mono_TIMP1 was associated with poorer prognosis (p = 0.02, Fig. 5I). These findings suggest that Mono_FCN1 is associated with an activated, pro-inflammatory state in primary tumors, whereas Mono_TIMP1 exhibits metabolic reprogramming in brain metastases that may contribute to an immunosuppressive microenvironment and reduced immunotherapy efficacy.
TAM subclusters in brain metastatic NSCLC modulate the immunosuppressive microenvironment and correlate with immunotherapeutic efficacy
Tumor-associated macrophages (TAMs) mediate antitumor immunity in PTs and BMs. We identified four distinct TAM sub-clusters in the present study (Fig. 6A). Of these subclusters, TAM_MARCO and TAM_PLTP exhibited elevated expression of immunosuppressive genes such as C1QA, C1QB, GPNMB, and APOE. Similarly, the expression of M2-like signature genes such as CD163, CCL18, MSR1, and that of MHC class II genes was elevated. This expression profile suggests an involvement in antigen presentation and the modulation of the immune microenvironment. Furthermore, TAM_SPP1 was associated with tumor angiogenesis and immune evasion based on elevated levels of VEGFA, CD163 and TGFB1. Contrastingly, TAM_MKI67 was characterized by high expression of proliferation markers MKI67 and TOP2A, indicating its role in the proliferation of TAMs (Figure S7A).
Fig. 6: Diverse tumor-associated macrophages (TAMs) in PT and BM are associated with a distinct functional profile and prognostic significance.
A UMAP plots showing four TAM subclusters. B Box plots comparing the proportions of TAM subclusters between primary (n = 8) and brain metastatic (n = 11) NSCLC samples. P-values from two-sided Wilcoxon Rank Sum tests are shown. C Representative multiplex immunofluorescent images of NSCLC sections stained for CD68 (green) and PLTP (red), comparing PTs and BMs. Scale bars, 50 μm. Images are representative of n = 34 PT and n = 26 BM patients. D Box plots quantifying PLTP⁺ CD68⁺ cells from the immunofluorescence images (n = 34 PT and n = 26 BM patients). P-values from two-sided Wilcoxon rank-sum tests are shown. E Volcano plot of pseudobulk differential gene expression between TAM_MARCO and TAM_PLTP. Significantly upregulated genes (FDR < 0.05, |log2FC | > 0.5) are colored: TAM_MARCO-enriched genes in dark grey, TAM_PLTP-enriched genes in red. F Violin plots showing pseudobulk expression (log2 CPM) of representative genes in TAM_MARCO (dark grey) and TAM_PLTP (blue) (n = 8 PT and n = 11 BM samples). P-values calculated using two-sided quasi-likelihood F-tests in edgeR with FDR correction. G Bar plots showing enriched GO terms in TAM_MARCO and TAM_PLTP cells. P-values were calculated using one-sided hypergeometric tests with Benjamini–Hochberg FDR correction. H, I Kaplan–Meier survival curves for NSCLC patients (n = 122 patients) in the Ravi cohort, as stratified by high and low levels of TAM_MARCO (H) and TAM_PLTP (I) cells; optimal cutoff values were determined using “survminer.” Hazard ratios (HRs) with 95% confidence intervals (CIs) and P-values from two-sided log-rank tests are shown. Survival curves represent Kaplan–Meier estimates; +, censored observations. In box plots (B, D), center line represents the median, box bounds indicate the 25th and 75th percentiles (IQR), whiskers extend to 1.5× IQR from box bounds, and outliers are shown as individual points.
Comparative analysis of TAM subcluster proportions between PTs and BMs, using both Wilcoxon rank-sum tests and scCODA, revealed significant differences (Fig. 6B, Figure S6D). TAM_MARCO was significantly enriched in PTs (p = 0.0053), whereas TAM_PLTP was predominantly found in BMs (p = 2.6 × 10⁻⁵), indicating site-specific functional specialization of macrophage populations. This differential distribution was validated by multiplex immunofluorescence staining, which demonstrated significantly higher levels of CD68⁺PLTP⁺ cells in BMs compared to PTs (p = 6.8 × 10⁻⁸) (Fig. 6C, D). A subset analysis restricted to stage IV disease (10 PT vs. 26 BM) likewise confirmed enrichment of CD68⁺PLTP⁺ cells in BMs (p = 0.0009), indicating that this pattern persists within advanced-stage disease and is more consistent with tissue-specific rather than purely stage-driven differences (Figure S7E).
To characterize the functional differences between TAM_MARCO and TAM_PLTP, we evaluated macrophage polarization signatures and performed pseudobulk differential expression analysis. TAM_MARCO exhibited elevated M0 and M1 signature scores and higher expression of interferon-stimulated genes (MX1, IFIT3, ISG15) alongside immune activation markers (CD44, TRIM22). In contrast, TAM_PLTP showed elevated M2 signature scores and higher expression of genes associated with immunoregulatory functions, including PLTP, FOLR2, and CD300A, as well as complement components C1QA and C1QB (Fig. 6E, F; Figure S7F, G; Supplementary Data 14).
Pathway enrichment analysis of differentially expressed genes between two subclusters revealed that TAM_MARCO and TAM_PLTP engage distinct immune activation programs (Fig. 6G). TAM_MARCO was enriched for interferon-mediated signaling and antiviral defense pathways, characteristic of anti-tumor macrophages. TAM_PLTP was enriched for complement activation (classical pathway, humoral immune response by immunoglobulin, B cell-mediated immunity) and adaptive immune responses. While both populations display immune activation signatures, complement-mediated immunity in the tumor microenvironment has been linked to tumor promotion through angiogenesis and recruitment of immunosuppressive myeloid cells29,30,31, consistent with the divergent clinical outcomes associated with these TAM subtypes.
To assess the clinical relevance of these TAM populations, we analyzed their association with patient outcomes in NSCLC patients receiving immunotherapy. Higher abundance of TAM_MARCO showed a trend toward improved PFS (p = 0.071, Fig. 6H), whereas elevated TAM_PLTP levels were significantly associated with poor prognosis (p = 0.016, Fig. 6I). These findings indicate that TAM_PLTP may contribute to the immunosuppressive microenvironment characteristic of brain metastatic NSCLC.
Characteristics of B and plasma cell subclusters in PTs and BMs
B and plasma cells are actively involved in antitumor immunity and are associated with the efficacy of immunotherapy. We re-clustered these cells into 14 distinct sub-clusters to assess their landscape in the PTs and BMs of NSCLC. These included five memory B-cell clusters (Bm_MS4A1, Bm_CD83, Bm_CD74, Bm_ALDOA, and Bm_FCRL4), two naïve B-cell clusters (Bn_IGHD and Bn_SPP1), a germinal center B-cell cluster (Bgc_LMO2), and six plasma cell clusters (PC_IGHG1, PC_FKBP2, PC_XBP1, PC_JSRP1, PC_MKI67, and PC_IGHM) (Fig. 7A, Figure S8A–E, Supplementary Data 15).
Fig. 7: Characterization of B and plasma cells in PT and BM reveals enrichment of naïve B cells in PTs.
A UMAP plots showing B and plasma cell subclusters. B Box plots comparing the proportions of B and plasma cell subclusters between primary (n = 8) and brain metastatic (n = 11) NSCLC samples. P-values from two-sided Wilcoxon Rank Sum tests are shown. C Violin plots showing marker gene expression of Bn_IGHD cells across clusters. D A heat map displaying potential ligands driving the phenotype of Bn_IGHD cells. E Bar plots showing enriched GO terms in Bn_IGHD cells. P-values were calculated using one-sided hypergeometric tests with Benjamini–Hochberg FDR correction. F Kaplan–Meier survival curve for NSCLC patients (n = 122 patients) in the Ravi cohort, as stratified by high and low levels of Bn_IGHD cells; the optimal cutoff value was determined using “survminer”. Hazard ratios (HRs) with 95% confidence intervals (CIs) and P-values from two-sided log-rank tests are shown. Survival curves represent Kaplan–Meier estimates; +, censored observations. In box plots (B), center line represents the median, box bounds indicate the 25th and 75th percentiles (IQR), whiskers extend to 1.5× IQR from box bounds, and outliers are shown as individual points.
We compared the relative abundance of these clusters between PTs and BMs to characterize their functions. We observed a decreased proportion of the naïve B cell cluster Bn_IGHD in BMs in both Wilcoxon and scCODA analysis (Fig. 7B, Figure S8F). This cluster was characterized by the elevated expression of IGHD, FCER2, IL4R, and YBX3 (Fig. 7C). B cells expressing IL4R and FCER2 are associated with B cell activation and increased IgE production32. Ligand prediction using NicheNet further identified interleukin-4 as a specific ligand for Bn_IGHD (Fig. 7D). Pathway enrichment analysis revealed that this cluster was enriched for B cell activation, leukocyte cell-cell adhesion, and antigen presentation pathways, including peptide antigen assembly with MHC class II and MHC class II protein complex assembly (Fig. 7E). These findings suggest that Bn_IGHD cells contribute to B cell-mediated immune responses and antigen presentation in primary tumors.
We explored the clinical relevance of Bn_IGHD by assessing its association with patient outcomes. Enrichment of this naïve B cell cluster showed a trend toward improved PFS (p = 0.06, Fig. 7F). These findings suggest that Bn_IGHD may contributes to antitumor immunity and serves as a potential biomarker for therapeutic responses.
Coordinated immune cell networks differ between primary tumors and brain metastases
To assess whether the immune cell populations identified above vary independently or form coordinated modules, we next quantified the correlations between the relative abundances of all immune subclusters across PT and BM samples. Spearman correlation analysis revealed groups of positively correlated clusters as well as pairs with strong inverse relationships, indicating that patients tend to segregate into distinct immune compositions rather than showing isolated changes in single subsets (Figure S9A). We then compared correlation coefficients between PT and BM samples and observed substantial remodeling of these associations, including both strengthened and weakened correlations, with multiple cluster pairs showing large shifts in correlation strength (|Δr | > 1.0), consistent with rewiring of the immune ecosystem in brain metastases (Figure S9B).
To probe potential functional links between key clusters, we further applied CellChat to model ligand–receptor–mediated communication among Mono_FCN1, DC_CLEC10A, CD4_IL7R, CD8_TRM, and TAM_PLTP. This analysis identified several interaction pathways, including adhesion and antigen-presentation related axes (e.g., ADGRE5–CD55, ANXA1–FPR1, APP–CD74, ICAM–ITGB2 and LGALS9–HAVCR2), that connect Mono_FCN1 and DC_CLEC10A with CD4_IL7R and CD8_TRM, many of which showed altered communication probabilities between PT and BM, and highlighted additional signals from TAM_PLTP to CD4_IL7R (Figure S9C). Together, these data indicate that the favorable PT-enriched populations (Mono_FCN1, DC_CLEC10A, CD4_IL7R, CD8_TRM) and the BM-enriched immunosuppressive TAM_PLTP cluster participate in a coordinated and site-specific interaction network, rather than acting in isolation.
BM-derived Immune Signature (BMIS) associates with immunotherapy efficacy across multiple cancers
To validate the clinical relevance of the cell subpopulations identified through scRNA-seq, we initially performed unsupervised clustering of the Ravi cohort based on signature scores of six key cell populations differentially abundant between primary tumors and brain metastases (CD4_IL7R, DC_CLEC10A, TAM_MARCO, HSP70-High CD8, HSP70-High CD4, TAM_PLTP). This analysis identified four distinct patient clusters characterized by differing immune infiltration profiles (Fig. 8A). Cluster 2, marked by elevated infiltration of brain metastasis-enriched populations (HSP70-High T cells and TAM_PLTP), showed a trend toward worse overall survival in the Ravi cohort, though this did not reach statistical significance (p = 0.054, Fig. 8B, Figure S10A). A similar pattern was observed in the mUC cohort (p = 0.065, Fig. 8C, Figure S10B, C).
Fig. 8: Development and validation of Brain Metastasis-derived Immune Signature (BMIS) associated with immunotherapy outcomes across cancer types.
A Heatmap showing unsupervised clustering of NSCLC patients in the Ravi cohort (n = 122 patients) based on signature scores of six cell populations. Clinical characteristics (gender, age, initial stage) and signature scores are annotated. B, C Kaplan-Meier survival curves comparing overall survival of Cluster 2 (high brain metastasis-enriched populations) versus other clusters in the Ravi cohort (n = 122 NSCLC patients) (B) and mUC cohort (n = 298 mUC patients) (C). P-values from two-sided log-rank tests are shown. D Workflow for BMIS development. Differentially expressed genes (DEGs) from brain metastasis-enriched cell populations were subjected to feature selection using Variable Selection Using Random Forests (VSURF), yielding a 7-gene signature used to construct a logistic regression model. Created in BioRender. Bai, M. (2026) https://BioRender.com/xfyb5zp. E Kaplan-Meier curves for progression-free survival (left) and overall survival (right) stratified by BMIS in the Ravi cohort (n = 122 NSCLC patients). Tick marks indicate censored observations. F Distribution of BMIS scores across immunotherapy response categories (PD, n = 41 patients; SD, n = 32 patients; PR, n = 42 patients; CR, n = 7 patients) in the Ravi cohort. G Proportion of patients achieving objective response (PR/CR) or progressive/stable disease (PD/SD) stratified by BMIS group in the Ravi cohort. Two-sided Fisher’s exact test P-value is shown. H Progression-free survival curves (left) and response rates (right) stratified by BMIS in the Kim cohort (n = 27 NSCLC patients). Two-sided Fisher’s exact test P value is shown. I Overall survival curves (left) and response rates (right) stratified by BMIS in the mUC cohort (n = 298 mUC patients). Tick marks indicate censored observations. Two-sided Fisher’s exact test P-value is shown. J, K Receiver operating characteristic (ROC) curves (left) and forest plots (right) comparing the performance of BMIS with established immunotherapy biomarkers, including tumor mutational burden (TMB), gene expression profile score (GEP), and cytolytic activity score (CYT), in the Ravi (n = 122 NSCLC patients) (J) and mUC (n = 298 mUC patients). Area under the curve (AUC) with 95% confidence intervals are shown. In survival analysis, hazard ratios (HRs) with 95% confidence intervals (CIs) and P-values from two-sided log-rank tests are shown. Survival curves represent Kaplan–Meier estimates; +, censored observations. In box plots (F), center line represents the median, box bounds indicate the 25th and 75th percentiles (IQR), whiskers extend to 1.5× IQR from box bounds, and outliers are shown as individual points.
Given the association between brain metastasis-enriched cell populations and immunotherapy resistance, we developed a quantitative prognostic model by extracting differentially expressed genes from cell populations associated with poor outcomes (HSP70-High CD4, HSP70-High CD8, TAM_PLTP). Using Variable Selection Using Random Forests (VSURF) for feature selection (Fig. 8D), we identified a robust 7-gene signature (PLK3, VAMP8, CKS2, TSPYL4, CDKN1A, LAMTOR1, IFNG). We then trained a logistic regression model to predict therapeutic response (CR/PR vs PD/SD) and used the model-derived predicted probabilities as continuous BMIS scores (Figure S10D–F).
We first evaluated BMIS performance in the discovery cohort (Ravi, n = 122). Using a prespecified cutoff (BMIS = 0.5; the standard decision boundary for predicted probabilities), patients with high BMIS scores exhibited significantly shorter PFS (log-rank p < 0.0001) and OS (log-rank p = 0.0019) (Fig. 8E). Multivariate Cox regression analysis, adjusting for age, sex, and initial tumor stage, confirmed that BMIS served as an independent prognostic factor for both PFS (p < 0.001) and OS (p = 0.016) (Figure S10G, H). Furthermore, BMIS scores were significantly elevated in patients with no therapeutic benefit (PD/SD) compared to responders (PR/CR) (Fig. 8F). Correspondingly, the objective response rate (PR/CR) in the high BMIS group was only 27.2%, substantially lower than 65.9% in the low BMIS group (Fisher’s exact test p = 7.03 × 10⁻⁵) (Fig. 8G).
To assess the generalizability of BMIS, we performed validation in three independent cohorts. In the Kim cohort (n = 27 NSCLC patients), the high BMIS group showed a consistent trend toward shorter PFS (log-rank p = 0.1), though this did not reach statistical significance, likely due to limited sample size (Fig. 8H). In the larger metastatic urothelial carcinoma (mUC) cohort (n = 298), BMIS demonstrated robust prognostic value, with high BMIS scores significantly associated with worse OS (log-rank p < 0.0001) and substantially lower objective response rates (14.4% vs 38.8%, Fisher’s exact test p < 0.0001) (Fig. 8I). However, in the melanoma cohort (n = 121), we did not observe significant associations between BMIS and either PFS or OS (PFS: p = 0.28; OS: p = 0.14) (Figure S10I–K). This suggests that the NSCLC brain metastasis-derived signature may not be directly applicable to melanoma, likely reflecting fundamental differences in tumor biology and immune microenvironments between cancer types.
Finally, we compared the predictive performance of BMIS with existing biomarkers, including tumor mutational burden (TMB), a widely used biomarker for immunotherapy response. Receiver operating characteristic (ROC) analysis in the Ravi cohort showed that BMIS (AUC = 0.773) exhibited comparable predictive power to TMB (AUC = 0.715). Importantly, a combined model integrating BMIS and TMB achieved superior performance (AUC = 0.805) (Fig. 8J). This complementary predictive value was confirmed in the mUC cohort, where the combined BMIS + TMB model (AUC = 0.764) outperformed either BMIS (AUC = 0.682) or TMB (AUC = 0.730) alone (Fig. 8K). These findings indicate that BMIS may captures immune microenvironment features distinct from tumor mutational landscape and that their integration enhances prediction of immunotherapy outcomes.