Prognostic power of a lipid metabolism gene panel for diffuse gliomas

Abstract Lipid metabolism reprogramming plays important role in cell growth, proliferation, angiogenesis and invasion in cancers. However, the diverse lipid metabolism programmes and prognostic value during glioma progression remain unclear. Here, the lipid metabolism‐related genes were profiled using RNA sequencing data from The Cancer Genome Atlas (TCGA) and Chinese Glioma Genome Atlas (CGGA) database. Gene ontology (GO) and gene set enrichment analysis (GSEA) found that glioblastoma (GBM) mainly exhibited enrichment of glycosphingolipid metabolic progress, whereas lower grade gliomas (LGGs) showed enrichment of phosphatidylinositol metabolic progress. According to the differential genes of lipid metabolism between LGG and GBM, we developed a nine‐gene set using Cox proportional hazards model with elastic net penalty, and the CGGA cohort was used for validation data set. Survival analysis revealed that the obtained gene set could differentiate the outcome of low‐ and high‐risk patients in both cohorts. Meanwhile, multivariate Cox regression analysis indicated that this signature was a significantly independent prognostic factor in diffuse gliomas. Gene ontology and GSEA showed that high‐risk cases were associated with phenotypes of cell division and immune response. Collectively, our findings provided a new sight on lipid metabolism in diffuse gliomas.

analysis. 6,7 At present, more and more studies focus on revealing the biological phenotype and molecular mechanism that altered lipid component leads to in glioma. Offer et al found that extracellular lipid loading augments hypoxic paracrine signalling and promotes glioma angiogenesis and macrophage infiltration. 8 GPIHBP1, a GDP-anchored protein of capillary endothelial cells, facilitated triglyceride-rich lipoproteins (TRLs) processing and provided a source of lipid nutrients for glioma cells. 9 Marifia and colleagues revealed that sphingosine-1-phosphate (S1P) fuelled proliferative and stemness qualities of glioblastoma stem cells. 10 However, the distinct lipid metabolism programmes and prognostic value in glioma progression need further study.
In this study, we profiled the lipid metabolism status in 859 diffuse glioma samples with gene expression data from TCGA and CGGA database. Distinct enrichments of lipid metabolism phenotype were observed between LGGs and GBM. Then, we constructed a lipid metabolism-related gene set for evaluating the risk of poor outcome, which was also validated in CGGA cohort. The gene set was closely associated with the pathological factors and could be identified as an independent prognostic feature. Taken together, our results indicated a strong connection between patients' survival and lipid metabolism in diffuse glioma.

| Patients and datasets
We collected 550 and 309 diffuse gliomas with RNA-seq data and clinical information from TCGA and CGGA database, respectively. 11,12 TCGA cohort was used as training set and CGGA cohort as validation set. All tissues and clinicopathologic information were obtained with written informed consents. This study was approved by ethics committee of Tiantan Hospital. The patient characteristics of these two cohorts were summarized in Table S1.

| Gene set selection
Four lipid metabolism-related gene sets (Reactome metabolism of lipids and lipoproteins, Reactome phospholipid metabolism, Hallmark fatty acid metabolism and KEGG glycerophospholipid metabolism) were collected from the Molecular Signature Database v5.1 (MSigDB). 13 After removing the overlapped genes, 614 lipid metabolism-related genes were obtained. The differential lipid metabolism genes between LGG and GBM were selected. By using the R package 'survival', univariate Cox analysis performed to prefilter the genes based on the P values. Then, the Cox proportional hazards model with elastic net penalty was applied for selecting signature gene, which was performed with the R package 'glmnet'. 14,15 A linear combination of signature genes expression level weighted by regression coefficients (Coeffs) was developed to calculate the risk score of each patient in training set. Then, the regression Coeffs from training set was used to compute the risk scores for cases of validation set.

| Bioinformatic analysis
Gene ontology (GO) analysis was performed for function annotation of differential genes. 16 Gene set enrichment analysis (GSEA) was applied for identifying statistically different gene sets between two groups with GSEA v3 software. 13 Principal components analysis (PCA) was carried out using the R package 'princomp' to analyse the expression pattern of grouped patients. 17,18 Utilizing the gene expression data, stromal and immune score of each sample was calculated with R package 'ESTIMATE' which reflected the gene signature enrichment of stromal and immune cells. 19

| Statistical analysis
Patients in both training and validation cohorts were assigned into high-or low-risk group based on the median value of risk score.
Kaplan-Meier curves and 2-sided log-rank test were applied to assess the survival difference between high-and low-risk groups.
Chi-square test was conducted to detect the pathologic differences between high-and low-risk patients. Univariate and multivariate Cox regression analyses were performed to assess the independent prognostic factors by using SPSS software. ROC curve analysis was used to predict overall survival (OS) with R package 'pROC'. P value <.05 was considered significant statistically.

| LGG and GBM show distinct lipid metabolism phenotypes
To detect the lipid metabolism differences during the progression of diffuse gliomas, we collected 550 patients with RNA sequencing data and clinical information from TCGA database and four lipid metabolism-related gene sets, which were integrated into one set containing 614 genes. Gene clustering using the R package 'pheatmap' found that the profile of lipid metabolism-related genes between LGG and GBM showed obvious differences ( Figure 1A). Principal components analysis based on these selected genes showed that GBM and LGG were distributed in different regions, suggesting distinct lipid metabolism phenotypes between them ( Figure 1B).
To further explore the lipid metabolism phenotypes, we performed GO analysis and found that GBM mainly exhibited an enrichment of glycosphingolipid metabolic progress, whereas LGG displayed enrichment of phosphatidylinositol metabolic progress ( Figure 1C).
Gene set enrichment analysis analysis also confirmed this finding ( Figure 1D,E). In addition, we also analysed the CGGA cohort of 309 glioma samples using the above methods, and the same results were observed between LGG and GBM ( Figure S1). Heat maps showed the differential genes between LGG and GBM, involving in glycosphingolipid and phosphatidylinositol metabolic progress ( Figure   S2). These results indicated LGG and GBM displayed distinct lipid metabolic phenotypes.

F I G U R E 1 Distinct lipid metabolism status between LGG and GBM. A, Heat map of lipid metabolism-related genes between
LGG and GBM of TCGA cohort. B, Principal components analysis of lipid metabolism-related genes between LGG and GBM. C, GO analysis of differential genes between LGG and GBM. D and E, Gene set enrichment analysis of lipid metabolism status between LGG and GBM. NES, normalized enrichment score F I G U R E 2 Identification of a prognostic signature by Cox proportional hazards model in TCGA cohort. A, Venn diagram shows prognosisrelated lipid metabolism genes which are also differentially expressed between GBM and LGG. B, Cross-validation for tuning parameter selection in the proportional hazards model. C, Heat map shows the signature genes. D, Coefficient (Coeff) values of the nine selected genes. E, Survival analysis of OS in high-and low-risk groups of patients

| Identification of a lipid metabolism-related gene set for prognostic prediction
Considering the distinct profile of lipid metabolism between LGG and GBM, we proposed to build a lipid metabolism-related gene set for predicting prognosis. By performing univariate Cox regression analysis, 297 prognosis-related genes remained (P < .05). Thirty-one out of prognosis-related genes were involved in glycosphingolipid and phosphatidylinositol metabolic progress (Figure 2A). Then, we performed the Cox proportional hazards model with elastic net regression for gene selection ( Figure 2B). Consequently, a nine-gene signature was obtained as a classifier ( Figure 2C,D), and risk score of each patient was computed with expression value and the coeffs of multivariable Cox regression.
Then, based on the median risk score, patients were assigned into high-or low-risk group. Kaplan-Meier analysis found the high-risk cases had a significantly shorter OS than low-risk ones (P < .001, Figure 2E). To validate this gene set, we also calculated patients' risk scores of CGGA cohort with same regression Coeffs.
Heat map showed the expression of signature genes in CGGA cohort ( Figure S3A). As expected, we acquired consensus result ( Figure S3B).

| The nine-gene set shows strong prognostic power
We next performed univariate and multivariate Cox regression analyses to determine the prognostic value of the acquired gene set. The results showed that the lipid metabolism-related gene set was independently correlative with OS (P = .017) ( Table 1). Consistently, this gene set could also be served as an independent prognostic factor in CGGA validation set (P = .003) ( Table 1). By computing the AUC of risk score, age and grade, we next assessed the predictive accuracy with ROC curve and found that AUC of risk score (0.86) was much higher than that of age (0.801) or grade (0.83) ( Figure S4A).
Similar results were also observed in CGGA validation set ( Figure   S4B). These results indicated that the acquired lipid metabolic gene set had strong power for prognosis prediction.

| The acquired nine-gene set is correlated with pathologic features in diffuse gliomas
We further detected whether the gene set was associated with pathologic features. As shown in Figure 3, higher level of risk scores preferred to distribute in higher grade, classical, mesenchymal, IDH-wt, MGMT promoter unmethylated or 1p/19q non-codeleted patients. We also assessed the distributive differences of these pathologic features between high-and low-risk groups by performing chi-square test. In both cohorts, most of pathologic features had significantly different distribution between risk groups except gender (Table S1). These results suggested a significant association between the lipid metabolism gene set and clinical molecular features.
P value (<.05) marked in bold was considered significant statistically.

| Application of the nine-gene panel in stratified patients
We further explored the prognostic significance of the gene panel in patients stratified by grade, IDH, MGMT promoter and 1p/19q status. In both cohorts, Kaplan-Meier analysis showed that cases with high-risk score had shorter overall survival than the low-risk ones in most stratified patients (Figure 4, Figure S5). The similar trend occurred in GBM or 1p/19q codeleted cases despite of no statistical difference ( Figure 4B,G). After that, patients were also stratified by WHO 2016 molecular subtype. Consensus results were obtained in cases of IDH-mutant LGG, whereas in IDH-wt LGG, IDH-wt GBM and IDH-mutant GBM found no significant differences ( Figure S6). These data revealed that acquired signature could accurately predict the unfavourable outcome in most stratified patients.

| High-risk cases show enhanced cell division and immune response phenotypes
To detect the biological function differences, we further compared gene expression of patients between low-and high-risk groups. PCA found that low-and high-risk cases distributed in two regions clearly ( Figure S7). Based on the differentially expressed genes (P < .05) which were identified by SAM, GO analysis found that cell division and immune response were significantly enriched in high-risk patients, whereas low-risk cases showed enrichments of chemical synaptic transmission and neurotransmitter secretion ( Figure 5A,B).

| D ISCUSS I ON
Compelling evidence has suggested that metabolism deregulation is one of the emerging hallmarks of cancer cells, due to its important role in cell growth, proliferation, angiogenesis and invasion. Warburg reported that cancer cells mainly obtain energy by shifting their metabolism towards glycolysis pathway rather than oxidative phosphorylation. 1 In addition to the abnormal glucose metabolism, lipids, amino acids and nucleic acids metabolism are also altered in cancer cells. 20 Recent studies have found that lipid metabolism reprogramming plays a crucial part in membrane synthesis, energetic production and signal transduction in the progression of cancer cells. 21 Glioma, an intractable cancer, is one of the most lethal human brain malignancies with frequent recurrences 6 months after surgery. Although great efforts have made on the glucose metabolism alterations, increasing research has indicated that lipid metabolism Risk score is a widely used approach to construct a meaningful signature. 28   LGG, and their high levels were associated with favourable outcome.
In contrast, the other five genes were up-regulated in GBM, and high expression indicated poor outcome ( Figure S8). The biological roles of these nine genes in gliomagenesis need to be further explored.
Since GO and GSEA revealed that high-risk cases showed an enhanced phenotype of immune response, we also performed the ESTIMATE algorithm to compare inflammatory microenvironment between high-and low-risk groups. Consequently, we found a significant increase in ESTIMATE scores in the high-risk group ( Figure   S9), indicating that the lipid metabolism status is associated with inflammatory microenvironment in diffuse gliomas.

| CON CLUS ION
Collectively, we profiled the lipid metabolism phenotype in diffuse gliomas and identified a lipid metabolic gene signature that could classify patients with high-and low-risk categories of poor outcome. Our workflow was summarized in Figure S10. However, prospective studies were further needed and the predictive capacity of the gene panel regarding lipid metabolism should be tested for clinical application.

ACK N OWLED G EM ENTS
The authors accomplishing this work represent the Chinese Glioma Cooperative Group (CGCG).
F I G U R E 5 Functional enrichments between low-and high-risk cases. A and B, GO analysis of differential genes between low-and highrisk cases in two cohorts. C and D, GSEA analysis based on the median value of the risk scores in two cohorts

CO N FLI C T O F I NTE R E S T
The authors confirm that there are no conflicts of interest.

AUTH O R CO NTR I B UTI O N
TJ designed the study and wrote the manuscript. FW, RC and ZZ performed the gene analysis. YL, GL and HJ collected the clinical data.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data can be downloaded from TCGA database (http://cance rgemo me.nih.gov/) and CGGA database (http://www.cgga.org.cn).