Improving the prediction of chemotherapeutic sensitivity of tumors in breast cancer via optimizing the selection of candidate genes

Comput Biol Chem. 2014 Apr:49:71-8. doi: 10.1016/j.compbiolchem.2013.12.002. Epub 2014 Jan 1.

Abstract

Estrogen receptor status and the pathologic response to preoperative chemotherapy are two important indicators of chemotherapeutic sensitivity of tumors in breast cancer, which are used to guide the selection of specific regimens for patients. Microarray-based gene expression profiling, which is successfully applied to the discovery of tumor biomarkers and the prediction of drug response, was suggested to predict the cancer outcomes using the gene signatures differentially expressed between two clinical states. However, many false positive genes unrelated to the phenotypic differences will be involved in the lists of differentially expressed genes (DEGs) when only using the statistical methods for gene selection, e.g. Student's t test, and subsequently affect the performance of the predictive models. For the purpose of improving the prediction of clinical outcomes, we optimized the selection of DEGs by using a combined strategy, for which the DEGs were firstly identified by the statistical methods, and then filtered by a similarity profiling approach that used for candidate gene prioritization. In our study, we firstly verified the molecular functions of the DEGs identified by the combined strategy with the gene expression data generated in the microarray experiments of Si-Wu-Tang, which is a popular formula in traditional Chinese medicine. The results showed that, for Si-Wu-Tang experimental data set, the cancer-related signaling pathways were significantly enriched by gene set enrichment analysis when using the DEG lists generated by the combined strategy, confirming the potentially cancer-preventive effect of Si-Wu-Tang. To verify the performance of the predictive models in clinical application, we used the combined strategy to select the DEGs as features from the gene expression data of the clinical samples, which were collected from the breast cancer patients, and constructed models to predict the chemotherapeutic sensitivity of tumors in breast cancer. After refining the DEG lists by a similarity profiling approach, the Matthew's correlation coefficients of predicting estrogen receptor status and the pathologic response to preoperative chemotherapy with the DEGs selected by the fold change ranking were 0.770 and 0.428, respectively, and were 0.748 and 0.373 with the DEGs selected by SAM, respectively, which were generally higher than those achieved with unrefined DEG lists and those achieved by the candidate models in the second phase of Microarray Quality Control project (0.732 and 0.301, respectively). Our results demonstrated that the strategy of integrating the statistical methods with the gene prioritization methods based on similarity profiling was a powerful tool for DEG selection, which effectively improved the performance of prediction models in clinical applications and can guide the personalized chemotherapy better.

Keywords: Breast cancer; Cancer outcome prediction; Gene expression profiling; Gene prioritization; Support vector machine.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Antineoplastic Agents, Phytogenic / pharmacology
  • Antineoplastic Agents, Phytogenic / therapeutic use*
  • Breast Neoplasms / drug therapy*
  • Breast Neoplasms / genetics*
  • Drugs, Chinese Herbal / pharmacology
  • Drugs, Chinese Herbal / therapeutic use*
  • Female
  • Gene Expression Profiling
  • Humans
  • MCF-7 Cells
  • Oligonucleotide Array Sequence Analysis
  • Predictive Value of Tests

Substances

  • Antineoplastic Agents, Phytogenic
  • Drugs, Chinese Herbal
  • si-wu-tang