Estrogen receptor status prediction by gene component regression: a comparative study

Int J Data Min Bioinform. 2014;9(2):149-71. doi: 10.1504/ijdmb.2014.059065.

Abstract

The aim of the study is to evaluate gene component analysis for microarray studies. Three dimensional reduction strategies, Principle Component Regression (PCR), Partial Least Square (PLS) and Reduced Rank Regression (RRR) were applied to publicly available breast cancer microarray dataset and the derived gene components were used for tumor classification by Logistic Regression (LR) and Linear Discriminative Analysis (LDA). The impact of gene selection/filtration was evaluated as well. We demonstrated that gene component classifiers could reduce the high-dimensionality of gene expression data and the collinearity problem inherited in most modern microarray experiments. In our study gene component analysis could discriminate Estrogen Receptor (ER) positive breast cancers from negative cancers and the proposed classifiers were successfully reproduced and projected into independent microarray dataset with high predictive accuracy.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Neoplasms* / genetics
  • Breast Neoplasms* / metabolism
  • Databases, Genetic*
  • Female
  • Humans
  • Neoplasm Proteins* / genetics
  • Neoplasm Proteins* / metabolism
  • Oligonucleotide Array Sequence Analysis*
  • Receptors, Estrogen* / genetics
  • Receptors, Estrogen* / metabolism

Substances

  • Neoplasm Proteins
  • Receptors, Estrogen