The use of genetic programming in the analysis of quantitative gene expression profiles for identification of nodal status in bladder cancer

BMC Cancer. 2006 Jun 16:6:159. doi: 10.1186/1471-2407-6-159.

Abstract

Background: Previous studies on bladder cancer have shown nodal involvement to be an independent indicator of prognosis and survival. This study aimed at developing an objective method for detection of nodal metastasis from molecular profiles of primary urothelial carcinoma tissues.

Methods: The study included primary bladder tumor tissues from 60 patients across different stages and 5 control tissues of normal urothelium. The entire cohort was divided into training and validation sets comprised of node positive and node negative subjects. Quantitative expression profiling was performed for a panel of 70 genes using standardized competitive RT-PCR and the expression values of the training set samples were run through an iterative machine learning process called genetic programming that employed an N-fold cross validation technique to generate classifier rules of limited complexity. These were then used in a voting algorithm to classify the validation set samples into those associated with or without nodal metastasis.

Results: The generated classifier rules using 70 genes demonstrated 81% accuracy on the validation set when compared to the pathological nodal status. The rules showed a strong predilection for ICAM1, MAP2K6 and KDR resulting in gene expression motifs that cumulatively suggested a pattern ICAM1 > MAP2K6 > KDR for node positive cases. Additionally, the motifs showed CDK8 to be lower relative to ICAM1, and ANXA5 to be relatively high by itself in node positive tumors. Rules generated using only ICAM1, MAP2K6 and KDR were comparably robust, with a single representative rule producing an accuracy of 90% when used by itself on the validation set, suggesting a crucial role for these genes in nodal metastasis.

Conclusion: Our study demonstrates the use of standardized quantitative gene expression values from primary bladder tumor tissues as inputs in a genetic programming system to generate classifier rules for determining the nodal status. Our method also suggests the involvement of ICAM1, MAP2K6, KDR, CDK8 and ANXA5 in unique mathematical combinations in the progression towards nodal positivity. Further studies are needed to identify more class-specific signatures and confirm the role of these genes in the evolution of nodal metastasis in bladder cancer.

Publication types

  • Research Support, N.I.H., Extramural
  • Validation Study

MeSH terms

  • Algorithms
  • Cohort Studies
  • Computer Simulation / statistics & numerical data*
  • Gene Expression
  • Gene Expression Profiling*
  • Gene Frequency
  • Genetic Testing / statistics & numerical data*
  • Humans
  • Lymph Nodes / pathology
  • Lymphatic Metastasis / diagnosis*
  • Lymphatic Metastasis / genetics
  • Predictive Value of Tests
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Statistics as Topic
  • Urinary Bladder Neoplasms / genetics*
  • Urinary Bladder Neoplasms / secondary*