Dissecting systems-wide data using mixture models: application to identify affected cellular processes

J Peter Svensson; Renée X de Menezes; Ingela Turesson; Micheline Giphart-Gassler; Harry Vrieling

doi:10.1186/1471-2105-6-177

Dissecting systems-wide data using mixture models: application to identify affected cellular processes

BMC Bioinformatics. 2005 Jul 14:6:177. doi: 10.1186/1471-2105-6-177.

Authors

J Peter Svensson¹, Renée X de Menezes, Ingela Turesson, Micheline Giphart-Gassler, Harry Vrieling

Affiliation

¹ Department of Toxicogenetics, Leiden University Medical Centre, P.O. Box 9503, 2300 RA Leiden, The Netherlands. p.svensson@lumc.nl

Abstract

Background: Functional analysis of data from genome-scale experiments, such as microarrays, requires an extensive selection of differentially expressed genes. Under many conditions, the proportion of differentially expressed genes is considerable, making the selection criteria a balance between the inclusion of false positives and the exclusion of false negatives.

Results: We developed an analytical method to determine a p-value threshold from a microarray experiment that is dependent on the quality and design of the data set. To this aim, populations of p-values are modeled as mathematical functions in which the parameters to describe these functions are estimated in an unsupervised manner. The strength of the method is exemplified by its application to a published gene expression data set of sporadic and familial breast tumors with BRCA1 or BRCA2 mutations.

Conclusion: We present an objective and unsupervised way to set thresholds adapted to the quality and design of the experiment. The resulting mathematical description of the data sets of genome-scale experiments enables a probabilistic approach in systems biology.

MeSH terms

Breast Neoplasms / genetics
Cell Cycle / genetics
Computational Biology / methods*
DNA-Binding Proteins / metabolism
Gene Expression Profiling / methods*
Gene Expression Regulation, Neoplastic / genetics
Genetic Testing / methods
Humans
Models, Genetic*
Phosphorylation
Predictive Value of Tests
Protein Array Analysis / methods

Substances

DNA-Binding Proteins