Logistic regression model for analyzing extended haplotype data

Genet Epidemiol. 1998;15(2):173-81. doi: 10.1002/(SICI)1098-2272(1998)15:2<173::AID-GEPI5>3.0.CO;2-7.

Abstract

Recently, there has been increased interest in evaluating extended haplotypes in p53 as risk factors for cancer. An allele-specific polymerase chain reaction (PCR) method, confirmed by restriction analysis, has been used to determine absolute extended haplotypes in diploid genomes. We describe statistical analyses for comparing cases and controls, or comparing different ethnic groups with respect to haplotypes composed of several biallelic loci, especially in the presence of other covariates. Tests based on cross-tabulating all possible genotypes by disease state can have limited power due to the large number of possible genotypes. Tests based simply on cross-tabulating all possible haplotypes by disease state cannot be extended to account for other variables measured on the individual. We propose imposing an assumption of additivity upon the haplotype-based analysis. This yields a logistic regression in which the outcome is case or control, and the predictor variables include the number of copies (0, 1, or 2) of each haplotype, as well as other explanatory variables. In a case-control study, the model can be constructed so that each coefficient gives the log odds ratio for disease for an individual with a single copy of the suspect haplotype and another copy of the most common haplotype, relative to an individual with two copies of the most common haplotype. We illustrate the method with published data on p53 and breast cancer. The method can also be applied to any polymorphic system, whether multiple alleles at a single locus or multiple haplotypes over several loci.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Alleles
  • Breast Neoplasms / ethnology
  • Breast Neoplasms / genetics
  • Case-Control Studies
  • Cohort Studies
  • Data Interpretation, Statistical
  • Female
  • Genotype
  • Haplotypes*
  • Humans
  • Logistic Models*
  • Models, Genetic*
  • Racial Groups / genetics