The sampling distribution of disease-associated alleles

Genetics. 1997 Dec;147(4):1855-61. doi: 10.1093/genetics/147.4.1855.

Abstract

A theory is developed that provides the sampling distribution of low frequency alleles at a single locus under the assumption that each allele is the result of a unique mutation. The numbers of copies of each allele is assumed to follow a linear birth-death process with sampling. If the population is of constant size, standard results from theory of birth-death processes show that the distribution of numbers of copies of each allele is logarithmic and that the joint distribution of numbers of copies of k alleles found in a sample of size n follows the Ewens sampling distribution. If the population from which the sample was obtained was increasing in size, if there are different selective classes of alleles, or if there are differences in penetrance among alleles, the Ewens distribution no longer applies. Likelihood functions for a given set of observations are obtained under different alternative hypotheses. These results are applied to published data from the BRCA1 locus (associated with early onset breast cancer) and the factor VIII locus (associated with hemophilia A) in humans. In both cases, the sampling distribution of alleles allows rejection of the null hypothesis, but relatively small deviations from the null model can account for the data. In particular, roughly the same population growth rate appears consistent with both data sets.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Alleles*
  • BRCA1 Protein / genetics
  • Breast Neoplasms / genetics
  • Factor VIII / genetics
  • Female
  • Hemophilia A / genetics
  • Humans
  • Mathematical Computing
  • Models, Genetic*
  • Models, Statistical*
  • Mutation
  • Population Density
  • Sampling Studies*
  • Selection, Genetic

Substances

  • BRCA1 Protein
  • Factor VIII