A Fast Association Test for Identifying Pathogenic Variants Involved in Rare Diseases

Am J Hum Genet. 2017 Jul 6;101(1):104-114. doi: 10.1016/j.ajhg.2017.05.015. Epub 2017 Jun 29.

Abstract

We present a rapid and powerful inference procedure for identifying loci associated with rare hereditary disorders using Bayesian model comparison. Under a baseline model, disease risk is fixed across all individuals in a study. Under an association model, disease risk depends on a latent bipartition of rare variants into pathogenic and non-pathogenic variants, the number of pathogenic alleles that each individual carries, and the mode of inheritance. A parameter indicating presence of an association and the parameters representing the pathogenicity of each variant and the mode of inheritance can be inferred in a Bayesian framework. Variant-specific prior information derived from allele frequency databases, consequence prediction algorithms, or genomic datasets can be integrated into the inference. Association models can be fitted to different subsets of variants in a locus and compared using a model selection procedure. This procedure can improve inference if only a particular class of variants confers disease risk and can suggest particular disease etiologies related to that class. We show that our method, called BeviMed, is more powerful and informative than existing rare variant association methods in the context of dominant and recessive disorders. The high computational efficiency of our algorithm makes it feasible to test for associations in the large non-coding fraction of the genome. We have applied BeviMed to whole-genome sequencing data from 6,586 individuals with diverse rare diseases. We show that it can identify multiple loci involved in rare diseases, while correctly inferring the modes of inheritance, the likely pathogenic variants, and the variant classes responsible.

Keywords: Bayesian inference; Mendelian diseases; hereditary disorders; rare diseases; rare variant association test; rare variants; whole-genome sequencing.

MeSH terms

  • Cardiomyopathies / genetics
  • Computer Simulation
  • Genetic Loci
  • Genetic Predisposition to Disease*
  • Genetic Variation*
  • Genome-Wide Association Study*
  • Humans
  • Immunologic Deficiency Syndromes / genetics
  • Intercellular Signaling Peptides and Proteins
  • Mental Retardation, X-Linked / genetics
  • Nuclear Proteins / genetics
  • Osteochondrodysplasias / genetics
  • Primary Immunodeficiency Diseases
  • Probability
  • Rare Diseases / genetics*
  • Retinal Diseases / genetics
  • Thrombocytopenia / genetics

Substances

  • ANKRD26 protein, human
  • Intercellular Signaling Peptides and Proteins
  • Nuclear Proteins

Supplementary concepts

  • Roifman syndrome