Recursive organizer (ROR): an analytic framework for sequence-based association analysis

Hum Genet. 2013 Jul;132(7):745-59. doi: 10.1007/s00439-013-1285-4. Epub 2013 Mar 14.

Abstract

The advent of next-generation sequencing technologies affords the ability to sequence thousands of subjects cost-effectively, and is revolutionizing the landscape of genetic research. With the evolving genotyping/sequencing technologies, it is not unrealistic to expect that we will soon obtain a pair of diploidic fully phased genome sequences from each subject in the near future. Here, in light of this potential, we propose an analytic framework called, recursive organizer (ROR), which recursively groups sequence variants based upon sequence similarities and their empirical disease associations, into fewer and potentially more interpretable super sequence variants (SSV). As an illustration, we applied ROR to assess an association between HLA-DRB1 and type 1 diabetes (T1D), discovering SSVs of HLA-DRB1 with sequence data from the Wellcome Trust Case Control Consortium. Specifically, ROR reduces 36 observed unique HLA-DRB1 sequences into 8 SSVs that empirically associate with T1D, a fourfold reduction of sequence complexity. Using HLA-DRB1 data from Type 1 Diabetes Genetics Consortium as cases and data from Fred Hutchinson Cancer Research Center as controls, we are able to validate associations of these SSVs with T1D. Further, SSVs consist of nine nucleotides, and each associates with its corresponding amino acids. Detailed examination of these selected amino acids reveals their potential functional roles in protein structures and possible implication to the mechanism of T1D.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Diabetes Mellitus, Type 1 / genetics*
  • Female
  • Genome, Human*
  • Genome-Wide Association Study / methods*
  • HLA-DRB1 Chains / genetics*
  • Humans
  • Male
  • Sequence Analysis, DNA / methods*

Substances

  • HLA-DRB1 Chains