Recovery of novel association loci in Arabidopsis thaliana and Drosophila melanogaster through leveraging INDELs association and integrated burden test

PLoS Genet. 2018 Oct 16;14(10):e1007699. doi: 10.1371/journal.pgen.1007699. eCollection 2018 Oct.

Abstract

Short insertions, deletions (INDELs) and larger structural variants have been increasingly employed in genetic association studies, but few improvements over SNP-based association have been reported. In order to understand why this might be the case, we analysed two publicly available datasets and observed that 63% of INDELs called in A. thaliana and 64% in D. melanogaster populations are misrepresented as multiple alleles with different functional annotations, i.e. where the same underlying variant is represented by inconsistent alignments leading to different variant calls. To address this issue, we have developed the software Irisas to reclassify and re-annotate these variants, which we then used for single-locus tests of association. We also integrated them to predict the functional impact of SNPs, INDELs, and structural variants for burden testing. Using both approaches, we re-analysed the genetic architecture of complex traits in A. thaliana and D. melanogaster. Heritability analysis using SNPs alone explained on average 27% and 19% of phenotypic variance for A. thaliana and D. melanogaster respectively. Our method explained an additional 11% and 3%, respectively. We also identified novel trait loci that previous SNP-based association studies failed to map, and which contain established candidate genes. Our study shows the value of the association test with INDELs and integrating multiple types of variants in association studies in plants and animals.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Arabidopsis / genetics
  • Drosophila melanogaster / genetics
  • Genetic Association Studies / methods*
  • Genotype
  • INDEL Mutation / genetics*
  • Phenotype
  • Polymorphism, Single Nucleotide / genetics
  • Quantitative Trait Loci / genetics
  • Sequence Analysis, DNA / methods*
  • Software

Grants and funding

This work was funded by a Max Planck Society core grant to the Department of Comparative Development and Genetics. BS is supported by a China Scholarship Council Fellowship, no. 201306300026. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.