Barnacle: detecting and characterizing tandem duplications and fusions in transcriptome assemblies

BMC Genomics. 2013 Aug 14:14:550. doi: 10.1186/1471-2164-14-550.

Abstract

Background: Chimeric transcripts, including partial and internal tandem duplications (PTDs, ITDs) and gene fusions, are important in the detection, prognosis, and treatment of human cancers.

Results: We describe Barnacle, a production-grade analysis tool that detects such chimeras in de novo assemblies of RNA-seq data, and supports prioritizing them for review and validation by reporting the relative coverage of co-occurring chimeric and wild-type transcripts. We demonstrate applications in large-scale disease studies, by identifying PTDs in MLL, ITDs in FLT3, and reciprocal fusions between PML and RARA, in two deeply sequenced acute myeloid leukemia (AML) RNA-seq datasets.

Conclusions: Our analyses of real and simulated data sets show that, with appropriate filter settings, Barnacle makes highly specific predictions for three types of chimeric transcripts that are important in a range of cancers: PTDs, ITDs, and fusions. High specificity makes manual review and validation efficient, which is necessary in large-scale disease studies. Characterizing an extended range of chimera types will help generate insights into progression, treatment, and outcomes for complex diseases.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Neoplasms / genetics
  • Exons / genetics
  • Gene Duplication / genetics*
  • Gene Expression Profiling / methods*
  • Gene Fusion / genetics*
  • Genomics*
  • Humans
  • Leukemia, Myeloid, Acute / genetics
  • Molecular Sequence Annotation
  • RNA, Messenger / genetics
  • Statistics as Topic

Substances

  • RNA, Messenger