Assembly of the working draft of the human genome with GigAssembler

Genome Res. 2001 Sep;11(9):1541-8. doi: 10.1101/gr.183201.

Abstract

The data for the public working draft of the human genome contains roughly 400,000 initial sequence contigs in approximately 30,000 large insert clones. Many of these initial sequence contigs overlap. A program, GigAssembler, was built to merge them and to order and orient the resulting larger sequence contigs based on mRNA, paired plasmid ends, EST, BAC end pairs, and other information. This program produced the first publicly available assembly of the human genome, a working draft containing roughly 2.7 billion base pairs and covering an estimated 88% of the genome that has been used for several recent studies of the genome. Here we describe the algorithm used by GigAssembler.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Chromosomes, Artificial, Bacterial / genetics
  • Computational Biology / methods
  • Contig Mapping / methods
  • Expressed Sequence Tags
  • Genome, Human*
  • Human Genome Project*
  • Humans
  • RNA, Messenger / genetics
  • Repetitive Sequences, Nucleic Acid
  • Sequence Alignment / methods
  • Software*

Substances

  • RNA, Messenger