DDBJ new system and service refactoring

Osamu Ogasawara; Jun Mashima; Yuichi Kodama; Eli Kaminuma; Yasukazu Nakamura; Kousaku Okubo; Toshihisa Takagi

doi:10.1093/nar/gks1152

DDBJ new system and service refactoring

Nucleic Acids Res. 2013 Jan;41(Database issue):D25-9. doi: 10.1093/nar/gks1152. Epub 2012 Nov 24.

Authors

Osamu Ogasawara¹, Jun Mashima, Yuichi Kodama, Eli Kaminuma, Yasukazu Nakamura, Kousaku Okubo, Toshihisa Takagi

Affiliation

¹ DDBJ Center, National Institute of Genetics, Yata 1111, Mishima, Shizuoka 411-8540, Japan. oogasawa@nig.ac.jp

Abstract

The DNA data bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) maintains a primary nucleotide sequence database and provides analytical resources for biological information to researchers. This database content is exchanged with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). Resources provided by the DDBJ include traditional nucleotide sequence data released in the form of 27 316 452 entries or 16 876 791 557 base pairs (as of June 2012), and raw reads of new generation sequencers in the sequence read archive (SRA). A Japanese researcher published his own genome sequence via DDBJ-SRA on 31 July 2012. To cope with the ongoing genomic data deluge, in March 2012, our computer previous system was totally replaced by a commodity cluster-based system that boasts 122.5 TFlops of CPU capacity and 5 PB of storage space. During this upgrade, it was considered crucial to replace and refactor substantial portions of the DDBJ software systems as well. As a result of the replacement process, which took more than 2 years to perform, we have achieved significant improvements in system performance.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Base Sequence*
Databases, Nucleic Acid*
Genomics
High-Throughput Nucleotide Sequencing
Internet
Sequence Analysis, DNA
Software