Mobilization and integration of bacterial phenotypic data-Enabling next generation biodiversity analysis through the BacDive metadatabase

J Biotechnol. 2017 Nov 10:261:187-193. doi: 10.1016/j.jbiotec.2017.05.004. Epub 2017 May 6.

Abstract

Microbial data and metadata are scattered throughout the scientific literature, databases and unpublished lab notes and thereby often are difficult to access. Hot spots of (meta)data are internal descriptions of culture collections and initial descriptions of novel taxa in primary literature. Here we describe three exemplary mobilization projects which yielded metadata published through the prokaryotic metadatabase BacDive. The Reichenbach collection of myxobacteria includes information on 12,535 typewritten index cards which were digitized. A total of 37,156 data points were extracted by text mining. In the second mobilization project, Analytical Profile Index (API) tests on paper forms were targeted. Overall 6820 API tests were digitized, which provide physiological data of 4524 microbial strains. Thirdly, the extraction of metadata from 523 new species descriptions of the International Journal of Systematic and Evolutionary Microbiology, yielding 35,651 data points, is described. All data sets were integrated and published in BacDive. Thereby these metadata not only became accessible and searchable but were also linked to strain taxonomy, isolation source, cultivation condition, and molecular biology data.

Keywords: Bacterial biodiversity; Bacterial phenotype; Data mobilization; Database; Metadata.

Publication types

  • Review

MeSH terms

  • Bacteria* / genetics
  • Bacteria* / metabolism
  • Biodiversity*
  • Computational Biology*
  • Database Management Systems
  • Databases, Genetic*
  • Phenotype