A genomic perspective on protein families

Science. 1997 Oct 24;278(5338):631-7. doi: 10.1126/science.278.5338.631.

Abstract

In order to extract the maximum amount of information from the rapidly accumulating genome sequences, all conserved genes need to be classified according to their homologous relationships. Comparison of proteins encoded in seven complete genomes from five major phylogenetic lineages and elucidation of consistent patterns of sequence similarities allowed the delineation of 720 clusters of orthologous groups (COGs). Each COG consists of individual orthologous proteins or orthologous sets of paralogs from at least three lineages. Orthologs typically have the same function, allowing transfer of functional information from one member to an entire COG. This relation automatically yields a number of functional predictions for poorly characterized genomes. The COGs comprise a framework for functional and evolutionary genome analysis.

Publication types

  • Review

MeSH terms

  • Amino Acid Sequence
  • Archaeal Proteins / chemistry
  • Archaeal Proteins / classification
  • Archaeal Proteins / genetics
  • Archaeal Proteins / physiology
  • Bacteria / chemistry
  • Bacteria / genetics
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / classification
  • Bacterial Proteins / genetics
  • Bacterial Proteins / physiology
  • Conserved Sequence
  • Evolution, Molecular
  • Fungal Proteins / chemistry
  • Fungal Proteins / classification
  • Fungal Proteins / genetics
  • Fungal Proteins / physiology
  • Genes, Archaeal*
  • Genes, Bacterial*
  • Genes, Fungal*
  • Methanococcus / chemistry
  • Methanococcus / genetics
  • Multigene Family*
  • Phylogeny*
  • Proteins / chemistry
  • Proteins / classification
  • Proteins / genetics*
  • Proteins / physiology
  • Saccharomyces cerevisiae / chemistry
  • Saccharomyces cerevisiae / genetics
  • Species Specificity

Substances

  • Archaeal Proteins
  • Bacterial Proteins
  • Fungal Proteins
  • Proteins