ProForma: A Standard Proteoform Notation

J Proteome Res. 2018 Mar 2;17(3):1321-1325. doi: 10.1021/acs.jproteome.7b00851. Epub 2018 Feb 14.

Abstract

The Consortium for Top-Down Proteomics (CTDP) proposes a standardized notation, ProForma, for writing the sequence of fully characterized proteoforms. ProForma provides a means to communicate any proteoform by writing the amino acid sequence using standard one-letter notation and specifying modifications or unidentified mass shifts within brackets following certain amino acids. The notation is unambiguous, human-readable, and can easily be parsed and written by bioinformatic tools. This system uses seven rules and supports a wide range of possible use cases, ensuring compatibility and reproducibility of proteoform annotations. Standardizing proteoform sequences will simplify storage, comparison, and reanalysis of proteomic studies, and the Consortium welcomes input and contributions from the research community on the continued design and maintenance of this standard.

Keywords: human readable; machine readable; proteoform; standard.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Computational Biology / methods*
  • Computational Biology / statistics & numerical data
  • Databases, Protein / statistics & numerical data
  • Humans
  • Information Dissemination
  • International Cooperation
  • Molecular Sequence Annotation
  • Protein Processing, Post-Translational*
  • Proteome / analysis*
  • Proteome / genetics
  • Proteome / metabolism
  • Proteomics / methods*
  • Proteomics / statistics & numerical data
  • Reproducibility of Results
  • Software*
  • Tandem Mass Spectrometry / methods
  • Tandem Mass Spectrometry / standards*

Substances

  • Proteome