PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data

Proteomics. 2016 Dec;16(23):2967-2976. doi: 10.1002/pmic.201600249. Epub 2016 Nov 21.

Abstract

Accurate and comprehensive identification of surface-exposed proteins (SEPs) in parasites is a key step in developing novel subunit vaccines. However, the reliability of MS-based high-throughput methods for proteome-wide mapping of SEPs continues to be limited due to high rates of false positives (i.e., proteins mistakenly identified as surface exposed) as well as false negatives (i.e., SEPs not detected due to low expression or other technical limitations). We propose a framework called PlasmoSEP for the reliable identification of SEPs using a novel semisupervised learning algorithm that combines SEPs identified by high-throughput experiments and expert annotation of high-throughput data to augment labeled data for training a predictive model. Our experiments using high-throughput data from the Plasmodium falciparum surface-exposed proteome provide several novel high-confidence predictions of SEPs in P. falciparum and also confirm expert annotations for several others. Furthermore, PlasmoSEP predicts that 25 of 37 experimentally identified SEPs in Plasmodium yoelii salivary gland sporozoites are likely to be SEPs. Finally, PlasmoSEP predicts several novel SEPs in P. yoelii and Plasmodium vivax malaria parasites that can be validated for further vaccine studies. Our computational framework can be easily adapted to improve the interpretation of data from high-throughput studies.

Keywords: Bioinformatics; Malaria; Plasmodium; Predicting surface-exposed proteins; Semi-supervised learning; Surface-exposed proteomics.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • High-Throughput Screening Assays / methods
  • Humans
  • Membrane Proteins / analysis*
  • Membrane Proteins / metabolism
  • Models, Theoretical
  • Plasmodium falciparum / chemistry*
  • Plasmodium vivax / metabolism
  • Plasmodium vivax / pathogenicity
  • Plasmodium yoelii / chemistry
  • Proteomics / methods*
  • Protozoan Proteins / analysis*
  • Protozoan Proteins / metabolism
  • Salivary Glands / metabolism

Substances

  • Membrane Proteins
  • Protozoan Proteins