CRISPRCasdb a successor of CRISPRdb containing CRISPR arrays and cas genes from complete genome sequences, and tools to download and query lists of repeats and spacers

Nucleic Acids Res. 2020 Jan 8;48(D1):D535-D544. doi: 10.1093/nar/gkz915.

Abstract

In Archaea and Bacteria, the arrays called CRISPRs for 'clustered regularly interspaced short palindromic repeats' and the CRISPR associated genes or cas provide adaptive immunity against viruses, plasmids and transposable elements. Short sequences called spacers, corresponding to fragments of invading DNA, are stored in-between repeated sequences. The CRISPR-Cas systems target sequences homologous to spacers leading to their degradation. To facilitate investigations of CRISPRs, we developed 12 years ago a website holding the CRISPRdb. We now propose CRISPRCasdb, a completely new version giving access to both CRISPRs and cas genes. We used CRISPRCasFinder, a program that identifies CRISPR arrays and cas genes and determine the system's type and subtype, to process public whole genome assemblies. Strains are displayed either in an alphabetic list or in taxonomic order. The database is part of the CRISPR-Cas++ website which also offers the possibility to analyse submitted sequences and to download programs. A BLAST search against lists of repeats and spacers extracted from the database is proposed. To date, 16 990 complete prokaryote genomes (16 650 bacteria from 2973 species and 340 archaea from 300 species) are included. CRISPR-Cas systems were found in 36% of Bacteria and 75% of Archaea strains. CRISPRCasdb is freely accessible at https://crisprcas.i2bc.paris-saclay.fr/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Archaea / classification
  • Archaea / enzymology
  • Archaea / genetics
  • Bacteria / classification
  • Bacteria / enzymology
  • Bacteria / genetics
  • CRISPR-Associated Proteins / chemistry
  • CRISPR-Associated Proteins / genetics*
  • CRISPR-Associated Proteins / metabolism
  • CRISPR-Cas Systems
  • Clustered Regularly Interspaced Short Palindromic Repeats / genetics*
  • Databases, Genetic*
  • Genome, Archaeal*
  • Genome, Bacterial*
  • Phylogeny
  • Software*

Substances

  • CRISPR-Associated Proteins