Centrifuge

Site Map

Getting Help

Please submit an issue on GitHub, or send an E-Mail to centrifuge.metagenomics@gmail.com for private communications.

Releases

version 1.0.3-beta (old)	12/06/2016
Source code
Linux x86_64 binary
Mac OS X x86_64 binary

Indexes

last updated:	12/06/2016
Bacteria, Archaea, Viruses, Human (compressed)	5.4 GB
Bacteria, Aarchaea, Viruses, Human	7.9 GB

last updated:	3/3/2018
NCBI nucleotide non-redundant sequences	64 GB

last updated:	4/15/2018
Bacteria, Archaea (compressed)	6.3 GB

MD5 checksum

Related Tools

Pavian: Tool for interactive analysis of pathogen and metagenomics data
HISAT2: Graph-based alignment to a population of genomes
Bowtie2: Ultrafast read alignment

Publications

Kim D, Song L, Breitwieser FP, and Salzberg SL. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Research 2016

Contributors

Links

Have a look at https://github.com/fbreitwieser/pavian for visual analysis of results generated with Centrifuge!

Due to the rapid spread of SARS-CoV-2 and its devastating effects, we provide additional Centrifuge indices in the hope that they will be useful for biomedical research related to the virus. The first two indexes include 106 complete SARS-CoV-2 genomes downloaded from GenBank as follows: (3/29/2020)

h+v+c: human genome and viral genomes including 106 SARS-CoV-2 complete genomes (download link)
h+p+v+c: human genome, prokaryotic genomes, and viral genomes including 106 SARS-CoV-2 complete genomes (download link)
Additional indexes including nt index are also available at Genexa (Note: the indexes include one reference SARS-CoV-2 genome.)

Centrifuge 1.0.4-beta release 6/5/2018

Support running multiple samples while loading index only once
Fix a bug of misassignment if a read comes near the boundary of a genome
centrifuge-kreport uses the lowest common ancestor taxonomy id for multiple assigned reads by default

Centrifuge 1.0.3 release 2/23/2018

Fix several bugs.
Output unclassified reads.
Make the options about output the sequences (--un,--al,--un-conc,--al-conc) work.

Centrifuge 1.0.3-beta release 12/06/2016

Fixed Perl hash bangs (thanks to Andreas Sjödin / @druvus).
Updated nt database building to work with new accession code scheme.
Added option --tab-fmt-cols to specify output format columns.
A minor fix for traversing the hitmap up the taxonomy tree.

Centrifuge paper published at Genome Research 11/16/2016

Centrifuge 1.0.2-beta release 5/25/2016

Fixed a runtime error during abundance analysis.
Changed a default report file name from centrifuge_report.csv to centrifuge_report.tsv.

Centrifuge preprint is available here at bioRxiv 5/24/2016

Centrifuge 1.0.1-beta release 3/8/2016

Centrifuge is now able to work directly with SRA data: both downloaded on demand over internet and prefetched to local disks.
- For example, you can run Centrifuge with SRA data (SRR353653) as follows.
  centrifuge -x /path/to/index --sra-acc SRR353653
- This eliminates the need to download SRA reads manually and to convert them into fasta/fastq format without affecting the run time.
We provide a Centrifuge index (nt index) for NCBI nucleotide non-redundant sequences collected from plasmids, organelles, viruses, archaea, bacteria, and eukaryotes, totaling ~109 billion bps. Centrifuge is a very good alternative to Megablast (or Blast) for searching through this huge database.
Fixed Centrifuge's scripts related to sequence downloading and index building.

Centrifuge 1.0.0-beta release 2/19/2016 - first release

The first release of Centrifuge features a dramatically reduced database size, higher classification accuracy and sensitivity, and comparably rapid classification speed.
Please refer to the manual for details on how to run Centrifuge and interpret Centrifuge’s classification results.
We provide several standard indexes designed to meet the needs of most users (see the side panel - Indexes)
- For compressed indexes, we first combined bacterial genomes belonging to the same species and removed redundant sequences, and built indexes using the combined sequences. As a result, those compressed indexes are much smaller than uncompressed indexes. Centrifuge classifies reads at the species level when using the compressed indexes and at the strain level (or the genome level) when using the uncompressed indexes.