CCB » Software » How to Choose Your Metagenomics Classification Tool

Introduction

Authors: Jennifer Lu (JL), Florian P. Breitwieser (FB), Derrick E. Wood (DW), Li Song (LS), Daehwan Kim (DK), Ben Langmead (BL), Christopher Pockrandt (CP), Steven L. Salzberg (SLS)

From 2014-2018, the Center for Computational Biology has released 4 different metagenomics classification software packages: Kraken, KrakenUniq, Kraken 2, and Centrifuge. This page is dedicated to describing:
  1. The history of each software package
  2. The differences between each software package
  3. The best software package for users
  4. Additional software provided for post-processing/analyzing classification results.

Table of Contents

#1) Introduction
#2) Software Packages
#3) Links to Software Websites & Papers
#4) General Comparison Table
#5) How to Choose
#6) About the Authors

Page Updated: 2022/09/29 by Jennifer Lu
( jlu26 jhmi edu )

Software Packages: A Brief Description

Links to Software Websites & Papers

For the most comprehensive understanding of each software package, please refer to the individual websites and papers:
On September 28th, 2022, a Nature Protocols paper: Metagenome analysis using the Kraken software suite was published describing how the Kraken suite (Kraken 2, KrakenUniq, Bracken, and KrakenTools) can be used for 1) microbiome analysis and 2) pathogen identification.

General Comparison

Kraken KrakenUniq Kraken 2 Centrifuge
First Release Date (yyyy/mm/dd) 2014/01/04 2018/05/30 2018/06/26 2016/10/04
Latest Release Date (yyyy/mm/dd) 2017/12/05 2022/09/09 2021/09/10 2021/08/16
Paper Date 2014/03/03 2018/11/18 2019/11/28 2016/10/17
Original Authors DW/SLS FB/SLS DW/JL/BL DK/LS/FB
Currently Supported? No Yes, FB Yes, DW/JL Yes, LS
MemoryA 240.8 GB 240.8 GB 34.7 GB 25.2 GB
Database Build TimeA 16 hours 16 hours 4 hours 17 hours
Processing Time (per 10 Million reads)A 60 sec 55 sec 13 sec 70 sec
Abundance Estimation Bracken Bracken Bracken Built-in
Supported Databases Refseq
GRCh38
Refseq
GRCh38
microbial nt
Refseq
GRCh38
nt
16S Greengenes
16S Silva
16S RDP
nr
protein (translated search)
Refseq
GRCh38
nt
A Memory and Times measured for databases containing GRCh38 and Refseq bacterial/archaeal/viral sequences downloaded in Sept 2018. Database build speed measured using 32 threads on a 48 core machine with 512 GB memory. Processing speed measured using 16 threads during classification on the same machine. Memory and speed measured using each program's defaults (including default kmer size)

How to Choose

Kraken 1 is no longer supported:

While many continue to use this software, we encourage all Kraken users to upgrade to either KrakenUniq or Kraken 2.


KrakenUniq and Kraken 2 are uniquely useful depending on the project goal:
Kraken 2 v Centrifuge are distinctly different, but with different advantages:

About the Authors

Jennifer Lu (JL) is a Staff Scientist at Johns Hopkins University in the Center for Computational Biology in Steven Salzberg's and Trish's labs. She maintains the Bracken and KrakenTools software packages and works alongside Derrick Wood and Ben Langmead to maintain Kraken 2. (Jennifer Lu's webpage )

Florian P Breitwieser (FB) is a former post-doctoral researcher at Johns Hopkins University in Steven Salzberg's Lab. He is one of the original authors of Centrifuge and is the author of KrakenUniq and Pavian. (Florian Breitwieser's former Hopkins webpage)

Derrick E Wood (DW) received his PhD in 2014 from his work with Steven Salzberg on Kraken at the University of Maryland. For his post-doctoral work, Derrick worked with Ben Langmead in Johns Hopkins Computer Science to develop Kraken 2. (Derrick Wood's former Hopkins webpage)

Li Song (LS) received his PhD in 2018 working with Liliana Florea at Johns Hopkins University in the Computer Science Department. He is now a post-doctoral researcher at the Dana-Farber Cancer Institute in Shirley Liu’s lab. He is one of the original authors of Centrifuge and continues to maintain and update the software.

Daewhan Kim (DK) received his PhD at the University of Maryland in Steven Salzberg's lab, and then conducted post-doctoral research with Salzberg at Johns Hopkins University, during which he developed the HISAT and HISAT2 spliced alignment programs. He wrote Centrifuge alongisde Florian Breitwieser and Li Song. He now is an Assistant Professor at the University of Texas, Southwestern Medical Cneter. (Kim Lab webpage)

Christopher Pockrandt (CP) was a postdoctoral researcher in Steven Salzberg's lab from 2019 through June of 2022. He developed and implemented the memory-chunking algorithm that allows KrakenUniq to run on low-memory computers.

Natalia Rincon (NR) is a current Ph.D. student in Biomedical Engineering in Steven Salzberg's lab. She is the author of the diversity scripts for the KrakenTools suite and is one of the co-first authors for the Kraken metagenome protocol paper.

Martin Steinegger (MS) is an Assistant Professor in the Biology Department at the Seoul National University. He is a former postdoctoral researcher in Steven Salzberg's lab. He incorporated the KrakenUniq kmer-counting features in Kraken2. He also led the effort for the Kraken Nature Protocols Paper. (Steinegger Lab webpage)

Ben Langmead (BL) is an Associate Professor at Johns Hopkins University in the Department of Computer Science. He is the primary advisor to the Kraken 2 project. (Langmead Lab webpage)

Steven L Salzberg (SLS) is the Bloomberg Distinguished Professor of Biomedical Engineering, Computer Science, and Biostatistics at Johns Hopkins University. He is/was the primary advisor for the students and postdocs who developed Kraken 1, Centrifuge, KrakenUniq, Bracken, and Pavian. (Salzberg Lab webpage)