This page describes the Kraken2 protocol detailed in the Nature Protocols paper titled Metagenome analysis using the Kraken software suite. Published on September 28, 2022, the protocol explains how Kraken 2, Bracken, KrakenUniq, and KrakenTools are used for both microbiome analysis and pathogen detection.
We provide jupyter notebooks for users to run the full Kraken 2 protocol.
Software
The following software programs must be downloaded for execution of the Kraken2 protocol- Kraken2 software ( https://github.com/DerrickWood/kraken2/, version 2.1.1 or later)
- Bracken software ( https://github.com/jenniferlu717/Bracken, version 2.6.2 or later)
- KrakenTools software ( https://github.com/jenniferlu717/KrakenTools, version 1.1 or later)
- Pavian software ( https://github.com/fbreitwieser/pavian, version 1.0 or later)
- Bowtie2 software ( https://github.com/BenLangmead/bowtie2, version 2.4.4 or later)
- R (https://www.r-project.org)
- RStudio (https://www.rstudio.com/)
Downloads
Required Data:-
Kraken2 Database (8 GB)
tar -xzvf k2_standard_eupath_20201202.tar.gz
- 3 Microbiome Analysis Samples (See SRA downloads)
- 10 Pathogen identification Samples (See SRA downloads)
- Bowtie2 Indices for the following genomes: k2protocol_bowtie2indices.tgz
- Anncaliia algerae genome (EupathDB46_Clean)
- Aspergillus flavus genome (EupathDB46_Clean)
- Candida albicans genome (EupathDB46_Clean)
- Mycobacterium chelonae genome (NCBI RefSeq NZ_CP007220)
- Streptococcus agalactiae genome (NCBI RefSeq GCF_900638415)
- Staphylococcus aureus genome (NCBI RefSeq GCF_000013425.1)
We provide a bash script for downloading these samples using the NCBI's SRA Toolkit. Ensure that the SRA Toolkit is installed before executing the script as follows Download the script here: download_samples.sh and execute the script using the following command line.
sh download_samples.sh
Sample | SRA ID | File size | Description | Expected Diversity |
---|---|---|---|---|
1 | SRR14143424 | 5.9 Gb | Day -2 | Normal |
2 | SRR14092160 | 3.4 Gb | Day -9 | Normal |
3 | SRR14092310 | 2.3 Gb | Day 12 | Low |
Pathogen Identification | ||||
Sample | SRA ID | File size | Case ID | Expected Pathogen |
1 | SRR12486971 | 178.2 Mb | Case 10 | Anncaliia algerae |
2 | SRR12486972 | 318.5 Mb | Case 9 | Aspergillus fumigatus |
3 | SRR12486974 | 141.0 Mb | Case 7 | Candida albicans |
4 | SRR12486978 | 110.5 Mb | Case 3 | Mycobacteroides chelonae |
5 | SRR12486979 | 112.3 Mb | Case 20 | Control |
6 | SRR12486981 | 311.1 Mb | Case 18 | Control |
7 | SRR12486983 | 236.5 Mb | Case 16 | HSV 1 |
8 | SRR12486988 | 351.0 Mb | Case 11 | Acanthamoeba castellanii |
9 | SRR12486989 | 228.2 Mb | Case 2 | Streptococcus agalactiae |
10 | SRR12486990 | 465.7 Mb | Case 1 | Staphylococcus aureus |
Authors/Contributors
Jennifer Lu, Ph.D.
(
jlu26 jhmi edu
)
Natalia Rincon
Derrick Wood, Ph.D.
Florian Breitwieser, Ph.D.
Ben Langmead
Steven Salzberg, Ph.D.
Martin Steinegger, Ph.D.
Page Updated: 2021/09/15