HISAT is a fast and sensitive spliced alignment program for mapping RNA-seq reads.
In addition to one global FM index that represents a whole genome, HISAT uses a large set of small FM indexes that collectively cover the whole genome
(each index represents a genomic region of ~64,000 bp and ~48,000 indexes are needed to cover the human genome).
These small indexes (called local indexes) combined with several alignment strategies enable effective alignment of RNA-seq reads, in particular, reads spanning multiple exons.
The memory footprint of HISAT is relatively low (~4.3GB for the human genome).
We have developed HISAT based on the Bowtie2 implementation to handle most of the operations on the FM index.
News and updates
|New releases and related tools will be announced through the Bowtie mailing list.|
|Please email Daehwan Kim for questions.|
|H. sapiens, UCSC hg19||7.0 GB|
|M. musculus, UCSC mm10||6.4 GB|
The current version of HISAT does not use some of the files in the index, so the actual memory requirement is much lower than the index size. For example, the memory footprint of HISAT for the human genome is about 4.3GB.
Kim D, Langmead B and Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nature Methods 2015
HISAT2 released 9/7/2015
- HISAT2 is a successor to both HISAT and TopHat2. We recommend that the HISAT and TopHat2 users switch to HISAT2.
HISAT 0.1.6-beta release 4/17/2015
- Added NH tags in SAM output for specifying number of alignments.
- Fixed a bug that -s/--summary option in hisat-inspect didn't work.
- Rewritten extract_splice_sites.py to better handle various GTF files with thanks to George Young.
- Improved alignment accuracy by preventing reads from being mapped to pseudogenes when known or novel splice sites are provided (--known-splicesite-infile/--novel-splicesite-infile).
- Disallow reads to have short anchors with mismatches.
HISAT source is available in a public GitHub repository (3/30/2015).
The HISAT paper is out in Nature Methods 3/9/2015
HISAT 0.1.5-beta release 2/25/2015
- HISAT is now able to work directly with SRA data: both downloaded on demand over internet and prefetched to local disks.
- Uses new NGS API (https://github.com/ncbi/ngs) created and published by SRA Toolkit Team at NCBI.
For example, you can run HISAT with SRA data (SRR353653) as follows.
hisat -x /path/to/index --sra-acc SRR353653
- This eliminates the need to download SRA reads manually and to convert them into fasta/fastq format without affecting the run time.
- Is compatible with environment created by SRA Toolkit installation and utilizes any existing rights to dbGaP protected data as well as cached file downloads.
- --pen-intronlen option is handled correctly.
- Two new options, --min-intronlen and --max-intronlen, are introduced, enabling users to specify minimum and maximum intron lengths.
HISAT 0.1.4-beta release 1/30/2015
- Alignment score for second-best alignment (XS:i) is no longer reported because it is in conflict with XS:A tag. XS:A tag is required for transcript assemblers such as Cufflinks and StringTie.
- Improved alignment accuracy involving multiple introns.
HISAT 0.1.3-beta release 1/27/2015
- Fixed an occasional runtime error.
- Fixed a python script, extract_splice_sites.py, to handle gene annotation files (GTF files) correctly.
A preprint version of HISAT manuscript is available 12/12/2014
HISAT 0.1.2-beta release 11/03/2014
- This version includes improvements in alignment sensitivity and splice site discovery.
HISAT 0.1.1-beta release 7/29/2014
- This version supports genomes of any size, larger than 4 billion base pairs (see the manual).
HISAT 0.1.0-beta release 7/10/2014
- First release of HISAT.