Overview
StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It uses a novel network flow algorithm as well as an optional de novo assembly step to assemble and quantitate full-length transcripts representing multiple splice variants for each gene locus. Its input can include not only alignments of short reads that can also be used by other transcript assemblers, but also alignments of longer sequences that have been assembled from those reads. In order to identify differentially expressed genes between experiments, StringTie's output can be processed by specialized software like Ballgown, Cuffdiff or other programs (DESeq2, edgeR, etc.).
News
- 1/6/2025 - v3.0.0 release:
- increase of accuracy in many cases, as StringTie can now better handle poor or inconsistent guide coverage by short or long reads
-
-N/--nasc
options can now be used to assemble and better handle incompletely processed (nascent) RNAs that are abundant in the case of rRNA-depletion sequencing samples (e.g. Total RNA Ribo-Zero libraries) - fixes and improvements for the
-e
expression quantification mode, for both long and short RNAseq reads
- 5/7/2024 - v2.2.3 release:
- fixes for long-read assembly and build scripts
- 4/20/2024 - v2.2.2 release:
- fixes an out-of-bounds issue when a large number of predictions are generated
- fixes other rare situations causing a program crash
- 1/26/2022 - v2.2.1 release:
- addressing an issue causing
-e
option to not output low coverage transcripts - fixed a
--ptf
data loading issue
- addressing an issue causing
- 12/3/2021 - v2.2.0 release hilights:
-
--mix
option allows StringTie to take both short and long read alignments; when this option is used, the 2nd BAM (or CRAM) file in the command line must be a long reads alignment file (the 1st being the short-reads alignment file); -L option should not be used in this case. - improved transcriptome assembly on mixed data with annotation
- added support for CRAM input files, as StringTie is now built using HTSlib
-
- 3/9/2021 - v2.1.5 release changes:
- adjusted the calculation of coverage for long reads assembly
- new trimming procedure implemented
- 7/7/2020 - v2.1.4 release is a maintenance release with the following fixes:
- fixed an issue with
--merge
sometimes incorrectly processing the order of the reference sequences - fixed
-e
issue sometimes entering an infinite loop - fixed compatibility issue between long transfrag and path assembled so far
- small bug fix in computing long read assemblies
- fixed an issue with
- 5/12/2020 - v2.1.3 release new features and fixes
- added the
--viral
option for long reads from viral data where splice sites do not follow consensus - adjustments to the assembly of long read alignment data
- fixed an occasional issue with the
--merge
option - made the
-e
compatible with long reads (-L
option)
See entire release history here.
- added the
Back to top
Obtaining and installing StringTie
The current version of StringTie can be downloaded as precompiled binary or as a source package:
- stringtie-3.0.0.tar.gz : source package
- stringtie-3.0.0.Linux_x86_64.tar.gz : Linux x86_64 binary package
- stringtie-3.0.0.OSX_x86_64.tar.gz : Apple OS X binary package (for OS X v10.7 and above)
In order to build and install StringTie from the source package the following steps can be taken:
- Unpack the downloaded StringTie source archive in a directory of your choice, e.g.:
cd ~/src/ tar xvfz ~/Downloads/stringtie-VER.tar.gz
A directory called stringtie-VER (where VER is the current numeric version of the program) will be created in the current directory. -
Change directory and build the stringtie executable:
cd stringtie-VER make release
Alternatively, the source tree can be downloaded from GitHub and built in a similar fashion:
- Optionally, the stringtie executable can be copied to one of the shell's PATH directories for easy access.
git clone https://github.com/gpertea/stringtie cd stringtie make release
For evaluating and further processing the GTF output of StringTie, the utility gffcompare can be downloaded from the GFF utilities page.
Back to top
Licensing and Contact Information
StringTie is free, open source software released under an
MIT License
.
You can contact us about StringTie at:
mpertea jhu edu
For technical issues, bug reports and code contributions please use StringTie's GitHub repository.
Back to top
Publications
Shumate A, Wong B, Pertea G, Pertea M Improved transcriptome assembly using a hybrid of long and short reads with StringTie, PLOS Computational Biology 18, 6 (2022), doi.org/10.1371/journal.pcbi.1009730
Kovaka S, Zimin AV, Pertea GM, Razaghi R, Salzberg SL, Pertea M Transcriptome assembly from long-read RNA-seq alignments with StringTie2, Genome Biology 20, 278 (2019), doi:10.1186/s13059-019-1910-1
Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown, Nature Protocols 11, 1650-1667 (2016), doi:10.1038/nprot.2016.095
Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT & Salzberg SL. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads Nature Biotechnology 2015, doi:10.1038/nbt.3122
Back to top