CCB » Software » StringTie

Overview

StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It uses a novel network flow algorithm as well as an optional de novo assembly step to assemble and quantitate full-length transcripts representing multiple splice variants for each gene locus. Its input can include not only alignments of short reads that can also be used by other transcript assemblers, but also alignments of longer sequences that have been assembled from those reads. In order to identify differentially expressed genes between experiments, StringTie's output can be processed by specialized software like Ballgown, Cuffdiff or other programs (DESeq2, edgeR, etc.).

News

  • 1/6/2025 - v3.0.0 release:
    • increase of accuracy in many cases, as StringTie can now better handle poor or inconsistent guide coverage by short or long reads
    • -N/--nasc options can now be used to assemble and better handle incompletely processed (nascent) RNAs that are abundant in the case of rRNA-depletion sequencing samples (e.g. Total RNA Ribo-Zero libraries)
    • fixes and improvements for the -e expression quantification mode, for both long and short RNAseq reads
  • 5/7/2024 - v2.2.3 release:
    • fixes for long-read assembly and build scripts
  • 4/20/2024 - v2.2.2 release:
    • fixes an out-of-bounds issue when a large number of predictions are generated
    • fixes other rare situations causing a program crash
  • 1/26/2022 - v2.2.1 release:
    • addressing an issue causing -e option to not output low coverage transcripts
    • fixed a --ptf data loading issue
  • 12/3/2021 - v2.2.0 release hilights:
    • --mix option allows StringTie to take both short and long read alignments; when this option is used, the 2nd BAM (or CRAM) file in the command line must be a long reads alignment file (the 1st being the short-reads alignment file); -L option should not be used in this case.
    • improved transcriptome assembly on mixed data with annotation
    • added support for CRAM input files, as StringTie is now built using HTSlib
  • 3/9/2021 - v2.1.5 release changes:
    • adjusted the calculation of coverage for long reads assembly
    • new trimming procedure implemented
  • 7/7/2020 - v2.1.4 release is a maintenance release with the following fixes:
    • fixed an issue with --merge sometimes incorrectly processing the order of the reference sequences
    • fixed -e issue sometimes entering an infinite loop
    • fixed compatibility issue between long transfrag and path assembled so far
    • small bug fix in computing long read assemblies
  • 5/12/2020 - v2.1.3 release new features and fixes
    • added the --viral option for long reads from viral data where splice sites do not follow consensus
    • adjustments to the assembly of long read alignment data
    • fixed an occasional issue with the --merge option
    • made the -e compatible with long reads (-L option)
  • See entire release history here.

Back to top

Obtaining and installing StringTie

The current version of StringTie can be downloaded as precompiled binary or as a source package:


In order to build and install StringTie from the source package the following steps can be taken:

  1. Unpack the downloaded StringTie source archive in a directory of your choice, e.g.:
       cd ~/src/
       tar xvfz ~/Downloads/stringtie-VER.tar.gz
                
    A directory called stringtie-VER (where VER is the current numeric version of the program) will be created in the current directory.

  2. Change directory and build the stringtie executable:
       cd stringtie-VER
       make release
    
  3. Alternatively, the source tree can be downloaded from GitHub and built in a similar fashion:
         git clone https://github.com/gpertea/stringtie
         cd stringtie
         make release
         
  4. Optionally, the stringtie executable can be copied to one of the shell's PATH directories for easy access.

For evaluating and further processing the GTF output of StringTie, the utility gffcompare can be downloaded from the GFF utilities page.


Back to top

Licensing and Contact Information

StringTie is free, open source software released under an MIT License .
You can contact us about StringTie at: mpertea jhu edu

For technical issues, bug reports and code contributions please use StringTie's GitHub repository.


Back to top

Publications

Shumate A, Wong B, Pertea G, Pertea M Improved transcriptome assembly using a hybrid of long and short reads with StringTie, PLOS Computational Biology 18, 6 (2022), doi.org/10.1371/journal.pcbi.1009730

Kovaka S, Zimin AV, Pertea GM, Razaghi R, Salzberg SL, Pertea M Transcriptome assembly from long-read RNA-seq alignments with StringTie2, Genome Biology 20, 278 (2019), doi:10.1186/s13059-019-1910-1

Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown, Nature Protocols 11, 1650-1667 (2016), doi:10.1038/nprot.2016.095

Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT & Salzberg SL. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads Nature Biotechnology 2015, doi:10.1038/nbt.3122


Back to top