CCB » Software » StringTie

Overview

StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It uses a novel network flow algorithm as well as an optional de novo assembly step to assemble and quantitate full-length transcripts representing multiple splice variants for each gene locus. Its input can include not only alignments of short reads that can also be used by other transcript assemblers, but also alignments of longer sequences that have been assembled from those reads. In order to identify differentially expressed genes between experiments, StringTie's output can be processed by specialized software like Ballgown, Cuffdiff or other programs (DESeq2, edgeR, etc.).

News

  • 1/25/2020 - v2.1.0 release brings improvements, corrections and new features.
    • more accurate assembly of long RNA reads from complex genomes with many overlapping genes
    • improved start/end trimming for long read alignments
    • new -R option for long read alignments (-L alternative) to output unassembled, cleaned, and non-redundant long read alignments (collapsing similar long reads alignments at the same location).
    • (experimental) verify consensus splice sites if genomic sequence is provided with -g
  • 12/16/2019 - v2.0.6 release is a maintenance StringTie release fixing a regression bug (occasional crash/instability) affecting the previous release.
  • 10/29/2019 - v2.0.4 release is a maintenance StringTie release:
    • fixed an occasional instability problem occurring in some situations when long read alignments (-L option) were used in conjuction with guides (reference annotation, -G option).
    • fixed a problem with -e option sometimes adding newly assembled transcripts to the output, that were not present in the -G reference transcripts file; this caused the prepDE.py script to fail in those cases.
    • added a few more input checks in prepDE.py to verify if the GTF files being processed were generated as expected.
  • 7/30/2019 - v2.0 release is a major StringTie release adding new features and improvements.
    • added support for long read alignments (enabled with -L option)
    • added a new "super-reads" module (found in the SuperReads_RNA directory of the source distribution) which can be used to perform de-novo assembly and alignment of RNA-Seq reads preparing them for assembly with StringTie
    • overall improved handling of read alignments and their transcription strand assignment
  • 5/2/2019 - v1.3.6 release is a maintenance build addressing a few issues found since the previous release:
    • fixing a GFF/GTF sorting issue causing occasional errors when the --merge option was used
    • addressing a float precision problem causing negative coverage/TPM/FPKM values in some cases
    • various GFF/GTF parsing adjustments improving support for some reference annotation sources
  • See entire release history here.

Back to top

Obtaining and installing StringTie

The current version of StringTie can be downloaded as precompiled binary or as a source package:


In order to build and install StringTie from the source package the following steps can be taken:

  1. Unpack the downloaded StringTie source archive in a directory of your choice, e.g.:
       cd ~/src/
       tar xvfz ~/Downloads/stringtie-VER.tar.gz
                
    A directory called stringtie-VER (where VER is the current numeric version of the program) will be created in the current directory.

  2. Change directory and build the stringtie executable:
       cd stringtie-VER
       make release
    
  3. Alternatively, the source tree can be downloaded from GitHub and built in a similar fashion:
         git clone https://github.com/gpertea/stringtie
         cd stringtie
         make release
         
  4. Optionally, the stringtie executable can be copied to one of the shell's PATH directories for easy access.

For evaluating and further processing the GTF output of StringTie, the utility gffcompare can be downloaded from the GFF utilities page.


Back to top

Licensing and Contact Information

StringTie is free, open source software released under an MIT License .
You can contact us about StringTie at: mpertea jhu edu

For technical issues, bug reports and code contributions please use StringTie's GitHub repository.


Back to top

Publications

Kovaka S, Zimin AV, Pertea GM, Razaghi R, Salzberg SL, Pertea M Transcriptome assembly from long-read RNA-seq alignments with StringTie2, Genome Biology 20, 278 (2019), doi:10.1186/s13059-019-1910-1

Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown, Nature Protocols 11, 1650-1667 (2016), doi:10.1038/nprot.2016.095

Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT & Salzberg SL. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads Nature Biotechnology 2015, doi:10.1038/nbt.3122


Back to top