CLASS - transcript assembly from RNA-seq sequences
CLASS (Constraint-based Local Assembly
and Selection of Splice variants) is a program for
assembling transcripts from RNA-seq reads aligned to a reference genome.
It produces a set of transcripts in three stages. Stage 1 uses
linear programming to determine a set of exons. Stage 2 builds a splice
graph representation of a gene, by connecting the exons (vertices) via
introns (edges) extracted from spliced read alignments. Stage 3 selects
a subset of the candidate transcripts encoded in the graph, according
to the constraints derived from mate pairs and spliced alignments and,
optionally, using knowledge about gene structure extracted from known
annotation or alignments of cDNA sequences ('evidence').
NOTE: This version is obsolete. For an up-to-date method please visit CLASS2.
CLASS is described in:
- Song, L. and L. Florea (2013). CLASS: Constrained Transcript Assembly of RNA-seq Reads. Third Annual RECOMB Satellite Workshop on Massively Parallel Sequencing - RECOMB-SEQ 2013, BMC Bionformatics 14(Suppl 5):S14. [Open Access]
Download CLASS
here.
Download and installation procedure
The program was written for Linux platforms. It may require some
modifications to run on another Unix platform. CLASS is still under
development, so please write us with feedback and bug reports.
To install and run CLASS, you will need to download and install the samtools package for managing large short read alignment files.
-
Download and install the samtools package, if not already on the system. Save the executables somewhere in your PATH.
-
Download and unpack the gzipped tar file CLASS.tar.gz
gunzip < CLASS.tar.gz | tar -xvf -
-
The tar will unpack into a directory named CLASS.[version]. (You'll see what the precise name is while tar is unpacking.) Make that directory current.
cd CLASS.*
-
Follow the instructions in the COMPILING file, included, to compile.
NOTE: Using an 'evidence' file will generally improve the results. You can find an 'evidence' file, consisting of spliced alignments of human dbEST and RefSeq sequences produced with the software ESTmapper/sim4db, here.
This work was supported in part by NSF grant ABI-1159078 to Liliana Florea. |