CLASS

CLASS - transcript assembly from RNA-seq sequences

CLASS (Constraint-based Local Assembly and Selection of Splice variants) is a program for assembling transcripts from RNA-seq reads aligned to a reference genome. It produces a set of transcripts in three stages. Stage 1 uses linear programming to determine a set of exons. Stage 2 builds a splice graph representation of a gene, by connecting the exons (vertices) via introns (edges) extracted from spliced read alignments. Stage 3 selects a subset of the candidate transcripts encoded in the graph, according to the constraints derived from mate pairs and spliced alignments and, optionally, using knowledge about gene structure extracted from known annotation or alignments of cDNA sequences ('evidence').

NOTE: This version is obsolete. For an up-to-date method please visit CLASS2.

CLASS is described in:

Song, L. and L. Florea (2013). CLASS: Constrained Transcript Assembly of RNA-seq Reads. Third Annual RECOMB Satellite Workshop on Massively Parallel Sequencing - RECOMB-SEQ 2013, BMC Bionformatics 14(Suppl 5):S14. [Open Access]

Download CLASS here.

Download and installation procedure

The program was written for Linux platforms. It may require some modifications to run on another Unix platform. CLASS is still under development, so please write us with feedback and bug reports.

To install and run CLASS, you will need to download and install the samtools package for managing large short read alignment files.

Download and install the samtools package, if not already on the system. Save the executables somewhere in your PATH.

Download and unpack the gzipped tar file CLASS.tar.gz

        gunzip < CLASS.tar.gz | tar -xvf -

The tar will unpack into a directory named CLASS.[version]. (You'll see what the precise name is while tar is unpacking.) Make that directory current.
```
        cd CLASS.*
```
Follow the instructions in the COMPILING file, included, to compile.

NOTE: Using an 'evidence' file will generally improve the results. You can find an 'evidence' file, consisting of spliced alignments of human dbEST and RefSeq sequences produced with the software ESTmapper/sim4db, here.

This work was supported in part by NSF grant ABI-1159078 to Liliana Florea.