JIGSAW is available for all species. We have tested JIGSAW on Human, Rice (Oryza sativa), Arabidopsis thaliana, C. elegans, Brugia malayi, Cryptococcus neoformans, Entamoeba histolytica, Theileria parva, Aspergillus fumigatus, Plasmodium falciparum and Plasmodium yoelii.
The linear combiner option is now available in the current JIGSAW software distribution. This allows JIGSAW to be run without the use of training data. A weight is assigned to each evidence source, and gene predictions are based on a weighted voting scheme, yielding the best 'consensus' predictions.
Predictions are now available for the ENCODE regions in Human and viewable as custom tracks in the UCSC Human Genome Browser
Predictions available for the Human genome and viewable as custom tracks in the UCSC Human Genome Browser
The results in Table 1 measure accuracy of JIGSAW, Ensembl and cDNA alignments from the UCSC genome browser in Human. The test is made up of 1563 genes. JIGSAW uses the output from Ensembl and the cDNA alignments along with many other evidence sources available in the UCSC genome database, including other gene finders and expression evidence. Sensitivity measures the percentage of true genes (exons/nucleotides) that the program finds. Precision measures the percentage of the program's predicted genes (exons/nucleotides) that are correct.
The results in Table 2 measure accuracy of JIGSAW, FgenesH and GeneMark.hmm in Oryza sativa. The test set includes 5,595 genes from 26,827 exons. JIGSAW uses the output from FgenesH, GlimmerR, GeneMark.hmm, Genscan and splice site predictions from GeneSplicer, sequence alignments from a protein database and sequence alignments from the TIGR gene indices.
The results in Table 3 measure the accuracy of gene prediction programs in Arabidopsis thaliana. The test set includes 1,783 genes from 7,510 exons. JIGSAW uses output from the other gene prediction programs listed in the table, an earlier version of GlimmerM, splice site predictions from GeneSplicer, sequence alignments from a protein database and sequence alignments from the TIGR gene indices.
JIGSAW predicts gene models for a user supplied genomic sequence. The main interface is a simple "evidence list" file, which lists the file names of each prediction program's output, file format and the type of evidence. JIGSAW reads several coordinate based file formats including GFF.
System requirementsJIGSAW is developed in C++ and compiles using GNU gcc 3.2 or newer.
This software is OSI Certified Open Source Software.
Software development documentationLibrary API
ReferencesJ.E. Allen, W.H. Majoros, M. Pertea, and S.L. Salzberg. JIGSAW, GeneZilla, and GlimmerHMM: puzzling out the features of human genes in the ENCODE regions. Genome Biology 2007, 7(Suppl):S9.
J. E. Allen and S. L. Salzberg. JIGSAW: integration of multiple sources of evidence for gene prediction. Bioinformatics 21(18): 3596-3603, 2005.
J. E. Allen, M. Pertea and S. L. Salzberg. Computational gene prediction using multiple sources of evidence. Genome Research, 14(1), 2004.
NIH grant RO1-LM06845 to SLS.
jeallen - umiacs umd edu