GlimmerM is a gene finder derived from Glimmer, but developed specifically for eukaryotes. It is based on a dynamic programing algorithm that considers all combinations of possible exons for inclusion in a gene model and chooses the best of these combinations. The decision about what gene model is best is a combination of the strength of the splice sites and the score of the exons generated by an interpolated Markov model (IMM). The system has been trained for Arabidopsis thaliana, Oryza sativa (rice), and Plasmodium falciparum (the malaria parasite), and should work well on closely related organisms. See below for instructions on downloading the complete system including source code.
GlimmerM is released as source code and was tested on Linux RedHat 6.x+, Sun Solaris, and Alpha OSF1, but should work on any Unix system.
This software is OSI Certified Open Source Software.
% tar -xzf GlimmerM.tar.gz
A directory named 'GlimmerM/' will be created which contains the executable, training data sets, and other supporting files.
Training data sets are available here.
You can contact us about GlimmerM at: mpertea jhu edu
1. A.L. Delcher, D. Harmon, S. Kasif, O. White,
and S.L. Salzberg.
microbial gene identification with GLIMMER (306K, PDF format)
Acids Research, 27:23, 4636-4641.
2. Gardner MJ, Tettelin H, Carucci DJ, Cummings LM, Aravind L, Koonin EV, Shallom S, Mason T, Yu K, Fujii C, Peterson J, Shen K, Jing J, Aston C, Lai Z, Schwartz DC, Pertea M, Salzberg S, Zhou L, Sutton GG, Clayton R, White O, Smith HO, Fraser CM, Hoffman SL, et al. Chromosome 2 sequence of the human malaria parasite Plasmodium falciparum. Science. 1998 Nov 6;282(5391):1126-32.
3. Salzberg, S., Delcher, A., Fasman, K., and Henderson, J. (1998a). A decision tree system for finding genes in DNA. J. Computat. Biol. 5(4), 667-680.
4. S. Salzberg, A. Delcher, S. Kasif, and O. White. Microbial gene identification using interpolated Markov models (73K, PDF format) Nucleic Acids Research 26:2 (1998b), 544-548. Reproduced with permission from NAR Online at http://www.oup.co.uk/nar.
5. Salzberg SL, Pertea M, Delcher AL, Gardner MJ, Tettelin H. Interpolated Markov models for eukaryotic gene finding. Genomics. 1999 Jul 1;59(1):24-31.
6. Pertea M, Salzberg SL, Gardner MJ. Finding genes in Plasmodium falciparum. Nature, 2000 Mar 2;404(6773):34.
7. Yuan Q, Quackenbush J, Sultana R, Pertea M, Salzberg SL, Buell CR. Rice bioinformatics. analysis of rice sequence data and leveraging the data to other plant species. Plant Physiol. 2001 Mar;125(3):1166-74.
8. Pertea, M. and Salzberg, S.L. Computational gene finding in plants. Plant Mol Biol 2002; 48(1-2): 39-48.
9. Pertea, M. and Salzberg, S.L. Using GlimmerM to find genes in eukaryotic genomes. Current Protocols in Bioinformatics, 2002.