Glimmer3
|
Table 5: Glimmer3 long-orfs and
Glimmer2.13 long-orfs Compared to All
Annotated Genes
|
This table compares the accuracy of the outputs of
the Glimmer3 version
of the long-orfs program with the
output of Glimmer2.13 long-orfs.
Program long-orfs is used to create a
set of orfs that can be used to train the ICM model used
by the main Glimmer gene-prediction program.
The 30 microbial genomes were obtained from
RefSeq at GenBank.
The test set of genes was all annotated genes at least 90bp
long, without frameshifts or in-frame stops.
|
Both versions of long-orfs were run in default mode,
i.e., they both computed the orf length
L that maximized the number of non-overlapping orfs
of length L or greater. The
Glimmer3 version of long-orfs was run
with a "-t 1.15" option to first
filter the set of candidate orfs based on amino-acid composition
before computing L. For high-GC genomes in particular,
this results in a much larger training set of orfs.
|
"Matches" are predictions that had the same
reading frame and stop codon as an annotated gene in the test set.
"Extra" are predictions that are not matches.
Start-codon information is not included in this table
because the long-orfs programs automatically choose
the most upstream start codon.
|
The columns labelled "G3 vs. G2.13" are the
Glimmer3 value minus the corresponding Glimmer2.13 value.
For example, an entry of "+2" in the "Matches" column means
that Glimmer3 had 2 more matches than Glimmer2.13 on the
genome for that row.
|
Genome |
Glimmer3 long-orfs |
Glimmer2.13 long-orfs |
G3 vs. G2.13 |
Organism |
Length |
GC% |
# Genes |
Matches |
Extra |
Matches |
Extra |
Matches |
Extra |
Archaeoglobus fulgidus |
2.18Mb |
48.6 |
2398 |
1083 |
45.2% |
26 |
706 |
29.4% |
18 |
+377 |
+15.7% |
+8 |
Bacillus anthracis |
5.23Mb |
35.4 |
5308 |
3494 |
65.8% |
194 |
2934 |
55.3% |
160 |
+560 |
+10.6% |
+34 |
Bacillus subtilis |
4.21Mb |
43.5 |
4095 |
2647 |
64.6% |
21 |
2062 |
50.4% |
21 |
+585 |
+14.3% |
0 |
Campylobacter jejuni |
1.78Mb |
30.3 |
1836 |
1347 |
73.4% |
58 |
743 |
40.5% |
43 |
+604 |
+32.9% |
+15 |
Carboxydothermus hydrogenoformans |
2.40Mb |
42.0 |
2606 |
1587 |
60.9% |
42 |
1181 |
45.3% |
27 |
+406 |
+15.6% |
+15 |
Caulobacter crescentus |
4.02Mb |
67.2 |
3737 |
1578 |
42.2% |
53 |
388 |
10.4% |
87 |
+1190 |
+31.8% |
-34 |
Chlorobium tepidum |
2.15Mb |
56.5 |
2252 |
943 |
41.9% |
37 |
438 |
19.4% |
30 |
+505 |
+22.4% |
+7 |
Clostridium perfringens |
3.03Mb |
28.6 |
2660 |
2111 |
79.4% |
16 |
1885 |
70.9% |
16 |
+226 |
+8.5% |
0 |
Colwellia psychrerythraea |
5.37Mb |
38.0 |
4902 |
3543 |
72.3% |
31 |
3143 |
64.1% |
21 |
+400 |
+8.2% |
+10 |
Dehalococcoides ethenogenes |
1.47Mb |
48.9 |
1579 |
807 |
51.1% |
33 |
595 |
37.7% |
20 |
+212 |
+13.4% |
+13 |
Escherichia coli |
4.64Mb |
50.8 |
4231 |
2754 |
65.1% |
39 |
1815 |
42.9% |
17 |
+939 |
+22.2% |
+22 |
Geobacter sulfurreducens |
3.81Mb |
60.9 |
3438 |
1432 |
41.7% |
59 |
553 |
16.1% |
61 |
+879 |
+25.6% |
-2 |
Haemophilus influenzae |
1.83Mb |
38.1 |
1649 |
1281 |
77.7% |
51 |
1088 |
66.0% |
41 |
+193 |
+11.7% |
+10 |
Helicobacter pylori |
1.67Mb |
38.9 |
1556 |
1141 |
73.3% |
20 |
831 |
53.4% |
10 |
+310 |
+19.9% |
+10 |
Listeria monocytogenes |
2.91Mb |
38.0 |
2819 |
2005 |
71.1% |
29 |
1607 |
57.0% |
21 |
+398 |
+14.1% |
+8 |
Methylococcus capsulatus |
3.30Mb |
63.6 |
2958 |
1053 |
35.6% |
60 |
302 |
10.2% |
70 |
+751 |
+25.4% |
-10 |
Mycobacterium tuberculosis |
4.40Mb |
65.6 |
4189 |
1500 |
35.8% |
44 |
866 |
20.7% |
37 |
+634 |
+15.1% |
+7 |
Neisseria meningitidis |
2.27Mb |
51.5 |
2055 |
994 |
48.4% |
70 |
537 |
26.1% |
50 |
+457 |
+22.2% |
+20 |
Porphyromonas gingivalis |
2.34Mb |
48.3 |
1909 |
950 |
49.8% |
54 |
639 |
33.5% |
34 |
+311 |
+16.3% |
+20 |
Pseudomonas fluorescens |
7.07Mb |
63.3 |
6134 |
2873 |
46.8% |
71 |
579 |
9.4% |
129 |
+2294 |
+37.4% |
-58 |
Pseudomonas putida |
6.18Mb |
61.5 |
5349 |
2616 |
48.9% |
121 |
836 |
15.6% |
99 |
+1780 |
+33.3% |
+22 |
Ralstonia solanacearum |
3.72Mb |
67.0 |
3435 |
1133 |
33.0% |
42 |
157 |
4.6% |
131 |
+976 |
+28.4% |
-89 |
Staphylococcus epidermidis |
2.62Mb |
32.1 |
2487 |
1797 |
72.3% |
40 |
1480 |
59.5% |
27 |
+317 |
+12.7% |
+13 |
Streptococcus agalactiae |
2.16Mb |
35.6 |
2122 |
1589 |
74.9% |
44 |
1224 |
57.7% |
38 |
+365 |
+17.2% |
+6 |
Streptococcus pneumoniae |
2.16Mb |
39.7 |
2093 |
1289 |
61.6% |
67 |
1001 |
47.8% |
50 |
+288 |
+13.8% |
+17 |
Thermotoga maritima |
1.86Mb |
46.2 |
1854 |
797 |
43.0% |
20 |
448 |
24.2% |
14 |
+349 |
+18.8% |
+6 |
Treponema denticola |
2.84Mb |
37.9 |
2761 |
1839 |
66.6% |
18 |
1426 |
51.6% |
16 |
+413 |
+15.0% |
+2 |
Treponema pallidum |
1.14Mb |
52.8 |
1034 |
507 |
49.0% |
7 |
379 |
36.7% |
6 |
+128 |
+12.4% |
+1 |
Ureaplasma parvum |
0.75Mb |
25.5 |
614 |
400 |
65.1% |
0 |
338 |
55.0% |
9 |
+62 |
+10.1% |
-9 |
Wolbachia endosymbiont |
1.08Mb |
34.2 |
805 |
536 |
66.6% |
43 |
441 |
54.8% |
41 |
+95 |
+11.8% |
+2 |
Averages: |
|
57.4% |
|
|
38.9% |
|
+566 |
+18.6% |
+2 |
|