FlyBase Progress Report December 2008
Staff:
Susan Tweedie (1 FTE on GO funded by MRC UK)
Michael Ashburner
Annotation Progress
In Fall 2008 FlyBase deleted all electronic (IEA) annotations that did not comply with current annotation guidelines (i.e. those that were over 1 year old and/or lacked entries in the 'with' column). These annotations were based on mappings to Panther protein domains and were frequently redundant with current InterPro based GO annotations. As a result, the total number of Drosophila annotations has decreased slightly. The vast majority of new annotations were based on experimental evidence.
Table showing Drosophila GO annotation progress in 2008 by evidence type
Dec 2008 | Dec 2007 | % Change | |
---|---|---|---|
Annotations with experimental evidence | 17568 | 14811 | +18.6% |
Annotations with IEA | 16130 | 22885 | -29.5% |
Annotations with ISS | 15478 | 15381 | +0.6% |
Annotations with ND | 5003 | 5016 | -0.3% |
Other annotations (NAS,TAS,RCA,IC) | 16487 | 16750 | -1.6% |
Total annotations | 70666 | 74843 | -5.6% |
Table showing annotation status of Drosophila genes in October 2008
Biological Process | Molecular Function | Cellular Component | Combined GO | |
---|---|---|---|---|
Genes with any GO annotation | 8080 | 9253 | 6893 | 10131 |
Genes with > 1 experimentally based term | 2603 | 1217 | 1403 | 3163 |
Genes with only IEA | 2288 | 2055 | 1684 | 1716 |
Genes with only ND | 855 | 859 | 1013 | 728 |
Methods and strategies for annotation
Literature curation:
Almost 100% of curation effort is devoted to capturing GO terms from published literature.
Priorities for annotation
- Genes assigned by Reference Genome Project
- Genes that lack any annotation
- Genes encoding transcription factors (see highlights section)
Computational annotation strategies:
The only computational strategy used is based on the mapping to InterPro domains. This is updated for each release of FlyBase (every 4-6 weeks).
Presentations and Publications
a. Papers with substantial GO content
Tweedie, S., Ashburner, M., Falls.,K., Leyland,P ., McQuilton, P., Marygold, S., Millburn, G., Osumi-Sutherland, D., Schroeder, A., Seal, R., Zhang, H., and the FlyBase Consortium. FlyBase: Enhancing Drosophila Gene Ontology (GO) annotations. Nucleic Acids Res. [in press]
b. Presentations including Talks and Tutorials and Teaching
c. Poster presentations
Susan Tweedie, The FlyBase Consortium and The Gene Ontology Consortium. Drosophila melanogaster: a GO reference genome. 49th Drosophila Research Conference April 2-6 2008 San Diego, CA.
Other Highlights:
A. Ontology Development Contributions:
- S. Tweedie contributed to the project identifying terms that should be limited to specific taxa.
B. Annotation Outreach and User Advocacy Efforts:
- S. Tweedie is a member of the GO newsletter team.
- S. Tweedie answers user GO related questions from the FlyBase and GO help email lists.
- S. Tweedie participated in the Reference Genome curation e-Jamborees (July and October 2008)
C. Other:
FlyBase funded a project student (Ulrike Pfreundt) to work with Boris Adryan at the Drosophila Transcription Factor Database (www.FlyTF.org) to review and improve GO annotations for transcription factors. New GO annotation data from this work will be added to FlyBase 2009.
Terms predicted by electronic annotation have been reviewed. All terms assigned more than one year ago have now been removed (this became public in November 2008). All IEA-based terms now come solely from InterPro to GO mappings.