FlyBase Progress Report December 2008

Susan Tweedie (1 FTE on GO funded by MRC UK)

Michael Ashburner

Annotation Progress

In Fall 2008 FlyBase deleted all electronic (IEA) annotations that did not comply with current annotation guidelines (i.e. those that were over 1 year old and/or lacked entries in the 'with' column). These annotations were based on mappings to Panther protein domains and were frequently redundant with current InterPro based GO annotations. As a result, the total number of Drosophila annotations has decreased slightly. The vast majority of new annotations were based on experimental evidence.

Table showing Drosophila GO annotation progress in 2008 by evidence type

Dec 2008 Dec 2007 % Change
Annotations with experimental evidence 17568 14811 +18.6%
Annotations with IEA 16130 22885 -29.5%
Annotations with ISS 15478 15381 +0.6%
Annotations with ND 5003 5016 -0.3%
Other annotations (NAS,TAS,RCA,IC) 16487 16750 -1.6%
Total annotations 70666 74843 -5.6%

Table showing annotation status of Drosophila genes in October 2008

Biological Process Molecular Function Cellular Component Combined GO
Genes with any GO annotation 8080 9253 6893 10131
Genes with > 1 experimentally based term 2603 1217 1403 3163
Genes with only IEA 2288 2055 1684 1716
Genes with only ND 855 859 1013 728

Methods and strategies for annotation

Literature curation:

Almost 100% of curation effort is devoted to capturing GO terms from published literature.

Priorities for annotation

  1. Genes assigned by Reference Genome Project
  2. Genes that lack any annotation
  3. Genes encoding transcription factors (see highlights section)

Computational annotation strategies:

The only computational strategy used is based on the mapping to InterPro domains. This is updated for each release of FlyBase (every 4-6 weeks).

Presentations and Publications

a. Papers with substantial GO content

Tweedie, S., Ashburner, M., Falls.,K., Leyland,P ., McQuilton, P., Marygold, S., Millburn, G., Osumi-Sutherland, D., Schroeder, A., Seal, R., Zhang, H., and the FlyBase Consortium. FlyBase: Enhancing Drosophila Gene Ontology (GO) annotations. Nucleic Acids Res. [in press]

b. Presentations including Talks and Tutorials and Teaching

c. Poster presentations

Susan Tweedie, The FlyBase Consortium and The Gene Ontology Consortium. Drosophila melanogaster: a GO reference genome. 49th Drosophila Research Conference April 2-6 2008 San Diego, CA.

Other Highlights:

A. Ontology Development Contributions:

    • S. Tweedie contributed to the project identifying terms that should be limited to specific taxa.

B. Annotation Outreach and User Advocacy Efforts:

    • S. Tweedie is a member of the GO newsletter team.
    • S. Tweedie answers user GO related questions from the FlyBase and GO help email lists.
    • S. Tweedie participated in the Reference Genome curation e-Jamborees (July and October 2008)

C. Other:

FlyBase funded a project student (Ulrike Pfreundt) to work with Boris Adryan at the Drosophila Transcription Factor Database ( to review and improve GO annotations for transcription factors. New GO annotation data from this work will be added to FlyBase 2009.

Terms predicted by electronic annotation have been reviewed. All terms assigned more than one year ago have now been removed (this became public in November 2008). All IEA-based terms now come solely from InterPro to GO mappings.