TAIR Nov2011
TAIR, The Arabidopsis Information Resource, November 2011
Statement of Year 11 objectives (March 1, 2011 – February 29, 2012)
1. Staff working on GOC tasks
Tanya Berardini, Donghui Li
The total number of FTE working on GOC tasks is 1.4.
2. Annotation progress
Table 1: Number of Annotations to Various GO Aspects
Annotations | BP (12/10) | BP (11/02/11) | change | MF (12/10) | MF (11/02/11) | change | CC (12/10) | CC (11/02/11) | change | |
---|---|---|---|---|---|---|---|---|---|---|
non-IEA/non-ND | 17690 | 19401 | + 11711 | 11219 | 12293 | + 1074 | 19841 | 24348 | + 4507 | |
IEA | 12095 | 11276 | - 819 | 19293 | 18479 | - 814 | 10505 | 8335 | - 2170 | |
ND | 9875 | 9904 | + 29 | 5060 | 5271 | + 211 | 10237 | 9120 | - 1117 |
Table 2: Number of Genes Annotated to Various GO Aspects
Genes | BP (12/10) | BP (11/02/11) | change | MF (12/10) | MF (11/02/11) | change | CC (12/10) | CC (11/02/11) | change | |
---|---|---|---|---|---|---|---|---|---|---|
non-IEA/non-ND | 7981 | 8463 | + 482 | 7189 | 7410 | + 221 | 7619 | 8661 | + 1042 | |
IEA | 6973 | 6620 | - 353 | 7924 | 7672 | - 252 | 7538 | 6148 | - 1390 | |
ND | 9875 | 9904 | + 29 | 5059 | 5270 | + 211 | 10233 | 9116 | - 1117 |
- Numbers of annotations/genes do not include annotations to Arabidopsis genes contributed by TIGR, GOC, RefGenome, IntAct or UniProtKB group. See section 3c for those numbers.
3. Methods and strategies for annotation
a. Literature curation: We continue to put most of our effort (95%) into annotation of gene products from the literature.
b. Computational annotation strategies: With every genome release, we run two computational GO annotation pipelines, one based on INTERPROtoGO mapping and the other based on a TargetP analysis. These results are integrated into our GO annotation file. This represents roughly 5% of our annotation effort.
c. Integration of non-TAIR Arabidopsis annotations: We integrate the following annotations into our gene association file so that Arabidopsis annotations, regardless of original source, are now relayed to GO via TAIR with the appropriate source attribution.
1. Experimental code GOA Arabidopsis GO annotations (4017 annotations) 2. PAINT-based annotations from RefGenome group (1080 annotations) 3. Function-Process link based annotations from GOC (40 annotations, something is up with pipeline, numbers should be higher) 4. TIGR (29860 annotations)
d. Priorities for annotation:
1. literature of any age pertaining to Reference Genome genes, 2. literature describing the characterization of previously undescribed ('novel') genes, 3. genes that do not have any GO annotations at all (none of the three aspects), 4. recent literature from high impact factor journals
e. Integration of community submitted GO annotations: These are added into the TAIR database after curator review. Curator review takes less time than curation of the paper itself, though the level of detail to which the submitter adds information may be less than that of the curator.
4. Presentations and publications
Publications,_Talks,_Posters_2010
5. Other Highlights
A. Ontology Development Contributions
- GO terms contributed by TAIR
Donghui Li has submitted 42 SourceForge term requests on behalf of TAIR curators from December 2010 to December 2011 (each request may contain multiple terms). Of these 42 requests, 41 have been closed. 39 new GO terms have been created, 18 terms are updated (definition, parentage, obsolete etc).
- Other ontology development work
Tanya Berardini:
- worked on a team with David Hill, Harold Drabkin, Chris Mungall, Midori Harris, Jane Lomax and ChEBI curators to align GO with ChEBI. This will result in the first set of cross-products with GO and an external ontology. The group also has a manuscript describing the collaborative work ready for submission in the next month
- acts as the gatekeeper for the TermGenie requests, merging and committing these into the main ontology using the scripts written by Chris Mungall.
B. Annotation outreach and user advocacy efforts
- TAIR/Journal collaboration
The collaboration to collect functional information about Arabidopsis genes from authors at the time of submission includes ten journals (Plant Physiology, Plant Cell, The Plant Journal, Molecular Plant, Journal of Experimental Botany, Plant Science, Plant, Cell and Environment, Environmental Botany, Plant Physiology and Biochemistry, and the Journal of Integrative Plant Biology). Our online data submission tool for collecting structured annotations to GO and PO terms is available here:
TAIR Online Annotation Submission Tool(TOAST)
TAIR can accept annotations based on any journal article, regardless of the journal it was published in, provided that the article has a DOI or a PMID. Submitters must be registered at TAIR.
We have a manuscript in preparation that describes the community annotation effort and analyzes annotations made by the community.
- GO help
Tanya Berardini continues to participate in manning the GO helpdesk. This involves answering the questions that come in through gohelp@geneontology.org or forwarding them to the appropriate parties for response. There are 11 GOC curators that rotate this task, one week at a time.
C. Other highlights - none