TAIR December 2015
IN PROGRESS
TAIR, The Arabidopsis Information Resource, December 2015
1. Staff working on GOC tasks
Tanya Berardini (0.35 FTE), Donghui Li (0.3 FTE), Leonore Reiser (0.5 FTE)
The total number of FTE working on GOC tasks is 1.15.
2. Annotation progress
In October 2015, we dropped about 37K RCA annotations from a single 2012 paper that we deemed out of date and that many community members had criticized as unreliable. This adversely affects many of the numbers shown below and the increase in experimentally based annotations is masked to some degree. Going forward, it may be more useful to show and compare the number of experimentally based annotations (IDA, IGI, IPI, IMP, IEP) as a better measure of literature based curation.
Dates are 12/15 = 12/31/2015 and 12/14 = 12/09/14.
Table 1: Number of Annotations to Various GO Aspects
Annotations | BP (12/15) | BP (12/14) | change | MF (12/15) | MF (12/14) | change | CC (12/15) | CC (12/14) | change | |
---|---|---|---|---|---|---|---|---|---|---|
non-IEA/non-ND | 22786 | 58326 | - 35540 | 14584 | 14098 | + 486 | 56339 | 55941 | + 398 | |
IEA | 5689 | 17211 | - 11522 | 9236 | 33757 | -24521 | 1354 | 7158 | -5804 | |
ND | 8106 | 9176 | - 1070 | 4861 | 5193 | - 332 | 4065 | 4452 | - 387 |
Table 2: Number of Genes Annotated to Various GO Aspects
Genes | BP (12/15) | BP (12/14) | change | MF (12/15) | MF (12/14) | change | CC (12/15) | CC (12/14) | change | |
---|---|---|---|---|---|---|---|---|---|---|
non-IEA/non-ND | 9730 | 13406 | -4036 | 8020 | 7870 | + 150 | 23782 | 23756 | + 26 | |
IEA | 4593 | 9840 | - 5247 | 6574 | 12434 | - 5860 | 1224 | 4379 | - 3152 | |
ND | 8106 | 9176 | - 1070 | 4861 | 5193 | - 332 | 4065 | 4452 | - 387 |
3. Methods and strategies for annotation
a. Literature curation: We continue to put most of our annotation effort (95%) into annotation of gene products from the literature.
b. We have transitioned to incorporating UniProtKB's INTERPRO2GO-based IEA annotations on a monthly basis. This replaces our in-house pipeline for generating a similar dataset.
c. Integration of non-TAIR Arabidopsis annotations: The following types of external annotations are included in our Arabidopsis gene association file:
1. Literature-based annotations made by TAIR curators 2. Community annotations made via TAIR's TOAST annotation tool (see below) 3. GOA annotations for Arabidopsis with experimental evidence codes 4. PAINT-based Arabidopsis annotations from RefGenome group 5. Function-Process link-based annotations from GOC 6. INTERPRO2GO-based IEA annotations from UniProtKB 7. TIGR's annotations from Arabidopsis functional annotation project
d. Priorities for annotation:
1. literature describing the characterization of previously undescribed ('novel') genes, 2. genes that do not have any GO annotations at all (none of the three aspects), 3. recent literature from high impact factor journals 4. user requested annotation requests
e. Review of user-submitted annotations (see TOAST section below)
Donghui, Leonore, and Tanya review the annotations submitted via TOAST, making sure that terms were mapped correctly and that the proper evidence_with information is entered, if necessary. Sometimes, follow-up with the submitter via email is necessary.
4. Presentations and publications
GO 2015 Publications, Talks, Posters
Publications - GO-related but not primarily about GO
Berardini TZ, Reiser L, Li D, Mezheritsky Y, Muller R, Strait E, Huala E. The Arabidopsis information resource: Making and mining the "gold standard" annotated reference plant genome. Genesis. 2015 Aug;53(8):474-85. doi: 10.1002/dvg.22877.
Talks - GO-related but not primarily about GO
None
5. Other Highlights
A. Ontology Development Contributions
- GO terms contributed by TAIR
Tanya Berardini and Donghui Li have created 90 new GO terms using TermGenie and Obo-Edit (search GO flat file by 'created_by: tanyaberardini', 'created_by: tb', 'created_by: dhl', 'created_by: donghui', date range: 1/1/2015 to 12/31/2015).
- Other ontology development work
Tanya Berardini:
- continues to participate in creating cross-products for terms within and among the three GO namespaces
- continues to participate in creation, testing, and deployment of new TermGenie templates for use by the broader community
- participates in the rota for GitHub (formerly Sourceforge) requests (4 total people in rota, one week at a time, biweekly conference call)
- participates in the rota for the gatekeeper for the TermGenie requests (4 total people in rota, one week at a time)
- attends weekly GO editors conference call
- periodically attends biweekly GO annotation conference calls
Donghui Li:
- attends GO annotation conference calls
- request and create GO terms related to Arabiodopsis and plants
B. Annotation outreach and user advocacy efforts
- PAINT annotation
Donghui Li does PAINT-based annotation.
- attends regular PAINT annotation conference calls
- curated 43 families.
- TOAST (TAIR Online Annotation Submission Tool)
TAIR continues to collect controlled vocabulary annotations via its online tool. TOAST
TAIR can accept annotations based on any journal article, regardless of the journal it was published in, provided that the article has a DOI or a PMID. Submitters must be registered at TAIR.
- GO help
Tanya Berardini and Donghui Li participate in manning the GO helpdesk. This involves answering the questions that come in through the gohelp email address or forwarding them to the appropriate parties for response. There are several GOC curators that rotate this task, one week at a time.
C. Other highlights