TAIR December 2015: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
 
(30 intermediate revisions by 2 users not shown)
Line 1: Line 1:
'''IN PROGRESS - STRAIGHT COPY OF DEC 2014 PAGE'''


== TAIR, The Arabidopsis Information Resource, December 2015 ==
== TAIR, The Arabidopsis Information Resource, December 2015 ==
Line 5: Line 4:
=== 1. Staff working on GOC tasks ===
=== 1. Staff working on GOC tasks ===


Tanya Berardini, Donghui Li
Tanya Berardini (0.35 FTE), Donghui Li (0.3 FTE), Leonore Reiser (0.5 FTE)


The total number of FTE working on GOC tasks is 1.2.
The total number of FTE working on GOC tasks is 1.15.


=== 2. Annotation progress ===
=== 2. Annotation progress ===


Dates are 12/13 = 12/02/2013 and 12/14 = 12/9/14.
In October 2015, we dropped about 37K RCA annotations from a single 2012 paper that we deemed out of date and that many community members had criticized as unreliable.  This adversely affects many of the numbers shown below and the increase in experimentally based annotations is masked to some degree.  Going forward, it may be more useful to show and compare the number of experimentally based annotations (IDA, IGI, IPI, IMP, IEP) as a better measure of literature based curation.
 
Dates are 12/15 = 12/31/2015 and 12/14 = 12/09/14.


'''Table 1: Number of Annotations to Various GO Aspects'''
'''Table 1: Number of Annotations to Various GO Aspects'''
Line 18: Line 19:
|-
|-
! Annotations
! Annotations
! BP (12/13)
! BP (12/15)
! BP (12/14)
! BP (12/14)
! change
! change
! MF (12/13)
! MF (12/15)
! MF (12/14)
! MF (12/14)
! change
! change
! CC (12/13)
! CC (12/15)
! CC (12/14)
! CC (12/14)
! change
! change
|-
|-
| non-IEA/non-ND|| 57816 || 58326 ||+ 510 ||13649 ||14098  ||+ 449 || 55665 || 55941  ||  + 276 ||
| non-IEA/non-ND|| 22786 || 58326 ||- 35540  ||14584 ||14098  ||+ 486 || 56339 || 55941  ||  + 398 ||
|-
|-
| IEA || 17665 || 17211 || -  454 || 34239 || 33757 || - 482  || 7200  || 7158  || - 42 ||
| IEA || 5689 || 17211 || - 11522 ||9236  || 33757 || -24521  || 1354 || 7158  || -5804  ||
|-
|-
| ND || 9206 || 9176  || - 30 || 5199 || 5193 || - 6 || 4471 || 4452 || - 19 ||
| ND ||8106 || 9176  || - 1070 || 4861|| 5193 || - 332  || 4065|| 4452 || - 387 ||


|}
|}
Line 42: Line 43:
|-
|-
! Genes
! Genes
! BP (12/13)
! BP (12/15)
! BP (12/14)
! BP (12/14)
! change
! change
! MF (12/13)
! MF (12/15)
! MF (12/14)
! MF (12/14)
! change
! change
! CC (12/13)
! CC (12/15)
! CC (12/14)
! CC (12/14)
! change
! change
|-
|-
| non-IEA/non-ND|| 13335 ||13406 || + 71 || 7805 || 7870 || + 65 || 23723 || 23756  || + 33  ||
| non-IEA/non-ND|| 9730 ||13406 || -4036  || 8020 || 7870 || + 150 ||23782  || 23756  || + 26  ||
|-
|-
| IEA || 10038 || 9840 || - 198 || 12602 || 12434 || -168  || 4404 || 4379  || - 25 ||
| IEA || 4593 || 9840 || - 5247 || 6574 ||12434 || - 5860  || 1224 || 4379  || - 3152 ||
|-
|-
| ND || 9206 || 9176 || - 30 || 5199 || 5193 || - 6 || 4471 || 4452 || - 19 ||
| ND ||8106 || 9176 || - 1070  || 4861|| 5193 || - 332  || 4065|| 4452 || - 387 ||


|}
|}
Line 64: Line 65:
a. Literature curation: We continue to put most of our annotation effort (95%) into annotation of gene products from the literature.
a. Literature curation: We continue to put most of our annotation effort (95%) into annotation of gene products from the literature.


b. Computational annotation strategies: Last year, we switched to using a new program for our cellular component IEA annotations.  Instead of TargetP, we now use [http://bioinfo3.noble.org/AtSubP/ AtSubP] with an ISM evidence code.  The analysis was run for TAIR by the author on the last Arabidopsis genome release, TAIR10We are working on transitioning to using UniProtKB's IEA results for all other computational predictions to replace our aged Interpro2GO mapped ones.
b. We have transitioned to incorporating UniProtKB's INTERPRO2GO-based IEA annotations on a monthly basisThis replaces our in-house pipeline for
generating a similar dataset.


c. Integration of non-TAIR Arabidopsis annotations:  The following types of external annotations are included in our Arabidopsis gene association file:
c. Integration of non-TAIR Arabidopsis annotations:  The following types of external annotations are included in our Arabidopsis gene association file:
Line 73: Line 75:
   4. PAINT-based Arabidopsis annotations from RefGenome group
   4. PAINT-based Arabidopsis annotations from RefGenome group
   5. Function-Process link-based annotations from GOC  
   5. Function-Process link-based annotations from GOC  
   6. TIGR's annotations from Arabidopsis functional annotation project
   6. INTERPRO2GO-based IEA annotations from UniProtKB
  7. TIGR's annotations from Arabidopsis functional annotation project


d. Priorities for annotation:  
d. Priorities for annotation:  
Line 80: Line 83:
   2. genes that do not have any GO annotations at all (none of the three aspects),
   2. genes that do not have any GO annotations at all (none of the three aspects),
   3. recent literature from high impact factor journals
   3. recent literature from high impact factor journals
  4. user requested annotation requests


e.      Review of user-submitted annotations (see TOAST section below)
e.      Review of user-submitted annotations (see TOAST section below)
   Donghui and Tanya review the annotations submitted via TOAST,  
   Donghui, Leonore, and Tanya review the annotations submitted via TOAST,  
   making sure that terms were mapped correctly and that the proper  
   making sure that terms were mapped correctly and that the proper  
   evidence_with information is entered, if necessary.  Sometimes,  
   evidence_with information is entered, if necessary.  Sometimes,  
Line 88: Line 92:


=== 4. Presentations and publications ===
=== 4. Presentations and publications ===
[http://wiki.geneontology.org/index.php/Publications,_Talks,_Posters_2014 GO 2014 Publications, Talks, Posters]
[http://wiki.geneontology.org/index.php/Publications,_Talks,_Posters_2015 GO 2015 Publications, Talks, Posters]


'''Publications - GO-related but not primarily about GO'''
'''Publications - GO-related but not primarily about GO'''


Li D, Dreher K, Knee E, Brkljacic J, Grotewold E, Berardini TZ, Lamesch P, Garcia-Hernandez M, Reiser L, Huala E (2014) Arabidopsis database and stock resources. Methods Mol Biol. 2014;1062:65-96. doi: 10.1007/978-1-62703-580-4_4. PMID: 24057361
Berardini TZ, Reiser L, Li D, Mezheritsky Y, Muller R, Strait E, Huala E. The Arabidopsis information resource: Making and mining the "gold standard" annotated reference plant genome. Genesis. 2015 Aug;53(8):474-85. doi: 10.1002/dvg.22877.


'''Talks - GO-related but not primarily about GO'''
'''Talks - GO-related but not primarily about GO'''


Donghui Li, TAIR: A Sustainable Community Database for International Arabidopsis Research. July 28, 2014. International Conference on Arabidopsis Research (ICAR), Vancouver, Canada.
None


=== 5. Other Highlights ===
=== 5. Other Highlights ===
Line 103: Line 107:
* '''GO terms contributed by TAIR'''
* '''GO terms contributed by TAIR'''


Tanya Berardini and Donghui Li have created 108 new GO terms using TermGenie and Obo-Edit (search GO flat file by 'created_by: tanyaberardini', 'created_by: tb', 'created_by: dhl', 'created_by: donghui', date range: 12/5/13 - 12/9/14).
Tanya Berardini and Donghui Li have created 90 new GO terms using TermGenie and Obo-Edit (search GO flat file by 'created_by: tanyaberardini', 'created_by: tb', 'created_by: dhl', 'created_by: donghui', date range: 1/1/2015 to 12/31/2015).


* '''Other ontology development work'''
* '''Other ontology development work'''
Line 110: Line 114:
*continues to participate in creating cross-products for terms within and among the three GO namespaces  
*continues to participate in creating cross-products for terms within and among the three GO namespaces  
*continues to participate in creation, testing, and deployment of new TermGenie templates for use by the broader community
*continues to participate in creation, testing, and deployment of new TermGenie templates for use by the broader community
*participated in Modular Annotation prototyping and modeling
*participates in the rota for GitHub (formerly Sourceforge) requests (4 total people in rota, one week at a time, biweekly conference call)
*participates in the rota for Sourceforge requests (4 total people in rota, one week at a time, biweekly conference call)
*participates in the rota for the gatekeeper for the TermGenie requests (4 total people in rota, one week at a time)
*participates in the rota for the gatekeeper for the TermGenie requests (4 total people in rota, one week at a time)
*attends weekly GO editors conference call
*attends weekly GO editors conference call
Line 117: Line 120:


Donghui Li:
Donghui Li:
*attends regular GO annotation conference calls as the TAIR representative
*attends GO annotation conference calls
*continues to request and create GO terms related to Arabiodopsis and plants
*request and create GO terms related to Arabiodopsis and plants


'''B. Annotation outreach and user advocacy efforts'''
'''B. Annotation outreach and user advocacy efforts'''
Line 125: Line 128:
Donghui Li does PAINT-based annotation.  
Donghui Li does PAINT-based annotation.  
*attends regular PAINT annotation conference calls
*attends regular PAINT annotation conference calls
*participated in the PAINT jamboree between July 1st and September 30th. Curated 38 families with a total of 2896 sequences.
*curated 43 families.


* '''TOAST (TAIR Online Annotation Submission Tool)'''
* '''TOAST (TAIR Online Annotation Submission Tool)'''
Line 136: Line 139:
* '''GO help'''
* '''GO help'''


Tanya Berardini continues to participate in manning the GO helpdesk.  This involves answering the questions that come in through the gohelp email address or forwarding them to the appropriate parties for response.  There are several GOC curators that rotate this task, one week at a time.  Beginning in 2015, Donghui Li will participate in the GO helpdesk rota as well.
Tanya Berardini and Donghui Li participate in manning the GO helpdesk.  This involves answering the questions that come in through the gohelp email address or forwarding them to the appropriate parties for response.  There are several GOC curators that rotate this task, one week at a time.  


'''C. Other highlights'''
'''C. Other highlights'''


 
[[Category:Reports - TAIR]]
[[Category:Reports]]
[[Category:TAIR]]

Latest revision as of 19:19, 6 March 2020

TAIR, The Arabidopsis Information Resource, December 2015

1. Staff working on GOC tasks

Tanya Berardini (0.35 FTE), Donghui Li (0.3 FTE), Leonore Reiser (0.5 FTE)

The total number of FTE working on GOC tasks is 1.15.

2. Annotation progress

In October 2015, we dropped about 37K RCA annotations from a single 2012 paper that we deemed out of date and that many community members had criticized as unreliable. This adversely affects many of the numbers shown below and the increase in experimentally based annotations is masked to some degree. Going forward, it may be more useful to show and compare the number of experimentally based annotations (IDA, IGI, IPI, IMP, IEP) as a better measure of literature based curation.

Dates are 12/15 = 12/31/2015 and 12/14 = 12/09/14.

Table 1: Number of Annotations to Various GO Aspects

Annotations BP (12/15) BP (12/14) change MF (12/15) MF (12/14) change CC (12/15) CC (12/14) change
non-IEA/non-ND 22786 58326 - 35540 14584 14098 + 486 56339 55941 + 398
IEA 5689 17211 - 11522 9236 33757 -24521 1354 7158 -5804
ND 8106 9176 - 1070 4861 5193 - 332 4065 4452 - 387


Table 2: Number of Genes Annotated to Various GO Aspects

Genes BP (12/15) BP (12/14) change MF (12/15) MF (12/14) change CC (12/15) CC (12/14) change
non-IEA/non-ND 9730 13406 -4036 8020 7870 + 150 23782 23756 + 26
IEA 4593 9840 - 5247 6574 12434 - 5860 1224 4379 - 3152
ND 8106 9176 - 1070 4861 5193 - 332 4065 4452 - 387

3. Methods and strategies for annotation

a. Literature curation: We continue to put most of our annotation effort (95%) into annotation of gene products from the literature.

b. We have transitioned to incorporating UniProtKB's INTERPRO2GO-based IEA annotations on a monthly basis. This replaces our in-house pipeline for generating a similar dataset.

c. Integration of non-TAIR Arabidopsis annotations: The following types of external annotations are included in our Arabidopsis gene association file:

 1. Literature-based annotations made by TAIR curators
 2. Community annotations made via TAIR's TOAST annotation tool (see below)
 3. GOA annotations for Arabidopsis with experimental evidence codes
 4. PAINT-based Arabidopsis annotations from RefGenome group
 5. Function-Process link-based annotations from GOC 
 6. INTERPRO2GO-based IEA annotations from UniProtKB
 7. TIGR's annotations from Arabidopsis functional annotation project

d. Priorities for annotation:

 1. literature describing the characterization of previously undescribed ('novel') genes, 
 2. genes that do not have any GO annotations at all (none of the three aspects),
 3. recent literature from high impact factor journals
 4. user requested annotation requests

e. Review of user-submitted annotations (see TOAST section below)

 Donghui, Leonore, and Tanya review the annotations submitted via TOAST, 
 making sure that terms were mapped correctly and that the proper 
 evidence_with information is entered, if necessary.  Sometimes, 
 follow-up with the submitter via email is necessary.

4. Presentations and publications

GO 2015 Publications, Talks, Posters

Publications - GO-related but not primarily about GO

Berardini TZ, Reiser L, Li D, Mezheritsky Y, Muller R, Strait E, Huala E. The Arabidopsis information resource: Making and mining the "gold standard" annotated reference plant genome. Genesis. 2015 Aug;53(8):474-85. doi: 10.1002/dvg.22877.

Talks - GO-related but not primarily about GO

None

5. Other Highlights

A. Ontology Development Contributions

  • GO terms contributed by TAIR

Tanya Berardini and Donghui Li have created 90 new GO terms using TermGenie and Obo-Edit (search GO flat file by 'created_by: tanyaberardini', 'created_by: tb', 'created_by: dhl', 'created_by: donghui', date range: 1/1/2015 to 12/31/2015).

  • Other ontology development work

Tanya Berardini:

  • continues to participate in creating cross-products for terms within and among the three GO namespaces
  • continues to participate in creation, testing, and deployment of new TermGenie templates for use by the broader community
  • participates in the rota for GitHub (formerly Sourceforge) requests (4 total people in rota, one week at a time, biweekly conference call)
  • participates in the rota for the gatekeeper for the TermGenie requests (4 total people in rota, one week at a time)
  • attends weekly GO editors conference call
  • periodically attends biweekly GO annotation conference calls

Donghui Li:

  • attends GO annotation conference calls
  • request and create GO terms related to Arabiodopsis and plants

B. Annotation outreach and user advocacy efforts

  • PAINT annotation

Donghui Li does PAINT-based annotation.

  • attends regular PAINT annotation conference calls
  • curated 43 families.
  • TOAST (TAIR Online Annotation Submission Tool)

TAIR continues to collect controlled vocabulary annotations via its online tool. TOAST

TAIR can accept annotations based on any journal article, regardless of the journal it was published in, provided that the article has a DOI or a PMID. Submitters must be registered at TAIR.

  • GO help

Tanya Berardini and Donghui Li participate in manning the GO helpdesk. This involves answering the questions that come in through the gohelp email address or forwarding them to the appropriate parties for response. There are several GOC curators that rotate this task, one week at a time.

C. Other highlights