TAIR December 2010: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
(Created page with '== covers the period from September 2009 to March 2010 == === 1. Staff working on GOC tasks === Tanya Berardini, Donghui Li The total number of FTE working on GOC tasks is 1.4.…')
 
mNo edit summary
(28 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== covers the period from September 2009 to March 2010 ==
== TAIR, The Arabidopsis Information Resource, December 2010 ==
=== 1. Staff working on GOC tasks ===
=== 1. Staff working on GOC tasks ===


Line 13: Line 13:
|-
|-
! Annotations
! Annotations
! BP (09/09)
! BP (12/10)
! BP (03/10)
! BP (12/09)
! change
! change
! MF (09/09)
! MF (12/10)
! MF (03/10)
! MF (12/09)
! change
! change
! CC (09/09)
! CC (12/10)
! CC (03/10)
! CC (12/09)
! change
! change
|-
|-
| non-IEA/non-ND|| 15584 || 16557 || + 973 || 10392 || 10942 || + 550 || 19175 ||  19423 || + 248  ||
| non-IEA/non-ND|| 17690 || 15868 ||+ 1822 ||11219 || 10603 ||+ 616 || 19841 || 19209 || + 632 ||
|-
|-
| IEA || 10776 || 10529 || - 247  || 20147 || 19562 || - 585  || 10490 ||  10382 || - 108  ||
| IEA || 12095 || 10688 || + 1407 || 19293 || 19934 || - 641 || 10505 ||  10452 || + 53 ||
|-
|-
| ND || 14327 || 14238 || - 89  || 8820 || 8805 || - 15  || 14519 || 14470 || - 49 ||
| ND || 9875 || 14284 || - 4409 || 5060 || 8813 || - 3753 || 10237 || 14501 || - 4264||


|}
|}
Line 36: Line 36:
|-
|-
! Genes
! Genes
! BP (09/09)
! BP (12/10)
! BP (03/10)
! BP (12/09)
! change
! change
! MF (09/09)
! MF (12/10)
! MF (03/10)
! MF (12/09)
! change
! change
! CC (09/09)
! CC (12/10)
! CC (03/10)
! CC (12/09)
! change
! change
|-
|-
| non-IEA/non-ND|| 7263 || 7586 || + 323 || 6904 || 7125 || + 221 || 7333 ||  7471 || + 138 ||
| non-IEA/non-ND|| 7981 || 7385 || + 533 || 7189 || 6988 || + 201 || 7619 ||  7378 || + 241  ||
|-
|-
| IEA || 6857 || 6719 || - 138 || 8205 || 8029 || - 180 || 7806 ||  7737 || - 69 ||
| IEA || 6973 || 6807 || + 166 || 7924 || 8135 || - 211 || 7538 ||  7783 || - 245 ||
|-
|-
| ND || 14327 || 14238 || - 89 || 8819 || 8804 || - 15 || 14515 || 14466 || - 49 ||
| ND || 9875 || 14284 || - 4409 || 5059 || 8812 || - 3753 || 10233 || 14497 || - 4264||


|}
|}
*Numbers of ND annotations/genes annotated decreased as annotations to pseudogenes and transposable element genes were removed after a QC check revealed that these existed.


=== 3. Methods and strategies for annotation ===
=== 3. Methods and strategies for annotation ===
Line 58: Line 60:
a. Literature curation: We continue to put most of our effort (95%) into annotation of gene products from the literature.
a. Literature curation: We continue to put most of our effort (95%) into annotation of gene products from the literature.


b. Computational annotation strategies: With every genome release, we rerun two computational GO annotation pipelines, one based on INTERPROtoGO mapping and the other based on a TargetP analysis.  These results are integrated into our GO annotation file. This represents roughly 5% of our annotation effort. We integrate GOA Arabidopsis GO annotations into our gene association file so that all Arabidopsis annotations, regardless of original source, are now relayed to GO via TAIR with the appropriate source attribution.
b. Computational annotation strategies: With every genome release, we run two computational GO annotation pipelines, one based on INTERPROtoGO mapping and the other based on a TargetP analysis.  These results are integrated into our GO annotation file. This represents roughly 5% of our annotation effort. We integrate GOA Arabidopsis GO annotations into our gene association file so that all Arabidopsis annotations, regardless of original source, are now relayed to GO via TAIR with the appropriate source attribution.


c. Priorities for annotation:  
c. Priorities for annotation:  
Line 70: Line 72:
=== 4. Presentations and publications ===
=== 4. Presentations and publications ===


a.  Papers with substantial GO content
[[Publications,_Talks,_Posters_2010-]]
 
Hill DP, Berardini TZ, Howe DG, Van Auken KM. (2009). Representing Ontogeny Through Ontology: A Developmental Biologist’s Guide to The Gene Ontology. Mol Reprod. Dev. 77(4):314-29.
 
Mungall CJ, Bada M, Berardini TZ, Deegan J, Ireland A, Harris MA, Hill DP, Lomax J. (2010). Cross-Product Extensions of the Gene Ontology. J. Biomed. Inform. doi:10.1016/j.jbi.2010.02.002
 
The Gene Ontology Consortium. (2010). The Gene Ontology in 2010: extensions and refinements. Nucleic Acids Res. 38:D331-5.
(TB, corresponding author)
 
b.  Presentations including Talks and Tutorials and Teaching
 
c. Poster presentations - none


=== 5. Other Highlights ===
=== 5. Other Highlights ===


'''A. Ontology Development Contributions'''
'''A. Ontology Development Contributions'''
* '''GO terms contributed by TAIR'''


Donghui Li has submitted 5 SourceForge term requests on behalf of TAIR curators from October 2009 to March 2010 (each request may contain multiple terms). Of these 5 requests, 5 have been closed. 13 new GO terms have been created.  
Donghui Li has submitted 58 SourceForge term requests on behalf of TAIR curators from December 2009 to December 2010 (each request may contain multiple terms). Of these 58 requests, 52 have been closed. 56 new GO terms have been created.


Tanya Berardini continues to work with:
* '''Other ontology development work'''


I. David Hill on  
Tanya Berardini:
#quality control reports that are generated by OBOL and reasoner, both within OBO-Edit and in external scripts. This is an ongoing effort that we address as issues arise. [[http://wiki.geneontology.org/index.php/Ontology_Quality_Control#External_reasoner-based_checks]]
*worked on a team with David Hill, Chris Mungall, Midori Harris and Jane Lomax to develop cross-products within and among the three GO namespaces
#regulation related SF items submitted by the GO community.
*worked on a team with David Hill, Harold Drabkin, Chris Mungall, Midori Harris, Jane Lomax and ChEBI curators to align GO with ChEBI. This will result in the first set of cross-products with GO and an external ontology.
#development specific ontology development.  Both curators attended the Annual meeting of the American Society for Cell Biology.  Ontology improvements from this meeting are detailed here [[http://wiki.geneontology.org/index.php/ASCB_2009]]
*continues to work with Chris Mungall and David Hill on inter-ontology links within GO.  
#continuing to add interontology links between MF and BP.
*acts as the gatekeeper for the TermGenie requests, merging and committing these into the main ontology using the scripts written by Chris Mungall,


II. Midori Harris, David Hill, Chris Mungall, Jen Deegan and Jane Lomax to develop cross-products within the three GO namespaces. The regulation cross-products have been released in the extended GO. David and Tanya are now quality checking the internal biological process cross-products.
'''B. Annotation outreach and user advocacy efforts'''
* '''TAIR/Journal collaboration'''


III. David Hill, Varsha Khodiyar, Doug Howe, Susan Tweedie, Ruth Lovering and community experts to expand the heart development portion of the ontology.
The collaboration to collect functional information about Arabidopsis genes from authors at the time of submission has expanded from Plant Physiology to include nine other journals (Plant Cell, The Plant Journal, Molecular Plant, Journal of Experimental Botany, Plant Science, Plant, Cell and Environment, Environmental Botany, Plant Physiology and Biochemistry, and the Journal of Integrative Plant Biology).  We have also developed and released an online data submission tool for collecting structured annotations to GO and PO terms. The URL for this tool is:


IV. Midori Harris, David Hill, Jane Lomax and Harold Drabkin to align the representation of biochemicals in GO with CheBI.
[http://www.arabidopsis.org/doc/submit/functional_annotation/123 TAIR Online Submission Tool ]
 
TAIR can accept annotations based on any journal article, regardless of the journal it was published in, provided that the article has a DOI or a PMID.  Submitters must be registered at TAIR.


'''B. Annotation outreach and user advocacy efforts'''


1. Tanya Berardini coordinated the writing of the 2010 Nucleic Acids Research Database Issue paper for the consortium.


2. The TAIR-Plant Physiology collaboration continues to gather Arabidopsis gene function data directly from authors whose articles have just been accepted for publication. This partnership has resulted in direct author submission of function or expression information for 422 genes drawn from 104 articles published in Plant Physiology over the last 12 months (March 09 - February 10) (574 GO annotations and 35 PO annotations).  We reevaluated the submission rate after last year's changes in the wording of the data request.  Sampling one month's worth of papers, we found that the base submission rate had increased from 21% to about 50%.
* '''GO help'''


We expanded our journal collaboration to include The Plant Journal in Fall 2009 and have received 2 submissions from Plant Journal containing function information for 3 genes (8 GO annotations and 3 PO annotations) since then.
Tanya Berardini continues to participate in manning the GO helpdesk.  This involves answering the questions that come in through gohelp@geneontology.org or forwarding them to the appropriate parties for response.  There are 9 GOC curators that rotate this task, one week at a time.


We are currently in talks with editors and publishers from 6 additional journals: Journal of Integrative Plant Biology, Journal of Experimental Botany, Plant Science, Environmental Botany, Plant Physiology and Biochemistry, and Plant, Cell and Environment, to establish similar collaborations.  At the same time, we are developing a web-based data submission tool that will be hosted by TAIR that can be used by authors from any journal and will accelerate the incorporation of this type of data into the TAIR database.




'''C. Other highlights - none'''
'''C. Other highlights - none'''
 
[[Category:Reports - TAIR]]
[[Category:Reports]]

Revision as of 19:19, 6 March 2020

TAIR, The Arabidopsis Information Resource, December 2010

1. Staff working on GOC tasks

Tanya Berardini, Donghui Li

The total number of FTE working on GOC tasks is 1.4.

2. Annotation progress

Table 1: Number of Annotations to Various GO Aspects

Annotations BP (12/10) BP (12/09) change MF (12/10) MF (12/09) change CC (12/10) CC (12/09) change
non-IEA/non-ND 17690 15868 + 1822 11219 10603 + 616 19841 19209 + 632
IEA 12095 10688 + 1407 19293 19934 - 641 10505 10452 + 53
ND 9875 14284 - 4409 5060 8813 - 3753 10237 14501 - 4264

Table 2: Number of Genes Annotated to Various GO Aspects

Genes BP (12/10) BP (12/09) change MF (12/10) MF (12/09) change CC (12/10) CC (12/09) change
non-IEA/non-ND 7981 7385 + 533 7189 6988 + 201 7619 7378 + 241
IEA 6973 6807 + 166 7924 8135 - 211 7538 7783 - 245
ND 9875 14284 - 4409 5059 8812 - 3753 10233 14497 - 4264
  • Numbers of ND annotations/genes annotated decreased as annotations to pseudogenes and transposable element genes were removed after a QC check revealed that these existed.

3. Methods and strategies for annotation

a. Literature curation: We continue to put most of our effort (95%) into annotation of gene products from the literature.

b. Computational annotation strategies: With every genome release, we run two computational GO annotation pipelines, one based on INTERPROtoGO mapping and the other based on a TargetP analysis. These results are integrated into our GO annotation file. This represents roughly 5% of our annotation effort. We integrate GOA Arabidopsis GO annotations into our gene association file so that all Arabidopsis annotations, regardless of original source, are now relayed to GO via TAIR with the appropriate source attribution.

c. Priorities for annotation:

(1) literature of any age pertaining to Reference Genome genes,

(2) literature describing the characterization of previously undescribed ('novel') genes,

(3) recent literature from high impact factor journals

4. Presentations and publications

Publications,_Talks,_Posters_2010-

5. Other Highlights

A. Ontology Development Contributions

  • GO terms contributed by TAIR

Donghui Li has submitted 58 SourceForge term requests on behalf of TAIR curators from December 2009 to December 2010 (each request may contain multiple terms). Of these 58 requests, 52 have been closed. 56 new GO terms have been created.

  • Other ontology development work

Tanya Berardini:

  • worked on a team with David Hill, Chris Mungall, Midori Harris and Jane Lomax to develop cross-products within and among the three GO namespaces
  • worked on a team with David Hill, Harold Drabkin, Chris Mungall, Midori Harris, Jane Lomax and ChEBI curators to align GO with ChEBI. This will result in the first set of cross-products with GO and an external ontology.
  • continues to work with Chris Mungall and David Hill on inter-ontology links within GO.
  • acts as the gatekeeper for the TermGenie requests, merging and committing these into the main ontology using the scripts written by Chris Mungall,

B. Annotation outreach and user advocacy efforts

  • TAIR/Journal collaboration

The collaboration to collect functional information about Arabidopsis genes from authors at the time of submission has expanded from Plant Physiology to include nine other journals (Plant Cell, The Plant Journal, Molecular Plant, Journal of Experimental Botany, Plant Science, Plant, Cell and Environment, Environmental Botany, Plant Physiology and Biochemistry, and the Journal of Integrative Plant Biology). We have also developed and released an online data submission tool for collecting structured annotations to GO and PO terms. The URL for this tool is:

TAIR Online Submission Tool

TAIR can accept annotations based on any journal article, regardless of the journal it was published in, provided that the article has a DOI or a PMID. Submitters must be registered at TAIR.


  • GO help

Tanya Berardini continues to participate in manning the GO helpdesk. This involves answering the questions that come in through gohelp@geneontology.org or forwarding them to the appropriate parties for response. There are 9 GOC curators that rotate this task, one week at a time.


C. Other highlights - none