MGI December 2014: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(24 intermediate revisions by 3 users not shown)
Line 26: Line 26:


{| class="wikitable" cellpadding="5"
{| class="wikitable" cellpadding="5"
!Annotation Type !! Dec 5 2014  !! June 5 2014 !! Change !!% change
!Annotation Type !! Dec 5 2014  !! Dec 5 2013 !! Change !!% change
|-
|-
|Total Genes annotated with at least one GO term of any kind ||   ||25425 || 12|| 0.05
|Total Genes annotated with at least one GO term of any kind || 24226  ||25406 ||-1180 *||-4.64
|-
|-
| Total Annotations: || || 317308  || 4641 ||  
| Total Annotations: || 341687 || 303352 || 38335 || 12.64
|-
|-
| colspan=5 align=center bgcolor=white| '''Total non-IEA Annotation'''  
| colspan=5 align=center bgcolor=white| '''Total non-IEA Annotation'''  
|-
|-
|  Total Number of Genes: || || 24835 || 72 ||  
|  Total Number of Genes: || 23844 || 24600 ||-756  || -3.07
|-
|-
|Total Annotations: |||223433 ||5458|| ||
|Total Annotations: ||242025 ||209013 || 33012||15.79 ||
|-
|-
|colspan=5 align=center bgcolor=white| '''Annotation by Direct Experiment'''
|colspan=5 align=center bgcolor=white| '''Annotation by Direct Experiment'''
|-
|-
|MGI Curated Mouse Genes || ||11937 ||195 || ||
|MGI Curated Mouse Genes ||12170 ||11569 ||601||5.15 ||
|-
|-
|MGI Curated Annotations || ||79422 ||1963 || ||
|MGI Curated Annotations ||82573 ||75555 ||7018 || 9.39||
|-
|-
|GOA Curated Mouse Genes: ||||4265 ||83|| ||
|GOA Curated Mouse Genes: ||4565 ||4075 ||490 ||12.03 ||
|-
|-
|GOA Curated Annotations: || ||23814 ||730 || ||
|GOA Curated Annotations: ||26002 ||22017 ||3985 ||18.10 ||
|-
|-
|colspan=5 align=center bgcolor=white| '''Annotation by Orthology'''
|colspan=5 align=center bgcolor=white| '''Annotation by Orthology'''
|-
|-
|Total Genes Annotated by Orthology ||10583
|Total Genes Annotated by Orthology ||11435
  ||10748 ||165 || ||
  ||9827 ||1608 ||16.36 ||
|-
|-
|Total Orthology Annotation ||80547
|Total Orthology Annotation ||92787
  ||83195 ||2648 || ||
  ||74768 ||18019 ||24.10 ||
|-
|-
|Genes Annotated by Human Orthology Load (GOA) || ||9279 || 120 || ||
|Genes Annotated by Human Orthology Load (GOA) ||10207 ||8173 || 2034 ||24.89 ||
|-
|-
|Total Annotation by Human Orthology Load || ||53044 || 1587|| ||  
|Total Annotation by Human Orthology Load ||61355 ||45989 || 15366|| 33.41||  
|-
|-
|Genes Annotated by Rat Orthology Load (RGD) || ||4326 || 128|| ||  
|Genes Annotated by Rat Orthology Load (RGD) ||4415 ||4160 ||255 ||6.13 ||  
|-
|-
|Total Annotations by Rat Orthology Load || ||26177||952 || ||  
|Total Annotations by Rat Orthology Load || 27006||24577 || 2429|| 9.88||  
|-
|-
|colspan=5 align=center bgcolor=white| '''IEA Annotation'''
|colspan=5 align=center bgcolor=white| '''IEA Annotation'''
|-
|-
|Total Genes with IEA Annotations || ||16819 ||11 || ||  
|Total Genes with IEA Annotations ||14602 ||16784 || -2182|| -13.00 ||  
|-
|-
|Total IEA Annotations || ||110685 ||15993 || ||  
|Total IEA Annotations ||99662 ||94339 || 5323||5.64 ||  
|-
|-
|Total Genes with SwissProt to GO Annotations ||16454
|Total Genes with SwissProt to GO Annotations ||14211
  ||16463 ||9|| ||  
  ||16414 ||-2203||-13.42 ||  
|-
|-
|Total SwissProt to GO Annotations ||66182 ||66054 ||-128 ||- ||  
|Total SwissProt to GO Annotations ||55788 ||65799 || -10011||-15.21 ||  
|-
|-
|Total Genes with Interpro to GO Annotations ||
|Total Genes with Interpro to GO Annotations ||10054
  ||10207 ||-65 || -0.63||  
  ||10224 ||-170 || -1.66||  
|-
|-
|Total Interpro to GO Annotations ||
|Total Interpro to GO Annotations ||25511
  ||26943 ||-439 || -1.60||  
  ||27375 ||-1864 ||-6.81 ||  
|-
|-
|Total Genes with EC to GO Annotations || ||1755 ||855  ||95.00 ||  
|Total Genes with EC to GO Annotations ||1734 ||930 ||804 || 86.45||  
|-
|-
|Total EC to GO Annotations || ||17688 ||16560 ||1468 ||  
|Total EC to GO Annotations ||18363 ||1165 ||17198 ||1476 ||  


|}
|}
Line 105: Line 105:
'''''Priorities for annotation'''''
'''''Priorities for annotation'''''


* Isoform curation (Harold, Karen, Protein Ontology project); focusing on genes that have isoforms or whose products are modified, and co-ordinate with the Protein Ontology Protein Complex project. Also is working on changing all isoform ids entered in MGI to PRO ids to create better presence in AMIGO.  
* Isoform curation (Harold, Karen, Protein Ontology project); focusing on genes that have isoforms or whose products are modified, and co-ordinate with the Protein Ontology Protein Complex project.  
* Genes with no GO annotation but with literature (Li and Dmitry)  
* Genes with no GO annotation but with literature (Li and Dmitry)  
* Genes with only IEA annotation but with literature (Li)
* Genes with only IEA annotation but with literature (Li)
* Genes marked as having GO annotation completed, but now having new literature (Dmitry)
* Genes marked as having GO annotation completed, but now having new literature (Dmitry)
* Genes that have an annotation to one or more of the three root nodes of GO, but have new literature (David, Karen, Dmitry)
* Genes that have an annotation to one of the three root nodes of GO, but have new literature (David, Karen)
* Dmitry has been focused on annotation or miRNAs in MGI
* Dmitry has been focused on annotation or miRNAs in MGI
* Karen has been focusing on annotation of ciliary genes
* Karen has been focusing on annotation of ciliary genes
* Reannotation of ubiquitin linking enzymes E1,E2,E3 (David)
* Annotation of metabolic genes, glycolysis and molybdenum cofactor metabolism (David)
* In November, Karen and Dmitry focused on annotating human or mouse genes, respectively, from the Synapse Jamboree target list in preparation for the jamboree on December 5th.


=  Presentations and Publications =
=  Presentations and Publications =
a.  Papers with substantial GO content
a.  Papers with substantial GO content


*Huntley RP, Harris MA, Alam-Faruque Y, Blake JA, Carbon SJ, Dietze H, Dimmer E, Foulger R, Hill DP, Khodiyar V, Lock A, Lomax J, Lovering RC, Mutowo-Meullenet P, Sawford T, Van Auken K, Wood V, Mungall, CJ.  2014. A Method for Increasing the Expressivity of Gene Ontology Annotations using a Compositional Approach. BMC Bioinformatics 15:155.
*Wick HC, Drabkin H, Ngu H, Sackman M, Fournier C, Haggett J, Blake JA, Bianchi DW, Slonim DK. DFLAT: functional annotation for human development. BMC Bioinformatics. 2014 Feb 7;15:45. doi: 10.1186/1471-2105-15-45. PubMed PMID:24507166; PubMed Central PMCID: PMC3928322.
 
*Alam-Faruque Y, Hill DP, Dimmer EC, Harris MA, Foulger RE, Tweedie S, Attrill H, Howe DG, Thomas SR, Davidson D, Woolf AS, Blake JA, Mungall CJ, O'Donovan C, Apweiler R, Huntley RP. Representing kidney development using the gene ontology. PLoS One. 2014 Jun 18;9(6):e99864.
 
*Huntley RP, Harris MA, Alam-Faruque Y, Blake JA, Carbon S, Dietze H, Dimmer EC, Foulger RE, Hill DP, Khodiyar VK, Lock A, Lomax J, Lovering RC, Mutowo-Meullenet P, Sawford T, Van Auken K, Wood V, Mungall CJ. A method for increasing expressivity of Gene Ontology annotations using a compositional approach. BMC Bioinformatics. 2014 May 21;15:155.
 


b.  Presentations including Talks and Tutorials and Teaching
b.  Presentations including Talks and Tutorials and Teaching
*Drabkin HJ Summary of Text Mining Tool Usage in the Mouse Genome Informatics Database Resource.  BioCreative 2014 Workshop, Toronto (Covered text mining used for GO literature triage)




c. Poster presentations
c. Poster presentations


* Christie KR and Blake JA. Comprehensive Gene Ontology annotation of ciliary genes in the mouse, Mus musculus. Keystone: Cilia, Development, and Human Disease. Tahoe City, CA. March 2-7, 2014.
* Christie KR and Blake JA. Comprehensive Gene Ontology annotaiton of ciliary genes in the mouse, ''Mus musculus''. Keystone Symposia: Cilia, Development, and Human Disease. Tahoe City, CA. March 2014
* Sitnikov D, Zhu S, McAndrews M, Bult C, Eppig J, Blake J. Cellular Localization, Functional, and Biological Process Curation of microRNA Literature. microRNA as Biomarkers and Diagnostics, Boston, MA, March 17-18.
 
* Ni L, Dolan ME and Blake JA. Assessment of comparative functional annotation propagation in mouse, ISMB 2014, Boston


= Other Highlights: =
= Other Highlights: =
Line 130: Line 141:
A. Ontology Development Contributions:
A. Ontology Development Contributions:


# David Hill continues working with the ontology group to develop cross-products within and among the three GO namespaces.
* David Hill continues working with the ontology development group to implement logical definitions for GO terms.
# David Hill and Jane Lomax oversee the biological content development of GO. David has a regular spot in the rotation to address GO Sourceforge items and submitted TermGenie terms.
* David Hill and Jane Lomax oversee the biological content development of GO. David now has a regular spot in the rotation to address GO Sourceforge items.
# David Hill completed the formal representation of glycolytic and molybdopterin-based cofactor biosynthetic pathways.
* David Hill has completed an overhaul of glycolytic processes and has begun implementing changes to glycolytic fermentation
 
* David Hill has worked with the BioGrid group to correct and refactor all of the small conjugating protein enzyme activities.


B.  Annotation Outreach and User Advocacy Efforts:
B.  Annotation Outreach and User Advocacy Efforts:
* The Protein Ontology project continues to provide a web interface (http/pir.georgetown.edu/cgi-bin/pro/race_pro) whereby functional annotation using the GO can be applied to PRO submissions.
* The Protein Ontology project continues to provide a web interface (http/pir.georgetown.edu/cgi-bin/pro/race_pro) whereby functional annotation using the GO can be applied to PRO submissions.
* Harold Drabkin continues to serve on the GO-help rota.
* Harold Drabkin continues to serve on the GO-help rota.
* Judith Blake and Karen Christie attend monthly teleconferences led by Astrid Laegrid and Martin Kuiper of the Norwegian University of Science and Technology in Trondheim to organize an international Gene Regulation Consortium (GRECO) focused on comprehensive annotation of mammalian (human, mouse, and rat) transcription factors and their target genes using a variety of existing controlled vocabularies for annotation, including GO.  
* Judith Blake and Karen Christie are working with Rachael Huntley to coordinate with Astrid Laegrid and Martin Kuiper of the Norwegian University of Science and Technology in Trondheim to determine how to incorporate GO annotations for mammalian (human, mouse, and rat) transcription factors, and their target genes, made by this group.
* Karen Christie led a remote GO annotation training session for Lora Nacheva, who will be annotating human cilary genes in Tony Gibson's group at the EMBL in Germany.
 


C.  Other Highlights:
C.  Other Highlights:
* As the designated coordinator of the MGI/GO project with the GO Reference Genome project, Li Ni participates in annotations of genes assigned by the Reference Genome Project, oversees the curation of Reference Genome Genes for the mouse group. Li responds and resolves questions about MGI GO annotations for the reference genome annotation project genes, and especially responds and resolves questions from the lead PAINT curator (see Reference Genome Project report for a description of PAINT). Li is also part of the PAINT curation team.  
* As the designated coordinator of the MGI/GO project with the GO Reference Genome project, Li Ni participates in annotations of genes assigned by the Reference Genome Project, oversees the curation of Reference Genome Genes for the mouse group. Li responds and resolves questions about MGI GO annotations for the reference genome annotation project genes, and especially responds and resolves questions from the lead PAINT curator (see Reference Genome Project report for a description of PAINT). Li is also part of the PAINT curation team.  
* As members of the PAINT curation team, Li Ni and Karen Christie curate Panther families in PAINT to propagate annotations based on evolutionary relationships. They also file bug reports on PAINT and contribute to the improvement of the PAINT software.
* Karen Christie and Li Ni serve as an MGI representatives on the PAINT curation team. As members of this team, Li and Karen curate Panther families in PAINT to propagate annotations based on evolutionary relationships. They also file bug reports on PAINT and contribute to the improvement of the PAINT software. Both attended the PAINT Jamboree at Stanford in July, along with Judy Blake.
* Based on the work of Joao A Ascensao, a student at Rice University, in the 2013 Jackson Laboratory Summer Student Program, we are submitting a paper "Methodology for the Inference of Gene Function from Phenotype Data" with co-authors Mary Dolan, David Hill, and Judith Blake. Joao presented the work at the Bar Harbor GOC meeting October 2013.
* Mary Dolan has begun working with other members of the GO software team on Galaxy for GO. In the past, most groups have used Galaxy for sequence analysis. The focus of the GO Galaxy initiative will be functional analysis, implementing GO tools in Galaxy. Mary also provides various files for the Reference Genome Project, for example, a report to assess the GO annotation status of PANTHER families and subfamilies based on annotations for all reference genome organism genes in the groups.

Latest revision as of 02:40, 9 December 2014

Overview:

Staff:

[please include FTEs working on GOC tasks designating as well how many FTEs funding by GOC NIHGRI grant]

Judith Blake*

Karen R Christie*

Mary E Dolan*

Harold J Drabkin*

David Hill*

Li Ni*

Dmitry Sitnikov

* Funded entirely or partially by GO

Annotation Progress

Annotation Type Dec 5 2014 Dec 5 2013 Change % change
Total Genes annotated with at least one GO term of any kind 24226 25406 -1180 * -4.64
Total Annotations: 341687 303352 38335 12.64
Total non-IEA Annotation
Total Number of Genes: 23844 24600 -756 -3.07
Total Annotations: 242025 209013 33012 15.79
Annotation by Direct Experiment
MGI Curated Mouse Genes 12170 11569 601 5.15
MGI Curated Annotations 82573 75555 7018 9.39
GOA Curated Mouse Genes: 4565 4075 490 12.03
GOA Curated Annotations: 26002 22017 3985 18.10
Annotation by Orthology
Total Genes Annotated by Orthology 11435 9827 1608 16.36
Total Orthology Annotation 92787 74768 18019 24.10
Genes Annotated by Human Orthology Load (GOA) 10207 8173 2034 24.89
Total Annotation by Human Orthology Load 61355 45989 15366 33.41
Genes Annotated by Rat Orthology Load (RGD) 4415 4160 255 6.13
Total Annotations by Rat Orthology Load 27006 24577 2429 9.88
IEA Annotation
Total Genes with IEA Annotations 14602 16784 -2182 -13.00
Total IEA Annotations 99662 94339 5323 5.64
Total Genes with SwissProt to GO Annotations 14211 16414 -2203 -13.42
Total SwissProt to GO Annotations 55788 65799 -10011 -15.21
Total Genes with Interpro to GO Annotations 10054 10224 -170 -1.66
Total Interpro to GO Annotations 25511 27375 -1864 -6.81
Total Genes with EC to GO Annotations 1734 930 804 86.45
Total EC to GO Annotations 18363 1165 17198 1476
  • Loses due to gene merges, withdrawals, or marker type changes (gene to pseudogene).

Methods and strategies for annotation

Literature curation:

Literature curation continues to be the major focus of our annotation efforts.

Computational annotation strategies:

As always, current strategies involve use of translation table to mine SwissProt keywords, InterPro domains, and EC numbers for IEA annotation. These are performed automatically on a nightly basis and require little human intervention.

Priorities for annotation

  • Isoform curation (Harold, Karen, Protein Ontology project); focusing on genes that have isoforms or whose products are modified, and co-ordinate with the Protein Ontology Protein Complex project.
  • Genes with no GO annotation but with literature (Li and Dmitry)
  • Genes with only IEA annotation but with literature (Li)
  • Genes marked as having GO annotation completed, but now having new literature (Dmitry)
  • Genes that have an annotation to one of the three root nodes of GO, but have new literature (David, Karen)
  • Dmitry has been focused on annotation or miRNAs in MGI
  • Karen has been focusing on annotation of ciliary genes
  • Reannotation of ubiquitin linking enzymes E1,E2,E3 (David)
  • Annotation of metabolic genes, glycolysis and molybdenum cofactor metabolism (David)
  • In November, Karen and Dmitry focused on annotating human or mouse genes, respectively, from the Synapse Jamboree target list in preparation for the jamboree on December 5th.

Presentations and Publications

a. Papers with substantial GO content

  • Wick HC, Drabkin H, Ngu H, Sackman M, Fournier C, Haggett J, Blake JA, Bianchi DW, Slonim DK. DFLAT: functional annotation for human development. BMC Bioinformatics. 2014 Feb 7;15:45. doi: 10.1186/1471-2105-15-45. PubMed PMID:24507166; PubMed Central PMCID: PMC3928322.
  • Alam-Faruque Y, Hill DP, Dimmer EC, Harris MA, Foulger RE, Tweedie S, Attrill H, Howe DG, Thomas SR, Davidson D, Woolf AS, Blake JA, Mungall CJ, O'Donovan C, Apweiler R, Huntley RP. Representing kidney development using the gene ontology. PLoS One. 2014 Jun 18;9(6):e99864.
  • Huntley RP, Harris MA, Alam-Faruque Y, Blake JA, Carbon S, Dietze H, Dimmer EC, Foulger RE, Hill DP, Khodiyar VK, Lock A, Lomax J, Lovering RC, Mutowo-Meullenet P, Sawford T, Van Auken K, Wood V, Mungall CJ. A method for increasing expressivity of Gene Ontology annotations using a compositional approach. BMC Bioinformatics. 2014 May 21;15:155.


b. Presentations including Talks and Tutorials and Teaching

  • Drabkin HJ Summary of Text Mining Tool Usage in the Mouse Genome Informatics Database Resource. BioCreative 2014 Workshop, Toronto (Covered text mining used for GO literature triage)


c. Poster presentations

  • Christie KR and Blake JA. Comprehensive Gene Ontology annotaiton of ciliary genes in the mouse, Mus musculus. Keystone Symposia: Cilia, Development, and Human Disease. Tahoe City, CA. March 2014
  • Ni L, Dolan ME and Blake JA. Assessment of comparative functional annotation propagation in mouse, ISMB 2014, Boston

Other Highlights:

A. Ontology Development Contributions:

  • David Hill continues working with the ontology development group to implement logical definitions for GO terms.
  • David Hill and Jane Lomax oversee the biological content development of GO. David now has a regular spot in the rotation to address GO Sourceforge items.
  • David Hill has completed an overhaul of glycolytic processes and has begun implementing changes to glycolytic fermentation
  • David Hill has worked with the BioGrid group to correct and refactor all of the small conjugating protein enzyme activities.


B. Annotation Outreach and User Advocacy Efforts:

  • The Protein Ontology project continues to provide a web interface (http/pir.georgetown.edu/cgi-bin/pro/race_pro) whereby functional annotation using the GO can be applied to PRO submissions.
  • Harold Drabkin continues to serve on the GO-help rota.
  • Judith Blake and Karen Christie are working with Rachael Huntley to coordinate with Astrid Laegrid and Martin Kuiper of the Norwegian University of Science and Technology in Trondheim to determine how to incorporate GO annotations for mammalian (human, mouse, and rat) transcription factors, and their target genes, made by this group.
  • Karen Christie led a remote GO annotation training session for Lora Nacheva, who will be annotating human cilary genes in Tony Gibson's group at the EMBL in Germany.


C. Other Highlights:

  • As the designated coordinator of the MGI/GO project with the GO Reference Genome project, Li Ni participates in annotations of genes assigned by the Reference Genome Project, oversees the curation of Reference Genome Genes for the mouse group. Li responds and resolves questions about MGI GO annotations for the reference genome annotation project genes, and especially responds and resolves questions from the lead PAINT curator (see Reference Genome Project report for a description of PAINT). Li is also part of the PAINT curation team.
  • Karen Christie and Li Ni serve as an MGI representatives on the PAINT curation team. As members of this team, Li and Karen curate Panther families in PAINT to propagate annotations based on evolutionary relationships. They also file bug reports on PAINT and contribute to the improvement of the PAINT software. Both attended the PAINT Jamboree at Stanford in July, along with Judy Blake.
  • Mary Dolan has begun working with other members of the GO software team on Galaxy for GO. In the past, most groups have used Galaxy for sequence analysis. The focus of the GO Galaxy initiative will be functional analysis, implementing GO tools in Galaxy. Mary also provides various files for the Reference Genome Project, for example, a report to assess the GO annotation status of PANTHER families and subfamilies based on annotations for all reference genome organism genes in the groups.