MGI December 2015: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
 
(8 intermediate revisions by 3 users not shown)
Line 28: Line 28:
!Annotation Type !! Dec 5 2014  !! Dec 5, 2015 !! Change !!% change
!Annotation Type !! Dec 5 2014  !! Dec 5, 2015 !! Change !!% change
|-
|-
|Total Genes annotated with at least one GO term of any kind || 24226  ||tba ||tba||tba
|Total Genes annotated with at least one GO term of any kind || 24226  ||24224 ||-2*||0.01
|-
|-
| Total Annotations: || 341687 || tba || tba || tba
| Total Annotations: || 341687 || 362727 || 21040 || 6.2
|-
|-
| colspan=5 align=center bgcolor=white| '''Total non-IEA Annotation'''  
| colspan=5 align=center bgcolor=white| '''Total non-IEA Annotation'''  
|-
|-
|  Total Number of Genes: || 23844 || tba ||tba || tba
|  Total Number of Genes: || 23844 || 23979 ||135 || 0.5
|-
|-
|Total Annotations: ||242025 ||tba || tba||tba ||
|Total Annotations: ||242025 ||262218 || 20193||7.7 ||
|-
|-
|colspan=5 align=center bgcolor=white| '''Annotation by Direct Experiment'''
|colspan=5 align=center bgcolor=white| '''Annotation by Direct Experiment'''
|-
|-
|MGI Curated Mouse Genes ||12170 ||tba ||tba||tba ||
|MGI Curated Mouse Genes ||12170 ||12433 ||263||2.2 ||
|-
|-
|MGI Curated Annotations ||82573 ||tba ||tba || tba||
|MGI Curated Annotations ||82573 ||87159 ||4586 || 5.6||
|-
|-
|GOA Curated Mouse Genes: ||4565 ||tba ||tba ||tba3 ||
|GOA Curated Mouse Genes: ||4565 ||5075 ||510 ||11.2 ||
|-
|-
|GOA Curated Annotations: ||26002 ||tba ||tba ||tba ||
|GOA Curated Annotations: ||26002 ||30177 ||4175 ||16.1 ||
|-
|-
|colspan=5 align=center bgcolor=white| '''Annotation by Orthology'''
|colspan=5 align=center bgcolor=white| '''Annotation by Orthology'''
Line 52: Line 52:
|Total Genes Annotated by Orthology ||11435
|Total Genes Annotated by Orthology ||11435
  ||tba ||tba ||tba ||
  ||11866 ||431 ||3.8 ||
|-
|-
|Total Orthology Annotation ||92787
|Total Orthology Annotation ||92787
  ||tba ||tba ||tba ||
  ||102212 ||9425 ||10.2 ||
|-
|-
|Genes Annotated by Human Orthology Load (GOA) ||10207 ||tba || tba ||tba ||
|Genes Annotated by Human Orthology Load (GOA) ||10207 ||10701 || 494 ||4.8 ||
|-
|-
|Total Annotation by Human Orthology Load ||61355 ||tba || tba||tba||  
|Total Annotation by Human Orthology Load ||61355 ||68129 || 6774|| 11.0||  
|-
|-
|Genes Annotated by Rat Orthology Load (RGD) ||4415 ||tba ||tba ||tba ||  
|Genes Annotated by Rat Orthology Load (RGD) ||4415 ||4696 ||281 ||6.4 ||  
|-
|-
|Total Annotations by Rat Orthology Load || 27006||tba || tba|| tba||  
|Total Annotations by Rat Orthology Load || 27006||30769 || 3763|| 13.9||  
|-
|-
|colspan=5 align=center bgcolor=white| '''IEA Annotation'''
|colspan=5 align=center bgcolor=white| '''IEA Annotation'''
|-
|-
|Total Genes with IEA Annotations ||14602 ||tba || tba|| tba ||  
|Total Genes with IEA Annotations ||14602 ||14724 || 122|| 0.8 ||  
|-
|-
|Total IEA Annotations ||99662 ||tba || tba||tba ||  
|Total IEA Annotations ||99662 ||100509 || 847||0.8 ||  
|-
|-
|Total Genes with SwissProt to GO Annotations ||14211
|Total Genes with SwissProt to GO Annotations ||14211
  ||tba ||tba||tba ||  
  ||14337 ||126||0.9 ||  
|-
|-
|Total SwissProt to GO Annotations ||55788 ||tba || tba||tba ||  
|Total SwissProt to GO Annotations ||55788 || 56888 || 1100||2.0 ||  
|-
|-
|Total Genes with Interpro to GO Annotations ||10054
|Total Genes with Interpro to GO Annotations ||10054
  ||tba ||tba || tba||  
  ||9966 ||-88 || -0.9||  
|-
|-
|Total Interpro to GO Annotations ||25511
|Total Interpro to GO Annotations ||25511
  ||tba ||tba ||tba ||  
  ||24408 ||-1103 ||-4.3||  
|-
|-
|Total Genes with EC to GO Annotations ||1734 ||tba ||tba || tba||  
|Total Genes with EC to GO Annotations ||1734 ||1709 ||25 || 1.5||  
|-
|-
|Total EC to GO Annotations ||18363 ||tba ||tba ||tba ||  
|Total EC to GO Annotations ||18363 ||19213 ||850 ||4.6 ||  


|}
|}
Line 120: Line 120:
a.  Papers with substantial GO content
a.  Papers with substantial GO content


* Drabkin HJ, Christie KR, Dolan ME, Hill DP, Ni L, Sitnikov D, Blake JA. Application of comparative biology in GO functional annotation: the mouse model. Mamm Genome. 2015 Jul 4. [Epub ahead of print] PubMed PMID: 26141960.  
* Drabkin HJ, Christie KR, Dolan ME, Hill DP, Ni L, Sitnikov D, Blake JA. Application of comparative biology in GO functional annotation: the mouse model. Mamm Genome. 2015 26, Issue 9, pp 574-58,  PubMed PMID: 26141960.  
* Dolan ME, Baldarelli RM, Bello SM, Ni L, McAndrews MS, Bult CJ, Kadin JA, Richardson JE, Ringwald M, Eppig JT, Blake JA. Orthology for comparative genomics in the mouse genome database. Mamm Genome. 2015 Aug;26(7-8):305-13. doi: 10.1007/s00335-015-9588-5. Epub 2015 Jul 30. PubMed PMID:26223881; PubMed Central PMCID: PMC4534493
* Dolan ME, Baldarelli RM, Bello SM, Ni L, McAndrews MS, Bult CJ, Kadin JA, Richardson JE, Ringwald M, Eppig JT, Blake JA. Orthology for comparative genomics in the mouse genome database. Mamm Genome. 2015 Aug;26(7-8):305-13. doi: 10.1007/s00335-015-9588-5. Epub 2015 Jul 30. PubMed PMID:26223881; PubMed Central PMCID: PMC4534493
* Gene Ontology Consortium. Gene Ontology Consortium: going forward. Nucleic Acids Res. 2015 Jan;43(Database issue):D1049-56. doi: 10.1093/nar/gku1179. Epub 2014 Nov 26. PubMed PMID: 25428369; PubMed Central PMCID: PMC4383973.  
* Gene Ontology Consortium. Gene Ontology Consortium: going forward. Nucleic Acids Res. 2015 Jan;43(Database issue):D1049-56. doi: 10.1093/nar/gku1179. Epub 2014 Nov 26. PubMed PMID: 25428369; PubMed Central PMCID: PMC4383973.  
Line 137: Line 137:


* David Hill continues working with the ontology development group to implement logical definitions for GO terms.
* David Hill continues working with the ontology development group to implement logical definitions for GO terms.
* David now has a regular spot in the rotation to address GO GITHUB items.
* David Hill is in the rotation to address GO-ontology GITHUB items.
* David Hill has completed an overhaul of carbohydrate catabolic processes to the TCA cycle.
* David Hill has completed an overhaul of carbohydrate catabolic processes to the TCA cycle.
* David HIll has been working with a focus group on representing autophagy in the ontology.
* David Hill has been working with a focus group on representing autophagy in the ontology.




Line 147: Line 147:
* Judith Blake and Karen Christie are working with Astrid Laegrid and Martin Kuiper of the Norwegian University of Science and Technology in Trondheim to determine how to incorporate GO annotations for mammalian (human, mouse, and rat) transcription factors, and their target genes, made by this group.
* Judith Blake and Karen Christie are working with Astrid Laegrid and Martin Kuiper of the Norwegian University of Science and Technology in Trondheim to determine how to incorporate GO annotations for mammalian (human, mouse, and rat) transcription factors, and their target genes, made by this group.
* Karen Christie reviewed and finalized annotations for human ciliary genes made using UniProt's Protein2GO tool by Lora Nacheva, who was a student in Tony Gibson's group at the EMBL in Germany.
* Karen Christie reviewed and finalized annotations for human ciliary genes made using UniProt's Protein2GO tool by Lora Nacheva, who was a student in Tony Gibson's group at the EMBL in Germany.
* David Hill is now co-managing the annotation group with Kimberly Van Auken (WormBase).




Line 155: Line 156:
* Karen Christie serves as the MGI representative on the PAINT curation team. As a member of this team, Karen curates Panther families in PAINT to propagate annotations based on evolutionary relationships. She also files bug reports on PAINT and contribute to the improvement of the PAINT software.  
* Karen Christie serves as the MGI representative on the PAINT curation team. As a member of this team, Karen curates Panther families in PAINT to propagate annotations based on evolutionary relationships. She also files bug reports on PAINT and contribute to the improvement of the PAINT software.  


<span style="color:#FF0000">* Mary Dolan has begun working with other members of the GO software team on Galaxy for GO. In the past, most groups have used Galaxy for sequence analysis. The focus of the GO Galaxy initiative will be functional analysis, implementing GO tools in Galaxy. Mary also provides various files for the Reference Genome Project, for example, a report to assess the GO annotation status of PANTHER families and subfamilies based on annotations for all reference genome organism genes in the groups.</span>
* Mary Dolan works with other members (particularly Seth Carbon) of the GO software team to develop automated unit tests for AmiGO2 in order to facilitate releases.

Latest revision as of 11:25, 14 December 2015

Overview:

Staff:

[please include FTEs working on GOC tasks designating as well how many FTEs funding by GOC NIHGRI grant]

Judith Blake*

Karen R Christie*

Mary E Dolan*

Harold J Drabkin*

David Hill*

Li Ni*

Dmitry Sitnikov

* Funded entirely or partially by GO

Annotation Progress

Annotation Type Dec 5 2014 Dec 5, 2015 Change % change
Total Genes annotated with at least one GO term of any kind 24226 24224 -2* 0.01
Total Annotations: 341687 362727 21040 6.2
Total non-IEA Annotation
Total Number of Genes: 23844 23979 135 0.5
Total Annotations: 242025 262218 20193 7.7
Annotation by Direct Experiment
MGI Curated Mouse Genes 12170 12433 263 2.2
MGI Curated Annotations 82573 87159 4586 5.6
GOA Curated Mouse Genes: 4565 5075 510 11.2
GOA Curated Annotations: 26002 30177 4175 16.1
Annotation by Orthology
Total Genes Annotated by Orthology 11435 11866 431 3.8
Total Orthology Annotation 92787 102212 9425 10.2
Genes Annotated by Human Orthology Load (GOA) 10207 10701 494 4.8
Total Annotation by Human Orthology Load 61355 68129 6774 11.0
Genes Annotated by Rat Orthology Load (RGD) 4415 4696 281 6.4
Total Annotations by Rat Orthology Load 27006 30769 3763 13.9
IEA Annotation
Total Genes with IEA Annotations 14602 14724 122 0.8
Total IEA Annotations 99662 100509 847 0.8
Total Genes with SwissProt to GO Annotations 14211 14337 126 0.9
Total SwissProt to GO Annotations 55788 56888 1100 2.0
Total Genes with Interpro to GO Annotations 10054 9966 -88 -0.9
Total Interpro to GO Annotations 25511 24408 -1103 -4.3
Total Genes with EC to GO Annotations 1734 1709 25 1.5
Total EC to GO Annotations 18363 19213 850 4.6
  • Loses due to gene merges, withdrawals, or marker type changes (gene to pseudogene).

Methods and strategies for annotation

Literature curation:

Literature curation continues to be the major focus of our annotation efforts.

Computational annotation strategies:

As always, current strategies involve use of translation table to mine SwissProt keywords, InterPro domains, and EC numbers for IEA annotation. These are performed automatically on a nightly basis and require little human intervention.

Harold Drabkin monitors weekly QC reports on manual and automatic annotation stats, and responds to questions about specific annotations as required.

Priorities for annotation

  • Isoform curation (Harold, Karen, Protein Ontology project); focusing on genes that have isoforms or whose products are modified, and co-ordinate with the Protein Ontology Protein Complex project.
  • Genes with no GO annotation but with literature (Li and Dmitry)
  • Genes with only IEA annotation but with literature (Li)
  • Genes marked as having GO annotation completed, but now having new literature (Dmitry)
  • Genes that have an annotation to one of the three root nodes of GO, but have new literature (David)
  • Dmitry has been focused on annotation or miRNAs in MGI
  • Annotation of ciliary genes (Karen)
  • Annotation of metabolic genes, glycolysis,pyruvate metabolism, and carbohydrate catabolism in general (David)
  • Autophagy genes

Presentations and Publications

a. Papers with substantial GO content

  • Drabkin HJ, Christie KR, Dolan ME, Hill DP, Ni L, Sitnikov D, Blake JA. Application of comparative biology in GO functional annotation: the mouse model. Mamm Genome. 2015 26, Issue 9, pp 574-58, PubMed PMID: 26141960.
  • Dolan ME, Baldarelli RM, Bello SM, Ni L, McAndrews MS, Bult CJ, Kadin JA, Richardson JE, Ringwald M, Eppig JT, Blake JA. Orthology for comparative genomics in the mouse genome database. Mamm Genome. 2015 Aug;26(7-8):305-13. doi: 10.1007/s00335-015-9588-5. Epub 2015 Jul 30. PubMed PMID:26223881; PubMed Central PMCID: PMC4534493
  • Gene Ontology Consortium. Gene Ontology Consortium: going forward. Nucleic Acids Res. 2015 Jan;43(Database issue):D1049-56. doi: 10.1093/nar/gku1179. Epub 2014 Nov 26. PubMed PMID: 25428369; PubMed Central PMCID: PMC4383973.

b. Presentations including Talks and Tutorials and Teaching

  • to be filled in

c. Poster presentations

  • David Hill, Peter D' Eustachio, Nikolai Renedo, Judith Blake. Linking GO Pathways to the Bigger Picture. July 2015. ISMB/ECCB, Dublin Ireland (audience: Computational Biologists)
  • Karen Christie and Judith Blake. Comprehensive Gene Ontology annotation of ciliary genes in the laboratory mouse. July 2015. FASEB Biology of Cilia & Flagella, Snowmass, CO (audience: Cilia researchers)

Other Highlights:

A. Ontology Development Contributions:

  • David Hill continues working with the ontology development group to implement logical definitions for GO terms.
  • David Hill is in the rotation to address GO-ontology GITHUB items.
  • David Hill has completed an overhaul of carbohydrate catabolic processes to the TCA cycle.
  • David Hill has been working with a focus group on representing autophagy in the ontology.


B. Annotation Outreach and User Advocacy Efforts:

  • The Protein Ontology project continues to provide a web interface (http/pir.georgetown.edu/cgi-bin/pro/race_pro) whereby functional annotation using the GO can be applied to PRO submissions.
  • Harold Drabkin continues to serve on the GO-help rota.
  • Judith Blake and Karen Christie are working with Astrid Laegrid and Martin Kuiper of the Norwegian University of Science and Technology in Trondheim to determine how to incorporate GO annotations for mammalian (human, mouse, and rat) transcription factors, and their target genes, made by this group.
  • Karen Christie reviewed and finalized annotations for human ciliary genes made using UniProt's Protein2GO tool by Lora Nacheva, who was a student in Tony Gibson's group at the EMBL in Germany.
  • David Hill is now co-managing the annotation group with Kimberly Van Auken (WormBase).


C. Other Highlights:

* As the designated coordinator of the MGI/GO project with the GO Reference Genome project, Li Ni participates in annotations of genes assigned by the Reference Genome Project, oversees the curation of Reference Genome Genes for the mouse group. Li responds and resolves questions about MGI GO annotations for the reference genome annotation project genes, and especially responds and resolves questions from the lead PAINT curator (see Reference Genome Project report for a description of PAINT). Li is also part of the PAINT curation team.

  • Karen Christie serves as the MGI representative on the PAINT curation team. As a member of this team, Karen curates Panther families in PAINT to propagate annotations based on evolutionary relationships. She also files bug reports on PAINT and contribute to the improvement of the PAINT software.
  • Mary Dolan works with other members (particularly Seth Carbon) of the GO software team to develop automated unit tests for AmiGO2 in order to facilitate releases.