MGI December 2012

From GO Wiki
Revision as of 07:50, 10 December 2012 by Hdrabkin (talk | contribs) (Staff:)

Jump to: navigation, search

Mouse Genome Informatics Summary, December 2011

Overview

Staff:

Judith Blake*

Karen R Christie*

Alexander Diehl*

Mary Dolan*

Harold J Drabkin*

David Hill*

Li Ni

Dmitry Sitnikov

* Funded entirely or partially by GO

Annotation Progress

We continue to put emphasis on those genes selected for the Reference Genome Project. Additional emphasis is placed on genes with experimental literature but with no GO annotations.


MGI GO STATS as of May 2011


Annotation Type
05_Nov_2011
05_May_2011
Change
% Change
Total Genes annotated with at least one GO term of any kind
25106


33693*
-8587


-25.5



Total annotations:
268669
288498*
-19829
-6.9
Total non-IEA Annotation
Total Number of Genes
23968
32611*
-8643
-26.5
Total Annotations
174915
196934*
-22019
11.2
MGI Curated Genes
10306
10223
83
0.8
MGI Curated Annotations
63951
62146
1805
2.9
GOA Curated Genes
2858
2644
214
8.1
GOA Curated Annotations
13612
10679
2933
27.5
Genes Annotated by Orthology Total
8579
8739
-160
-1.8
Total Orthology Annotations
60082
59095
987
1.7
Genes Annotated by Human Orthology Load (GOA)
6742
7339
-597
-8.1
Total Annotation by Human Orthology Load
-34656
-35376
723
2
Genes annotated by Rat Orthology Load (RGD)
3957
3870
87
2.2
Total Annotations by Rat Orthology Load
22250
-21095
-1155
-5.5
IEA Annotation
Total Genes with IEA
16319
16267
52
0.3
Total IEA annotations
-93754
-91564
-2190
-2.4
Total Genes with SwissProt to GO
15812
15684
128
0.8
Total SwissProt to GO Annotations
-62352
-60213
-2139
3.6
Total Genes with Interpro to GO
9792
10333
-541
-5.2
Total Interpro to GO Annotations
29893
-30152
259
0.9
Total Genes with EC to GO Annotations
925
985
-60
6.1
Total EC to GO annotations
-1509
-1199
-310
25.9)
100% of current gene models. *Drops reflect changes in gene number (Marker type gene changed to pseudogenes, DNA segments,etc.). The total gene annotations have dropped mostly due to losses of markers from class gene to a non-gene type, such as pseudogene, clusters, etc. When this happens any root annotations, and annotation by orthology are automatically removed.

Methods and strategies for annotation

Literature curation:

Literature curation continues to be the major focus of our annotation efforts.

Computational annotation strategies:

As always, current strategies involve use of translation table to mine SwissProt keywords, InterPro domains, and EC numbers for IEA annotation. These are performed automatically on a nightly basis and require little human intervention.

Priorities for annotation

  • Genes assigned by Reference Genome Project (everyone)
  • Isoform curation (Harold, Protein Ontology project); now co-ordinating with 1 by focusing on reference genes that have isoforms, and co-ordinated with the Protein Ontology Protein Complex project.
  • Genes with no GO annotation but with literature (Li and Dmitry)
  • Genes with only IEA annotation but with literature (Li)
  • Genes identified as being important in lung development (Dmitry)
  • Genes marked as having GO annotation completed, but now having new literature (Dmitry)
  • Dmitry has been focused on annotation or miRNAs in MGI

Presentations and Publications

Nucleic Acid research title here: in press

Ontology Development Contributions:

  • 1. David Hill, working with Tanya Berardini, Chris Mungall, Midori Harris and Jane Lomax develop cross-products within and among the three GO namespaces.
  • 2. David Hill works with Chris Mungall and Tanya Berardini to add inter-ontology links in GO.
  • 3. Alexander Diehl is project leader for the Cell Ontology (with Chris Mungall) and Terry Meehan was a full-time curator working on the Cell Ontology. Terry worked on general improvements to the CL, and import of FMA cell types, among others. Alex expanded the representation of neurons in the CL, part of an ongoing collaboration with the International Neuroinformatics Framework (INCF). In summer 2011 Terry mentored Christopher Carr, a former participant of the Jackson Laboratory Summer Student Program, and two new participants, Krystal English and Danielle Penny. Chris worked on identifying mouse genotypes that share autistic-like phenotypes; Krystal identified developmental processes linked to interstitial lung disease; Penny worked on using spatial and temporal qualities of gene expression to prioritize candidate genes involved in autism. All three made extensive use of various ontologies including the GO for their analyses.
  • 4. Alex continues to act as the GOC liaison to the Infectious Disease Ontology and Vaccine Ontology groups and to act on term requests for the GO from those groups, and is active in the GO Signaling and Virus content development groups.
  • 5. David Hill and Jane Lomax oversee the biological content development of GO. In particular, all new developmental biology-related terms submitted to SourceForge are handled by David Hill.
  • 6. David Hill and Harold Drabkin continue working with Tanya Berardini, Chris Mungall, Midori Harris, Jane Lomax and ChEBI curators to align GO with ChEBI. This will result in the first set of cross-products with GO and an external ontology.
  • 7. David Hill and Karen Christie continue to revise and update the transcription area of GO.

Annotation Outreach and User Advocacy Efforts:

  • The Protein Ontology project continues to provide a web interface (http/pir.georgetown.edu/cgi-bin/pro/race_pro) whereby functional annotation using the GO can be applied to PRO submissions. Harold attended the Protein Society meeting in Boston to try to encourage bench scientists to use the too.
  • David Hill and Harold Drabkin serve on the GO-help rota.
  • Terry Meehan, Chris Mungall, and Alex Diehl (Univ. of Buffalo) continue their editing of the cell ontology (CL) including cross-products to the GO. Chris is helping David Hill in 'mining 'notes in MGI's GO annotations that reference to CL identifiers.

Other Highlights:

  • As the designated coordinator of the MGI/GO project with the GO Reference Genome project, Li Ni participates in annotations of genes assigned by the Reference Genome Project, maintain the mouse Reference Genome list on MGI GO wiki and Google spreadsheet, maintain the Reference Genome status table on GO wiki, oversees the curation of Reference Genome Genes for the mouse group. Li responds and resolves questions about MGI GO annotations for the reference genome annotation project genes, and especially responds and resolves questions from the lead PAINT curator (see Reference Genome Project report for a description of PAINT).
  • Mary Dolan has been involved in a collaboration with Carol Bult at MGI on aligning gene ontology annotations for mouse genes assigned to MouseCyc pathways (See http://www.informatics.jax.org/pathways.shtml) and exploring computational methods for associating functional, pathway, and phenotypic data. Mary also provides various files for the Reference Genome Project, for example, a report to assess the GO annotation status of PANTHER families and subfamilies based on annotations for all reference genome organism genes in the groups.