MGI,Sept 2010

From GO Wiki
Jump to: navigation, search

Mouse Genome Informatics September, 2010

Overview

Staff:

Judith Blake

Alexander Diehl

Mary Dolan

Harold J Drabkin

David Hill

Terry Meehan

Li Ni

Dmitry Sitnikov

Annotation Progress

We continue to put emphasis on those genes selected for the Reference Genome Project. Additional emphasis has been placed on certain genes associated with lung development.


MGI GO STATS as of August 1, 2010


Annotation Type 01_Aug_10 11_Mar_10 Change % Change
Total Genes annotated (with at least one GO term of any kind):
34271*
35191
Total Manual Annotation
Number of Genes
32488
33228
Orthology:
4191
4006
IEA Annotation
SwissProt to GO
15797
15942
Interpro to GO
10791
10592
EC to GO
1969
1248
* 100% of current gene models
  • Note: decreases in the number of genes annotated reflect effects of MGI Sequence Group reclassifying various markers as no longer type "gene", but pseudogene, complex/cluster, or "other genome feature". Previous annotation to these markers, even if to root, has been stricken.


Methods and strategies for annotation

Literature curation:

Literature curation continues to be the major focus of our annotation efforts. We are currently using ProMiner to improve associating incoming literature to genes in MGI (indexing). This association is used to drive various QC reports used for identifying annotation priorities.


Computational annotation strategies:

As always current strategies involve use of translation table to mine SwissProt Keywords and InterPro domains for IEA annotation. These are performed automatically on a nightly basis and require little human intervention.


Priorities for annotation

  1. Genes assigned by Reference Genome Project (everyone)
  2. Isoform curation (Harold, Protein Ontology project); now co-ordinating with 1 by focusing on reference genes that have isoforms.
  3. Genes with no GO annotation but with literature (Li and Dmitry) (see ** above)
  4. Genes identified as being important in lung development (Dmitry)
  5. Genes marked as having GO annotation completed, but now having new literature (Dmitry)

Presentations and Publications

Add new since March 2010:

b. Presentations including Talks and Tutorials and Teaching

Add new since March 2010

A. Ontology Development Contributions:

Edit, adding new since March 2010:

1. David Hill has worked on a team with Tanya Berardini, Chris Mungall, Midori Harris, Jen Deegan and Jane Lomax to develop cross-products within the three GO namespaces. The regulation cross-products have been released in the extended GO. David and Tanya are now quality checking the internal biological process cross-products.

2. David Hill has worked with Tanya Berardini to continuing to add interontology links between MF and BP.

3. David Hill has worked with Varsha Khodiyar, Tanya Berardini, Doug Howe, Susan Tweedie, Ruth Lovering and community experts to expand the heart development portion of the ontology.

4. David Hill has worked with Yasmin Alam-Faruque, Midori Harris, Becky Folger, Doug Howe, Emily Dimmer, Rachel Huntley and community experts to expand the kidney development portion of the ontology.

5. David Hill and Tanya Berardini are maintaining the automated quality checks of the ontology on an ongoing basis.

6. David Hill and Harold Drabkin are working with Jane Lomax, Midori Harris, Tanya Berardini and Jane Lomax to align the representation of biochemicals in GO.

7. David Hill and Tanya Berardini added or modified 231 terms as a result of attending the American Society for Cell Biology Meeting in December.

8. Alexander Diehl is part of the GO Signaling Working Group led by Becky Folger that continues to work on improvements in the representation of signaling in the GO.

9. Alexander Diehl is project leader for the Cell Ontology (with Chris Mungall) and Terry Meehan is a full-time curator working on the Cell Ontology. Terry has finished implementing cross-products for hematopoietic cell ontology terms and is working on general improvements to the CL. Alex has recently worked with two summer students on improvements to the representation of neurons in the CL, part of an ongoing collaboration with the International Neuroinformatics Framework. In May 2010 we held a very successful Cell Ontology Workshop at The Jackson Laboratory where many issues regarding the longterm development of the ontology were settled. We will hold another workshop on neurons in the first quarter of 2011.

Annotation Outreach and User Advocacy Efforts:

The Protein Ontology project is providing a web interface (http/pir.georgetown.edu/cgi-bin/pro/race_pro ) whereby functional annotation using the GO can be applied to PRO submissions. These are reviewed by Cecilia Arighi of Georgetown. At present, only the PRO curators (Georgetown and MGI) are using the tool, but it is available to anyone.

Harold and David, along with Emily Dimmer, are mentoring Heather Wick of Tufts University, for annotation of human fetal lung development

Other Highlights:

We have currently implemented a pipeline that adds "missing" annotations based on the function x process inter-ontology links being added to the GO.