SGD December 2010

From GO Wiki
Jump to: navigation, search

Saccharomyces Genome Database Summary, 2010



J. Michael Cherry, Eurie Hong, Rama Balakrishnan, Gail Binkley, Karen Christie, Maria Costanzo, Selina Dwight, Stacia Engel, Dianna Fisk, Jodi Hirschman, Ben Hitz, Cindy Krieger, Stuart Miyasato, Rob Nash, Julie Park, Marek Skrzypek, Shuai Weng, Edith Wong, Kalpana Karra, Craig Amundsen, Jo Chung, Esther Chan

Annotation Progress

Since October 2009 the following annotations have been added to SGD
  • Note that these numbers count manually curated and high-throughput annotations only for ORFs that are Verified or Uncharacterized (Dubious ORFs are excluded), for RNA genes (ncRNA, rRNA, snRNA, snoRNA, or tRNA) and for genes encoded within transposable elements. It should also be noted these annotations may include both new annotations and updated annotations which replaced older ones.
GO Aspect Number of Annotations Added Number of genes updated Number of publications used
Biological Process 1560 845 743
Molecular Function 677 455 377
Cellular Component 687 458 317

State of GO annotations Genome wide

Type Counts as of November 29, 2010
GP with Any Annotation 6357
GPs with Manual annotation 6357
GP with Experimental and Computational Evidence 3554
GP with Computational Evidence 3554
GP with Curator Evidence (TAS, NAS, IC) 1615
GP with No Data (ND) 2310
All annotations 79378
Total Annotations with Experimental Evidence 37045
Total Annotations with Computational Evidence 35580
Annotations with Curator Evidence (TAS, NAS, IC) 3753
Annotations with No Data (ND) 4072

Ref.Genome Curation Status

Month Number of S. Cerevisiae genes Number completed
October 09 15 15
November 09 11 11
December 09 6 6
January 10 14 14

Methods and strategies for annotation

a. Literature curation: 100% of SGD’s effort is dedicated to manual curation based on the published literature for budding yeast gene and their products.

b. Computational annotation strategies: SGD does not employ automated methods to assign annotations, rather we absorb the computationally predicted annotations made by the UniProtKB GOA project for S. cerevisiae. The IEA annotations are loaded into the SGD database from the GOA gene association file after each release. In addition, SGD has incorporated computationally predicted annotations generated by the integrated bioinformatic analysis of high-throughput data from the Roth and Troyanskaya labs (Tian et. al., 2008, Huttenhower and Troyanskaya, 2008). These computational predictions are updated once a year. All these annotations are included in the gene_association.sgd file, which represents a significant expansion of the types of evidence codes and data sources that are provided by SGD.

c. Priorities for annotation: The highest priority is to capture annotations where new information is available for an Uncharacterized gene product. These papers are identified during the literature triage process. In addition, we update older annotations and replace TAS evidence code with experimental evidence. SGD captures the date when the annotations for a gene were reviewed. Using this date reviewed, older annotations are checked for consistency with the current literature.

Presentations and Publications


Other Highlights

A. Ontology Development Contributions:

1) K. Christie has been working on improving the ontology branches related to transcription, and the has_part relationship. She has added 129 terms in MF and BP aspects so far and is working on adding more.

2) K. Christie participates in OBO-Edit working group.

3) SGD curators participate in the Annotation Jamboree. R. Balakrishnan, K. Christie, Julie Park participated in the Annotation Jamboree held in November 2010.

SGD's participation in Source Forge items:
New Term Requests submitted by SGD: 55
New Term Requests by others, commented by SGD: 11
Ontology related changes initiated by SGD: 56
Ontology related changes initiated by others, commented by SGD:32

B. Reference Genome Project:

1) R. Balakrishnan has been working on generating inferences using PAINT

C. Evidence code ontology:

1) R. Balakrishnan and K. Christie are working with Michelle and Marcus on the ECO project.

D. Annotation Working groups:

1) J. Hirschman and R. Balakrishnan are part of the regulation_of working group

2) K. Christie and R. Balakrishnan are part of the response_to and downstream_effects working groups

3) R. Balakrishnan is part of HTP working group

4) D. Fisk and R. Balakrishnan are part of Protein complex working group

5) J. Hirschman is part of Signaling working group

Annotation Outreach and User Advocacy Efforts

1) E. Hong is part of the rotation that answers user email from gohelp.

2) R. Balakrishnan, E. Wong and B. Hitz participate in the WebPresence and AmiGO Hub working groups.

3) B. Hitz, Gail Binkley and Stuart Miyasato are part of the software group