28 SEPT 2010 RefGen Priorities Discussion (Archived)

From GO Wiki
Revision as of 10:58, 28 September 2010 by Pascale (talk | contribs) (→‎PTHR10150)
Jump to navigation Jump to search

Mike annotated :

  • PTHR10000
  • PTHR10150

Pascale did

  • PTHR10003
  • PTHR10150

Mike's annotations are in the CVS repository: http://cvsweb.geneontology.org/cgi-bin/cvsweb.cgi/go/gene-associations/submission/paint/#dirlist



PTHR10000

Mike

  • 65 sequences; 4 with EXP data
  • Propagated 6 annotations:
    • MF: GO:0000287: magnesium ion binding
    • MF: GO:0004647: phosphoserine phosphatase activity
    • MF: GO:0016791: phosphatase activity
    • CC: GO:0009507: chloroplast
    • CC: GO:0005737: cytoplasm
    • BP: GO:0006564: L-serine biosynthetic process
  • Comparing with InterPro:
    • Yeast serB matches IPR004469
    • BP: GO:0006564 L-serine biosynthetic process
    • MF: GO:0004647 phosphoserine phosphatase activity

PTHR10150

Mike

  • Propagate GO:0016532 "superoxide dismutase copper chaperone activity" to AN3.

-Propagate GO:0016532 "superoxide dismutase copper chaperone activity" to AN3. -There is an annoitation to SOD activity on Arabidopsis CCS, but no other SOD annotations in the CCS clade. There are widespread SOD annotations throughout the rest of the family.

  • Propagate GO:0004784 "superoxide dismutase activity" to AN0.
  • E. coli SOD (http://biocyc.org/ECOLI/NEW-IMAGE?type=NIL&object=G6886-MONOMER) has 4 active site histidines that have been identified, at positions 67, 69, 92, and 147, corresponding approximately to positions 320 (for 67 and 69), 348, and 470 in this alignment. The first three of these histidines are conserved throughout this family, except for the fungal and plant CCS proteins, which include the Arabidopsis CCS with SOD activity. The fourth histidine is absent from most of the CCS clade, including plant. So, either there are 3 incorrect annotations (including an IDA) from 2 different papers showing SOD activity on plant CCS, or the plant CCS has acquired SOD activity through a different mechanism. Let's go with the latter explanation for now. Block propagation of SOD activity to the CCS clade by placing an IRD at AN3; plant CCS will still have the positive annotations curated to it. Correction: block propagation at AN4, since PAINT will not allow the IRD at AN3.
  • Both 16532 and 4784 should has_part GO:0005507 "copper ion binding," but propagate 5507 to AN0 until this is implemented.
  • Also propagate GO:0008270 "zinc ion binding" to AN0. In the absence of contradictory information, allow this to propagate to the CCS clade, but be prepared to change this decision.

Time to curate MF: 47 minutes


CC

  • SOD/mitochondrion: There are anotations to mitochondrion or some child of mitochondrion for mouse, rat, worm, and yeast SOD1. Propagate mitochondrion to AN48 and block poropagation to the SOD3 clade.
  • The eukaryotic SODs have multiple annotations to "extracellular region" or its children. Propagate to AN46. Similarly, propagate "periplasmic space" from E. coli sodC to the other bacterial proteins.
  • Can't really make any good inference for the CCS clade.

Time to annotate CC: 14 minutes


BP

Start with functions directly related to the MFs of these proteins. Set aside multicellular processes.

  • CCS clade: Propagate:
    • GO:0051341 : regulation of oxidoreductase activity
    • GO:0015680 : intracellular copper ion transport
  • Propagate to AN0:
    • GO:0019430 : removal of superoxide radicals

That covers most of the cellular processes.


Pascale

    • MOLECULAR FUNCTION**

- 20 min, 1 annotation propagated. - Only 3 direct EXP: -- GO:0046982 protein heterodimerization activity -- GO:0000014 single-stranded DNA specific endodeoxyribonuclease activity -- GO:0003697 single-stranded DNA binding


MF ANNOTATIONS:

1. Propagated to all tree: GO:0004520 : endodeoxyribonuclease activity - Could perhaps have propagared to the more specific term: "GO:0000014 : single-stranded DNA specific endodeoxyribonuclease activity" but need more data (one mammalian species would be great).

2. NOT to long branches:

  • ORNAN ENSOANG00000025598, XP_001199502, XP_001186979 (Strongylocentrotus purpuratus - however, reported as partial sequences in GenBank) Rat XP_573032 (record removed in GenBank), Rat XP_001077837 - looks like it needs to be merged with another sequence (5' only)

Questions for MODs: SGD:

1. Is the gene comprehensively annotated? Seems like there are few annotation compared with the volume of lit. Date last reviewed: 2007-10-01.

2. Cerevisiae has more processes, possibly based on more biochemical data. Are those all independent? -- GO:0000735 removal of nonhomologous ends -- GO:0006296 nucleotide-excision repair, DNA incision, 5'-to lesion -- GO:0000736 double-strand break repair via single-strand annealing, removal of nonhomologous ends


============
    • CELLULAR COMPONENT**

- Annotation: 15 min

  • Again, human annotations don't show up. Missing "GO:0000109 nucleotide-excision repair complex" in PAINT. GOA should have annotated to GO:0000110 , not GO:0000109.


CC ANNOTATIONS:

1. Annotated "GO:0000110 : nucleotide-excision repair factor 1 complex" to AN1. (left plants out). We need to know whether the other spp have RAD10 as well, but since yeast and human have them, I am assuming it's conserved.

2. NOT to long branches:

  • ORNAN ENSOANG00000025598,
  • XP_001199502, XP_001186979 (Strongylocentrotus purpuratus - however, reported as partial sequences in GenBank)
  • Rat XP_573032 (record removed in GenBank)
  • Rat XP_001077837 - looks like it needs to be merged with another sequence (5' only)

-GOA: ERCC4 should be annotated to GO:0000110 , not GO:0000109.

Did not annotate: - Spindle body : S. pombe HTP - This is the only outlier

============
    • BIOLOGICAL PROCESS**

- Annotation: 30 minutes - GO:0006289 nucleotide-excision repair: TAIR, pombe, cerevisiae, fly, human (annotations not showing up) - Annotations are related (recombinatio, meiosis, recombinational repair, etc) but all over the GO, probably dependent on experiments or annotators.

BP ANNOTATIONS:

1. GO:0006289 nucleotide-excision repair: Propagate to all (except long branches, see MF annotations)

2.  : Annotate all to GO:0006310 : DNA recombination based on:

  • Fly: GO:0007131 reciprocal meiotic recombination //
  • cerevisiae: GO:0006312 mitotic recombination
  • S. pombe: GO:0007534 : gene conversion at mating-type locus (child of GO:0006312 : mitotic recombination )
  • A. thaliana: GO:0000724 double-strand break repair via homologous recombination

Outliers: - do not propagate to GO:0009792 embryo development ending in birth or egg hatching (worm RNAi) - do not propagate GO:0016321 female meiosis chromosome segregation (Fly IMP)


PTHR10003