RefG Princeton April 12-13 2010: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
Line 59: Line 59:
=GO annotation camp discussion=
=GO annotation camp discussion=


SOP will be presented at GO annotation camp.  LONP family we be used as an illustrative, "easy" example, and perhaps a "hard" one, example PGM5, duplication at base of vertebrates, good example of NOT annotation.
SOP will be presented at GO annotation camp.  LONP family we be used as an illustrative, "easy" example, and perhaps a "hard" one, for example PGM5, duplication at base of vertebrates, good example of NOT annotation.
Advance discussion with Alan Bridge, Compara.
Advance discussion with Alan Bridge, Compara.



Revision as of 10:32, 13 April 2010

Propagation Rules/SOP

  • Very useful to spend a few minutes looking at a review, geneWiki, etc for an overview of the family when PAINT curators are not familiar.
  • Look at the tree topology to see if it makes sense. For example, use OrthoMCL mapping to do a reality check on the tree. If it does not, contact Paul and the tree will be edited as appropriate.
  • Generally easiest to start with Mol. Function, then Cell. Component, then Biol. Process
  • In general, we will annotate to the most specific term possible and propagate as far back as possible.
  • We will curate exhaustively by examining every experimental annotation
  • Can be useful (leads to improvements to GO structure) downloads terms and views the DAG for all terms (possible future feature request)
  • Every NOT must have an manual note added in the Evidence pane. Add notes below the generic paragraph that pops up.
  • When a PAINT curator finds a possible experimental annotation that has not yet been added, the SOP is to contact the MOD curator to request that the annotation be added, but they do not need to wait to do the PAINT curation. They can just add the note to the Evidence entry that the annotation exists and the tree will be revisited.
  • NOT + rapid divergence = the line will not be in the GAF provided to the MOD but will be retained in the PAINT GAF. This will enable the ability to say "do not propagate" to a particular clade, distinguished from adding an explicit NOT. For a "real" NOT, we will use a different qualifier; these will be exported in the GAF. This SOP was discussed for quite some time--alternative solutions that we did not like as well: 1) "Do Not Propagate" pruning automatically based on branch length. 2) manually examine the Ref Genome proteins, but do not look at every single other proteins for other species.
  • Use common sense and keep the big picture of the tree and knowledge about the family in mind (eg. LON family: propagation of mito., light strand promoter anti-sense binding annotation to base of euks) ie. we should not always limit ourselves to the bare minimal triangulation. Always include an evidence note when doing so.
  • Treat closely related genes with opposite annotations: look at PMIDs and see if they are really contradictory, if so, don't propagate. If not, contact the MODs to correct the annotation.
  • Still do the multiple annotations in cases where we make sourceforge requests for new links in the ontology.
  • Do not propagate GO:0005515 protein binding (will be suppressed from PAINT), GO:0005488 binding, and enzyme binding.
  • We will only propagate children of protein binding when the terms are specific enough to indicate a specific protein family and/or it provides useful biological information to the biologist wanting to learn more about this term ie. that molecular function is related to the biological process(es) that are annotated in this family.
  • We will propagate small molecule binding terms.

Misc Notes/Action items/Still pending questions

  • Missing MOD annotation to 'sequence-specific DNA binding', will request this: LON family
  • ser-dependent (parent) -> atp-dependent peptidase (child), need this link, check up to endopeptidase-> sourceforge item: LON family.
  • Read document of proposal about binding terms
  • write to Emily, remove ADP binding from human annotation in LON family
  • DNA polymerase binding: ask for new term: DNA polymerase gamma binding. And, human changes annotation to new term. LON family
  • Request Sequence Specific RNA Binding as a new term, and request annotation is changed. LON family.
  • Request that RGD change rat LONP1 annotation (Q924S5) from peroxisome to mitochondrion

PAINT feature requests/bugs

  • Down the road feature: be able to launch a DAG viewer to see all annotations in context of GO structure
  • Add domain information
  • Radio buttons color coded based on GO aspect
  • Scrolling in MSA view alters the residue number (bug), enable search to go to specific residues
  • Remove GO:0005515 (protein binding) from the list of terms we see in PAINT

Quick tour for new PAINT users (Li and Mary)

Ed gave a quick tour of the latest version of PAINT.

Review protein families, see: http://wiki.geneontology.org/index.php/GAFs_for_trees-based_annotations While reviewing protein families, we can generate a list of propagation rules. We can pick up lunch in our cafe, and work through lunch.

LONP1/2

  • Annotate root to 'ATP-dependent peptidase activity' based on experimental annotation span across species
  • NOT to radA clade, we know that they do not have this activity, use the missing_residues qualifier
  • Scrolled through rest of alignment to identify others that do not have the active site
  • Missing MOD annotation to 'sequence-specific DNA binding', will request this of MOD, and annotate to root
  • Annotate mito., light strand promoter anti-sense binding annotation to base of eukaryotes. Based simply on data, would go to human-mouse base, but when given some thought about where this happened, should go to the base of eukaryotes.
  • see notes in the abstract generated by this family for more details


CPS

HPRT

GO annotation camp discussion

SOP will be presented at GO annotation camp. LONP family we be used as an illustrative, "easy" example, and perhaps a "hard" one, for example PGM5, duplication at base of vertebrates, good example of NOT annotation. Advance discussion with Alan Bridge, Compara.

misc. discussion items

  • (Mike): Is the PANTHER to P-POD OrthMCL mapping using the most recent data? Can we add InParanoid results soon, too?
  • (Mike): Should we fix the dates in the new GAF files to reflect when the annotations were actually made?
  • (Mike): Could we re-generate the statistics from the GAF files using a script (rather than manually)?
    • Probably easy enough, once we discuss what stats we're interested in capturing (Ed)
  • (Mike): Pascale noticed a problem with the literature linkouts to Wormbase, and I just had some trouble with ZFIN
    • Fixed (Ed)

Annotation tracker

Sven, CJM, Seth

  • Current issues/roadblocks
  • Subfamily issue: can we deal?
  • Feedback
  • Can we leverage the annotation reporter to generate the monthly lists in an easier way?

Paper

  • Title: ???
  • Authors: as on this mailing list, and possibly adding CJM, Seth, and Sven if we add a section on the DB and the other GO-top PIs
  • Affiliations: obviously
  • Abstract: Paul?
  • Author Summary: Suzi will take a crack at this
  • Introduction: Pascale&Paul
  • Results: as below, with possible addition of DB section and web interface, although this could be a different paper. Ed and Suzi can write #1, Paul&Mike for #2?
  • Discussion: #3 (Mike & Kara)
  • Materials and Methods: cut and dry, write at the end
  • Acknowledgments: all the curators at the MODs, the grant...
  • References, Figure Legends, Tables: as they fall out from the above.