Reference Genome progress report for 2009

From GO Wiki
Revision as of 11:30, 14 December 2009 by Pascale (talk | contribs) (Created page with '==Publications/communications== * Paper has been accepted in PLoS Computational Biology (April 24) * A note about the reference genome project was published in the GO news site: …')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Publications/communications

  • Paper has been accepted in PLoS Computational Biology (April 24)
  • A note about the reference genome project was published in the GO news site: http://go.berkeleybop.org/news4go/node/27
  • Pascale presented the Reference genome project at the Biocurator meeting (Berlin, April -09), the Quest for ortholog meeting (Hinxton, July 09), and the Dicty meeting (Estes Park, CO, September 09)

Number of families annotated

  • ~ 750 families (424 PANTHER families)
  • ~ 6,000 gene products
  • constant at about 2000 per year


PAINT updates

  • PAINT update: Vbeta15, November 2009
    1. You can now enter and save your comments/evidence for the annotations. This is the feature Pascale, Paul and I discussed. It is nothing more than a simple text box, but cutting and pasting works so you can enter URLs and the things you used as background information.
    2. In order to make this work you need to be able to save the evidence separate from the GAF and so while I was at it I simply added the ability to save a complete session (tree, gene data, msa, gaf, and evidence). These are stored as a suite of files with an additional xml file as an index to the individual files. This means that you can completely restore a session, although you'll still need a connection for getting the GO term file and the most current GO annotations. But it does round trip.
    3. The file menu now works as follows:
      • "Open from database ..." just as it did before, loads the family from the panther db and the go annotations too
      • "Open from files ... " restores tree, gene data, msa, gaf, and evidence from the local file system
      • "Save annotations ... " records tree, gene data, msa, gaf, and evidence to the local file system
      • "Restore annotations ... " this is to be used in conjunction with the open from db option. it lets you first retrieve the tree from the panther db and then overlay your existing locally saved gaf+evidence onto that tree.
      • "Export ..." saves a stripped down version of the GAF to the local file system. these are the files that can be delivered to the mods.
    4. Appearance, The color scheme was getting far too overloaded so I tried to simplify it. Here is how it now works:
      • Shapes are used to indicate the state/type of node

speciation nodes: circles (as before)

      • duplication nodes: squares
      • rerooted: triangle (as before)
      • collapsed: vertical rectangle
      • subfamily: diamond (as before)
      • selection is now "pink", no stars, just lines.

Colors are strictly reserved for "painting", as you add more annotation the nodes become more colored (it starts out just black&white with a sprinkling of deep red)

      • Deep red: experimental annotations for the node/gene
      • yellow-orange: direct annotation added by one of you
      • Dark blue: inferred annotation


  • April 09:
    1. PAINT no longer relies on a local GO database installation (or Eclipse)
    2. the node annotation panel layout has been simplified somewhat. There are a couple of speed-ups to the code as well.
  • May 09:
    1. The ability to select terms that are not on the list of annotations (ie, parent terms). Sometimes annotations are just more granular between different organisms but a parent term could be chosen. The software must absolutely allow that for us to be able to annotate.
    2. Add links to AmiGO or Quick GO or OBO edit to visualize the ontology easily.
    3. Clicking on a GO term should highlight, on the tree, all proteins (or collapsed nodes) in the clade
    4. Exporting annotations
  • June 09
    1. Main change is that the 'NOT' information is now sent from the server to the client. You should see this information now getting populated in the annotation panel, when you click a node that has been annotated with a 'NOT' qualifier.
  • August 09
    1. allows curators to annotate to parents of terms with annotations.
    2. paint now has flexible docking, proper gaf files, proper go hierarchy,
    3. some degree of speedup thanks to threading
    4. annotation by dragging terms from the term tree onto the gene info
    5. and highlighting of nodes that share annotations to terms
    6. negation, qualifiers,
    7. and more--including some known bugs.

PAINT-based annotations

  • We have started to annotate families using the PAINT tool. The data is available here for review by curators and integration in their respective database: GAFs_for_trees-based_annotations

Visualising PAINT annotations with GO nuts wiki

GONUTS GOsummary extension pulls all gene associations for a gene into the summary table and graph, not just those added from the PAINT GAF.

  • Currently IEA is excluded from the summary table and graph. Should other evidence codes be excluded [Mary comment: the refG graphs display only experimental evidence code annotations (EXP, IDA, IPI,IMP, IGI, IEP); IC; and the ISS codes (ISS, ISO, ISA, ISM), which are only displayed (and labeled as "ISS_only") if there is no experimental annotation to the term.]
  • ISS to an ancestor node is now displayed as ISS-An in the table
  • How should qualifiers be handled. From the electronic jamborees, it was important to display NOT in the table. What about other qualifiers?


Examples

  • To show only the graph, enter this line on a category page.
<GOsummary>notable</GOsummary>
  • Alternatively, you can show only the table with:
<GOsummary>nograph</GOsummary>

PPOD update

From Kara, July 24, 2009: We have just released new PPOD families based on the new protein sets assembled by Paul Thomas' group. Note that we also have run InParanoid/MultiParanoid this time around, in addition to OrthoMCL. We are in contact with Mary about re-generating her GO graphs for the new families, and we will add back those links when the graphs are available. One interface change note: the Functional Conservation info is now on a different page, rather than on the bottom of the family page as it was previously.

Electronic annotation jamboree: ETFDH and SLC11A1, SLC11A2

Electronic_jamboree_july28-2009