AmiGO Big Ideas

From GO Wiki
Jump to: navigation, search

Please submit your Big Ideas for AmiGO future directions, add ons, and extensions here!

Visualizing Annotations To A Term

The current method for visualizing annotations to a term and its children is a pie chart. I think there are some issues with this.

  • Gene products can be annotated to more than one of the children of a term, so the totals in a pie chart can add up to more than 100%.
  • The pie chart only appears to represent direct annotations, and doesn't take into account annotations to child terms. This means you can get the total adding up to less than 100%. e.g. have a look at the pie chart for plastid ; GO:0009536. 'Plastid part' has 0 gps / annots, even though in the tree view, it has 1002 annotations.
    • Update: this is a bug in AmiGO
  • There is also a problem if a term has is-a and part-of children, e.g.
     GO:0031012 : extracellular matrix [586]  
    [i] GO:0048196 : extracellular matrix (sensu Magnoliophyta) [7]
    [i] GO:0005578 : extracellular matrix (sensu Metazoa) [552]
    [p] GO:0044420 : extracellular matrix part [291]
    

    It doesn't make sense to have the subclasses mixed up with the parts.

  • The total at the bottom of the table is the number of annotations, but the table talks about number of gene products annotated to a node or its children. That is super-confusing. I am not sure whether it would make more sense for the pie chart to represent number of annotations rather than gene products?
  • The pie chart is titled as being the distribution of annotations below a term, but it's actually the distribution to the term and its children.

I think we need to decide what we want to represent; should it be annotations or should it be GPs?

Visualization Ideas for the Reference Genomes

How should we display reference genome data?

Ideally we would have a method of visualising the terms to which a single gene product is annotated, which would be extensible and could be used for multiple gene products. With a single gene product, one simple visualisation would be to show a DAG with the terms to which the gene product is annotated shaded in. Unfortunately, that isn't all that scalable.

There are twelve reference genomes, and something that is familiar to most people is a clock face, so I was wondering if that kind of representation could be used. Perhaps each term could have a pie with twelve slices attached to it, with each slice representing one of the genomes. If there was a direct annotation to the node, the slice would be shaded; if it was to a child node, perhaps it could be shaded a different colour. No shading would mean there was no annotation to the term or its children for the gene product from whichever species. Alternatively, instead of a pie, you could have dots at each of the 'hour' positions, with a similar scheme of dark shading for direct annotation and light for an annotation to a child term.

Circles.png

There's a rather snazzy Periodic Table of Visualisation Methods at http://www.visual-literacy.org/periodic_table/periodic_table.html . A layer chart might be a suitable representation - the base layer would be the tree, and you could add layers on top for different species.

GO Slims

A few ideas involving GO slims:

  • rather than having the terms in the GO slim as being true GO terms, let them be "artificial" terms which can represent boolean concepts, e.g. 'binding NOT carbohydrate binding', 'glucose metabolism AND sucrose metabolism', '(membrane or organelle) NOT cytoskeleton'. This is similar to the existing map2slim bucket terms but more flexible.
  • have a server-based GO slim script where the user can choose the annotation data set and upload their custom GO slim set. Downloadable version for people to use with their own data sets.
  • some kind of exciting AmiGO-esque viewer for people to see the results of the above ideas. The GO browser could just show the GO slim tree structure and annotation numbers. The numbers could then be expanded into a list of gene products with links to the relevant AmiGO pages.


Taxonomy Browser

A taxonomic tree browser linked to the AmiGO annotation data set so that you could view the number of annotations made to each species, genus, family, kingdom, etc..

This would need to integrate the annotation data from AmiGO with a taxonomy browser, e.g. that at NCBI or NEWT.

AmiGO as an environment

Here's a little rambling scenario to illustrate some possible future ideas of amigo as an all-encompassing web2.0 environment.

I'm exploring the process hierarchy; I have several nodes open; I have to leave for the day so I choose to save my subset as a slim. I export it as an obo file, but i also save it in the amigo workspace. I mark it as "shared/public" and I "tag" it with some keywords - say, "behavior", "insect" "royal jelly" and "caste", presumably reflecting the content of the slim.

The next day another researcher is browsing pubmed through the amigo publications portal. She finds an interesting paper on the bee genome (perhaps because another user has tagged it, Connotea-style). The paper describes a term enrichment analysis on bee ESTs. The analysis was deposited in amigo, so she can rerun it on the original dataset on the touch of a button. She can also apply the same analysis on the existing, more up-to-date bee genes in AmiGO. She tags the resulting set of enriched terms as a slim. She then comes across the slim I saved and marked as shared. Finding it interesting, she merges it into a new slim which she saves and tags. After that, she uploads her own dataset - an annotated set of ant genes. Then, using the amigo taxonomy browser, she queries amigo for all insect genes, merges this with her current set then does a map2slim against the merged slim she created earlier.

Pleased with the results, she saves the resulting analysis, marks it "private", but adds read-rights for her group, LiveJournal style, so her collaborators can log on and see it for themselves.

See also:


AmiGO workflow interface

If users are going to be creating their own AmiGO workflows, it might be good to use an interface like that of Apple's Automator application, with 'inputs' and 'outputs' and tranformations. See Automator for more info and pics of the interface.