GO Usage FAQ

From GO Wiki
Jump to: navigation, search

Warning: The contents of this FAQ have been moved to the new website.

How is the GO used in genome analysis?

Genome and full-length cDNA sequence projects often include computational (putative) assignments of molecular function based on sequence similarity to annotated genes or sequences. A common tactic now is to use a computational approach to establish some threshold sequence similarity to a SWISS-PROT sequence. Then the GO associations to the SWISS-PROT sequence can be retrieved and associated with the gene model. Under the GO guidelines, the evidence code for this event would be 'inferred from electronic annotation' (IEA).

One aspect of the use of the GO for annotation of large data sets is the ability to group gene products to some high level term. For example, while gene products may be precisely annotated as having role in a particular function in carbohydrate metabolism (i.e., glucose catabolism), in the summary documentation of the data set, all gene products functioning in carbohydrate metabolism could be grouped together as being involved in the more general phenomena 'carbohydrate metabolism'. Various sets of GO terms have been used to summarize experimental data sets in this way. The expectation is that published sets of high-level GO terms used in genome annotations and publications will be archived at the GO site. Some of these 'GO slims' are already available.

What are all the possible uses of GO?

It would be impossible to list all the potential applications of GO, but applications for which GO has already been used include the following:

  • integrating proteomic information from different organisms;
  • assigning functions to protein domains;
  • finding functional similarities in genes that are overexpressed or underexpressed in diseases and as we age;
  • predicting the likelihood that a particular gene is involved in diseases that haven't yet been mapped to specific genes;
  • analysing groups of genes that are co-expressed during development;
  • developing automated ways of deriving information about gene function from the literature;
  • verifying models of genetic, metabolic and product interaction networks.

For references to these and other studies that have used GO, see the GO Publications.