See this document

From GO Wiki
Jump to: navigation, search

Evidence Code Hierarchy

It has been suggested that the evidence codes do not have a hierarchy. Unfortunately, we would have to disagree. There should be no argument that an experiment backed annotation is better than any annotation based on a prediction. Therefore, any of the experimental evidence codes mean more than the others. If there are no hierarchies, then all IEA based evidence codes should be displayed in Amigo.

Different rules for Evidence code usage based on database status

It is currently the practice that certain ways of using evidence codes are “relaxed” if the annotations are to an organism not part of the Reference Genome Project (RGP).

  1. ISS: the object in the “with” field must point to an organism that an experiment backing the annotation has been made in . This is not a requirement for non-RGP databases.
  2. The RGP group will refrain from using TAS and NAS ;

The value of relaxing these rules should be questioned. Users not familiar with the distinctions between RGP and non-RGP groups will most likely not notice this difference. We should make usage rules that do not vary.

The utility of TAS and NAS

Basically, there currently appear to be really three classes of evidence codes (ND being a special case):

  1. Experimental: IDA, IPI, IMP, IEP, and IGI
  2. Predictive: ISS, IEA, RCA, IGC
  3. TAS and NAS

Since the RGP groups will not use the third group, one has to ask “why?”. If the assertion is “traceable”, then it should be quite easy to obtain the original experimental evidence . If the assertion is “non-traceable”, it is really no better than hearsay. The TAS appears to be merely a way to save time. One can see its use if a review article is being used. However, it is difficult to justify the use of such literature for the annotation of a specific gene product if there actually is published experimental evidence. It may not be clear at all that the statement made in the review article demonstrates the actual status of the specific gene product in question (organism, isoform, etc.).

There has been a proposal being discussed to allow the use of the WITH field with certain NAS annotations. If NAS actually means non-traceable, then there is no rational in putting anything in the WITH field to accompany it. It is non-traceable. This also suggests that any use of an IC annotation that uses a GO_ID in the with field that is supported by a NAS is not valuable.

We might want to seriously think about dropping the use of TAS and NAS altogether. If an annotation cannot be backed up by either an experiment, or a predictive method, then how can any annotation be made at all?

When to use IEA vs ISS

The GO should restrict the ISS code situations where a comparison is made to a sequence from a source that has had an experiment done to back up the annotation. One can think of that situation as a computational method (the sequence alignment) plus experiment. What do we do with situations, like sequence + structural analysis, such as programs that predict snoRNAs and tRNAs. These predict that a particular sequence may have certain functions, etc. because they look like something; the analysis appears to be solely based on computation. More like an IEA or RCA

The WITH field

The WITH/Inferred field always tries to answer the question WHAT? Based on sequence similarity to WHAT? Based on WHAT mutant phenotype? Based on interaction with WHAT gene? Based on WHAT kind of electronic method (Reviewed or not-reviewed).

Judy, David, Harold