Evidence codes documentation- to do

From GO Wiki
Jump to: navigation, search


Since I'm due on September 20th and will be going on maternity leave shortly before the GO meeting, Mike asked me to send these remaining items to finish up the Evidence Code documentation directly to the list to at least get the discussion started. Some issues may need to be discussed at the GO meeting as well.

This email will contain some responses to Midori's last email and a few other email comments to resolve some minor comments on the current draft of the new Evidence Code documentation. I will send separate emails to deal with each of these specific issues:

1. Restriction that all unknowns MUST use ND

2. IMP vs IGI for single gene mutations, regardless of gene being annotated

3. How to put program or method names in the with column for ISS

4. Scope of the RCA evidence code

For both issues 2 and 4, I think that the recommendations I've made will help make it possible to create a decision tree/flowchart that is fairly simple and clear. I'll send a very rough draft of a flowchart separately as well.

Note that for both #s 3 and 4, I have put some supplemental info into html docs in my personal space. I did not spend much time doing html formatting for these docs, on the thought that people might prefer to move them to the GOC wiki. However, as the Evidence Code Committee was not designated as a Working Group, I have no idea where to put them within the wiki structure. If a spot is designated for them, they can be moved to the wiki.


Responses and comments on things in red on this page:


1. GO_REF documentation

> We should have documentation that explains GO_REF's and links to it > when we refer to them.

Midori (15 Jun 2007):

 Links can go to the existing GO References page:
 I can write up a description (which will be brief; there's not an
 enormous amount to say) and give it to Amelia to be added to the blurb
 at the top of this page. The plain text file from which the web page
 is generated contains a brief description of the format, which could
 be HTMLified and also added to the blurb if it would be useful.

Karen (9 Sept 2007):

 Please do. It would also be good if the page for the GO_REFs is made
 easier to find in general in our documentation.

2. ChEBI IDs in with field?

> Do we allow things like ChEBI IDs in the with field?

Midori (15 Jun 2007):

 I would say yes.

Karen (9 Sept 2007):

 Perhaps we should make this a quick agenda item for the next GO
 meeting, so that people can ratify this face to face, unless we get an
 overwhelming response via email to proceed with allowing this new ID
 for the with field.

3. IMP examples

> any more positive examples for IMP?, e.g. phenotypic similarity

Midori (15 Jun 2007):

 Dredged up from email from January 2002 ...
 Erich Schwarz needed to know which code to use for "other mutations
 sharing a complex mutant phenotype syndrome with [a well-characterized
 mutant]." My comment at the time was: "The situation you've described
 is IMP, not IGI, because (if I understand correctly) you're looking at
 one mutation at a time. Comparing the phenotype of one mutation to
 that of another helps you interpret the meaning, but is not a kind of
 genetic interaction."
 I think this still holds. Erich provided some details of an example,
 which I can forward if you want.

Karen (9 Sept 2007):

 We can certainly include it, the more examples the better in my
 opinion, but don't send it to me. I'll be going on maternity leave
 soon and don't want to be responsible for this getting added.

4. use of with field for NAS

> The Evidence Code Committee discussed the idea of making GO > annotations from Reactome entries. ... What does the full group feel > about the idea of allowing the ID for a database record, when such > exist, in the with field?

Midori (15 Jun 2007):

 I'm all for including annotations based on Reactome entries -- they
 have a well-developed curation system that deeply involves expert
 biologists, so the statements in their records are very reliable.
 I am not in favor of putting the Reactome ID in the with field for
 these annotations, however, because the Reactome entry does not modify
 or supplement the evidence; rather, the entry provides the
 evidence. GO would effectively be using a Recatome record as a source
 of information about a gene product, so it would make much more sense
 to put the Reactome ID in the reference field.
 For the more general database record case, it may be that I don't
 sufficiently understand what might go in a GO_REF (or equivalent), so
 I don't understand the rationale for allowing 'with' for NAS.
 For the case where the author infers one thing from another, using a
 GO ID in 'with' makes more sense, but I think it's not really
 necessary because the author (presumably) hasn't actually made any GO
 annotations, and hasn't stated observations or conclusions in terms
 of, well, GO terms. (Perhaps this will change some day!) Also, note
 that we have expressly disallowed the use of 'with' for NAS, so the
 script would have to be changed if the use of with-for-NAS is agreed.

Karen (9 Sept 2007):

 Regarding the idea of allowing Reactome IDs in the with field, the
 thought was that it provided the specific information about which
 record in Reactome made the statement, but the idea was
 controversial even just with the Evidence Code Committee.
 Regarding the idea of allowing GOids for NAS, I think you bring up a
 good point that this may not make sense since the author has typically
 not stated their statement in terms of a GOid from which an inference
 was made. Allowing this may just be more confusing than helpful,
 especially since deciding which GOid to put in the with field will
 almost always be a curator judgement.
 However, I wasn't one of the proponents of this idea, so those who
 are may wish to defend it.
 In any case, rather than adding yet another usage of the with column
 that is potentially confusing to users, I could personallyjust go
 with not allowing use of the with column at all for NAS.

5. Representation of examples for with/from:

Susan (14 Jun 2007):

 IPI examples
 Looks good but there something odd about the IPI example,
 assuming I am looking at the latest version ok.
 Firstly, the paper is about mouse proteins not Drosophila so could we
 change FB to MGI please. Also, I am confused as to why there are three
 lines shown - MGI just list the middle one:
 FB:gene_1_ID    Abcd3    GO:0005515    PMID:10551832    IPI    UniProt:protein_2_ID    ...
 FB:gene_1_ID    Abcd3    GO:0005515    PMID:10551832   IPI    UniProt:protein_2_ID|UniProt:protein_3_ID       ...
 FB:gene_1_ID    Abcd3    GO:0005515    PMID:10551832   IPI    FB:gene_2_ID
 So unless I'm missing something I suggest we lose the extra lines and
 have either:
 MGI:1349216    Abcd3    GO:0005515    PMID:10551832    IPI    UniProt:P33897|UniProt:Q61285
 MGI:gene_1_ID    Abcd3    GO:0005515    PMID:10551832    IPI    UniProt:protein_2_ID|UniProt:protein_3_ID
 I'd prefer to include the real identifiers so it isn't a mix of 'real'
 and 'example'.
 Similarly there seems to be a mix of FB and SGD db identifiers in the
 IGI examples. A possible alternative for IGI is:
 In PMID:9043060, flies simultaneously mutant for three genes: klingon
 (klg), sevenless (sev) and Son of sevenless (Sos) are used to show that
 klingon participates in R7 photoreceptor fate commitment. This leads to
 the annotation:
 FB:FBgn0017590    klg    GO:0045466    PMID:9043060    IGI    FB:FBgn0003366|FB:FBgn0001965

Karen (9 Sept 2007):

 I'm all for real examples, but I don't have time to dig them up for
 every evidence code. Perhaps we could distribute this task around, so
 that we have multiple real examples for each evidence code. It would
 be good to have at least one example with one entry in the with
 column, as well as the one with multiple. It would also be good if
 they showed various IDs in the with field.
 This would be a reasonable task if there was one person for each
 evidence code to find some real examples, and then hopefully it would
 be easy for Amelia to put them in the right format if she was given
 all the specific info that should be in the table.

6. ISS & with col: > Note that there should be good evidence that the gene product(s) > placed in the with/from column actually has the activity, process, > etc. being annotated.

Midori (15 Jun 2007):

 Do we want to specifically say the "good evidence" should be
 *experimental* evidence? Would be consistent with the Ref Genome
 requirement, and good practice generally ...

Karen (9 Sept 2007):

 We do have to remember that this Evidence Code document is not just
 for the use of the Reference Genomes. While did agree that ISS should
 not be made from pairwise BLAST unless the gene to be placed in the
 with column has been experimentally characterized, the ISS code covers
 more situations than just that. The with field may also contain Pfams,
 Prosite, TIGRFAMS, CBS, COG, PANTHER, and we also have to determine
 how to include method names here for stuff like tRNAscan and my
 specific question about snoRNAs. Michelle Gwinn may wish to comment
 on this too.

Typos, other trivial fixes:

1. IGI

> Should we add a statement in the paragraph above to IGI, similar to > the one in IMP, about care in making annotations from gain of > function mutations ...?

Midori (15 Jun 2007):

 Sounds reasonable to me.

Karen (9 Sept 2007):

 OK, added to first paragraph of IGI.

2. Last paragraph of Introduction:

Midori (15 Jun 2007):

 Change "effect" to "affect" in "... will also effect the quality of the resulting annotation."

Karen (9 Sept 2007):


3. IDA & IMP:

Midori (15 Jun 2007):

 Does "over-expression" really need to be hyphenated? I've seen it
 unhyphenated more frequently; also, there's one unhyphenated
 occurrence in the document.

Karen (9 Sept 2007):

 changed to unhyphenated

4. IGI examples:

Midori (15 Jun 2007):

 The statements "For this type of experiment, use the IGI Code" could
 be deleted -- they're redundant with the fact that the description
 appears in a list headed "where the IGI code should be used."

Karen (9 Sept 2007):