Annotation consistency: IEA, ISS, IC Usage Discussion: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
m (New page: =Email exchange= Tanya Berardini, Emily Dimmer, Pascale Gaudet, Chris Mungall, Kimberly VanAuken ---- Hello, I am going through the action items for next week's reference genome conferen...)
 
Line 5: Line 5:
Hello,
Hello,


I am going through the action items for next week's reference genome
I am going through the action items for next week's reference genome conference call. At the reference genome meeting we all eagerly volunteered to come up with some guidelines as to how to use ISS, IEA
conference call. At the reference genome meeting we all eagerly
volunteered to come up with some guidelines as to how to use ISS, IEA
and IC (too much coffee??). Here's the major points I remember:
and IC (too much coffee??). Here's the major points I remember:


* IC: came up during the example of a translation initiation factor
* IC: came up during the example of a translation initiation factor annotated to a good function and component but had a root annotation for process. It was suggested maybe that could have got an IC annotation. (gene is mouse Eif2b2)
annotated to a good function and component but had a root annotation for
process. It was suggested maybe that could have got an IC annotation.
(gene is mouse Eif2b2)


* related is ISS and IEAs: David pointed out this gene probably had a
* related is ISS and IEAs: David pointed out this gene probably had a goos IEA annotation from interpro. The question was how to address this: dicty and pombe would make this an ISS annotation, to avoid the root annotation. The problem is, InterPro domains and mappings can change so maybe that's not such a good practice after all. However, since IEAs ware not displayed in AmiGO (and perhaps excluded in certain studies are poorly reliable), some valuable information is not used.
goos IEA annotation from interpro. The question was how to address this:
dicty and pombe would make this an ISS annotation, to avoid the root
annotation. The problem is, InterPro domains and mappings can change so
maybe that's not such a good practice after all. However, since IEAs
ware not displayed in AmiGO (and perhaps excluded in certain studies are
poorly reliable), some valuable information is not used.


Was that your recollection as well? Can we make a plan to discuss that
Was that your recollection as well? Can we make a plan to discuss that at some point and come up with suggestions? (It doesn't have to be before the call next Tuesday; but I wanted to write down the important discussion points).
at some point and come up with suggestions? (It doesn't have to be
before the call next Tuesday; but I wanted to write down the important
discussion points).


Cheers,
Cheers,
Line 34: Line 20:
Hi,
Hi,


My memory of the Reference Genome discussion for IEA was that as many of
My memory of the Reference Genome discussion for IEA was that as many of the IEA methods used have increased in quality over the last 5  years, some groups are becoming more accepting of the data they provide. I thought that Judy's suggestion was that a group should overview these methods and decide which sets of data should be displayed in AmiGO (ie. if an annotation is supported by multiple independent IEA methods).
the IEA methods used have increased in quality over the last 5  years,
some groups are becoming more accepting of the data they provide. I
thought that Judy's suggestion was that a group should overview these
methods and decide which sets of data should be displayed in AmiGO (ie. if
an annotation is supported by multiple independent IEA methods).


If that is a true memory it might be handy to draw up a list of the IEA
If that is a true memory it might be handy to draw up a list of the IEA methods that we want to discuss. From the GOA perspective, I would like to
methods that we want to discuss. From the GOA perspective, I would like to
include:
include:
# InterPro2GO
# InterPro2GO
Line 53: Line 33:
Cheers,
Cheers,
Emily
Emily
----
Hi,
Thanks for getting the ball rolling on this one, Pascale.
My recollection of this discussion was that it stemmed from a concern about annotation consistency and that Suzi was concerned that there seemed to be lots of 'missing' ISS annotations in the reference genome work. One response to this was that for many gene products, the IEA annotations were actually providing sufficient information and that perhaps IEAs suffer from past negative perceptions that are no longer accurate given that substantial feedback has greatly improved some of the mappings.
Just speaking for myself here, when curating I often make the very pragmatic decision to focus on getting as many experimental annotations in as possible and then, time permitting, go back and try to fill in ISS annotations where we don't have experimental data.  In the meantime, though, I do look at our existing IEA annotations and find that many of them are just fine and, if included in AmiGO,
would help plug some perceived annotation holes.
I agree with Emily that it would be worthwhile to look at the various IEA methods and come up with some metrics for evaluating their accuracy.  I'm not sure what these could be, but (thinking out loud) perhaps there's a way to determine what percentage of these mappings are supported by experimental data in any organism.
Also, with the proposed changes to the ISS branch of the evidence codes, is it worth considering promoting some of these mappings to one of the new IS* evidence codes, if the method ultimately stems from sequence analysis?
Emily, does some of the BioCreAtIvE work speak to this subject as well?  I seem to recall Evelyn talking about this issue at past GO meetings and commenting that the BioCreAtIvE work provided support for the idea that some electronic annotations were actually of high quality.
Cheers,
Kimberly

Revision as of 16:53, 6 December 2007

Email exchange

Tanya Berardini, Emily Dimmer, Pascale Gaudet, Chris Mungall, Kimberly VanAuken


Hello,

I am going through the action items for next week's reference genome conference call. At the reference genome meeting we all eagerly volunteered to come up with some guidelines as to how to use ISS, IEA and IC (too much coffee??). Here's the major points I remember:

  • IC: came up during the example of a translation initiation factor annotated to a good function and component but had a root annotation for process. It was suggested maybe that could have got an IC annotation. (gene is mouse Eif2b2)
  • related is ISS and IEAs: David pointed out this gene probably had a goos IEA annotation from interpro. The question was how to address this: dicty and pombe would make this an ISS annotation, to avoid the root annotation. The problem is, InterPro domains and mappings can change so maybe that's not such a good practice after all. However, since IEAs ware not displayed in AmiGO (and perhaps excluded in certain studies are poorly reliable), some valuable information is not used.

Was that your recollection as well? Can we make a plan to discuss that at some point and come up with suggestions? (It doesn't have to be before the call next Tuesday; but I wanted to write down the important discussion points).

Cheers, Pascale


Hi,

My memory of the Reference Genome discussion for IEA was that as many of the IEA methods used have increased in quality over the last 5 years, some groups are becoming more accepting of the data they provide. I thought that Judy's suggestion was that a group should overview these methods and decide which sets of data should be displayed in AmiGO (ie. if an annotation is supported by multiple independent IEA methods).

If that is a true memory it might be handy to draw up a list of the IEA methods that we want to discuss. From the GOA perspective, I would like to include:

  1. InterPro2GO
  2. SwissProt Keyword 2GO
  3. EC2GO
  4. HAMAP2GO
  5. UniProt subcellular location 2GO
  6. Ensembl Compara - projection of annotation from ortholog data.

Cheers, Emily


Hi,

Thanks for getting the ball rolling on this one, Pascale.

My recollection of this discussion was that it stemmed from a concern about annotation consistency and that Suzi was concerned that there seemed to be lots of 'missing' ISS annotations in the reference genome work. One response to this was that for many gene products, the IEA annotations were actually providing sufficient information and that perhaps IEAs suffer from past negative perceptions that are no longer accurate given that substantial feedback has greatly improved some of the mappings.

Just speaking for myself here, when curating I often make the very pragmatic decision to focus on getting as many experimental annotations in as possible and then, time permitting, go back and try to fill in ISS annotations where we don't have experimental data. In the meantime, though, I do look at our existing IEA annotations and find that many of them are just fine and, if included in AmiGO, would help plug some perceived annotation holes.

I agree with Emily that it would be worthwhile to look at the various IEA methods and come up with some metrics for evaluating their accuracy. I'm not sure what these could be, but (thinking out loud) perhaps there's a way to determine what percentage of these mappings are supported by experimental data in any organism.

Also, with the proposed changes to the ISS branch of the evidence codes, is it worth considering promoting some of these mappings to one of the new IS* evidence codes, if the method ultimately stems from sequence analysis?

Emily, does some of the BioCreAtIvE work speak to this subject as well? I seem to recall Evelyn talking about this issue at past GO meetings and commenting that the BioCreAtIvE work provided support for the idea that some electronic annotations were actually of high quality.

Cheers, Kimberly