Difference between revisions of "2010 GO camp downstream effect"

From GO Wiki
Jump to: navigation, search
m (Annotating to downstream processes)
m (Annotating to downstream processes)
Line 17: Line 17:
1. development
* 1. development
2. ageing
* 2. ageing
3. signalling - when does a process start/end
* 3. signalling - when does a process start/end
4. IMPs from large-scale mutant screens and from individual experiments
* 4. IMPs from large-scale mutant screens and from individual experiments
5. gene products with few papers available (e.g. a phenotype is all you have to annotate with)
* 5. gene products with few papers available (e.g. a phenotype is all you have to annotate with)

Revision as of 08:05, 27 April 2010

1. Background

2. Review of current GO annotation practices

  • Annotating signaling biological processes to transcription factors
  • when not to capture phenotypes : from 22nd Feb Jamboree call [1], Tanya: It's not uncommon for the initial publications to describe a mutant phenotype, with a developmental defect, and then later publications to describe much more explicit functions or processes. You should always annotate based on whatever evidence is available. Once you've done that, the question becomes, "When do we keep or remove the phenotype-based annotations?" At TAIR, their policy is to keep the developmental terms if they think that their users would expect to see them. Some participants suggested that one would expect all orthologs to have the same development-type annotations, across organisms. Others disagreed with this expectation.
  • [From Karen] This is fairly anecdotal. I wasn't able to find papers about this, but it's been known for a long time so that isn't necessarily surprising.

In S. cerevisiae, there are a number of genes which are components of the spliceosome, which when mutated produce strains with defects in protein production/accumulation. Early on, some of these genes were thought to be involved in translation. It was later determined that these genes are components of the spliceosome which are involved in mRNA splicing and not directly in translation at all. The reason why splicing defects produce translation defects is related to the distribution of introns in cerevisiae. Out of about 6000 genes, only about 270 contain introns. Many of the intron containing genes are ribosomal protein genes. Combined with the fact that ribosomal protein genes are highly transcribed, splicing defects have a disproportionate effect on production of ribosomal proteins and thus on translation.

So, while it is true that mutations in many spliceosomal genes produce a phenotype of defects in protein production, it is very clear that this is a downstream effect related to the fact that the majority of mRNAs to be spliced are ribosomal protein genes. Thus, we do not use the mutant phenotype of a defect in protein production to annotate these genes to GO terms related to translation.

  • Ranjana from WormBase: Checking for embryonic lethality or larval stages that do not develop further are very common assays that authors do in the elegans field, such that it may feel like our genes are over-annotated to the terms "embryonic development ending in birth or egg-hatching" and/or "nematode larval development". Also, like Tanya pointed out, we too annotate to the paper, if you want to say we have annotated every paper that talks about a gene, then you record everything. Sometimes its hard to tell whether something is a downstream effect. We have no mechanism in place to go back and remove these high-level development terms once the core process/function is known.

Annotating to downstream processes

Could at least one person from each group fill in this survey to give an idea of how much discrepancy there is between groups/annotators. This will give us an idea of the issues that are controversial and need to be discussed. In the first conference call (see discussions page for minutes), these five areas were identified as being used for downstream process annotations;

  • 1. development
  • 2. ageing
  • 3. signalling - when does a process start/end
  • 4. IMPs from large-scale mutant screens and from individual experiments
  • 5. gene products with few papers available (e.g. a phenotype is all you have to annotate with)

It would be useful to know which groups annotate to any of these areas and if you DON'T annotate to a certain area, why not? If your examples are too long for the table, please insert them into the main body of section 2. Review of current GO annotation practices.

Name Group Do you always annotate to downstream processes? Do you never annotate to downstream processes? Do you sometimes annotate to downstream processes?* (If yes, please answer next question) *Give examples of when you would AND wouldn't annotate to a DS process Do you go back and remove annotations if you find out more specific information about a gene products?
Rachael GOA No No Yes General: If it adds information that maybe useful to users I would usually annotate DS, but not at the cost of diluting what the gp is centrally involved in. I would not annotate to examples such as Karen's above which clearly demonstrates that effect on translation is a side effect of the mutation of the spliceosome. 1.Development - I think this is useful when users want to know all genes involved in e.g. eye development. I would annotate if a gp was repeatedly shown to have an effect on 'x' development (and, if the evidence was convincing, a single paper reporting an effect on development as this may be previously unknown information), but I maybe wouldn't annotate if there were pleiotropic effects. 3. Signalling - If a gp was repeatedly shown to have an effect on a particular pathway I would annotate. 4. I wouldn't annotate large-scale screens, but I do annotate individual experiments. 5. I would annotate if few papers available and the evidence was good. If they are old NAS/TAS evidence annotations I'll try to improve the evidence/granularity of the term or if there are high-level terms without much meaning I will probably delete them. I wouldn't look at a paper that had already been curated with experimental evidence codes. If a previous annotation is noticeably different to what I have read about the gp, I will go and check the original paper.
Fiona AgBase No No Yes General: There are a lot of chicken papers where they do IMP on very early embyronic stages and see many phenotypic effects. If the gene is well studied we can sometimes tease apart what processes we should annotate to but oftentimes this may be the only gene product information we have. No
Rebecca GOA No No Yes General: Alot of frog annotation is development based, and a role in a particular part of development is annotated based on mutant phenotype- it's not always possible to see what is direct and what is downstream, so I would record what the paper shows with IMP. Even if it is a downstream effect, if it isn't an artefact of the assay, I think it's still valid to annotate it. Transcription factors for example can regulate a number of different genes so have a whole load of downstream effects, which I would curate if the paper showed. I probably wouldn't curate downstream effects if the authors explicitly say they are knock-on effects. I tend to be more generous than cautious in annotating downstream events. Not often. Generally I curate on a protein-by protein basis so unless I'm adding in a new paper, I don't revisit the same protein often. I would replace IC, TAS or NAS with an experimental evidence tag if possible, and relook at annotations if there are glaring inconsistencies, but otherwise no.

3. Proposed annotation policy

4. Examples (papers) and discussion of GO annotation issues

  • Submitted by Pascale: There are several SF items about growth/cell growth/cell proliferation. I know some of the terms were done to accommodate experiments done in Dicty - often people look at the rate of cell proliferation as a general phenotype, and we have been capturing this. It's very high level and usually IMP, but in the absence of other information it seems relevant (otherwise people would not bother testing it).

What do people think about this? To me it's similar to the issue of annotating from IEP or to high level developmental terms by IMP. The question is, what data are too general to be useful to capture?

  • Organismal behaviors are always quite controversial. For example, lonp gene of rat is annotated to aging.

5. Suggestions for Quality Control procedures

Back to 2010_GO_camp_Meeting_Agenda