Annotating from phenotypes: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
Line 1: Line 1:
=Introduction=
= Introduction =
Mutants can provide useful insights into a protein's function. GO annotations based on a phenotype should represent the normal function that can be inferred from the mutant. GO does not aim to capture individual phenotypes; use '''phenotype annotation resources (do we want to be more specific here? Most MODs perform phenotype curation, but what about human phenotypes, for example?)''' for this purpose. The following guidelines should help determine how to annotate the function of a protein that can be inferred from its observed phenotypes.  
Mutant phenotypes can provide useful insights into gene function. In order to use mutant phenotypes as supporting evidence for a GO annotation, however, curators should keep in mind the following general principles:
#GO annotations based on a mutant phenotype should represent the ''normal'' role of a gene in biology. 
#Mutant phenotypes need to be interpreted in the overall context of what is known about a gene product so that appropriately proximal Biological Process terms can be selected.
#GO does not aim to capture individual phenotypes. Most groups that contribute annotations to GO also curate phenotypes as a separate data type.  


===What is the normal molecular function/biological process?===
= Guidelines =
*Remember that annotations are inferences from the evidence to a normal function/process.  
== Ask yourself: what is the normal Biological Process for this gene? ==
*You can only annotate a gene product as being 'involved in' a biological process if the MF can be placed within the set of MFs that make up that process.
*Remember that annotations based on mutant phenotypes are inferences about the ''normal'' Biological Process (BP) for a gene.  
**To help determine whether this is the case, it is useful to create a GO-CAM model in Noctua or consult a pathway or process model from a recent paper or review.
*To decide if a gene is directly 'involved in' a BP, determine if its Molecular Function (MF) is one of the MFs that makes up that BP.
**To help determine whether this is the case, it is useful to create an activity-based GO-CAM model in Noctua or consult a pathway or process model from a recent paper or review.
*If there is no MF known, any phenotype can only be annotated to ‘acts upstream of or within’ OR consider not making a GO annotation - it’s OK!
*If there is no MF known, any phenotype can only be annotated to ‘acts upstream of or within’ OR consider not making a GO annotation - it’s OK!
**'''Q: Is it possible to have a gene product with a novel or unknown MF (or just a protein binding MF) that is otherwise well characterized genetically (and maybe somewhat biochemically?) such that it could be placed within a given pathway or process?''' Does anyone have an example of this, where the annotation would be informative?  
**'''Q: Is it possible to have a gene product with a novel or unknown MF (or just a protein binding MF) that is otherwise well characterized genetically (and maybe somewhat biochemically?) such that it could be placed within a given pathway or process?''' Does anyone have an example of this, where the annotation would be informative?  

Revision as of 14:38, 13 November 2018

Introduction

Mutant phenotypes can provide useful insights into gene function. In order to use mutant phenotypes as supporting evidence for a GO annotation, however, curators should keep in mind the following general principles:

  1. GO annotations based on a mutant phenotype should represent the normal role of a gene in biology.
  2. Mutant phenotypes need to be interpreted in the overall context of what is known about a gene product so that appropriately proximal Biological Process terms can be selected.
  3. GO does not aim to capture individual phenotypes. Most groups that contribute annotations to GO also curate phenotypes as a separate data type.

Guidelines

Ask yourself: what is the normal Biological Process for this gene?

  • Remember that annotations based on mutant phenotypes are inferences about the normal Biological Process (BP) for a gene.
  • To decide if a gene is directly 'involved in' a BP, determine if its Molecular Function (MF) is one of the MFs that makes up that BP.
    • To help determine whether this is the case, it is useful to create an activity-based GO-CAM model in Noctua or consult a pathway or process model from a recent paper or review.
  • If there is no MF known, any phenotype can only be annotated to ‘acts upstream of or within’ OR consider not making a GO annotation - it’s OK!
    • Q: Is it possible to have a gene product with a novel or unknown MF (or just a protein binding MF) that is otherwise well characterized genetically (and maybe somewhat biochemically?) such that it could be placed within a given pathway or process? Does anyone have an example of this, where the annotation would be informative?
  • Example: nuclear pore: BRR6 is involved in nuclear envelope organization, when mutated, causes nucleocytoplasmic transport defects, but is NOT involved in nuclear transport
    • An essential nuclear envelope integral membrane protein, Brr6p, required for nuclear transport (2001). (PMID:11483521). The authors identified an essential gene, BRR6, whose product, Brr6p, behaves like a nuclear envelope integral membrane protein. Notably, the brr6-1 mutant specifically affects transport of mRNA and a protein reporter containing a nuclear export signal. In addition, Brr6p depletion alters nucleoporin distribution and nuclear envelope morphology, suggesting that the protein is required for the spatial organization of nuclear pores.
    • Integral membrane proteins Brr6 and Apq12 link assembly of the nuclear pore complex to lipid homeostasis in the endoplasmic reticulum (2010). (PMID:20016074)

The authors found that Brr6 has an essential function in regulating lipid homeostasis in the NE-ER, thereby impacting NPC formation and nucleocytoplasmic transport.

  • Annotations:
    • GO:0055088 lipid homeostasis
    • GO:0006998 nuclear envelope organization
    • NOT nucleocytoplasmic transport.

Being ‘required for’ a process does not mean a protein is ‘involved in’ a process

It is common for authors to state that a gene or gene product is required for a given process. However, the true meaning of this statement can vary and it is thus the responsibility of the curator to determine whether the gene or gene product's activity is, indeed, an integral part of a process. To help make this decision, it may be useful to think about the gene's role in a concentric circle of processes, starting with its MF. For a given MF, what is the most proximal process in which the gene is involved? For a transcription factor, the most proximal process would be regulation of transcription. From there, what is the next most proximal process? In a development context, for example, this might be cell fate specification or differentiation. Moving outward, what phenotypes might be indicative of defects in cell fate specification? If, for example, the cell was a particular class of neurons, then one of the defects may manifest as a change in a behavior. Is the transcription factor thus 'involved in' the behavior? No, but by annotating the TF to regulation of transcription (perhaps with relevant target genes) and specification of the particular cell types, it is possible capture the most relevant aspects of that gene's function. The defects in behavior could then be captured with phenotype annotations. Possible example: C. elegans ttx-1 Note that placing a gene or gene product in its appropriate biological context may require reading more than one paper!

apoE example? https://www.uniprot.org/uniprot/P02649

Pleiotropic effects should not usually be captured

Pleiotropy and ‘required for’ a process does not mean a protein is ‘part of’ a process

For example: splicing factors are often required for cell cycle transition, but they are not part of the cell cycle transition A good clue is viability of mutants: inviable mutants often have pleiotropic phenotypes or they have a strong terminal phenotype that can easily be misinterpreted (cell cycle transition blocks/checkpoints, chromosome mis-segregation, etc) Beware of read-outs: DNA replication, apoptotic DNA fragmentation, etc)

Cell proliferation, cell migration and apoptosis

  • Mutants showing increased/decreased cell proliferation, cell migration and apoptosis need to be analyzed carefully. If we don’t know the underlying molecular/cellular mechanism, these annotations should not be made
  • Mutually exclusive terms: cell proliferation should not be used for proteins involved generally in growth or division.
  • I would also add to this list: lethality, low brood size, slow growth, perhaps also sluggish locomotion.

Mutant phenotypes and regulation terms

  • It can be difficult to assess whether a gene or gene product regulates a process based on mutant phenotypes alone. As annotation to regulation terms in the BP ontology requires an understanding of the molecular basis for that regulation, mutant phenotypes may more often be used as supporting, rather than definitive, evidence for a gene's regulatory role. As with annotating to BP terms from phenotypes more generally, consider what is known about the MFs involved in the process and use that information to guide your annotation practice.
  • Mutants annotated to ‘regulation’ with no molecular function annotation (sometimes an annotation to a protein complex of with a known function) should be examined closely and reviewed.

Mutant phenotypes and other BP relations

  • acts upstream of or within
  • acts upstream of
  • We have some info on the relations pages, but I think we really need more examples.

Reviewing and removing older phenotype annotations

When there is new knowledge, older IMP annotation should be reviewed and removed as required - link to section on removing annotations?