Annotation Extension: Capturing participants: Difference between revisions
mNo edit summary |
mNo edit summary |
||
Line 2: | Line 2: | ||
==Introduction== | ==Introduction== | ||
==Allowable relations for target annotation extensions== | |||
* [[has_regulation_target]] | |||
* [[has_downstream_target]] | |||
* [[has_target_cell]] | |||
* [[has_target_anatomical_entity]] | |||
* [[dependent_on]] | |||
==Usage notes== | ==Usage notes== | ||
Line 101: | Line 109: | ||
|} | |} | ||
===Use cases=== | ===Use cases=== | ||
==Multiple annotation extensions for targets== | ==Multiple annotation extensions for targets== |
Revision as of 04:13, 17 April 2012
N.B. This page is under construction!
Introduction
Allowable relations for target annotation extensions
- has_regulation_target
- has_downstream_target
- has_target_cell
- has_target_anatomical_entity
- dependent_on
Usage notes
Specifying interacting partner gene products
Gene products are participants in processes/functions. We would use the has_participant relation or one of it's subtypes to indicate the particular role that additional gene products play. These will typically be has_input or has_output
Note that the gene product in col 2 is also a participant. However, the relationship between the process in col5 and this gene product is currently implicit, so we don't need to worry about it here.
If in doubt about which relation to use, it is always possible to use the most generic relation, has_participant. Of course, this does not carry as much information but at least should be correct.
The gene product could be a UniProtKB ID, a gene product ID from the same MOD that is contributing the annotation (e.g. FBtr or FBpp for FlyBase). If the target is a protein then you should always use a protein ID, rather than a gene ID as proxy. Similarly, if the target is a gene (e.g. transcription) then use a gene ID.
As an example, consider SIRP beta2 which in concert with CD47 positively regulates cell-cell adhesion (PMID:15383453)
The annotation for SIRP beta2 would be:
col5: GO:0022409 col16: has_participant(UniProtKB:Q08722)
Here Q08722 is the ID for CD47
NOTE: (2010-04-30) - we decided there is a simpler way to do this using GO:0050839 ! cell adhesion molecule binding, however this has less information that saying specifically that it's involved in cell-cell adhesion, and I don't think it's appropriate to add a F->P link from 'cell adhesion molecule binding'.
Phosphorylation targets
If protein SGD:A phosphorylates protein SGD:B then annotate A to:
col5: GO:0004672 col16: has_input(SGD:B)
GO:0004672 is protein kinase activity
note we would not include a separate annotation line for B, because we only have annotation lines for active participants
strictly speaking, the input is SGD:B in the unphosphorylated state and the output is SGD:B in the phosphorylated state. However, currently we do not have IDs for these separate protein forms. Really B is both an input and an output. We standardize on has_input here.
Note there is no need to say
col16: has_input(SGD:B),has_input(CHEBI:15422)
or even
col16: has_input(SGD:B),has_input(CHEBI:15422),has_output(SGD:B),has_output(CHEBI:16761)
These are correct but this is pointless because the additional info is redundant with what we already know about kinase activity (this is actually made computable in MF x CHEBI)
Also there is no need to make a separate col 16 annotation for the phosphorylation process as this can be inferred
If protein SGD:A phosphorylates protein SGD:B and SGD:C then annotate A to:
col5: GO:0004672 (protein kinase activity) col16: has_input(SGD:B),has_input(SGD:C)
There is some redundancy with interaction databases here. Capturing this as GO annotation is more expressive as you can say "A phosphorylates B during pathway C". But if you want to capture this in interaction databases exclusively we have tools for generating GO annotations from these (just as we have tools for capturing GO annotations from pathway databases)
transport targets
TODO
transcription targets
TODO
Specifying inter-species protein binding partners
If an experiment showed binding of two proteins from the same species, then the identifier for the binding partner would go in both column 8 and column 16. If it was an inter-species experiment, i.e. a protein from one species and a binding partner from another species, then the accession for the binding partner actually used in the binding experiment would go in column 8 and the accession for the inferred in vivo binding partner would go in column 16.
Use case
1. Chicken SFRP1 (Q9DEQ4) interacts with mouse Frizzled-2 (Q9JIP6) PMID:16172602. The actual experiment was performed with chicken and mouse proteins, but a curator can infer that the chicken SFRP1 would bind the chicken Frizzled-2 (Q9IA06) and the mouse Frizzled-2 would bind the mouse SFRP1 (Q8C4U3). The GO term used to annotate chicken SFRP1 should be 'frizzled binding' (GO:0005109).
So the reciprocal annotations would be;
DB (Col 2) | Object (Col 3) | GO ID (Col 5) | Reference (Col 6) | With (Col 8) | Extension (Col 16) |
---|---|---|---|---|---|
Q9DEQ4 | SFRP1 | GO:0005109 | PMID:16172602 | Q9JIP6 | has_participant(Q9IA06) |
Q9JIP6 | FZD2 | GO:0005515 | PMID:16172602 | Q9DEQ4 | has_participant(Q8C4U3) |