Annotation Extension: Capturing participants: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
mNo edit summary
Line 2: Line 2:


==Introduction==
==Introduction==
==Allowable relations for target annotation extensions==
* [[has_regulation_target]]
* [[has_downstream_target]]
* [[has_target_cell]]
* [[has_target_anatomical_entity]]
* [[dependent_on]]


==Usage notes==
==Usage notes==
Line 101: Line 109:
|}
|}


====  ====
==Allowable relations for target annotation extensions==
* [[has_regulation_target]]
* [[has_downstream_target]]
* [[has_target_cell]]
* [[has_target_anatomical_entity]]
* [[dependent_on]]


===Use cases===
===Use cases===


==Multiple annotation extensions for targets==
==Multiple annotation extensions for targets==

Revision as of 04:13, 17 April 2012

N.B. This page is under construction!

Introduction

Allowable relations for target annotation extensions

Usage notes

Specifying interacting partner gene products

Gene products are participants in processes/functions. We would use the has_participant relation or one of it's subtypes to indicate the particular role that additional gene products play. These will typically be has_input or has_output

Note that the gene product in col 2 is also a participant. However, the relationship between the process in col5 and this gene product is currently implicit, so we don't need to worry about it here.

If in doubt about which relation to use, it is always possible to use the most generic relation, has_participant. Of course, this does not carry as much information but at least should be correct.

The gene product could be a UniProtKB ID, a gene product ID from the same MOD that is contributing the annotation (e.g. FBtr or FBpp for FlyBase). If the target is a protein then you should always use a protein ID, rather than a gene ID as proxy. Similarly, if the target is a gene (e.g. transcription) then use a gene ID.

As an example, consider SIRP beta2 which in concert with CD47 positively regulates cell-cell adhesion (PMID:15383453)

The annotation for SIRP beta2 would be:

 col5: GO:0022409
 col16: has_participant(UniProtKB:Q08722)

Here Q08722 is the ID for CD47

NOTE: (2010-04-30) - we decided there is a simpler way to do this using GO:0050839 ! cell adhesion molecule binding, however this has less information that saying specifically that it's involved in cell-cell adhesion, and I don't think it's appropriate to add a F->P link from 'cell adhesion molecule binding'.


Phosphorylation targets

If protein SGD:A phosphorylates protein SGD:B then annotate A to:

 col5: GO:0004672
 col16: has_input(SGD:B)

GO:0004672 is protein kinase activity

note we would not include a separate annotation line for B, because we only have annotation lines for active participants

strictly speaking, the input is SGD:B in the unphosphorylated state and the output is SGD:B in the phosphorylated state. However, currently we do not have IDs for these separate protein forms. Really B is both an input and an output. We standardize on has_input here.

Note there is no need to say

 col16: has_input(SGD:B),has_input(CHEBI:15422)

or even

 col16: has_input(SGD:B),has_input(CHEBI:15422),has_output(SGD:B),has_output(CHEBI:16761)

These are correct but this is pointless because the additional info is redundant with what we already know about kinase activity (this is actually made computable in MF x CHEBI)

Also there is no need to make a separate col 16 annotation for the phosphorylation process as this can be inferred

If protein SGD:A phosphorylates protein SGD:B and SGD:C then annotate A to:

 col5: GO:0004672 (protein kinase activity)
 col16: has_input(SGD:B),has_input(SGD:C)

There is some redundancy with interaction databases here. Capturing this as GO annotation is more expressive as you can say "A phosphorylates B during pathway C". But if you want to capture this in interaction databases exclusively we have tools for generating GO annotations from these (just as we have tools for capturing GO annotations from pathway databases)

transport targets

TODO

transcription targets

TODO

Specifying inter-species protein binding partners

If an experiment showed binding of two proteins from the same species, then the identifier for the binding partner would go in both column 8 and column 16. If it was an inter-species experiment, i.e. a protein from one species and a binding partner from another species, then the accession for the binding partner actually used in the binding experiment would go in column 8 and the accession for the inferred in vivo binding partner would go in column 16.

Use case

1. Chicken SFRP1 (Q9DEQ4) interacts with mouse Frizzled-2 (Q9JIP6) PMID:16172602. The actual experiment was performed with chicken and mouse proteins, but a curator can infer that the chicken SFRP1 would bind the chicken Frizzled-2 (Q9IA06) and the mouse Frizzled-2 would bind the mouse SFRP1 (Q8C4U3). The GO term used to annotate chicken SFRP1 should be 'frizzled binding' (GO:0005109).

So the reciprocal annotations would be;

DB (Col 2) Object (Col 3) GO ID (Col 5) Reference (Col 6) With (Col 8) Extension (Col 16)
Q9DEQ4 SFRP1 GO:0005109 PMID:16172602 Q9JIP6 has_participant(Q9IA06)
Q9JIP6 FZD2 GO:0005515 PMID:16172602 Q9DEQ4 has_participant(Q8C4U3)


Use cases

Multiple annotation extensions for targets