Ontology meeting 2013-10-24: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
Line 45: Line 45:


Can we add it to the term genie template?
Can we add it to the term genie template?
Comment by Heiko: Could/Should this be a (set of) general disjoints? This looks good check for the main validation pipeline.
I would like to avoid special checks for a template. A set of axioms in OWL and using the OWL reasoner
would be better than an adhoc solution in TG.


=== TG template Protein complex by activity ===
=== TG template Protein complex by activity ===

Revision as of 15:47, 23 October 2013

Attendees:

Minutes:


Legitimate GO term?

GO:0003869 4-nitrophenylphosphatase activity

https://sourceforge.net/p/geneontology/ontology-requests/10452/

Substrate appears to be an indicator but the activity is widely described by the GO term name in the literature. Keep?


Review of Jira tickets

...

Classification of protein complexes

  • Should we take components and protein families into account ? If so, how?
  • Should we add more abstract, functional classes, or just rely on adding capable_of/capable_of_part_of links to molecular function terms?

This agenda item was prompted by the edits and questions described in SF ticket 10443. Briefly, it is clear that the only safe way to write XP definitions for 'ATP-binding cassette (ABC) transporter complex' (GO:0043190) would be with reference to the ABC transporter protein superfamily, perhaps via InterPro?. Formal definition of at least some child terms would require this + references to plus references to components, or perhaps to external resources collecting information on protein complexes (via Intact?).

More generally: the hierarchy under 'protein complex' (~1500 terms) is currently quite flat and consists mostly of terms for complexes defined in part by their constituent proteins. It contains relatively little abstractly defined classification of complexes based on location and function. About 1/3 have some assertions of function coming from about 100 direct assertions - almost all of which are in XP definitions of the form 'protein complex that capable_of some X'. Given the problems described in SF-10443, I think these need to be reviewed, questioning whether XP defs are safe or should be relaxed to relationships (AKA SubClassing assertions).

  • Should we take components and protein families into account ? If so, how?
    • DOS: My instinct that we should try to leverage external classification systems and reference sources for this - e.g. Intact & InterPro
  • Should we add more abstract, functional classes, or just rely on adding capable_of/capable_of_part_of links to molecular function terms?
    • DOS: I think that recording function and location plays to the strengths of GO. We should definitely be doing this as comprehensively and completely as possible. I'm agnostic about whether we should add a layer of abstract classes for complexes defined purely by function and location. These would certainly be useful for grouping, but adding them would be a lot of work and I could imagine purpose built tools that allowed users to construct their own queries for complexes based on their function and location.

Addition of disjointness declarations to TermGenie regulation template?

This seems like a sensible error check to have:

'positive regulation of X' disjointWith 'negative regulation of X'

It would have instantly caught

positive regulation of strand invasion is_a: negative regulation of strand invasion

which I accidentally added last week (now corrected).

Can we add it to the term genie template?

Comment by Heiko: Could/Should this be a (set of) general disjoints? This looks good check for the main validation pipeline.
I would like to avoid special checks for a template. A set of axioms in OWL and using the OWL reasoner
would be better than an adhoc solution in TG.

TG template Protein complex by activity

Chris mentioned that most of the xps for protein complexes defined by their activity are in place. There is a ticket here: https://www.ebi.ac.uk/panda/jira/browse/GO-204

Some things to define/discuss:

  • Name of the template?
  • A generic template for the textual definition
  • Some of the complexes have a comment stanza to also look at the activity term, should this also be generated?
  • Do we also want to handle: 'capable_of_part_of'? Right now it would just be 'protein complex' and 'capable_of' some 'molecular function'