Ontology meeting 2015-06-25

Attendees: Paola, David OS, Melanie, Harold, Heiko, Paul T, David H

Minutes: Paola

transcription factor name changes

Quick decision ? https://www.ebi.ac.uk/panda/jira/browse/GO-350

 Needs DPH input. Is leaving out 'sequence-specific DNA binding' from names a potential cause of dangerous ambiguity?  Karen suggests it might, but hasn't provided examples.
 DPH - can't give examples off-hand.   Is happy with compromise naming solution.  DOS will go with this.

Protein family complex terms

For protein complexes consisting of multiple subunits from the same family (e.g. smads, integrins, some channels), should we just a general family term and then rely on InTact for specific classes.

(We provisionally agreed to this before). For many such cases, our coverage is very spotty and not likely to get better soon.

BUT NOTE: This approach involves obsoleting the more specific classes we currently have. Ideally we'd have a way to reference the INTACT ID in an annotation extension. This rather stretches the semantics of AEs.

examples:

http://sourceforge.net/p/geneontology/ontology-requests/11740/ http://sourceforge.net/p/geneontology/ontology-requests/11735/

 Shall we obsolete such terms if they have never been used (or nearly never)? 
 Can we come up with a relationship/extension for the children of e.g. ‘integrin complex’ to represent information that would otherwise be lost if we obsoleted? 
 Maybe not? 
 Until then, it’s difficult to go ahead with the obsoletions - we’d have to maintain things manually. 
 We’ll need to discuss in DC. We’ll get a list of these families from Sandra and/or Birgit at IntAct, then we’ll look into the annotations.

 AI: David OS to follow up with IntAct on this - requesting more examples  -  and add to the GOC meeting agenda page.

Follow-up: New TG templates

Where are we with action items on these? Copying over from last discussion:

If the export templates are ready, we'll test them, then advertise them along with the ones below:

cellular component binding

cellular component organization

import_across_membrane

import_across_plasma_membrane

import_into_cell

STATUS OF TEMPLATES:

cellular component binding  - FINISHED.  READY FOR RELEASE

cellular component organization  - FINISHED.  READY FOR RELEASE

import_across_membrane - FINISHED.  READY FOR RELEASE

export_across_membrane - PATTERN SPECIFIED*.  NOT YET ON TG new.

*https://www.ebi.ac.uk/panda/jira/browse/GO-348  This has export pattern, ready to be implemented.

 https://www.ebi.ac.uk/panda/jira/browse/GO-331 
   
  Discussion of problems with current 'import into cell' and 'import across plasma membrane & relevance to export equivalents'

   'import across plasma membrane' terms are currently inferred subclasses of import into cell. 
   Perhaps questionable for cells with double membrane.  But definitely wrong to have reverse inference on export for cells with double membrane.
   We may need a new term for plasma membrane that is outer layer of cell. Could be defined using 'bounding layer of' relationship 
   used for defining bounding membranes of membrane bound organelles

   We need to look more carefully at the import to cell and import across plasma membrane patterns. 
   In the meantime, advertise other templates above, including generic export template

AI: Paola to advertise templates for 'cellular component binding' and 'cellular component organization'.

AI: Heiko to implement geneic 'export across membrane' template

AI: David OS to coordinate analysis and resolution about use of new term or extending existing relationship.

Follow-up: GO-SO issues

Where are we? Copying from last discussion:

There are various problems with our use of SO, some of which requires co-ordination with SO dev:

We need a bridge from SO transcript terms -> ChEBI:RNA. In the absence of this, lots of inference is missing. Will the long planned SO molecular save us, or do we need our own bridge axoims?
We need a differentium for recording which RNA metabolic processes are processing (involve maturation). We may be able to do this using terms from SO (see next item), or we could use a similar strategy to the one we use for developmental progression via a 'results in maturation of' relation. (We may, in fact, need a combination of these).
We use the SO terms nRNA, ncRNA and its children as if they refer to both mature and immature states of transcripts. In fact, according to SO they refer to the mature state. To align with SO properly we would need to review usage and use alternative SO terms where available. SO has an additional set of terms for primary transcripts, but no terms for immature. Primary transcript refers only to before splicing so no terms for intermediate state after splcing and before other modifications involved in maturation such as capping and polyadenylation for mRNA. Need to discuss possibilities of adding these with SO.

Meeting report: 
  * We committed to using MSO once it is ready, which should be soon. They are working on MSO->SO sync mechanisms.
    * In the meantime we will continue to use SO, adding a bridging axiom to GO  for SO:transcript -> ChEBI:RNA.
  * SO will add a set of terms for transcripts that are agnostic as to their maturity, as well as mature and primary (which they already have).
  * We came up with a set of patterns to use to keep the 3 heirarchies in sync in SO.
  * We will switch to using the appropriate SO terms until MSO is avaiable.

We need a formal way to refer to transcript maturation; it’s all manual right now. 
It would be nice if we could do it by referring to the participating transcript, but maybe that’s a bit too complicated. 
Otherwise use results_in_maturation_of - currently it’s used strictly in development; could it be expanded? 
No, that may lead to too many unwanted inferences. 
We probably need a whole new relation to refer to transcript maturation. 
AI: come up with a new relation. Do that in discussion with SO.

https://code.google.com/p/sequence-ontology/source/browse/#svn%2Ftrunk%2Fsrc%2Fontology%2Fdev

 AI:  DONE DOS to add ticket for new relation to RO repo - with aim of adding to ontology ASAP.
    https://code.google.com/p/obo-relations/issues/detail?id=63&thanks=63&ts=1434641139

Follow-up: X dependent process & involved in

We have many X-dependent process terms. My initial reading of the names and definitions of these terms was that the X must be upstream, but typically X (implicitly) is treated as part of the process that depends on it, as shown by the large number of cases of term of the form X involved in X dependent process Y. This may be a good thing, but perhaps could be made clearer with some design patterns.

Prompted by: https://sourceforge.net/p/geneontology/ontology-requests/11713/#a397

 Ruth & Rachael are already reviewing ~ 2000 annotation extensions using dependent_on.
 The results are posted on the AE GitHub site https://github.com/geneontology/annotation_extensions/issues/17
 TODO: Following this, DOS to review list of terms in ontology using during to see where the same patterns might apply. <- PRIORITY?

 in_presence_of is much more defensible than dependent_on.  Former is used for chemicals/CC/proteins (https://github.com/geneontology/annotation_extensions/issues/18)
 conservative proposal is to obsolete dependent_on and tighten up in_presence_of to ban annotation to substrates.

Aim for guidance on use of during in Ontology and by annotators. To be discussed in annotation call.

Follow-up: Remove redundant relations for release files

As discussed last week, we updated the release process to remove redundant super classes for go.{obo|owl} and go-simple.{obo|owl} (go-basic and other legacy files are derived from go-simple, so they should be fine too).

 Implemented.  Would be good to also remove redundant relationships from the release file (rather than just classifications).  But this has to be a longer term aim as we can't just use built-in OWL-API methods.

 Still to do: removed redundant classification from go-plus file release, which needs more work. (It is currently also a different Jenkins job)