Ontology meeting 2014-11-13

From GO Wiki
Jump to navigation Jump to search

Attendees:David H, Harold, Tanya, David OS, Paola, Heiko, Jane, Paul T

Minutes:David H


Follow-up: Relationships

[Copied from last meeting - where are we? Anything we should discuss today?]

Follows up to GOC meeting discussion with Ruth on Annotation extension relationships. The action item was:

"Ontology group will dedicate a meeting to looking through the list of all relations used in annotation extensions. They will document those that are used in the ontology and will indicate whether the remainder should be dropped or kept and therefore documented. They will also note any new relations that need to be added e.g. adjacent_to.

Ontology group will make sure all the relationships are modeled correctly in LEGO."

Also see email thread 'Annotation extension relations'.

Chris writes: "The first task will be to migrate the editors version of the annotators extensions to RO to OWL so we can better manage these."

See email thread 'AE relationships from last meeting'.

  PROGRESS: 
       - reviewed all except 'over' relations  and marked whether in RO, should be obsoleted, or throw back to editors for more work 
           (see http://wiki.geneontology.org/images/2/28/AE_relations_GO_EDs_review.txt)
       - Paul T has made LEGO models in noctua based on existing AE wiki doc - he will link to these from the appropriate AE wiki pages.
  TODO: Work through relations in RO to check whether RO defs are clear enough and consistent with usage guidance from annotators (WHO, WHEN?)
  TODO: schedule meeting to discuss has_input  :DOS:
     When: Week of Nov 17th if poss (Chris away next week)
     invite: DH, Ruth. Val, Chris, DOS, Jane, Paul T. - anyone else?
     We MUST have prep of the various uses and suggested names for the variants
  TODO:  Discuss better formalising local domain and range (currently stored as ID strings - target should be actual ontology classes  (perhaps unions) :David:Chris:Tony:
  TODO:  Further restrict local domains in order to help protein2GO help annotators appropriately restrict term choice.   
       This is particularly important for specific relations rqd for dev branch.

We will have the has_input meeting next Monday. Once we are done, we need to go back to the annotators and report about which will be valid for use. Will the RO definitions be of use for annotators? We need a person from the annotation group and the ontology group to get together to make sure both are the same, David OS and Ruth.

Implementing new merge strategy / representation in OWL

Last week we agreed:

   AI: OBO2OWL round-tripping alt_id -> obsolete + replaced_by in OWL only.
       We gradually move towards using obsoletion + replaced_by as part of our Ed policy.   
       Need to trail this to MODs (who will then gnash teeth)

First part of this AI is for Chris and Heiko (should it also be part of standard roundtrip spec?) Second and third parts: Do we need more specific AIs with clear responsibility, or is this still just an aspiration?

Heiko and Chris will work on the round-tripping. We don't need to make a plan for the second part until we have implemented the first part.

Transcription factor activity

We had a request from the GOC meeting to add a union term for DNA-binding sequence-specific etc TF and protein-binding sequence-specific etc TF. What do we call this term? Should 'transcription factor activity' be reserved for DNA-binding only?

We will rename the DNA-binding transcription factor activity to be just transcription factor activity. We will reinstate transcription regulator activity. We will label the new high-level term as a 'do not annotate' term. Instead we will annotate to the process.

Other action items for the whole team from GOC meeting?

Minutes here: https://docs.google.com/document/d/1NonH97s8xEpDdx6DfonKPKI_RdHbbG-yft85UZtUmF0/edit#

Paola will make a list of AIs and present it at the next meeting.

Follow-up: cell migration

Action items from last call here: http://wiki.geneontology.org/index.php/Ontology_meeting_2014-10-09#TG_template_for_.27cell_migration.27

Left to do: Heiko to change relationship type in TermGenie for this template (from the current 'alters location of' to the new 'results in movement of'). For the ontology bit, I (Paola) made the changes, but am keeping a SF ticket open as some cell migration terms lack logical defs and we should add them.

Ontology term for protein family / protein domain (family)

Tony S is redoing imports for QuickGO and trying to be as ontology compliant as possible in how things are typed for import (JIRA ticket here: https://www.ebi.ac.uk/panda/jira/browse/GO-322)

I've pursuaded him that in most cases, he should be able to use an existing foundry ontology class. But there are still a few cases where we don't have an obvious term. In particular, what do Panther IDs and InterPro IDs refer to?

It is not particularly useful to have them refer to proteins and protein domains respectively - as Tony wants to distinguish from a protein in a specific species. If we say that Panther IDs refer to classes of protein, then we would type using an ID for protein. But perhaps it is better to treat protein family as some kind of information artefact - or in OWL terms as a individual with members rather than a class?

Bumped until Chris gets back. Judy, Chris and Cathy will talk about this and coordinate once Chris gets back.

Links to documentation

[Copying from last week - should we follow up on decisions below now, or bump to 1-2 weeks in the future?]

Following old discussion and recent email thread, in search of a more robust and logic-oriented approach than the one currently adopted in QuickGO.

See email thread 'Rough notes for OG from PAINT call'. Summary:

"The goal is to have a procedure that runs as part of the release process that propagates the URLs (or any annotation property) down the hierarchy based on a structured query (OPPL, SPARQL etc)

DavidOS - do you think this is possible based on your axiomatization for membrane proteins

Others - can we axiomatize the production terms to avoid dependence on regexs?"

   DECISIONS: 
      - Propagation rules will go in an OWL file - using DL or SPARQL or some combination (rather than hacking something with xref_abbs)
         - We should investigate using OWL image annotation system developed at VFB for OWL formalisation
         - *could* use super-relations in order to do 'unions' (e.g. IPR)
         - Chris notes - may also be an option for linking to LEGO exemplars for some terms.
      - Propagation would then happen at build time so the public would see propagated in dbxrefs.


Bumped til next week.

Membrane proteins

David OS writes:

"I've made a Google Doc on the current state of the ontology wrt membranes and their components - and trying to relate this to the biology. I think the new pictures help enormously. Comments most welcome.

https://docs.google.com/document/d/1e05OH9v82bOLglMGnKpy4_BiotlHxh12IVKRXczJk-w/edit#heading=h.41zc0ewzwqjd

Key points: I completely agree that we need to get rid of the clause: "with the bulk of the gene product residing on the external side of the membrane". But there are still many cases where a protein is only on one side of the membrane, and we should be able to record that. I also see no reason to exclude recording that a protein is membrane spanning - unless we consider it simply too late in the game to have this term (actually, we've had it for a few years, so better to say - too late in the game to publicise it?)."

David OS went through the new proposal. It seems that all of the possible combinations of what an experiment might show are covered by the new structure. We might want to overlaps relation in the ontology structure. We need to figure out how to integrate how the protein is attached to a membrane and the topology of the membrane. Some of the terms came in based on integration with keywords. Make the assays part of the documentation.

protein binding - families vs domains

(DOS: I'm sure this has been discussed before. But did we agree on a strategy?)

Protein binding terms can be defined wrt families or domains. The latter is clearly better in providing useful generalisations as binding to a member of a family could be due to binding to non-conserved domains. We have terms defined wrt families - and lots of annotations to these, so we don't have the luxury of making a choice, but should we have both types of terms with relationships between them where appropriate* and comments to encourage annotation to domain terms if possible?

(* e.g. if all members of family Y have domain X, then 'domain X binding' can safely be subclass of 'family Y binding')

Domain binding will be the thing that is reported in a paper. But we have defined them with respect to protein families. Domain binding should be treated differently than family binding. We will make both terms the family term and the domain term, but we will not relate the two in the ontology now to be safe.