Ontology meeting 2014-11-20

From GO Wiki
Jump to navigation Jump to search

Attendees:

Minutes: Paola


Downloads page on GO website

Some time ago (before the Barcelona meeting), we (esp. Chris) mentioned that we may want to improve/increase documentation on this page - or is it ok as is now? http://geneontology.org/page/download-ontology

Related: Susan says that "FlyBase are finally switching from gene_ontology.obo to go-basic.obo. Would it be an idea to mention gene_ontology.obo in the legacy section in case folks are confused about this change. I haven't managed to find an official announcement about these name changes in gofriends emails though I could have missed it." and "Searching the new site with gene_ontology.obo gives no hits which isn't ideal."

What's in the cell and what's out

Chris to report - he set up a Google doc to discuss this some time ago. Is this resolved now?

(Chris, could you post the link to the Google doc here if needed, thanks.)

Propagation of intersections to relationships

Why isn't this part of the pipeline? Is it just because we have a tradition of manually saying everything twice (once a logical def and once in a relationship)?

See https://sourceforge.net/p/geneontology/ontology-requests/11311/ for example.

Follow-up: Relationships (Annotation extensions)

Where are we with this: "We will have the has_input meeting next Monday [i.e. 3 days ago]. Once we are done, we need to go back to the annotators and report about which will be valid for use. Will the RO definitions be of use for annotators? We need a person from the annotation group and the ontology group to get together to make sure both are the same, David OS and Ruth."

Background here: http://wiki.geneontology.org/index.php/Ontology_meeting_2014-11-13#Follow-up:_Relationships

Ontology term for protein family / protein domain (family) (bumped from last week)

Tony S is redoing imports for QuickGO and trying to be as ontology compliant as possible in how things are typed for import (JIRA ticket here: https://www.ebi.ac.uk/panda/jira/browse/GO-322)

I've pursuaded him that in most cases, he should be able to use an existing foundry ontology class. But there are still a few cases where we don't have an obvious term. In particular, what do Panther IDs and InterPro IDs refer to?

It is not particularly useful to have them refer to proteins and protein domains respectively - as Tony wants to distinguish from a protein in a specific species. If we say that Panther IDs refer to classes of protein, then we would type using an ID for protein. But perhaps it is better to treat protein family as some kind of information artefact - or in OWL terms as a individual with members rather than a class?

AI from last week: Judy, Chris and Cathy will talk about this and coordinate once Chris gets back.

Links to documentation (bumped from last week)

[Copying from last week - should we follow up on decisions below now, or bump to 1-2 weeks in the future?]

Following old discussion and recent email thread, in search of a more robust and logic-oriented approach than the one currently adopted in QuickGO.

See email thread 'Rough notes for OG from PAINT call'. Summary:

"The goal is to have a procedure that runs as part of the release process that propagates the URLs (or any annotation property) down the hierarchy based on a structured query (OPPL, SPARQL etc)

DavidOS - do you think this is possible based on your axiomatization for membrane proteins

Others - can we axiomatize the production terms to avoid dependence on regexs?"

   DECISIONS: 
      - Propagation rules will go in an OWL file - using DL or SPARQL or some combination (rather than hacking something with xref_abbs)
         - We should investigate using OWL image annotation system developed at VFB for OWL formalisation
         - *could* use super-relations in order to do 'unions' (e.g. IPR)
         - Chris notes - may also be an option for linking to LEGO exemplars for some terms.
      - Propagation would then happen at build time so the public would see propagated in dbxrefs.

Action items for ontology team from GOC meeting

Paola to report. (FYI, minutes are at https://docs.google.com/document/d/1NonH97s8xEpDdx6DfonKPKI_RdHbbG-yft85UZtUmF0/edit#)

TG:

Action: Heiko: TG form Change ‘add template’ to ‘Use template’

Action : Ontology group look into creating intermediate templates for complexes

GO annotations for IntAct complexes:

Action: Harold: GO, IntAct, and PRO needs to coordinate better so that they are aware of the new and planned developments on each project, to ensure there is no duplicated effort.

Readability of term labels:

Chris’s 3 proposals based on David’s slides: If TG detects the term is overly composed (automatically detected based on a profile of ‘overly composed’), the requester is forced to enter an example of usage in the comment (ie a free text description of their annotation) Pascale: The automated check, based on term string length (100 characters ?) and other characteristics, would trigger a mandatory comment. In practice we recommend ALL new GO terms should have at least ONE example of a correct annotation (PubMed ID + figure included). That would help understand what the term means, and perhaps help people annotate to an already existing term. A sub-editor can tag a term as being overly composed will be auto-unfolded and shown in a decomposed fashion in AmiGO If a term is is auto-detected as being overly composed the curator is forced to enter a Noctua model ID before the term is generated Guidelines: If there is a commonly used name then that is the first choice for a term/class label. The labels of the parental terms may be helpful, but there is no requirement to reiterate the parental labels in the labels of the child classes.