Ontology meeting 2015-06-04
Attendees: Paola, David OS, Tanya, David H, Harold, Chris, Heiko
Minutes: David OS
- 1 Moving to GitHub
- 2 Renaming of transcription-related terms
- 3 Follow-up: New TG templates
- 4 Follow-up: No more merges
- 5 Follow-up: GO-SO issues
- 6 Follow-up: Intersections vs. relationships in children of 'protein complex'
- 7 Follow-up: part_of vs has_part in export from nucleus
- 8 X dependent process & involved in
Moving to GitHub
Has everyone added their github username in the appropriate file?
What do converted tickets looks like? Assume we can do a better job than the current default?
Original ticket in SF: https://sourceforge.net/p/geneontology/ontology-requests/11358/
Is discussion not ported over to github? Same for status (that ticket was closed) and owner, which should be minimum requirements; followed by milestone and label(s)
Chris' progress report: First attempts used standard GitHub API and default mapping. Mapping not good enough and got locked out of API even with requests timed at one every 3 seconds (could have been due to cc'ing authors in tickets). Contacted GitHub who pointed to new GitHub API. Looks nice and straightforward to get all our data in, and seems more likely GitHub won't lock out scripts.
Outcome of email discussion?
Summary of changes:
1. Move the mechanism to the end of the term, after a comma:
transcription factor binding transcription factor activity > transcription factor activity, transcription factor binding
copper ion regulated core promoter proximal region sequence-specific DNA binding RNA polymerase II transcription factor activity' > RNA polymerase II transcription factor activity, copper ion regulated core promoter proximal region binding
2. use transcriptional repressor/activator in place of 'transcription factor activity involved in negative regulation of transcription' and 'transcription factor activity involved in positive regulation of transcription':
RNA polymerase II transcription factor binding transcription factor activity involved in negative regulation of transcription > transcriptional repressor activity, RNA polymerase II transcription factor binding
RNA polymerase II transcription factor binding transcription factor activity involved in positive regulation of transcription > transcriptional activator activity, RNA polymerase II transcription factor binding
3. Change 'X region sequence-specific DNA binding' to 'X binding', in cases where it is obvious from the name that this is a case of 'sequence specific DNA binding'. e.g.:
bacterial-type RNA polymerase core promoter proximal region sequence-specific DNA binding transcription factor activity' > transcription factor activity, bacterial-type RNA polymerase core promoter proximal region binding
AI: DOS to double check with Karen + Norwegian group before implementing.
Follow-up: New TG templates
Where are we with action items on these? Copying over from last week:
If the export templates are ready, we'll test them, then advertise them along with the ones below:
cellular component binding
cellular component organization
We resolved to change the template for 'cellular component organization' to make 'cellular process' the genus (let’s try that, rather than further up the graph, and see what happens). Jira ticket: https://www.ebi.ac.uk/panda/jira/browse/GO-327 UPDATE - THIS IS NOW DONE.
Once that's done and tested, we can advertise the 5 templates above.
As for the export templates, David OS is still working on them. He'll check that he has all the relevant info in one ticket.
https://www.ebi.ac.uk/panda/jira/browse/GO-348 This has export pattern, ready to be implemented. https://www.ebi.ac.uk/panda/jira/browse/GO-331 David looking in to whether patterns for export from cell are safe. Paola suggests waiting until all templates are ready to announce.
Follow-up: No more merges
Where are we with action items on this? There was a bit of discussion at the managers call too. Copying over from last week:
Obsolete instead and use the replace tag, as discussed in the recent past.
Chris tracked the Protege request here as it's best done as an extension to Jim's plugin: https://github.com/balhoff/obo-actions/issues/1
Any further action item we need to take to make sure this becomes regular practice in GO?
We need a long-term plan to deal with these changes, alert the databases, make sure the infrastructure is up to speed. David H will ask Harold about MGI expected behavior. We need to make a detailed proposal and bring it up as a heads-up on an annotation call. In the meantime, as a short-term solution, we’ll work on the one outlined in the github link above. Then we need to test AmiGO load before going live.
(Related) AI for Chris: remember to prepend all existing obsolete term names with ‘obsolete’ (see http://wiki.geneontology.org/index.php/Ontology_meeting_2015-04-23#Follow-up:_Prefixing_of_obsolete_to_label_of_all_obsoletes)
Ultimate aim: Use obsolete & replaced_by, where replaced_by has a cardinality of 1. We will only announce obsoletion where there is either no suggested replacement, or multiple suggested replacements (via consider). Reason: Current strategy causes loss of URIs and their associated data. This makes life hard for tracking by groups who rely on the OWL, and assume URIs will remain stable and resolvable. It also loses useful information associated with terms. Short term: - Change to OBO to OWL conversion to ensure URIs are not lost in the OWL side (this has already been done and will be in OWL-API). - Note - with this change to the OWL, need to check AMIGO load (from OWL) works OK. - Editors merge as usual, but also Add xref to losing synonym: alt_id:GO:nnnnnnn. Medium term: - AI: Change all 1:many replaced_by to 1:1 :Chris?: - AI: Make list of things that would break in GO when we make full change. :Chris:Heiko: Long term: - AI:Chris takes charge of making anouncement to world of full change, giving 1 year lead time. AI: Update doc (for now stays on wiki)
Follow-up: GO-SO issues
Where are we? Copying from last week:
There are various problems with our use of SO, some of which requires co-ordination with SO dev:
- We need a bridge from SO transcript terms -> ChEBI:RNA. In the absence of this, lots of inference is missing. Will the long planned SO molecular save us, or do we need our own bridge axoims?
- We need a differentium for recording which RNA metabolic processes are processing (involve maturation). We may be able to do this using terms from SO (see next item), or we could use a similar strategy to the one we use for developmental progression via a 'results in maturation of' relation. (We may, in fact, need a combination of these).
- We use the SO terms nRNA, ncRNA and its children as if they refer to both mature and immature states of transcripts. In fact, according to SO they refer to the mature state. To align with SO properly we would need to review usage and use alternative SO terms where available. SO has an additional set of terms for primary transcripts, but no terms for immature. Primary transcript refers only to before splicing so no terms for intermediate state after splcing and before other modifications involved in maturation such as capping and polyadenylation for mRNA. Need to discuss possibilities of adding these with SO.
DOS organising meeting with Karen Elibeck. Who wants to be involved?
AI: organise a meeting with Mike Bada and Karen Eilbeck. David OS and Chris will be involved. AI: David OS will make a doodle poll. DONE
We need a formal way to refer to transcript maturation; it’s all manual right now. It would be nice if we could do it by referring to the participating transcript, but maybe that’s a bit too complicated. Otherwise use results_in_maturation_of - currently it’s used strictly in development; could it be expanded? No, that may lead to too many unwanted inferences. We probably need a whole new relation to refer to transcript maturation. AI: come up with a new relation. Do that in discussion with SO.
AI: DOS to add ticket for new relation to RO repo - with aim of adding to ontology ASAP.
Follow-up: Intersections vs. relationships in children of 'protein complex'
Guess what... where are we? Last week we wrote...:
(David OS to elaborate) To avoid mis-classification of protein complexes that may have the same function or be involved in the same process, we need to retrospectively check all children of 'protein complex' that have intersections (not relationships!) with capable_of function links and capable_of_part_of process links. David OS prepared a list:
Sent list to Harold. He will review.
Follow-up: part_of vs has_part in export from nucleus
dph-All of the named RNAs that are exported from the nucleus are exported as part of RNP complexes. In the ontology we had asserted that the RNA export has_part RNP export. This seemed backwards to me since the RNA is part of the RNP. I reversed these relations to make the RNA transport part_of the RNP transport.
DPH to look at how this works for protein complexes, LDLs etc. If we go all the way with this then we should add the property chain transport o has_part -\> transports. But this may be too strong. Leave use of this pattern up to editor discretion?
P.S. [from Paola] Also please let's make sure that the terms' definitions and/or comments fully detail the biology here. Might be confusing otherwise. PUNTED AGAIN
X dependent process & involved in
We have many X-dependent process terms. My initial reading of the names and definitions of these terms was that the X must be upstream, but typically X (implicitly) is treated as part of the process that depends on it, as shown by the large number of cases of term of the form X involved in X dependent process Y. This may be a good thing, but perhaps could be made clearer with some design patterns.
Ruth & Rachael are already reviewing ~ 2000 annotation extensions using dependent_on. They plan to post the results to the annotation extensions GitHub site https://github.com/geneontology/annotation_extensions Following this, DOS to review list of terms in ontology using during to see where the same patterns might apply.
Aim for guidance on use of during in Ontology and by annotators. TO be discussed in annotation call.