Manager Call 2016-06-15

From GO Wiki
Jump to navigation Jump to search

Minutes from the previous GO Managers Meeting can be found here. GO Managers meet using the GO Phone Conference Line from Jackson Laboratory.


Identifier space in GO annotations (GAF, GPAD)

  • Address pending questions:
    • Should GO provide the digested GAF that contains only the canonical IDs in all columns (except Column 17) of GAF?
    • How should mapping of protein complex members be handled? We probably do want to have a mechanism for mapping between gene or gene-centric protein IDs and protein complexes and then automatically unfold annotations to each member of a complex using the contributes_to qualifier for MF?
    • Do we need to establish guidelines for curators who want to use more granular identifiers in AEs to make sure the IDs used correspond to the correct entity given the GO term used? How much effort should be put into this?
    • If we allow any gene or gene-centric identifier in AEs or With/From, what effects does this have on error checking?

  • For the purpose of this discussion:
    • Determine if there currently are uses cases for the more granular gene or gene product information in AEs and With/From. Consult with Val and Ruth on this. (Sent (late notice) invitation to Ruth and Val; hopefully their schedule will allow them to join).
    • Look for examples of AE usage in the literature. One possible use case mentioned in the Discussion of Gene Prioritization for Imaging Genetics Studies Using Gene Ontology and a Stratified False Discovery Rate Approach.
    • Need to check with Sandra Orchard about how protein complex mappings to gene or gcrp IDs are currently handled.



Regrets: Moni Munoz-Torres (USCIS).

Agenda: Moni | Minutes: Chris

Identifier space in GO annotations (GAF, GPAD)

Clarification: we are talking about possibility to allow people to use whichever ID they wanted, but burden on user to map those things back. Does GO want to provide files mapped back where all IDs are mapped back to GCPR for example.

Chris: semantics for GAF are clear: if column 17 is filled in then this is the specific entity carrying the function. If not filled, entity in column 2 is the entity itself.

Melanie: semantics for GPAD are clear and uses column 2.

Chris: there is still flexibility, do you use UniProt or PRO, which type of Gene ID?

Judy: e.g. if we annotate to mouse gene ID, should we enforce usage of an MGI ID? (and disallow to use a UniProt ID for example to represent an MGI gene) Those multiple IDs can identify the same thing, but how would the user know? Unless the GO takes responsibility in doing some of that wrangling for users.

June 6th: GOA files have been renamed. Old files are maintained for a month or so, a new announcement will be made and then they will be removed.

Discussion: should we follow the new naming and splitting convention (gaf, gaf_rna, gaf_complex)?

Chris: should be scheduled for after February 2017, maybe taken care by data capture group.

AI: Chris to add as Trello card. AI: data capture group to extend scope to release

KVA: What identifier space to allow people to use for annotation extensions? We haven't made a decision on that.

Consensus: anything that is in the GPI/mapping file, must be mapped back to a standard gene ID

Continue to allow GCRP as allowed identifier for cows etc

Ruth: RNA binding specificity use case DavidH: Make sure it's in GPI file

Summary: Can use anything in your organism's GPI file.

AI: Need to get GPIs from everyone. Ticket filed:

Protein Complexes

Should we always propagate function annotations on a complex to the individual gene products, using the contibutes_to qualifier?

Consensus appeared to be yes

Should we also propagate BP and CC the same way

Consensus appears to be yes.

AI : update: