Revision as of 11:25, 2 June 2016

Agenda

In response to the May 18th call's discussion on gene and gene product identifier space (see minutes), I've put together a spreadsheet that documents our current practice wrt for GAF and GPAD:
- Annotated Entity IDs
- With/From Entity IDs (note only for gene and gene product)
- Annotation Extension Entity IDs (note only for gene and gene product)
- Annotation Isoform Entity IDs
Then, for the purposes of discussion, I also added two other possible approaches:
- Gene IDs only
- Broad range of gene, transcript, protein, protein complex entity IDs
At the top of the spreadsheet are three general questions that we need to consider - there may be more; please add if needed
The plan was to review the different approaches, debate the pros and cons and then either get more feedback or finalize the proposal for presentation on an annotation or all-hands call

Attendees: Chris, David H, Kimberly, Moni, Paola, Paul T.

Regrets: Moni Munoz-Torres (Teaching 9th & 10th graders about research and the scientific method from 7:00AM - 9:30AM PDT).

Agenda: Paola; Minutes: Kimberly

We discussed different options for what to use as gene and gene product identifiers in GAF and GPAD.
Much of the discussion was centered around cost/benefit for curators and users of using gene or gene-centric protein identifiers vs using more specific or granular identifiers, such as UniProtKB protein isoform IDs or PRO IDs for modified forms of proteins, for annotations.
There is currently an important distinction between GAF and GPAD in that GPAD specs indicate that Column 2 can use the more granular identifier, e.g. P34187-1, while in GAF Column 2 uses canonical identifiers for gene, protein, ncRNA, or protein complex.
Curators may want to capture the most granular information possible, but what use cases do we have for use of that more granular info?
Enrichment analysis still seems to be the more common use case of GO annotations and for that, most users still just use gene or gene-centric annotations
AI:Determine if there currently are uses cases for the more granular gene or gene product information in AEs and With/From. Consult with Val and Ruth on this.
AI:More generally, look for examples of AE usage in literature.
- One possible use case mentioned in the Discussion of Gene Prioritization for Imaging Genetics Studies Using Gene Ontology and a Stratified False Discovery Rate Approach

@@ Line 32: / Line 32: @@
 *Much of the discussion was centered around cost/benefit for curators and users of using gene or gene-centric protein identifiers vs using more specific or granular identifiers, such as UniProtKB protein isoform IDs or PRO IDs for modified forms of proteins, for annotations.
 *There is currently an important distinction between GAF and GPAD in that GPAD specs indicate that Column 2 can use the more granular identifier, e.g. P34187-1, while in GAF Column 2 uses canonical identifiers for gene, protein, ncRNA, or protein complex.
+*Curators may want to capture the most granular information possible, but what use cases do we have for use of that more granular info?
+*Enrichment analysis still seems to be the more common use case of GO annotations and for that, most users still just use gene or gene-centric annotations
+*
+*'''AI:'''Determine if there currently are uses cases for the more granular gene or gene product information in AEs and With/From.  Consult with Val and Ruth on this.
+*'''AI:'''More generally, look for examples of AE usage in literature.
+**One possible use case mentioned in the Discussion of [http://www.ncbi.nlm.nih.gov/pubmed/27092072 Gene Prioritization for Imaging Genetics Studies Using Gene Ontology and a Stratified False Discovery Rate Approach]
 [[Category:GO Managers Meetings ]]