Manager Call 2015-12-16: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
== Agenda ==
= Agenda =


=== 1. Acceptable Gene Identifiers ===
== 1. Acceptable Gene Identifiers ==
''Is this the rule?''
''Is this the rule?''
*IDs used to represent genes:
*IDs used to represent genes:
Line 26: Line 26:




==== Minutes ====
=== Minutes ===


'''The BIG question is:''' Which gene IDs are valid for the GOC?  
'''The BIG question is:''' Which gene IDs are valid for the GOC?  
Line 51: Line 51:




==== FAQs ====
=== FAQs ===
* '''1) Is a gene ID, correct? Or is UniProtKB a protein ID?'''
* '''1) Is a gene ID, correct? Or is UniProtKB a protein ID?'''
Answer: we are using MOD IDs and UniProtKB IDs to mean the union of the gene and the product.  For humans, there is no MOD ID.  
Answer: we are using MOD IDs and UniProtKB IDs to mean the union of the gene and the product.  For humans, there is no MOD ID.  
Line 61: Line 61:




==== Additional Thoughts ====
=== Additional Thoughts ===


* '''Judy expressed that there is a case to be made for HGNC IDs.'''
* '''Judy expressed that there is a case to be made for HGNC IDs.'''
** Paul: we could consider it as the equivalent of the MOD ID for human. But why would we want to use it at this point if we don't already use it in Column 2?
** '''Paul:''' we could consider it as the equivalent of the MOD ID for human. But why would we want to use it at this point if we don't already use it in Column 2?
** Judy: Until now, the human curation has come out of UniProt. If mouse were doing human and mouse curation, we would be requesting HGNC IDs.  
** '''Judy:''' Until now, the human curation has come out of UniProt. If mouse were doing human and mouse curation, we would be requesting HGNC IDs.  
** Paul: it's ok to extend that to one more identifier space. Let us leave it on the table for further discussion.
** '''Paul:''' it's ok to extend that to one more identifier space. Let us leave it on the table for further discussion.


* When MGI (and also WormBase) receives a GAF with Ensembl or NCBI IDS, they translate the MGI IDs and output MGI. They are not doing that for Column 16 UniProt IDs. They do translate Column 2 IDs into MGI IDs.  
* When MGI (and also WormBase) receives a GAF with Ensembl or NCBI IDS, they translate the MGI IDs and output MGI. They are not doing that for Column 16 UniProt IDs. They do translate Column 2 IDs into MGI IDs.  
Line 76: Line 76:




==== Conclusions ====  
=== Conclusions ===  


* Curators may use MOD ID or UniProt generic ID to represent a gene  
* Curators may use MOD ID or UniProt generic ID to represent a gene  
Line 82: Line 82:




=== 2. Brief Geneva-LEGO report ===
== 2. Brief Geneva-LEGO report ==
* [[2015_LEGO_Jamboree_Logistics]]
* [[2015_LEGO_Jamboree_Logistics]]
* LEGO curation docs: [https://goo.gl/olzAUL https://goo.gl/olzAUL]
* LEGO curation docs: [https://goo.gl/olzAUL https://goo.gl/olzAUL]
Line 88: Line 88:




==== Minutes ====
=== Minutes ===


* '''Strategy:''' Paul gave a talk, then curators went on to curate a paper using the tool and later brainstormed about what they wanted to see. Seth made it all happen very quickly!  
* '''Strategy:''' Paul gave a talk, then curators went on to curate a paper using the tool and later brainstormed about what they wanted to see. Seth made it all happen very quickly!  
* DavidH:  
* Praise from the attendees:  
** Tried to write biologist interpretation of each of the relations - how to use the relations in the Noctua world. Reviewing before making it public.
** David Hill: Tried to write biologist interpretation of each of the relations - how to use the relations in the Noctua world. Reviewing before making it public.
** We can take single annotations and link them together to tell a biological story. We can export into GAF files, hopefully also into GPAD files. Now we can import existent annotations into Noctua. This is the way of the future!
** David Hill: We can take single annotations and link them together to tell a biological story. We can export into GAF files, hopefully also into GPAD files. Now we can import existent annotations into Noctua. This is the way of the future!
** LEGO makes GO annotation a lot easier for people. It lessens the information stage stops people need to make. Simplifies the process and makes the annotation more accurate.  
** Kimberly: LEGO makes GO annotation a lot easier for people. It lessens the information stage stops people need to make. Simplifies the process and makes the annotation more accurate.  
** If you go through the curation documentation, you can see it. No longer needed to choose the longest term first to reach the level of detail you'd like.  
** If you go through the curation documentation, you can see it. No longer needed to choose the longest term first to reach the level of detail you'd like.  
* Proposal to do a live presentation during an annotation call; there are also videos [https://vimeo.com/channels/Noctua available here].
* Proposal to do a live presentation during an annotation call; there are also videos [https://vimeo.com/channels/Noctua available here].

Revision as of 18:46, 16 December 2015

Agenda

1. Acceptable Gene Identifiers

Is this the rule?

  • IDs used to represent genes:
    • MOD gene identifiers (MGI:MGI:, WB:, ZFIN:ZDB-GENE-, TAIR:locus: etc)
    • Generic UniprotKB Ids (UniProtKB:)
    • ENSEMBL gene IDs (Ensembl:)
    • NCBI gene IDs (NCBI_gene:)
    • RNA central IDs (RNAcentral:)
    • HGNC IDs (HGNC:)


This is what is loaded in the Noctua-Entity-Ontology

  • MOD IDs (MGI:MGI:, FBL, etc)
  • UniProtKB:
    • The MOD ID takes precedence over the UniProtKB ID in Noctua, but based on feedback from Geneva we will allow curators to enter the UniProtKB ID, but this will still be resolved as a MOD ID if there is a MOD for the product
    • In NEO, we formally commit to the entity being uncommitted w.r.t gene vs product. Ie we use MOD IDs and UniProtKBs interchangeably
  • ENSEMBL IDs: not currently loaded -- but we may allow these for search
  • NCBIGene: ditto
  • HGNC: ditto
  • RNAcentral: will be loaded soon


Minutes

The BIG question is: Which gene IDs are valid for the GOC?

  • There is a motion to make a final decision so that curators may use whichever one they want, choosing from a list of valid ID options.
  • This will be the list of valid IDs that the GOC will recognize.


The consensus response we have reached so far is:

  • We will use MOD gene IDs and UniProtKB IDs to represent gene and gene products, and we will not add semantics into that.
  • Both sets of IDs are consistent with what we do on Column 2.
  • We WILL NOT use Ensembl gene IDs, NCBI gene IDs, or HGNC IDs.

Special notes: RNA-Central ID will also be ok. For human genes: we will use UniProt IDs.


MGI ID update:

  • Prefix is MGI, then colon, then MGI number. E.g. MGI:xxxxx Note: I don't know the number of integers after the colon.
  • Resolving the issue of incorporating the new MGI ID will take a lot of coordination, as all IDs have to change at the same time.

During 'meeting of the MODs', NHGRI put forward the following initiative: The common API needs standard IDs with which to work. IDs have to be a component of that. ACTION: Judy will bring Chris into this conversation.


FAQs

  • 1) Is a gene ID, correct? Or is UniProtKB a protein ID?

Answer: we are using MOD IDs and UniProtKB IDs to mean the union of the gene and the product. For humans, there is no MOD ID.


  • 2) Why are we removing NCBI, Ensembl, and HGNC?

Answer: Because we want one canonical ID for everything. To ensure we do the same in Column 16 as we did in Column 2. We are standardizing on a restricting space of IDs to use.


Additional Thoughts

  • Judy expressed that there is a case to be made for HGNC IDs.
    • Paul: we could consider it as the equivalent of the MOD ID for human. But why would we want to use it at this point if we don't already use it in Column 2?
    • Judy: Until now, the human curation has come out of UniProt. If mouse were doing human and mouse curation, we would be requesting HGNC IDs.
    • Paul: it's ok to extend that to one more identifier space. Let us leave it on the table for further discussion.
  • When MGI (and also WormBase) receives a GAF with Ensembl or NCBI IDS, they translate the MGI IDs and output MGI. They are not doing that for Column 16 UniProt IDs. They do translate Column 2 IDs into MGI IDs.
    • Paul: we are proposing to be consistent between Col2 and Col16.
    • DavidH: we still have generic UniProt IDs in Col16 and we will allow these for mouse genes at this time.
    • Judy: in COl 16 as we get annotations to particular isoforms, we use UniProtIDs-ProIDs.
    • DavidH: in Geneva the consensus was that we want to have proteoform annotations.
    • Chris: There are constrains dictated at the moment of entering information about isoform. You annotate at the gene level unless you know specifically that it is an isoform, then you use isoform / proteoform ID.


Conclusions

  • Curators may use MOD ID or UniProt generic ID to represent a gene
  • We now have a Google document and get the discussion finalized there. The document can be found on the GO Drive, under the GO Annotation Directory at https://goo.gl/JxHiUN


2. Brief Geneva-LEGO report


Minutes

  • Strategy: Paul gave a talk, then curators went on to curate a paper using the tool and later brainstormed about what they wanted to see. Seth made it all happen very quickly!
  • Praise from the attendees:
    • David Hill: Tried to write biologist interpretation of each of the relations - how to use the relations in the Noctua world. Reviewing before making it public.
    • David Hill: We can take single annotations and link them together to tell a biological story. We can export into GAF files, hopefully also into GPAD files. Now we can import existent annotations into Noctua. This is the way of the future!
    • Kimberly: LEGO makes GO annotation a lot easier for people. It lessens the information stage stops people need to make. Simplifies the process and makes the annotation more accurate.
    • If you go through the curation documentation, you can see it. No longer needed to choose the longest term first to reach the level of detail you'd like.
  • Proposal to do a live presentation during an annotation call; there are also videos available here.
  • Curators at the meeting gave very positive feedback; they were very happy with it. Their models are available on the Noctua site.
  • Notes for the meeting are available at https://goo.gl/mVOcyI