Annotation Conf. Call 2015-11-10: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
Line 12: Line 12:


==Entity IDs in Annotation Extensions==
==Entity IDs in Annotation Extensions==
===Goal===
*To agree on a set of ID spaces that will be used in annotation extensions to provide both manual and computational consistency.
*Ideally a user would be able to query on ANY type of ID that is deemed to be recognized by the GOC and return results of a seamless query of those IDs.
*To use the results of this discussion for the creation of annotation documentation.
===Initial Assumptions===
*We will not be able to mandate the types of IDs that are used by annotation groups, but we should be able to mandate that the IDs used are compatible/translatable to the ID spaces that are used by the group (MOD) that is primarily responsible for submission of annotations to the GOC.
*Upon processing of submitted annotations, primary responsible groups (MODs) will translate IDs into the objects used for curation by that group. That group will then provide the translated (normalized) annotations to the GOC.
*The GOC may then translate the submitted IDs for the purpose of data integration across species.
*Before submission of annotations, the submitter and the primary responsible group (MOD) will work together to make sure that all ID spaces can be normalized.
===Types of IDs used in Annotation Extensions===
*IDs used to represent genes
**MOD gene identifiers (MGI:MGI:, WB:, ZFIN:ZDB-GENE-, TAIR:locus: etc)
**Generic UniprotKB Ids (UniProtKB:)
**ENSEMBL gene IDs (Ensembl:)
**NCBI gene IDs (NCBI_gene:)
*IDs used to represent cell types
**cell ontology IDs
**wormbase anatomy and cell IDs
**Plant ontology IDs
*IDs used to represent chemicals
**ChEBI
*IDs used to represent gene products
**Protein ontology IDs (Pro:)
**UniProt isoform-specific IDs (UniProtKB:######-#)
**EMBL IDs
*IDs used to represent protein domains
**InterPro IDs
*IDs used to represent biological processes
**GO IDs
*IDs used to represent molecular functions
**GO IDs
*IDs used to represent cellular components
**GO IDs
*IDs used to represent anatomical structures
**EMAPA IDs
**UBERON IDs
**Wormbase anatomy and cell IDs
**Plant ontology IDs
==Next Curation Consistency Exercise==
==Next Curation Consistency Exercise==
*2015-11-24
*2015-11-24

Revision as of 11:34, 8 November 2015

Agenda

Deprecated Annotation Extension Relations

  • Please see Rachael's email from 2015-10-25
  • A number of annotation extension relations have been deprecated. Three relations, in particular, have quite a few annotations that need to be updated:
    • Dependent_on
    • In_presence_of
    • In_absence_of
  • Guidance for updating annotations is available here: http://wiki.geneontology.org/index.php?title=Guidance_for_updating_deprecated_Annotation_Extension_Relations
  • The wiki page also contains a link, at the bottom of the General Guidance section, to a spreadsheet that lists each annotation by Source (curation group)
  • Please go through your annotations by the end of this year, and note any cases where you think the guidelines may not address the experiment you are trying to annotate.
  • We will collate and discuss any outstanding issues on the first call of the new year, January 12th, 2016.

Entity IDs in Annotation Extensions

Goal

  • To agree on a set of ID spaces that will be used in annotation extensions to provide both manual and computational consistency.
  • Ideally a user would be able to query on ANY type of ID that is deemed to be recognized by the GOC and return results of a seamless query of those IDs.
  • To use the results of this discussion for the creation of annotation documentation.

Initial Assumptions

  • We will not be able to mandate the types of IDs that are used by annotation groups, but we should be able to mandate that the IDs used are compatible/translatable to the ID spaces that are used by the group (MOD) that is primarily responsible for submission of annotations to the GOC.
  • Upon processing of submitted annotations, primary responsible groups (MODs) will translate IDs into the objects used for curation by that group. That group will then provide the translated (normalized) annotations to the GOC.
  • The GOC may then translate the submitted IDs for the purpose of data integration across species.
  • Before submission of annotations, the submitter and the primary responsible group (MOD) will work together to make sure that all ID spaces can be normalized.

Types of IDs used in Annotation Extensions

  • IDs used to represent genes
    • MOD gene identifiers (MGI:MGI:, WB:, ZFIN:ZDB-GENE-, TAIR:locus: etc)
    • Generic UniprotKB Ids (UniProtKB:)
    • ENSEMBL gene IDs (Ensembl:)
    • NCBI gene IDs (NCBI_gene:)
  • IDs used to represent cell types
    • cell ontology IDs
    • wormbase anatomy and cell IDs
    • Plant ontology IDs
  • IDs used to represent chemicals
    • ChEBI
  • IDs used to represent gene products
    • Protein ontology IDs (Pro:)
    • UniProt isoform-specific IDs (UniProtKB:######-#)
    • EMBL IDs
  • IDs used to represent protein domains
    • InterPro IDs
  • IDs used to represent biological processes
    • GO IDs
  • IDs used to represent molecular functions
    • GO IDs
  • IDs used to represent cellular components
    • GO IDs
  • IDs used to represent anatomical structures
    • EMAPA IDs
    • UBERON IDs
    • Wormbase anatomy and cell IDs
    • Plant ontology IDs

Next Curation Consistency Exercise

  • 2015-11-24
  • TAIR is up next to select a paper
  • Continue with consistency exercises in 2016?
  • Suggestions for changes or improvements?
    • Model each paper in LEGO
  • Groups still to select paper: dictyBase, EBI/UniProt, BBOP (Moni?), NextProt, USC, AgBase, anyone else?

Minutes