993661940

Agenda

Deprecated Annotation Extension Relations

Please see Rachael's email from 2015-10-25
A number of annotation extension relations have been deprecated. Three relations, in particular, have quite a few annotations that need to be updated:
- Dependent_on
- In_presence_of
- In_absence_of
Guidance for updating annotations is available here: http://wiki.geneontology.org/index.php?title=Guidance_for_updating_deprecated_Annotation_Extension_Relations
The wiki page also contains a link, at the bottom of the General Guidance section, to a spreadsheet that lists each annotation by Source (curation group)
Please go through your annotations by the end of this year, and note any cases where you think the guidelines may not address the experiment you are trying to annotate.
Document those cases here: http://wiki.geneontology.org/index.php/Annotation_Conf._Call_2016-01-12#Deprecated_Annotation_Extensions
We will discuss any outstanding issues on the first call of the new year, January 12th, 2016.

Entity IDs in Annotation Extensions

There is varied entity ID usage in the annotation extensions.
We'd like to review what currently exists and reach agreement on what we should use going forward
IDs used should be compatible with the source group submitting the GAF
The goal is to enable seamless curation and query
Suzi: Will alt_ids for these entities be supplied in the GAF?
- David H: No, we don't currently supply synonyms in the GAF for entities used in Col. 16/AEs, but each group could supply all valid IDs in a gpad file
Paul T.: UniProt has mapping files and we should take advantage of this
- Midori: Who is ultimately responsible for the mappings? The MODs? Not all groups maintain mappings for all possible IDs.
- Suzi: With advice from the MODs, UniProt has agreed to provide mapping files
- Kimberly, Midori: This is okay for MOD and UniProt IDs, but what about other sources?
Suzi: What other ID spaces are in use?
- David H.: This is part of the purpose of this exercise - to see what ID space is being used.

Entity IDs in Annotation Extensions

Goals

For this meeting

To understand the scope of the project by discussing what types of objects annotators would like to use in annotation extensions.

Final Goals

To agree on a set of ID spaces that will be used in annotation extensions that provides both manual and computational consistency.
Ideally a user would be able to query on ANY type of ID that is deemed to be recognized by the GOC and return results of a seamless query of those IDs.
To use the results of this discussion for the creation of annotation documentation.

Initial Assumptions

We will not be able to mandate the types of IDs that are used by annotation groups, but we should be able to mandate that the IDs used are compatible/translatable to the ID spaces used by the group (MOD) that is primarily responsible for submission of annotations to the GOC.
Upon processing of submitted annotations, primary responsible groups (MODs) will translate IDs into the objects used for curation by that group. That group will then provide the translated (normalized) annotations to the GOC.
The GOC may then translate the submitted IDs for the purpose of data integration across species.
Before submission of annotations, the submitter and the primary responsible group (MOD) will work together to make sure that all ID spaces can be normalized. This should become an SOP for any new group wishing to submit annotations.

Types of IDs used in Annotation Extensions

IDs used to represent genes
- MOD gene identifiers (MGI:MGI:, WB:, ZFIN:ZDB-GENE-, TAIR:locus: etc)
- Generic UniprotKB Ids (UniProtKB:)
- ENSEMBL gene IDs (Ensembl:)
- NCBI gene IDs (NCBI_gene:)
- RNA central IDs (RNAcentral:)
- HGNC IDs (HGNC:)
IDs used to represent cell types
- cell ontology IDs
- wormbase anatomy and cell IDs
- Plant ontology IDs
IDs used to represent chemicals
- ChEBI
IDs used to represent gene products
- Proteins/Proteoforms
  - Protein ontology IDs (PR:)
  - UniProt isoform-specific IDs (UniProtKB:######-#)
  - MOD gene identifiers
- Transcripts
  - EMBL IDs
  - MOD gene identifiers
IDs used to represent protein domains
- InterPro IDs
IDs used to represent biological processes
- GO IDs
IDs used to represent molecular functions
- GO IDs
IDs used to represent cellular components
- GO IDs
IDs used to represent anatomical structures
- EMAPA IDs
- UBERON IDs
- Wormbase anatomy and cell IDs
- Plant ontology IDs

Next Curation Consistency Exercise

2015-11-24
TAIR is up next to select a paper
Continue with consistency exercises in 2016?
Suggestions for changes or improvements?
- Model each paper in LEGO
Groups still to select paper: dictyBase, EBI/UniProt, BBOP (Moni?), NextProt, USC, AgBase, anyone else?

Minutes

On call: Aleks, Alex, David H., Edith, Kimberly, Li, Melanie, Midori, Paul T., Petra, Rachael, Ruth, Shur-Jen, Stacia, Stan, Suzi, Tanya

Deprecated Annotation Extension Relations

Background - at the Texas meeting, we agreed to retire some annotation extension relations
Rachael and Ruth have looked over a number of the existing annotations associated with these relations and put together guidelines for how to re-house the information
Annotations that need attention are in a spreadsheet linked to the guidelines wiki page (see below)
It will take a bit of time for people to go through these annotations, but we are asking for everyone to please have a look by the end of the year and bring forward for discussion any annotations that do not seem to fit the recommended guidelines so we can discuss them on the first annotation call of the new year, 2016-01-12
Guidelines: http://wiki.geneontology.org/index.php?title=Guidance_for_updating_deprecated_Annotation_Extension_Relations
Use Cases for Further Discussion: http://wiki.geneontology.org/index.php/Annotation_Conf._Call_2016-01-12#Deprecated_Annotation_Extensions

Entity IDs in Annotation Extensions

There is varied ID usage in annotation extensions
We'd like to review what currently exists and agree on what we should use going forward
IDs used should be compatible with ID space of the submitting source of the GAF
The goal is to have seamless curation and querying of the annotations
Suzi: Will alt_ids be supplied in the GAFs?
- David H.: No, ID synonyms are not provided in the GAFs for entities used in Col. 16, but each group could provide all valid IDs in a gpi file
Paul T.: UniProt has mappings files; we should take advantage of this
- Midori: Who is responsible for maintaining the mappings? MODs? Not all groups maintain mappings for all possible namespaces.
- Suzi: With advice of MODs, UniProt will provide the mappings files.
- Kimberly, Midori: This is okay for MOD and UniProt IDs, but what about other IDs?
Suzi: What other ID space is in use?
- David H.: This is one of the goals of this work - to determine what ID space is currently used and what we should use in the future.
Ruth, Rachael: Have been using ENSEMBL gene IDs for AEs, but could change to UniProt IDs to represent a gene. Also have been using RNAcentral IDs, but this is okay for representing ncRNAs.
David H.: The submitting group need to make sure all ID spaces can be normalized; the MOD - UniProtKB - annotator need to all be in agreement
Review of specific types of IDs used
- Gene IDs
  - HGNC - the human MOD?
  - Paul T.: UniProt Reference Proteomes map to HGNC identifiers as the preferred source of human gene identifier
  - Ruth: Can see why UniProt uses HGNC as this allows them to retrieve the correct gene symbol, but there won't necessarily be a 1:1 relationship between HGNC gene and UniProt protein IDs
  - David H.: MGI leaves UniProtKB IDs as is - should this continue? This question touches on the issue of what identifiers should be used with what relation
  - Rachael: If we change the ENSEMBL gene IDs to UniProtKB IDs, will we have to then change back to a gene ID is we decide that the AE entity should be a gene?
  - David H.: Parent UniProtKB IDs are thought of as the equivalent of gene IDs in GO, so they should be okay as a gene ID.
- Gene Product IDs
  - Existing IDs used here are okay, but we need to have guidelines on how to treat isoforms
  - Please send examples to David and Kimberly for AEs where you'd like to represent isoforms
- InterPro IDs
  - David H.: Some discussion of this wrt protein binding annotations at the DC meeting. See meeting minutes for proposal.
  - Would like more examples for this, too.
- Anatomical Structures
  - Mappings between different ontologies still needs to be done
  - What anatomy ontologies do groups use?
  - Suzi: Groups should coordinate with Chris to make sure that their cell and anatomy ontologies are compatible with UBERON; some work still needs to be done here.
- Midori: Need to also include SO identifiers and Pfam identifiers

Curation Consistency Exercises

Tanya: Will chose a paper from TAIR for next exercise and collate annotations
David H., Ruth, Paul T. expressed support for continuing with the consistency exercise
Ruth: If papers chosen are particularly complicated, we could always decide to focus on a subset of Figures/experiments
Tanya: Would be nice to have a consistent template for collating annotations
Ruth: From these exercises, we've reached some agreements on how to annotate, but we need to provide better documentation, ideally linked to the papers, to illustrate how to annotate and why
Paul T.: agrees that we need work on the documentation - this will be a formidable task

Annotation Conf. Call 2015-11-10

Contents

Bluejeans URL: https://bluejeans.com/993661940

Agenda

Deprecated Annotation Extension Relations

Entity IDs in Annotation Extensions

Entity IDs in Annotation Extensions

Goals

For this meeting

Final Goals

Initial Assumptions

Types of IDs used in Annotation Extensions

Next Curation Consistency Exercise

Minutes

Deprecated Annotation Extension Relations

Entity IDs in Annotation Extensions

Curation Consistency Exercises

Navigation menu