Annotation Conf. Call, April 10, 2012

From GO Wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Agenda:

Update Contact Info in GAF config file

The email address mentioned in the GAF conf (config) file is used to contact your group to report errors in your GAF file (for example, SGD's config is named as gene_association.sgd.conf). So, please take a moment to check the config file and make sure the email address is correct. The config file can be found in the Submission directory- http://cvsweb.geneontology.org/cgi-bin/cvsweb.cgi/go/gene-associations/submission/
It is not enough if you have the right email address in the top part of your GAF file.

Discussion of the annotation_extension field

1. Introduction to the basic format of this field

2. Data inappropriate for the Annotation Extension field

3. What information are groups currently adding into this field? File:C16 values per group.xls

4. How can we improve the quality of this data?

  • Kimberly: I'd just like clarification on the exact relations file to use. I'm assuming that these two links point to the same relations file, so it doesn't matter which one we grab for our curation form?

http://www.geneontology.org/scratch/xps/go_annotation_extension_relations.obo

http://cvsweb.geneontology.org/cgi-bin/cvsweb.cgi/go/scratch/xps/go_annotation_extension_relations.obo


Will the relations file eventually be submitted to OBO or be made available for download from the GO web site (like the ontologies, GO references and evidence codes files)?

  • Emily OR: SVN /trunk/ontology/extensions/gorel.owl  ?


  • Emily: how can we reason over range identifiers? e.g. range: TEMP:0000001 ! gene product or complex


Total number of annotations from all groups with a filled c.16 field = 83,588

  • Kimberly: Would it be possible to link out from each of the examples in the table below to an instance of that curation described on the wiki?
Annotation Count relationship and associated ids types GO ID Process GO ID Component GO ID Function Assigned_by
4 acts_at(SO id) P C PomBase
3 dependent_on(GO id) P C BHF-UCL
38 dependent_on(GO id, Pfam id or PomBase id) P C F PomBase
39 dependent_on(GO id, UniProtKB id) P C F UniProt-GOA
288 during (GO id) P C F PomBase
3 exists_during(GO id) C PomBase
3 exists_during(GO id) F UniProt-GOA
2 exists_during(GO id) C dictyBase
7 happens_during(GO id) P BHF-UCL
1 happens_during(GO id) P PomBase
39 happens_during(GO id) P C F UniProt-GOA
2 has_binding_specificity(SO id) C PomBase
12 has_downstream_target (UniProt id) P BHF-UCL
18 has_downstream_target(UniProt id) P UniProt-GOA
139 has_input(CL, UBERON, UniProt and CHEBI ids) P F UniProt-GOA
7 has_input(UniProt id) P BHF-UCL
1 has_output(UniProt id) P BHF-UCL
1 has_participant(GO id) P BHF-UCL
49 has_participant(GO, UniProt id) P F UniProt-GOA
26 has_regulation_target (ChEBI id, Ensembl, UniProt ids) P BHF-UCL
745 has_regulation_target(UniProt id) P F UniProt-GOA
48 has_substrate (Genedb_Spombe, PomBase, PR, GO ids) P F PomBase
329 has_substrate(CHEBI and UniProt ids) P F UniProt-GOA
1 has_substrate(CL id) F BHF-UCL
1 has_target_anatomical_entity(GO id) F BHF-UCL
37 has_target_cell(CL id) P UniProt-GOA
1 has_upstream_or_downstream_target(UniProt id) P UniProt-GOA
76 in_presence_of(GO, ChEBI, UniProt id) P UniProt-GOA
1 in_presence_of(UniProt id) F BHF-UCL
3 independent_of(GO id) C PomBase
1 independent_of(GO ID) P UniProt-GOA
4 localization_dependency (PomBase ids) P PomBase
93 localization_dependent_on(CHEBI and GO id) C F UniProt-GOA
1 localization_target (PomBase ids) P PomBase
60 localizes(GO, PomBase ids) P C PomBase
10 localizes(UBERON id) P C F BHF-UCL
1 localizes(UniProtKB id) P dictyBase
27131 occurs_in(CL id) P F MGI
15 occurs_in(CL id) P P BHF-UCL
30,477 occurs_in(GO id) F IntAct
9 occurs_in(Pfam or GO id) P F PomBase
11,095 occurs_in(Reactome protein complex identifier) P C F Reactome
1113 occurs_in(UBERON,CL id) P C F UniProt-GOA
265 part_of (CL, UBERON id, GO id) P C UniProt-GOA
10880 part_of(CL id) C MGI
1 protease_activator(PomBase) P PomBase
1 protease_inhibitor_of(PomBase F PomBase
2 required_for(GO id) C F PomBase
2 requires_substance(ChEBI ID) F UniProt-GOA
2 stabilizes(UniProt id) P UniProt-GOA
2 stablizes(PomBase ids) P PomBase
1 transcriptionally_regulates(PomBase id) P PomBase

WormBase: contains either CheBI or WB identifiers, but not supplied with any relationships.

SGN: contains genomic identifiers, but not supplied with any relationships: e.g. AB623205.1; missing a tab?

Minutes

Present:

dictyBase: Pascale, Bob and Petra SGD: Rama and Karen MGI: Li, Mary and Judy TAIR: Tanya and Donghui WormBase: Kimberly PomBase: Val and Midori UniProt-GOA: Emily, Prudence, Tony FlyBase: Susan Berkley: Chris PO Group: Ramona Walls, Laurel Cooper


Update Contact Info in GAF config file

Action: groups to review the contact email in their .conf files that are submitted to the go/submissions directory


Discussion of the annotation_extension field

Action Include a section guiding curators on when to request a GO term, or add data to annotation in the annotation_extension field

The obo version of the go_annotation_extension_relations.obo file is the primary file, located here (webcvs URL points to the same place):

http://www.geneontology.org/scratch/xps/go_annotation_extension_relations.obo

Chris: However, software developers should be encouraged to consume the OWL version if at all possible.

Action Ensure the annotation examples in the documentation are very clear as to what should be included in the annotation line in the files, and what is included in the documenation, to aid readability.

There was a dicussion on how to handle experimental data (cell lines, chemicals), which are important to the interpretation of the annotation. While some groups have added this data to the annotation_extension field, it is not appropriate as this field is intended to enhance the descriptiveness of the GO term in col.5, and GO does not include descriptions on assay conditions. However, an enhanced evidence record for a new, extended annotation format would be desirable so that such information could be included in future.

Chris: negative information cannot be included in the annotation_Extension field, so that the annotation line can be correctly interpreted if column 16/17 were excluded from anyone's view (as these fields are optional)

Action More examples, and more information from Chris's original cross-products documentation need to be included in the guidance, before it is converted to a GOC web page.

Action Sanity checks need to be discussed with Chris and Mike C.

Midori: annotations that fall foul of annotation_extension checks should only have the contents of their column 16 field removed - not the removal of the whole annotation line.

Emily: could information contained in the TEMP ids, next to the range parameters for a relationship type, be included in the go.db_xrefs file?