Annotation Conf. Call, November 22, 2011

From GO Wiki
Jump to navigation Jump to search

Webex session URL

1. Proposal for the Apoptosis ontology and annotation project; Paola and Pablo [1]

2. Description of integral_to qualifier annotation proposal; ChrisFile:GO-integral-to.pdf

Details: http://wiki.geneontology.org/index.php/Proposal_for_integral_to_qualifier

3. Review of specific proposed annotation QC checks; Emily

All IC annotations should include a GO id in column 8 (with)

Supported by current Evidence Code documentation

ASAP: 1 EcoliWiki: 20 GR: 1959 JCVI: 2 MGI: 1 PAMGO_GAT: 2 SGN: 2 TAIR: 1 WB: 29

SQL to find the annotation exceptions for individual groups named above:

SELECT concat(dbxref.xref_dbname, ':', dbxref.xref_key) AS dbx, gene_product.symbol, term.name, term.acc, evidence.code, db.name AS assigned_by FROM association, evidence, term, gene_product, dbxref, db WHERE (association.id = evidence.association_id) AND (association.gene_product_id = gene_product.id) AND (association.term_id = term.id) AND (gene_product.dbxref_id = dbxref.id) AND (association.source_db_id=db.id) AND evidence.code = 'IC' AND evidence.id NOT IN (SELECT evidence_dbxref.evidence_id FROM evidence_dbxref) AND db.name = 'ADD_YOUR_GROUPS_NAME'

All IDA annotations should NOT include any id in column 8 (with) Supported by current Evidence Code documentation

No annotations in the GO database that conflicts with this proposed rule.

To find annotations, use the following SQL in GOOSE:

SELECT concat(dbxref.xref_dbname, ':', dbxref.xref_key) AS dbx, gene_product.symbol, term.name, term.acc, evidence.code, db.name AS assigned_by FROM association, evidence, term, gene_product, dbxref, db WHERE (association.id = evidence.association_id) AND (association.gene_product_id = gene_product.id) AND (association.term_id = term.id) AND (gene_product.dbxref_id = dbxref.id) AND (association.source_db_id=db.id) AND evidence.code = 'IDA' AND evidence.id IN (SELECT evidence_dbxref.evidence_id FROM evidence_dbxref)

ND-evidenced Annotations to root nodes only Supported by current Evidence Code documentation

dictyBase: 4 GeneDB_Pfalciparum: 1 GeneDB_Tbrucei: 5 JCVI: 1 WB: 1

SQL to find these annotation exceptions:

SELECT concat(dbxref.xref_dbname, ':', dbxref.xref_key) AS dbx, gene_product.symbol, term.name, term.acc, evidence.code, db.name AS assigned_by FROM association, evidence, term, gene_product, dbxref, db WHERE (association.id = evidence.association_id) AND (association.gene_product_id = gene_product.id) AND (association.term_id = term.id) AND (gene_product.dbxref_id = dbxref.id) AND (association.source_db_id=db.id) AND evidence.code = 'ND' AND term.acc NOT IN ('GO:0005575','GO:0003674','GO:0008150')

Should all ND annotations refer to a GO_REF rather than PubMed Reference?

Supported by current Evidence Code documentation:

'Annotations made with the ND evidence code should be accompanied by a reference that explains that curators looked but found no information. Note that some groups check only published literature while other groups also make sequence comparisons to see if an annotation can be made on the basis of a sequence comparison. The GO Reference collection includes a reference that can be used with ND when both literature and sequence have been checked; to use it, put "GO_REF:0000015" in the reference column of a gene association file. ... Even if an author states in a paper that there is no data available or nothing is known about the gene product in a particular GO aspect, annotation to the corresponding root node should be made with ND evidence code citing either the annotating group's internal reference or the GOC's reference on use of the ND evidence code, not a specific paper.'

dictyBase: 168 GeneDB_Pfalciparum: 3 PAMGO_MGG: 2 RGD: 13 SGD: 2 TAIR: 3 UniProtKB: 55

SQL query to locate these PMID-referenced annotations:

SELECT term.name, term.acc, evidence.code, gene_product.symbol, concat(dbxref.xref_dbname, ':', dbxref.xref_key) AS dbx, concat(ev_dbxref.xref_dbname, ':', ev_dbxref.xref_key) AS ev_dbx FROM association, evidence, term, gene_product, dbxref, dbxref AS ev_dbxref WHERE (association.id = evidence.association_id) AND (evidence.dbxref_id = ev_dbxref.id) AND (association.gene_product_id = gene_product.id) AND (association.term_id = term.id) AND (gene_product.dbxref_id = dbxref.id) AND evidence.code = 'ND' AND ev_dbxref.xref_dbname = 'PMID'


  • In addition, there are some MOD interal references which are not indicated as equivalents to the GO_REF:0000015. Could the following be added to the GO Reference file as equivalents?

dictyBase_REF:9 (dictyBase)

GO_REF:nd (GeneDB_Pfalciparum, GeneDB_Tbrucei, TIGR, WB)

GOC:unpublished (GeneDB_Pfalciparum)


Possible actions for some of the above QC checks:

1. Delete contravening annotations? Therefore would be filtered from GOC-submitted GAF files

2. Grandfather old annotations contravene this rule and filter out newly-created IC annotations?

3. Take up EcoWiki's group's offer to improve the contravening IC annotation set where the annotation group is unable to afford the curation time?

... in all cases decisions from these discussions need to be emailed to the GO list to ensure groups that are not present have a chance to respond.