Annotation Advocacy progress report for 2012

From GO Wiki
Jump to navigation Jump to search

Management

Annotation Consistency

Annotation call

IKR evidence code: A type of manually-curated evidence derived from sequence analysis, characterized by the lack of key sequence residues. All annotations that apply this evidence code should use the 'NOT' qualifier.

Discussed removal of fraction-type terms from the ontology.

New annotation datasets

Transcription factor annotation by NTNU

Annotation Guidelines

Documentation

Improvements to the GO Consortium's annotation documentation

Annotation Extension column (Column 16)

The aim of the Annotation Extension column is to allow curators to provide additional information within a single annotation. This information could take the form of gene products, GO IDs or terms from other OBO ontologies.


Annotation quality control

New checks were implemented on February 1st 2012 as follows;

1) All IC annotations should include a GO id in column 8 (with/from)

2) All IDA annotations should NOT include any id in column 8 (with/from)

3) ND-evidenced Annotations to root nodes only http://geneontology.org/GO.annotation_qc.shtml#GO_AR:0000011

4) ND annotations should NOT have PMID in the reference column

5) References in the GAF -Column 6 should be of the format SGD_REF:S000047763|PMID:2676709. References like PMID:PMID:14561399', or PMID:unpublished or GOC:unpublished should be filtered out.

6) Annotations to GO:0005488, 'binding' should be made with IPI and the interacting partner should be in the 'with' column.

7) Annotations with IPI evidence code made after Jan 1, 2012 that don't have an ID in the 'with' column should be filtered out (grand father old annotations)

GO Helpdesk

GO helpdesk staff

Rama Balakrishnan, Rachael Huntley, David Hill, Harold Drabkin, Jane Lomax, Kimberly Van Auken, Tanya Berardini, Susan Tweedie, Rebecca Foulger, Prudence Mutowo-Muellenet, Poala Roncaglia.

Statistics for the past year

Other

New annotation files

GPAD/GPI: An alternative means of exchanging annotations. The GPAD 1.1 format is designed to be more normalized than GAF, and is intended to work in conjunction with a separate format for exchanging gene product information (GPI1.1). This allows separation of data on genes and gene products, objects being annotated, from the annotation data. The GPAD format also allows for use of Evidence Code Ontology (ECO) codes.

The GPAD/GPI format also;

  • allows unannotated gene products to be submitted to the GO database
  • reduces the amount of redundant gene product information in the GAF files

GP2RNA: A gp2rna file is a tab-delimited file that provides mapping between the MOD database object IDs and ncRNA gene/sequence IDs. Contribution of this file is a new requirement (starting 03/2012).

GP_unlocalized: If annotating groups have provided annotations to gene identifiers that have been manually curated from the literature, but where no sequence or genomic location is known (such genes have been variously described as 'unlocalised genes', 'single heritable traits' or 'phenotypic orphans'), then the group should also provide a gp_unlocalized file containing all the non-genome localized identifiers available in their database, including those not annotated to GO.

Plans for 2011

In the upcoming year we will;

  • Continue to promote annotation consistency through regular annotation calls and jamborees and through up-to-date guideline documentation
  • Enforce annotation guidelines by implementing further quality control checks as needed
  • Complete the re-organization of the annotation documentation provided by the GO Consortium through its website
  • Continue to assist external annotating groups who wish to provide annotation sets to the GO Consortium by providing them with the GO Consortium's annotation policies and guidelines and reviewing their annotations as necessary
  • Ensure use of the new Evidence Code Ontology is adopted by all annotating groups within the GO Consortium by promoting its use and providing up-to-date documentation
  • Continue to keep curators informed of updates to the ontology by providing documentation and holding conference calls