Oct 2012 Meeting to finalize GPAD specification (Archived)

From GO Wiki
Revision as of 07:48, 24 October 2012 by Huntley (talk | contribs)
Jump to navigation Jump to search

Link to previous proposed GPI format; http://wiki.geneontology.org/index.php/Gene_Product_Data_File_Format

and previous proposed GPAD format; http://wiki.geneontology.org/index.php/Gene_Product_Association_Data_%28GPAD%29_Format

Issues for discussion:

gp_association files

  1. Translating existing qualifiers CONTRIBUTES_TO and COLOCALIZES_WITH into annotation relations (+ NOT?)
  2. Filling in relationships retrospectively - implicit or explicit values in column 4?
  3. spliceforms - separate column with canonical form in column 2 as per GAF, or column 2 only?


contents required? cardinality old column # extra info
DB required 1 1 must be in xrf_abbs
DB_Object_ID required 1 2
Qualifier optional 0 or greater 4 (NOT or integral_to)? (other_organism or colocalizes_with or contributes_to)? annotation_relation
GO ID required 1 5 must be extant GO ID
DB:Reference(s) required 1 or greater 6 DB must be in xrf_abbs
Evidence code required 1 7 from ECO
With (or) From optional 0 or greater 8
Interacting taxon ID (for multi-organism processes) optional 0 or 1 13 ncbi taxon ID
Date required 1 14 YYYYMMDD
Assigned_by required 1 15 from xrf_abbs
Annotation XP (Annotation Cross Products) optional 0 or greater 16


gp_information files

gp_information.goa_uniprot (gpi-version: 1.0) currently has these columns, some of which (db_subset, annotation_target_set, annotation_completed) are not present in the proposed 1.1 format.

It does not contain xrefs to other databases, however this would be useful for mapping of MOD-specific identifiers/symbols/synonyms to UniProt accessions to assist MOD curators moving to Protein2GO in searching for familiar IDs/gene names.

column name required? cardinality GAF column Example
01 DB required 1 1 UniProtKB
02 DB_Subset optional 0 or 1 - Swiss-Prot or TrEMBL
03 DB_Object_ID required 1 2 Q4VCS5-1
04 DB_Object_Symbol required 1 3 AMOT
05 DB_Object_Name optional 0 or 1 10 Angiomotin
06 DB_Object_Synonym(s) optional 0 or greater 11 KIAA1071|IPI:IPI00163085|IPI:IPI00644547|UniProtKB:AMOT_HUMAN
07 DB_Object_Type required 1 12 protein
08 Taxon required 1 13 taxon:9606
09 Annotation_Target_Set optional 0 or greater - KRUK|Reference Genome
10 Annotation_Completed optional 1 - timestamp (YYYYMMDD)
11 Parent_Object_ID optional 0 or 1 - UniProtKB:Q4VCS5