Oct 2012 Meeting to finalize GPAD specification (Archived)
Link to previous proposed GPI format; http://wiki.geneontology.org/index.php/Gene_Product_Data_File_Format
and previous proposed GPAD format; http://wiki.geneontology.org/index.php/Gene_Product_Association_Data_%28GPAD%29_Format
Issues for discussion:
gp_association files
- Translating existing qualifiers CONTRIBUTES_TO and COLOCALIZES_WITH into annotation relations (+ NOT?)
- Filling in relationships retrospectively - implicit or explicit values in qualifier column?
- spliceforms - separate column with canonical form in column 2 as per GAF, or column 2 only?
contents | required? | cardinality | old column # | extra info |
---|---|---|---|---|
DB | required | 1 | 1 | must be in xrf_abbs |
DB_Object_ID | required | 1 | 2 | |
Qualifier | optional | 0 or greater | 4 | (NOT or integral_to)? (other_organism or colocalizes_with or contributes_to)? annotation_relation |
GO ID | required | 1 | 5 | must be extant GO ID |
DB:Reference(s) | required | 1 or greater | 6 | DB must be in xrf_abbs |
Evidence code | required | 1 | 7 | from ECO |
With (or) From | optional | 0 or greater | 8 | |
Interacting taxon ID (for multi-organism processes) | optional | 0 or 1 | 13 | ncbi taxon ID |
Date | required | 1 | 14 | YYYYMMDD |
Assigned_by | required | 1 | 15 | from xrf_abbs |
Annotation XP (Annotation Cross Products) | optional | 0 or greater | 16 |
gp_information files
gp_information.goa_uniprot (gpi-version: 1.0) currently has these columns, some of which (db_subset, annotation_target_set, annotation_completed) are not present in the proposed 1.1 format.
It does not contain xrefs to other databases, however this would be useful for mapping of MOD-specific identifiers/symbols/synonyms to UniProt accessions to assist MOD curators moving to Protein2GO in searching for familiar IDs/gene names.
column | name | required? | cardinality | GAF column | Example |
---|---|---|---|---|---|
01 | DB | required | 1 | 1 | UniProtKB |
02 | DB_Subset | optional | 0 or 1 | - | Swiss-Prot or TrEMBL |
03 | DB_Object_ID | required | 1 | 2 | Q4VCS5-1 |
04 | DB_Object_Symbol | required | 1 | 3 | AMOT |
05 | DB_Object_Name | optional | 0 or 1 | 10 | Angiomotin |
06 | DB_Object_Synonym(s) | optional | 0 or greater | 11 | KIAA1071|IPI:IPI00163085|IPI:IPI00644547|UniProtKB:AMOT_HUMAN |
07 | DB_Object_Type | required | 1 | 12 | protein |
08 | Taxon | required | 1 | 13 | taxon:9606 |
09 | Annotation_Target_Set | optional | 0 or greater | - | KRUK|Reference Genome |
10 | Annotation_Completed | optional | 1 | - | timestamp (YYYYMMDD) |
11 | Parent_Object_ID | optional | 0 or 1 | - | UniProtKB:Q4VCS5 |