Oct 2012 Meeting to finalize GPAD specification (Archived): Difference between revisions
Tonysawford (talk | contribs) |
|||
Line 8: | Line 8: | ||
#Translating existing qualifiers CONTRIBUTES_TO and COLOCALIZES_WITH into annotation relations (+ NOT?) | #Translating existing qualifiers CONTRIBUTES_TO and COLOCALIZES_WITH into annotation relations (+ NOT?) | ||
#Filling in relationships retrospectively - implicit or explicit values in qualifier column? | #Filling in relationships retrospectively - implicit or explicit values in qualifier column? | ||
Line 21: | Line 20: | ||
| DB || style="color:red" | required || 1 || 1 || must be in xrf_abbs | | DB || style="color:red" | required || 1 || 1 || must be in xrf_abbs | ||
|- style="color:blue" | |- style="color:blue" | ||
| DB_Object_ID || style="color:red" | required || 1 || 2 || | | DB_Object_ID || style="color:red" | required || 1 || 2 || canonical or spliceform ID | ||
|- | |- | ||
| Qualifier || optional || 0 or greater || 4 || (NOT or integral_to)? (other_organism or colocalizes_with or contributes_to)? annotation_relation | | Qualifier || optional || 0 or greater || 4 || (NOT or integral_to)? (other_organism or colocalizes_with or contributes_to)? annotation_relation |
Revision as of 10:41, 21 November 2012
Link to previous proposed GPI format; http://wiki.geneontology.org/index.php/Gene_Product_Data_File_Format
and previous proposed GPAD format; http://wiki.geneontology.org/index.php/Gene_Product_Association_Data_%28GPAD%29_Format
Issues for discussion:
gp_association files
- Translating existing qualifiers CONTRIBUTES_TO and COLOCALIZES_WITH into annotation relations (+ NOT?)
- Filling in relationships retrospectively - implicit or explicit values in qualifier column?
contents | required? | cardinality | old column # | extra info |
---|---|---|---|---|
DB | required | 1 | 1 | must be in xrf_abbs |
DB_Object_ID | required | 1 | 2 | canonical or spliceform ID |
Qualifier | optional | 0 or greater | 4 | (NOT or integral_to)? (other_organism or colocalizes_with or contributes_to)? annotation_relation |
GO ID | required | 1 | 5 | must be extant GO ID |
DB:Reference(s) | required | 1 or greater | 6 | DB must be in xrf_abbs |
Evidence code | required | 1 | 7 | from ECO |
With (or) From | optional | 0 or greater | 8 | |
Interacting taxon ID (for multi-organism processes) | optional | 0 or 1 | 13 | ncbi taxon ID |
Date | required | 1 | 14 | YYYYMMDD |
Assigned_by | required | 1 | 15 | from xrf_abbs |
Annotation XP (Annotation Cross Products) | optional | 0 or greater | 16 |
gp_information files
1. gp_information.goa_uniprot (gpi-version: 1.0) currently has these columns, some of which (db_subset, annotation_target_set, annotation_completed) are not present in the proposed 1.1 format.
2. It does not contain xrefs to other databases, however this would be useful for mapping of MOD-specific identifiers/symbols/synonyms to UniProt accessions to assist MOD curators moving to Protein2GO in searching for familiar IDs/gene names.
3. Is it possible to dispose of Col.1 and replace it with a header line specifying the namespace of the annotating groups' identifiers, e.g. WB:, UniProtKB: ?
None of the gp2protein files in the GOC SVN repository refer to objects from more than one namespace in column 1, so it seems like unnecessary repetition and redundancy to repeat the namespace in every row of the gpi file. Parent_object_id would only need to be qualified if it referred to an object from a different namespace (is this ever likely to happen in practice?)
column | name | required? | cardinality | GAF column | Example |
---|---|---|---|---|---|
01 | DB | required | 1 | 1 | UniProtKB |
02 | DB_Subset | optional | 0 or 1 | - | Swiss-Prot or TrEMBL |
03 | DB_Object_ID | required | 1 | 2 | Q4VCS5-1 |
04 | DB_Object_Symbol | required | 1 | 3 | AMOT |
05 | DB_Object_Name | optional | 0 or 1 | 10 | Angiomotin |
06 | DB_Object_Synonym(s) | optional | 0 or greater | 11 | KIAA1071|IPI:IPI00163085|IPI:IPI00644547|UniProtKB:AMOT_HUMAN |
07 | DB_Object_Type | required | 1 | 12 | protein |
08 | Taxon | required | 1 | 13 | taxon:9606 |
09 | Annotation_Target_Set | optional | 0 or greater | - | KRUK|Reference Genome |
10 | Annotation_Completed | optional | 1 | - | timestamp (YYYYMMDD) |
11 | Parent_Object_ID | optional | 0 or 1 | - | UniProtKB:Q4VCS5 |