|
|
(4 intermediate revisions by 3 users not shown) |
Line 1: |
Line 1: |
| ===gp_information files (GPI)===
| | Moved to https://github.com/geneontology/go-annotation/tree/master/specs |
|
| |
|
|
| |
|
| <pre>
| | [[Category:Software]] |
| N.B. The first line in the gp_information file should be;
| |
| | |
| !gpi-version: 1.2
| |
| | |
| </pre>
| |
| | |
| ====Proposed format (March 2014)====
| |
| | |
| | |
| {| cellspacing="2" border="1"
| |
| |-
| |
| ! column
| |
| ! name
| |
| ! required?
| |
| ! cardinality
| |
| ! GAF column
| |
| ! Example for UniProt
| |
| ! Example for IntAct
| |
| |-
| |
| | 01 || DB || required || 1 || 1 || UniProtKB || IntAct
| |
| |-
| |
| | 02 || DB_Object_ID || required || 1 || 2/17 || Q4VCS5-1 || EBI-9008420
| |
| |-
| |
| | 03 || DB_Object_Symbol || required || 1 || 3 || AMOT || HBA1:HBB
| |
| |-
| |
| | 04 || DB_Object_Name || optional || 0 or 1 || 10 || Angiomotin || Hemoglobin HbA complex
| |
| |-
| |
| | 05 || DB_Object_Synonym(s) || optional || 0 or greater || 11 || AMOT_HUMAN|KIAA1071|AMOT || HBA-HBB complex|HBA1-HBB complex|HBA1-HBB heterotetramer
| |
| |-
| |
| | 06 || DB_Object_Type || required || 1 || 12 || protein || complex
| |
| |-
| |
| | 07 || Taxon || required || 1 || 13 || 9606 || 9606
| |
| |-
| |
| | 08 || Parent_Object_ID || optional || 0 or 1 || || UniProtKB:Q4VCS5 ||
| |
| |-
| |
| | 09 || DB_Xref(s) || optional || 0 or greater || || UniProtKB:P38433 || PR:000025934
| |
| |-
| |
| | 010 || Gene_Product_Properties || optional || 0 or greater || || See Note 4 below ||
| |
| |-
| |
| |}
| |
| | |
| | |
| '''Notes'''
| |
| | |
| 1. Where it is stated that a column can have one or greater values,
| |
| e.g. 'with', DB_Object_Synonym(s), DB_Xref(s), the values should be given as a pipe-separated list.
| |
| | |
| | |
| 2. The DB_Xrefs column will be useful for mapping of MOD-specific identifiers/symbols/synonyms to UniProt accessions to assist MOD curators moving to Protein2GO in searching for familiar IDs/gene names. In the case of IntAct complexe IDs, it will be useful to include PRO IDs as an xref to enable a look-up function in Protein2GO. In the case where the value in column #2 represents a MOD gene identifier, the Xref should correspond to the UniProtKB identifier for the GCRP.
| |
| | |
| 3. Identifiers in the Parent_Object_ID column must have a prefix to avoid confusion in cases where an ID from a different database to the one specified in the header is included
| |
| | |
| 4. The Gene Product Properties column can be filled with a pipe separated list of "property_name = property_value". There will be a fixed vocabulary for the property names and this list can be extended when necessary.
| |
| Supported properties will include: 'GO annotation complete', "Phenotype annotation complete' (the value for these two properties would be a date), 'Target set' (e.g. Reference Genome, Kidney etc.), 'Database subset' (e.g. Swiss-Prot, TrEMBL), go_annotation_summary (textual summary of annotations for an entity)
| |
| | |
| | |
| | |
| [[Category:Specification]]
| |
| [[Category:GPAD]] | |