GAF file 2.0

From GO Wiki
Jump to: navigation, search
  • GAF file header: (please fill in)
  • Column 2 should represent the 'canonical gene', and should map to column 1 of the Gene2geneproduct_file
  • Column 12 now refers to the information in column 17 (or in Column 2 if Column 17 is empty).
  • Column 16: can be left blank
  • column 17: should represent the unit annotated (if different from that in column 2)



Can all groups provide the new file? (Yes or No, if not please explain):

Reference Genome Group Contact Person Y/N Reason
SGD Stacia Engel
MGI Li Ni MGI is really to start including data in column 16 for CL; although this additional information will not be displayed on the web.
Added by Emily from 14-12-2009 Col.16 meeting
FlyBase Susan Tweedie No FlyBase does not currently support curation of column 16 or 17 data during GO annotation. Like ZFIN, the required changes to DB, curation interface, data checking/integration tool and web pages are not a high priority but I will continue to push for this. I will also try to generate some example data and will also use this to try to convince the other curators that the extra effort involved in capturing this information is worthwhile (they have expressed concern that it could be very time consuming to find out which isoform has been used in an experiment for col 17).
dictyBase Pascale Gaudet No dictyBase does not currently support curation of column 16 or 17 data during GO annotation. We require major changes to our data schema before we can modify our GO tool (we are currently migrating to Chado), so implementing column 16 and 17 for curation will likely take months.
E.coli Jim Hu E.coli would like to start supplying such data in ~6 months.
Added by Emily from 14-12-2009 Col.16 meeting
TAIR Tanya Beradini
WormBase Kimberly Van Auken WormBase would like to start supplying such data in ~6 months.
Added by Emily from 14-12-2009 Col.16 meeting
S. pombe Val Wood No GeneDB does not currently support curation of column 16 or 17 data during GO annotation. The required changes to curation interface, database and web pages are not a high priority for GeneDB.

However, I will try to get the option added to the curation tool so that i can curate the additional information, even if I can't display it.

Column 16: I would like to start to collect data for 17 as soon as I can use it. My interests are mainly for the BP-MF type examples. I also have a number of "qualifiers" which represent the cell cycle of life cycle phase when a process or localization is occurring. For example during_early_meiosis, during meiotic_prophase, suring meiosis_I, during_meiosis_II, suring_anaphase, during_vegetative_growth which could probably be represented in an "life cycle stage" ontology and referenced in this column? Column 17: Fission yeast does not have many alternatively spliced genes although there are a couple of examples of alternative transcripts. These are not currently suitable for GO annotation as all of the experimentation is done on the canonical form. For some examples the alternative transcripts are regulatory, but do not produce a protein coding gene and so have been given a separate identifier as a non-codingRNA (I can think of one example). I don't think i would use this column much for alternative splice forms. If I remember correctly this column will also be used for representing a modified (for example phosphorylated form) of a protein. I can see that potentially I would use the column for this, but at present I have no identifiers to represent modified/non modified forms, although this information in a qualifier "phosphorylated_form" i don't feel a pressing need to get this information into GO in the grand scheme of things.

RGD Stan Laulederkind No RGD doesn't currently have the capability to create column 16 or 17 from our curation software. We may start using column 16 within 6 months, but our use of column 17 is less likely.
Human Emily Dimmer Yes Column 16: GOA has recently started to capture data relevant for column 16 as free text. However we would like to model this information in our database, and eventually release the data in column 16. However this is likely to take sometime before we are ready.

Column 17: GOA currently annotates to both UniProtKB accessions (e.g. P12345) and UniProtKB isoform ids (P12345-1). Therefore these isoform ids could be used to populate column 17 with information on specific protein products. We would like to start producing a GAF 2.0 format file in December 2009.

Zebrafish Doug Howe No ZFIN does not currently support curation of column 16 or 17 data during GO annotation. We are not likely to support curation of this type of data into ZFIN any time soon as it will require DB, curation interface, and public display changes that we are not likely to have time to implement any time soon....though I wish we could. I may still try to drum up some example data, but will have to manually manufacture a GAF with col. 16/17 data added on. When these columns become manditory, we can of course upgrade our GAF to include 17 columns even if we don't populate 16/17 with curated data.
Chicken Fiona McCarthy

Back to Reference_Genome_Annotation_Project