GO Consortium Meeting 2007: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
Line 32: Line 32:
#Piped data for IPI, need consistency in usage (Evelyn)
#Piped data for IPI, need consistency in usage (Evelyn)
IGI data allows piped accessions in the 'with' columns to capture the fact that two or more genes may be interacting simultaneously. IPI data also allows piped accessions in with column but some GOC members here use the pipe to specifically say that in a given paper that protein A, B and C precipitated together or form part of a complex others I think use it also for circumstances where 2 separate experiments in the same paper showed protein A interacted with protein B and to protein C. GOA prefers using it like IGI for a specific circumstance otherwise information is lost? Others??
IGI data allows piped accessions in the 'with' columns to capture the fact that two or more genes may be interacting simultaneously. IPI data also allows piped accessions in with column but some GOC members here use the pipe to specifically say that in a given paper that protein A, B and C precipitated together or form part of a complex others I think use it also for circumstances where 2 separate experiments in the same paper showed protein A interacted with protein B and to protein C. GOA prefers using it like IGI for a specific circumstance otherwise information is lost? Others??
Related Issue: GOA has decided for the moment not to pipe several protein binding interactions simply because it comes from same paper. We unwrap piped data from MODS because of inconsistency in usage and because this data not normalised (causes problems of database and web services)
#GO Consortium Tools (Evelyn, Emily)
#GO Consortium Tools (Evelyn, Emily)
GOA feels that GOC should not have tools on GO tool page unless they are maintained or at least highlight that fact, we also feel that we should consider perhaps a top 10 GOC reviewed set of tools that we can recommend and liase with on a regular basis. GOA can do that independently of GOC if GOC does not want to take such a position. Most users want advice on GO tools and presenting them with over 100 is not overly helpful. We also need to consider how to modify next GO users/tool meeting (already discussed on GO management I think?)
GOA feels that GOC should not have tools on GO tool page unless they are maintained or at least highlight that fact, we also feel that we should consider perhaps a top 10 GOC reviewed set of tools that we can recommend and liase with on a regular basis. GOA can do that independently of GOC if GOC does not want to take such a position. Most users want advice on GO tools and presenting them with over 100 is not overly helpful. We also need to consider how to modify next GO users/tool meeting (already discussed on GO management I think?)

Revision as of 09:40, 26 October 2006

Topics

Please add items below that you think need to be presented. We are not -yet- putting these in any particular order or time, we're just collecting the topics we need to address while we are together.

GO Team and other Status Reports

Listed with potential people to provide the summaries

  1. Reference Genomes - Rex&Karen
  2. Ontology Content - David&Midori
    • IS_A complete
    • regulates
    • Cell Ontology links
  3. Ontology & Software - Chris&Ben/Mike
    • Includes OBO-Edit working group report
  4. Annotation outreach - Jen&Michelle/Simon
  5. User Advocacy - Eurie&Jane
    • Includes AmiGO working group report
  6. Operations - Suzi
  7. Publications

Issues to be addressed

  1. The issue of using the GO_REF vs extension of the evidence codes to amplify upon the method that is used.
  2. gp2protein file:
    • The lagtime between when the protein sequences of a newly sequenced and annotated organism are published, and when they make it into UniProt. For example, even now only ~10% of 27,855 arabidopsis protein sequences are contained in Swiss-Prot. For the final release (version 5) of arabidopsis, 306 proteins (~1%) are available in Swiss-Prot and 374 in TrEMBL. Older arabidopsis sequences are found in TrEMBL, but fully 1/3 of the sequences found in the first release have changed over the life of the project.

(Ans From Evelyn):This problem stems from the fact that corrections to the original genome sequence have not been submitted to EMBL but only to TAIR. Paul Kersey at EBI is responsible for importing sequences from TAIR to UniProtKB (evelyn querying this). Why is this data or annotation not submitted to the EMBL/DDBJ/GenBank international nucleotide sequence databank? or is it??

    • GOST needs to use the precise sequences, as supplied by the submitting group
  1. Handling multiple identifiers for gene products and sequences
  2. Evaluation of project tracking methods
  3. GO policy on incorporating GOA annotations into MOD annotations (Evelyn and Mike/Judy?)

GO annotations have been stripped out of GOA-UniProt (all species file) on GO site for taxon Id's represented by other GOC members. The idea was that the other GOC members would integrate annotations from GOA. Experimentally verified data from GOA was being lost 6000-ish for Mouse alone. The GOA group have been receiving alot of questions about how to get complete annotation datasets from GOC. GOA-UniProt file unstripped available on EBI ftp site. GOA now integrating all experimental data from all other GOC members on monthly basis. Can we have a GO policy on this? If a GOC member can't integrate GOA manual annotation should that taxon Id continue to be stripped from GOA-UniProt file?

  1. More?? Claim some? Explain some?
  1. Piped data for IPI, need consistency in usage (Evelyn)

IGI data allows piped accessions in the 'with' columns to capture the fact that two or more genes may be interacting simultaneously. IPI data also allows piped accessions in with column but some GOC members here use the pipe to specifically say that in a given paper that protein A, B and C precipitated together or form part of a complex others I think use it also for circumstances where 2 separate experiments in the same paper showed protein A interacted with protein B and to protein C. GOA prefers using it like IGI for a specific circumstance otherwise information is lost? Others?? Related Issue: GOA has decided for the moment not to pipe several protein binding interactions simply because it comes from same paper. We unwrap piped data from MODS because of inconsistency in usage and because this data not normalised (causes problems of database and web services)

  1. GO Consortium Tools (Evelyn, Emily)

GOA feels that GOC should not have tools on GO tool page unless they are maintained or at least highlight that fact, we also feel that we should consider perhaps a top 10 GOC reviewed set of tools that we can recommend and liase with on a regular basis. GOA can do that independently of GOC if GOC does not want to take such a position. Most users want advice on GO tools and presenting them with over 100 is not overly helpful. We also need to consider how to modify next GO users/tool meeting (already discussed on GO management I think?)

New proposals

  1. Protein Family based annotation tool - Suzi
  2. Term history tracking capability - John/Chris/and OBO-Edit group
  3. Incorporation of all gene product sequences and IDs into GO database and fasta files. How are we to accomplish this.

Venue

 Click here to register for this meeting. Please do this by October 31st
 Jesus College
 Jesus Lane, Cambridge, CB5 8BL, UK
 Cambridge, UK