RefGenome10Jun08 Phone Conference (Archived)

From GO Wiki
Revision as of 11:35, 10 June 2008 by Pascale (talk | contribs)
Jump to navigation Jump to search

Tuesday June 10, 10 AM CDT (8 AM PDT, 4 PM BST)

Present

  • Emily GOA
  • Rachael GOA
  • Pascale dictyBase
  • Kara
  • Donghui TAIR
  • Mary MGI
  • Judy MGI
  • Susan Flybase
  • Suzi
  • Chris
  • Val
  • Fiona
  • Kimberly
  • Stacia SGD

ACTION ITEMS

Review old action items

  1. All: Annotation Quality control: Please pick an ortholog set from the Curation Targets table [1]
  2. All: Annotation Quality control: Have a look at the SF items and see if the ortholog from your organism is correctly annotated ("comprehensive"). Let lead curator for that set know that you're done.
  3. Seth: send URL sometime to the prototype of the ortholog tool this week

Action items from the meeting

  1. Paul will grab gp2protein files on May 1st and begin his run
  2. Develop annotation SOPs - this needs to be assigned

New action items

[Action item] : All : fill the old Google spreadsheets so that Mary can generate the ortho sets for making the graphs.

Meetings

  • Please sign up for the electronic refgenome curation jamboree http://doodle.ch/np9byb6avdi8zwc4
  • Next GOC meeting will be in Montreal, Tues-Wed Oct 21-22. Should we add a day for Ref Genome annotation discussion? That would likely be Monday Oct 20.

Software

Graphs

Mary: I have updated the PPOD12 graphs (spreadsheet links should work as before): http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD12_Graphs/

I have also made graphs for the new targets (not much there yet): http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD_Graphs_June/ I can continue to update both sets as folks add annotations until software takes over.

Orthology Analysis

Paul:

Refgenome update 2008-06-10

  • Completed to date:
  1. Assemble gp2protein files for all Ref Genome species, and other phylogenetically diverse species to help evolutionary reconstruction
    1. For most Ref Genome species, take gp2protein file
    2. For a few Ref Genome species, make a gp2protein file
      1. Human: Genes from Ensembl, use Ensembl mapping to UniProt for protein (SwissProt over Trembl, Ensembl protein if no mapping)
      2. Chicken: Genes from Entrez Gene, use RefSeq mapping to UniProt for protein (SwissProt over Trembl, if no mapping then use reviewed RefSeq over provisional)
      3. Zebrafish: Genes from Ensembl, use ZFIN mapping to ZFIN gene IDs wherever possible, use Ensembl mapping to UniProt for protein as in 1.b.i.
    3. For other species available at Ensembl, construct gp2protein file as in 1.2.1.
      1. Chimpanzee, Macaque, Cow, Dog, Opossum, Platypus, Ciona, Frog, Fugu, Mosquito
    4. For other species available at Entrez Gene, construct gp2protein file as in 1.2.2.
      1. Eukaryotes: Sea urchin, Ashbya, Aspergillus, Neurospora, Rice, Tetrahymena, Leishmania, Chlamydomonas, Entamoeba, Plasmodium
      2. Archaea: Sulfolobus, Methanosarcina
      3. Eubacteria: Streptomyces, Aquifex, Bacteroides, Chlamydia, Chloroflexus, Gloeobacter, Deinococcus, B. subtilis, Leptospira, Thermotoga, Pseudomonas, Bradyrhizobium, Geobacter
  2. Assign proteins to existing PANTHER families
    1. Get sequences for each protein above
    2. Score sequences against PANTHER families (search vs. PANTHER HMMs)
      1. Approximately 75% of the proteins hit an existing PANTHER family, though this number differs for different organisms (~85% for vertebrates to ~40% for E. coli)
  • To be completed in July:
  1. For each family, build tree for all sequences that hit in 2.2.
    1. Build multiple sequence alignment using MAFFT
    2. Build phylogenetic tree
  2. Load trees into database that is accessible to GO curators
    1. Trees can be searched, accessed and viewed using Tree Curation Tool (Java application)

Comments about new targets

http://spreadsheets.google.com/ccc?key=pZhlLFuj8ewDe799QTmxzCA&hl=en

Annotation QC

http://sourceforge.net/tracker/?group_id=36855&atid=1040173

  • GRIN1 Donghui
  • EIF2B2 Stacia
  • ACRV1B Ranjana

Next conference call

Tuesday June 8, 2008, 1 PM CDT, 11 AM PDT, 7 PM BST

Return to Reference_Genome_Annotation_Project