RefGenome10Jun08 Phone Conference (Archived)
Tuesday June 10, 10 AM CDT (8 AM PDT, 4 PM BST)
- Emily GOA
- Rachael GOA
- Pascale dictyBase
- Donghui TAIR
- Mary MGI
- Judy MGI
- Susan Flybase
Review old action items
- All: Annotation Quality control: Please pick an ortholog set from the Curation Targets table 
- All: Annotation Quality control: Have a look at the SF items and see if the ortholog from your organism is correctly annotated ("comprehensive"). Let lead curator for that set know that you're done.
- Seth: send URL sometime to the prototype of the ortholog tool this week
Action items from the meeting
- Paul will grab gp2protein files on May 1st and begin his run
- Develop annotation SOPs - this needs to be assigned
New action items
[Action item] : All : fill the old Google spreadsheets so that Mary can generate the ortho sets for making the graphs.
- Please sign up for the electronic refgenome curation jamboree http://doodle.ch/np9byb6avdi8zwc4
- Next GOC meeting will be in Montreal, Tues-Wed Oct 21-22. Should we add a day for Ref Genome annotation discussion? That would likely be Monday Oct 20.
Mary: I have updated the PPOD12 graphs (spreadsheet links should work as before): http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD12_Graphs/
I have also made graphs for the new targets (not much there yet): http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD_Graphs_June/ I can continue to update both sets as folks add annotations until software takes over.
Refgenome update 2008-06-10
- Completed to date:
- Assemble gp2protein files for all Ref Genome species, and other phylogenetically diverse species to help evolutionary reconstruction
- For most Ref Genome species, take gp2protein file
- For a few Ref Genome species, make a gp2protein file
- Human: Genes from Ensembl, use Ensembl mapping to UniProt for protein (SwissProt over Trembl, Ensembl protein if no mapping)
- Chicken: Genes from Entrez Gene, use RefSeq mapping to UniProt for protein (SwissProt over Trembl, if no mapping then use reviewed RefSeq over provisional)
- Zebrafish: Genes from Ensembl, use ZFIN mapping to ZFIN gene IDs wherever possible, use Ensembl mapping to UniProt for protein as in 1.b.i.
- For other species available at Ensembl, construct gp2protein file as in 1.2.1.
- Chimpanzee, Macaque, Cow, Dog, Opossum, Platypus, Ciona, Frog, Fugu, Mosquito
- For other species available at Entrez Gene, construct gp2protein file as in 1.2.2.
- Eukaryotes: Sea urchin, Ashbya, Aspergillus, Neurospora, Rice, Tetrahymena, Leishmania, Chlamydomonas, Entamoeba, Plasmodium
- Archaea: Sulfolobus, Methanosarcina
- Eubacteria: Streptomyces, Aquifex, Bacteroides, Chlamydia, Chloroflexus, Gloeobacter, Deinococcus, B. subtilis, Leptospira, Thermotoga, Pseudomonas, Bradyrhizobium, Geobacter
- Assign proteins to existing PANTHER families
- Get sequences for each protein above
- Score sequences against PANTHER families (search vs. PANTHER HMMs)
- Approximately 75% of the proteins hit an existing PANTHER family, though this number differs for different organisms (~85% for vertebrates to ~40% for E. coli)
- To be completed in July:
- For each family, build tree for all sequences that hit in 2.2.
- Build multiple sequence alignment using MAFFT
- Build phylogenetic tree
- Load trees into database that is accessible to GO curators
- Trees can be searched, accessed and viewed using Tree Curation Tool (Java application)
Comments about new targets
- GRIN1 Donghui
- EIF2B2 Stacia
- ACRV1B Ranjana
Next conference call
Tuesday June 8, 2008, 1 PM CDT, 11 AM PDT, 7 PM BST
Return to Reference_Genome_Annotation_Project