RefGenome10Jun08 Phone Conference (Archived): Difference between revisions
m (→ACTION ITEMS) |
mNo edit summary |
||
Line 23: | Line 23: | ||
== Software == | == Software == | ||
=== Graphs=== | === Graphs=== | ||
Mary: | Mary: | ||
Line 31: | Line 32: | ||
[http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD_Graphs_June/ http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD_Graphs_June/] | [http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD_Graphs_June/ http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD_Graphs_June/] | ||
I can continue to update both sets as folks add annotations until software takes over. | I can continue to update both sets as folks add annotations until software takes over. | ||
=== Orthology Analysis=== | |||
Paul: | |||
Refgenome update 2008-06-10 | |||
*Completed to date: | |||
#Assemble gp2protein files for all Ref Genome species, and other phylogenetically diverse species to help evolutionary reconstruction | |||
##For most Ref Genome species, take gp2protein file | |||
##For a few Ref Genome species, make a gp2protein file | |||
###Human: Genes from Ensembl, use Ensembl mapping to UniProt for protein (SwissProt over Trembl, Ensembl protein if no mapping) | |||
###Chicken: Genes from Entrez Gene, use RefSeq mapping to UniProt for protein (SwissProt over Trembl, if no mapping then use reviewed RefSeq over provisional) | |||
###Zebrafish: Genes from Ensembl, use ZFIN mapping to ZFIN gene IDs wherever possible, use Ensembl mapping to UniProt for protein as in 1.b.i. | |||
##For other species available at Ensembl, construct gp2protein file as in 1.b.i. | |||
###Chimpanzee, Macaque, Cow, Dog, Opossum, Platypus, Ciona, Frog, Fugu, Mosquito | |||
##For other species available at Entrez Gene, construct gp2protein file as in 1.b.ii. | |||
###Eukaryotes: Sea urchin, Ashbya, Aspergillus, Neurospora, Rice, Tetrahymena, Leishmania, Chlamydomonas, Entamoeba, Plasmodium | |||
###Archaea: Sulfolobus, Methanosarcina | |||
###Eubacteria: Streptomyces, Aquifex, Bacteroides, Chlamydia, Chloroflexus, Gloeobacter, Deinococcus, B. subtilis, Leptospira, Thermotoga, Pseudomonas, Bradyrhizobium, Geobacter | |||
#Assign proteins to existing PANTHER families | |||
##Get sequences for each protein above | |||
##Score sequences against PANTHER families (search vs. PANTHER HMMs) | |||
###Approximately 75% of the proteins hit an existing PANTHER family, though this number differs for different organisms (~85% for vertebrates to ~40% for E. coli) | |||
*To be completed in July: | |||
#For each family, build tree for all sequences that hit in 2.b. | |||
##Build multiple sequence alignment using MAFFT | |||
##Build phylogenetic tree | |||
#Load trees into database that is accessible to GO curators | |||
##Trees can be searched, accessed and viewed using Tree Curation Tool (Java application) | |||
== Comments about new targets == | == Comments about new targets == |
Revision as of 10:49, 10 June 2008
Tuesday June 10, 10 AM CDT (8 AM PDT, 4 PM BST)
Present
ACTION ITEMS
Review old action items
- All: Annotation Quality control: Please pick an ortholog set from the Curation Targets table [1]
- All: Annotation Quality control: Have a look at the SF items and see if the ortholog from your organism is correctly annotated ("comprehensive"). Let lead curator for that set know that you're done.
- Seth: send URL sometime to the prototype of the ortholog tool this week
Action items from the meeting
- Paul will grab gp2protein files on May 1st and begin his run
- Develop annotation SOPs - this needs to be assigned
New action items
[Action item] : All : fill the old Google spreadsheets so that Mary can generate the ortho sets for making the graphs.
Meetings
- Please sign up for the electronic refgenome curation jamboree http://doodle.ch/np9byb6avdi8zwc4
- Next GOC meeting will be in Montreal, Tues-Wed Oct 21-22. Should we add a day for Ref Genome annotation discussion? That would likely be Monday Oct 20.
Software
Graphs
Mary: I have updated the PPOD12 graphs (spreadsheet links should work as before): http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD12_Graphs/
I have also made graphs for the new targets (not much there yet): http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD_Graphs_June/ I can continue to update both sets as folks add annotations until software takes over.
Orthology Analysis
Paul:
Refgenome update 2008-06-10
- Completed to date:
- Assemble gp2protein files for all Ref Genome species, and other phylogenetically diverse species to help evolutionary reconstruction
- For most Ref Genome species, take gp2protein file
- For a few Ref Genome species, make a gp2protein file
- Human: Genes from Ensembl, use Ensembl mapping to UniProt for protein (SwissProt over Trembl, Ensembl protein if no mapping)
- Chicken: Genes from Entrez Gene, use RefSeq mapping to UniProt for protein (SwissProt over Trembl, if no mapping then use reviewed RefSeq over provisional)
- Zebrafish: Genes from Ensembl, use ZFIN mapping to ZFIN gene IDs wherever possible, use Ensembl mapping to UniProt for protein as in 1.b.i.
- For other species available at Ensembl, construct gp2protein file as in 1.b.i.
- Chimpanzee, Macaque, Cow, Dog, Opossum, Platypus, Ciona, Frog, Fugu, Mosquito
- For other species available at Entrez Gene, construct gp2protein file as in 1.b.ii.
- Eukaryotes: Sea urchin, Ashbya, Aspergillus, Neurospora, Rice, Tetrahymena, Leishmania, Chlamydomonas, Entamoeba, Plasmodium
- Archaea: Sulfolobus, Methanosarcina
- Eubacteria: Streptomyces, Aquifex, Bacteroides, Chlamydia, Chloroflexus, Gloeobacter, Deinococcus, B. subtilis, Leptospira, Thermotoga, Pseudomonas, Bradyrhizobium, Geobacter
- Assign proteins to existing PANTHER families
- Get sequences for each protein above
- Score sequences against PANTHER families (search vs. PANTHER HMMs)
- Approximately 75% of the proteins hit an existing PANTHER family, though this number differs for different organisms (~85% for vertebrates to ~40% for E. coli)
- To be completed in July:
- For each family, build tree for all sequences that hit in 2.b.
- Build multiple sequence alignment using MAFFT
- Build phylogenetic tree
- Load trees into database that is accessible to GO curators
- Trees can be searched, accessed and viewed using Tree Curation Tool (Java application)
Comments about new targets
http://spreadsheets.google.com/ccc?key=pZhlLFuj8ewDe799QTmxzCA&hl=en
Annotation QC
http://sourceforge.net/tracker/?group_id=36855&atid=1040173
- GRIN1 Donghui
- EIF2B2 Stacia
- ACRV1B Ranjana
Next conference call
Tuesday June 8, 2008, 1 PM CDT, 11 AM PDT, 7 PM BST
Return to Reference_Genome_Annotation_Project