RefGenome10Jun08 Phone Conference (Archived): Difference between revisions

From GO Wiki
Jump to navigation Jump to search
 
(23 intermediate revisions by 5 users not shown)
Line 1: Line 1:
[[Category:Reference Genome]][[Category:Archived]]
Tuesday June 10, 10 AM CDT (8 AM PDT, 4 PM BST)'''
Tuesday June 10, 10 AM CDT (8 AM PDT, 4 PM BST)'''


==Present==
==Present==
* Emily GOA
* Rachael GOA
* Pascale dictyBase
* Kara
* Donghui TAIR
* Tanya TAIR
* Mary MGI
* Judy MGI
* David MGI
* Susan Flybase
* Suzi
* Chris
* Val S. pombe
* Fiona AgBase
* Kimberly WormBase
* Stacia SGD
* Ruth UCL


== ACTION ITEMS ==
===Review old/ongoing action items===


== ACTION ITEMS ==
# All: Annotation Quality control: Please pick an ortholog set from the Curation Targets table [http://spreadsheets.google.com/ccc?key=pwOksMOra5uq4vIYjPgefPw]
see also [[Annotation_QC]] for some general documentation and previous issues
# All: Annotation Quality control: Have a look at the SF items and see if the ortholog from your organism is correctly annotated ("comprehensive"). Let lead curator for that set know that you're done.
# Seth: send URL sometime to the prototype of the ortholog tool this week (will do!)
 
===Action items from the meeting===
# [DONE] Paul will grab gp2protein files on May 1st and begin his run; see details in software/orthology section
# Develop annotation SOPs - this needs to be assigned
There are some wiki pages about that on the ref genome main page's annotation section:
[[Reference_Genome_Annotation_Project#Gene_Annotation]]
 
===New action items===
[Action item] : All : fill the old Google spreadsheets so that Mary can generate the ortho sets for making the graphs.
[Action item]: Discuss use of binding/regulation terms (GOC meeting)
[Action item]: MGI: verify GRIN1 annotation binding/complex


== Meetings ==
== Meetings ==
Line 10: Line 44:
* Next GOC meeting will be in Montreal, Tues-Wed Oct 21-22. Should we add a day for Ref Genome annotation discussion? That would likely be Monday Oct 20.  
* Next GOC meeting will be in Montreal, Tues-Wed Oct 21-22. Should we add a day for Ref Genome annotation discussion? That would likely be Monday Oct 20.  


== Software ==  
== Software ==
 
=== Graphs===
Mary:
I have updated the PPOD12 graphs (spreadsheet links should work as before):
[http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD12_Graphs/ http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD12_Graphs/]
 
I have also made graphs for the new targets (not much there yet):
[http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD_Graphs_June/ http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD_Graphs_June/]
I can continue to update both sets as folks add annotations until software takes over.
 
=== Orthology Analysis===
Paul:
 
Refgenome update 2008-06-10
 
*Completed to date:
#Assemble gp2protein files for all Ref Genome species, and other phylogenetically diverse species to help evolutionary reconstruction
##For most Ref Genome species, take gp2protein file
##For a few Ref Genome species, make a gp2protein file
###Human: Genes from Ensembl, use Ensembl mapping to UniProt for protein (SwissProt over Trembl, Ensembl protein if no mapping)
###Chicken: Genes from Entrez Gene, use RefSeq mapping to UniProt for protein (SwissProt over Trembl, if no mapping then use reviewed RefSeq over provisional)
###Zebrafish: Genes from Ensembl, use ZFIN mapping to ZFIN gene IDs wherever possible, use Ensembl mapping to UniProt for protein as in 1.b.i.
##For other species available at Ensembl, construct gp2protein file as in 1.2.1.
###Chimpanzee, Macaque, Cow, Dog, Opossum, Platypus, Ciona, Frog, Fugu, Mosquito
##For other species available at Entrez Gene, construct gp2protein file as in 1.2.2.
###Eukaryotes: Sea urchin, Ashbya, Aspergillus, Neurospora, Rice, Tetrahymena, Leishmania, Chlamydomonas, Entamoeba, Plasmodium
###Archaea: Sulfolobus, Methanosarcina
###Eubacteria: Streptomyces, Aquifex, Bacteroides, Chlamydia, Chloroflexus, Gloeobacter, Deinococcus, B. subtilis, Leptospira, Thermotoga, Pseudomonas, Bradyrhizobium, Geobacter
#Assign proteins to existing PANTHER families
##Get sequences for each protein above
##Score sequences against PANTHER families (search vs. PANTHER HMMs)
###Approximately 75% of the proteins hit an existing PANTHER family, though this number differs for different organisms (~85% for vertebrates to ~40% for E. coli)
 
*To be completed in July:
#For each family, build tree for all sequences that hit in 2.2.
##Build multiple sequence alignment using MAFFT
##Build phylogenetic tree
#Load trees into database that is accessible to GO curators
##Trees can be searched, accessed and viewed using Tree Curation Tool (Java application)


== Comments about new targets ==  
== Comments about new targets ==  
Line 17: Line 90:
== Annotation QC ==
== Annotation QC ==
[http://sourceforge.net/tracker/?group_id=36855&atid=1040173 http://sourceforge.net/tracker/?group_id=36855&atid=1040173]
[http://sourceforge.net/tracker/?group_id=36855&atid=1040173 http://sourceforge.net/tracker/?group_id=36855&atid=1040173]
* GRIN1 Donghui
* Issues with 'X protein binding' when co-annotated with 'x complex component'; generally people feel that those two should not co-occur, but we also acknowledge that that also sometimes depend on the experiment that was done. In the case of mouse GRIN1, since the two annotations come from the same paper, it seems like 'glutamate receptor binding should be removed? (to be confirmed by David)
* regulation: mouse gene is annotated to 'positive regulation of transcription from RNA polymerase II promoter'; this is a downstream effect, and the question is, how much downstream can one annotate? In this case the paper shows that activating or blocking the receptor causes decrease /increased rates of transcription of a specific gene.


==Next conference call==
==Next conference call==

Latest revision as of 11:28, 16 January 2018

Tuesday June 10, 10 AM CDT (8 AM PDT, 4 PM BST)

Present

  • Emily GOA
  • Rachael GOA
  • Pascale dictyBase
  • Kara
  • Donghui TAIR
  • Tanya TAIR
  • Mary MGI
  • Judy MGI
  • David MGI
  • Susan Flybase
  • Suzi
  • Chris
  • Val S. pombe
  • Fiona AgBase
  • Kimberly WormBase
  • Stacia SGD
  • Ruth UCL

ACTION ITEMS

Review old/ongoing action items

  1. All: Annotation Quality control: Please pick an ortholog set from the Curation Targets table [1]

see also Annotation_QC for some general documentation and previous issues

  1. All: Annotation Quality control: Have a look at the SF items and see if the ortholog from your organism is correctly annotated ("comprehensive"). Let lead curator for that set know that you're done.
  2. Seth: send URL sometime to the prototype of the ortholog tool this week (will do!)

Action items from the meeting

  1. [DONE] Paul will grab gp2protein files on May 1st and begin his run; see details in software/orthology section
  2. Develop annotation SOPs - this needs to be assigned

There are some wiki pages about that on the ref genome main page's annotation section: Reference_Genome_Annotation_Project#Gene_Annotation

New action items

[Action item] : All : fill the old Google spreadsheets so that Mary can generate the ortho sets for making the graphs. [Action item]: Discuss use of binding/regulation terms (GOC meeting) [Action item]: MGI: verify GRIN1 annotation binding/complex

Meetings

  • Please sign up for the electronic refgenome curation jamboree http://doodle.ch/np9byb6avdi8zwc4
  • Next GOC meeting will be in Montreal, Tues-Wed Oct 21-22. Should we add a day for Ref Genome annotation discussion? That would likely be Monday Oct 20.

Software

Graphs

Mary: I have updated the PPOD12 graphs (spreadsheet links should work as before): http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD12_Graphs/

I have also made graphs for the new targets (not much there yet): http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD_Graphs_June/ I can continue to update both sets as folks add annotations until software takes over.

Orthology Analysis

Paul:

Refgenome update 2008-06-10

  • Completed to date:
  1. Assemble gp2protein files for all Ref Genome species, and other phylogenetically diverse species to help evolutionary reconstruction
    1. For most Ref Genome species, take gp2protein file
    2. For a few Ref Genome species, make a gp2protein file
      1. Human: Genes from Ensembl, use Ensembl mapping to UniProt for protein (SwissProt over Trembl, Ensembl protein if no mapping)
      2. Chicken: Genes from Entrez Gene, use RefSeq mapping to UniProt for protein (SwissProt over Trembl, if no mapping then use reviewed RefSeq over provisional)
      3. Zebrafish: Genes from Ensembl, use ZFIN mapping to ZFIN gene IDs wherever possible, use Ensembl mapping to UniProt for protein as in 1.b.i.
    3. For other species available at Ensembl, construct gp2protein file as in 1.2.1.
      1. Chimpanzee, Macaque, Cow, Dog, Opossum, Platypus, Ciona, Frog, Fugu, Mosquito
    4. For other species available at Entrez Gene, construct gp2protein file as in 1.2.2.
      1. Eukaryotes: Sea urchin, Ashbya, Aspergillus, Neurospora, Rice, Tetrahymena, Leishmania, Chlamydomonas, Entamoeba, Plasmodium
      2. Archaea: Sulfolobus, Methanosarcina
      3. Eubacteria: Streptomyces, Aquifex, Bacteroides, Chlamydia, Chloroflexus, Gloeobacter, Deinococcus, B. subtilis, Leptospira, Thermotoga, Pseudomonas, Bradyrhizobium, Geobacter
  2. Assign proteins to existing PANTHER families
    1. Get sequences for each protein above
    2. Score sequences against PANTHER families (search vs. PANTHER HMMs)
      1. Approximately 75% of the proteins hit an existing PANTHER family, though this number differs for different organisms (~85% for vertebrates to ~40% for E. coli)
  • To be completed in July:
  1. For each family, build tree for all sequences that hit in 2.2.
    1. Build multiple sequence alignment using MAFFT
    2. Build phylogenetic tree
  2. Load trees into database that is accessible to GO curators
    1. Trees can be searched, accessed and viewed using Tree Curation Tool (Java application)

Comments about new targets

http://spreadsheets.google.com/ccc?key=pZhlLFuj8ewDe799QTmxzCA&hl=en

Annotation QC

http://sourceforge.net/tracker/?group_id=36855&atid=1040173

  • GRIN1 Donghui
  • Issues with 'X protein binding' when co-annotated with 'x complex component'; generally people feel that those two should not co-occur, but we also acknowledge that that also sometimes depend on the experiment that was done. In the case of mouse GRIN1, since the two annotations come from the same paper, it seems like 'glutamate receptor binding should be removed? (to be confirmed by David)
  • regulation: mouse gene is annotated to 'positive regulation of transcription from RNA polymerase II promoter'; this is a downstream effect, and the question is, how much downstream can one annotate? In this case the paper shows that activating or blocking the receptor causes decrease /increased rates of transcription of a specific gene.

Next conference call

Tuesday June 8, 2008, 1 PM CDT, 11 AM PDT, 7 PM BST

Return to Reference_Genome_Annotation_Project