Phylogenetic Annotation Project: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
 
(109 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[Category:Reference Genome]] [[Category:Working Groups]]
''Note that this project was formerly called the Reference Genome Annotation Project.''
=Reference Genome Annotation Project: Purpose=
* Comprehensive information about the group's purpose and objectives can be found at [[Reference Genome Annotation Project Summary]]
====[[Reference Genome Progress Reports]]====


====[[Progress_Reports#Reference_Genomes | Project timeline]]====
=Overview=
====[[Reference_Genome Contact Persons from each database]]====
The Phylogenetic Annotation Project performs annotation inferences across evolutionary related proteins based on known function of proteins within PANTHER [http://pantherdb.org/] phylogenetic family trees.
= PAINT (Phylogenetic Annotation and INference Tool) =
PAINT is a Java software application for supporting inference of ancestral as well as present-day characters (represented by ontology terms) in the context of a phylogenetic tree.  PAINT is currently being used in the GO [[Phylogenetic Annotation Project]] to support inference of GO function terms (molecular function, cellular component and biological process) by homology.


====[http://geneontology.org/GO.refgenome.shtml?all Reference genome web page]====
==Resources for PAINT annotation==


----
===[[PAINT annotation guidelines|PAINT Annotation principles]]===
=Communications=
The [[PAINT SOP|PAINT Annotation principles]] page describes the PAINT annotation guidelines.
====[[Reference Genome Mailing list]]====


==== [[Conference Calls]]====
===[[PAINT_User_Guide|PAINT User Guide]]===
The [[PAINT_User_Guide|PAINT User Guide]] provides annotation guidelines for PAINT annotation as well as step-by-step instructions on how to use the PAINT tool.


===[[Reference Genomes Meetings | Meetings]]====
===[http://pantree.org/tree/allTrees.jsp PAINT trees curation status]===
The [http://pantree.org/tree/allTrees.jsp PAINT trees curation status] page provides a list of all Panther trees and the history of their curation status.


====  [[Electronic_jamborees| Electronic jamborees ]]====
=[[PAINT_database_update_pipeline|PAINT Update pipeline]]=


===[[Reference Genome Gene Index | Gene Annotation wiki pages]]===
=[[PAINT_GAF_production|PAINT GAF production]]=
* The purpose of these pages are to allow discussions of annotation and orthology issues related to particular genes.  The individual gene pages are to be created as needed.
==[[PAINT GAF QC-examples|PAINT GAF QC-examples]]==


----
= [[PAINT Conference Calls]]=
* Monthly, every First Tuesday of the month, 9 AM Pacific/6 PM Europe as of March 2019
* Link to the zoom is in the Google Calendar


=PAINT : Software for tree annotation=
* [[PAINT_Conference_Calls]] Agendas and Minutes
The reference project group has developed a software to perform the annotations based on phylogenetic trees, [[PAINT]].


----
=PAINT Curation guidelines=
[[PAINT_SOP]]


----
=Reporting bugs or likely errors in the trees=


= Data availability =
==Tree issues==
* PAINT-generated GAF files are on the cvs repository :  
If a Panther tree needs to be reviewed, please create a ticket in the Panther GitHub tracker: https://github.com/pantherdb/Helpdesk/issues
http://cvsweb.geneontology.org/cgi-bin/cvsweb.cgi/go/gene-associations/submission/paint/#dirlist
* By Species: http://cvsweb.geneontology.org/cgi-bin/cvsweb.cgi/go/gene-associations/submission/#dirlist
* Viewing reference genome annotations in [http://amigo.geneontology.org/cgi-bin/amigo/amigo?mode=homolset_summary AmiGO ]
* Reference GO:0000033: [http://geneontology.org/doc/GO.references]


----
==PAINT issues==
Issues with the PAINT tools should be reported in this tracker: https://github.com/pantherdb/db-PAINT/issues


=Annotation Priorities=
==Pantree issues==
Issues with the Pantree.org site should be reported at: https://github.com/pantherdb/PanTree


* '''[[RefG annotation priorities]]''' as of September 2009 (following GOC meeting held in Cambridge). Including procedure to propose new targets.
=Pages to review=
* http://wiki.geneontology.org/index.php/PAINT_annotation_working_group
* [[reference proteomes files]]: to be moved elsewhere
* Metrics: Discussion on annotation progress measurements
**From 2017 Grant, suggestions for metrics:
*** fraction of human proteins in annotated families (PAINT progress)
*** impact: number of annotations added, for human and for other species
** From a previous grant, see [[Image:HowToCaptureMetrics3.doc|thumb|Description]]
** Other ideas (to be reviewed): [[Metrics:_breath_and_depth_of_annotations |Breath and Depth]]
**** http://wiki.geneontology.org/index.php/GO_Reference_Genome_Meeting_Metric_Plan


== Annotation targets ==
=Archived & retired Pages=
===PAINT Annotation target 2014===
*DNA repair family list: http://goo.gl/BaQxMC
*PAINT family list based on number of genes, duplication nodes and GO terms: https://docs.google.com/spreadsheets/d/1uHgcaXO7t9__9GuXgBibT0H-YlnU252IsRGWIyhCWN8/edit?usp=sharing. The family is usually easier to PAINT if there are less duplication nodes, but please select ones that have more GO terms and more genes (>100)
Please contact PAINT team to have access to the document.


=== Nov 2013 [[PAINT - Apoptosis]] ===
Those pages are kept as reference but the information in them is not the most current information.
 
* [[Reference Genome Mailing list]] - disabled
=== May- Sept 2011: [[RefG_Heart_Development_co-curation#Heart_Development_Transcription_Annotation_Targets]]===
* [[Electronic_jamborees| Electronic jamborees ]]
 
* [[Annotation_pipeline]] By Judy, Suzi, Michael
=== February-April 2011: [[Apoptosis Reference Genome Targets]]===
* [[Ideas for publicizing Ref.Genome Annotation Data]]
* [[PAINT-GONUTS integration]]
* [[Reference Genome Annotation Project Summary]]
* [[Progress_Reports#Reference_Genomes | Project timeline]]
* [[Reference_Genome Contact Persons from each database]]
* [[Reference Genome Progress Reports]]
* [[Procedure for selection of target genes]]
* [[Procedure for filling Genome-Specific spreadsheets]]
* [[Tools_for_identifying_orthologs|Tools for orthology determination]]: A summary of tools available to identify orthologs.
* [[Orthology discussion page|SOP for determining ortholog (by database)]]: The purpose of this page was to discuss the method by which each group establishes orthology between reference genome genes and human disease genes. We now collaborate with PANTHER to provide that. (Issues are different)
* [[Ref_Gen_pub_draft | Reference Genome Web Page Draft]]: We now have a real web page!
* [[List of potentially problematic families for all vs. all BLAST methods of orthology determination]]
* [[Running P-POD orthology tool on the reference genomes gene set]] by Kara Dolinski at Princeton - Nov2007.
* [[Reference_Genome_sequence_annotation]]: GFF3 sequence files for reference genome MODs
* [[Reference Genome Database Requirements Discussion]]
* [[Source_Forge_items_for_reference_genomes_(Retired)]]
* [[Reference Genome Publication Counts]]
* [[Review_of_trees-based_annotations_(Retired)]]
* [[GAF file 2.0]] survey of contributing groups
* [[RG:_Software|Reference Genome Software]] Plan to have some tracking system - supplanted with the db-version of Paint (2017)
* [[Ref_genome_Annotation_progress_ideas_(Retired)]]


=== June-Sept 2010: [[Wnt_signaling_Pathway]]===


==Past Annotation targets==
==Past Annotation targets==


====[[Lung_branching_morphogenesis_genes]] Annotation Progress ====
* [[RefG annotation priorities]] of September 2009
 
*[[Lung_branching_morphogenesis_genes]] December 2009
====[[Panther gene lists]]====
* [http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD12_Graphs/ All PPOD clusters with at least one object from each of the twelve refG organisms]  
 
*[http://spreadsheets.google.com/ccc?key=pZhlLFuj8ewDe799QTmxzCA&hl=en Target Gene List]: May 2008-Jan 2010
==== [http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD12_Graphs/ All PPOD clusters with at least one object from each of the twelve refG organisms] ====
*[[Tree annotation progress]] 2010-2011
 
* [[RefG_Heart_Development_co-curation#Heart_Development_Transcription_Annotation_Targets]]: May- Sept 2011
From May 2008
* [[Wnt_signaling_Pathway]] June-Sept 2010
====[http://spreadsheets.google.com/ccc?key=pZhlLFuj8ewDe799QTmxzCA&hl=en Target Gene List] (May 2008-Jan 2010) ====
* [[Apoptosis Reference Genome Targets]] February-April 2011
 
* [[PAINT_-_Apoptosis_(Archived)]]
==== [http://www.geneontology.org/images/RefGenomeGraphs/ Selected refG target sets] ====
* [[PAINT - Apoptosis]] Nov 2013
* PPOD clusters selected since April 2008
* DNA repair family list: http://goo.gl/BaQxMC 2014
* Manually curated target sets selected before April 2008
* http://dcn.spreadsheets.google.com/ccc?id=o16926456948884040128.4584390909151853752.07000735126025259412.442372083524637957
 
Target Gene List August 2006-April 2008
====[http://dcn.spreadsheets.google.com/ccc?id=o16926456948884040128.4584390909151853752.07000735126025259412.442372083524637957 Target Gene List August 2006-April 2008]====
* [[Reference_Genome_Genes_(Retired)]]
 
* [[PAINT_trees_to_review (Retired)]]
* Access requires your email to be added to the system.  Email Pascale if you would like to be added.
== Review Status ==
* This spreadsheet contains links to separate spreadsheets maintained by each of the reference genome groups.
 
 
----
 
=[[Scripts to ensure PAINT data integrity]]=
 
=Work in progress=
 
==[[Tree annotation progress]]==
 
==PAINT annotation SOPs==
 
====[[GAFs for trees-based annotations]]====
 
====[[PAINT_SOP |Standard Operating Procedure for Tree-based propagation of annotations]]====
* [[PAINT-GONUTS integration]]
 
==[[Reference Genomes Metrics]] | Metrics: Discussion on annotation progress measurements==
 
==Branding Ref.Genome Project==
[[ Ideas for publicizing Ref.Genome Annotation Data ]]
 
----
 
=Orthology determination=
 
====[[List of potentially problematic families for all vs. all BLAST methods of orthology determination]] ====
 
==Data used to make orthology calls==
 
====[[reference proteomes files]]====
At the July 2009 Quest for Orthologs meeting, it was agreed to decide upon a standard set of genomes, and compile "complete" sets of protein coding genes for each genome, and a representative protein sequence for each gene.
 
====New [[gene2geneproduct file]]====
At the April 2009 Reference Genome meeting it was decided to create a new file to replace the GP2protein file, called 'gene2geneproduct'. Specifications can be found on this page (will be added soon).
 
====[[GAF file 2.0]]====
The GAF file should contain 17 columns, and the meaning of columns 2, 12 and 17 have been modified. See that page for specifications.
 
====Data used for [[Running P-POD orthology tool on the reference genomes gene set]] ====
by Kara Dolinski at Princeton - Nov2007
* This page contains a description of the project and the requirements for providing files for the P-POD analysis.
 
====GFF3 sequence files for reference genome MODs====
[[Reference_Genome_sequence_annotation]]
----
 
=Software/database development=
 
*'''[[Reference Genome Database Requirements Discussion]]'''
*'''[[RG:_Software|Reference Genome Software]]'''
*'''[[RG_Software_group|Software group]]'''
*'''[[PAINT_SOP|PAINT]]'''
 
 
The purpose of this page is to discuss features and requirements that would be desirable in a database used to replace the existing Google Spreadsheet system for managing target genes, their annotations and metrics.
 
 
 
 
=Retired Pages=
Those pages are kept as reference but the information in them is not the most current information.  
 
 
====[[Procedure for selection of target genes]]====
 
====[[Procedure for filling Genome-Specific spreadsheets]]====
 
===[[Annotation_pipeline]]===
By Judy, Suzi, Michael
 
====[[Tools_for_identifying_orthologs|Tools for orthology determination]]====
A summary of tools available to identify orthologs.
 
==== [[Orthology discussion page|SOP for determining ortholog (by database)]]====
 
* The purpose of this page was to discuss the method by which each group establishes orthology between reference genome genes and human disease genes.
We now collaborate with PANTHER and POPOD to provide that. (Issues are different)


==== [[Ref_Gen_pub_draft | Reference Genome Web Page Draft]]====
Last reviewed: 2021-07-01
* We now have a real web page!
[[Category:PAINT]]
----

Latest revision as of 01:50, 1 July 2021

Note that this project was formerly called the Reference Genome Annotation Project.

Overview

The Phylogenetic Annotation Project performs annotation inferences across evolutionary related proteins based on known function of proteins within PANTHER [1] phylogenetic family trees.

PAINT (Phylogenetic Annotation and INference Tool)

PAINT is a Java software application for supporting inference of ancestral as well as present-day characters (represented by ontology terms) in the context of a phylogenetic tree. PAINT is currently being used in the GO Phylogenetic Annotation Project to support inference of GO function terms (molecular function, cellular component and biological process) by homology.

Resources for PAINT annotation

PAINT Annotation principles

The PAINT Annotation principles page describes the PAINT annotation guidelines.

PAINT User Guide

The PAINT User Guide provides annotation guidelines for PAINT annotation as well as step-by-step instructions on how to use the PAINT tool.

PAINT trees curation status

The PAINT trees curation status page provides a list of all Panther trees and the history of their curation status.

PAINT Update pipeline

PAINT GAF production

PAINT GAF QC-examples

PAINT Conference Calls

  • Monthly, every First Tuesday of the month, 9 AM Pacific/6 PM Europe as of March 2019
  • Link to the zoom is in the Google Calendar


Reporting bugs or likely errors in the trees

Tree issues

If a Panther tree needs to be reviewed, please create a ticket in the Panther GitHub tracker: https://github.com/pantherdb/Helpdesk/issues

PAINT issues

Issues with the PAINT tools should be reported in this tracker: https://github.com/pantherdb/db-PAINT/issues

Pantree issues

Issues with the Pantree.org site should be reported at: https://github.com/pantherdb/PanTree

Pages to review

Archived & retired Pages

Those pages are kept as reference but the information in them is not the most current information.


Past Annotation targets

Target Gene List August 2006-April 2008

Review Status

Last reviewed: 2021-07-01