Phylogenetic Annotation Project: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
mNo edit summary
 
(491 intermediate revisions by 24 users not shown)
Line 1: Line 1:
The GO consortium has established the complete annotation of 12 reference genomes as a priority goal. These reference genomes are:
''Note that this project was formerly called the Reference Genome Annotation Project.''


*<i>Arabidopsis thaliana</i>
=Overview=
*<i>Caenorhabditis elegans</i>
The Phylogenetic Annotation Project performs annotation inferences across evolutionary related proteins based on known function of proteins within PANTHER [http://pantherdb.org/] phylogenetic family trees.
*<i>Danio rerio</i> (zebrafish)
*<i>Dictyostelium discoideum</i>
= PAINT (Phylogenetic Annotation and INference Tool) =
*<i>Drosophila melanogaster</i>
PAINT is a Java software application for supporting inference of ancestral as well as present-day characters (represented by ontology terms) in the context of a phylogenetic tree.  PAINT is currently being used in the GO [[Phylogenetic Annotation Project]] to support inference of GO function terms (molecular function, cellular component and biological process) by homology.
*<i>Escherichia coli</i>
*<i>Homo sapiens</i>
*<i>Saccharomyces cerevisiae</i>
*<i>Mus musculus</i>
*<i>Schizosaccharomyces pombe</i>
*<i>Gallus gallus</i>
*<i>Rattus norvegicus</i>


The Reference Genome GO Annotation Team, with representatives from each genome annotation group, will coordinate annotation, facilitate implementation of GO Consortium annotation priorities, provide metrics to assess progress toward the goal of broad and deep annotation of the reference genomes. This group will be responsible for the coordination of the annotation of the nine reference genomes. This group represents the annotation expertise within the GO consortium and provides key liaisons to the model organism databases the have primary responsibilities for the annotation of the reference genomes.
==Resources for PAINT annotation==


[[Reference Genome Annotation Project Summary]]
===[[PAINT annotation guidelines|PAINT Annotation principles]]===
The [[PAINT SOP|PAINT Annotation principles]] page describes the PAINT annotation guidelines.


----
===[[PAINT_User_Guide|PAINT User Guide]]===
The [[PAINT_User_Guide|PAINT User Guide]] provides annotation guidelines for PAINT annotation as well as step-by-step instructions on how to use the PAINT tool.


The first [[Reference Genome Annotation Meeting]] will be September 26 to 27, 2007, in Princeton, NJ (tentative)
===[http://pantree.org/tree/allTrees.jsp PAINT trees curation status]===
----
The [http://pantree.org/tree/allTrees.jsp PAINT trees curation status] page provides a list of all Panther trees and the history of their curation status.


==The Reference Genome Gene List and Summary==
=[[PAINT_database_update_pipeline|PAINT Update pipeline]]=


The spreadsheet is located at
=[[PAINT_GAF_production|PAINT GAF production]]=
==[[PAINT GAF QC-examples|PAINT GAF QC-examples]]==


http://dcn.spreadsheets.google.com/ccc?id=o16926456948884040128.4584390909151853752.07000735126025259412.442372083524637957
= [[PAINT Conference Calls]]=
* Monthly, every First Tuesday of the month, 9 AM Pacific/6 PM Europe as of March 2019
* Link to the zoom is in the Google Calendar


Access requires your email to be added to the system.  Email Rex if you would like to be added.
* [[PAINT_Conference_Calls]] Agendas and Minutes


This spreadsheet contains links to separate spreadsheets maintained by each of the reference genome groups.


----
=Reporting bugs or likely errors in the trees=


==Reference Genome Wiki Pages==
==Tree issues==
If a Panther tree needs to be reviewed, please create a ticket in the Panther GitHub tracker: https://github.com/pantherdb/Helpdesk/issues


The following Wiki Pages are available to reference genome participants for discussions:
==PAINT issues==
Issues with the PAINT tools should be reported in this tracker: https://github.com/pantherdb/db-PAINT/issues


[[Reference Genome Database Requirements Discussion]]
==Pantree issues==
Issues with the Pantree.org site should be reported at: https://github.com/pantherdb/PanTree


The purpose of this page is to discuss features and requirements that would be desirable in a database used to replace the existing Google Spreadsheet system for managing target genes, their annotations and metrics.
=Pages to review=
* http://wiki.geneontology.org/index.php/PAINT_annotation_working_group
* [[reference proteomes files]]: to be moved elsewhere
* Metrics: Discussion on annotation progress measurements
**From 2017 Grant, suggestions for metrics:
*** fraction of human proteins in annotated families (PAINT progress)
*** impact: number of annotations added, for human and for other species
** From a previous grant, see [[Image:HowToCaptureMetrics3.doc|thumb|Description]]
** Other ideas (to be reviewed): [[Metrics:_breath_and_depth_of_annotations |Breath and Depth]]
**** http://wiki.geneontology.org/index.php/GO_Reference_Genome_Meeting_Metric_Plan


[[Orthology discussion page]]
=Archived & retired Pages=


The purpose of this page is to discuss general principles and problems with establishing orthology between reference genome genes and human disease genes.
Those pages are kept as reference but the information in them is not the most current information.
* [[Reference Genome Mailing list]] - disabled
* [[Electronic_jamborees| Electronic jamborees ]]
* [[Annotation_pipeline]] By Judy, Suzi, Michael
* [[Ideas for publicizing Ref.Genome Annotation Data]]
* [[PAINT-GONUTS integration]]
* [[Reference Genome Annotation Project Summary]]
* [[Progress_Reports#Reference_Genomes | Project timeline]]
* [[Reference_Genome Contact Persons from each database]]
* [[Reference Genome Progress Reports]]
* [[Procedure for selection of target genes]]
* [[Procedure for filling Genome-Specific spreadsheets]]
* [[Tools_for_identifying_orthologs|Tools for orthology determination]]: A summary of tools available to identify orthologs.
* [[Orthology discussion page|SOP for determining ortholog (by database)]]: The purpose of this page was to discuss the method by which each group establishes orthology between reference genome genes and human disease genes. We now collaborate with PANTHER to provide that. (Issues are different)
* [[Ref_Gen_pub_draft | Reference Genome Web Page Draft]]: We now have a real web page!
* [[List of potentially problematic families for all vs. all BLAST methods of orthology determination]]
* [[Running P-POD orthology tool on the reference genomes gene set]] by Kara Dolinski at Princeton - Nov2007.
* [[Reference_Genome_sequence_annotation]]: GFF3 sequence files for reference genome MODs
* [[Reference Genome Database Requirements Discussion]]
* [[Source_Forge_items_for_reference_genomes_(Retired)]]
* [[Reference Genome Publication Counts]]
* [[Review_of_trees-based_annotations_(Retired)]]
* [[GAF file 2.0]] survey of contributing groups
* [[RG:_Software|Reference Genome Software]] Plan to have some tracking system - supplanted with the db-version of Paint (2017)
* [[Ref_genome_Annotation_progress_ideas_(Retired)]]


[[Reference Genome Gene Index|Index of wiki pages for Reference Genome Genes]]


The purpose of these pages are to allow discussions of annotation and orthology issues related to particular genes.  The individual gene pages are to be created as needed.
==Past Annotation targets==


[[Reference Genome Publication Counts]]
* [[RefG annotation priorities]] of September 2009
*[[Lung_branching_morphogenesis_genes]] December 2009
* [http://proto.informatics.jax.org/prototypes/GOgraphEX/PPOD12_Graphs/ All PPOD clusters with at least one object from each of the twelve refG organisms]
*[http://spreadsheets.google.com/ccc?key=pZhlLFuj8ewDe799QTmxzCA&hl=en Target Gene List]: May 2008-Jan 2010
*[[Tree annotation progress]] 2010-2011
* [[RefG_Heart_Development_co-curation#Heart_Development_Transcription_Annotation_Targets]]: May- Sept 2011
* [[Wnt_signaling_Pathway]] June-Sept 2010
* [[Apoptosis Reference Genome Targets]] February-April 2011
* [[PAINT_-_Apoptosis_(Archived)]]
* [[PAINT - Apoptosis]] Nov 2013
* DNA repair family list: http://goo.gl/BaQxMC 2014
* http://dcn.spreadsheets.google.com/ccc?id=o16926456948884040128.4584390909151853752.07000735126025259412.442372083524637957
Target Gene List August 2006-April 2008
* [[Reference_Genome_Genes_(Retired)]]
* [[PAINT_trees_to_review (Retired)]]
== Review Status ==


A summary of how each group is completing columns L,M,N,O for publications metrics.
Last reviewed: 2021-07-01
 
[[Category:PAINT]]
----
 
==Reference Genome Mailing list==
 
The Reference Genome GO Annotation Team uses an email discussion list to facilitate communication.  The list is open to curators of the identified reference genomes involved in reference genome annotation.
 
To join the list send an email to:
 
  refgenome-request@geneontology.org
 
In the body of the message add the line:
 
  subscribe <insert your email address>
 
 
To access the email archive for this mailing list:
 
http://www.geneontology.org/GO.list.refgenome.shtml
 
-----
 
==Guidelines for Characterization of Reference Genome Descriptions==
 
All descriptions based on Sequence Ontology terms
 
All counts are necessarily estimates, but some can be estimated to
the ones digits, while others just to the 1000's. Therefore no need to
distinguish, just look at the significant digit.  It is recognized that different databases will be currently able to provide different portions of this.  A goal should be for each database to provide numbers for each of these categories.
 
Numbers to be presented. 
 
- CDS: count one per genomic occurrence (mRNA? this might need to be
refined, if the group is annotating proteins, not genes)--required
 
- snoRNA: count one per genomic occurrence
 
- rRNA: count one per type
 
- snRNA: count one per genomic occurrence
 
- tRNA: count one per genomic occurrence
 
- ncRNA: count one per genomic occurrence and do not double count (i.e.
if snoRNA count is supplied, don't double count it here)
 
- transposable_element: count one per genomic occurrence
 
- transposable_element_gene: count one per unique mRNA occurrence per
transposable_element type
 
- pseudogene: count one genomic occurrence
 
----
 
==Minutes of Reference Genome Phone Conferences==
 
[[RefGenome24Jan07_Phone_Conference.doc]]

Latest revision as of 01:50, 1 July 2021

Note that this project was formerly called the Reference Genome Annotation Project.

Overview

The Phylogenetic Annotation Project performs annotation inferences across evolutionary related proteins based on known function of proteins within PANTHER [1] phylogenetic family trees.

PAINT (Phylogenetic Annotation and INference Tool)

PAINT is a Java software application for supporting inference of ancestral as well as present-day characters (represented by ontology terms) in the context of a phylogenetic tree. PAINT is currently being used in the GO Phylogenetic Annotation Project to support inference of GO function terms (molecular function, cellular component and biological process) by homology.

Resources for PAINT annotation

PAINT Annotation principles

The PAINT Annotation principles page describes the PAINT annotation guidelines.

PAINT User Guide

The PAINT User Guide provides annotation guidelines for PAINT annotation as well as step-by-step instructions on how to use the PAINT tool.

PAINT trees curation status

The PAINT trees curation status page provides a list of all Panther trees and the history of their curation status.

PAINT Update pipeline

PAINT GAF production

PAINT GAF QC-examples

PAINT Conference Calls

  • Monthly, every First Tuesday of the month, 9 AM Pacific/6 PM Europe as of March 2019
  • Link to the zoom is in the Google Calendar


Reporting bugs or likely errors in the trees

Tree issues

If a Panther tree needs to be reviewed, please create a ticket in the Panther GitHub tracker: https://github.com/pantherdb/Helpdesk/issues

PAINT issues

Issues with the PAINT tools should be reported in this tracker: https://github.com/pantherdb/db-PAINT/issues

Pantree issues

Issues with the Pantree.org site should be reported at: https://github.com/pantherdb/PanTree

Pages to review

Archived & retired Pages

Those pages are kept as reference but the information in them is not the most current information.


Past Annotation targets

Target Gene List August 2006-April 2008

Review Status

Last reviewed: 2021-07-01