Phylogenetic annotation overview: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
 
(16 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[[Category:SWUG Projects]] [[Category: PAINT]]
#REDIRECT [[PAINT User Guide]]
= PAINT (Phylogenetic Annotation and INference Tool) =
 
PAINT is a Java software application for supporting inference of ancestral as well as present-day characters (represented by ontology terms) in the context of a phylogenetic tree.  PAINT is currently being used in the GO [[Reference Genome Annotation Project]] to support inference of GO function terms (molecular function, cellular component and biological process) by homology.
[[Category: PAINT]]
 
== Principles underlying PAINT ==
== Principles underlying PAINT ==
Annotation of a gene's function by homology is often referred to as "transitive annotation", in which an experimentally-characterized function of one gene is "transferred" to another gene because of their similarity in sequence.  This pairwise transfer paradigm derives from the success of sequence searching algorithms such as BLAST and Smith-Waterman. Of course, pairwise conservation of function is really due to descent from a common ancestor (homology).  In other words, two sequences of sufficient length are similar because they share a common ancestor, and the reason they have a common function is most likely that they inherited that function from their common ancestor.  This process can be explicitly captured using a phylogenetic model.
Annotation of a gene's function by homology is often referred to as "transitive annotation", in which an experimentally-characterized function of one gene is "transferred" to another gene because of their similarity in sequence.  This pairwise transfer paradigm derives from the success of sequence searching algorithms such as BLAST and Smith-Waterman. Of course, pairwise conservation of function is really due to descent from a common ancestor (homology).  In other words, two sequences of sufficient length are similar because they share a common ancestor, and the reason they have a common function is most likely that they inherited that function from their common ancestor.  This process can be explicitly captured using a phylogenetic model.


Rather than a pairwise paradigm, PAINT uses this more accurate phylogenetic model to infer gene function by homology.  PAINT annotation is intended to capture actual inferences about the evolution of gene function within a gene family: the gain, inheritance, modification and loss of function over evolutionary time.  Inference is a two-step process, and involves directly annotating a phylogenetic tree.  In the first step, experimental GO annotations for extant sequences are used to make inferences about when a given function may have first evolved.  In PAINT, this is referred to as "up-propagation", in which ancestral genes are annotated based on information about extant sequences.  In the second step, "down-propagation", ancestral annotations are used to make inferences about unannotated extant sequences, based on the principle of inheritance from the common ancestor, and allowing for modification and even loss of function during evolution.
Rather than a pairwise paradigm, PAINT uses this more accurate phylogenetic model to infer gene function by homology.  PAINT (Phylogenetic Annotation and INference Tool) annotation is intended to capture actual inferences about the evolution of gene function within a gene family: the gain, inheritance, modification and loss of function over evolutionary time.  Inference is a two-step process, and involves directly annotating a phylogenetic tree.  In the first step, experimental GO annotations for extant sequences are used to make inferences about when a given function may have first evolved.  In PAINT, this is referred to as "up-propagation", in which ancestral genes are annotated based on information about extant sequences.  In the second step, "down-propagation", ancestral annotations are used to make inferences about unannotated extant sequences, based on the principle of inheritance from the common ancestor, and allowing for modification and even loss of function during evolution.
 
'''For a more complete description, please see the publication on the GO Phylogenetic Annotation process, [https://www.ncbi.nlm.nih.gov/pubmed/21873635 Gaudet et al, Briefings in Bioinformatics, 2011].'''


== PAINT software ==
== PAINT software ==
PAINT is currently being implemented in Java 1.6, as a joint project between Paul Thomas's group (USC) and Suzanna Lewis's group (LBL).  Development of PAINT has being funded by grant [http://projectreporter.nih.gov/project_info_description.cfm?aid=7591614&icde=3337927 GM081084] from the U.S. National Institutes of Health, and the GO Consortium grant 5U41HG002273.   
PAINT is implemented in Java, as a joint project between Paul Thomas's group (USC) and Suzanna Lewis's group (LBL).  Development of PAINT has been funded by grant [http://projectreporter.nih.gov/project_info_description.cfm?aid=7591614&icde=3337927 GM081084] from the U.S. National Institutes of Health, and the GO Consortium grant 5U41HG002273.   


== Touch Up (automates updates of phylogenetic annotation files) ==
* check on whether each term has been obsoleted
** if obsolete, replace if only one replacement term, otherwise report in log
* check on validity of each annotation
** is_pruned? does the annotation occur on a branch that was pruned (removed) from the tree?
** is_valid_term? does the term appear on the DO NOT USE FOR ANNOTATION list?
** is_redundant?  is the term more general (an is_a or part_of ancestor) than a more specific annotated term?
** has_evidence? is there at least one descendant leaf node with experimental evidence for the annotated term?
** violates_taxon_constraint? does the ancestral node violate a taxon constraint?
* propagation non-redundancy
== Availability ==
== Availability ==
PAINT is freely available for download on [https://github.com/geneontology/paint/releases GitHub].
PAINT is freely available for download at: https://go.paint.usc.edu/


== Installing and using PAINT ==
== Installing and using PAINT ==
See the [[PAINT User Guide]] for more information.
See the [[PAINT User Guide]] for more information.
== Standard operating procedures for GO annotation using PAINT ==
== Standard operating procedures for GO annotation using PAINT ==
The GO Reference Genome Project is annotating trees generated for [http://pantherdb.org PANTHER version 11].
The PAN-GO Phylogenetic Annotation with GO Project is annotating trees generated for [https://pantherdb.org PANTHER version 17], released on 2022-02-23.
 
Annotation of GO terms using PAINT follows the (see the [[PAINT SOP]]).
 
== Review Status ==


Annotation of GO terms using PAINT requires an understanding of the semantics of annotations, and is governed by a set of SOPs (see the [[PAINT SOP]]).  An annotation of an ancestral node means that a gene function is inferred to have first arisen somewhere along the branch of the tree immediately preceding that node.  Loss of a function "X" is annotated using the GO "NOT" qualifier, and means that a given function was inferred to have been lost along the branch immediately preceding the annotated node.  Only experimental GO annotations (as represented by the [http://www.geneontology.org/GO.evidence.shtml evidence code]) can be used as a basis for annotation of ancestral genes.  NOT annotations can be supported by either an experimental GO annotation, absence of specific residues in the sequences, or generally accelerated evolutionary rate.
Last reviewed: October 16, 2023

Latest revision as of 15:10, 17 October 2023

Redirect to:


Principles underlying PAINT

Annotation of a gene's function by homology is often referred to as "transitive annotation", in which an experimentally-characterized function of one gene is "transferred" to another gene because of their similarity in sequence. This pairwise transfer paradigm derives from the success of sequence searching algorithms such as BLAST and Smith-Waterman. Of course, pairwise conservation of function is really due to descent from a common ancestor (homology). In other words, two sequences of sufficient length are similar because they share a common ancestor, and the reason they have a common function is most likely that they inherited that function from their common ancestor. This process can be explicitly captured using a phylogenetic model.

Rather than a pairwise paradigm, PAINT uses this more accurate phylogenetic model to infer gene function by homology. PAINT (Phylogenetic Annotation and INference Tool) annotation is intended to capture actual inferences about the evolution of gene function within a gene family: the gain, inheritance, modification and loss of function over evolutionary time. Inference is a two-step process, and involves directly annotating a phylogenetic tree. In the first step, experimental GO annotations for extant sequences are used to make inferences about when a given function may have first evolved. In PAINT, this is referred to as "up-propagation", in which ancestral genes are annotated based on information about extant sequences. In the second step, "down-propagation", ancestral annotations are used to make inferences about unannotated extant sequences, based on the principle of inheritance from the common ancestor, and allowing for modification and even loss of function during evolution.

For a more complete description, please see the publication on the GO Phylogenetic Annotation process, Gaudet et al, Briefings in Bioinformatics, 2011.

PAINT software

PAINT is implemented in Java, as a joint project between Paul Thomas's group (USC) and Suzanna Lewis's group (LBL). Development of PAINT has been funded by grant GM081084 from the U.S. National Institutes of Health, and the GO Consortium grant 5U41HG002273.

Availability

PAINT is freely available for download at: https://go.paint.usc.edu/

Installing and using PAINT

See the PAINT User Guide for more information.

Standard operating procedures for GO annotation using PAINT

The PAN-GO Phylogenetic Annotation with GO Project is annotating trees generated for PANTHER version 17, released on 2022-02-23.

Annotation of GO terms using PAINT follows the (see the PAINT SOP).

Review Status

Last reviewed: October 16, 2023