Tips to Produce High Quality Annotations: Difference between revisions
mNo edit summary |
|||
(45 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
== Get the wider perspective == | |||
* '''Favor a pathway-by-pathway or gene product-by-gene product approach''' for curation, rather than paper-by-paper. | |||
** GO annotations should be made considering the overall context in which a gene product functions, which may not be apparent from just looking at individual papers | |||
* '''Read recent publications, including recent reviews if available.''' | |||
** Recent reviews can also be helpful for understanding the overall pathway or process you are annotating. | |||
* '''Look at existing annotations''' for the gene product and for the term you have chosen to annotate, to ensure consistency. | |||
** This may trigger the need to revise other annotations. | |||
* '''Remove incorrect annotations based on invalidated hypotheses.''' | |||
** If potentially incorrect annotations have been made by another group, create an issue in the [https://github.com/geneontology/go-annotation/issues go-annotation tracker]. | |||
== | == Focus on the research hypothesis == | ||
* | * '''Use existing knowledge''' to understand the hypothesis being tested and its relation to the experimental observation. | ||
== | == Capture the conclusion, not the assay == | ||
=== Make sure to distinguish between assays and GO terms=== | |||
* Some assays can readily seem to correspond to GO terms. Be careful when interpreting the results of these assays as they may not reflect the actual role of the gene product in a process. | |||
* '''Examples''': | |||
** <code>GO:0006309 apoptotic DNA fragmentation</code> | |||
** <code>GO:0006919 activation of cysteine-type endopeptidase activity involved in apoptotic process</code> (AKA caspase activation) | |||
*** These assays are often used to measure whether apoptosis occurred, but NOT to measure a specific, direct role of a gene product in that process. | |||
** <code>GO:0042060 wound healing</code> it is incorrect to use this term for wound healing assay experiments, which is an assay for 'GO:0048870' cell motility', and NOT wound healing. | |||
== | === Some assay conditions test general protein properties, not their function=== | ||
'''Examples''': | |||
* '''1. Requirement for a post-translational modification for activity''': PMID:16500043, Fig 3. DNAse II is a glycosylated protein, and the glycosylation is presumably necessary for the correct protein conformation - the protein is not active when glycosylation is less active, for example with tunicamycin. | |||
** Hints that tunicamycin does not regulate the activity of DNAse II: | |||
*** The paper doesn't mention that DNAse II binds tunicamycin (a direct interaction is necessary for a regulation annotation) | |||
*** The paper indirectly shows that in the presence of tunicamycin is not glycosylated (it has a lower molecular weight), and this lower molecular weight form is less active. | |||
** The conclusion of the experiment is that DNAse II is glycosylated. Since glycosylated is not known to be regulated (unlike phosphorylation), it is unlikely that this data supports any kind of regulation. The effect of tunicamycin is outside the scope of GO. The only annotation that can be made from Fig. 3 is <code>GO:0004531 deoxyribonuclease II activity</code>. | |||
* '''2. Impact of chemicals that affect the cellular environment on an activity''': PMID:19690162, Fig. 3. Trx contains a conserved active site with two cysteines that are essential for its redox activity. The authors investigated whether this site was necessary for interaction with SlrP by treating the cells with hydrogen peroxide. The interaction was reduced. In the discussion the authors suggest that the reduced form of Trx (ie in the presence of H2O2) doesn't have the correct conformation for interaction with SlrP ("The residues necessary for interaction with SlrP could be buried in the dimeric form of Trx.". | |||
** Hints that hydrogen peroxide does not regulate Trx/SlrP binding: | |||
*** based on experimental data: hydrogen peroxide is not shown to directly interact with either interacting partner | |||
*** based on biological knowledge: the likely effect of hydrogen peroxide is the reduction of the disulfide bond of Trx, which likely affects its conformation. | |||
== | == Annotate gene products to their normal function == | ||
* '''The scope of the GO is to capture the normal function of gene products''', so special care must be taken to understand how experimental observations inform this. | |||
The scope of the GO | * Mutant phenotypes are an important sources of GO annotations, but curators should consider carefully how to use that information for annotation. | ||
* '''Specific guidelines for annotating mutant phenotypes are available''' [http://wiki.geneontology.org/index.php/Annotating_from_phenotypes here]. | |||
== Check term placement in the ontology == | |||
- | '''Make sure that the parents of the term are also consistent with the annotation.''' For example | ||
'GO:0015616 DNA translocase activity' is a 'GO:0008094 DNA-dependent ATPase activity', although just by looking at the term name one cannot tell that DNA translocase is a type of ATPase. | |||
== Represent current knowledge == | |||
'''The GO is not an archive or all findings published on a protein.''' Do no hesitate to remove older annotations that are inconsistent with the current state of knowledge for a protein' role. | |||
== | == Annotations from large-scale datasets== | ||
The term high-throughput data is often used to describe data that has been generated by automatic or semi-automatic methodology without validation of the results for individual gene products. The experiments can be viewed as screens: experiments performed in parallel without explicit target selection; they are generally not hypothesis-driven. More details as to how to annotate high throughput data can be found in the [[Guide_to_GO_Evidence_Codes#High_Throughput_Experimental_Evidence_Codes]] page. | |||
== | ==Avoiding predatory journals== | ||
Make sure you select high quality papers. Some journals have been labeled 'Predatory journals' for their dubious practices with respect to the publication process. There is a list of these journals here: https://predatoryjournals.com/journals/. If you have a doubt this may hep decide whether or not to annotate the paper. | Make sure you select high quality papers. Some journals have been labeled 'Predatory journals' for their dubious practices with respect to the publication process. There is a list of these journals here: https://predatoryjournals.com/journals/. If you have a doubt this may hep decide whether or not to annotate the paper. | ||
Line 35: | Line 55: | ||
* https://www.the-scientist.com/news-opinion/german-scientists-frequently-publish-in-predatory-journals-64518 | * https://www.the-scientist.com/news-opinion/german-scientists-frequently-publish-in-predatory-journals-64518 | ||
* https://www.the-scientist.com/news-opinion/indian-government-aims-to-take-down-predatory-journals-64731?utm_campaign=TS_DAILY%20NEWSLETTER_2018&utm_source=hs_email&utm_medium=email&utm_content=65569194&_hsenc=p2ANqtz-8vKr7yMcdVq-SddM-VUDuhEkiuw_GUGkhM8JomWp1adoKTdafscdN7dP2Y-PP2zwhFVC3e0zD9SSshnGzZ6T9hyOHOCQ&_hsmi=65569194 | * https://www.the-scientist.com/news-opinion/indian-government-aims-to-take-down-predatory-journals-64731?utm_campaign=TS_DAILY%20NEWSLETTER_2018&utm_source=hs_email&utm_medium=email&utm_content=65569194&_hsenc=p2ANqtz-8vKr7yMcdVq-SddM-VUDuhEkiuw_GUGkhM8JomWp1adoKTdafscdN7dP2Y-PP2zwhFVC3e0zD9SSshnGzZ6T9hyOHOCQ&_hsmi=65569194 | ||
* DEF CON 26 - Svea, Suggy, Till - Inside the Fake Science Factory https://www.youtube.com/watch?v=ras_VYgA77Q | |||
== Review Status == | |||
Last reviewed: April 11, 2019 | |||
Back to: [[Annotation]] | Back to: [[Annotation]] | ||
[[Category: Annotation | [[Category: Annotation Guidelines]] |
Revision as of 19:33, 6 March 2020
Get the wider perspective
- Favor a pathway-by-pathway or gene product-by-gene product approach for curation, rather than paper-by-paper.
- GO annotations should be made considering the overall context in which a gene product functions, which may not be apparent from just looking at individual papers
- Read recent publications, including recent reviews if available.
- Recent reviews can also be helpful for understanding the overall pathway or process you are annotating.
- Look at existing annotations for the gene product and for the term you have chosen to annotate, to ensure consistency.
- This may trigger the need to revise other annotations.
- Remove incorrect annotations based on invalidated hypotheses.
- If potentially incorrect annotations have been made by another group, create an issue in the go-annotation tracker.
Focus on the research hypothesis
- Use existing knowledge to understand the hypothesis being tested and its relation to the experimental observation.
Capture the conclusion, not the assay
Make sure to distinguish between assays and GO terms
- Some assays can readily seem to correspond to GO terms. Be careful when interpreting the results of these assays as they may not reflect the actual role of the gene product in a process.
- Examples:
GO:0006309 apoptotic DNA fragmentation
GO:0006919 activation of cysteine-type endopeptidase activity involved in apoptotic process
(AKA caspase activation)- These assays are often used to measure whether apoptosis occurred, but NOT to measure a specific, direct role of a gene product in that process.
GO:0042060 wound healing
it is incorrect to use this term for wound healing assay experiments, which is an assay for 'GO:0048870' cell motility', and NOT wound healing.
Some assay conditions test general protein properties, not their function
Examples:
- 1. Requirement for a post-translational modification for activity: PMID:16500043, Fig 3. DNAse II is a glycosylated protein, and the glycosylation is presumably necessary for the correct protein conformation - the protein is not active when glycosylation is less active, for example with tunicamycin.
- Hints that tunicamycin does not regulate the activity of DNAse II:
- The paper doesn't mention that DNAse II binds tunicamycin (a direct interaction is necessary for a regulation annotation)
- The paper indirectly shows that in the presence of tunicamycin is not glycosylated (it has a lower molecular weight), and this lower molecular weight form is less active.
- The conclusion of the experiment is that DNAse II is glycosylated. Since glycosylated is not known to be regulated (unlike phosphorylation), it is unlikely that this data supports any kind of regulation. The effect of tunicamycin is outside the scope of GO. The only annotation that can be made from Fig. 3 is
GO:0004531 deoxyribonuclease II activity
.
- Hints that tunicamycin does not regulate the activity of DNAse II:
- 2. Impact of chemicals that affect the cellular environment on an activity: PMID:19690162, Fig. 3. Trx contains a conserved active site with two cysteines that are essential for its redox activity. The authors investigated whether this site was necessary for interaction with SlrP by treating the cells with hydrogen peroxide. The interaction was reduced. In the discussion the authors suggest that the reduced form of Trx (ie in the presence of H2O2) doesn't have the correct conformation for interaction with SlrP ("The residues necessary for interaction with SlrP could be buried in the dimeric form of Trx.".
- Hints that hydrogen peroxide does not regulate Trx/SlrP binding:
- based on experimental data: hydrogen peroxide is not shown to directly interact with either interacting partner
- based on biological knowledge: the likely effect of hydrogen peroxide is the reduction of the disulfide bond of Trx, which likely affects its conformation.
- Hints that hydrogen peroxide does not regulate Trx/SlrP binding:
Annotate gene products to their normal function
- The scope of the GO is to capture the normal function of gene products, so special care must be taken to understand how experimental observations inform this.
- Mutant phenotypes are an important sources of GO annotations, but curators should consider carefully how to use that information for annotation.
- Specific guidelines for annotating mutant phenotypes are available here.
Check term placement in the ontology
Make sure that the parents of the term are also consistent with the annotation. For example 'GO:0015616 DNA translocase activity' is a 'GO:0008094 DNA-dependent ATPase activity', although just by looking at the term name one cannot tell that DNA translocase is a type of ATPase.
Represent current knowledge
The GO is not an archive or all findings published on a protein. Do no hesitate to remove older annotations that are inconsistent with the current state of knowledge for a protein' role.
Annotations from large-scale datasets
The term high-throughput data is often used to describe data that has been generated by automatic or semi-automatic methodology without validation of the results for individual gene products. The experiments can be viewed as screens: experiments performed in parallel without explicit target selection; they are generally not hypothesis-driven. More details as to how to annotate high throughput data can be found in the Guide_to_GO_Evidence_Codes#High_Throughput_Experimental_Evidence_Codes page.
Avoiding predatory journals
Make sure you select high quality papers. Some journals have been labeled 'Predatory journals' for their dubious practices with respect to the publication process. There is a list of these journals here: https://predatoryjournals.com/journals/. If you have a doubt this may hep decide whether or not to annotate the paper.
More information about predatory journals may be found in these articles:
- https://www.the-scientist.com/news-opinion/german-scientists-frequently-publish-in-predatory-journals-64518
- https://www.the-scientist.com/news-opinion/indian-government-aims-to-take-down-predatory-journals-64731?utm_campaign=TS_DAILY%20NEWSLETTER_2018&utm_source=hs_email&utm_medium=email&utm_content=65569194&_hsenc=p2ANqtz-8vKr7yMcdVq-SddM-VUDuhEkiuw_GUGkhM8JomWp1adoKTdafscdN7dP2Y-PP2zwhFVC3e0zD9SSshnGzZ6T9hyOHOCQ&_hsmi=65569194
- DEF CON 26 - Svea, Suggy, Till - Inside the Fake Science Factory https://www.youtube.com/watch?v=ras_VYgA77Q
Review Status
Last reviewed: April 11, 2019
Back to: Annotation