2010 GO camp Annotation of HTP data: Difference between revisions
(annotation incoherence for the user) |
|||
Line 2: | Line 2: | ||
[http://gocwiki.geneontology.org/index.php/2010_GO_camp_working_groups_composition Members] | [http://gocwiki.geneontology.org/index.php/2010_GO_camp_working_groups_composition Members] | ||
=1. Background= | = 1. Background= | ||
With the advent of technology and genomic data, it is now possible to study/look at the various aspects of gene products at a genomic level. This implies studying the localization of thousands of gene products or looking at phenotypes in a large scale and so on. These studies typically describe the experimental method in the main text and can't show the results for each gp within the paper and often supply that in the suppl. data. It is time consuming for a curator to look at/review the data for thousands of gps. What is the best way to annotate these types of studies? | With the advent of technology and genomic data, it is now possible to study/look at the various aspects of gene products at a genomic level. This implies studying the localization of thousands of gene products or looking at phenotypes in a large scale and so on. These studies typically describe the experimental method in the main text and can't show the results for each gp within the paper and often supply that in the suppl. data. It is time consuming for a curator to look at/review the data for thousands of gps. What is the best way to annotate these types of studies? | ||
Line 9: | Line 9: | ||
* [http://gocwiki.geneontology.org/index.php/SGD_GO_HTP_guidelines SGD's HTP annotation practice] | * [http://gocwiki.geneontology.org/index.php/SGD_GO_HTP_guidelines SGD's HTP annotation practice] | ||
=3. Proposed annotation policy= | = 3. Proposed annotation policy= | ||
=4. Examples (papers) and discussion of GO annotation issues = | = 4. Examples (papers) and discussion of GO annotation issues = | ||
*Hazbun TR, et al. (2003) Assigning function to yeast proteins by integration of technologies. Mol Cell 12(6):1353-65 PMID:14690591 | *Hazbun TR, et al. (2003) Assigning function to yeast proteins by integration of technologies. Mol Cell 12(6):1353-65 PMID:14690591 | ||
*Kumar A, et al. (2002) Subcellular localization of the yeast proteome. Genes Dev 16(6):707-19, PMID 11914276 | *Kumar A, et al. (2002) Subcellular localization of the yeast proteome. Genes Dev 16(6):707-19, PMID 11914276 | ||
Line 18: | Line 18: | ||
*Sickmann A, et al. (2003) The proteome of Saccharomyces cerevisiae mitochondria. Proc Natl Acad Sci U S A 100(23):13207-12, PMID 14576278 | *Sickmann A, et al. (2003) The proteome of Saccharomyces cerevisiae mitochondria. Proc Natl Acad Sci U S A 100(23):13207-12, PMID 14576278 | ||
=5. Suggestions for Quality Control procedures= | = 5. Examples of annotation incoherence for the user= | ||
* Multiple subcellular locations | |||
'''At the moment, there is no way to distinguish large scale results from specific papers. What will happen with propagations ?''' | |||
AC: Q42290 | |||
GO:0005618 cell wall IDA | |||
GO:0005730 nucleolus IDA | |||
GO:0005739 mitochondrion | |||
GO:0005741 mitochondrial outer membrane IDA | |||
GO:0005750 mitochondrial respiratory chain complex III IDA | |||
GO:0005758 mitochondrial intermembrane space IDA | |||
GO:0005759 mitochondrial matrix IDA | |||
GO:0005886 plasma membrane IDA | |||
GO:0009507 chloroplast IDA | |||
GO:0016020 membrane IDA | |||
AC: Q9FT52 | |||
GO:0005730 nucleolus IDA | |||
GO:0005739 mitochondrion IDA | |||
GO:0005886 plasma membrane IDA | |||
GO:0009507 chloroplast IDA | |||
GO:0009535 chloroplast thylakoid membrane IDA | |||
GO:0009579 thylakoid IDA | |||
GO:0022626 cytosolic ribosome IDA | |||
* When no distinction between 2 subcellular locations is made in the paper... | |||
'''Should we annotate it at all ?''' | |||
pubmed:16618929 no distinction between mitochondrion and chloroplasts, but both subcellular locations have been added !!! | |||
see page 3 '...corresponding to mitochondria/plastids (which were not resolved from each other within the density gradient used in this study),...' | |||
Is that correct, or shall we discard such result | |||
AC: O04308 | |||
GO:0005739 mitochondrion IDA | |||
GO:0009536 plastid IDA | |||
* Induction by high-throughput analysis... | |||
'''Should we be aware about significant induction or just take it as the authors suggest it...?''' | |||
pubmed: 16463103 | |||
response to a lot of biotic and abiotic stresses...with | |||
+ | |||
++ | |||
+++ | |||
++++ | |||
>++++ | |||
...when the control is + and the induction is ++ is it really significant, when in the same table you can have up to ++++> ? | |||
what about propagation after that, especially in this case where a large family is analyzed with differential inductions ? | |||
* Limit of TAS statement | |||
pubmed: 11118137, family of 1984 putative transcription factors | |||
GO:0045449 regulation of transcription TAS | |||
'''Should we consider such large TAS ?''' | |||
= 6. Suggestions for Quality Control procedures= | |||
---- | ---- | ||
Back to [[2010_GO_camp_Meeting_Agenda]] | Back to [[2010_GO_camp_Meeting_Agenda]] |
Revision as of 10:48, 29 April 2010
Members of this working group
1. Background
With the advent of technology and genomic data, it is now possible to study/look at the various aspects of gene products at a genomic level. This implies studying the localization of thousands of gene products or looking at phenotypes in a large scale and so on. These studies typically describe the experimental method in the main text and can't show the results for each gp within the paper and often supply that in the suppl. data. It is time consuming for a curator to look at/review the data for thousands of gps. What is the best way to annotate these types of studies?
2. Review of current GO annotation practices
- ChIP data (Rachael?): Annotation_consistency_:_ChIP_experiments
- SGD's HTP annotation practice
3. Proposed annotation policy
4. Examples (papers) and discussion of GO annotation issues
- Hazbun TR, et al. (2003) Assigning function to yeast proteins by integration of technologies. Mol Cell 12(6):1353-65 PMID:14690591
- Kumar A, et al. (2002) Subcellular localization of the yeast proteome. Genes Dev 16(6):707-19, PMID 11914276
- Reinders J, et al. (2006) Toward the complete yeast mitochondrial proteome: multidimensional separation techniques for mitochondrial proteomics. J Proteome Res 5(7):1543-54, PMID 16823961
- Huh WK, et al. (2003) Global analysis of protein localization in budding yeast. Nature 425(6959):686-91, PMID 14562095
- Sickmann A, et al. (2003) The proteome of Saccharomyces cerevisiae mitochondria. Proc Natl Acad Sci U S A 100(23):13207-12, PMID 14576278
5. Examples of annotation incoherence for the user
- Multiple subcellular locations
At the moment, there is no way to distinguish large scale results from specific papers. What will happen with propagations ?
AC: Q42290
GO:0005618 cell wall IDA GO:0005730 nucleolus IDA GO:0005739 mitochondrion GO:0005741 mitochondrial outer membrane IDA GO:0005750 mitochondrial respiratory chain complex III IDA GO:0005758 mitochondrial intermembrane space IDA GO:0005759 mitochondrial matrix IDA GO:0005886 plasma membrane IDA GO:0009507 chloroplast IDA GO:0016020 membrane IDA
AC: Q9FT52
GO:0005730 nucleolus IDA GO:0005739 mitochondrion IDA GO:0005886 plasma membrane IDA GO:0009507 chloroplast IDA GO:0009535 chloroplast thylakoid membrane IDA GO:0009579 thylakoid IDA GO:0022626 cytosolic ribosome IDA
- When no distinction between 2 subcellular locations is made in the paper...
Should we annotate it at all ? pubmed:16618929 no distinction between mitochondrion and chloroplasts, but both subcellular locations have been added !!! see page 3 '...corresponding to mitochondria/plastids (which were not resolved from each other within the density gradient used in this study),...'
Is that correct, or shall we discard such result
AC: O04308
GO:0005739 mitochondrion IDA GO:0009536 plastid IDA
- Induction by high-throughput analysis...
Should we be aware about significant induction or just take it as the authors suggest it...? pubmed: 16463103 response to a lot of biotic and abiotic stresses...with
+ ++ +++ ++++ >++++
...when the control is + and the induction is ++ is it really significant, when in the same table you can have up to ++++> ? what about propagation after that, especially in this case where a large family is analyzed with differential inductions ?
- Limit of TAS statement
pubmed: 11118137, family of 1984 putative transcription factors
GO:0045449 regulation of transcription TAS
Should we consider such large TAS ?
6. Suggestions for Quality Control procedures
Back to 2010_GO_camp_Meeting_Agenda