Taxon-GO Checks and Commentary - Part 1: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(9 intermediate revisions by the same user not shown)
Line 5: Line 5:
Here are the things that I have found:
Here are the things that I have found:


===Annotation errors===
==Annotation errors==


*'''Senescence''': Human annotations to the plant senescence (GO:0010149) term. Senescence in plants is the process whereby cells die just before the shedding of an organ, whereas senscence in humans is the more general aging of cells that remain alive in the organism. We may need to make a new term for the human process, or add a synonym to the existing aging (GO:0007568) term. I have written to ask people, as I am not sure how much these processes differ, and whether we need separate terms.
==Senescence==


**8 manual annotations by BHF-UCL to plant senescence. I have reported this to GOA, and they will liaise with BHF-UCL and send back any terms changes that need to be made.  
Human annotations to the plant senescence (GO:0010149) term. Senescence in plants is the process whereby cells die just before the shedding of an organ, whereas senscence in humans is the more general aging of cells that remain alive in the organism. We may need to make a new term for the human process, or add a synonym to the existing aging (GO:0007568) term. I have written to ask people, as I am not sure how much these processes differ, and whether we need separate terms.  


**7 RGD ISO annotatations to the same term. Wrote to Simon Twigger.
*8 manual annotations by BHF-UCL to plant senescence. I have reported this to GOA, and they will liaise with BHF-UCL and send back any terms changes that need to be made.  


**'''organ senescence''': 2 manual from RGD, 1 manual from MGI, 3 IEA from Ensembl. I have written to Harold Drabkin and Simon Twigger, but Emily will pass on the information to Ensembl later.  
*7 RGD ISO annotatations to the same term. Wrote to Simon Twigger.


*'''Lactation''': Mammalian reproduction in chickens: There are 6 AgBase ISS annotations to GO:0001553 "luteinization" or GO:0007595 "lactation", which are both mammalian reproductive processes. Written to Fiona McCarthy.  
==organ senescence==
2 manual from RGD, 1 manual from MGI, 3 IEA from Ensembl. I have written to Harold Drabkin and Simon Twigger, but Emily will pass on the information to Ensembl later.  


*'''phycobilisome''': I will come back to phycobilisome GO:0030089 later as it needs some research.  
==Lactation==
Mammalian reproduction in chickens: There are 6 AgBase ISS annotations to GO:0001553 "luteinization" or GO:0007595 "lactation", which are both mammalian reproductive processes. Written to Fiona McCarthy.
 
==phycobilisome==
I will come back to phycobilisome GO:0030089 later as it needs some research.  


From Emily:
From Emily:
Line 27: Line 32:
TAXON RULE INCORRECT.  (DEF OF GO TERM INCLUDES CYANOBACTERIUM) GO:0030089 "phycobilisome" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot Q01951 apcA GO:0030089 PMID:14730074 IDA C protein NCBITaxon:1148 20050613 UniProtKB
TAXON RULE INCORRECT.  (DEF OF GO TERM INCLUDES CYANOBACTERIUM) GO:0030089 "phycobilisome" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot Q01951 apcA GO:0030089 PMID:14730074 IDA C protein NCBITaxon:1148 20050613 UniProtKB


* '''gastrulation with mouth forming first''': 67 WormBase annotations to 'gastrulation with mouth forming first' GO:0001703. All are IEA. The script objects to nematodes being on a term with only_in_taxon Protostomia. I Have written to Kimberly van Auken to ask if this is right.  
==gastrulation with mouth forming first==
67 WormBase annotations to 'gastrulation with mouth forming first' GO:0001703. All are IMP. The script objects to nematodes being on a term with only_in_taxon Protostomia. I Have written to Kimberly van Auken to ask if this is right.  


Reply:
Reply:
Line 86: Line 92:
I have written to suggest that we come up with a union_of taxon term that covers all the needed groups.  
I have written to suggest that we come up with a union_of taxon term that covers all the needed groups.  


*'''nucleus''': 14 EcoCyc IEA annotations of E. coli gene products to 'nucleus' GO:0005634. Written to Jim Hu to ask about these.  
==nucleus==
14 EcoCyc IEA annotations of E. coli gene products to 'nucleus' GO:0005634. Written to Jim Hu to ask about these.  
 
==Peptidoglycan biosynthetic process==
 
3 IEA from TAIR to the bacterial peptidoglycan biosynthetic process, instead of to another similiar sounding term.
 
The def of this term has been improved to be much more comprehensive:
 
def: The chemical reactions and pathways resulting in the formation of peptidoglycans, any of a class of glycoconjugates found in bacterial cell walls. Peptidoglycans are polymers consisting of sugars and amino acids. The sugar component consists of alternating residues of β-(1,4) linked N-acetylglucosamine and N-acetylmuramic acid residues. Attached to the N-acetylmuramic acid is a peptide chain of three to five amino acids. The peptide chain can be cross-linked to the peptide chain of another strand forming the 3D mesh-like layer.
 
TAIR GO:0009252 "peptidoglycan biosynthetic process" only_in NCBITaxon:2 "Bacteria" :: TAIR gene:2097647 AT3G08840.1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11694
 
TAIR GO:0009252 "peptidoglycan biosynthetic process" only_in NCBITaxon:2 "Bacteria" :: TAIR gene:1006228677 AT3G08840.2 GO:0005758 PMID:10620603 EXP C protein NCBITaxon:11683


*Emily is looking at the file too, and is making notes of the numbers of annotations errors found, and the fixes that she has applied. She is feeding this information back, along with any ontology errors that she finds.  
TAIR GO:0009252 "peptidoglycan biosynthetic process" only_in NCBITaxon:2 "Bacteria" :: TAIR gene:1006228676 AT3G08840.3 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11683




Line 114: Line 133:
  def: Functions in the storage of nutritious substrates.
  def: Functions in the storage of nutritious substrates.


*Emily has checked the UniProtKB manual annotation lines and has provided comments with CAPS at the beginning of full annotation lines.  
TAXON RULE INCORRECT. (annotation to a major plant storage protein) GO:0045735 "nutrient reservoir activity" only_in NCBITaxon:88770 "Panarthropoda" :: Swiss-Prot Q9XHP1 Q9XHP1 GO:0045735 PMID:10606554 NAS F protein NCBITaxon:4182 20051104 UniProtKB
TAXON RULE INCORRECT. (annotation to a major plant storage protein)  GO:0045735 "nutrient reservoir activity" only_in NCBITaxon:88770 "Panarthropoda" :: Swiss-Prot Q9SPL3 AMP2-3 GO:0045735 PMID:10571855 NAS F protein NCBITaxon:60698 20060904 UniProtKB
TAXON RULE INCORRECT. (annotation to a major plant storage protein)GO:0045735 "nutrient reservoir activity" only_in NCBITaxon:88770 "Panarthropoda" :: Swiss-Prot Q9XHP0 Q9XHP0 GO:0045735 PMID:10606554 NAS F protein NCBITaxon:4182 20051104 UniProtKB
TAXON RULE INCORRECT. (annotation to a major plant storage protein) GO:0045735 "nutrient reservoir activity" only_in NCBITaxon:88770 "Panarthropoda" :: Swiss-Prot Q39649 Q39649 GO:0045735 PMID:8275099 TAS F protein NCBITaxon:3661 20050516 UniProtKB


==Multi-organism process==
==Multi-organism process==
Line 171: Line 196:
'''Before''':
'''Before''':


[[Image:DNA.gif|4000px]]
[[Image:DNA.gif|1000px]]


'''After''':
'''After''':


[[Image:DNANew.gif|4000px]]
[[Image:DNANew.gif|1000px]]


==photoreceptor cell morphogenesis==


This term was developed for us with Drosophila melanogaster but has annotations as follows:


==System problems==
MGI    GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: MGI MGI:1920910 Cabp4 GO:0009297 PMID:16729895 RCA TIGR_CMR:PSPTO0926 P protein NCBITaxon:208964
 
MGI GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: MGI MGI:1277215 Bbs1 GO:0009297 PMID:94049125 IDA P protein NCBITaxon:208964
 
MGI GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: MGI MGI:1345146 Grk1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11695
 
Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: RGD 619712 Grk1 GO:0005758 PMID:10620603 EXP C protein NCBITaxon:11686
 
RGD GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: RGD 1306083 Cabp4 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11686
 
RGD GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: RGD 1307581 Bbs1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11689


===Taxon hierarchy problems===
RGD GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: RGD 619712 Grk1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:362651


THIS ORGANISM IS A CHILD OF EUKARYOTA; SOMETHING MISSING IN TAXON HIERARCHY? GO:0009407 "toxin catabolic process" only_in NCBITaxon:131567 "cellular organisms" :: Swiss-Prot P86101 P86101 GO:0009407 PMID:18599388 IDA P protein NCBITaxon:569175 20081222 UniProtKB
Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot P28327 GRK1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11703


Chris responds: Looks like I was using a slightly out of date taxonomy, hence the false positives for bivalves, Candida strains etc. Will try and fix.
Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot Q8HZJ4 CABP4 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11692


==UniProt Comments==
Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot Q8NFJ9-2 BBS1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11688


Comments from Emily:
Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot P57796-2 CABP4 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11678


TAXON RULE INCORRECT (see GO term def:A ribonuclease P complex located in the nucleolus of a eukaryotic cell,and PMID:8918471) GO:0005655 "nucleolar ribonuclease P complex" only_in NCBITaxon:2157 "Archaea" :: Swiss-Prot Q99575 POP1 GO:0005655 PMID:8918471 IDA C protein NCBITaxon:9606 20010813 UniProtKB
Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot P57796 CABP4 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11699
'''FIXED''': Multimeric ribonuclease P complex no longer has any taxon link.


TAXON RULE INCORRECT. (annotation to a major plant storage protein) GO:0045735 "nutrient reservoir activity" only_in NCBITaxon:88770 "Panarthropoda" :: Swiss-Prot Q9XHP1 Q9XHP1 GO:0045735 PMID:10606554 NAS F protein NCBITaxon:4182 20051104 UniProtKB
Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot Q8NFJ9 BBS1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11689
TAXON RULE INCORRECT. (annotation to a major plant storage protein)  GO:0045735 "nutrient reservoir activity" only_in NCBITaxon:88770 "Panarthropoda" :: Swiss-Prot Q9SPL3 AMP2-3 GO:0045735 PMID:10571855 NAS F protein NCBITaxon:60698 20060904 UniProtKB
TAXON RULE INCORRECT. (annotation to a major plant storage protein)GO:0045735 "nutrient reservoir activity" only_in NCBITaxon:88770 "Panarthropoda" :: Swiss-Prot Q9XHP0 Q9XHP0 GO:0045735 PMID:10606554 NAS F protein NCBITaxon:4182 20051104 UniProtKB
TAXON RULE INCORRECT. (annotation to a major plant storage protein) GO:0045735 "nutrient reservoir activity" only_in NCBITaxon:88770 "Panarthropoda" :: Swiss-Prot Q39649 Q39649 GO:0045735 PMID:8275099 TAS F protein NCBITaxon:3661 20050516 UniProtKB


FROM SGD ANNOTATION: AND YEAST DO HAVE THESE; PMID: 15103330. SOMETHING MISSING IN TAXON HIERARCHY? GO:0016226 "iron-sulfur cluster assembly" only_in NCBITaxon:131567 "cellular organisms" :: TrEMBL B9W8S4 CD36_08500 GO:0016226 GO_REF:0000024 ISS UniProtKB:P23503 P protein NCBITaxon:573826 20090714 UniProtKB
Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot Q15835 GRK1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11692


INAPPROPRIATE ANNOTATION - WRONG AC USED. GO:0007517 "muscle organ development" only_in NCBITaxon:2759 "Eukaryota" :: TrEMBL P84067 tthHB8IM GO:0007517 GO_REF:0000029 NAS P protein NCBITaxon:274 20040707 UniProtKB
8 EXP from Ensembl<br>
1 RCA, 1 EXP and 1 IDA from MGI<br>
3 EXP from RGD<br>


INAPPROPRIATE ANNOTATION -  WRONG AC USED.GO:0007517 "muscle organ development" only_in NCBITaxon:2759 "Eukaryota" :: TrEMBL P84068 tthHB8IM GO:0007517 GO_REF:0000029 NAS P protein NCBITaxon:274 20040707 UniProtKB
The term is fairly ambiguously defined. I need to ask Becky Foulger if it is appropriate to use the same term for all kinds of photoreceptor cell. She is away, but I will email and wait for a response.  


I DON'T KNOW!! ISS FROM FLYBASE ANNOTATION GO:0042331 "phototaxis" only_in NCBITaxon:33090 "Viridiplantae" :: Swiss-Prot Q9GRW7 nonA GO:0042331 GO_REF:0000024 ISS UniProtKB:Q04047 P protein NCBITaxon:7244 20060207 UniProtKB


I DON'T KNOW!! ISS FROM FLYBASE ANNOTATION GO:0042331 "phototaxis" only_in NCBITaxon:33090 "Viridiplantae" :: Swiss-Prot Q9GRX4 nonA GO:0042331 GO_REF:0000024 ISS UniProtKB:Q04047 P protein NCBITaxon:47316 20060207 UniProtKB
==System problems==


GO:0001701 "in utero embryonic development" only_in NCBITaxon:40674 "Mammalia" :: TrEMBL Q6XV80 nbn GO:0001701 GO_REF:0000024 ISS UniProtKB:Q9R207 P protein NCBITaxon:8355 20061204 UniProtKB
===Taxon hierarchy problems===


INAPPROPRIATE TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL Q90VU7 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB
THIS ORGANISM IS A CHILD OF EUKARYOTA; SOMETHING MISSING IN TAXON HIERARCHY? GO:0009407 "toxin catabolic process" only_in NCBITaxon:131567 "cellular organisms" :: Swiss-Prot P86101 P86101 GO:0009407 PMID:18599388 IDA P protein NCBITaxon:569175 20081222 UniProtKB
should have been: GO:0052085 negative regulation by symbiont of host T-cell mediated immune response


INAPPROPRIATE ISS) ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL O40177 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB
INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL Q71VG3 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB
FROM SGD ANNOTATION: AND YEAST DO HAVE THESE; PMID: 15103330. SOMETHING MISSING IN TAXON HIERARCHY? GO:0016226 "iron-sulfur cluster assembly" only_in NCBITaxon:131567 "cellular organisms" :: TrEMBL B9W8S4 CD36_08500 GO:0016226 GO_REF:0000024 ISS UniProtKB:P23503 P protein NCBITaxon:573826 20090714 UniProtKB
 
INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL Q9WM24 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB
Chris responds: Looks like I was using a slightly out of date taxonomy, hence the false positives for bivalves, Candida strains etc. Will try and fix.
 
INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL Q9WLM4 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB
==UniProt Comments==
 
INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: Swiss-Prot Q70627 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:82834 20060116 UniProtKB
Comments from Emily:
 
* INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: Swiss-Prot P03406 nef GO:0001915 PMID:11689886 IDA P protein NCBITaxon:11686 20060113 UniProtKB
'''nucleolar ribonuclease P complex'''
 
'''FIXED''': Multimeric ribonuclease P complex no longer has any taxon link.


INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: Swiss-Prot P03404 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11678 20061106 UniProtKB
TAXON RULE INCORRECT (see GO term def:A ribonuclease P complex located in the nucleolus of a eukaryotic cell,and PMID:8918471) GO:0005655 "nucleolar ribonuclease P complex" only_in NCBITaxon:2157 "Archaea" :: Swiss-Prot Q99575 POP1 GO:0005655 PMID:8918471 IDA C protein NCBITaxon:9606 20010813 UniProtKB
INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: Swiss-Prot P04324 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11679 20060116 UniProtKB
INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: UniProtKB Q9PXW9 Q9PXW9 GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB
'''gamma-tubulin ring complex'''


'''FIXED''':  Only_in_taxon link from gamma-tubulin large complex to Fungi removed.


TAXON RULE INCORRECT GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot Q95K09 TUBGCP5 GO:0008274 PMID:11694571 ISS UniProtKB:Q96RT8 C protein NCBITaxon:9541 20041006 UniProtKB
TAXON RULE INCORRECT GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot Q95K09 TUBGCP5 GO:0008274 PMID:11694571 ISS UniProtKB:Q96RT8 C protein NCBITaxon:9541 20041006 UniProtKB
e.g. this complex exists vertebrates and invertebrates PMID:10854328, PMID:11134080, PMID: 19509060
TAXON RULE INCORRECT GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot Q9GKK8 BRCA1 GO:0008274 PMID:12214252 NAS C protein NCBITaxon:9598 20021218 UniProtKB
TAXON RULE INCORRECT GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot Q9GKK8 BRCA1 GO:0008274 PMID:12214252 NAS C protein NCBITaxon:9598 20021218 UniProtKB


Line 248: Line 277:
GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot Q96RT8 TUBGCP5 GO:0008274 PMID:11694571 IDA C protein NCBITaxon:9606 20021106 UniProtKB
GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot Q96RT8 TUBGCP5 GO:0008274 PMID:11694571 IDA C protein NCBITaxon:9606 20021106 UniProtKB


GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot Q9UGJ1 TUBGCP4 GO:0008274 PMID:10562286 NAS C protein NCBITaxon:9606 20021218 UniProtKB
GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot Q9UGJ1 TUBGCP4 GO:0008274 PMID:10562286 NAS C protein NCBITaxon:9606 20021218 UniProtKB


INAPPROPRIATE ANNOTATION; GO TERM DEF IMPROVEMENTS? GO:0007600 "sensory perception" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot Q9AE24 rprY GO:0007600 GO_REF:0000029 NAS P protein NCBITaxon:817 20030605 UniProtKB
'''odontogenesis of dentine-containing tooth'''
 
INAPPROPRIATE ANNOTATION; GO TERM DEF IMPROVEMENTS? GO:0007600 "sensory perception" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot Q08408 BF3865 GO:0007600 GO_REF:0000029 NAS P protein NCBITaxon:817 20030512 UniProtKB
'''FIXED'''
 
INAPPROPRIATE ANNOTATION - ISS FROM MOUSE -> FROG GO:0001701 "in utero embryonic development" only_in NCBITaxon:40674 "Mammalia" :: TrEMBL Q6EKW1 NBS1 GO:0001701 GO_REF:0000024 ISS UniProtKB:Q9R207 P protein NCBITaxon:8355 20061204 UniProtKB
 
MGI (col 15 incorrect) GO:0007424 "open tracheal system development" only_in NCBITaxon:6656 "Arthropoda" :: MGI MGI:2142149 Ano1 GO:0007424 MGI:MGI:3805598|PMID:18585372 IMP P gene NCBITaxon:10090 20080917 UniProtKB


TAXON RULE INCORRECT.GO:0042475 "odontogenesis of dentine-containing tooth" only_in NCBITaxon:33317 "Protostomia" :: MGI MGI:104659 Dll1 GO:0042475 MGI:MGI:1327472|PMID:9882480 NAS P gene NCBITaxon:10090 20021218 UniProtKB
TAXON RULE INCORRECT.GO:0042475 "odontogenesis of dentine-containing tooth" only_in NCBITaxon:33317 "Protostomia" :: MGI MGI:104659 Dll1 GO:0042475 MGI:MGI:1327472|PMID:9882480 NAS P gene NCBITaxon:10090 20021218 UniProtKB
Line 282: Line 307:
GO:0042475 "odontogenesis of dentine-containing tooth" only_in NCBITaxon:33317 "Protostomia" :: Swiss-Prot P10070 GLI2 GO:0042475 GO_REF:0000024 ISS UniProtKB:Q0VGT2 P protein NCBITaxon:9606 20070125 UniProtKB
GO:0042475 "odontogenesis of dentine-containing tooth" only_in NCBITaxon:33317 "Protostomia" :: Swiss-Prot P10070 GLI2 GO:0042475 GO_REF:0000024 ISS UniProtKB:Q0VGT2 P protein NCBITaxon:9606 20070125 UniProtKB


==Annotation problems found and fixed by UniProt==
INAPPROPRIATE ANNOTATION; (replaced with GO:0006948 induction by virus of host cell-cell fusion)GO:0000747 "conjugation with cellular fusion" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot P01730 CD4 GO:0000747 PMID:9166430 IDA P protein NCBITaxon:9606 20060110 UniProtKB
INAPPROPRIATE ANNOTATION; INAPPROPRIATE ISS TRANSFER GO:0007424 "open tracheal system development" only_in NCBITaxon:6656 "Arthropoda" :: Swiss-Prot Q5XXA6 ANO1 GO:0007424 GO_REF:0000024 ISS UniProtKB:Q8BHY3 P protein NCBITaxon:9606 20080917 UniProtKB
INAPPROPRIATE ANNOTATION; GO TERM DEF IMPROVEMENTS? GO:0048222 "glycoprotein network" only_in NCBITaxon:33090 "Viridiplantae" :: Swiss-Prot Q8IX30 SCUBE3 GO:0048222 PMID:15234972 NAS C protein NCBITaxon:9606 20060904 UniProtKB
INCORRECT ANNOTATION -INAPPROPRIATE ISS TRANSFER GO:0005739 "mitochondrion" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot O34708 yflA GO:0005739 GO_REF:0000024 ISS UniProtKB:Q9WVL0 C protein NCBITaxon:1423 20041006 UniProtKB
INCORRECT ANNOTATION -HOST TERM NEEDED GO:0005783 "endoplasmic reticulum" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot P03247 E1B GO:0005783 PMID:10381623 IDA C protein NCBITaxon:10515 20060314 UniProtKB
INCORRECT ANNOTATION -HOST TERM NEEDED GO:0005783 "endoplasmic reticulum" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot P03247 E1B GO:0005783 PMID:7954800 IDA C protein NCBITaxon:10515 20060214 UniProtKB
INCORRECT ANNOTATION - GO:0009252 "peptidoglycan biosynthetic process" only_in NCBITaxon:2 "Bacteria" :: Swiss-Prot P83944 P83944 GO:0009252 PMID:12431412 IDA P protein NCBITaxon:6687 20040526 UniProtKB
INCORRECT ANNOTATION - INAPPROPRIATE ISS TRANSFER GO:0001541 "ovarian follicle development" only_in NCBITaxon:32523 "Tetrapoda" :: Swiss-Prot Q90511 eif2b2 GO:0001541 GO_REF:0000024 ISS UniProtKB:P49770 P protein NCBITaxon:31033 20060905 UniProtKB
INCORRECT ANNOTATION -NEED NEW TERM TO DESCRIBE DEFENCE/ATTACK MECHANISM; IF APPROPRIATE. GO:0043303 "mast cell degranulation" only_in NCBITaxon:131567 "cellular organisms" :: Swiss-Prot P86170 P86170 GO:0043303 PMID:18942691 IDA P protein NCBITaxon:582867 20081222 UniProtKB
INCORRECT ANNOTATION - SHOULD HAVE BEEN EXTRACELLULAR MATRIX. GO:0005578 "proteinaceous extracellular matrix" only_in NCBITaxon:33208 "Metazoa" :: Swiss-Prot Q06100 Cgl1 GO:0005578 PMID:10931889 IDA C protein NCBITaxon:5346 20040723 UniProtKB
INCORRECT ANNOTATION - INAPPROPRIATE ISS TRANSFER  GO:0001701 "in utero embryonic development" only_in NCBITaxon:40674 "Mammalia" :: TrEMBL Q641A9 LOC403373 GO:0001701 GO_REF:0000024 ISS UniProtKB:Q9R207 P protein NCBITaxon:8355 20061204 UniProtKB
INCORRECT ANNOTATION - INAPPROPRIATE ISS TRANSFER GO:0005634 "nucleus" only_in NCBITaxon:2759 "Eukaryota" :: TrEMBL Q9J190 gag GO:0005634 PMID:11322894 ISS UniProtKB:Q9UBS8 C protein NCBITaxon:11676 20041006 UniProtKB
INCORRECT ANNOTATION - WRONG AC USED (replaced_by O88869). GO:0005768 "endosome" only_in NCBITaxon:2759 "Eukaryota" :: TrEMBL Q88869 Q88869 GO:0005768 PMID:9837933 IDA C protein NCBITaxon:11023 20031216 UniProtKB
INCORRECT ANNOTATION - WRONG AC USED (replaced_by O88869). GO:0016197 "endosome transport" only_in NCBITaxon:131567 "cellular organisms" :: TrEMBL Q88869 Q88869 GO:0016197 PMID:9837933 NAS P protein NCBITaxon:11023 20031216 UniProtKB
INCORRECT ANNOTATION - INAPPROPRIATE ISS TRANSFER GO:0001541 "ovarian follicle development" only_in NCBITaxon:32523 "Tetrapoda" :: TrEMBL C0H8R1 EI2BE GO:0001541 GO_REF:0000024 ISS UniProtKB:Q13144 P protein NCBITaxon:8030 20090514 UniProtKB
INAPPROPRIATE ANNOTATION - WRONG AC USED. GO:0007517 "muscle organ development" only_in NCBITaxon:2759 "Eukaryota" :: TrEMBL P84067 tthHB8IM GO:0007517 GO_REF:0000029 NAS P protein NCBITaxon:274 20040707 UniProtKB
INAPPROPRIATE ANNOTATION -  WRONG AC USED.GO:0007517 "muscle organ development" only_in NCBITaxon:2759 "Eukaryota" :: TrEMBL P84068 tthHB8IM GO:0007517 GO_REF:0000029 NAS P protein NCBITaxon:274 20040707 UniProtKB
GO:0001701 "in utero embryonic development" only_in NCBITaxon:40674 "Mammalia" :: TrEMBL Q6XV80 nbn GO:0001701 GO_REF:0000024 ISS UniProtKB:Q9R207 P protein NCBITaxon:8355 20061204 UniProtKB
INAPPROPRIATE TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL Q90VU7 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB
should have been: GO:0052085 negative regulation by symbiont of host T-cell mediated immune response
INAPPROPRIATE ISS) ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL O40177 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB
INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL Q71VG3 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB
INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL Q9WM24 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB
INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL Q9WLM4 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB
INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: Swiss-Prot Q70627 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:82834 20060116 UniProtKB
* INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: Swiss-Prot P03406 nef GO:0001915 PMID:11689886 IDA P protein NCBITaxon:11686 20060113 UniProtKB
INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: Swiss-Prot P03404 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11678 20061106 UniProtKB
INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: Swiss-Prot P04324 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11679 20060116 UniProtKB
INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: UniProtKB Q9PXW9 Q9PXW9 GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB
INAPPROPRIATE ANNOTATION; GO TERM DEF IMPROVEMENTS? GO:0007600 "sensory perception" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot Q9AE24 rprY GO:0007600 GO_REF:0000029 NAS P protein NCBITaxon:817 20030605 UniProtKB
INAPPROPRIATE ANNOTATION; GO TERM DEF IMPROVEMENTS? GO:0007600 "sensory perception" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot Q08408 BF3865 GO:0007600 GO_REF:0000029 NAS P protein NCBITaxon:817 20030512 UniProtKB
INAPPROPRIATE ANNOTATION - ISS FROM MOUSE -> FROG GO:0001701 "in utero embryonic development" only_in NCBITaxon:40674 "Mammalia" :: TrEMBL Q6EKW1 NBS1 GO:0001701 GO_REF:0000024 ISS UniProtKB:Q9R207 P protein NCBITaxon:8355 20061204 UniProtKB
MGI (col 15 incorrect) GO:0007424 "open tracheal system development" only_in NCBITaxon:6656 "Arthropoda" :: MGI MGI:2142149 Ano1 GO:0007424 MGI:MGI:3805598|PMID:18585372 IMP P gene NCBITaxon:10090 20080917 UniProtKB






[[Category:Taxon]]
[[Category:Taxon]]

Latest revision as of 10:49, 4 September 2009

28th August 2009

I am looking through the file of errors that Chris Mungall's checking script has produced. It is called gaf-taxon-gaffes.txt and is in go/scratch/go-taxon.

Here are the things that I have found:

Annotation errors

Senescence

Human annotations to the plant senescence (GO:0010149) term. Senescence in plants is the process whereby cells die just before the shedding of an organ, whereas senscence in humans is the more general aging of cells that remain alive in the organism. We may need to make a new term for the human process, or add a synonym to the existing aging (GO:0007568) term. I have written to ask people, as I am not sure how much these processes differ, and whether we need separate terms.

  • 8 manual annotations by BHF-UCL to plant senescence. I have reported this to GOA, and they will liaise with BHF-UCL and send back any terms changes that need to be made.
  • 7 RGD ISO annotatations to the same term. Wrote to Simon Twigger.

organ senescence

2 manual from RGD, 1 manual from MGI, 3 IEA from Ensembl. I have written to Harold Drabkin and Simon Twigger, but Emily will pass on the information to Ensembl later.

Lactation

Mammalian reproduction in chickens: There are 6 AgBase ISS annotations to GO:0001553 "luteinization" or GO:0007595 "lactation", which are both mammalian reproductive processes. Written to Fiona McCarthy.

phycobilisome

I will come back to phycobilisome GO:0030089 later as it needs some research.

From Emily:

TAXON RULE INCORRECT. (DEF OF GO TERM INCLUDES CYANOBACTERIUM) GO:0030089 "phycobilisome" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot Q54714 cpcB GO:0030089 PMID:14730074 IDA C protein NCBITaxon:1148 20050613 UniProtKB

TAXON RULE INCORRECT. (DEF OF GO TERM INCLUDES CYANOBACTERIUM) GO:0030089 "phycobilisome" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot Q54715 cpcA GO:0030089 PMID:14730074 IDA C protein NCBITaxon:1148 20050613 UniProtKB

TAXON RULE INCORRECT. (DEF OF GO TERM INCLUDES CYANOBACTERIUM) GO:0030089 "phycobilisome" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot Q01951 apcA GO:0030089 PMID:14730074 IDA C protein NCBITaxon:1148 20050613 UniProtKB

gastrulation with mouth forming first

67 WormBase annotations to 'gastrulation with mouth forming first' GO:0001703. All are IMP. The script objects to nematodes being on a term with only_in_taxon Protostomia. I Have written to Kimberly van Auken to ask if this is right.

Reply:

Hi Jen--

Thanks for asking about these annotations.

C. elegans is actually grouped with the protostomes, which means that with
respect to this basic phylogenetic division, their gastrulation is lumped
in the 'mouth forming first' category.

This categorization of gastrulation is probably a bit of an
oversimplification, as it's really the middle, or gut, that forms first in
elegans.  Nonetheless, I imagine we chose GO:0001703 'gastrulation with
mouth forming first' as the GO term for these IMP annotations to be sure
we annotated our genes to the 'protostome' gastrulation term so the
elegans annotations would be annotated to the same term as genes involved
in protostome gastrulation in other species.

Hope that rationale makes sense.

Cheers,
--Kimberly

I have then checked and this is the lineage of C. elegans in NCBI:

cellular organisms; Eukaryota; Fungi/Metazoa group; Metazoa; Eumetazoa; Bilateria; Pseudocoelomata; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; Rhabditidae; Peloderinae; Caenorhabditis

C. elegans doesn't seem to be under Protostomia, so I sent this reply to Kimberly:

Hi Kimberly,

Thanks for getting back to me so swiftly.

How do you mean 'grouped with Protostomes'? It seems not to be under
protostomes in the NCBI classification. Should I be   marking this 
go term as intended for use with protostomes and some other taxonomic grouping as well?
 
Thanks,

Jen 

reply:

Hi Jen,

In the past, I think the prevailing view was that nematodes didn't neatly
fit into either the protostome or deuterostome classification, so they
were kind of off on their own as 'pseudocoelomates'.  More recently,
however,  a newer classification grouped nematodes, along with arthropods,
as Ecdysozoa, a protostome superclade.  This more recent classification is
what we've been following with these annotations.

Cheers,
--Kimberly

I have written to suggest that we come up with a union_of taxon term that covers all the needed groups.

nucleus

14 EcoCyc IEA annotations of E. coli gene products to 'nucleus' GO:0005634. Written to Jim Hu to ask about these.

Peptidoglycan biosynthetic process

3 IEA from TAIR to the bacterial peptidoglycan biosynthetic process, instead of to another similiar sounding term.

The def of this term has been improved to be much more comprehensive:

def: The chemical reactions and pathways resulting in the formation of peptidoglycans, any of a class of glycoconjugates found in bacterial cell walls. Peptidoglycans are polymers consisting of sugars and amino acids. The sugar component consists of alternating residues of β-(1,4) linked N-acetylglucosamine and N-acetylmuramic acid residues. Attached to the N-acetylmuramic acid is a peptide chain of three to five amino acids. The peptide chain can be cross-linked to the peptide chain of another strand forming the 3D mesh-like layer.

TAIR GO:0009252 "peptidoglycan biosynthetic process" only_in NCBITaxon:2 "Bacteria" :: TAIR gene:2097647 AT3G08840.1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11694

TAIR GO:0009252 "peptidoglycan biosynthetic process" only_in NCBITaxon:2 "Bacteria" :: TAIR gene:1006228677 AT3G08840.2 GO:0005758 PMID:10620603 EXP C protein NCBITaxon:11683

TAIR GO:0009252 "peptidoglycan biosynthetic process" only_in NCBITaxon:2 "Bacteria" :: TAIR gene:1006228676 AT3G08840.3 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11683


Ontology errors

sensory perception

Sensory perception GO:0007600 seems to be much less granular than its parent GO:0050890 cognition

    • I need a taxon link for neurological system process, which is the ancestor that indicates involvement of the brain. TODO
    • I need to add the cognitive part to the def of sensory perception. DONE

sensory perception
old def:The series of events required for an organism to receive a sensory stimulus, convert it to a molecular signal, and recognize and characterize the signal.

new def: The series of events required for an organism to receive a sensory stimulus, convert it to a molecular signal, and recognize and characterize the signal. This is a neurological process.

cognition
def: The operation of the mind by which an organism becomes aware of objects of thought or perception; it includes the mental activities associated with thinking, learning, and memory.

nutrient reservoir activity

I have written to Tanya and Donghui about this term, as it has lots of TAIR annotations but an Arthropoda taxon link. I'm hoping they'll be able to shed light on what it means so we can improve the def and correct the taxon link.

GO:0045735
nutrient reservoir activity
def: Functions in the storage of nutritious substrates.

TAXON RULE INCORRECT. (annotation to a major plant storage protein) GO:0045735 "nutrient reservoir activity" only_in NCBITaxon:88770 "Panarthropoda" :: Swiss-Prot Q9XHP1 Q9XHP1 GO:0045735 PMID:10606554 NAS F protein NCBITaxon:4182 20051104 UniProtKB

TAXON RULE INCORRECT. (annotation to a major plant storage protein) GO:0045735 "nutrient reservoir activity" only_in NCBITaxon:88770 "Panarthropoda" :: Swiss-Prot Q9SPL3 AMP2-3 GO:0045735 PMID:10571855 NAS F protein NCBITaxon:60698 20060904 UniProtKB

TAXON RULE INCORRECT. (annotation to a major plant storage protein)GO:0045735 "nutrient reservoir activity" only_in NCBITaxon:88770 "Panarthropoda" :: Swiss-Prot Q9XHP0 Q9XHP0 GO:0045735 PMID:10606554 NAS F protein NCBITaxon:4182 20051104 UniProtKB

TAXON RULE INCORRECT. (annotation to a major plant storage protein) GO:0045735 "nutrient reservoir activity" only_in NCBITaxon:88770 "Panarthropoda" :: Swiss-Prot Q39649 Q39649 GO:0045735 PMID:8275099 TAS F protein NCBITaxon:3661 20050516 UniProtKB

Multi-organism process

Emily's comment:

TAXON RULE INCORRECT THIS TERM SHOULD INCLUDE VIRUSES GO:0019064 "viral envelope fusion with host membrane" only_in NCBITaxon:131567 "cellular organisms" :: Swiss-Prot P36334 S GO:0019064 PMID:11162792 TAS P protein NCBITaxon:31631 20030508 UniProtKB

viral envelope fusion with host membrane: How do I stop this term from inheriting the restriction to cellular organisms?

I could make a link for the term that is only_in_taxon viruses, as that is what the terms is for, but it seems as if there is some principle of multi-organism process terms showing here. I wonder if we can solve this using internal cross products?

Photoprotection

Emily's comment:

TAXON RULE INCORRECT GO:0010117 "photoprotection" only_in NCBITaxon:33090 "Viridiplantae" :: Swiss-Prot P83690 P83690 GO:0010117 PMID:22511991 TAS P protein NCBITaxon:105610 20041119 UniProtKB (PMID: 12595737:"Reef-building corals contain fluorescent pigments termed pocilloporins that function by regulating the light environment of coral and acting as a photoprotectant in excessive sunlight.)"

This taxon rule is actually okay, but the definition of the term is very woolly and needs improved. This term was intended for the plant processes, but these are quite diverse, and should really be covered by several terms. There will need to be a new term for the Coral process. There could perhaps be a broad grouping term above all of these, and I think that the annotator assumed that that was what this term was when annotating the coral protein. I have requested a pdf from the library for an article that will enable creation of these terms:

http://arjournals.annualreviews.org/doi/abs/10.1146%2Fannurev.arplant.50.1.333

Euglena

Comments from Emily:

TAXON RULE INCORRECT EUGLENA CONTAIN CHLOROPLASTS AND cytochrome b6f complex GO:0009775 "photosynthetic electron transport in cytochrome b6/f" only_in ID:0000000 "Viridiplantae or Bacteria" :: Swiss-Prot P31480 petB GO:0009775 PMID:12837550 NAS P protein NCBITaxon:3039 20031120 UniProtKB

TAXON RULE INCORRECT EUGELA CONTAIN CHLOROPLASTS GO:0009507 "chloroplast" only_in NCBITaxon:33090 "Viridiplantae" :: Swiss-Prot P83687 PSBP GO:0009507 PMID:12837550 IDA C protein NCBITaxon:3039 20031120 UniProtKB

TAXON RULE INCORRECT EUGELA CONTAIN CHLOROPLASTS GO:0009507 "chloroplast" only_in NCBITaxon:33090 "Viridiplantae" :: Swiss-Prot Q8GZR2 petA GO:0009507 PMID:12837550 IDA C protein NCBITaxon:3039 20031120 UniProtKB

TAXON RULE INCORRECT EUGELA CONTAIN CHLOROPLASTS GO:0009507 "chloroplast" only_in NCBITaxon:33090 "Viridiplantae" :: Swiss-Prot P83688 P83688 GO:0009507 PMID:12837550 IDA C protein NCBITaxon:3039 20031120 UniProtKB

TAXON RULE INCORRECT EUGELA CONTAIN CHLOROPLASTS AND cytochrome b6f complex GO:0009512 "cytochrome b6f complex" only_in JD:0000002 "Viridiplantae or Cyanobacteria" :: Swiss-Prot Q84TU6 petD GO:0009512 PMID:12837550 IDA C protein NCBITaxon:3039 20040507 UniProtKB

TAXON RULE INCORRECT EUGELA CONTAIN CHLOROPLASTS AND cytochrome b6f complex GO:0009512 "cytochrome b6f complex" only_in JD:0000002 "Viridiplantae or Cyanobacteria" :: Swiss-Prot P31480 petB GO:0009512 PMID:12837550 IDA C protein NCBITaxon:3039 20031120 UniProtKB

I am going to need to make changes to the taxon links to take account of Euglena. Apparently it is thought to have been a call that acquired an endosymbiotic green algal cell, and contains chloroplasts similar to those in algae. http://euglena.msu.edu/chloroplast_mito.shtml

Need links to show that chloroplasts are in Viridiplantae or Euglena.

I will do this later, as it may be a big job.

DNA replication

Emily's comment:

TAXON RULE INCORRECT: GO:0006260 "DNA replication" only_in NCBITaxon:131567 "cellular organisms" :: TrEMBL O00529 O00529 GO:0006260 GO_REF:0000029 NAS P protein NCBITaxon:333760 20021106 UniProtKB (ATP-dependent DNA helicase required for initiation of viral DNA replication. It forms a complex with the viral E2 protein. The E1-E2 complex binds to the replication origin which contains binding sites for both proteins)

I discussed this with Jane and we decided to rename DNA replication to be cellular DNA replication and make a new generic parent that would be okay for annotation of viruses. There were whole string of ancestors of DNA replication that were not named or defined as cellular but were logically defined as cellular by their position in the graph. I have renamed all of these to be the cellular version and made generic parents. The before and after graphs are shown below.

Before:

After:

photoreceptor cell morphogenesis

This term was developed for us with Drosophila melanogaster but has annotations as follows:

MGI GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: MGI MGI:1920910 Cabp4 GO:0009297 PMID:16729895 RCA TIGR_CMR:PSPTO0926 P protein NCBITaxon:208964

MGI GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: MGI MGI:1277215 Bbs1 GO:0009297 PMID:94049125 IDA P protein NCBITaxon:208964

MGI GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: MGI MGI:1345146 Grk1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11695

Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: RGD 619712 Grk1 GO:0005758 PMID:10620603 EXP C protein NCBITaxon:11686

RGD GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: RGD 1306083 Cabp4 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11686

RGD GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: RGD 1307581 Bbs1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11689

RGD GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: RGD 619712 Grk1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:362651

Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot P28327 GRK1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11703

Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot Q8HZJ4 CABP4 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11692

Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot Q8NFJ9-2 BBS1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11688

Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot P57796-2 CABP4 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11678

Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot P57796 CABP4 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11699

Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot Q8NFJ9 BBS1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11689

Ensembl GO:0008594 "photoreceptor cell morphogenesis" only_in NCBITaxon:50557 "Insecta" :: Swiss-Prot Q15835 GRK1 GO:0019047 PMID:16291214 EXP P protein NCBITaxon:11692

8 EXP from Ensembl
1 RCA, 1 EXP and 1 IDA from MGI
3 EXP from RGD

The term is fairly ambiguously defined. I need to ask Becky Foulger if it is appropriate to use the same term for all kinds of photoreceptor cell. She is away, but I will email and wait for a response.


System problems

Taxon hierarchy problems

THIS ORGANISM IS A CHILD OF EUKARYOTA; SOMETHING MISSING IN TAXON HIERARCHY? GO:0009407 "toxin catabolic process" only_in NCBITaxon:131567 "cellular organisms" :: Swiss-Prot P86101 P86101 GO:0009407 PMID:18599388 IDA P protein NCBITaxon:569175 20081222 UniProtKB


FROM SGD ANNOTATION: AND YEAST DO HAVE THESE; PMID: 15103330. SOMETHING MISSING IN TAXON HIERARCHY? GO:0016226 "iron-sulfur cluster assembly" only_in NCBITaxon:131567 "cellular organisms" :: TrEMBL B9W8S4 CD36_08500 GO:0016226 GO_REF:0000024 ISS UniProtKB:P23503 P protein NCBITaxon:573826 20090714 UniProtKB

Chris responds: Looks like I was using a slightly out of date taxonomy, hence the false positives for bivalves, Candida strains etc. Will try and fix.

UniProt Comments

Comments from Emily:

nucleolar ribonuclease P complex

FIXED: Multimeric ribonuclease P complex no longer has any taxon link.

TAXON RULE INCORRECT (see GO term def:A ribonuclease P complex located in the nucleolus of a eukaryotic cell,and PMID:8918471) GO:0005655 "nucleolar ribonuclease P complex" only_in NCBITaxon:2157 "Archaea" :: Swiss-Prot Q99575 POP1 GO:0005655 PMID:8918471 IDA C protein NCBITaxon:9606 20010813 UniProtKB


gamma-tubulin ring complex

FIXED: Only_in_taxon link from gamma-tubulin large complex to Fungi removed.

TAXON RULE INCORRECT GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot Q95K09 TUBGCP5 GO:0008274 PMID:11694571 ISS UniProtKB:Q96RT8 C protein NCBITaxon:9541 20041006 UniProtKB

TAXON RULE INCORRECT GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot Q9GKK8 BRCA1 GO:0008274 PMID:12214252 NAS C protein NCBITaxon:9598 20021218 UniProtKB

GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot Q96RT7 TUBGCP6 GO:0008274 PMID:11694571 IDA C protein NCBITaxon:9606 20021106 UniProtKB

GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot P38398 BRCA1 GO:0008274 PMID:12214252 NAS C protein NCBITaxon:9606 20021023 UniProtKB

GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot Q96RT8 TUBGCP5 GO:0008274 PMID:11694571 IDA C protein NCBITaxon:9606 20021106 UniProtKB

GO:0008274 "gamma-tubulin ring complex" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot Q9UGJ1 TUBGCP4 GO:0008274 PMID:10562286 NAS C protein NCBITaxon:9606 20021218 UniProtKB

odontogenesis of dentine-containing tooth

FIXED

TAXON RULE INCORRECT.GO:0042475 "odontogenesis of dentine-containing tooth" only_in NCBITaxon:33317 "Protostomia" :: MGI MGI:104659 Dll1 GO:0042475 MGI:MGI:1327472|PMID:9882480 NAS P gene NCBITaxon:10090 20021218 UniProtKB

TAXON RULE INCORRECT.GO:0042476 "odontogenesis" only_in NCBITaxon:33317 "Protostomia" :: MGI MGI:1914505 Zfp422 GO:0042476 MGI:MGI:2449683|PMID:12489153 NAS P gene NCBITaxon:10090 20040226 UniProtKB

TAXON RULE INCORRECT.GO:0042476 "odontogenesis" only_in NCBITaxon:33317 "Protostomia" :: RGD 620229 Zfp422 GO:0042476 RGD:14706453 NAS P gene NCBITaxon:10116 20040226 UniProtKB

GO:0042476 "odontogenesis" only_in NCBITaxon:33317 "Protostomia" :: Swiss-Prot P17026 ZNF22 GO:0042476 PMID:14630903 NAS P protein NCBITaxon:9606 20040226 UniProtKB

GO:0042476 "odontogenesis" only_in NCBITaxon:33317 "Protostomia" :: Swiss-Prot Q9NNX1 TUFT1 GO:0042476 PMID:12489194 NAS P protein NCBITaxon:9606 20030115 UniProtKB

GO:0042476 "odontogenesis" only_in NCBITaxon:33317 "Protostomia" :: Swiss-Prot P02818 BGLAP GO:0042476 PMID:11856645 NAS P protein NCBITaxon:9606 20030114 UniProtKB

GO:0042475 "odontogenesis of dentine-containing tooth" only_in NCBITaxon:33317 "Protostomia" :: Swiss-Prot Q12948 FOXC1 GO:0042475 PMID:12614756 IMP P protein NCBITaxon:9606 20061201 UniProtKB

GO:0042476 "odontogenesis" only_in NCBITaxon:33317 "Protostomia" :: Swiss-Prot Q6W2J9 BCOR GO:0042476 PMID:17517692 IMP P protein NCBITaxon:9606 20081114 UniProtKB

GO:0042476 "odontogenesis" only_in NCBITaxon:33317 "Protostomia" :: Swiss-Prot Q9NRM1 ENAM GO:0042476 GO_REF:0000029 NAS P protein NCBITaxon:9606 20030113 UniProtKB

GO:0042476 "odontogenesis" only_in NCBITaxon:33317 "Protostomia" :: Swiss-Prot P08123 COL1A2 GO:0042476 PMID:1740554 NAS P protein NCBITaxon:9606 20080918 UniProtKB

GO:0042476 "odontogenesis" only_in NCBITaxon:33317 "Protostomia" :: Swiss-Prot P07942 LAMB1 GO:0042476 PMID:15894315 IDA P protein NCBITaxon:9606 20060815 UniProtKB

GO:0042475 "odontogenesis of dentine-containing tooth" only_in NCBITaxon:33317 "Protostomia" :: Swiss-Prot P10070 GLI2 GO:0042475 GO_REF:0000024 ISS UniProtKB:Q0VGT2 P protein NCBITaxon:9606 20070125 UniProtKB

Annotation problems found and fixed by UniProt

INAPPROPRIATE ANNOTATION; (replaced with GO:0006948 induction by virus of host cell-cell fusion)GO:0000747 "conjugation with cellular fusion" only_in NCBITaxon:4751 "Fungi" :: Swiss-Prot P01730 CD4 GO:0000747 PMID:9166430 IDA P protein NCBITaxon:9606 20060110 UniProtKB

INAPPROPRIATE ANNOTATION; INAPPROPRIATE ISS TRANSFER GO:0007424 "open tracheal system development" only_in NCBITaxon:6656 "Arthropoda" :: Swiss-Prot Q5XXA6 ANO1 GO:0007424 GO_REF:0000024 ISS UniProtKB:Q8BHY3 P protein NCBITaxon:9606 20080917 UniProtKB

INAPPROPRIATE ANNOTATION; GO TERM DEF IMPROVEMENTS? GO:0048222 "glycoprotein network" only_in NCBITaxon:33090 "Viridiplantae" :: Swiss-Prot Q8IX30 SCUBE3 GO:0048222 PMID:15234972 NAS C protein NCBITaxon:9606 20060904 UniProtKB

INCORRECT ANNOTATION -INAPPROPRIATE ISS TRANSFER GO:0005739 "mitochondrion" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot O34708 yflA GO:0005739 GO_REF:0000024 ISS UniProtKB:Q9WVL0 C protein NCBITaxon:1423 20041006 UniProtKB

INCORRECT ANNOTATION -HOST TERM NEEDED GO:0005783 "endoplasmic reticulum" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot P03247 E1B GO:0005783 PMID:10381623 IDA C protein NCBITaxon:10515 20060314 UniProtKB

INCORRECT ANNOTATION -HOST TERM NEEDED GO:0005783 "endoplasmic reticulum" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot P03247 E1B GO:0005783 PMID:7954800 IDA C protein NCBITaxon:10515 20060214 UniProtKB

INCORRECT ANNOTATION - GO:0009252 "peptidoglycan biosynthetic process" only_in NCBITaxon:2 "Bacteria" :: Swiss-Prot P83944 P83944 GO:0009252 PMID:12431412 IDA P protein NCBITaxon:6687 20040526 UniProtKB

INCORRECT ANNOTATION - INAPPROPRIATE ISS TRANSFER GO:0001541 "ovarian follicle development" only_in NCBITaxon:32523 "Tetrapoda" :: Swiss-Prot Q90511 eif2b2 GO:0001541 GO_REF:0000024 ISS UniProtKB:P49770 P protein NCBITaxon:31033 20060905 UniProtKB

INCORRECT ANNOTATION -NEED NEW TERM TO DESCRIBE DEFENCE/ATTACK MECHANISM; IF APPROPRIATE. GO:0043303 "mast cell degranulation" only_in NCBITaxon:131567 "cellular organisms" :: Swiss-Prot P86170 P86170 GO:0043303 PMID:18942691 IDA P protein NCBITaxon:582867 20081222 UniProtKB

INCORRECT ANNOTATION - SHOULD HAVE BEEN EXTRACELLULAR MATRIX. GO:0005578 "proteinaceous extracellular matrix" only_in NCBITaxon:33208 "Metazoa" :: Swiss-Prot Q06100 Cgl1 GO:0005578 PMID:10931889 IDA C protein NCBITaxon:5346 20040723 UniProtKB

INCORRECT ANNOTATION - INAPPROPRIATE ISS TRANSFER GO:0001701 "in utero embryonic development" only_in NCBITaxon:40674 "Mammalia" :: TrEMBL Q641A9 LOC403373 GO:0001701 GO_REF:0000024 ISS UniProtKB:Q9R207 P protein NCBITaxon:8355 20061204 UniProtKB

INCORRECT ANNOTATION - INAPPROPRIATE ISS TRANSFER GO:0005634 "nucleus" only_in NCBITaxon:2759 "Eukaryota" :: TrEMBL Q9J190 gag GO:0005634 PMID:11322894 ISS UniProtKB:Q9UBS8 C protein NCBITaxon:11676 20041006 UniProtKB

INCORRECT ANNOTATION - WRONG AC USED (replaced_by O88869). GO:0005768 "endosome" only_in NCBITaxon:2759 "Eukaryota" :: TrEMBL Q88869 Q88869 GO:0005768 PMID:9837933 IDA C protein NCBITaxon:11023 20031216 UniProtKB

INCORRECT ANNOTATION - WRONG AC USED (replaced_by O88869). GO:0016197 "endosome transport" only_in NCBITaxon:131567 "cellular organisms" :: TrEMBL Q88869 Q88869 GO:0016197 PMID:9837933 NAS P protein NCBITaxon:11023 20031216 UniProtKB


INCORRECT ANNOTATION - INAPPROPRIATE ISS TRANSFER GO:0001541 "ovarian follicle development" only_in NCBITaxon:32523 "Tetrapoda" :: TrEMBL C0H8R1 EI2BE GO:0001541 GO_REF:0000024 ISS UniProtKB:Q13144 P protein NCBITaxon:8030 20090514 UniProtKB

INAPPROPRIATE ANNOTATION - WRONG AC USED. GO:0007517 "muscle organ development" only_in NCBITaxon:2759 "Eukaryota" :: TrEMBL P84067 tthHB8IM GO:0007517 GO_REF:0000029 NAS P protein NCBITaxon:274 20040707 UniProtKB

INAPPROPRIATE ANNOTATION - WRONG AC USED.GO:0007517 "muscle organ development" only_in NCBITaxon:2759 "Eukaryota" :: TrEMBL P84068 tthHB8IM GO:0007517 GO_REF:0000029 NAS P protein NCBITaxon:274 20040707 UniProtKB

GO:0001701 "in utero embryonic development" only_in NCBITaxon:40674 "Mammalia" :: TrEMBL Q6XV80 nbn GO:0001701 GO_REF:0000024 ISS UniProtKB:Q9R207 P protein NCBITaxon:8355 20061204 UniProtKB


INAPPROPRIATE TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL Q90VU7 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB should have been: GO:0052085 negative regulation by symbiont of host T-cell mediated immune response

INAPPROPRIATE ISS) ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL O40177 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB

INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL Q71VG3 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB

INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL Q9WM24 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB

INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: TrEMBL Q9WLM4 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB

INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: Swiss-Prot Q70627 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:82834 20060116 UniProtKB

  • INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: Swiss-Prot P03406 nef GO:0001915 PMID:11689886 IDA P protein NCBITaxon:11686 20060113 UniProtKB

INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: Swiss-Prot P03404 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11678 20061106 UniProtKB

INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: Swiss-Prot P04324 nef GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11679 20060116 UniProtKB

INAPPROPRIATE ANNOTATION - TERM DUE TO HOST/PATHOGEN INTERACTION GO:0001915 "negative regulation of T cell mediated cytotoxicity" only_in NCBITaxon:7742 "Vertebrata" :: UniProtKB Q9PXW9 Q9PXW9 GO:0001915 GO_REF:0000024 ISS UniProtKB:P03406 P protein NCBITaxon:11676 20060116 UniProtKB

INAPPROPRIATE ANNOTATION; GO TERM DEF IMPROVEMENTS? GO:0007600 "sensory perception" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot Q9AE24 rprY GO:0007600 GO_REF:0000029 NAS P protein NCBITaxon:817 20030605 UniProtKB

INAPPROPRIATE ANNOTATION; GO TERM DEF IMPROVEMENTS? GO:0007600 "sensory perception" only_in NCBITaxon:2759 "Eukaryota" :: Swiss-Prot Q08408 BF3865 GO:0007600 GO_REF:0000029 NAS P protein NCBITaxon:817 20030512 UniProtKB

INAPPROPRIATE ANNOTATION - ISS FROM MOUSE -> FROG GO:0001701 "in utero embryonic development" only_in NCBITaxon:40674 "Mammalia" :: TrEMBL Q6EKW1 NBS1 GO:0001701 GO_REF:0000024 ISS UniProtKB:Q9R207 P protein NCBITaxon:8355 20061204 UniProtKB

MGI (col 15 incorrect) GO:0007424 "open tracheal system development" only_in NCBITaxon:6656 "Arthropoda" :: MGI MGI:2142149 Ano1 GO:0007424 MGI:MGI:3805598|PMID:18585372 IMP P gene NCBITaxon:10090 20080917 UniProtKB