Jamboree 18 July 2008-minutes

Present; Pascale, Jim, David, Victoria, Petra, Tanya, Debby, Michael, Suzi, Stacia, Dmitri, Alex, Kimberley, Emily, Rachael, Varsha, Ruth, Becky Foulger, Michele Magrane.

Discussion on GCH1

1. (Debby) FolE in E. coli is a decamer, should it be annotated using protein homooligomerisation?, contributes_to? The active site is in the oligomer not in individual subunits

(Michael) Homooligomer is not an appropriate annotation, only appropriate if it is a chaperone helping the formation of a complex. Should annotate to the relevant protein complex term.

(Debby) Protein homooligomerization is a process, Def: The process of creating protein oligomers, compounds composed of a small number, usually between three and ten, of identical component monomers. Oligomers may be formed by the polymerization of a number of monomers or the depolymerization of a large protein polymer.

(Michael) This should be discussed offline. The annotation should not be attached to a protein which is oligomerized, rather those which help the formation of a complex. The definition(s) should be changed and each annotation using these terms be checked.

(Pascale) Also include ‘protein complex assembly’

(David) CC assembly terms are on the list to be discussed.

ACTION ITEM 1. (Debby) Put in a SourceForge request that definitions of terms for oligomerization, dimerization, protein complex assembly etc. should be clarified as to their use and annotations using these terms be checked and, if necessary, changed.

2. (Debby) The InterPro IEA annotations for FolE are incorrect. For example, InterPro annotates FolE to GO:0006556 S-adenosylmethionine biosynthetic process. Methionine biosynthesis in E. coli requires 5-methyltetrahydropteroyltri-L-glutamate, which is derived from tetrahydrofolate, but FolE is not directly involved in the biosynthesis of methionine.

ACTION ITEM 2. Emily will make a SourceForge request to get the mappings for GO:0006556 S-adenosylmethionine biosynthetic process changed.

3. Petra noted on the GONUTS wiki that E.coli FolE was not annotated to tetrahydrobiopterin biosynthesis process and suggested it should be as it is already annotated to GTP cyclohydrolase I activity.

Debby responded that she couldn’t find any evidence that E. coli made tetrahydrobiopterin

(Jim has since put up this comment on the wiki – “There's a NAS in a paper showing reconstitution of a tetrahydrobiopterin synthesis pathway in E. coli that says E. coli and most bacteria don't make tetrahydrobiopterin[59]”)

(David) In several experiments they only measure the neopterin product not tetrahydrobiopterin so you can’t tell if it’s involved in tetrahydrobiopterin biosynthesis process

(Debby) Should annotations be made to the process if it’s not shown in the paper and they only show one step in the process. Very rarely does one paper show the entire process experimentally.

(David) This will be partially solved when we have function/process links

Petra has been making the process annotations with an IDA

(David) Should annotate the function with IDA, but if the paper does not investigate the role of the protein in the entire pathway, the process term should be annotated to ‘IC’ as the role of the enzyme in the pathway has not been tested biochemically by the paper’s authors.

Pascale suggests it should be IDA

Ruth was concerned that the author’s statement that the enzyme was part of a particular process was less important than the curator inferring this, so in effect the authors’s statement is being downgraded from IDA to IC.

(Debby) Common knowledge should be considered, e.g. everyone knows the TCA cycle

(Petra) This used to be a TAS annotation but we stopped making those

David insisted it should be IC, with the relevant molecular function term in the ‘with’ column.

(Ruth) What about regulation of transcription, should all processes not fully characterized in one paper be IC? Correct use of IC should be e.g. inferring ATP binding from ATPase activity by the curators – here the author is making this statement.

Emily suggested taking the discussion to the GOC meeting in October

ACTION ITEM 3. Put “Process IC or IDA?” On agenda for Consortium meeting

4. (Petra) ‘membrane fraction’ should be part of ‘insoluble fraction’, currently they are siblings. All agreed (Michael) these terms are only in GO as they are standard biochemical terms. Should only be used when the author specifically states that the gene product was found in the ‘membrane fraction’/’insoluble fraction’.

ACTION ITEM 4. (Petra) Make SourceForge item for making membrane fraction a child of insoluble fraction

5. (Suzi) Were there any contradictions in the GCH1 annotation set?

(Debby) Rat has coenzyme binding

(Victoria) GCH1 binds to the end product tetrahydrobiopterin – a coenzyme - and is inhibited by it.

(Debby) This doesn’t fit the definition of coenzyme binding (Interacting selectively with a coenzyme, any of various nonprotein organic cofactors that are required, in addition to an enzyme and a substrate, for an enzymatic reaction to proceed.) as the reaction is not proceeding.

(David) Is there an experiment showing GCH1 binding tetrahydrobiopterin?

(Victoria) Yes

(David) Then that is the function. The first part of the definition describes the binding function, the second part of the definition is describing what a coenzyme is.

(Pascale) The cofactor is the protein doing the regulating, but you cannot annotate the cofactor


6. (Susan) Fly has three isoforms with different N-termini which bind to the active site of themselves. Can we have regulation of own activity?

(Pascale) What about enzyme regulator activity term?

All agreed can have a new term for the specific enzyme regulation

ACTION ITEM 5. (Susan) Request new term for Regulation of GTP Cyclohydrolase I activity.

7. Doug had a question on whether to annotate mutant worm as regulation of blood pressure

(David) Important to know what genes are involved in what disease, so Yes.

Discussion on HSPD1

1. (David) Describes an annotation conflict. In early papers, the human protein was expressed in E.coli and it caused T cell activation, however a later paper then disputed the quality of the results as there was endotoxin contamination in the original experiment. A third, more recent paper now duplicates the experiments and says the original paper was correct, David decided to delete the annotations made to the original paper and re-added them with the later reference.

All agreed this was acceptable and Debby suggested David make a note of the disagreement between papers so that a later curator doesn’t go through the same thing.

2. (Ruth) Question on how to annotate cell component terms. HSPD1 binds the cell surface is then endocytosed into the cell and then returned to the cell surface for antigen presentation. In this process, HSPD1 travels through a number of different cellular componet pathways. Should HSPD1 be annotated to each of these cellular component terms (e.g. endocytic vesicle), or only ‘cell surface?

(Alex) Not sure this is an OK annotation as it is in the vesicle only transitory, BUT there is a requirement for the protein’s function to go through endocytosis. And the majority of gene products in endocytic vesicles are only located there transitorily.

All agreed that if it’s important for the function of the protein then it can be annotated to each of the cellular component terms.

3. (Pascale) In PMID 12952082, they show that Dicty hspA is *not* up-regulated by heat shock. Do we put a NOT annotation?

Susan has also made them for fly.

(Alex) Perhaps the authors are only assuming they are heat shock proteins because of their name but they aren’t necessarily. This may be an issue with the nomenclature rather than GO.

(Ruth) In the paper that Susan read, the proteins were clustered with heat shock proteins therefore they WERE expected to be heat shock proteins.

(Michael) In that case you can make a ‘NOT’ annotation.

4. (Pascale) In PMID 12952082, the use a helix prediction software to assess that the gene is likely mitochondrial; how should we annotate that ? ISS? ISM?

(Susan/Emily) Would use ISM if happy with software predicting it.

5. (Doug) In PMID 12416612, they show that expression of zebrafish hspd1, unlike dicty and worm, IS upregulated by heat shock. Should we make an annotation to 'response to heat' by IDA? Worm has a similar "NOT" annotation using IEP.

(Pascale) Make an IEP annotation as the expression is upregulated

6. (Victoria) GO has 'chaperone activity' as a synonym for 'unfolded protein binding' but not for 'misfolded protein binding' or otherwise damaged/modified/abnormal which the chaperone(s) recognize as well. I had an annotation for 'unfolded protein binding' because I was looking for chaperone activity and the misfolded term was not pulled out. I removed the annotation and replaced it with the 'misfolded' term which is closer to an abnormal state than unfolded; in my case the proteins recognized by the chaperone were modified proteins.

(Becky) Need to add chaperone activity as synonym to both the unfolded/misfolded terms

(Debby) There is also ‘de novo protein folding’

Question raized as to whether to combine terms – decided against and instead re-define terms.

ACTION ITEM 6. (Victoria) Make a SourceForge request to clarify the definitions of unfolded/misfolded protein binding and add chaperone activity as a synonym to both of the terms. Also add ‘de novo’ synonym to ‘unfolded protein binding’.

7. (Tanya) In PMID: 14730085, the expression of all three genes (Arabidopsis Hsp60-3a, Hsp60-3b, and Hsp6-3c) is affected by rotenone treatment. Rotenone is a mitochondrial electron transport chain inhibitor. Expression levels of all three genes increase about 2 fold with respect to no treatment. The only assay for this is a microarray experiment. There is no follow-up assay of a different sort. I've chosen not to annotate this.

(Debby) Is this stress response?

(David) As it’s a microarray, you are never really sure if it is directly involved in the response to stress.

(Tanya) This was the only type of experiment found for this gene

(David) Could do an IEP ‘Response to stress’

(Ruth) IEP is a good idea but you have to be very careful with microarray data as the users of GO are doing microarray analysis on GO data, end up too circular. The genes might be up or down regulated by stress but not actually involved in the stress response process. Would you annotate all 2000 genes in the experiment to this process? Also there was only a 2-fold increase which is a minimal amount. Midori requested IEP be used minimally with caution.

(Varsha) 2-fold is the baseline for significance in a microarray experiment, I wouldn’t annotate this.

Michael agrees, microarray data has limited possibilities for GO annotation – perhaps should think about allowing any microarray data in GO.

(Pascale) What about Mass-spec data?

(Michael) MS can be useful for complexes

(Ruth) GOA does not take/make any annotations to microarray data

(Suzi) The goal is good annotation, if you are unsure you should always err on the side of caution

(Emily) Don’t blacklist microarray data yet as SGD and Val use it and need to be consulted

(Michael) We can say don’t use microarray data for Reference Genomes.

ACTION ITEM 7. Discuss microarray data further, possibly GOC meeting item.

8. (Susan) Question about high-throughput data. HSP60 had annotation lipid particle IDA from protein sequencing, should it be marked as HTP?

(Pascale, Stacia) SGD has implemented HTP and found it useful.

Stacia – SGD have had long (unconcluded) discussions about what constitutes a HTP experiment. Very difficult to define.

(Michael) But that is not in the context of Reference Genomes. Need to decide on a case-by-case basis for HTP

(Emily) We should tag those we find and SGD could send around their protocol for defining HTP data to help further this discussion.

ACTION ITEM 8. (Stacia) Send around SGDs SOP for HTP annotation. More comfortable with HTP for cell component than for process.

ACTION ITEM 9. Make HTP annotation a topic for a Reference Genome call.

9. (Becky) It is common knowledge that protein A is a chaperone, protein B binds protein A, can you annotate protein B as ‘chaperone binding’ even though there is no direct evidence? (Debby) Depends how confident are you that protein A is a chaperone

Some people suggest it is curator judgement as to whether or not to make the annotation.

(Pascale) Why not just protein binding?

(Ruth) Is it a part of the activity of the protein? Is it significant that it binds a chaperone?

(Becky) Can’t tell from the immunoprecipitation experiment in the paper

(Pascale) Would be safer to annotate protein binding

(Suzi) If there is a term for it then you should use it or could make an even more specific term

(David) Have to take things in context, is it useful to the community to know this?

(Suzi) We need three levels of specificity; - Protein binding -- Chaperone binding --- Chaperone binding as a target for a process

ACTION ITEM 10. (Becky) Move to email discussion on whether we need more terms under protein binding to describe protein binding as a target for a process

10. Somebody asked how to get the MGI ‘J’ references, David replied that they will appear in GOA as PMIDs.