The Annotation conference call for November will feature an annotation jamboree.
The call is scheduled for Nov 8, 2010 (Monday) at 8:00 AM California time.
We will try a new format this time. We have picked 2 papers on transcription for this jamboree (yes, they are yeast papers!).
1) PMID 7568152
Role of the GATA factors Gln3p and Nil1p of Saccharomyces cerevisiae in the expression of nitrogen-regulated genes. Stanbrough M, Rowen DW, Magasanik B.
2) PMID 8655543
G1n3p is capable of binding to UAS(NTR) elements and activating transcription in Saccharomyces cerevisiae. Cunningham TS, Svetlov VV, Rai R, Smart W, Cooper TG.
I provided a simple template in my email to enter the annotations (I am unable to upload the template here). But basically provide annotations in the format:
Gene name [tab] GO Aspect [tab] GOID [tab] GO term [tab] Evidence [tab] With/From [tab] Qualifier [tab]comments <br>
Please send your annotations to me by November 3rd and also remember to name your sheet so i know who gave what. I will collate them and upload it on the wiki by Nov 5th. Each curator can send in individual sheets or each MOD can send their consensus sheet to me.
Collated annotations for
- PMID 7568152 - Role of the GATA factors Gln3p and Nil1p of Saccharomyces cerevisiae in the expression of nitrogen-regulated genes. Stanbrough M, Rowen DW, Magasanik B.
Note about the PDF- when you open it you can barely read the text. Please zoom to read.
- PMID 8655543 - G1n3p is capable of binding to UAS(NTR) elements and activating transcription in Saccharomyces cerevisiae. Cunningham TS, Svetlov VV, Rai R, Smart W, Cooper TG.
Present: SGD- Rama, Karen, Julie, Jodi, Ruth, Rachael, Becky, Yasmin, Kimberly, Stan (RGD), Suzi, Pascale?
1) The annotation conf.call and the Ref.genome conf. call are on two consequtive days (second Monday and second Tuesday of the month). If any of you have issues with this schedule please email Rama and Rachael.
2) If you have specific annotation related issues that you want discussed at this Annotaiton conf. call, please post them on this wiki page- http://gocwiki.geneontology.org/index.php/Topics_for_annotation_call. Rachael and Rama will review them and prioritize.
Papers for discussion
PMID 8655543 (GLN3 paper)
- What level of term should be chosen for the MF annotation?
- Question: Is it appropriate to pick RNA Pol II terms although the paper doesn't mention RNA Pol II? Yes, because all nuclear protein coding genes are transcribed by RNA Pol II (except for some exceptions in trypanosomatids). People doing research on transcription that is NOT done by RNA polymerase II will tell you which RNA polymerase is involved, whether it's another nuclear RNA polymerase or one from an organelle. When researchers do not specify which RNA polymerase and just talk about regulation of gene expression, RNAP II can be safely used. Of course if you can't tell and you are not comfortable go one level up. Karen will check with David on whether we could put comments in relevant terms to help annotators remember this information.
- Comment on multiple versus single MF annotation. Some groups tend to make only the most granular annotation that a given paper supports, in this case the GO:0001077 term. However, some groups make multiple annotations even if some terms are parents to other terms also annotated. Thus, it is also fine if you choose to make the direct annotation to the DNA binding aspect alone, even though the GO:0001077 term incorporated a has_part relationship to this DNA binding term already. Thus, the "RNA polymerase II regulatory transcription factor site sequence-specific DNA binding" (GO:0000978) by IDA is also fine if your group chooses to make this annotation.
- Question: Why is there no function annotation to the MF term "promoter binding" or one of its child terms? As the authors describe in the introduction, the binding shown here is to a transcription factor binding site that is in the Upstream Activating Sequence (UAS), which is upstream of the promoter region, not part of the promoter. The background here is that the word "promoter" is loosely used by many people. The strict interpretation is that the promoter is only the sequence to which the RNA polymerase, and basal factors, binds, while any other regulatory sequences are not part of the promoter. However, many RNAP II researchers use the term "promoter" to refer to all of the regulatory sequence that is proximal to the gene itself (but excluding enhancers). These two usages are mutually exclusive. To try to be clear about what region is involved, Karen and David decided that we will use the phrase "core promoter" (commonly used in RNAP II literature) to describe the basal sequence where the RNAP binds and "transcription regulatory region" to describe all of the rest of it. Note that an ancestor of recommended function term (GO:0001077) already incorporates a has_part relationship to the appropriate type of DNa binding.
- Questions on BP annotations from this paper
- Question: How can you NOT make a Process annotation when you know the function? The experiments in this paper primarily generate annotations that are all in MF for GLN3. Since this paper doesn't show evidence for the nitrogen catabolite repression part and explicitly states that the Gln3-LexA fusion constructs do not show appropriate response to nitrogen source, you cannot make an annotation to a process as specific as "nitrogen catabolite activation of transcription from RNA polymerase II promoter" (GO:0001080) based on the experimental evidence. If you feel you need to make a process annotation based on the experimental evidence, you can only go as far as "positive regulation of transcription from RNA polymerase II promoter" (GO:0045944). In addition, if you make a function annotation to "sequence-specific DNA binding RNA polymerase II transcription factor activity" (GO:0000981) or to one of its children including the recommended GO:0001077 term, then a BP inference will be made from the MF annotation based on inter-ontology relationship.
- Possible new term request: "positive regulation of organic nitrogen metabolic process by activation of transcription". Ruth: Should a new term be created in order to distinguish between nitrogen in the media vs stored nitrogen that is broken down. Does it matter where the nitrogen is, what form? Is it true for multicellular organisms? Doesn't the existing term refer specifically to breakdown of internally stored nitrogen compounds. Karen & Julie: The authors clearly state these genes are acting in Nitrogen Catabolite Repression process (NCR) process where transcription is regulated by the presence of nitrogen compounds, in this case added exogenously. This is a standard way to do this type of assay. Use of the phrase "nitrogen catabolite" is the standard terminology for this process so this is what we should go with it. Reading the definition does not reveal any requirement the the nitrogen be produced by catabolism of internally stored nitrogen. Karen thinks it is likely that this term is equally relevant to multicellular eukaryotes, but if someone is ever able to prove that there is different regulation when the nitrogen source is internally catabolized nitrogen versus exogenously added, then we can make subtypes of the term. Currently this term would be appropriate regardless of whether the nitrogen was produced internally or provided externally.
- Question: Can you make an annotation to a process term that includes the fact that the regulation occurs in response to nitrogen catabolites by IC? No, because the most specific GO term that can be selected is the MF term "sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter" (GO:0001077). This term is not specific enough to allow the inference that the regulation is in response to nitrogen compounds. This function term is appropriate for transcription factors involved in regulation of many different processes, not just the process discussed in this paper. If you do wish to make a process annotation at that level, you can make one by TAS because the introduction clearly cites previous work demonstrating that GLN3 is involved in this regulation.
- Tips on finding the correct term, including viewing the has_part relationships. AmiGO does not display the has_part relationships, so you will not see the connection between "sequence_specific DNA binding transcription factor activity" terms and the relevant "DNA binding" term. You can see these in QuickGO or Rama recommended that annotators start using OBO-Edit. David says that there is a way to have OBO-Edit load the ontology directly from the web without having to fiddle with cvs or do a separate download. OBO-Edit shows the has_part, and other, relationships very clearly.
Acceptable annotations from this paper:
recommended MF annotations from experimental evidence (or less granular if you feel you cannot make the call that it is RNAP II): GO:0001077 sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter IDA GO:0001077 sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter IMP optional MF annotation GO:0000978 - RNA polymerase II regulatory transcription factor site sequence-specific DNA binding | IDA Other annotations which can be made if you feel you need to make BP and CC annotations from this paper: BP | GO:0001080 - nitrogen catabolite activation of transcription from RNA polymerase II promoter | TAS BP | GO:0045944 - positive regulation of transcription from RNA polymerase II promoter | IDA CC | GO:0005634 - nucleus | IC from GO:0001077 (or related MF term provided it includes "transcription factor activity")
PMID 7568152 (GAT/NIL1 paper)
Acceptable annotations from this paper:
FYI- NIL1 is an alias name for GAT1 GAT1 F 1077 sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter ISS with GLN3 GAT1 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IMP GAT1 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IGI 7568152 GLN3 GLN3 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IMP 7568152 GLN3 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IGI 7568152 GAT1 (NIL1)
- What is the appropriate level of granularity of P terms, i.e. why stop at nitrogen catabolites, and not go on to make terms for proline, glutamine, glutamate, etc.? Because it is not any different going down. The main idea of the paper is to show transcription in response to NCR.