- 1 Annotation Jamboree
- 2 Collated Annotations
- 3 Minutes
- 3.1 Announcements
- 3.2 Tips on finding the correct term
- 3.3 Papers for discussion
The Annotation conference call for November will feature an annotation jamboree.
The call is scheduled for Nov 8, 2010 (Monday) at 8:00 AM California time.
We will try a new format this time. We have picked 2 papers on transcription for this jamboree (yes, they are yeast papers!).
1) PMID 7568152
Role of the GATA factors Gln3p and Nil1p of Saccharomyces cerevisiae in the expression of nitrogen-regulated genes. Stanbrough M, Rowen DW, Magasanik B.
2) PMID 8655543
G1n3p is capable of binding to UAS(NTR) elements and activating transcription in Saccharomyces cerevisiae. Cunningham TS, Svetlov VV, Rai R, Smart W, Cooper TG.
I provided a simple template in my email to enter the annotations (I am unable to upload the template here). But basically provide annotations in the format:
Gene name [tab] GO Aspect [tab] GOID [tab] GO term [tab] Evidence [tab] With/From [tab] Qualifier [tab]comments <br>
Please send your annotations to me by November 3rd and also remember to name your sheet so i know who gave what. I will collate them and upload it on the wiki by Nov 5th. Each curator can send in individual sheets or each MOD can send their consensus sheet to me.
Collated annotations for
- PMID 7568152 - Role of the GATA factors Gln3p and Nil1p of Saccharomyces cerevisiae in the expression of nitrogen-regulated genes. Stanbrough M, Rowen DW, Magasanik B.
Note about the PDF- when you open it you can barely read the text. Please zoom to read.
- PMID 8655543 - G1n3p is capable of binding to UAS(NTR) elements and activating transcription in Saccharomyces cerevisiae. Cunningham TS, Svetlov VV, Rai R, Smart W, Cooper TG.
Present: SGD- Rama, Karen, Julie, Jodi, Ruth, Rachael, Yasmin, Kimberly, Stan, Tim, Tom, Shur-Jen (RGD), Suzi, Pascale, Susan, Serenella (UniProt-SIB), Michele + Mindi (UniProt-EBI)[can't remember]
1) The annotation conf.call and the Ref.genome conf. call are on two consequtive days (second Monday and second Tuesday of the month). If any of you have issues with this schedule please email Rama and Rachael.
2) If you have specific annotation related issues that you want discussed at this Annotaiton conf. call, please post them on this wiki page- http://gocwiki.geneontology.org/index.php/Topics_for_annotation_call. Rachael and Rama will review them and prioritize.
Tips on finding the correct term
Many of these new terms have has_part relationships to other function terms that represent a single aspect of their overall function. For example, a term for "sequence-specific DNA binding transcription factor" would have a has_part relationship to "sequence-specific DNA binding". Since AmiGO does not display the has_part relationships, you will not see the connection between "sequence_specific DNA binding transcription factor activity" terms and the relevant "DNA binding" term. You can see these in QuickGO or Rama recommended that annotators start using OBO-Edit. David says that there is a way to have OBO-Edit load the ontology directly from the web without having to fiddle with cvs or do a separate download. OBO-Edit shows the has_part, and other, relationships very clearly.
Papers for discussion
PMID 8655543 (GLN3 paper)
- What level of term should be chosen for the MF annotation?
- Question: Is it appropriate to pick RNA Pol II terms although the paper doesn't mention RNA Pol II? Yes, because all nuclear protein coding genes are transcribed by RNA Pol II (except for some exceptions in trypanosomatids). People doing research on transcription that is NOT done by RNA polymerase II will tell you which RNA polymerase is involved, whether it's another nuclear RNA polymerase or one from an organelle. When researchers do not specify which RNA polymerase and just talk about regulation of gene expression, RNAP II is a really safe bet. If you feel you can't tell and you are not comfortable, you can always go one level up. However, Karen and David encourage people to try to make the call to RNAP II since we have both seen lots of papers for regulation of transcription of genes transcribed by RNAP II where researchers don't bother to mention the RNAP, but neither of us remembers ever seeing a paper for regulation of transcription by another RNAP where it is not mentioned. If curators do not make this call, we feel that there will be many annotations where we lose the specificity that it is RNAP II. Karen will check with David on whether we could put comments in relevant terms to help annotators remember this information.
- Comment on multiple versus single MF annotation. Some groups tend to make only the most granular annotation that a given paper supports, in this case the GO:0001077 term. However, some groups make multiple annotations even if some terms are parents to other terms also annotated. Thus, it is also fine if you choose to make the direct annotation to the DNA binding aspect alone, even though the GO:0001077 term incorporated a has_part relationship to this DNA binding term already. Thus, the "RNA polymerase II regulatory transcription factor site sequence-specific DNA binding" (GO:0000978) by IDA is also fine if your group chooses to make this annotation.
- Question: Why is there no function annotation to the MF term "promoter binding" or one of its child terms? As the authors describe in the introduction, the binding shown here is to a transcription factor binding site that is in the Upstream Activating Sequence (UAS), which is upstream of the promoter region, not part of the promoter. The background here is that the word "promoter" is loosely used by many people. The strict interpretation is that the promoter is only the sequence to which the RNA polymerase, and basal factors, binds, while any other regulatory sequences are not part of the promoter. However, many RNAP II researchers use the term "promoter" to refer to all of the regulatory sequence that is proximal to the gene itself (but excluding enhancers). These two usages are mutually exclusive. To try to be clear about what region is involved, Karen and David decided that we will use the phrase "core promoter" (commonly used in RNAP II literature) to describe the basal sequence where the RNAP binds and "transcription regulatory region" to describe all of the rest of it. Note that an ancestor of recommended function term (GO:0001077) already incorporates a has_part relationship to the appropriate type of DNa binding.
- Questions on BP annotations from this paper
- Question: How can you NOT make a Process annotation when you know the function? The experiments in this paper primarily generate annotations that are all in MF for GLN3. Since this paper doesn't show evidence for the nitrogen catabolite repression part and explicitly states that the Gln3-LexA fusion constructs do not show appropriate response to nitrogen source, you cannot make an annotation to a process as specific as "nitrogen catabolite activation of transcription from RNA polymerase II promoter" (GO:0001080) based on the experimental evidence. If you feel you need to make a process annotation based on the experimental evidence, you can only go as far as "positive regulation of transcription from RNA polymerase II promoter" (GO:0045944). In addition, if you make a function annotation to "sequence-specific DNA binding RNA polymerase II transcription factor activity" (GO:0000981) or to one of its children including the recommended GO:0001077 term, then a BP inference will be made from the MF annotation based on inter-ontology relationship.
- Possible new term request: "positive regulation of organic nitrogen metabolic process by activation of transcription". Ruth: Should a new term be created in order to distinguish between nitrogen in the media vs stored nitrogen that is broken down. Does it matter where the nitrogen is, what form? Is it true for multicellular organisms? Doesn't the existing term refer specifically to breakdown of internally stored nitrogen compounds. Karen & Julie: The authors clearly state these genes are acting in Nitrogen Catabolite Repression process (NCR) process where transcription is regulated by the presence of nitrogen compounds, in this case added exogenously. This is a standard way to do this type of assay. Use of the phrase "nitrogen catabolite" is the standard terminology for this process so this is what we should go with it. Reading the definition does not reveal any requirement that the nitrogen be produced by catabolism of internally stored nitrogen. Karen thinks it is likely that this term is equally relevant to multicellular eukaryotes, but if someone is ever able to prove that there is different regulation when the nitrogen source is internally catabolized nitrogen versus exogenously added, then we can make subtypes of the term. Currently this term would be appropriate regardless of whether the nitrogen was produced internally or provided externally.
- Question: Can you make an annotation to a process term that includes the fact that the regulation occurs in response to nitrogen catabolites by IC? No, because the most specific GO term that can be selected is the MF term "sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter" (GO:0001077). This term is not specific enough to allow the inference that the regulation is in response to nitrogen compounds. This function term is appropriate for transcription factors involved in regulation of many different processes, not just the process discussed in this paper. If you do wish to make a process annotation at that level, you can make one by TAS because the introduction clearly cites previous work demonstrating that GLN3 is involved in this regulation.
Annotations from this paper
recommended MF annotations from experimental evidence (or less granular if you feel you cannot make the call that it is RNAP II): GO:0001077 | IDA, IMP | sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter optional MF annotation GO:0000978 | IDA | RNA polymerase II regulatory transcription factor site sequence-specific DNA binding Other annotations which can be made if you feel you need to make BP and CC annotations from this paper: BP | GO:0001080 | TAS | nitrogen catabolite activation of transcription from RNA polymerase II promoter BP | GO:0045944 | IDA | positive regulation of transcription from RNA polymerase II promoter CC | GO:0005634 - nucleus | IC from GO:0001077 (or related MF term provided it includes "transcription factor activity")
Notes for this paper
Results section 1 - Expression of GLN3 in E. coli - construct HA-GLN3 in expression vector for coli - purification: 2 major & many minor species (full length is very minor species) Results section 2 - Specific Gln3p-dependent binding to DAL3 DNA fragments containing GATAA sequences - use small DNA fragment from UAS3 containing 3 well characterized UAS(NTR) sites - get Gln3p-dependent band shifts - DNA fragment CAN1-1 (previously shown to support Gln3p-dependent NCR-sensitive txn) was effective competitor - mutation of GATAA sequences away from consensus abolished binding Results section 3 - Gln3p-dependent binding to wild-type and mutant DNA fragments derived from GLN1 - binds to GLN1-162 - does not bind when GATAA sequence is mutated Results section 4 - Gln3p-dependent binding to wild-type and mutant PUT1 DNA fragments - binds to the wt sequence containing 3 sites: GATAAA, GATAG, and CTTATC (GATAAG) - mutation of the 5' GATAAA sequence (fragment PUT1-2M1) = modest decrease - mutation of the 3' GATAAG sequence = barely detectable binding - double mutant lacking both the GATAAA and GATAAG sequences = no demonstrable binding Results section 5 - Gln3p-dependent binding to wild-type and mutant UGA4 DNA fragments Results section 6 - Gln3p-dependent binding to wild-type and mutant GDH2 DNA fragments. Results section 7 - Gln3p-dependent binding to wild-type and mutant CAR1 DNA fragments. (more of the same) Evidence: IDA; terms to consider based on these results: GO:0000987 - regulatory transcription factor site sequence-specific DNA binding GO:0000978 - RNA polymerase II regulatory transcription factor site sequence-specific DNA binding Results section 8 - Reporter gene expression mediated by LexAp-Gln3p. - generate a LexA-Gln3p fusion protein - assay plasmid-based reporters containing 1 LexA site: txn (none with LexA protein alone) - assay plasmid-based reporters containing 4 LexA sites: much more txn (none with LexA protein alone) Evidence: IDA (using cell based system with reporter constructs) Results section 9 - Assay of gln3 mutant alleles for the ability to support reporter gene expression. 1st paragraph - using reporter construct above -- deletion of 470-670 or 470-end: no effect -- deletion of the first 150 Gln3p residues decreased reporter gene expression 15-fold. evidence: IMP 2nd paragraph - to assess in vivo function, constructed mutant alleles and assayed abilities to complement a gln3-delta allele. -- zinc finger mutant (pVS316-3), Gln3p residue Cys-306 was mutated to encode Ser -- acidic region mutant (pVS316-2), residues 2 to 150 was deleted growth rate in glucose-Asn medium supported by a plasmid-borne wt GLN3 allele was two to three times lower than that of - gln3-delta strain RR91 - strain RR91 transformed with plasmid pRS316 (empty control plasmid) - strain RR91 transformed with plasmid pVS316-3 (zinc finger mutant) - strain RR91 transformed with plasmid pVS316-2 (acidic region region deletion) Remember that this carbon source (Asn) activates NCR "When readily used nitrogen sources (e.g., Asn, Gln, or ammonia in some strains) are available, NCR (Nitrogen Catabolite Repression) sensitive genes are expressed only at low levels (14)." => either mutant is the same as empty control plasmid, i.e. inactive in NCR Evidence: IMP Terms to consider based on all experimental results: GO:0000987 - regulatory transcription factor site sequence-specific DNA binding GO:0000978 - RNA polymerase II regulatory transcription factor site sequence-specific DNA binding GO:0000982 - sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity GO:0001077 - sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter
PMID 7568152 (GAT/NIL1 paper)
- What is the appropriate level of granularity of P terms, i.e. why stop at nitrogen catabolites, and not go on to make terms for proline, glutamine, glutamate, etc.? Because it is not any different going down. The main idea of the paper is to show transcription in response to nitrogen compounds in general.
Annotations from this paper
GAT1 F 1077 sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter ISS with GLN3 GAT1 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IMP GAT1 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IGI GLN3 GLN3 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IMP GLN3 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IGI GAT1 (NIL1)
Notes for this paper
FYI- NIL1 is an alias name for GAT1 Results section 2: Isolation of the NIL1 Gene (aka GAT1) - cloned NIL1 using degenerate primers for the characteristic zinc finger of GATA factors based on three genes: Sc Gln3, N crass NIT2, & A nidulans AREA -- obtained 4 groups of clones: GLN3, DAL80, NIL1 (aka GAT1), NIL2 -- NIL1 (aka GAT1) located to Chr VI & NIL on Chr X - comparison of predicted protein sequences: -- NIL1 is more similar to GLN3 than either is to DAL80 or NIL2 -- both GLN3 & NIL2 have an N-terminal 160 aa acidic region and a zinc finger These two lines of sequence evidence - cloning with degenerate primers targeted for zinc finger region of GATA factors - sequence comparison of cloned sequenced gene NIL1 (aka GAT1) with previously sequenced GLN3, contains both a zinc finger similar to GAT1 and a highly acidic region that "could indicate a transcriptional activation domain" evidence: ISS ; what term can be used? GO:0000987 - regulatory transcription factor site sequence-specific DNA binding GO:0000978 - RNA polymerase II regulatory transcription factor site sequence-specific DNA binding GO:0000982 - sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity GO:0001077 - sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter - Disruption of NIL1 (aka GAT1) in a gln3 background -- "total inability to produce the permease" GAP1 -- adding back either NIL1 or GLN3 restores ability to produce Gap1p permease -- assayed with proline as N source -- measuring Gap1p activity ==> QUOTE: "demonstrate that either Gln3p or Nil1p is required for the formation of the general amino acid permease." evidence: IGI with GLN3; for what term? can probably combine with section 3 Results section 3: Activation of Transcription by Gln3p and Nil1p (aka Gat1p) QUOTE: "We determined the abilities of strains defective in GLN3, NIL1, or both to grow in media containing various sources of nitrogen." - txn activation in single or double mutants of Gln3 and Nil1 -- either single results in small decrease in growth rate -- double mutant results in severely reduced rate of growth, except on glutamine (wt didn't grow either) -- used CYC1-lacZ fusion to UAS regions of GLN1, GDH2, and GAP1 on plasmids in vivo -- measured by b-gal activity -- compared ability of Gln3 and Nil1 to activate txn of each of these genes in media containing nitrogen sources, either Glu, Gln, urea, ammonia, or proline. Note that Gln was not shown in table because activity of WT cells was very low evidence: IMP, IGI with GLN3