Difference between revisions of "20101108 AnnotationCallMinutes"

From GO Wiki
Jump to: navigation, search
(Papers for discussion)
 
(17 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 +
[[Category:Annotation Working Group]]
 
==Annotation Jamboree==
 
==Annotation Jamboree==
 
The Annotation conference call for November will feature an annotation jamboree.<br>
 
The Annotation conference call for November will feature an annotation jamboree.<br>
Line 24: Line 25:
  
 
==Minutes==
 
==Minutes==
Present: SGD- Rama, Karen, Julie, Jodi, Ruth, Rachael, Becky, Yasmin, Kimberly, Stan (RGD), Suzi, Pascale?<br>
+
Present: SGD- Rama, Karen, Julie, Jodi, Ruth, Rachael, Yasmin, Kimberly, Stan, Tim, Tom, Shur-Jen (RGD), Suzi, Pascale, Susan, Serenella (UniProt-SIB), Michele + Mindi (UniProt-EBI)[can't remember]<br>
  
 
===Announcements===
 
===Announcements===
Line 30: Line 31:
 
2) If you have specific annotation related issues that you want discussed at this Annotaiton conf. call,  
 
2) If you have specific annotation related issues that you want discussed at this Annotaiton conf. call,  
 
please post them on this wiki page- http://gocwiki.geneontology.org/index.php/Topics_for_annotation_call. Rachael and Rama will review them and prioritize.
 
please post them on this wiki page- http://gocwiki.geneontology.org/index.php/Topics_for_annotation_call. Rachael and Rama will review them and prioritize.
 +
 +
=== Tips on finding the correct term===
 +
 +
Many of these new terms have has_part relationships to other function terms that represent a single aspect of their overall function. For example, a term for "sequence-specific DNA binding transcription factor" would have a has_part relationship to "sequence-specific DNA binding". Since AmiGO does not display the has_part relationships, you will not see the connection between "sequence_specific DNA binding transcription factor activity" terms and the relevant "DNA binding" term. You can see these in QuickGO or Rama recommended that annotators start using OBO-Edit. David says that there is a way to have OBO-Edit load the ontology directly from the web without having to fiddle with cvs or do a separate download. OBO-Edit shows the has_part, and other, relationships very clearly. 
  
 
===Papers for discussion===
 
===Papers for discussion===
Line 35: Line 40:
 
====PMID 8655543 (GLN3 paper)====
 
====PMID 8655543 (GLN3 paper)====
  
 +
=====Discussion=====
 
* What level of term should be chosen for the MF annotation?
 
* What level of term should be chosen for the MF annotation?
** Question: Is it appropriate to pick RNA Pol II terms although the paper doesn't mention RNA Pol II? Yes, because all nuclear protein coding genes are transcribed by RNA Pol II (except for some exceptions in trypanosomatids). People doing research on transcription that is '''NOT''' done by RNA polymerase II will tell you which RNA polymerase is involved, whether it's another nuclear RNA polymerase or one from an organelle. When researchers do not specify which RNA polymerase and just talk about regulation of gene expression, RNAP II can be safely used. Of course if you can't tell and you are not comfortable go one level up. Karen will check with David on whether we could put comments in relevant terms to help annotators remember this information.
+
** Question: Is it appropriate to pick RNA Pol II terms although the paper doesn't mention RNA Pol II? Yes, because all nuclear protein coding genes are transcribed by RNA Pol II (except for some exceptions in trypanosomatids). People doing research on transcription that is '''NOT''' done by RNA polymerase II will tell you which RNA polymerase is involved, whether it's another nuclear RNA polymerase or one from an organelle. When researchers do not specify which RNA polymerase and just talk about regulation of gene expression, RNAP II is a really safe bet. If you feel you can't tell and you are not comfortable, you can always go one level up. However, Karen and David encourage people to try to make the call to RNAP II since we have both seen lots of papers for regulation of transcription of genes transcribed by RNAP II where researchers don't bother to mention the RNAP, but neither of us remembers ever seeing a paper for regulation of transcription by another RNAP where it is not mentioned. If curators do not make this call, we feel that there will be many annotations where we lose the specificity that it is RNAP II. Karen will check with David on whether we could put comments in relevant terms to help annotators remember this information.
 
** Comment on multiple versus single MF annotation. Some groups tend to make only the most granular annotation that a given paper supports, in this case the GO:0001077 term. However, some groups make multiple annotations even if some terms are parents to other terms also annotated. Thus, it is also fine if you choose to make the direct annotation to the DNA binding aspect alone, even though the GO:0001077 term incorporated a has_part relationship to this DNA binding term already. Thus, the "RNA polymerase II regulatory transcription factor site sequence-specific DNA binding" (GO:0000978) by IDA is also fine if your group chooses to make this annotation.
 
** Comment on multiple versus single MF annotation. Some groups tend to make only the most granular annotation that a given paper supports, in this case the GO:0001077 term. However, some groups make multiple annotations even if some terms are parents to other terms also annotated. Thus, it is also fine if you choose to make the direct annotation to the DNA binding aspect alone, even though the GO:0001077 term incorporated a has_part relationship to this DNA binding term already. Thus, the "RNA polymerase II regulatory transcription factor site sequence-specific DNA binding" (GO:0000978) by IDA is also fine if your group chooses to make this annotation.
* Question: Why is there no function annotation to the MF term "promoter binding" or one of its child terms? As the authors describe in the introduction, the binding shown here is to a transcription factor binding site that is in the Upstream Activating Sequence (UAS), which is upstream of the promoter region, not part of the promoter. The background here is that the word "promoter" is loosely used by many people. The strict interpretation is that the promoter is only the sequence to which the RNA polymerase, and basal factors, binds, while any other regulatory sequences are not part of the promoter. However, many RNAP II researchers use the term "promoter" to refer to all of the regulatory sequence that is proximal to the gene itself (but excluding enhancers). These two usages are mutually exclusive. To try to be clear about what region is involved, Karen and David decided that we will use the phrase "core promoter" (commonly used in RNAP II literature) to describe the basal sequence where the RNAP binds and "transcription regulatory region" to describe all of the rest of it.
+
* Question: Why is there no function annotation to the MF term "promoter binding" or one of its child terms? As the authors describe in the introduction, the binding shown here is to a transcription factor binding site that is in the Upstream Activating Sequence (UAS), which is upstream of the promoter region, not part of the promoter. The background here is that the word "promoter" is loosely used by many people. The strict interpretation is that the promoter is only the sequence to which the RNA polymerase, and basal factors, binds, while any other regulatory sequences are not part of the promoter. However, many RNAP II researchers use the term "promoter" to refer to all of the regulatory sequence that is proximal to the gene itself (but excluding enhancers). These two usages are mutually exclusive. To try to be clear about what region is involved, Karen and David decided that we will use the phrase "core promoter" (commonly used in RNAP II literature) to describe the basal sequence where the RNAP binds and "transcription regulatory region" to describe all of the rest of it. Note that an ancestor of recommended function term (GO:0001077) already incorporates a has_part relationship to the appropriate type of DNa binding.
 
* Questions on BP annotations from this paper   
 
* Questions on BP annotations from this paper   
 
** Question: How can you NOT make a Process annotation when you know the function? The experiments in this paper primarily generate annotations that are all in MF for GLN3. Since this paper doesn't show evidence for the nitrogen catabolite repression part and explicitly states that the Gln3-LexA fusion constructs do not show appropriate response to nitrogen source, you cannot make an annotation to a process as specific as "nitrogen catabolite activation of transcription from RNA polymerase II promoter" (GO:0001080) based on the experimental evidence. If you feel you need to make a process annotation based on the experimental evidence, you can only go as far as "positive regulation of transcription from RNA polymerase II promoter" (GO:0045944).  In addition, if you make a function annotation to "sequence-specific DNA binding RNA polymerase II transcription factor activity" (GO:0000981) or to one of its children including the recommended GO:0001077 term, then a BP inference will be made from the MF annotation based on inter-ontology relationship.
 
** Question: How can you NOT make a Process annotation when you know the function? The experiments in this paper primarily generate annotations that are all in MF for GLN3. Since this paper doesn't show evidence for the nitrogen catabolite repression part and explicitly states that the Gln3-LexA fusion constructs do not show appropriate response to nitrogen source, you cannot make an annotation to a process as specific as "nitrogen catabolite activation of transcription from RNA polymerase II promoter" (GO:0001080) based on the experimental evidence. If you feel you need to make a process annotation based on the experimental evidence, you can only go as far as "positive regulation of transcription from RNA polymerase II promoter" (GO:0045944).  In addition, if you make a function annotation to "sequence-specific DNA binding RNA polymerase II transcription factor activity" (GO:0000981) or to one of its children including the recommended GO:0001077 term, then a BP inference will be made from the MF annotation based on inter-ontology relationship.
** Ruth's new term request: "positive regulation of organic nitrogen metabolic process by activation of transcription". Should a new term be created in order to distinguish between nitrogen in the media vs stored nitrogen that is broken down. Does it matter where the nitrogen is, what form? Is it true for multicellular organisms? Doesn't the existing term refer specifically to breakdown of internally stored nitrogen compounds. Karen - The authors clearly state these genes are acting in Nitrogen Catabolite Repression process (NCR) process where transcription is regulated by the presence of nitrogen compounds, in this case added exogenously. This is a standard way to do this type of assay.  Use of the phrase "nitrogen catabolite" is the standard terminology for this process so this is what we should go with it. Reading the definition does not reveal any requirement the the nitrogen be produced by catabolism of internally stored nitrogen. Karen thinks it is likely that this term is equally relevant to multicellular eukaryotes, but if someone is ever able to prove that there is different regulation when the nitrogen source is internally catabolized nitrogen versus exogenously added, then we can make subtypes of the term. Currently this term would be appropriate regardless of whether the nitrogen was produced internally or provided externally.
+
** Possible new term request: "positive regulation of organic nitrogen metabolic process by activation of transcription". Ruth: Should a new term be created in order to distinguish between nitrogen in the media vs stored nitrogen that is broken down. Does it matter where the nitrogen is, what form? Is it true for multicellular organisms? Doesn't the existing term refer specifically to breakdown of internally stored nitrogen compounds. Karen & Julie: The authors clearly state these genes are acting in Nitrogen Catabolite Repression process (NCR) process where transcription is regulated by the presence of nitrogen compounds, in this case added exogenously. This is a standard way to do this type of assay.  Use of the phrase "nitrogen catabolite" is the standard terminology for this process so this is what we should go with it. Reading the definition does not reveal any requirement that the nitrogen be produced by catabolism of internally stored nitrogen. Karen thinks it is likely that this term is equally relevant to multicellular eukaryotes, but if someone is ever able to prove that there is different regulation when the nitrogen source is internally catabolized nitrogen versus exogenously added, then we can make subtypes of the term. Currently this term would be appropriate regardless of whether the nitrogen was produced internally or provided externally.
 
** Question: Can you make an annotation to a process term that includes the fact that the regulation occurs in response to nitrogen catabolites by IC? No, because the most specific GO term that can be selected is the MF term "sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter" (GO:0001077). This term is not specific enough to allow the inference that the regulation is in response to nitrogen compounds. This function term is appropriate for transcription factors involved in regulation of many different processes, not just the process discussed in this paper. If you do wish to make a process annotation at that level, you can make one by TAS because the introduction clearly cites previous work demonstrating that GLN3 is involved in this regulation.
 
** Question: Can you make an annotation to a process term that includes the fact that the regulation occurs in response to nitrogen catabolites by IC? No, because the most specific GO term that can be selected is the MF term "sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter" (GO:0001077). This term is not specific enough to allow the inference that the regulation is in response to nitrogen compounds. This function term is appropriate for transcription factors involved in regulation of many different processes, not just the process discussed in this paper. If you do wish to make a process annotation at that level, you can make one by TAS because the introduction clearly cites previous work demonstrating that GLN3 is involved in this regulation.
  
 +
=====Annotations from this paper=====
 +
<pre>
 +
recommended MF annotations from experimental evidence (or less granular if you feel you cannot make the call that it is RNAP II):
 +
GO:0001077 | IDA, IMP | sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity
 +
                        involved in positive regulation of transcription from RNA polymerase II promoter
 +
optional MF annotation
 +
GO:0000978 | IDA  | RNA polymerase II regulatory transcription factor site sequence-specific DNA binding
  
 +
Other annotations which can be made if you feel you need to make BP and CC annotations from this paper:
  
Acceptable annotations from this paper:
+
BP | GO:0001080 | TAS | nitrogen catabolite activation of transcription from RNA polymerase II promoter
 +
BP | GO:0045944 | IDA | positive regulation of transcription from RNA polymerase II promoter
 +
 
 +
 
 +
CC | GO:0005634 - nucleus | IC from GO:0001077 (or related MF term provided it includes "transcription factor activity")
 +
</pre>
 +
 
 +
=====Notes for this paper=====
 
<pre>
 
<pre>
 +
Results section 1 - Expression of GLN3 in E. coli
 +
- construct HA-GLN3 in expression vector for coli
 +
- purification: 2 major & many minor species (full length is very minor species)
  
recommended MF annotations from experimental evidence (or less granular if you feel you cannot make the call that it is RNAP II):
+
Results section 2 -  Specific Gln3p-dependent binding to DAL3 DNA fragments containing GATAA sequences
GO:0001077 sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter IDA
+
- use small DNA fragment from UAS3 containing 3 well characterized UAS(NTR) sites
GO:0001077 sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter IMP
+
- get Gln3p-dependent band shifts
 +
- DNA fragment CAN1-1 (previously shown to support Gln3p-dependent
 +
  NCR-sensitive txn) was effective competitor
 +
- mutation of GATAA sequences away from consensus abolished binding
 +
 
 +
Results section 3 - Gln3p-dependent binding to wild-type and mutant DNA fragments derived from GLN1
 +
- binds to GLN1-162
 +
- does not bind when GATAA sequence is mutated
 +
 
 +
Results section 4 - Gln3p-dependent binding to wild-type and mutant PUT1 DNA fragments
 +
- binds to the wt sequence containing 3 sites: GATAAA, GATAG, and CTTATC (GATAAG)
 +
- mutation of the 5' GATAAA sequence (fragment PUT1-2M1) = modest decrease
 +
- mutation of the 3' GATAAG sequence = barely detectable binding
 +
- double mutant lacking both the GATAAA and GATAAG sequences = no demonstrable binding
 +
 
 +
Results section 5 - Gln3p-dependent binding to wild-type and mutant UGA4 DNA fragments
 +
Results section 6 - Gln3p-dependent binding to wild-type and mutant GDH2 DNA fragments.
 +
Results section 7 - Gln3p-dependent binding to wild-type and mutant CAR1 DNA fragments.
 +
(more of the same)
 +
 
 +
Evidence: IDA; terms to consider based on these results:
 +
  GO:0000987 - regulatory transcription factor site sequence-specific DNA binding
 +
  GO:0000978 - RNA polymerase II regulatory transcription factor site sequence-specific DNA binding
 +
 
 +
Results section 8 - Reporter gene expression mediated by LexAp-Gln3p.
 +
- generate a LexA-Gln3p fusion protein
 +
- assay plasmid-based reporters containing 1 LexA site: txn (none with LexA protein alone)
 +
- assay plasmid-based reporters containing 4 LexA sites: much more txn (none with LexA protein alone)
 +
 
 +
Evidence: IDA (using cell based system with reporter constructs)
 +
 
 +
Results section 9 - Assay of gln3 mutant alleles for the ability to support reporter gene expression.
 +
1st paragraph
 +
- using reporter construct above
 +
-- deletion of 470-670 or 470-end: no effect
 +
-- deletion of the first 150 Gln3p residues decreased reporter gene expression 15-fold.
 +
 
 +
evidence: IMP  
 +
 
 +
2nd paragraph
 +
- to assess in vivo function, constructed mutant alleles and assayed abilities to complement a gln3-delta allele.
 +
-- zinc finger mutant (pVS316-3), Gln3p residue Cys-306 was mutated to encode Ser
 +
-- acidic region mutant (pVS316-2), residues 2 to 150 was deleted
  
optional MF annotation
+
growth rate in glucose-Asn medium supported by a
GO:0000978 - RNA polymerase II regulatory transcription factor site sequence-specific DNA binding |  IDA
+
plasmid-borne wt GLN3 allele was
 +
two to three times lower than that of
 +
- gln3-delta strain RR91
 +
- strain RR91 transformed with plasmid pRS316 (empty control plasmid)
 +
- strain RR91 transformed with plasmid pVS316-3 (zinc finger mutant)
 +
- strain RR91 transformed with plasmid pVS316-2 (acidic region region deletion)
  
Other annotations which can be made if you feel you need to make BP and CC annotations from this paper:
+
Remember that this carbon source (Asn) activates NCR
 +
  "When readily used nitrogen sources (e.g., Asn, Gln, or ammonia in
 +
  some strains) are available, NCR (Nitrogen Catabolite Repression)
 +
  sensitive genes are expressed only at low levels (14)."
 +
=> either mutant is the same as empty control plasmid, i.e. inactive in NCR
  
BP | GO:0001080 - nitrogen catabolite activation of transcription from RNA polymerase II promoter | TAS
+
Evidence: IMP
  
CC | GO:0005634 - nucleus | IC from GO:0001077 (or related MF term provided it includes "transcription factor activity")
+
Terms to consider based on all experimental results:
 +
  GO:0000987 - regulatory transcription factor site sequence-specific DNA binding
 +
  GO:0000978 - RNA polymerase II regulatory transcription factor site sequence-specific DNA binding
 +
  GO:0000982 - sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity
 +
  GO:0001077 - sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity  
 +
              involved in positive regulation of transcription from RNA polymerase II promoter
 
</pre>
 
</pre>
 +
  
 
====PMID 7568152 (GAT/NIL1 paper)====
 
====PMID 7568152 (GAT/NIL1 paper)====
  
Acceptable annotations from this paper:
+
=====Discussion=====
 +
* What is the appropriate level of granularity of P terms, i.e. why stop at nitrogen catabolites, and not go on to make terms for proline, glutamine, glutamate, etc.? Because it is not any different going down. The main idea of the paper is to show transcription in response to nitrogen compounds in general.
 +
 
 +
=====Annotations from this paper=====
 
<pre>
 
<pre>
FYI- NIl1 is an alias name for GAT1
+
GAT1 F 1077 sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter ISS with GLN3
GAT1 (NIL1) F 1077 sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter ISS with GLN3
+
GAT1 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IMP
GAT1 (NIL1) P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IMP
+
GAT1 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IGI GLN3
GAT1 (NIL1) P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IGI 7568152 GLN3
+
GLN3 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IMP
GLN3 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IMP 7568152
+
GLN3 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IGI GAT1 (NIL1)
GLN3 P 1080 positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites IGI 7568152 GAT1 (NIL1)
+
</pre>
 +
 
 +
=====Notes for this paper=====
 +
<pre>
 +
FYI- NIL1 is an alias name for GAT1
 +
 
 +
Results section 2: Isolation of the NIL1 Gene (aka GAT1)
 +
 
 +
- cloned NIL1 using degenerate primers for the characteristic zinc
 +
  finger of GATA factors based on three genes:
 +
  Sc Gln3, N crass NIT2, & A nidulans AREA
 +
-- obtained 4 groups of clones: GLN3, DAL80, NIL1 (aka GAT1), NIL2
 +
-- NIL1 (aka GAT1) located to Chr VI & NIL on Chr X
 +
 
 +
- comparison of predicted protein sequences:
 +
-- NIL1 is more similar to GLN3 than either is to DAL80 or NIL2
 +
-- both GLN3 & NIL2 have an N-terminal 160 aa acidic region and a zinc finger
 +
 
 +
These two lines of sequence evidence
 +
- cloning with degenerate primers targeted for zinc finger region of
 +
  GATA factors
 +
 
 +
- sequence comparison of cloned sequenced gene NIL1 (aka GAT1) with
 +
  previously sequenced GLN3, contains both a zinc finger similar to
 +
  GAT1 and a highly acidic region that "could indicate a
 +
  transcriptional activation domain"
 +
 
 +
evidence: ISS ; what term can be used?
 +
  GO:0000987 - regulatory transcription factor site sequence-specific DNA binding
 +
  GO:0000978 - RNA polymerase II regulatory transcription factor site sequence-specific DNA binding
 +
  GO:0000982 - sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity
 +
  GO:0001077 - sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity
 +
              involved in positive regulation of transcription from RNA polymerase II promoter
 +
 
 +
- Disruption of NIL1 (aka GAT1) in a gln3 background
 +
-- "total inability to produce the permease" GAP1
 +
-- adding back either NIL1 or GLN3 restores ability to produce Gap1p permease
 +
-- assayed with proline as N source
 +
-- measuring Gap1p activity
 +
==> QUOTE: "demonstrate that either Gln3p or Nil1p is required for the
 +
    formation of the general amino acid permease."
 +
 
 +
evidence: IGI with GLN3; for what term?
 +
can probably combine with section 3
  
</pre>
 
  
**Appropriate level of granularity of P terms, i.e. why stop at nitrogen catabolites, and not go on to make terms for proline, glutamine, glutamate, etc.? Because it is not any different going down. The main idea of the paper is to show transcription in response to NCR.
+
Results section 3: Activation of Transcription by Gln3p and Nil1p (aka Gat1p)
 +
 
 +
QUOTE: "We determined the abilities of strains defective in GLN3,
 +
NIL1, or both to grow in media containing various sources of
 +
nitrogen."
 +
 
 +
- txn activation in single or double mutants of Gln3 and Nil1
 +
-- either single results in small decrease in growth rate
 +
-- double mutant results in severely reduced rate of growth, except on glutamine (wt didn't grow either)
 +
-- used CYC1-lacZ fusion to UAS regions of GLN1, GDH2, and GAP1 on plasmids in vivo
 +
-- measured by b-gal activity
 +
-- compared ability of Gln3 and Nil1 to activate txn of each of these
 +
  genes in media containing nitrogen sources, either Glu, Gln, urea,
 +
  ammonia, or proline. Note that Gln was not shown in table because
 +
  activity of WT cells was very low
  
**Kimberly- Is there a way to add a comment for each term as to which system it is applicable/relevant to etc? Karen will check with David.
+
evidence: IMP, IGI with GLN3
 +
</pre>

Latest revision as of 12:42, 9 April 2014

Annotation Jamboree

The Annotation conference call for November will feature an annotation jamboree.
The call is scheduled for Nov 8, 2010 (Monday) at 8:00 AM California time.

We will try a new format this time. We have picked 2 papers on transcription for this jamboree (yes, they are yeast papers!).
1) PMID 7568152
Role of the GATA factors Gln3p and Nil1p of Saccharomyces cerevisiae in the expression of nitrogen-regulated genes. Stanbrough M, Rowen DW, Magasanik B.
2) PMID 8655543
G1n3p is capable of binding to UAS(NTR) elements and activating transcription in Saccharomyces cerevisiae. Cunningham TS, Svetlov VV, Rai R, Smart W, Cooper TG.

I provided a simple template in my email to enter the annotations (I am unable to upload the template here). But basically provide annotations in the format:

Gene name [tab] GO Aspect [tab] GOID [tab] GO term [tab] Evidence [tab] With/From [tab] Qualifier [tab]comments <br>

Please send your annotations to me by November 3rd and also remember to name your sheet so i know who gave what. I will collate them and upload it on the wiki by Nov 5th. Each curator can send in individual sheets or each MOD can send their consensus sheet to me.

Collated Annotations

Collated annotations for

  • PMID 7568152 - Role of the GATA factors Gln3p and Nil1p of Saccharomyces cerevisiae in the expression of nitrogen-regulated genes. Stanbrough M, Rowen DW, Magasanik B.

Note about the PDF- when you open it you can barely read the text. Please zoom to read.
File:Jam7568152 Sheet2.pdf

  • PMID 8655543 - G1n3p is capable of binding to UAS(NTR) elements and activating transcription in Saccharomyces cerevisiae. Cunningham TS, Svetlov VV, Rai R, Smart W, Cooper TG.

File:Jam8655543.pdf

Minutes

Present: SGD- Rama, Karen, Julie, Jodi, Ruth, Rachael, Yasmin, Kimberly, Stan, Tim, Tom, Shur-Jen (RGD), Suzi, Pascale, Susan, Serenella (UniProt-SIB), Michele + Mindi (UniProt-EBI)[can't remember]

Announcements

1) The annotation conf.call and the Ref.genome conf. call are on two consequtive days (second Monday and second Tuesday of the month). If any of you have issues with this schedule please email Rama and Rachael.
2) If you have specific annotation related issues that you want discussed at this Annotaiton conf. call, please post them on this wiki page- http://gocwiki.geneontology.org/index.php/Topics_for_annotation_call. Rachael and Rama will review them and prioritize.

Tips on finding the correct term

Many of these new terms have has_part relationships to other function terms that represent a single aspect of their overall function. For example, a term for "sequence-specific DNA binding transcription factor" would have a has_part relationship to "sequence-specific DNA binding". Since AmiGO does not display the has_part relationships, you will not see the connection between "sequence_specific DNA binding transcription factor activity" terms and the relevant "DNA binding" term. You can see these in QuickGO or Rama recommended that annotators start using OBO-Edit. David says that there is a way to have OBO-Edit load the ontology directly from the web without having to fiddle with cvs or do a separate download. OBO-Edit shows the has_part, and other, relationships very clearly.

Papers for discussion

PMID 8655543 (GLN3 paper)

Discussion
  • What level of term should be chosen for the MF annotation?
    • Question: Is it appropriate to pick RNA Pol II terms although the paper doesn't mention RNA Pol II? Yes, because all nuclear protein coding genes are transcribed by RNA Pol II (except for some exceptions in trypanosomatids). People doing research on transcription that is NOT done by RNA polymerase II will tell you which RNA polymerase is involved, whether it's another nuclear RNA polymerase or one from an organelle. When researchers do not specify which RNA polymerase and just talk about regulation of gene expression, RNAP II is a really safe bet. If you feel you can't tell and you are not comfortable, you can always go one level up. However, Karen and David encourage people to try to make the call to RNAP II since we have both seen lots of papers for regulation of transcription of genes transcribed by RNAP II where researchers don't bother to mention the RNAP, but neither of us remembers ever seeing a paper for regulation of transcription by another RNAP where it is not mentioned. If curators do not make this call, we feel that there will be many annotations where we lose the specificity that it is RNAP II. Karen will check with David on whether we could put comments in relevant terms to help annotators remember this information.
    • Comment on multiple versus single MF annotation. Some groups tend to make only the most granular annotation that a given paper supports, in this case the GO:0001077 term. However, some groups make multiple annotations even if some terms are parents to other terms also annotated. Thus, it is also fine if you choose to make the direct annotation to the DNA binding aspect alone, even though the GO:0001077 term incorporated a has_part relationship to this DNA binding term already. Thus, the "RNA polymerase II regulatory transcription factor site sequence-specific DNA binding" (GO:0000978) by IDA is also fine if your group chooses to make this annotation.
  • Question: Why is there no function annotation to the MF term "promoter binding" or one of its child terms? As the authors describe in the introduction, the binding shown here is to a transcription factor binding site that is in the Upstream Activating Sequence (UAS), which is upstream of the promoter region, not part of the promoter. The background here is that the word "promoter" is loosely used by many people. The strict interpretation is that the promoter is only the sequence to which the RNA polymerase, and basal factors, binds, while any other regulatory sequences are not part of the promoter. However, many RNAP II researchers use the term "promoter" to refer to all of the regulatory sequence that is proximal to the gene itself (but excluding enhancers). These two usages are mutually exclusive. To try to be clear about what region is involved, Karen and David decided that we will use the phrase "core promoter" (commonly used in RNAP II literature) to describe the basal sequence where the RNAP binds and "transcription regulatory region" to describe all of the rest of it. Note that an ancestor of recommended function term (GO:0001077) already incorporates a has_part relationship to the appropriate type of DNa binding.
  • Questions on BP annotations from this paper
    • Question: How can you NOT make a Process annotation when you know the function? The experiments in this paper primarily generate annotations that are all in MF for GLN3. Since this paper doesn't show evidence for the nitrogen catabolite repression part and explicitly states that the Gln3-LexA fusion constructs do not show appropriate response to nitrogen source, you cannot make an annotation to a process as specific as "nitrogen catabolite activation of transcription from RNA polymerase II promoter" (GO:0001080) based on the experimental evidence. If you feel you need to make a process annotation based on the experimental evidence, you can only go as far as "positive regulation of transcription from RNA polymerase II promoter" (GO:0045944). In addition, if you make a function annotation to "sequence-specific DNA binding RNA polymerase II transcription factor activity" (GO:0000981) or to one of its children including the recommended GO:0001077 term, then a BP inference will be made from the MF annotation based on inter-ontology relationship.
    • Possible new term request: "positive regulation of organic nitrogen metabolic process by activation of transcription". Ruth: Should a new term be created in order to distinguish between nitrogen in the media vs stored nitrogen that is broken down. Does it matter where the nitrogen is, what form? Is it true for multicellular organisms? Doesn't the existing term refer specifically to breakdown of internally stored nitrogen compounds. Karen & Julie: The authors clearly state these genes are acting in Nitrogen Catabolite Repression process (NCR) process where transcription is regulated by the presence of nitrogen compounds, in this case added exogenously. This is a standard way to do this type of assay. Use of the phrase "nitrogen catabolite" is the standard terminology for this process so this is what we should go with it. Reading the definition does not reveal any requirement that the nitrogen be produced by catabolism of internally stored nitrogen. Karen thinks it is likely that this term is equally relevant to multicellular eukaryotes, but if someone is ever able to prove that there is different regulation when the nitrogen source is internally catabolized nitrogen versus exogenously added, then we can make subtypes of the term. Currently this term would be appropriate regardless of whether the nitrogen was produced internally or provided externally.
    • Question: Can you make an annotation to a process term that includes the fact that the regulation occurs in response to nitrogen catabolites by IC? No, because the most specific GO term that can be selected is the MF term "sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter" (GO:0001077). This term is not specific enough to allow the inference that the regulation is in response to nitrogen compounds. This function term is appropriate for transcription factors involved in regulation of many different processes, not just the process discussed in this paper. If you do wish to make a process annotation at that level, you can make one by TAS because the introduction clearly cites previous work demonstrating that GLN3 is involved in this regulation.
Annotations from this paper
recommended MF annotations from experimental evidence (or less granular if you feel you cannot make the call that it is RNAP II):
GO:0001077 | IDA, IMP |	sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity 
                        involved in positive regulation of transcription from RNA polymerase II promoter
optional MF annotation
GO:0000978 | IDA  | RNA polymerase II regulatory transcription factor site sequence-specific DNA binding

Other annotations which can be made if you feel you need to make BP and CC annotations from this paper:

BP | GO:0001080 | TAS | nitrogen catabolite activation of transcription from RNA polymerase II promoter 
BP | GO:0045944 | IDA | positive regulation of transcription from RNA polymerase II promoter


CC | GO:0005634 - nucleus | IC from GO:0001077 (or related MF term provided it includes "transcription factor activity")
Notes for this paper
Results section 1 - Expression of GLN3 in E. coli
- construct HA-GLN3 in expression vector for coli
- purification: 2 major & many minor species (full length is very minor species)

Results section 2 -  Specific Gln3p-dependent binding to DAL3 DNA fragments containing GATAA sequences
- use small DNA fragment from UAS3 containing 3 well characterized UAS(NTR) sites
- get Gln3p-dependent band shifts
- DNA fragment CAN1-1 (previously shown to support Gln3p-dependent
   NCR-sensitive txn) was effective competitor
- mutation of GATAA sequences away from consensus abolished binding

Results section 3 - Gln3p-dependent binding to wild-type and mutant DNA fragments derived from GLN1
- binds to GLN1-162
- does not bind when GATAA sequence is mutated

Results section 4 - Gln3p-dependent binding to wild-type and mutant PUT1 DNA fragments
- binds to the wt sequence containing 3 sites: GATAAA, GATAG, and CTTATC (GATAAG)
- mutation of the 5' GATAAA sequence (fragment PUT1-2M1) = modest decrease
- mutation of the 3' GATAAG sequence = barely detectable binding
- double mutant lacking both the GATAAA and GATAAG sequences = no demonstrable binding

Results section 5 - Gln3p-dependent binding to wild-type and mutant UGA4 DNA fragments
Results section 6 - Gln3p-dependent binding to wild-type and mutant GDH2 DNA fragments.
Results section 7 - Gln3p-dependent binding to wild-type and mutant CAR1 DNA fragments.
(more of the same)

Evidence: IDA; terms to consider based on these results:
  GO:0000987 - regulatory transcription factor site sequence-specific DNA binding
  GO:0000978 - RNA polymerase II regulatory transcription factor site sequence-specific DNA binding

Results section 8 - Reporter gene expression mediated by LexAp-Gln3p.
- generate a LexA-Gln3p fusion protein
- assay plasmid-based reporters containing 1 LexA site: txn (none with LexA protein alone)
- assay plasmid-based reporters containing 4 LexA sites: much more txn (none with LexA protein alone)

Evidence: IDA (using cell based system with reporter constructs)

Results section 9 - Assay of gln3 mutant alleles for the ability to support reporter gene expression.
1st paragraph
- using reporter construct above
-- deletion of 470-670 or 470-end: no effect
-- deletion of the first 150 Gln3p residues decreased reporter gene expression 15-fold.

evidence: IMP 

2nd paragraph
- to assess in vivo function, constructed mutant alleles and assayed abilities to complement a gln3-delta allele.
-- zinc finger mutant (pVS316-3), Gln3p residue Cys-306 was mutated to encode Ser
-- acidic region mutant (pVS316-2), residues 2 to 150 was deleted

growth rate in glucose-Asn medium supported by a 
plasmid-borne wt GLN3 allele was 
two to three times lower than that of
- gln3-delta strain RR91 
- strain RR91 transformed with plasmid pRS316 (empty control plasmid)
- strain RR91 transformed with plasmid pVS316-3 (zinc finger mutant)
- strain RR91 transformed with plasmid pVS316-2 (acidic region region deletion)

Remember that this carbon source (Asn) activates NCR
  "When readily used nitrogen sources (e.g., Asn, Gln, or ammonia in
  some strains) are available, NCR (Nitrogen Catabolite Repression)
  sensitive genes are expressed only at low levels (14)."
=> either mutant is the same as empty control plasmid, i.e. inactive in NCR

Evidence: IMP

Terms to consider based on all experimental results:
  GO:0000987 - regulatory transcription factor site sequence-specific DNA binding
  GO:0000978 - RNA polymerase II regulatory transcription factor site sequence-specific DNA binding
  GO:0000982 - sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity
  GO:0001077 - sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity 
               involved in positive regulation of transcription from RNA polymerase II promoter


PMID 7568152 (GAT/NIL1 paper)

Discussion
  • What is the appropriate level of granularity of P terms, i.e. why stop at nitrogen catabolites, and not go on to make terms for proline, glutamine, glutamate, etc.? Because it is not any different going down. The main idea of the paper is to show transcription in response to nitrogen compounds in general.
Annotations from this paper
GAT1	F	1077	sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity involved in positive regulation of transcription from RNA polymerase II promoter	ISS with GLN3
GAT1 	P	1080	positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites	IMP	
GAT1	P	1080	positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites	IGI		GLN3
GLN3	P	1080	positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites	IMP	
GLN3	P	1080	positive regulation of transcription from RNA polymerase II promoter by nitrogen catabolites	IGI		GAT1 (NIL1)
Notes for this paper
FYI- NIL1 is an alias name for GAT1

Results section 2: Isolation of the NIL1 Gene (aka GAT1)

- cloned NIL1 using degenerate primers for the characteristic zinc
  finger of GATA factors based on three genes: 
  Sc Gln3, N crass NIT2, & A nidulans AREA
-- obtained 4 groups of clones: GLN3, DAL80, NIL1 (aka GAT1), NIL2
-- NIL1 (aka GAT1) located to Chr VI & NIL on Chr X

- comparison of predicted protein sequences:
-- NIL1 is more similar to GLN3 than either is to DAL80 or NIL2
-- both GLN3 & NIL2 have an N-terminal 160 aa acidic region and a zinc finger

These two lines of sequence evidence 
- cloning with degenerate primers targeted for zinc finger region of
  GATA factors

- sequence comparison of cloned sequenced gene NIL1 (aka GAT1) with
  previously sequenced GLN3, contains both a zinc finger similar to
  GAT1 and a highly acidic region that "could indicate a
  transcriptional activation domain"

evidence: ISS ; what term can be used?
  GO:0000987 - regulatory transcription factor site sequence-specific DNA binding
  GO:0000978 - RNA polymerase II regulatory transcription factor site sequence-specific DNA binding
  GO:0000982 - sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity
  GO:0001077 - sequence-specific regulatory transcription factor site binding RNA polymerase II transcription factor activity 
               involved in positive regulation of transcription from RNA polymerase II promoter

- Disruption of NIL1 (aka GAT1) in a gln3 background
-- "total inability to produce the permease" GAP1
-- adding back either NIL1 or GLN3 restores ability to produce Gap1p permease
-- assayed with proline as N source
-- measuring Gap1p activity
==> QUOTE: "demonstrate that either Gln3p or Nil1p is required for the
    formation of the general amino acid permease."

evidence: IGI with GLN3; for what term?
can probably combine with section 3


Results section 3: Activation of Transcription by Gln3p and Nil1p (aka Gat1p)

QUOTE: "We determined the abilities of strains defective in GLN3,
NIL1, or both to grow in media containing various sources of
nitrogen."

- txn activation in single or double mutants of Gln3 and Nil1
-- either single results in small decrease in growth rate
-- double mutant results in severely reduced rate of growth, except on glutamine (wt didn't grow either)
-- used CYC1-lacZ fusion to UAS regions of GLN1, GDH2, and GAP1 on plasmids in vivo
-- measured by b-gal activity
-- compared ability of Gln3 and Nil1 to activate txn of each of these
   genes in media containing nitrogen sources, either Glu, Gln, urea,
   ammonia, or proline. Note that Gln was not shown in table because
   activity of WT cells was very low

evidence: IMP, IGI with GLN3