Transcription

From GO Wiki
Jump to navigation Jump to search

Background

There are a large number of Source Forge items (some as old as 2003) relating to problems with the representation of transcription, especially in both Molecular Function and in Biological Process. A major goal of this reorganization is to come up with a more sensible and consistent organization of transcription related terms.

Karen Christie will be attending the Gene Transcription in Yeast meeting in Sant Feliu de Guixols in Spain in late June. The plan is to have a good idea of what needs to be done prior to the meeting and to have a list of specific questions which need expert input, so that we can make the needed changes and conclude this project within a few weeks after the meeting.

July 2010 - Karen and David discussed the issues during May and June and came up with ideas about how we would like to change the representation of transcription related terms in both the Molecular Function and Biological Process aspects prior to the Gene Transcription in Yeast meeting which Karen attended in June. Discussing issues with experts there largely clarified our thoughts on how to proceed. We now have a number of proposals for how we would like to reorganize the representation of transcription in GO.

Action Items

  1. Karen - contact Jim Hu re
    • whether basal vs specific transcription is a relevant concept for bacterial transcription
    • possible obsoletion of RNA polymerase core enzyme binding
  2. David - find example of heterodimer that binds DNA and also interacts with basal and specific transcription factors to help clarify activator vs coactivator issue (and repressor vs corepressor too if you see one)Examples: PMID: 9838003, PMID: 19879326, PMID: 10410794, PMID: 8999959.

Source Forge items

commencing June 2008, trying to get a handle on the various issues that have been reported

2010-04-29 - "RNA polymerase II transcription factor activity" parentage closed, consolidated into 2003 item
2010-02-19 - TPV: "reg of txn factor activ" not always "reg of DNA bind"
2010-02-17 - protein-regulatory region DNA complex
2010-02-16 - placement of Transcription cofactor activity
2010-01-04 - specific RNA polymerase II transcription factor activity closed, consolidated into 2003 item
2009-11-27 - NTR: Basal transcription activator activity
2009-11-18 - MP:specific RNA polymerase II transcription factor activity closed, consolidated into 2003 item
2008-11-03 - transcription repressor activity? Confusion
2008-06-21 - transcription factor activity, child problem closed, consolidated into 2003 item
2008-05-12 - transcription activator/coactivator
2008-04-28 - mp:specific RNA polymerase II transcription fact closed, consolidated into 2003 item
2008-03-21 - NTR: specific transcriptional activator activity
2008-02-12 - transcriptional preinitiation/initiation
2008-02-12 - transcription initiation def closed, consolidated into 2007-06-13 item
2008-02-04 - mv DNA-directed RNA polymerase II, holoenzyme complex?
2007-12-18 - term merge?nonspecific RNA polymerase
2007-12-18 - missing parent:RNA pol II transcription factor activityclosed, consolidated into 2003 item
2007-11-14 - rg: def. of holo vs. core TFIIH complex
2007-08-06 - query parent hyperphosphorylation of RNA polymerase II done and closed
2007-07-25 - transcription related terms
2007-06-13 - transcription initiation def
2006-11-03 - general and specific transcription
2006-08-16 - update def: transcription factor TFIIIC complex
2006-08-10 - Definition change GO:0030014; CCR4-NOT complex
2006-05-23 - new MF terms for eukaryotic RNA polymerase binding
2006-02-02 - Transcription factor related closed, consolidated into 2003 item
2005-10-06 - snoRNA transcription
2004-05-14 - transcription factors
2004-02-02 - promoter escape
2004-02-02 - promoter opening
2003-03-17 - transcription factor activity

Issues to address

Here I've gone through all the transcription related items, identified which aspect they relate to, and summarized the issues. Several items were closed as duplicates of other items. When comments were present, they were pasted into the remaining open item.

Molecular Function

  1. Which "txn factor activities" need to be related to DNA binding? and which do not?
    • There is no relationship between transcription factor activity and specific Pol II txn factor activity
      • Notes
        - reported in both directions, but multiple essentially identical items: main is transcription factor activity 2003-03-17; (Also: 2006-02-02 2007-12-18, 2008-04-28, 2008-06-21, 2010-04-29, 2010-01-04, 2009-11-18)
        - transcription factor activity has DNA binding parentage, but it is not parent of specific Pol II txn factor activity which has no relationship to DNA binding
      • Discussion (May 12):
        - The existing term transcription factor activity probably needs to be renamed to reflect its current definition and relationship to DNA binding, to something like regulatory DNA-binding transcription factor activity
        - Existing terms like "RNA polymerase II transcription factor activity" are fine as is without relationship to "DNA binding" because they are intended to cover all factors needed for txn from Pol II promoter, including general txn factors that do not appear to bind DNA on their own. There will need to be child terms to be specific for DNA binding Pol II txn factors vs general, etc.
    • Defs of transcription activator activity (GO:0016563) vs transcription coactivator activity (GO:0003713); item from 2008-05-12: transcription activator/coactivator
      • Notes
        - The activator term is a parent of the coactivator term. The Coactivator term specifies lack of DNA binding, while the activator term makes no comment on DNA binding.
        - Zanetti et al. 2004 defines a coactivator as a "category of eukaryotic proteins termed co-activators, which enhance transcription by interacting with both general and gene-specific transcription factors,"
        - also read about MBF1
        - question about whether an activator is equivalent to "specific RNA polymerase II transcription factor activity"
      • Discussion (May 12):
        - This is clearly an area that is represented poorly in the ontology and which is confusing to annotators.
        - It may be that the usage of coactivator in the literature is not consistent across the board. However, the Zanetti et al paper suggests the possibility that we can represent that type of activity that bridges between general and specific txn factors.
        - Karen plans to utilize attendance at the meeting to get expert opinions on this, since previous reading has not brought clarity.
        - David brought up the question about what to do in a case of a heterodimer where it binds DNA and also bridges between general and specific txn factors. He will look up an example paper. Examples: PMID: 9838003, PMID: 19879326, PMID: 10410794.
      • More Notes
        - Note that these three terms have general sounding names but Pol II specific defs: transcription cofactor activity GO:0003712, transcription coactivator activity GO:0003713 transcription corepressor activity GO:0003714
        => look into cAMP response element binding protein binding. This term is undefined but placed under transcription coactivator activity
        - Note that Gross et al 1998 talk about activators in context of E. coli in which most activators "bind to sites at or near the promoter and make direct contact with one of the subunits of RNAP"
    • Defs of transcription repressor activity (GO:0016564) vs transcription corepressor activity (GO:0003714) and relationships to DNA binding; item from 2004-05-14: transcription factors
      Discussion (May 12):
      - This is a parallel situation to coactivator activity and will likely need the same treatment
    • NTR for specific transcriptional activator activity for when you don't know whether or not it binds DNA; item from 2008-03-21: NTR: specific transcriptional activator activity
      If you don't know how, is this a function? maybe if only know that it has an effect, but nothing about how, you only make process annotations...
      Discussion (May 12):
      - Agreed, this case seems appropriate for process annotation only, no function term needed.
  2. Distinction between basal/general versus specific transcription (at multiple levels): need clear definitions
    • Notes
      - need clear definition of general txn (basal used instead in Sikorski & Buratowski 2009); item from 2007-07-25: transcription related terms
      - same issue applies to children of transcription repressor activity (GO:0016564) ; item from 2004-05-14 transcription factors
      - Merge basal transcription repressor activity (GO:0017163) and general transcriptional repressor activity (16565); item from 2004-05-14 transcription factors
      - Is nonspecific txn equivalent/identical to general txn?; item from 2007-12-18: term merge?nonspecific RNA polymerase
      - Does this apply beyond RNAP II??? [no SF item]
    • Discussion (May 12):
      - We are in agreement here that basal = general = non-specific and that we need to have a clear definition and appropriate synonyms.
      - Regarding the request to implement the basal vs specific distinction at a higher level than RNAP II, there does not seem to be any current justification for doing this for RNAPs I or III. Karen will check with Jim Hu on whether this would be a useful distinction for prokaryotic txn. If there is general need to have this type of distinction, we will have higher level terms for it, but if it is truely a RNAP II specific issue, than it will stay under RNAP II.
  3. requests for more specific RNA polymerase binding terms; item from 2006-05-23 - new MF terms for eukaryotic RNA polymerase binding
    • some history here:
      - We had one term that is phrased and defined to be specific to E. coli/prokaryotic RNAP
      - The SF item requested binding terms for binding to core and holoenzyme forms for RNAP II [Note that since the time of this item, I have seen reviews discussing the multiplicity of large RNAP II complexes called "holoenzyme".]
      - the SF item also contains a list of a number of possible RNAP ___ binding terms that would be specific for a variety of RNAPs found in eukaryotes, as well as some possible grouping terms
      - A second more general term was added for "RNA polymerase binding" that is very general (see RNA polymerase binding)
    • How do we want to move forwards?
    - Is this better dealt with with a single term for RNA polymerase binding and you use col 16 with a PRO ID if you want to go more specific?
    - Alternatively, is there reason to make a specific term for different varieties of RNAP when there is more than one kind in the same cell normally (e.g. RNAPs I, II, III, IVa, IVb, mito, & chloro all exist in normal plant cells)
    - Is there any reason to keep the existing term specific for E. coli/prok RNAP, when there is only 1 RNAP in E. coli? (should get input from Jim Hu, etc.)
    • Discussion (May 12):
      - David has an issue about these binding terms in general as there is no clear link to be made to process.
      - For the present, we are both in favor of having a single general term for RNA polymerase binding and those who wish to capture more detail will need to do so via column 16 or via other systems than GO.
      - We recommend merging the E. coli specific term into the more general one. [Karen is to get input from Jim Hu on this.]
  4. Making Function-Process links [no SF item]
    I'm thinking that we'll need terms for things like
    • DNA-directed RNA polymerase activity involved in transcription from RNA polymerase I promoter
    • DNA-directed RNA polymerase activity involved in transcription from RNA polymerase II promoter
    • DNA-directed RNA polymerase activity involved in transcription from RNA polymerase III promoter
    What is the limit of this? Cells that have those three types of RNAPs also have mito RNAPs, and some additional RNAPs if they are plant cells. Do we need a term for every different type of RNAP in a single cell normally?
    • Discussion (May 12):
      - Funnily enough, we used to have terms called things like "RNA polymerase I activity". However, the definitions were problematic in that they attempted to use the RNAs produced as the defining characteristic and this doesn't work.
      - In the current method of making function-process links, the defining characteristic will merely be the type of promoter.
      - We are currently thinking that we will have a function-process linking term for each different RNAP that exists in a single cell. To take the example with the largest number of types, a plant cell has multiple nuclear RNAPs (I, II, III, IVa, and IVb) and multiple organellar RNAPs (mitochondrial and a couple different plastid ones), so likely we'll need to represent each polymerase separately, providing that it makes sense to have corresponding process terms.
      - A question is how granular we will want to go. For example, E. coli has multiple sigma factors (70, 32, more) that each recognize a class of promoters. Should we represent this? Possibly, but we will let the E. coli people drive the need here.
      - We are both in agreement that we do not want to go so granular that every different gene gets a specific function-process linking term. We want to keep this at the level of broad classes of promoters.
      - For the moment, Karen will only represent the eukaryotic ones she understands, but the structure should be such that additional terms can be added if needed.
  5. additional catalytic activity of RNAPs [no SF item]
    - RNAPs, at least multisubunit ones, all seem to have a "transcript cleavage" activity
    - even when an additional factor is required (e.g. Gre, TFIIS), the cleavage is thought to be performed by the RNAP catalytic site (Sigurdson 2010)
    - type of endoribonuclease activity ?
    - Is it known if it is more specific, e.g. producing 3' or 5' phosphomonoesters [? for mtg]
    • Discussion (May 12):
      - David asked if the txpt cleavage activity is intrinsic to the process of polymerizing or regulatory. Based on recent Sigurdson paper, I'm inclined to think that it is intrinsic to the process of polymerizing RNA, at least for multisubunit RNAPs. However, based on the fact that multiple forms of TFIIS, some of which are tissue specific, exist in mammals, it seems that it may also be a regulated activity, at least for RNAP II. These are questions to try to get input on at the meeting.
  6. Terms that are too specific with respect to DNA sequence or gene products interacted with [no SF item]
    • Existing children of promoter binding (GO:0010843)
      Do we want to go that specific, as in specific DNA motifs???; I'm inclined to say no, and obsolete these; some way of indicating the motif via col 16 seems better if that level of detail is desired
    • Existing descendents (3 total) of transcription coactivator activity (GO:0003713) seem too specific, too tied to specific gene products, and cAMP one is not even defined
      - cAMP response element binding protein binding GO:0008140
      - ligand-dependent nuclear receptor transcription coactivator activity GO:0030374
      - thyroid hormone receptor coactivator activity GO:0030375
    • Discussion (May 12):
    - We are both in favor of obsoleting the existing children of promoter binding and of transcription coactivator activity, though we may have to wait until column 16 is in production so that those who capture this level of detail may put it there.
  7. Are these terms really functions?
    There are some terms in the function ontology that do not seem to represent a clear function, but rather represent a process.
    - transcription regulator activity - Based on the definition of this term, "Plays a role in regulating transcription; may bind a promoter or enhancer DNA sequence or interact with a DNA-binding transcription factor.", this term does not seem to represent functions grouped based on similarity in how they function, but instead a grouping term designed to keep together things which act in the same process. Is this essentially equivalent to the process term regulation of transcription?
    - transcription elongation regulator activity - This seems to represent basically anything that effects elongation and may be equivalent to the process term regulation of RNA elongation. Things which have been described as elongation factors include TFIIS, which acts directly on RNAP II to stimulate txpt cleavage and lots of things which modulate chromatin structure
    - zinc-mediated transcriptional activator activity seems like a process, and perhaps too specific
  8. What is transcription initiation factor activity?
    - This term is not defined,
    - It has two child terms (mito init factor and sigma factor), both of which are defined and which basically mean a subunit which interacts with core RNAP and which then confers sequence specific binding to the promoter sequence by the holoenzyme. Neither the core RNAP or the "init factor" bind to DNA independently. Since sigma binds directly to core RNAP, should it perhaps have "RNAP binding" as a parent. It also binds DNA, but only when in a complex with RNAP; should it have "sequence specific DNA/promoter binding" as a parent also?
    - TFIIIB is also described as an initiation factor (Schramm & Hernandez 2002 cite Kassavetis et al. 1990) and like sigma plays a role recruiting RNAP III to the promoter and also in opening the transcription bubble
  9. Some ideas and notes attempting to define common functions without regard to commonly used words
    • recruitment factor or promoter recognition activity - "Thus, TFIIIA, TFIIIC, and SNAPc can be viewed as recruitment factors whose main function is to recruit TFIIIB to promoters of various structures, which then allows the recruitment of RNA polymerase III." from Schramm & Hernandez 2002
      => sequence specific DNA binding activity involved in pre-initiation complex formation??? [not specific enough, this would be true of TFIIIB also]
      => sequence specific DNA binding activity involved in recruiting basal transcription factor to promotor region??? (this would be true for both TFIIIA and TFIIIC, i.e. things that bind DNA on their own, and then recruit another txn factor)
      => DNA binding activity involved in recruiting RNA polymerase to promotor region??? (This would be true for TFIIIB, which binds DNA though it needs TFIIIC to bring it to the right place, but which once there binds on its own, and recruits RNAP to the promoter)
    • how to describe things like sigma?
      - sigma binds core RNAP (to make holo) and then sigma binds DNA as part of holo while TFIIIB, for example, is brought to DNA and then recruits RNAP III
      => insert phrase to describe what sigma does here
    • E. coli transcription activator activity - binds specific DNA sequence and contacts some component of the basal txn machinery, usually sigma or alpha. Generally basal promoter is "impaired" such that it needs activator in order for holo to be brought to promoter
      => sequence specific DNA binding activity involved in activation of transcription??? (might want to be more specific about what is doing to bring about "activation")

Biological Process

  1. general vs. specific transcription (in Process)
  2. RNA pol transcription cycle
  3. some focused, more specific issues in Process
    • 2005-10-06 - snoRNA transcription
      immediate question of whether any snoRNAs are directly transcribed was answered, but it led to some suggestions for better representation of how snoRNAs are produced; rather specialized to snoRNAs though


Cellular Component

  1. composition of complexes
  2. parentage issues
  3. Are TF complexes always nuclear? If we change the TF activity term, we will need to change TF complex to follow.

Meetings

  • Wednesday, May 12
    Time: 9-10 am Pacific
    Present: David Hill & Karen Christie
    We discussed Function items 1-6. We were generally in agreement. Specific comments have been added to each item.

Questions to resolve

  1. Definitions of:
    • coactivator
    • corepressor
    • activator - Is an activator the same as a specific transcription factor activity"?
  2. new function term to represent the intrinsic txpt cleavage activity of RNAPs
    • Placement of this term, is it known what it releases?
    • Found in all multisubunit RNAPs (E. coli, euk I, II, III, ...), is it also found in single subunit RNAPs?
  3. Are basal & specific/activated relevant for all RNAPs
    • Is basal vs specific relevant for either RNAP I or RNAP III?
    • Is the usage of basal vs activated/specific the same for E. coli and RNAP II?

Summary of Issues [old - to be merged into updated issues to address]

Many of these issues are derived from the various SourceForge items. However, I am also putting in other things I come across.

Ontological Issues

In Process
  1. sets of terms that may be equivalent/synonymous
    • positive regulation of transcription from RNA polymerase II promoter (GO:0045944) and positive regulation of transcription from RNA polymerase II promoter, global (GO:0045817) or if regulation of global transcription needs a distinct term, perhaps defs could be clearer
    • tRNA transcription from RNA polymerase III promoter (GO:0042797) and tRNA transcription (GO:0009304) - not sure the more specific term is really needed (right now it's the only child of its parent), unless we're going to go the other way and have mitochondrial tRNA transcription.
  2. inconsistency in term name organization
    • positive regulation of specific transcription from RNA polymerase II promoter (GO:0010552) and positive regulation of transcription from RNA polymerase II promoter, global (GO:0045817) - would be good to have these two term names done in the same way instead of one with a comma and one without
  3. no representation of transcript cleavage, like that stimulated by PPR2, see Weilbaecher RG, et al. (2003) Intrinsic transcript cleavage in yeast RNA polymerase II elongation complexes. J Biol Chem 278(26):24189-99 PMID:12692127, or should this be in function?...

Biological Questions

  1. General vs specific transcription
    1. Is this distinction specific to Pol II, or does it also apply to Pol I and Pol III (other RNAPs)?
    2. Defining the distinction between general and specific txn?
    3. Defining the distinction between general and specific repressors?
    4. Is nonspecific synonymous with general?
    5. Do specific transcriptional activators need to have DNA binding activity?
  2. the transcription cycle
    1. defining initiation - does initiation include PIC formation?
    2. promoter opening - Is this step regulated/activated for RNAP's other than RNAPII?
    3. promoter escape - part of initiation or elongation?
    4. CTD phosphorylation - part of initiation or elongation?

Research

start on p 16


List of references relevant to transcription

Phrases of note

- from Geiduschek EP & Kassavetis GA. (2001)

  • core transcription factors (for Pol III)
  • direct promotor recognition (for Pol III)
  • RNA chain elongation
  • hydrolytic retraction (a form of transcript cleavage)
  • endopyrophosphorolytic retraction (a form of transcript cleavage)
  • holoenzyme - "In the original bacterial terminology, a holoenzyme is the form of the RNA polymerase that is competent to execute an entire transcription cycle, including specific initiation."
  • promoter recognition - primary function of bacterial sigma
  • regulation of promoter clearance - secondary function of sigma70
  • assembly of anti-terminating elongation complexes - secondary function of sigma70

- from Oficjalska-Pham D, et al. (2006)

  • general negative cofactor (for Pol III)

- from Schramm L & Hernandez N. (2002)

  • recruitment factor - factors like TFIIIA, TFIIIC, and SNAPc that bind a specific DNA sequence and function to bring another factor or the polymerase to the DNA

- from [Beskow A and Wright AP (2006)

  • regulatory transcription factor: "(rTFs) that bind specific DNA-binding sites and activate or repress expression of adjacent genes"

Transcription cycle

"In bacterial transcription, the functions of core promoter sequence recognition and initiation of promoter opening are combined in one subunit of the RNA polymerase holoenzyme. In eukaryotic and archaeal transcription this labor is divided: core promoter recognition is assigned to the transcription initiation factors. Initiation of promoter opening is largely or entirely sequence-independent (perhaps preferring only a particular purine or pyrimidine-purine pair), and might be executed by one of the pol III or TFIIIB subunits known to be located in the vicinity." Geiduschek EP & Kassavetis GA. (2001)

Initial phase of RNA chain elongation for RNAPIII Geiduschek EP & Kassavetis GA. (2001)

  • TFIIIB-pol III contact is not severed until several nucleotides have been incorporated
  • pol III abortively makes short txpts, mosty di- and tri-nucleotides and including hydrolytic cleavage products

Elongation Geiduschek EP & Kassavetis GA. (2001)

  • "RNA chain elongation (measured at each of 58 successive steps) proceeds unevenly, and particularly slowly where three UMP residues in succession are added."
  • "Pausing sets cycles of hydrolytic RNA chain retraction and resynthesis in motion."
  • "Pol III is termination-prone, consistent with its role in producing short transcripts. Short runs of U provoke termination" "Termination by

scpol III is accompanied by (extensive) generation of short oligonucleotides by hydrolytic cleavage at the RNA 30 end," "One can, therefore, regard the C11 subunit as the intrinsic termination factor of pol III."

Recycling (by RNAP III)

  • reinitiation of the same RNAP III complex at the same promoter that it just transcribed
  • thought to be particularly relevant to Pol III since it only transcribes short txn units, which are at times highly expressed

potential references for definitions

  • Specificity of TBP controlled by "proteins termed "activating factors," "coactivators," and/or "initiators" (2-6)" Librizzi et al. (1998)
  • Featherstone M. Coactivators in transcription initiation: here are your orders. Curr Opin Genet Dev. 2002 Apr;12(2):149-55. PMID:11893487
  • Naar AM, Lemon BD, Tjian R. Transcriptional coactivator complexes. Annu Rev Biochem. 2001;70:475-501. PMID:11395415

RNA polymerase III

transcription initiation factors - (Geiduschek EP & Kassavetis GA. 2001):

  1. TFIIIA
    • 5 S rRNA gene-specific DNA-binding factor
    • forms the platform for binding TFIIIC to these genes
  2. TFIIIC
    • recruits TFIIIB to its DNA sites upstream of the transcriptional start
    • binds DNA directly at the internal boxB and boxA promoter sites of tRNA (and similar) genes
    • hTFIIIC seems to have additional functions not shared by yTFIIIC,
      • hTFIIIC2 contains 3 HATs
      • hTFIIIC1 appears to have a direct role in polymerase recruitment independently of TFIIIC2
      • hTFIIIC0 "facilitates termination of transcription and contributes a factor that binds in the vicinity of the terminator"
    • TFIIIC "plays an important role in reserving pol III-transcribed genes for transcription in the face of competition by chromatin"
  3. TFIIIB
    • recruits pol III to the promoter
    • DNA binding factor brought to DNA in two ways:
      • binds directly to strong TATAs via TBP
      • through interaction with DNA-anchored TFIIIC
    • bends DNA (sharply)
    • involved in promoter opening (from deletions in both Brf and B)
  4. factors for vertebrate U class snRNAs
    1. Distal Sequence Element binding factors
      • DSE is compound element
      • bound by Oct-1 and Staf
    2. Proximal Sequence Element binding factors
      • bound by SNAPc (snRNA Activator Protein complex)
      • SNAPc is also a factor for pol II

negative regulators of RNAPIII:

  1. mammalian - three pleiotropic negative regulators: Dr1, p53 and RB (Geiduschek EP & Kassavetis GA. 2001)
    • Unphosphorylated RB suppresses pol III transcription by interactions with TFIIIB and TFIIIC2
    • Dr1, an abundant small nuclear phosphorylated protein, is a competing TBP ligand that sequesters TBP from its association with Brf (as well as TFIIB and TFIIA).179
    • interaction of p53 with Brf and TBP represses pol III transcription in vitro
  2. cerevisiae - MAF1