Proposal for fate of "transcription" and corresponding regulation terms

From GO Wiki
Revision as of 18:13, 7 January 2011 by Kchris (talk | contribs)
Jump to navigation Jump to search

There are a couple problems with the term transcription (GO:0006350), and with its three associated regulation terms:

  • GO:0006350 transcription
  • GO:0045449 regulation of transcription
  • GO:0045941 positive regulation of transcription
  • GO:0016481 negative regulation of transcription

This page contains the 2 possible options for resolution followed by a detailed discussion of the issues and some supporting data.

Proposal

There are 2 possible ways to go with these terms:

  1. Merge the transcription terms into the 4 corresponding transcription, DNA-dependent terms (pairs shown below). Since we think that essentially all of the annotations should have been made to the more granular transcription, DNA-dependent terms, we feel this might be the conservative approach with respect to annotations.
  2. Obsolete the 4 transcription and suggest replacement terms. Since the vast majority of annotations are IEAs, which would be fixed by fixing the mappings rather than by having to manually evaluate each annotation, perhaps the group would prefer to go this way rather than risk inappropriately moving a small number of annotations which should actually be to reverse transcription or transcription, RNA-dependent to the transcription, DNA-dependent term.

Regardless of which option is chosen, we are not planning to create a replacement grouping term to group transcription, DNA-dependent and transcription, RNA-dependent. We feel that the existing term RNA biosynthetic process is sufficient, and unlikely to produce confusion in the way that having a grouping term named transcription has.

Problems

  1. Incorrect definition
    Transcription (GO:0006350) is defined incorrectly. As written, the definition is specifically grouping "normal" (DNA-dependent) transcription from DNA templates with reverse transcription, which is actually a type of DNA synthesis which should NOT be grouped with transcription at all. It is also excluding production of RNA transcripts from RNA templates, which is a kind of transcription and which occurs in some viruses.
  2. Confusing use in annotations
    • The term transcription (GO:0006350) is being used for annotations as if it were equivalent to transcription, DNA-dependent (GO:0006351), and similarly for the regulation of transcription terms compared with the regulation of transcription, DNA-dependent terms. This is occurring for thousands of annotations, slightly over 20% of the annotations made to transcription or its child terms (some numbers from AmiGO are below). David and I have scanned a small subset of these annotations and strongly suspect that almost all of these annotations should have been made to the equivalent transcription, DNA-dependent terms, rather than just to transcription.
    • In addition to thousands of annotations (numbers below) to these 4 top level transcription terms, there are over 500 mappings (numbers of dbxref mappings below). These mappings are probably the main source of annotations as the vast majority of annotations to these high level grouping terms are by IEA (annotation counts for "transcription" below). So we will want to consider the fate of the mappings in this decision as well.

Fate of direct child terms of transcription

  • already included in proposal
    - GO:0045449 regulation of transcription
    - GO:0045941 positive regulation of transcription
    - GO:0016481 negative regulation of transcription
    These terms should be handled in the same manner as transcription.
  • input requested
    - GO:0000988 protein binding transcription factor activity
    - GO:0001071 nucleic acid binding transcription factor activity
    These are both new terms that have been added recently as part of the transcription overhaul. I put these in thinking only of DNA-dependent transcription. However, at the moment, nothing about the names or definitions of these terms distinguishes between DNA-dependent or RNA-dependent processes. If anyone knowledgeable about the RNA-dependent process thinks these functions would be needed for the RNA-dependent process, we can make these general terms and create new terms that are subtypes. If no one speaks up though, I might be inclined to just make these terms specific to DNA-dependent transcription since I know almost nothing about RNA-dependent transcription.
  • cross product relationships to be changed
    - GO:0034401 regulation of transcription by chromatin organization
    This term has transcription as part of cross product definition, but you have to have DNA to have chromatin, so this relationship seems like it would be more accurately made to transcription, DNA-dependent
  • new parentage needed
    - GO:0019083 viral transcription
    Similarly to transcription, DNA-dependent, we think that RNA biosynthetic process would be a sufficient is_a parent for this term. It is not currently a parent, so it would need to be added when transcription is removed.
  • no change needed
    - GO:0006351 transcription, DNA-dependent
    This already has an additional is_a parent of RNA biosynthetic process. We feel this is sufficient.
  • to be dealt with separately
    - GO:0006410 transcription, RNA-dependent
    - GO:0032199 transcription involved in RNA-mediated transposition


-Karen & David

Supporting Information

transcription & transcription, DNA-dependent term pairs

GO:0006350 transcription
 => GO:0006351 transcription, DNA-dependent

GO:0045449 regulation of transcription
 => GO:0006355 regulation of transcription, DNA-dependent

GO:0045941 positive regulation of transcription
 => GO:0045893 positive regulation of transcription, DNA-dependent

GO:0016481 negative regulation of transcription
 => GO:0045892 negative regulation of transcription, DNA-dependent

Counts for dbxref mappings to these terms

# of
dbxrefs GO term name
*225    transcription
  43    transcription, DNA-dependent
*231    regulation of transcription
 708    regulation of transcription, DNA-dependent
* 71    positive regulation of transcription
   8    positive regulation of transcription, DNA-dependent
* 57    negative regulation of transcription
  19    negative regulation of transcription, DNA-dependent
1362    Grand Total

* mappings which need to be changed

Gene product Annotation counts from AmiGO

GO:0006350 : transcription [28093 gene products]
- GO:0006351 : transcription, DNA-dependent [21637 gene products]
- GO:0006410 : transcription, RNA-dependent [79 gene products]
- GO:0019083 : viral transcription [265 gene products]

GO:0045449 : regulation of transcription [25298 gene products]
- GO:0006355 : regulation of transcription, DNA-dependent [19684 gene products]
- GO:0046782 : regulation of viral transcription [80 gene products]

GO:0045941 : positive regulation of transcription [3897 gene products]
- GO:0045893 : positive regulation of transcription, DNA-dependent [3099 gene products]
- GO:0050434 : positive regulation of viral transcription [58 gene products]

GO:0016481 : negative regulation of transcription [4008 gene products]
- GO:0075182 : negative regulation of symbiont transcription in response to host [1 gene product]
- GO:0045892 : negative regulation of transcription, DNA-dependent [2852 gene products]
- GO:0032897 : negative regulation of viral transcription [25 gene products]

Annotation counts for transcription by evidence code

GO:0006350      transcription   2       IC
GO:0006350      transcription   51      IDA
GO:0006350      transcription   469711  IEA
GO:0006350      transcription   20      IEP
GO:0006350      transcription   9       IGI
GO:0006350      transcription   32      IMP
GO:0006350      transcription   8       IPI
GO:0006350      transcription   6       ISA
GO:0006350      transcription   26      ISO
GO:0006350      transcription   273     ISS
GO:0006350      transcription   22      NAS
GO:0006350      transcription   12      RCA
GO:0006350      transcription   240     TAS