Proposals to overhaul transcription in GO - 2010
Philosophy of Overhaul
Over the last few years, GO has changied how we talk about and create Function terms so that they now represent how something occurs, e.g. binding, catalytic activities, etc. We are now also avoiding Function terms that duplicate process terms, that is functions that do not describe how the gene product acts but only specify that it is involved in a process. However, despite the fact that we have always said that Function and Process should represent non-overlapping aspects, we have many older terms in Function that essentially duplicate a Process term. Compare, for example, the Function term transcription regulator activity with the Process term regulation of transcription. Both terms essentially mean the same thing. In addition, the Function term transcription regulator activity is not grouping the terms below it on the basis of having similar functions, but rather on the basis of being involved in the same process. This lack of clarity in the distinction between Function and Process generates confusion, both for annotators and for users. One researcher at the meeting told me that she only uses GO occasionally and she can never remember whether the term she wants is in Function or Process.
One of the major goals of this overhaul is to generate clarity between the function terms and the process terms for transcription. We are proposing to eliminate some Function terms that are equivalent to Process terms and which cannot be converted into a description of the molecular activity, or activities, involved. In other cases, we are proposing changes to Function terms so that they actually describe molecular activities.
With respect to annotation, these changes will mean that in cases where the experiments indicate that a gene product is involved in regulating transcription, but give no indication as to how it acts, it would be appropriate to annotate only with a Process term and not with a Function term. With the recently developed method of creating links between Function and Process terms, the old motivations to have terms like transcription regulator activity should be addressed anyway, since terms representing functions involved in regulation of transcription will have a relationship to that Process terms.
"transcription factor activity"
The term "transcription factor activity" has parentage under "DNA binding" and was probably really intended to represent the type of txn factor that binds a specific DNA sequence in a limited set of promoters to activate transcription when the basal factors are not sufficient to drive transcription.
However, we also need to account for the fact that there are basal transcription factors that bind specific sequences, but that not all basal txn factors bind DNA. This is why we also have terms like 'RNA polymerase II transcription factor activity" that do not have parentage under "DNA binding" and which really mean anything involved in regulating RNAP II transcription, i.e. basically a process definition.
I would like to have function terms something sort of like this where the function is described in terms of the type of sequence that is being bound to :
- sequence specific DNA-binding transcription factor activity -- core promoter binding transcription factor activity --- RNA pol I core promoter binding transcription factor activity --- RNA pol II core promoter binding transcription factor activity --- RNA pol III core promoter binding transcription factor activity -- regulatory transcription factor site* binding transcription factor activity --- promotor transcription factor site binding tfa
RNA pol I promotor transcription factor site binding tfa
RNA pol II promotor transcription factor site binding tfa
RNA pol III promotor transcription factor site binding tfa
--- enhancer transcription factor site binding tfa