Noctua

From GO Wiki
Revision as of 20:07, 17 January 2018 by Paul Thomas (talk | contribs)
Jump to navigation Jump to search

Summary

Noctua is an online system for making extensible GO annotations, which we call "GO-CAM models". Anything from simple annotations to complicated pathways are supported. However, the overall goal should be for a model to represent a unit that roughly corresponds to a biological pathway. This document describes how to make GO-CAM models using Noctua.

What is a Noctua model?

A Noctua model is a generalization of a GO annotation, that specifies how different "traditional" GO annotations can be combined into a more complex annotation ("model"). It must contain at least one GENE PRODUCT, and at least one ACTIVITY (a molecular level process carried out by a population of gene products of a given type). An activity can have an CAUSAL EFFECT on another activity (traditional GO "regulation of molecular function" annotations). In general, a Noctua model will contain multiple activities with causal relations linking them together into a pathway.

Molecular activity

An ACTIVITY is represented by a GO Molecular Function term, and additional "aspects" that describe when and where the ACTIVITY occurs. Ideally, all of the following “aspects” of the gene product’s function will be specified in the model. However, in cases where some or most of these aspects are unknown, a model may still be constructed with details added as more information becomes available. Users should attempt to specify functions as fully as possible, but partial models are expected and still contribute to the GO knowledgebase. The following aspects are represented in a model:

Molecular function (MF)

The type of molecular activity carried out by a gene product as part of a larger biological process; this is specified by a term from the GO molecular function ontology. MF may be qualified, using defined relations, as follows:

  • If the function acts upon another “target” molecule, this can be specified using a gene product identifier (for a protein or a gene) or term from the ChEBI ontology (for a small molecule)
  • If the function acts during a particular “biological phase” (e.g. a particular stage in organism development), this can be specified using a term from an appropriate ontology

Cellular component (CC)

The location of the gene product when it is carrying out its activity; this is specified by a term from the GO cellular component ontology. CC may be qualified, using defined relations, as follows:

  • If the activity occurs in a specific cell type, this can be specified using a term from a Cell Type or Anatomy Ontology.
  • If the activity occurs in a specific anatomical structure, this can be specified using a term from the Uberon, or other organismal Anatomy, ontology.

Biological process (BP)

The larger “biological program” to which the activity contributes; this is specified by a term from the GO biological process ontology. BP may be qualified, using defined relations, as follows:

  • If the process is a part of a larger biological program, it can be linked to the larger biological program with another GO biological process term.

Molecular activities can be linked by causal relations

Activities can be linked together by relations that describe their causal dependence. The most common relations are “regulates” and “provides input for”, but there are other relations of greater and lesser specificity, depending on what is known. “Regulates” should be used to denote biological control of a downstream activity. “Provides input for” should be used when there is no control, but an upstream function creates a molecular entity that is the target of the downstream function, such as in a metabolic pathway.

Requirements

A web browser. Chrome is recommended.

Launching Noctua

Setup

  • Before using Noctua to edit or create models, please follow this procedure to request edit access. You will need a ORCID (https://orcid.org), so you can be uniquely identified. Each part of a Noctua model is individually attributed to an editor, as well as the project that provided their funding (if applicable).

Using Noctua

Login

You can view models without logging in, but you must log in before creating new annotations (by editing an existing model, or creating a new model). Click on the Login button in the right upper corner of the page. There are several options for logging in. We recommend using Github (if you don't already have an account just go to http://github.com).

Editing an existing model

Just click on the "Edit" button in the rightmost column of the model list. The model list can be filtered using the search box just above the list of available models.

Starting a new model

Just click on the blue "Create Noctua Model" button.

Creating a new activity and its properties

After either selecting an existing model or starting a new one, you will see the graph view by default. To create new activities, you should use the “Simple annoton editor” tool, available in the Workbench menu: Workbench -> Simple annoton editor

Fig. 1 Launching the simple annoton editor

This will launch a new browser tab

Step 1. Fill in the form

Fill in as many fields as possible in the form, by typing in the field, and then selecting from the autocomplete suggestions by moving the mouse over your selection and clicking.

Tips:
In the autocomplete, enter a space after a complete word, to narrow down the choices.

1.a. Enter gene product or macromolecular complex to be annotated

By default, the form allows you to enter a single gene product. Start typing, choices will appear, and then select the gene product.

Tips:
For gene products, you can type in the gene symbol, e.g. Wnt3a.  If necessary to narrow down the choices, type a space after the symbol, and enter the three letter code for the species (first letter from genus and two from species name, e.g. mmu for Mus musculus).  Each entry in the autocomplete will also show the associated unique database identifier or accession, so curators can confirm that they are selecting the appropriate entity for annotation.

You can annotate a complex instead of a single gene product, by choosing "macromolecular complex" from the drop-down menu on the right of the section header instead of "single gene product". When annotating a complex, you will need to specify the GO macromolecular complex term (use "macromolecular complex" if you don't have a more specific term), and ideally also specify the gene products in the complex.

Tip:
Whenever possible, an annoton should have a single gene product.  A complex should be used only when the function has been characterized for the entire complex, and it cannot be adequately described in terms of the functions of its component parts (or these functions are unknown).  For instance, the ribosome would be represented as a complex, but a calcium-dependent protein kinase would be represented as two different activities (annotons), a calcium sensor activity enabled by the regulatory subunit, and a protein kinase activity enabled by the catalytic subunit.

1.b. Enter the molecular function and evidence

These fields are required. If the molecular function is not specified, enter "molecular_function".

1.c. Enter other fields (optional)

If you fill in the first field of a line, you will need to add evidence. Otherwise any field can be left blank.

We recommend that you fill in as many fields as possible before creating the activity, as after it is created, you will need to edit it from the graph canvas, which requires more steps to do.

Step 2. Add the new activity to a model

Press the CREATE button. A new activity will appear on the graph canvas (the main window).

Tips:
1. Each new activity will appear on the same part of the canvas, so if you add more than one activity you will need to move them around on the canvas (by clicking and dragging) to see the ones underneath.
2. If the CREATE button is grayed-out, there is some information missing from the form that you still need to fill in.  You can press the "why is the save button disabled?" for a list of missing fields.

Specifying the causal ordering of the activities

Once you have created at least two activities, you can specify the causal relations between them. This is done on the graph canvas, by dragging from the blue circle of the upstream activity box, onto the downstream activity box. You can then select the relation. Relations that are “direct” mean that there is a physical interaction mediating the effect on the downstream activity.

Choosing the right causal relation between activities/annotons

  • If the upstream activity regulates the downstream activity through direct binding or by covalent modification, or by creating a small molecule that directly binds to the downstream gene product, use the “directly (positively/negatively) regulates” relation. Examples:
    • Receptor ligand activity enabled by Wnt1 directly positively regulates receptor activity enabled by Fzd1 (Wnt1 binds to the Fzd1 receptor and activates it).
    • Protein kinase activity enabled by MAP3K1 directly positively regulates protein kinase activity enabled by MAP2K1 (MAP3K1 phosphorylates MAP2K1 and activates it)
    • Adenylate cyclase activity of ADCYA1 directly positively regulates protein kinase activity enabled by PKC1 (ADCYA1 creates cAMP, which directly binds to PKC1 and activates it; this case is slightly different than the ones above but for simplicity we use the same relation)
    • Acetylcholinesterase activity of ACHE1 directly negatively regulates acetylcholinesterase receptor activity of ACHR1 (ADCHE1 breaks down acetylcholine, which directly binds to ACHR1 and activates it).
  • If an upstream activity creates a molecule that is acted upon by the downstream activity, use “provides direct input for” relation. Examples:
    • Glucose-6-phosphate dehydrogenase activity of GAPDH provides direct input for phosphofructokinase activity of PFK2 (the small molecule output from the GAPDH activity is acted upon by PFK2 as the next step in the metabolism of glucose).
    • (X phosphorylates Y, creating a molecule that is then acted upon by Z)
  • If the mechanism of the causal relation is not known, use the more general “causally upstream of” relations (these can include a positive/negative effect, if known).

Subfunctions: specifying more detail about molecular activities

Sometimes, molecular activities are composed of distinct subfunctions, and those subfunctions may even be carried out in distinct locations, or by distinct subunits of a complex. For example you may want to specify “hormone binding” in the “cytosol” as a subfunction of a nuclear receptor, that then activates (directly positively regulates) “transcription factor activity” in the “nucleus”. To specify subfunctions, you will create new activities and link them to an activity that you have previously created that describes the overall function of the gene product (e.g. “nuclear receptor activity”). Subfunctions (e.g. “hormone binding”) can be created using the simple annoton editor, but do not fill in the biological process field as it is the same as for the overall function. Once the new annoton is created, link it to the overall molecular function you created earlier, by dragging (on the graph canvas) from the subfunction annoton (blue circle) to the overall activity, and selecting the “part of” relation. You will then need to add evidence by clicking on the "part of" edge; a box will pop up; fill in the evidence fields and press the "Add" button.

Editing the model

Editing can currently be performed only on the graph canvas (the simple annoton editor form does not pick up any operations you have performed on the graph canvas).

Note that only one edit operation can be done at a time.  To change something on the canvas, you will need to first ADD the correct part, and then DELETE the incorrect part, as separate operations.  We recommend that you add first, so that you can transfer evidence from the incorrect part if necessary, by using the “clone other” operation.

Editing relations

Relations can be removed by dragging the end of the relation arrow away from the box it connects to, into an empty part of the canvas. Relations can be added by clicking on the blue circle inside the upstream box, and dragging to the downstream box. Evidence for a relation can be edited by clicking on the relation arrow.

Editing the type/label on a graph node

To edit a simple box on the graph (no colored bars indicating that it has multiple parts folded together for easy viewing), just click on the green square. To change it, first add the new term by filling in the field under “add type”, and clicking add. Then reopen the box again and delete the old term by clicking on the red “x” next to it.

Editing types/labels that are inside a graph node

  • To edit properties of an activity that are “folded” into the molecular activity box on the canvas, click on the green box in the corner of a box. Note that only one edit operation can be done at a time, so do not make more than one edit before pressing a button to save the edit. To change part of the annoton, you will need to first ADD the corrected part, and then DELETE the incorrect part, as separate operations.
  • To remove a property of the annoton, click the “x” next to it.
  • To edit the evidence, click on the “E” next to the part for which you want to edit evidence (e.g., the “E” next to enabled by is the evidence that the molecular function is enabled by the gene product).

Making "traditional" GO annotations using Noctua

Molecular function annotation

  • Fill in the gene product field
  • Fill in the molecular function field, including evidence
  • Optionally, the following "extensions" can be added:
    • has_input(molecule): fill in the "has input" field and evidence
    • happens_during(biological phase): fill in the "happens during" field and evidence
    • occurs_in(cellular component): fill in the "cellular component" field and evidence
    • part_of(biological_process): fill in the "biological process" field and evidence

Cellular component annotation

  • Use the "Component only" version of the form. This is still under development.

Biological process annotation

  • Fill in the gene product field.
  • Fill in the molecular function field with "molecular_function" if unknown. Fill in the evidence with the same evidence code and PMID that you will use for biological process.
  • Fill in the biological process field with the desired GO term, and evidence.
  • Optionally, the following "extensions" can be added:
    • part_of(larger biological process): fill in the "part of biological process" field. Other extensions should be added to the molecular function.