Noctua: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
 
(455 intermediate revisions by 8 users not shown)
Line 1: Line 1:
[[Category: Annotation]]
= GO-CAMs and Noctua =
=Summary=
*This documentation is presented in two parts:
Noctua is an online system for making extensible GO annotations, which we call "GO-CAM models"Anything from simple annotations to complicated pathways are supportedHowever, the overall goal should be for a model to represent a unit that roughly corresponds to a biological pathwayThis document describes how to make GO-CAM models using Noctua.
** GO-CAM Principles
==What is a Noctua model?==
** Noctua Curation Tool
A Noctua model is a generalization of a GO annotation, that specifies how different "traditional" GO annotations can be combined into a more complex annotation ("model")It must contain at least one GENE PRODUCT, and at least one ACTIVITY (a molecular level process carried out by a population of gene products of a given type)An activity can have an CAUSAL EFFECT on another activity (traditional GO "regulation of molecular function" annotations)In general, a Noctua model will contain multiple activities with causal relations linking them together into a pathway.
 
===Molecular activity===
== GO-CAM Principles ==
An ACTIVITY is represented by a GO Molecular Function term, and additional "aspects" that describe when and where the ACTIVITY occurs. Ideally, all of the following “aspects” of the gene product’s function will be specified in the model. However, in cases where some or most of these aspects are unknown, a model may still be constructed with details added as more information becomes availableUsers should attempt to specify functions as fully as possible, but partial models are expected and still contribute to the GO knowledgebaseThe following aspects are represented in a model:
=== Standard GO Annotations and GO-CAM Models ===
====Molecular function (MF)====
==== Standard GO Annotations ====
The type of molecular activity carried out by a gene product as part of a larger biological process; this is specified by a term from the GO molecular function ontologyMF may be qualified, using defined relations, as follows:
 
*If the function acts upon another “target” molecule, this can be specified using a gene product identifier (for a protein or a gene) or term from the ChEBI ontology (for a small molecule)
[http://geneontology.org/docs/go-annotations/ GO annotations] have been a key component of GO since its inceptionStandard annotations are defined as an association between a gene and a biological concept from one of the [http://geneontology.org/docs/ontology-documentation/ three GO aspects]: Molecular Function (MF), Biological Process (BP), and Cellular Component (CC)Standard annotations always contain a reference (either a published, peer-reviewed paper or internal GO reference) and an evidence code that indicates the type of experiment or method used to make the assertionStandard GO annotations may further be qualified using annotation extensions that provide additional biological context to a GO term using a relation from the [http://www.obofoundry.org/ontology/ro.html Relations Ontology (RO)] and a term from GO or an external ontology, e.g. [http://uberon.github.io/ UBERON].
*If the function acts during a particular “biological phase” (e.g. a particular stage in organism development), this can be specified using a term from an appropriate ontology
 
====Cellular component (CC)====
==== GO-CAM Models ====
The location of the gene product when it is carrying out its activity; this is specified by a term from the GO cellular component ontology. CC may be qualified, using defined relations, as follows:
 
*If the activity occurs in a specific cell type, this can be specified using a term from a Cell Type or Anatomy Ontology.
While standard GO annotations are very useful for discerning basic information about genes, they provide only a partial view of each gene's role in a larger biological context.  To provide more comprehensive annotation of genes and link their activities in a causal framework, the GO developed [http://geneontology.org/docs/gocam-overview/ GO-CAMs]Using RO relations, GO-CAMs link GO annotations together with biological entities and external ontology terms to model how a gene functions in the broader context of a biological process or pathwayGO-CAMs thus provide structured descriptions of biological systems and allow for interrogation of causal events in biology through use of clearly defined, and consistently applied, semantics.
*If the activity occurs in a specific anatomical structure, this can be specified using a term from the Uberon, or other organismal Anatomy, ontology.
 
====Biological process (BP)====
A summary of the GO-CAM model specifications is presented in Figure 1.  [[File:GO CAM figure from Google drive.png|thumb|center|Figure 1. GO-CAM Model Specifications|800px]]
The larger “biological program” to which the activity contributes; this is specified by a term from the GO biological process ontology. BP may be qualified, using defined relations, as follows:
 
*If the process is a part of a larger biological program, it can be linked to the larger biological program with another GO biological process term.
The basic unit of a GO-CAM model is the Activity Unit, outlined on the left, which represents a set of standard GO annotations with select annotation extensions, e.g. the inputs and outputs of a molecular function.  GO-CAM models are constructed by filling in as many pieces of relevant information in an Activity Unit as possible and then linking different Activity Units in a causal chain to model a biological process.  Thus, GO-CAM models use standard GO annotations as the foundation on which to build more comprehensive representations of biology.
===Molecular activities can be linked by causal relations===
 
Activities can be linked together by relations that describe their causal dependenceThe most common relations are “directly (positively/negatively) regulates” and “provides input for”, but there are other relations of greater and lesser specificity, depending on what is known.  “Regulates” should be used to denote biological control of a downstream activity.  “Provides input for” should be used when there is no control, but an upstream function creates a molecular entity that is the target of the downstream function, such as in a metabolic pathway.
=== Use of GO in GO-CAMs ===
 
==== Molecular Activities in GO-CAMs ====
 
* Wherever possible, curators should strive to select the single most granular GO Molecular Function (MF) term that best describes the overall activity of the gene, gene product, or protein-containing complex being annotated.
 
* If desired, individual "sub-functions" may be captured by using the 'part of' relation between the main MF and its sub-functions.
 
==== Biological Processes in GO-CAMs ====
 
* The ultimate aim of GO-CAMs is to create a suite of Biological Process (BP)-centric models that can be used to interrogate causal effects of molecular activities on one another as part of the execution of a larger BP.
 
* When annotating, curators should always think about the BP they are modeling and what MFs are 'part of' that BP.
 
* Additional relations between MFs and BPs, e.g. 'causally upstream of or within', can be used to capture experimental information that, in the future, will be incorporated into a more complete model of that process.
 
==== Cellular Components in GO-CAMs ====
 
* Cellular component annotations are intended to capture where the MF enabled by a gene, gene product, or protein-containing complex occurs.
 
* Cellular component annotations may be further qualified with cell, tissue, and organismal context if that information is germane to the process being modeled.
 
== Noctua: the Gene Ontology's GO-CAM Annotation Tool ==
[http://noctua.geneontology.org/ Noctua] is a web-based, collaborative [http://geneontology.org/ Gene Ontology (GO)] annotation tool developed by the GO ConsortiumNoctua can be used to create standard GO annotations as well as more expressive models of biological processes, known as [http://geneontology.org/docs/gocam-overview/ GO-CAMs (Gene Ontology Causal Activity Models)]There are two types of user interface available in Noctua: 1) a form interface and 2) a graph interface.
 
=== System Requirements ===
Noctua is a web-based annotation tool and thus requires only a web browser to access and use. 
 
We have tested Noctua primarily with Chrome on a Mac operating system. 
 
Issues that arise using other browsers and operating systems may be reported on [https://github.com/geneontology/helpdesk/issues go-helpdesk]
 
=== User Account Setup ===
GO-CAMs can be browsed using Noctua, but no annotations can be created or edited unless a user has a registered account.
To create a new account, please email help@geneontology.org or enter a ticket on the [https://github.com/geneontology/helpdesk helpdesk repo in github].   
 
Note that to create a Noctua account, you will need an [https://orcid.org/ ORCID] and a [https://github.com/ github account].
 
Please allow 24 hours for your account to be created.
 
If you have any questions about user accounts, please contact help@geneontology.org
 
==Entities and Ontologies for Annotation==
=== Entities for Annotation ===
 
==== Genes and Gene Products ====
 
* The primary nodes in a GO-CAM model are the ACTIVITIES (Molecular Functions (MFs)) of genes, gene products, or protein-containing complexes.
 
* Every gene, gene product, and protein-containing complex annotated in GO-CAMs must be associated with a stable database identifier and represented either in a GPI (Gene Production Information) file (preferred), in an existing annotation file, e.g. a GAF (Gene Association File), or in the GO Cellular Component aspect.
 
* Curators should strive to annotate activities (MFs) to individual genes or gene products wherever possible.
 
==== Protein-Containing Complexes ====
 
* In GO-CAMs, curators should always try to assign each member of a protein-containing complex its specific activity (e.g. regulatory activity vs catalytic activity).
 
* However, if the main activity of the protein-containing complex cannot be ascribed to a single subunit of the complex (e.g. RNA polymerase II activity), then the activity should be enabled by an appropriate protein-containing complex (e.g. a GO protein-containing complex), with each gene or gene product associated with that protein-containing complex with a 'part of' relation.
 
* Requests to add new entity identifiers to Noctua should be directed to help@geneontology.org
 
=== Ontologies for Annotation ===
 
==== Gene Ontology ====
* The GO is used to annotate:
** Molecular Activities (MF)
** Biological Processes (BP)
** Cellular Component (CC)
 
* To provide appropriate biological context to an annotation or an activity unit, additional ontologies may be used either in GO-CAM or in annotation extensions [link].
 
====Relations Ontology====
Relations used in GO-CAM are a subset of Relations Ontology. These relations and their usage are described in the [[Annotation Relations]] page.
 
==Cell and Anatomy Ontologies==
* Describes the location where processes and functions occur.
* Describes the location of a GO cellular component.
* Add list
==Biological Phase and Life Stage Ontologies==
* Describes the temporal period during which processes and functions occur.
* Describes the temporal period during which a cellular component or anatomical entity exists.
* Add list
==Chemical Ontology (ChEBI) [link]==
* Can be used to capture inputs and outputs of processes and functions.
* GO-CAM uses the Chemical Entities of Biological Interest (ChEBI)
* Sequence Ontology [link]
 
Requests to add ontologies to Noctua should be sent to help@geneontology.org
 
== GO-CAM Workflow ==
* The ultimate goal for GO-CAMs is to create a knowledge graph whereby users can use the GO to traverse a causal representation of a biological system.
* To that end, curators should try, as much as possible, to make individual annotations in the context of the overall process being modeled.
* It can be very helpful to refer to a summary figure from a recent research article or review to help visualize a potential GO-CAM.
* When making a GO-CAM model, we suggesting these steps:
** What are the main activities (MFs) of each of the gene products in a model?
** How do those activities relate, in a causal chain, to each other?
** What processes are those activities involved in?
** Where do the activities occur?
* Even when annotating a single paper, try to incorporate as much of this workflow as possible.  This will make it easier, in the future, to build on existing models with new curation.
 
== Noctua Users Manual ==
* Noctua is accessed via: http://noctua.geneontology.org/
 
=== Noctua Landing Page ===
* The Noctua landing page is the portal by which curators can browse or search and filter models.   
* It is also the starting point for curation (when logged in) and where individual GPAD and OWL files for a model can be downloaded.
 
* By default, the Noctua landing page displays models by date, descending order, i.e. the most recently edited models are shown at the top of the list.
 
 
[[File:NLP_home.png|thumb|center|upright=3.00|Noctua Landing Page]]
 
 
==== Filtering Models ====
* There are two ways to filter GO-CAMs on the landing page:
#Click on the magnifying glass icon in the upper left
#Click on the metadata icons to the right of the model title in the table.
 
 
[[File:Filtering options.png|thumb|center|upright=3.00|Filtering Options]]
 
 
===== Filtering with the Magnifying Glass =====


=Requirements=
Clicking on the magnifying glass opens up the menu of available filter options:
A web browser.  Chrome is recommended.


=Launching Noctua=
#'''Ontology term''' (autocomplete)
* Go to http://geneontology.org/cam.
#'''Gene product''' (autocomplete)
#'''Reference'''
## If entered as free-text, must be the full reference id, e.g. PMID:31884020 or doi:10.1016/j.ydbio.2019.12.010
## Can also use the drop-down prefix menu (and PMID look-up feature) by clicking on the =+ icon.
## Must press return after entering search string.
#'''Organism''' (drop-down list)
#'''Contributor''' (autocomplete and drop-down list)
#'''Groups''' (autocomplete and drop-down list)
#'''Exact date''' (enter YYYY-MM-DD/return or calendar, select/return)
#'''Date range''' (enter YYYY-MM-DD/return or calendar, select/return)
#'''Title''' (enter/return)
#'''State''' (drop-down list)


=Setup=
[[File:Filter options menu.png|thumb|center|upright=3.00|Filtering Options]]
* Before using Noctua to edit or create models, please follow this procedure to request edit access.  You will need a ORCID (https://orcid.org), so you can be uniquely identified.  Each part of a Noctua model is individually attributed to an editor, as well as the project that provided their funding (if applicable).


=Using Noctua=
=== Noctua Form and Graph Editors ===
Noctua has three curation interfaces: 1) the [https://docs.google.com/document/d/1RNbr-2T6CSZfaYKDrom7259IK6ck82vZLThgymyNZ28 Noctua Visual Editor], 2) the Noctua Form and 3) the Noctua Graph Editor.


==Login==
* The [https://docs.google.com/document/d/1RNbr-2T6CSZfaYKDrom7259IK6ck82vZLThgymyNZ28 Noctua Visual Editor] (VPE) is designed for curating GO-CAM models, i.e. models that include at least two activities/MFs linked by causal relations.
You can view models without logging in, but you must log in before creating new annotations (by editing an existing model, or creating a new model).  Click on the Login button in the right upper corner of the page.  There are several options for logging in.  We recommend using Github (if you don't already have an account just go to http://github.com).
* The Noctua Form is a structured annotation form that is recommended for creating 'standard' GO annotations.
* The Noctua Graph Editor is best suited for linking standard annotations together to create a causal model (GO-CAM).


==Editing an existing model==
Just click on the "Edit" button in the rightmost column of the model list.  The model list can be filtered using the search box just above the list of available models.


==Starting a new model==
Just click on the blue "Create Noctua Model" button.


==Creating a new activity and its properties==
{| class="wikitable"
After either selecting an existing model or starting a new one, you will see the graph view by default. To create new activities, you should use the “Simple annoton editor” tool, available in the Workbench menu:
!colspan="2" | Noctua Users Manual
Workbench -> Simple annoton editor
|-
[[File:Launch_noctua_activity_form.png|thumb|Fig. 1 Launching the simple annoton editor|400px]]
!rowspan="2" | Getting started
This will launch a new browser tab
|[[Browsing and searching annotations and models]]
===Step 1. Fill in the form===
|-
Fill in as many fields as possible in the form, by typing in the field, and then selecting from the autocomplete suggestions by moving the mouse over your selection and clicking.
|[[Login]]
Tips:
|-
In the autocomplete, enter a space after a complete word, to narrow down the choices.
!colspan="2" |Creating standard GO annotations
====1.a. Enter gene product or macromolecular complex to be annotated====
|-
By default, the form allows you to enter a single gene product. Start typing, choices will appear, and then select the gene product.
!rowspan = "3" | Form Editor
Tips:
|[http://wiki.geneontology.org/index.php/Noctua_Form_Molecular_Function Molecular Function]
For gene products, you can type in the gene symbol, e.g. Wnt3a. If necessary to narrow down the choices, type a space after the symbol, and enter the three letter code for the species (first letter from genus and two from species name, e.g. mmu for Mus musculus).  Each entry in the autocomplete will also show the associated unique database identifier or accession, so curators can confirm that they are selecting the appropriate entity for annotation.
|-
You can annotate a complex instead of a single gene product, by choosing "macromolecular complex" from the drop-down menu on the right of the section header instead of "single gene product". When annotating a complex, you will need to specify the GO macromolecular complex term (use "macromolecular complex" if you don't have a more specific term), and ideally also specify the gene products in the complex.
|[http://wiki.geneontology.org/index.php/Noctua_Form_Biological_Process Biological Process]
Tip:
|-
Whenever possible, an annoton should have a single gene product. A complex should be used only when the function has been characterized for the entire complex, and it cannot be adequately described in terms of the functions of its component parts (or these functions are unknown). For instance, the ribosome would be represented as a complex, but a calcium-dependent protein kinase would be represented as two different activities (annotons), a calcium sensor activity enabled by the regulatory subunit, and a protein kinase activity enabled by the catalytic subunit.
|[http://wiki.geneontology.org/index.php/Noctua_Form_Cellular_Component Cellular Component]
|-
!rowspan = "3" | Graph Editor
|[http://wiki.geneontology.org/index.php/Noctua_Graph_Molecular_Function Molecular Function]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Graph_Biological_Process Biological Process]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Graph_Cellular_Component Cellular Component]
|-
!colspan="3" |Adding contextual information (annotation extensions)
|-
!rowspan = "3" | Form Editor
|[http://wiki.geneontology.org/index.php/Noctua_Form_MF_Extensions Molecular Function]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Form_BP_Extensions Biological Process]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Form_CC_Extensions Cellular Component]
|-
!rowspan = "3" | Graph Editor
|[http://wiki.geneontology.org/index.php/Noctua_Graph_MF_Extensions Molecular Function]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Graph_BP_Extensions Biological Process]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Graph_CC_Extensions Cellular Component]
|-
!colspan="2" | Editing annotations
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Editing_in_form Form Editor]
|-
|[[Graph Editor]]
|-
!colspan="2" | Creating GO-CAMs
|-
!rowspan = "2" | Creating an activity unit
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Creating_an_Activity_Unit_in_Form Form Editor]
|-
|[[Graph Editor]]
|-
!rowspan = "2" | Linking Activities
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Linking_Activities_in_Form Form Editor]
|-
|[[Graph Editor]]
|-
!colspan = "2" | Model metadata
|-
!rowspan = "3" | Model titles
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_General_Guidelines_for_Model_Titles General Guidelines]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Naming_Models_in_Form Form Editor]
|-
|[[Graph Editor]]
|-
!rowspan = "2" | Releasing models to production
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Making_Models_Production_in_Form Form Editor] Form Editor]
|-
|[[Graph Editor]]
|-
!rowspan = "7" | Other tips and tricks
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Adding_NOT_Qualifer_in_Form Adding a NOT qualifier to an annotation]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Importing_existing_annotations Importing existing annotations]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Changing_annotation_group Changing annotation group]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Model_validation Model validation]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Running_the_reasoner Running the reasoner]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Viewing_GPAD_export Viewing GPAD export]]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Using_templates Using templates]
|-
|}


====1.b. Enter the molecular function and evidence====
=== Noctua Annotation Review ===
These fields are required.  If the molecular function is not specified, enter "molecular_function".
Documentation on how to use the Noctua Annotation Review workbench is [https://docs.google.com/document/d/1Yo2O7LWj1wdRGbbOlaxX_AFDl99T-P59JOf5YoebX0Q/edit#heading=h.jjddl63ap4eh here].
====1.c. Enter other fields (optional)====
If you fill in the first field of a line, you will need to add evidence.  Otherwise any field can be left blank.
We recommend that you fill in as many fields as possible before creating the activity, as after it is created, you will need to edit it from the graph canvas, which requires more steps to do.


===Step 2. Add the new activity to a model===
=== Model Copy ===
Press the CREATE button. A new activity will appear on the graph canvas (the main window).
* Noctua has a model copy functionality that allows users to copy an entire model, minus the evidence.
Tips:
* Model copy can help curators make efficient use of existing models to create new content.
1. Each new activity will appear on the same part of the canvas, so if you add more than one activity you will need to move them around on the canvas (by clicking and dragging) to see the ones underneath.
* More details on model copy and how to use it can be found [https://docs.google.com/document/d/17qlOwSOov-hTKt1wxcyhF1M0XzfQG-r2N6RmhaNj2-E here].
2. If the CREATE button is grayed-out, there is some information missing from the form that you still need to fill in.  You can press the "why is the save button disabled?" for a list of missing fields.


==Specifying the causal ordering of the activities==
== Noctua Maintenance ==
Once you have created at least two activities, you can specify the causal relations between them.  This is done on the graph canvas, by dragging from the blue circle of the upstream activity box, onto the downstream activity box. You can then select the relation.  Relations that are “direct” mean that there is a physical interaction mediating the effect on the downstream activity.
* The Noctua curation tool undergoes routine maintenance on the second and fourth Thursdays of each month from ~4pm - 7pm PDT.
===Choosing the right causal relation between activities/annotons===
* During these maintenance outages the following tasks are typically performed:
====Direct relations====
** Ontology updates:
*If the upstream activity regulates the downstream activity through direct binding or by covalent modification, or by creating a small molecule that directly binds to the downstream gene product, use the “directly (positively/negatively) regulates” relation. Examples:
*** incorporate the latest versions of all ontologies used in Noctua, including NEO (Noctua Entity Ontology)
**Receptor ligand activity enabled by Wnt1 directly positively regulates receptor activity enabled by Fzd1 (Wnt1 binds to the Fzd1 receptor and activates it).
** Model updates:
**Protein kinase activity enabled by MAP3K1 directly positively regulates protein kinase activity enabled by MAP2K1 (MAP3K1 phosphorylates MAP2K1 and activates it)
*** replace obsolete ontology terms if there is a 'replaced by' value
*If an upstream activity creates a molecule that is acted upon by the downstream activity, use “provides direct input for” relation. Examples:
*** replace ontology terms if usage has changed
**Glucose-6-phosphate dehydrogenase activity of GAPDH provides direct input for phosphofructokinase activity of PFK2 (the small molecule output from the GAPDH activity is acted upon by PFK2 as the next step in the metabolism of glucose).
*** delete any models marked with state 'delete'
**(X phosphorylates Y, creating a molecule that is then acted upon by Z)
** Software updates:
====Indirect but proximal causal mechanisms====
*** Incorporate bug fixes
*Regulation via a small molecule intermediate. Create an instance of the small molecule (an instance of a CHEBI term), using the "Add Individual" item on the left of the graph canvas.  Connect the upstream activity to the small molecule with "directly increases/decreases concentration of", and the small molecule to the downstream activity with "directly positively/negatively regulates". Examples:
*** Add new features
**ADCYA1 creates cAMP, which directly binds to PKC1 and activates it
* Noctua maintenance outage reminders are sent out on the go-consortium mailing list ~24 hrs. prior to the outage.
**ADCHE1 breaks down acetylcholine, which directly binds to ACHR1 and activates it.
* For each outage, a list of specific tasks is enumerated in a ticket on the noctua github repo.
*Transcriptional regulation.  Create an instance of the transcriptional process, using the "Add Process" item on the left of the graph canvas. Connect the upstream activity to the transcriptional process with "directly positively regulates", and the transcriptional process to the downstream activity with "causally upstream of".
====Indirect and unknown causal mechanisms====
*If the mechanism of the causal relation is not known, use the more general “causally upstream of” relations (these can include a positive/negative effect, if known).


==Subfunctions: specifying more detail about molecular activities==
Sometimes, molecular activities are composed of distinct subfunctions, and those subfunctions may even be carried out in distinct locations, or by distinct subunits of a complex.  For example you may want to specify “hormone binding” in the “cytosol” as a subfunction of a nuclear receptor, that then activates (directly positively regulates) “transcription factor activity” in the “nucleus”.  To specify subfunctions, you will create new activities and link them to an activity that you have previously created that describes the overall function of the gene product (e.g. “nuclear receptor activity”).  Subfunctions (e.g. “hormone binding”) can be created using the simple annoton editor, but do not fill in the biological process field as it is the same as for the overall function.  Once the new annoton is created, link it to the overall molecular function you created earlier, by dragging (on the graph canvas) from the subfunction annoton (blue circle) to the overall activity, and selecting the “part of” relation.  You will then need to add evidence by clicking on the "part of" edge; a box will pop up; fill in the evidence fields and press the "Add" button.


==Editing the model==
[[Category: Annotation]]
Editing can currently be performed only on the graph canvas (the simple annoton editor form does not pick up any operations you have performed on the graph canvas).
Note that only one edit operation can be done at a time.  To change something on the canvas, you will need to first ADD the correct part, and then DELETE the incorrect part, as separate operations.  We recommend that you add first, so that you can transfer evidence from the incorrect part if necessary, by using the “clone other” operation.
===Editing relations===
Relations can be removed by dragging the end of the relation arrow away from the box it connects to, into an empty part of the canvas.  Relations can be added by clicking on the blue circle inside the upstream box, and dragging to the downstream box.  Evidence for a relation can be edited by clicking on the relation arrow.
===Editing the type/label on a graph node===
To edit a simple box on the graph (no colored bars indicating that it has multiple parts folded together for easy viewing), just click on the green square.  To change it, first add the new term by filling in the field under “add type”, and clicking add.  Then reopen the box again and delete the old term by clicking on the red “x” next to it.
===Editing types/labels that are inside a graph node===
* To edit properties of an activity that are “folded” into the molecular activity box on the canvas, click on the green box in the corner of a box.  Note that only one edit operation can be done at a time, so do not make more than one edit before pressing a button to save the edit.  To change part of the annoton, you will need to first ADD the corrected part, and then DELETE the incorrect part, as separate operations.
* To remove a property of the annoton, click the “x” next to it.
* To edit the evidence, click on the “E” next to the part for which you want to edit evidence (e.g., the “E” next to enabled by is the evidence that the molecular function is enabled by the gene product).
==Making "traditional" (single aspect) GO annotations using Noctua==
===Molecular function annotation===
* Fill in the gene product field
* Fill in the molecular function field, including evidence
* Optionally, the following "extensions" can be added:
** has_input(molecule): fill in the "has input" field and evidence
** happens_during(biological phase): fill in the "happens during" field and evidence
** occurs_in(cellular component): fill in the "cellular component" field and evidence
** part_of(biological_process): fill in the "biological process" field and evidence
===Cellular component annotation===
* Use the "Component only" version of the form.  This is still under development.
===Biological process annotation===
* Fill in the gene product field.
* Fill in the molecular function field with "molecular_function" if unknown.  Fill in the evidence with the same evidence code and PMID that you will use for biological process.
* Fill in the biological process field with the desired GO term, and evidence.
* Optionally, the following "extensions" can be added:
** part_of(larger biological process): fill in the "part of biological process" field.  Other extensions should be added to the molecular function.

Latest revision as of 11:36, 8 January 2024

GO-CAMs and Noctua

  • This documentation is presented in two parts:
    • GO-CAM Principles
    • Noctua Curation Tool

GO-CAM Principles

Standard GO Annotations and GO-CAM Models

Standard GO Annotations

GO annotations have been a key component of GO since its inception. Standard annotations are defined as an association between a gene and a biological concept from one of the three GO aspects: Molecular Function (MF), Biological Process (BP), and Cellular Component (CC). Standard annotations always contain a reference (either a published, peer-reviewed paper or internal GO reference) and an evidence code that indicates the type of experiment or method used to make the assertion. Standard GO annotations may further be qualified using annotation extensions that provide additional biological context to a GO term using a relation from the Relations Ontology (RO) and a term from GO or an external ontology, e.g. UBERON.

GO-CAM Models

While standard GO annotations are very useful for discerning basic information about genes, they provide only a partial view of each gene's role in a larger biological context. To provide more comprehensive annotation of genes and link their activities in a causal framework, the GO developed GO-CAMs. Using RO relations, GO-CAMs link GO annotations together with biological entities and external ontology terms to model how a gene functions in the broader context of a biological process or pathway. GO-CAMs thus provide structured descriptions of biological systems and allow for interrogation of causal events in biology through use of clearly defined, and consistently applied, semantics.

A summary of the GO-CAM model specifications is presented in Figure 1.

Figure 1. GO-CAM Model Specifications

The basic unit of a GO-CAM model is the Activity Unit, outlined on the left, which represents a set of standard GO annotations with select annotation extensions, e.g. the inputs and outputs of a molecular function. GO-CAM models are constructed by filling in as many pieces of relevant information in an Activity Unit as possible and then linking different Activity Units in a causal chain to model a biological process. Thus, GO-CAM models use standard GO annotations as the foundation on which to build more comprehensive representations of biology.

Use of GO in GO-CAMs

Molecular Activities in GO-CAMs

  • Wherever possible, curators should strive to select the single most granular GO Molecular Function (MF) term that best describes the overall activity of the gene, gene product, or protein-containing complex being annotated.
  • If desired, individual "sub-functions" may be captured by using the 'part of' relation between the main MF and its sub-functions.

Biological Processes in GO-CAMs

  • The ultimate aim of GO-CAMs is to create a suite of Biological Process (BP)-centric models that can be used to interrogate causal effects of molecular activities on one another as part of the execution of a larger BP.
  • When annotating, curators should always think about the BP they are modeling and what MFs are 'part of' that BP.
  • Additional relations between MFs and BPs, e.g. 'causally upstream of or within', can be used to capture experimental information that, in the future, will be incorporated into a more complete model of that process.

Cellular Components in GO-CAMs

  • Cellular component annotations are intended to capture where the MF enabled by a gene, gene product, or protein-containing complex occurs.
  • Cellular component annotations may be further qualified with cell, tissue, and organismal context if that information is germane to the process being modeled.

Noctua: the Gene Ontology's GO-CAM Annotation Tool

Noctua is a web-based, collaborative Gene Ontology (GO) annotation tool developed by the GO Consortium. Noctua can be used to create standard GO annotations as well as more expressive models of biological processes, known as GO-CAMs (Gene Ontology Causal Activity Models). There are two types of user interface available in Noctua: 1) a form interface and 2) a graph interface.

System Requirements

Noctua is a web-based annotation tool and thus requires only a web browser to access and use.

We have tested Noctua primarily with Chrome on a Mac operating system.

Issues that arise using other browsers and operating systems may be reported on go-helpdesk

User Account Setup

GO-CAMs can be browsed using Noctua, but no annotations can be created or edited unless a user has a registered account.

To create a new account, please email help@geneontology.org or enter a ticket on the helpdesk repo in github.

Note that to create a Noctua account, you will need an ORCID and a github account.

Please allow 24 hours for your account to be created.

If you have any questions about user accounts, please contact help@geneontology.org

Entities and Ontologies for Annotation

Entities for Annotation

Genes and Gene Products

  • The primary nodes in a GO-CAM model are the ACTIVITIES (Molecular Functions (MFs)) of genes, gene products, or protein-containing complexes.
  • Every gene, gene product, and protein-containing complex annotated in GO-CAMs must be associated with a stable database identifier and represented either in a GPI (Gene Production Information) file (preferred), in an existing annotation file, e.g. a GAF (Gene Association File), or in the GO Cellular Component aspect.
  • Curators should strive to annotate activities (MFs) to individual genes or gene products wherever possible.

Protein-Containing Complexes

  • In GO-CAMs, curators should always try to assign each member of a protein-containing complex its specific activity (e.g. regulatory activity vs catalytic activity).
  • However, if the main activity of the protein-containing complex cannot be ascribed to a single subunit of the complex (e.g. RNA polymerase II activity), then the activity should be enabled by an appropriate protein-containing complex (e.g. a GO protein-containing complex), with each gene or gene product associated with that protein-containing complex with a 'part of' relation.
  • Requests to add new entity identifiers to Noctua should be directed to help@geneontology.org

Ontologies for Annotation

Gene Ontology

  • The GO is used to annotate:
    • Molecular Activities (MF)
    • Biological Processes (BP)
    • Cellular Component (CC)
  • To provide appropriate biological context to an annotation or an activity unit, additional ontologies may be used either in GO-CAM or in annotation extensions [link].

Relations Ontology

Relations used in GO-CAM are a subset of Relations Ontology. These relations and their usage are described in the Annotation Relations page.

Cell and Anatomy Ontologies

  • Describes the location where processes and functions occur.
  • Describes the location of a GO cellular component.
  • Add list

Biological Phase and Life Stage Ontologies

  • Describes the temporal period during which processes and functions occur.
  • Describes the temporal period during which a cellular component or anatomical entity exists.
  • Add list

Chemical Ontology (ChEBI) [link]

  • Can be used to capture inputs and outputs of processes and functions.
  • GO-CAM uses the Chemical Entities of Biological Interest (ChEBI)
  • Sequence Ontology [link]

Requests to add ontologies to Noctua should be sent to help@geneontology.org

GO-CAM Workflow

  • The ultimate goal for GO-CAMs is to create a knowledge graph whereby users can use the GO to traverse a causal representation of a biological system.
  • To that end, curators should try, as much as possible, to make individual annotations in the context of the overall process being modeled.
  • It can be very helpful to refer to a summary figure from a recent research article or review to help visualize a potential GO-CAM.
  • When making a GO-CAM model, we suggesting these steps:
    • What are the main activities (MFs) of each of the gene products in a model?
    • How do those activities relate, in a causal chain, to each other?
    • What processes are those activities involved in?
    • Where do the activities occur?
  • Even when annotating a single paper, try to incorporate as much of this workflow as possible. This will make it easier, in the future, to build on existing models with new curation.

Noctua Users Manual

Noctua Landing Page

  • The Noctua landing page is the portal by which curators can browse or search and filter models.
  • It is also the starting point for curation (when logged in) and where individual GPAD and OWL files for a model can be downloaded.
  • By default, the Noctua landing page displays models by date, descending order, i.e. the most recently edited models are shown at the top of the list.


Noctua Landing Page


Filtering Models

  • There are two ways to filter GO-CAMs on the landing page:
  1. Click on the magnifying glass icon in the upper left
  2. Click on the metadata icons to the right of the model title in the table.


Filtering Options


Filtering with the Magnifying Glass

Clicking on the magnifying glass opens up the menu of available filter options:

  1. Ontology term (autocomplete)
  2. Gene product (autocomplete)
  3. Reference
    1. If entered as free-text, must be the full reference id, e.g. PMID:31884020 or doi:10.1016/j.ydbio.2019.12.010
    2. Can also use the drop-down prefix menu (and PMID look-up feature) by clicking on the =+ icon.
    3. Must press return after entering search string.
  4. Organism (drop-down list)
  5. Contributor (autocomplete and drop-down list)
  6. Groups (autocomplete and drop-down list)
  7. Exact date (enter YYYY-MM-DD/return or calendar, select/return)
  8. Date range (enter YYYY-MM-DD/return or calendar, select/return)
  9. Title (enter/return)
  10. State (drop-down list)
Filtering Options

Noctua Form and Graph Editors

Noctua has three curation interfaces: 1) the Noctua Visual Editor, 2) the Noctua Form and 3) the Noctua Graph Editor.

  • The Noctua Visual Editor (VPE) is designed for curating GO-CAM models, i.e. models that include at least two activities/MFs linked by causal relations.
  • The Noctua Form is a structured annotation form that is recommended for creating 'standard' GO annotations.
  • The Noctua Graph Editor is best suited for linking standard annotations together to create a causal model (GO-CAM).


Noctua Users Manual
Getting started Browsing and searching annotations and models
Login
Creating standard GO annotations
Form Editor Molecular Function
Biological Process
Cellular Component
Graph Editor Molecular Function
Biological Process
Cellular Component
Adding contextual information (annotation extensions)
Form Editor Molecular Function
Biological Process
Cellular Component
Graph Editor Molecular Function
Biological Process
Cellular Component
Editing annotations
Form Editor
Graph Editor
Creating GO-CAMs
Creating an activity unit Form Editor
Graph Editor
Linking Activities Form Editor
Graph Editor
Model metadata
Model titles General Guidelines
Form Editor
Graph Editor
Releasing models to production Form Editor Form Editor]
Graph Editor
Other tips and tricks Adding a NOT qualifier to an annotation
Importing existing annotations
Changing annotation group
Model validation
Running the reasoner
Viewing GPAD export]
Using templates

Noctua Annotation Review

Documentation on how to use the Noctua Annotation Review workbench is here.

Model Copy

  • Noctua has a model copy functionality that allows users to copy an entire model, minus the evidence.
  • Model copy can help curators make efficient use of existing models to create new content.
  • More details on model copy and how to use it can be found here.

Noctua Maintenance

  • The Noctua curation tool undergoes routine maintenance on the second and fourth Thursdays of each month from ~4pm - 7pm PDT.
  • During these maintenance outages the following tasks are typically performed:
    • Ontology updates:
      • incorporate the latest versions of all ontologies used in Noctua, including NEO (Noctua Entity Ontology)
    • Model updates:
      • replace obsolete ontology terms if there is a 'replaced by' value
      • replace ontology terms if usage has changed
      • delete any models marked with state 'delete'
    • Software updates:
      • Incorporate bug fixes
      • Add new features
  • Noctua maintenance outage reminders are sent out on the go-consortium mailing list ~24 hrs. prior to the outage.
  • For each outage, a list of specific tasks is enumerated in a ticket on the noctua github repo.