Annotation Extension: Capturing cell and tissue types

From GO Wiki
Revision as of 07:46, 8 December 2010 by Huntley (talk | contribs)
Jump to navigation Jump to search

Introduction

This page describes the guidelines for using cell type information, e.g. the cell ontology or the plant ontology, in Column 16 (Annotation Extension) of the Gene Association File. It is a subset of the guidelines laid out in Annotation_Cross_Products. The use of Column 16 will be incremental, cell type is the first vocabulary to be rolled out.

Usage notes

1. When including cell type information in column 16, no judgment is made as to whether a gene product is involved in a particular process in just a particular cell type or in all cell types. In other words, curators simply annotate all available data in a paper.

Therefore it is incorrect to assume that a gene product used in a GO annotation that has a cell type identifier in column 16 is involved in the curated process only in that annotated cell type. Similarly, it would be a mistake to conclude that lack of a cell type identifier in column 16 indicates that a given gene product is involved in a process in all cell types where it is found. The only correct interpretation of a GO annotation with a cell type identifier in column 16 is that in one particular experiment a given gene product was found to be involved in a particular process in a particular cell line.

2. An annotation cannot differ solely in the contents of column 16. This is because it will be optional for users to process this field. Therefore all information should be added to one line, and separate statements in column 16 should be separated by pipes (|). See Multiple annotation extensions for cell type.

3. Cell type location should not be inferred from investigations that use immortalized cell lines. Such cell lines should be treated as an experimental tool rather than an indication of the biological context of function. As the process of immortalization is known to involve multiple genetic changes a curator should never assume that the studied process is carried out in the equivalent normal cell type.

Allowable relations for cell type annotation extensions

  • part_of - Indicates a GO Cellular Component is part_of a specific cell type from a cell type ontology.
  • occurs_in - Indicates a GO Molecular Function or GO Biological Process occurs_in a specific cell type from a cell type ontology.


Using cell type ontologies to enhance Cellular Component annotations

Specifying that a gene product is located in a cellular component of a specific cell type

For example: If a gene product is located to the mitochondrial membrane (GO:0031966) in a spermatocyte (CL:0000017):

 col 5: GO:0031966
 col 16: part_of(CL:0000017)

Or, if a gene product is located to the cell hair (GO:0070451) of a root hair cell (PO:0000256):

 col 5: GO:0070451
 col 16: part_of(PO:0000256)

Use cases

1. Toll-like receptor 4 (TLR4) (O00206) is located intracellularly in the perinuclear region (GO:0048471 perinuclear region of cytoplasm) only in dendritic cells (CL:0000451), PMID:15027902

So the annotation would be;

DB (Col 2) Object (Col 3) GO ID (Col 5) Reference (Col 6) Extension (Col 16)
O00206 TLR4 GO:0048471 PMID:15027902 part_of(CL:0000451)


2. TLR4 is located on the cell surface (GO:0009986) in monocytes (CL:0000576), PMID:15027902

So the annotation would be;

DB (Col 2) Object (Col 3) GO ID (Col 5) Reference (Col 6) Extension (Col 16)
O00206 TLR4 GO:0009986 PMID:15027902 part_of(CL:0000576)


Using cell type ontologies to enhance Molecular Function and Biological Process annotations

Specifying that a gene product is involved in a process in a specific cell type

For example: If a gene product is involved in transcription (GO:0006350) in Purkinje cells (CL:0000121):

 col 5: GO:0006350
 col 16: occurs_in(CL:0000121)

Use cases

1. Human SLC22A5 (UniProtKB:O76082) is involved in quorum sensing involved in interaction with host (GO:0052106) in colonic epithelial cells (CL:0000066), PMID:18005709

So the annotation would be;

DB (Col 2) Object (Col 3) GO ID (Col 5) Reference (Col 6) Extension (Col 16)
O76082 SLC22A5 GO:0052106 PMID:18005709 occurs_in(CL:0000066)


2. Human Wnt7a (UniProtKB:O00755) is involved in positive regulation of epithelial cell proliferation involved in wound healing (GO:0060054) in corneal epithelial cells (CL:0000575), PMID:15802269

So the annotation would be;

DB (Col 2) Object (Col 3) GO ID (Col 5) Reference (Col 6) Extension (Col 16)
O00755 Wnt7a GO:0060054 PMID:15802269 occurs_in(CL:0000575)


Exception

One exception to using the occurs_in relationship for enhancing Biological Process annotations is when annotating a gene product to terms such as '<X> cell fate commitment'. The commitment actually occurs in a stem cell before 'X cell' forms. For example, an annotation to 'myoblast cell fate commitment' should not have the annotation extension: occurs_in(CL:0000056), which indicates that the commitment to become a myoblast cell is occuring in the myoblast cell (CL:0000056) as, in fact, it is occuring in a stem cell.

Multiple annotation extensions for cell type

The publication may describe the localization of a gene product in two or more distinct cell types.

For example: Theoretical gene 1234 is located in the mitochondrial membrane (GO:0031966) of Purkinje cells (CL:0000121) and bipolar neurons (CL:0000103), PMID:54321

So the annotation would be;

DB (Col 2) Object (Col 3) GO ID (Col 5) Reference (Col 6) Extension (Col 16)
1234 Theo GO:0031966 PMID:54321 part_of(CL:0000121)|part_of(CL:0000103)

N. B. No meaning is attached to the order of the cell type identifiers listed in column 16


Requesting new cell type ontology terms

If the cell type term you require does not exist, you can make a request on the Cell Type Ontology SourceForge tracker, or for plant cell types, on the Plant Ontology SourceForge tracker.