Contributing GO Annotations: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
 
(9 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
[[Category:Annotation]]
 
= Contributions of GO annotations =
= Contributions of GO annotations =


The GO Consortium annotation set is provided from a large number of different GO Consortium associated annotation efforts. New GO annotation groups are strongly encouraged to [http://www.geneontology.org/GO.contacts.shtml contact the GO Consortium], who are very happy to assist and support such efforts.   
The GO Consortium annotation set is created from a large number of different GO Consortium associated annotation efforts. New GO annotation groups are strongly encouraged to [http://www.geneontology.org/GO.contacts.shtml contact the GO Consortium], who are very happy to assist and support new groups.   


A number of possible methods of interacting with the GO Consortium are outlined below:
A number of possible methods of interacting with the GO Consortium are outlined below:
Line 31: Line 30:
'''1. Single File Contribution'''
'''1. Single File Contribution'''


Some research communities do not have an established database group with the funding and time to commit to long-term maintenance of their GO annotaiton datasets. Such groups can contribute annotations to the central GO Consortium repository on a single-submission basis. Under such circumstances, the external group would be sending data on the understanding that although they would be fully-acknowledged as the creators of such annotations, the submitters would be unable to commit to any future updates of the annotation set. Therefore, if and when the GO Consortium annotation group has accepted the annotations, it will be the GO Consortium that will become immediately responsible for the future maintenance of the annotation set  
Some research communities do not have an established database group with the funding and time to commit to long-term maintenance of their GO annotaiton datasets, however such groups can contribute annotations to the central GO Consortium repository on a single-submission basis.  
 
Under such circumstances, the external group would be sending data on the understanding that although they would be fully acknowledged as the creators of such annotations, the submitters would be unable to commit to any future updates of the annotation set. Therefore, if and when the GO Consortium annotation group has accepted the annotations, it will be the GO Consortium that would become immediately responsible for the future maintenance of the annotation set  


In these cases it is vital that the submitters have been in contact with the GO Consortium ''before'' submitting their annotation file, so the Consortium can work with the submitting annotation group to ensure submission of the highest-quality data possible.
In these cases it is vital that the submitters have been in contact with the GO Consortium ''before'' submitting their annotation file, so the Consortium can work with the submitting annotation group to ensure submission of the highest-quality data possible.
Line 39: Line 40:
Interested annotation efforts should contact the GO Consortium using the email address: gohelp@geneontology.org.
Interested annotation efforts should contact the GO Consortium using the email address: gohelp@geneontology.org.


GO Consortium groups may be able to offer access to their curation tool.


'''2. On-going GO Annotation Contributions/Collaborations'''
'''2. On-going GO Annotation Contributions/Collaborations'''


External groups may alternatively choose to regularly supply the GO Consortium with an annotation file.   
Annotation groups may alternatively choose to regularly supply the GO Consortium with an annotation file.   


In this instance, the annotation group would continue to be the 'owner' of the annotation set, and would be responsible for responding to any requests for annotation changes.  
In this instance the annotation group would continue to be responsible for the maintenance and improvement of the annotation set, and would be responsible for responding to any requests for annotation changes.  


===Requirements for both input mechanisms:===
===Requirements for both input mechanisms:===
Line 51: Line 51:
1. It is important that the external curation group contacts the GO Consortium ''before'' annotation work is carried out, to ensure that mentors/trainers can be allocated from the GO Consortium group so that it can be established that the data produced would satisfy all GO Consortium annotation and format requirements.
1. It is important that the external curation group contacts the GO Consortium ''before'' annotation work is carried out, to ensure that mentors/trainers can be allocated from the GO Consortium group so that it can be established that the data produced would satisfy all GO Consortium annotation and format requirements.


2. All annotation sets should be supplied in standard GO Consortium annotation format, such as [http://www.geneontology.org/GO.format.gaf-2_0.shtml#db_object_id GAF 2.0]. Guidance to supporting new annotation groups creating an annotation file are available from [http://wiki.geneontology.org/index.php/Submit_GO_annotations here].
2. All annotation sets should be supplied in standard GO Consortium annotation format, such as [http://www.geneontology.org/GO.format.gaf-2_0.shtml#db_object_id GAF 2.0]. Guidance for new annotation groups on how to create an annotation file is available from [http://wiki.geneontology.org/index.php/Submit_GO_annotations here].


3. GO annotations should ideally be a UniProtKB accession (e.g. P12345) or  NCBI accession (e.g. . Where alternative identifier types are used, these need to be stable and a gp2protein file (see PAGE xyz) must be generated and submitted along side the annotation file,  hat can be used to map annotations to central (UniProt/NCBI) gene/gene product identifier types.
3. GO annotations should ideally be a UniProtKB accession (e.g. P12345) or  NCBI accession. Where alternative identifier types are used, these need to be stable and a gp2protein file (see PAGE xyz) must be generated and submitted along side the annotation file.


4. As is described in the GAF2.0 format, the contributing annotation group must supply a name that will be used to acknowledge their annotation set. This database name would be visible in the 'assigned_by' field (column 15) of all annotation lines contributed by the group, and will be included in the list of [http://www.geneontology.org/cgi-bin/xrefs.cgi annotation providers]
4. As is described in the GAF2.0 format, the contributing annotation group must supply a name that will be used to acknowledge their annotation set. This database name would be visible in the 'assigned_by' field (column 15) of all annotation lines contributed by the group, and will be included in the list of [http://www.geneontology.org/cgi-bin/xrefs.cgi annotation providers]


5. Where a 'Longer-term Annotation Contribution/Collaboration' is entered into, a primary contact person/email list needs to be identified, so that any annotation requests can be fed back to the group and acted upon in a timely manner. This information should be submitted in a annotation .conf file [ftp://ftp.geneontology.org/../go/gene-associations/submission/gene_association.aspgd.conf as displayed here] <span style="color:purple"> ''readme for conf files needs to be created'' </span>
5. Where a 'Longer-term Annotation Contribution/Collaboration' is entered into, a primary contact person/email list needs to be identified, so that any annotation requests can be fed back to the group and acted upon in a timely manner.  
 
== Credit for annotation work ==


Every annotation is marked with the name of the database that made the annotation  
This information should be submitted in a annotation .conf file [ftp://ftp.geneontology.org/../go/gene-associations/submission/gene_association.aspgd.conf as displayed here] <span style="color:purple"> ''readme for conf files needs to be created'' </span>


== Provide a downloadable Excel spreadsheet to support annotation provision? ==


<span style 'color=red'>''Should we start to list all groups that supply annotations via the values in the assigned_by field? This list would be longer than the list of groups contributing annotation files.''</span>
- might support groups before a community annotation tool becomes available?


== Downloadable spreadsheet to support annotation provision? ==
Something similar to this [[File:Annotation_import_form.xls‎]], for example?

Latest revision as of 10:36, 9 April 2019

Contributions of GO annotations

The GO Consortium annotation set is created from a large number of different GO Consortium associated annotation efforts. New GO annotation groups are strongly encouraged to contact the GO Consortium, who are very happy to assist and support new groups.

A number of possible methods of interacting with the GO Consortium are outlined below:

A. Small-scale annotation contributions

Where users find errors or omissions in the GO annotation set, they are strongly encouraged to contact the GO Consortium with details of the data that requires improvement.

In all cases, it is extremely useful to include in your feedback:

  • An identifier for the gene product/gene of interest
  • A citation to a publicly-available reference supporting a need for annotation improvement (such as a PubMed identifier)
  • Some details on the specific annotations that should be reviewed or added.

Annotation enquiries can be communicated to the GO Consortium via:

1. Email

Users can contact the GO Consortium via the GO Help Desk (gohelp@geneontology.org), whereupon the most appropriate annotation group in the GO Consortium (based on current annotation contributions, species or process expertise) will be asked to investigate the annotation request and make the most appropriate changes.

2. SourceForge Annotation tracker

The SourceForge Annotation Tracker can be used to request annotation updates or to report errors.

B. Large-scale annotation contributions

1. Single File Contribution

Some research communities do not have an established database group with the funding and time to commit to long-term maintenance of their GO annotaiton datasets, however such groups can contribute annotations to the central GO Consortium repository on a single-submission basis.

Under such circumstances, the external group would be sending data on the understanding that although they would be fully acknowledged as the creators of such annotations, the submitters would be unable to commit to any future updates of the annotation set. Therefore, if and when the GO Consortium annotation group has accepted the annotations, it will be the GO Consortium that would become immediately responsible for the future maintenance of the annotation set

In these cases it is vital that the submitters have been in contact with the GO Consortium before submitting their annotation file, so the Consortium can work with the submitting annotation group to ensure submission of the highest-quality data possible.

The GO Consortium would become responsible for future annotation updates in response to user feedback or in response to changes occuring in the GO (i.e. GO terms being made secondary or obsolete), changes to annotated sequence identifiers or annotation format changes. Changes made by the GO Consortium curators to submitted annotations will be attributed to the GO Consortium.

Interested annotation efforts should contact the GO Consortium using the email address: gohelp@geneontology.org.


2. On-going GO Annotation Contributions/Collaborations

Annotation groups may alternatively choose to regularly supply the GO Consortium with an annotation file.

In this instance the annotation group would continue to be responsible for the maintenance and improvement of the annotation set, and would be responsible for responding to any requests for annotation changes.

Requirements for both input mechanisms:

1. It is important that the external curation group contacts the GO Consortium before annotation work is carried out, to ensure that mentors/trainers can be allocated from the GO Consortium group so that it can be established that the data produced would satisfy all GO Consortium annotation and format requirements.

2. All annotation sets should be supplied in standard GO Consortium annotation format, such as GAF 2.0. Guidance for new annotation groups on how to create an annotation file is available from here.

3. GO annotations should ideally be a UniProtKB accession (e.g. P12345) or NCBI accession. Where alternative identifier types are used, these need to be stable and a gp2protein file (see PAGE xyz) must be generated and submitted along side the annotation file.

4. As is described in the GAF2.0 format, the contributing annotation group must supply a name that will be used to acknowledge their annotation set. This database name would be visible in the 'assigned_by' field (column 15) of all annotation lines contributed by the group, and will be included in the list of annotation providers

5. Where a 'Longer-term Annotation Contribution/Collaboration' is entered into, a primary contact person/email list needs to be identified, so that any annotation requests can be fed back to the group and acted upon in a timely manner.

This information should be submitted in a annotation .conf file as displayed here readme for conf files needs to be created

Provide a downloadable Excel spreadsheet to support annotation provision?

- might support groups before a community annotation tool becomes available?

Something similar to this File:Annotation import form.xls, for example?