How External Communities can contribute annotations to the GO Consortium
From GO Public
- How communities can interact with the GO Consortium, either contributing or feeding back on annotations.
Contributions of GO annotations
The GO Consortium annotation set is provided from a large number of different GO Consortium associated annotation efforts. New GO annotation groups are strongly encouraged to contact the GO Consortium, who are very happy to assist and support such efforts.
A number of possible methods of interacting with the GO Consortium are outlined below:
A. Small-scale annotation contributions
Where users find errors or omissions in the GO annotation set, they are strongly encouraged to contact the GO Consortium with details of the data that requires improvement.
In all cases, it is extremely useful to include in your feedback:
- An identifier for the gene product/gene of interest
- A citation to a publicly-available reference supporting a need for annotation improvement (such as a PubMed identifier)
- Some details on the specific annotations that should be reviewed or added.
Annotation enquiries can be communicated to the GO Consortium via:
Users can contact the GO Consortium via the GO Help Desk (firstname.lastname@example.org), whereupon the most appropriate annotation group in the GO Consortium (based on current annotation contributions, species or process expertise) will be asked to investigate the annotation request and make the most appropriate changes.
2. SourceForge Annotation tracker
The SourceForge Annotation Tracker can be used to request annotation updates or to report errors.
B. Large-scale annotation contributions
1. Single File Contribution
Some research communities do not have an established database group with the funding and time to commit to long-term maintenance of their GO annotation datasets. Such groups can contribute annotations to the central GO Consortium repository on a single-submission basis. Under such circumstances, the external group would be sending data on the understanding that although they would be fully-acknowledged as the creators of such annotations, the submitters would be unable to commit to any future updates of the annotation set. Therefore, if and when the GO Consortium annotation group has accepted the annotations, it will be the GO Consortium that will become immediately responsible for the future maintenance of the annotation set.
In these cases it is vital that the submitters have been in contact with the GO Consortium before submitting their annotation file, so the Consortium can work with the submitting annotation group to ensure submission of the highest-quality data possible.
The GO Consortium would become responsible for future annotation updates in response to user feedback or in response to changes occurring in the GO (e.g. GO terms being made secondary or obsolete), changes to annotated sequence identifiers or annotation format changes. Changes made by the GO Consortium curators to submitted annotations will be attributed to the GO Consortium.
Interested annotation efforts should contact the GO Consortium using the email address: email@example.com.
Some GO Consortium groups may be able to offer access to their curation tool.
2. On-going GO Annotation Contributions/Collaborations
External groups may alternatively choose to regularly supply the GO Consortium with an annotation file.
In this instance, the annotation group would continue to be the 'owner' of the annotation set, and would be responsible for responding to any requests for annotation changes.
Requirements for both input mechanisms:
1. It is important that the external curation group contacts the GO Consortium before annotation work is carried out, to ensure that mentors/trainers can be allocated from the GO Consortium group so that it can be established that the data produced would satisfy all GO Consortium annotation and format requirements.
2. A reference must be cited in each annotation line that provides details on the methods and results from which the annotation was made. The reference should be either a PubMed identifier or an abstract (GO_REF) describing how the annotation was made (see http://www.geneontology.org/doc/GO.references for all current GO References).
3. The object identifier used in an annotation should ideally be a UniProtKB accession (e.g. P12345) or stable database identifiers. Where alternative identifier types are used, these need to be stable and a gp2protein file (see http://wiki.geneontology.org/index.php/Gp2protein_file) must be available that can be used to map to central (UniProt/NCBI) identifier types.
4. The group must supply a name that will be used to acknowledge their annotation set. This database name would be visible in the 'assigned_by' field (column 15) of all annotation lines contributed by the group, and will be added to the list of annotation providers
6. Where a 'Longer-term Annotation Contribution/Collaboration' is entered into, a primary contact person would need to be identified, so that any annotation requests could be fed back and acted upon in a timely manner.
7. All annotation sets should be supplied in standard GO Consortium annotation format, such as GAF 2.0. Special attention should be paid to ensure the file is in the correct format, especially the following: a) the file must be Unix format b) the file must have the correct file header c) if the file contains column names, these must be commented out using ! at the start of the line d) ensure there are no leading/trailing spaces e) ensure the file has the correct number of columns, even if these are unpopulated
PLEASE NOTE: Some GO Consortium annotations groups may require additional annotation restrictions before agreeing to integration of external annotation sets.
Credit for annotation work
Every annotation is marked with the name of the database that made, or last updated, the annotation. This ensures that the database making or editing the annotation will receive full credit for their work.