Drosophila melanogaster (Retired)

From GO Wiki
Jump to navigation Jump to search


Return to Reference Genome Annotation Project Main Page

Return to Reference Genome Publication Counts



L - Total Publications

This is the total number of publications about the gene based on 3 sources:

1. Publications that are already associated with the gene in FlyBase. Only primary research literature is included in this list (reviews, abstracts, personal communications, FlyBase analysis refs, Drosophila Information Service reports etc are excluded). These papers are known to contain some mention of the gene because they have already been curated by FlyBase however that curation may pre-date GO.

2. Publications (excluding reviews) identified in a PubMed search (title/abstract/MeSH terms, for all identifiers associated with that gene and/ or gene product in Drosophila. Some searches are modified (e.g. short synonyms ignored) to reduce the number of false positives. Where there are very few or no PubMed hits, a full text search is carried out at PubMed Central.

3. Any additional publications that are cited within other publications and were used for GO annotation. Other papers that were 'skimmed' in the annotation process are not included in the total pubs count.

M - Triaged papers

FlyBase do not triage papers in the course of normal curation and no link is made between papers and genes until the paper is fully curated. As a result, I've had to work out my own triage process to deal with this project and the process varies somewhat depending on the number of publications and the status of our current GO data...

For a low number of publications (<10), I review the titles and abstracts of the PubMed hits and eliminate only the obvious false positives. In this case M is very similar to N.

For a modest number of publications (<100), I review the titles and abstracts of the PubMed hits, eliminate the obvious false positives and priorities the ones that look most promising for GO data. In this case the difference between M and N is more marked.

For a large number of publications (>100), even triaging them is too time consuming and I tackle annotating these genes via reviews. Historically, FlyBase has curated a lot of GO data from reviews; where there are many existing NAS or TAS annotations, I would normally start by tracing those back to the source - frequently this approach is sufficient to achieve 'complete' annotation. For these cases, M = N as there is no triage.

Note: TAS/NAS annotations are removed where appropriate for all genes, not just those associated with many pubs.

N - Number of papers read

As it says but this number may be smaller than the actual number read (see *L* 3. above). FlyBase has been routinely curating GO data since 2000; papers associated with the gene that have been curated by FlyBase from 2000 are included in this count even though they may not have been read as part of this curation initiative.

O - Number of papers producing GO annotations

The total number of newly and previously curated papers producing GO annotation.