Saccharomyces cerevisiae (Retired): Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
mNo edit summary
 
(8 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[[Category:PAINT Archived]]
[[Reference Genome Annotation Project|Return to Reference Genome Annotation Project Main Page]]
[[Reference Genome Annotation Project|Return to Reference Genome Annotation Project Main Page]]


Line 5: Line 7:


----------------------
----------------------


==== L - Total Publications ====
==== L - Total Publications ====
Line 12: Line 13:
including both research articles and reviews, associated with a gene
including both research articles and reviews, associated with a gene
in SGD.
in SGD.


There are two scripts that run to identify relevant papers.  Then
There are two scripts that run to identify relevant papers.  Then
Line 20: Line 20:
first-pass paper curation.
first-pass paper curation.


1. The PubMed script searches all of PubMed, weekly, for  citations containing 'cerevisiae' in the title or abstract plus  an S. cerevisiae gene name, alias, or systematic name in the  title or abstract. It compares the references in the search with  those currently in the database and adds new references to the  database, unless the corresponding PubMed ID has been previously  marked as inapproriate for SGD.


1. The PubMed script searches all of PubMed, weekly, for citations containing 'cerevisiae' in the title or abstract plus an S. cerevisiae gene name, alias, or systematic name in the title or abstract. It compares the references in the search with those currently in the database and adds new references to the database, unless the corresponding PubMed ID has been previously marked as inapproriate for SGD.
This script also makes automated links between references and the   genes mentioned in each. For newly added papers, the genes   mentioned within are automatically linked to the paper in SGD. It   also compares the genes mentioned in each reference with those   currently linked in the database and will add new links between a   gene and a reference, unless the link has been previously marked   as inapproriate. New links between genes and references are   assigned the 'Not yet curated' topic.
 
 
  This script also makes automated links between references and the
  genes mentioned in each. For newly added papers, the genes
  mentioned within are automatically linked to the paper in SGD. It
  also compares the genes mentioned in each reference with those
  currently linked in the database and will add new links between a
  gene and a reference, unless the link has been previously marked
  as inapproriate. New links between genes and references are
  assigned the 'Not yet curated' topic.
 


  In subsequent curation, curators may add links that were missed
In subsequent curation, curators may add links that were missed   and will remove any inappropriate links that were made   automatically.
  and will remove any inappropriate links that were made
  automatically.


2. Because the PubMed script misses some references, we also do a  broader search on a weekly basis ('yeast' OR 'cerevisiae') of the  references added to PubMed each week. This script runs  automatically, immediately after the PubMed script runs on Friday  nights. The script compares its list of hits to the references  already in the SGD database, removes any that are already  present, and loads the remainder automatically into a temporary  location.  Because this broader search contains a high proportion  of irrelevant hits, the references must be reviewed to select the  relevant ones. Links between these papers and any relevant genes  are made manually by curators during the curation process.


2. Because the PubMed script misses some references, we also do a
Note that this second script was only begun in 2005. Prior to   that time, our coverage of references relevant to S. cerevisiae   will be less complete.
  broader search on a weekly basis ('yeast' OR 'cerevisiae') of the
  references added to PubMed each week. This script runs
  automatically, immediately after the PubMed script runs on Friday
  nights. The script compares its list of hits to the references
  already in the SGD database, removes any that are already
  present, and loads the remainder automatically into a temporary
  location.  Because this broader search contains a high proportion
  of irrelevant hits, the references must be reviewed to select the
  relevant ones. Links between these papers and any relevant genes
  are made manually by curators during the curation process.
 
 
  Note that this second script was only begun in 2005. Prior to
  that time, our coverage of references relevant to S. cerevisiae
  will be less complete.
 


In addition to references added by scripts, curators will occasionally
In addition to references added by scripts, curators will occasionally
Line 61: Line 34:
chapter that is relevant to that gene. These references are added
chapter that is relevant to that gene. These references are added
directly by curators.
directly by curators.


==== M - Triaged papers ====
==== M - Triaged papers ====

Latest revision as of 11:24, 12 April 2019


Return to Reference Genome Annotation Project Main Page

Return to Reference Genome Publication Counts


L - Total Publications

For this metric, we count the total number of articles of any kind, including both research articles and reviews, associated with a gene in SGD.

There are two scripts that run to identify relevant papers. Then genes are associated with papers in two ways: 1) via a script that automatically associates the papers with the genes mentioned in the title or abstract, and 2) via manual association as part of our first-pass paper curation.

1. The PubMed script searches all of PubMed, weekly, for citations containing 'cerevisiae' in the title or abstract plus an S. cerevisiae gene name, alias, or systematic name in the title or abstract. It compares the references in the search with those currently in the database and adds new references to the database, unless the corresponding PubMed ID has been previously marked as inapproriate for SGD.

This script also makes automated links between references and the genes mentioned in each. For newly added papers, the genes mentioned within are automatically linked to the paper in SGD. It also compares the genes mentioned in each reference with those currently linked in the database and will add new links between a gene and a reference, unless the link has been previously marked as inapproriate. New links between genes and references are assigned the 'Not yet curated' topic.

In subsequent curation, curators may add links that were missed and will remove any inappropriate links that were made automatically.

2. Because the PubMed script misses some references, we also do a broader search on a weekly basis ('yeast' OR 'cerevisiae') of the references added to PubMed each week. This script runs automatically, immediately after the PubMed script runs on Friday nights. The script compares its list of hits to the references already in the SGD database, removes any that are already present, and loads the remainder automatically into a temporary location. Because this broader search contains a high proportion of irrelevant hits, the references must be reviewed to select the relevant ones. Links between these papers and any relevant genes are made manually by curators during the curation process.

Note that this second script was only begun in 2005. Prior to that time, our coverage of references relevant to S. cerevisiae will be less complete.

In addition to references added by scripts, curators will occasionally come across a paper that should be linked but is not, or a book chapter that is relevant to that gene. These references are added directly by curators.

M - Triaged papers

SGD does not do any triage to target papers for GO, so this column will be blank for SGD.

N - Number of papers read

This is the number of papers read or checked for possible GO annotations by the curator who was responsible for the Reference Genome curation of that gene. This number is manually tracked by the curator during the curation process for reference genome genes. We don't track this number as a part of standard curation, just when genes are designated for the Reference Genomes project.

O - Number of papers producing GO annotations

This is the number of papers from which we actually made GO annotations.