L - Total Publications

Total number of papers found as a result of a PubMed search using the gene/protein name and synonyms AND (Gallus OR chick* NOT pea). I use GeneRIF and UniProtKB entries to get the list of proteins names and synonyms and may refine the search as I read the articles and find other names that have been used. So, yes this number contains some false positives but I still think that the number is a useful starting point and I have used it to estimate the precentage of chicken literature that contains GO annotation data.

M - Triaged papers

The kind of triage I do is to take all the papers found (L) and read through the abstract (and sometimes methods & results) to determine whether or not it has "GO annotatable" information. This process culls many papers that only report sequence or RNA expression patterns, etc.

N - Number of papers read

Ideally this number should be the same as the number of papers triaged but sometimes I can't get access to papers. We have developed a Journal Database at AgBase where I record the articles I read and mark them as either "read, contains GO", "read, no GO" or "not accessible". This allows MSU Libraries to track journals I would use but currently can't access and we can go back and revisit articles as our access to journals improves.

Sometimes as I read these papers I get references to other papers that should be included in the literature for the gene product. These are added and numbers recalculated (L,M,N).

O - Number of papers producing GO annotations

As the name implies.