QCQA call 2018-03-20

From GO Wiki
Jump to navigation Jump to search


GOC NYC May 2018

What do we want to report on? Review Cambridge meeting minutes: https://docs.google.com/document/d/1Y9_Mvqes3op36TPHgfaS7K5FHnGZLgUApghIFYyUKR8/edit

  • HTP report (Helen)
  • HTP/ propagation: do we have guidelines yet?
  • Annotation reviews: update; perspective
  • How GO annotation errors can affect analysis: https://github.com/geneontology/go-annotation/issues/1869
  • Berkeley should provide a report for each organism for a slim to see if there are major (unexpected) changes
  • Priorities for annotation reviews: could be driven by the slim analysis: when numbers seem inconsistent (between IBA+IEAInterPro) AND EXP, especially when the numbers are high
  • Major causes of errors/ inconsistencies/ omissions (see below)
  • Derivative annotations must be flagged/and filtered (done automatically in P2GO - this need to be done at GOC as well)
  • Improve guidelines: again, prioritize which guidelines to update; get each group to review the new guidelines

General overview

How GO annotation errors/ inconsistencies/ ommissions can affect analysis


  • Encourage people to think about annotation depth ( e.g. cytoplasmic translation)
  • Slim numbers *should* now be relatively stable now (ontology wise), and only affected by annotation increases and decreases ( I might be able to illustrate how errors affect analyses...)

How errors /inconsistencies/ ommissions are identified

  • Rules
i) Blocking of very general high-level GO terms to increase specificity
ii) Taxon constraints (broken?)
iii) Other rules? (Matrix see below)
  • Annotation reviews from ontology development work
Specificity in the ontology changed to annotation+ extension
general curator error
  • Matrix and slimming (assessing annotation at the organism level)
general curation errors
ontology fixes (true path violations and missing parent)
Incorrect IEA mappings (SPKW and InterPro)
  • other methods

Summarize major causes of errors/ inconsistencies/ omissions

  • a) experimental readout
  • b) annotating a cargo or target to a process
  • c) ignoring author intent
  • d) New biological insights
we need a list of blacklisted papers and retractions
  • e) A phenotype not specific for a process


Why we should prioritise error fixing

  • Should be considered critical like fixing bugs in software
  • Small errors can uncover large issues (fix many problems simultaneously across multiple species)
  • Prevents future propagation, via PAINt, Ensembl, InterPro mappings

What next?

  • Removing redundancy? this is related to QC. When the redundancy is removed its easier to spot errors
  • Really need a way to identify unsupported ISS annotation (there is no alerting when the primary annotation is removed)
  • Block terms not suitable for annotation transfer

examples transporter specificity (changes frequently between species) drug related terms response to terms add examples....

  • I would like to propose that we phase out TAS and NAS (not used for future annotations). Should always be an experiment (or an ISO is better than TAS/NAS)

Rules for protein complexes in 'with' column


Progress on annotation reviews


Documentation for annotation reviews

Requesting an annotation review tool?

Review existing pages under Quality Control


Meeting notes

  • Ask PIs/managers: every contributing group should have one member attending annotation call
  • Contributing groups that do not participate must give access to their annotations OR we reserve the right to filter out annotations that do not match our criteria
  • Many RCA annotations come from AgBase and Gramene, who haven't responded to our requests