QCQA call 2018-03-20
Agenda
GOC NYC May 2018
What do we want to report on? Review Cambridge meeting minutes: https://docs.google.com/document/d/1Y9_Mvqes3op36TPHgfaS7K5FHnGZLgUApghIFYyUKR8/edit
- Annotation reviews: update; perspective
- How GO annotation errors can affect analysis: https://github.com/geneontology/go-annotation/issues/1869
General overview
How GO annotation errors/ inconsistencies/ ommissions can affect analysis
https://github.com/geneontology/go-annotation/issues/1869
- Encourage people to think about annotation depth ( e.g. cytoplasmic translation)
- Slim numbers *should* now be relatively stable now (ontology wise), and only affected by annotation increases and decreases ( I might be able to illustrate how errors affect analyses...)
How errors /inconsistencies/ ommissions are identified
- Rules
- i) Blocking of very general high-level GO terms to increase specificity
- ii) Taxon constraints (broken?)
- iii) Other rules? (Matrix see below)
- Annotation reviews from ontology development work
- examples
- identifies:
- Specificity in the ontology changed to annotation+ extension
- general curator error
- Matrix and slimming (assessing annotation at the organism level)
- https://www.slideshare.net/ValerieWood/copy-of-biocuration-2017
- https://drive.google.com/drive/folders/0B0YtE_BqXTzQbkdFZmwzV1M2TjA
- identifies:
- general curation errors
- ontology fixes (true path violations and missing parent)
- Incorrect IEA mappings (SPKW and InterPro)
- other methods
Summarize major causes of errors/ inconsistencies/ omissions
- a) experimental readout
- b) annotating a cargo or target to a process
- c) ignoring author intent
- d) New biological insights
- https://github.com/geneontology/go-annotation/issues/1806
- we need a list of blacklisted papers and retractions
- e) A phenotype not specific for a process
etc
Why we should prioritise error fixing
- Should be considered critical like fixing bugs in software
- Small errors can uncover large issues (fix many problems simultaneously across multiple species)
- Prevents future propagation, via PAINt, Ensembl, InterPro mappings
What next?
- Removing redundancy? this is related to QC. When the redundancy is removed its easier to spot errors
- https://github.com/geneontology/go-site/issues/436
- https://github.com/geneontology/go-annotation/issues/1544
- https://github.com/geneontology/go-annotation/issues/1674
- https://github.com/geneontology/amigo/issues/43
- https://github.com/geneontology/amigo/issues/295
- https://github.com/geneontology/amigo/issues/440
- Really need a way to identify unsupported ISS annotation (there is no alerting when the primary annotation is removed)
- Block terms not suitable for annotation transfer
examples transporter specificity (changes frequently between species) drug related terms response to terms add examples....
- I would like to propose that we phase out TAS and NAS (not used for future annotations). Should always be an experiment (or an ISO is better than TAS/NAS)
Rules for protein complexes in 'with' column
https://github.com/geneontology/go-ontology/issues/15421
Progress on annotation reviews
Documentation for annotation reviews
Requesting an annotation review tool?
Review existing pages under Quality Control
http://wiki.geneontology.org/index.php/Category:Quality_Control