QCQA call 2018-03-20: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
 
(25 intermediate revisions by 2 users not shown)
Line 5: Line 5:
What do we want to report on?
What do we want to report on?
Review Cambridge meeting minutes: https://docs.google.com/document/d/1Y9_Mvqes3op36TPHgfaS7K5FHnGZLgUApghIFYyUKR8/edit
Review Cambridge meeting minutes: https://docs.google.com/document/d/1Y9_Mvqes3op36TPHgfaS7K5FHnGZLgUApghIFYyUKR8/edit
* HTP report (Helen)
* HTP/ propagation: do we have guidelines yet?
* Annotation reviews: update; perspective
* How GO annotation errors can affect analysis: https://github.com/geneontology/go-annotation/issues/1869
* Berkeley should provide a report for each organism for a slim to see if there are major (unexpected) changes
* Priorities for annotation reviews: could be driven by the slim analysis: when numbers seem inconsistent (between IBA+IEAInterPro) AND EXP, especially when the numbers are high
* Major causes of errors/ inconsistencies/ omissions (see below)
* Derivative annotations must be flagged/and filtered (done automatically in P2GO - this need to be done at GOC as well)
* Improve guidelines: again, prioritize which guidelines to update; get each group to review the new guidelines
==General overview==




====How GO annotation errors/ inconsistencies/ ommissions can affect analysis====
====How GO annotation errors/ inconsistencies/ ommissions can affect analysis====


: https://github.com/geneontology/go-annotation/issues/1869
https://github.com/geneontology/go-annotation/issues/1869
*  Encourage people to think about annotation depth (cytoplasmic translation)
*  Encourage people to think about annotation depth ( e.g. cytoplasmic translation)
* Slim numbers *should* now be relatively stable now (ontology wise), and only affected by annotation increases and decreases ( I might be able to illustrate how errors affect analyses...)
* Slim numbers *should* now be relatively stable now (ontology wise), and only affected by annotation increases and decreases ( I might be able to illustrate how errors affect analyses...)


Line 18: Line 30:


: i)  Blocking of  very general high-level GO terms to increase specificity
: i)  Blocking of  very general high-level GO terms to increase specificity
:: https://github.com/geneontology/go-annotation/issues/1659
: ii) Taxon constraints  (broken?)
: ii) Taxon constraints  (broken?)
: iii) Other rules?
: iii) Other rules? (Matrix see below)


* Annotation reviews from  ontology development work  
* Annotation reviews from  ontology development work  
: examples
: examples
: identifies
: identifies:
::  Specificity in the ontology changed to annotation+ extension
::  Specificity in the ontology changed to annotation+ extension
::  general curator error
::  general curator error
Line 31: Line 44:
: https://drive.google.com/drive/folders/0B0YtE_BqXTzQbkdFZmwzV1M2TjA
: https://drive.google.com/drive/folders/0B0YtE_BqXTzQbkdFZmwzV1M2TjA


:identifies
: identifies:
:: general curation errors  
:: general curation errors  
:: ontology fixes (true path violations and missing parent)
:: ontology fixes (true path violations and missing parent)
Line 38: Line 51:
* other methods
* other methods


====Summarize major causes of errors/ inconsistencies/ omissions====


====Summarize major causes of errors/ inconsistencies/ omissions====
* a) experimental readout
a) experimental readout
* b) annotating a cargo or target to a process
b) annotating a cargo or target to a process
* c) ignoring author intent
c) ignoring author intent
* d) New biological insights   
d) New biological insights  https://github.com/geneontology/go-annotation/issues/1806
: https://github.com/geneontology/go-annotation/issues/1806
e) A phenotype not specific for a process
: we need a list of blacklisted papers and retractions
* e) A phenotype not specific for a process
etc
etc


Why we should prioritise error fixing
====Why we should prioritise error fixing====
- should be considered critical like fixing bugs in software
* Should be considered critical like fixing bugs in software
- small errors can uncover large issues (fix many problems simultaneously across multiple species)
* Small errors can uncover large issues (fix many problems simultaneously across multiple species)
- prevents future propagation, via PAINt, Ensembl, InterPro mappings
* Prevents future propagation, via PAINt, Ensembl, InterPro mappings




What next?
====What next?====
* Removing redundancy?
* Removing redundancy? this is related to QC. When the redundancy is removed its easier to spot errors
https://github.com/geneontology/go-annotation/issues/1544
: https://github.com/geneontology/go-site/issues/436
https://github.com/geneontology/go-annotation/issues/1674
: https://github.com/geneontology/go-annotation/issues/1544
https://github.com/geneontology/amigo/issues/43
: https://github.com/geneontology/go-annotation/issues/1674
https://github.com/geneontology/amigo/issues/295
: https://github.com/geneontology/amigo/issues/43
https://github.com/geneontology/amigo/issues/440
: https://github.com/geneontology/amigo/issues/295
this is related to QC. When the redundancy is removed its easier to spot errors
: https://github.com/geneontology/amigo/issues/440


* Really need a way to identify unsupported ISS annotation (there is no alerting when the primary annotation is removed)
* Really need a way to identify unsupported ISS annotation (there is no alerting when the primary annotation is removed)


* some terms are really not ideal for annotation transfer
* Block terms not suitable for annotation transfer
examples
examples
transporter specificity (changes frequently between species)
transporter specificity (changes frequently between species)
Line 70: Line 85:
response to terms
response to terms
add examples....
add examples....
* I would like to propose that we phase out TAS and  NAS (not used for future annotations). Should always be an experiment  (or an ISO is  better than TAS/NAS)


==Rules for protein complexes in 'with' column==
==Rules for protein complexes in 'with' column==
Line 88: Line 105:




[[Category:Quality Control]][[Category:Meetings]]
==Meeting notes==
* Ask PIs/managers: every contributing group should have one member attending annotation call
* Contributing groups that do not participate must give access to their annotations OR we reserve the right to filter out annotations that do not match our criteria
* Many RCA annotations come from AgBase and Gramene, who haven't responded to our requests
 
 
 
[[Category:Quality Control]]

Latest revision as of 09:54, 15 April 2019

Agenda

GOC NYC May 2018

What do we want to report on? Review Cambridge meeting minutes: https://docs.google.com/document/d/1Y9_Mvqes3op36TPHgfaS7K5FHnGZLgUApghIFYyUKR8/edit

  • HTP report (Helen)
  • HTP/ propagation: do we have guidelines yet?
  • Annotation reviews: update; perspective
  • How GO annotation errors can affect analysis: https://github.com/geneontology/go-annotation/issues/1869
  • Berkeley should provide a report for each organism for a slim to see if there are major (unexpected) changes
  • Priorities for annotation reviews: could be driven by the slim analysis: when numbers seem inconsistent (between IBA+IEAInterPro) AND EXP, especially when the numbers are high
  • Major causes of errors/ inconsistencies/ omissions (see below)
  • Derivative annotations must be flagged/and filtered (done automatically in P2GO - this need to be done at GOC as well)
  • Improve guidelines: again, prioritize which guidelines to update; get each group to review the new guidelines

General overview

How GO annotation errors/ inconsistencies/ ommissions can affect analysis

https://github.com/geneontology/go-annotation/issues/1869

  • Encourage people to think about annotation depth ( e.g. cytoplasmic translation)
  • Slim numbers *should* now be relatively stable now (ontology wise), and only affected by annotation increases and decreases ( I might be able to illustrate how errors affect analyses...)

How errors /inconsistencies/ ommissions are identified

  • Rules
i) Blocking of very general high-level GO terms to increase specificity
https://github.com/geneontology/go-annotation/issues/1659
ii) Taxon constraints (broken?)
iii) Other rules? (Matrix see below)
  • Annotation reviews from ontology development work
examples
identifies:
Specificity in the ontology changed to annotation+ extension
general curator error
  • Matrix and slimming (assessing annotation at the organism level)
https://www.slideshare.net/ValerieWood/copy-of-biocuration-2017
https://drive.google.com/drive/folders/0B0YtE_BqXTzQbkdFZmwzV1M2TjA
identifies:
general curation errors
ontology fixes (true path violations and missing parent)
Incorrect IEA mappings (SPKW and InterPro)
  • other methods

Summarize major causes of errors/ inconsistencies/ omissions

  • a) experimental readout
  • b) annotating a cargo or target to a process
  • c) ignoring author intent
  • d) New biological insights
https://github.com/geneontology/go-annotation/issues/1806
we need a list of blacklisted papers and retractions
  • e) A phenotype not specific for a process

etc

Why we should prioritise error fixing

  • Should be considered critical like fixing bugs in software
  • Small errors can uncover large issues (fix many problems simultaneously across multiple species)
  • Prevents future propagation, via PAINt, Ensembl, InterPro mappings


What next?

  • Removing redundancy? this is related to QC. When the redundancy is removed its easier to spot errors
https://github.com/geneontology/go-site/issues/436
https://github.com/geneontology/go-annotation/issues/1544
https://github.com/geneontology/go-annotation/issues/1674
https://github.com/geneontology/amigo/issues/43
https://github.com/geneontology/amigo/issues/295
https://github.com/geneontology/amigo/issues/440
  • Really need a way to identify unsupported ISS annotation (there is no alerting when the primary annotation is removed)
  • Block terms not suitable for annotation transfer

examples transporter specificity (changes frequently between species) drug related terms response to terms add examples....

  • I would like to propose that we phase out TAS and NAS (not used for future annotations). Should always be an experiment (or an ISO is better than TAS/NAS)

Rules for protein complexes in 'with' column

https://github.com/geneontology/go-ontology/issues/15421

Progress on annotation reviews

https://github.com/geneontology/go-annotation/issues?q=is%3Aissue+is%3Aopen+label%3A%22annotation+review%22


Documentation for annotation reviews

Requesting an annotation review tool?

Review existing pages under Quality Control

http://wiki.geneontology.org/index.php/Category:Quality_Control


Meeting notes

  • Ask PIs/managers: every contributing group should have one member attending annotation call
  • Contributing groups that do not participate must give access to their annotations OR we reserve the right to filter out annotations that do not match our criteria
  • Many RCA annotations come from AgBase and Gramene, who haven't responded to our requests