Difference between revisions of "Annotation Conf. Call 2019-07-09"

From GO Wiki
Jump to: navigation, search
m (Minutes)
m
 
Line 44: Line 44:
 
= Minutes =
 
= Minutes =
 
*Present: Birgit, Bob, Colin, David, Dmitry, Dustin, Edith, Giulia, Harold, Judy, Karen, Kimberly, Laurent-Philippe, Li, Michele, Midori, Pascale, Patrick N, Petra, Ruth, Sabrina, Stacia, Stan, Seth, Suzi A, Tanya, Tom, Val
 
*Present: Birgit, Bob, Colin, David, Dmitry, Dustin, Edith, Giulia, Harold, Judy, Karen, Kimberly, Laurent-Philippe, Li, Michele, Midori, Pascale, Patrick N, Petra, Ruth, Sabrina, Stacia, Stan, Seth, Suzi A, Tanya, Tom, Val
 +
 +
== github submission for InterPro2GO mappings ==
 +
* For now, add InterPro curators as listed in [https://github.com/geneontology/go-annotation/issues/2472 ticket] to any InterPro2GO mapping issues reported in go-annotation tracker
 +
* Also add InterPro2GO mapping label
 +
* We'll investigate if there are any ways to automatically assign tickets
 +
 +
== Protein oligomerization terms in GO ==
 +
* Work on revising BP annotations seems to be going well, although some groups still have a lot of review to do
 +
* For MF annotations, less manual work is likely required
 +
* MF ticket will be created when spreadsheet recommendations are finished; will highlight those annotations that should be manually reviewed
 +
* RCA annotations - will double-check documentation on whether they should be filtered after a certain amount of time, e.g. year
 +
** Since they are supposed to be manually reviewed, there may not be a limit, but it's not clear that that's how the evidence code is being used by all groups
 +
* Kimberly will contact Alex about annotation updates in Protein2GO
 +
* Deadline to complete this work - end of summer (although spreadsheets will always be there in case people aren't able to finish annotation review before the obsoletions go into effect)
 +
 +
== QC Reports ==
 +
 +
=== Pipeline Reports ===
 +
* We reviewed the GOC pipeline reports
 +
* [http://snapshot.geneontology.org/reports/gorule-report.html Table view] provides a subset of QC rules that are implemented
 +
* Full group reports (html likely most useful) should also be reviewed
 +
* Additional work is being done to incorporate more rules, e.g. taxon constraints
 +
* Can work on improving documentation on how to use the reports
 +
 +
=== Matrix QC checks ===
 +
* Val gave a summary of how she's currently using the matrix tool for QC checks
 +
* She's focusing on intersections with low numbers as these are often sources of annotation error
 +
* Wrt NOT annotations, the matrix tool ignores NOT annotations, but it looks like some annotations that should have a NOT annotation qualifier and don't are still sneaking in via PAINT (there is a ticket for this)
 +
* The matrix reports are available as a [http://snapshot.geneontology.org/reports/shared-annotation-check.html shared annotation check], but the errors are not sorted by contributor yet, so it's hard for groups to find annotations that need to be fixed
 +
* 130 matrix-derived rules with this latest batch (up from 107)
 +
* In the future, plan to extend the rules to component - process
 +
* Possibly also figure out a way to use the matrix to find genes annotated to both a process and regulation of that process
  
  
  
 
[[Category:Annotation Working Group]]
 
[[Category:Annotation Working Group]]

Latest revision as of 09:53, 9 July 2019

Agenda

github submission for InterPro2GO mappings

Protein oligomerization terms in GO

  • Follow up to 2019-06-11 annotation conference call to review recommendations and actions
  • Proposal and Recommendations
  • Biological Process Annotations
  • Molecular Function Annotations
    • Protein Dimerization Activity (includes regulation of activity)
    • For MF annotations using the IPI evidence code, we are proposing term merges, so curators hopefully will not have to do manual revision here (programmatic upgrades can be done locally?)
      • For example, protein homodimerization activity, should be handled by merging into 'identical protein binding'
    • However, there are some evidence code issues/questions that we should review
      • For IC, NAS, and TAS evidence codes, recommend upgrading to an experimentally supported annotation or removing
    • For IDA annotations, we'd like to convert as many of those as possible into IPI, even if the With field will contain the same ID as the annotated entity. This allows for consistent representation of binding annotations in GO.
      • ~1200 IDA annotations to 'protein homodimerization activity'. These should be automatically converted (locally) to 'identical protein binding' using IPI and the annotated entity in the With field.
      • ~350 IDA annotations to 'protein heterodimerization activity'. Are these all cases where the interactor could not be assigned an ID? Or are these cases where curators didn't feel the assay fit with IPI?
    • There are also IGI, IMP, and RCA annotations - can any of these be converted to IPI (if there isn't a corresponding IPI already)
      • IGI (4) to 'protein heterodimerization activity' - are these supporting evidence in addition to IPI? Or something else?
      • IMP (133) to one of the MF terms - are these supporting evidence in addition to IPI? Or something else?
      • RCA (21) to one of the MF terms - most of these are older annotations from Gramene, a handful from AgBase
  • Gitter question from Petra

QC Reports

Pipeline Reports

Matrix QC checks

  • How is this work going for people?
  • Questions, comments?

GO-CAMs and Annotation Extensions

  • After USC hackathon in June, we formed a small working group to fully articulate the GO-CAM model specifications
  • This will allow uniform QC across workbenches, projects (e.g. Reactome and MOD imports)
  • Will also allow us to move forward with harmonizing AE relations used in GO-CAM and conventional annotation

Minutes

  • Present: Birgit, Bob, Colin, David, Dmitry, Dustin, Edith, Giulia, Harold, Judy, Karen, Kimberly, Laurent-Philippe, Li, Michele, Midori, Pascale, Patrick N, Petra, Ruth, Sabrina, Stacia, Stan, Seth, Suzi A, Tanya, Tom, Val

github submission for InterPro2GO mappings

  • For now, add InterPro curators as listed in ticket to any InterPro2GO mapping issues reported in go-annotation tracker
  • Also add InterPro2GO mapping label
  • We'll investigate if there are any ways to automatically assign tickets

Protein oligomerization terms in GO

  • Work on revising BP annotations seems to be going well, although some groups still have a lot of review to do
  • For MF annotations, less manual work is likely required
  • MF ticket will be created when spreadsheet recommendations are finished; will highlight those annotations that should be manually reviewed
  • RCA annotations - will double-check documentation on whether they should be filtered after a certain amount of time, e.g. year
    • Since they are supposed to be manually reviewed, there may not be a limit, but it's not clear that that's how the evidence code is being used by all groups
  • Kimberly will contact Alex about annotation updates in Protein2GO
  • Deadline to complete this work - end of summer (although spreadsheets will always be there in case people aren't able to finish annotation review before the obsoletions go into effect)

QC Reports

Pipeline Reports

  • We reviewed the GOC pipeline reports
  • Table view provides a subset of QC rules that are implemented
  • Full group reports (html likely most useful) should also be reviewed
  • Additional work is being done to incorporate more rules, e.g. taxon constraints
  • Can work on improving documentation on how to use the reports

Matrix QC checks

  • Val gave a summary of how she's currently using the matrix tool for QC checks
  • She's focusing on intersections with low numbers as these are often sources of annotation error
  • Wrt NOT annotations, the matrix tool ignores NOT annotations, but it looks like some annotations that should have a NOT annotation qualifier and don't are still sneaking in via PAINT (there is a ticket for this)
  • The matrix reports are available as a shared annotation check, but the errors are not sorted by contributor yet, so it's hard for groups to find annotations that need to be fixed
  • 130 matrix-derived rules with this latest batch (up from 107)
  • In the future, plan to extend the rules to component - process
  • Possibly also figure out a way to use the matrix to find genes annotated to both a process and regulation of that process