Annotation Conf. Call 2019-07-09: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
(Created page with "= Agenda = == github submission for InterPro2GO mappings == * [https://github.com/geneontology/go-annotation/issues/2472 Tagging InterPro in issues] = Minutes = *Present:...")
 
mNo edit summary
 
(41 intermediate revisions by 5 users not shown)
Line 3: Line 3:
== github submission for InterPro2GO mappings ==
== github submission for InterPro2GO mappings ==
* [https://github.com/geneontology/go-annotation/issues/2472 Tagging InterPro in issues]
* [https://github.com/geneontology/go-annotation/issues/2472 Tagging InterPro in issues]
* Lorna has requested that we tag InterPro curators so they can be alerted to issues relevant to them (can find relevant gitHub handles on the group-contacts.yaml file: https://github.com/geneontology/go-site/blob/master/metadata/group-contacts.csv)
* Please also add the InterPro mapping label (always good practice to add labels)
== Protein oligomerization terms in GO ==
* Follow up to 2019-06-11 annotation conference call to review recommendations and actions
* [https://docs.google.com/document/d/1JxzNwl1fr5Fj-IJy_Fo1lLfAlAQB8IxPDwwYKxIXBZo Proposal and Recommendations]
* Biological Process Annotations
** [https://docs.google.com/spreadsheets/d/15tIvSts5PcWidtMpXTQQfjFY3LTYcfjOc62Y5_FeRpU Protein Complex Oligomerization]
** [https://docs.google.com/spreadsheets/d/1tVkRG81cfLxbrnS279RlJ-u1k20ee_NwHRXlWKUeWuo Regulation of Protein Complex Oligomerization]
* Molecular Function Annotations
** [https://docs.google.com/spreadsheets/d/1xOvmAmg5agQxo11pqYa6sGvbuJhOIyRGkZRjrcZHF8s Protein Dimerization Activity (includes regulation of activity)]
** For MF annotations using the IPI evidence code, we are proposing term merges, so curators hopefully will not have to do manual revision here (programmatic upgrades can be done locally?)
*** For example, protein homodimerization activity, should be handled by merging into 'identical protein binding'
**However, there are some evidence code issues/questions that we should review
*** For IC, NAS, and TAS evidence codes, recommend upgrading to an experimentally supported annotation or removing
** For IDA annotations, we'd like to convert as many of those as possible into IPI, even if the With field will contain the same ID as the annotated entity.  This allows for consistent representation of binding annotations in GO.
*** ~1200 IDA annotations to 'protein homodimerization activity'.  These should be automatically converted (locally) to 'identical protein binding' using IPI and the annotated entity in the With field.
*** ~350 IDA annotations to 'protein heterodimerization activity'.  Are these all cases where the interactor could not be assigned an ID?  Or are these cases where curators didn't feel the assay fit with IPI?
** There are also IGI, IMP, and RCA annotations - can any of these be converted to IPI (if there isn't a corresponding IPI already)
*** IGI (4) to 'protein heterodimerization activity'  - are these supporting evidence in addition to IPI? Or something else?
*** IMP (133) to one of the MF terms - are these supporting evidence in addition to IPI? Or something else?
*** RCA (21) to one of the MF terms - most of these are older annotations from Gramene, a handful from AgBase
* [https://gitter.im/geneontology/go-annotation Gitter question] from Petra
== QC Reports ==
=== Pipeline Reports ===
*[http://snapshot.geneontology.org/reports/index.html snapshot reports]
=== Matrix QC checks ===
* How is this work going for people?
* Questions, comments?
== GO-CAMs and Annotation Extensions ==
* After USC hackathon in June, we formed a small working group to fully articulate the GO-CAM model specifications
* This will allow uniform QC across workbenches, projects (e.g. Reactome and MOD imports)
* Will also allow us to move forward with harmonizing AE relations used in GO-CAM and conventional annotation


= Minutes =
= Minutes =
*Present:  
*Present: Birgit, Bob, Colin, David, Dmitry, Dustin, Edith, Giulia, Harold, Judy, Karen, Kimberly, Laurent-Philippe, Li, Michele, Midori, Pascale, Patrick N, Petra, Ruth, Sabrina, Stacia, Stan, Seth, Suzi A, Tanya, Tom, Val
 
== github submission for InterPro2GO mappings ==
* For now, add InterPro curators as listed in [https://github.com/geneontology/go-annotation/issues/2472 ticket] to any InterPro2GO mapping issues reported in go-annotation tracker
* Also add InterPro2GO mapping label
* We'll investigate if there are any ways to automatically assign tickets
 
== Protein oligomerization terms in GO ==
* Work on revising BP annotations seems to be going well, although some groups still have a lot of review to do
* For MF annotations, less manual work is likely required
* MF ticket will be created when spreadsheet recommendations are finished; will highlight those annotations that should be manually reviewed
* RCA annotations - will double-check documentation on whether they should be filtered after a certain amount of time, e.g. year
** Since they are supposed to be manually reviewed, there may not be a limit, but it's not clear that that's how the evidence code is being used by all groups
* Kimberly will contact Alex about annotation updates in Protein2GO
* Deadline to complete this work - end of summer (although spreadsheets will always be there in case people aren't able to finish annotation review before the obsoletions go into effect)
 
== QC Reports ==
 
=== Pipeline Reports ===
* We reviewed the GOC pipeline reports
* [http://snapshot.geneontology.org/reports/gorule-report.html Table view] provides a subset of QC rules that are implemented
* Full group reports (html likely most useful) should also be reviewed
* Additional work is being done to incorporate more rules, e.g. taxon constraints
* Can work on improving documentation on how to use the reports
 
=== Matrix QC checks ===
* Val gave a summary of how she's currently using the matrix tool for QC checks
* She's focusing on intersections with low numbers as these are often sources of annotation error
* Wrt NOT annotations, the matrix tool ignores NOT annotations, but it looks like some annotations that should have a NOT annotation qualifier and don't are still sneaking in via PAINT (there is a ticket for this)
* The matrix reports are available as a [http://snapshot.geneontology.org/reports/shared-annotation-check.html shared annotation check], but the errors are not sorted by contributor yet, so it's hard for groups to find annotations that need to be fixed
* 130 matrix-derived rules with this latest batch (up from 107)
* In the future, plan to extend the rules to component - process
* Possibly also figure out a way to use the matrix to find genes annotated to both a process and regulation of that process
 
 


[[Category:Annotation Working Group]]
[[Category:Annotation Working Group]]

Latest revision as of 13:53, 9 July 2019

Agenda

github submission for InterPro2GO mappings

Protein oligomerization terms in GO

  • Follow up to 2019-06-11 annotation conference call to review recommendations and actions
  • Proposal and Recommendations
  • Biological Process Annotations
  • Molecular Function Annotations
    • Protein Dimerization Activity (includes regulation of activity)
    • For MF annotations using the IPI evidence code, we are proposing term merges, so curators hopefully will not have to do manual revision here (programmatic upgrades can be done locally?)
      • For example, protein homodimerization activity, should be handled by merging into 'identical protein binding'
    • However, there are some evidence code issues/questions that we should review
      • For IC, NAS, and TAS evidence codes, recommend upgrading to an experimentally supported annotation or removing
    • For IDA annotations, we'd like to convert as many of those as possible into IPI, even if the With field will contain the same ID as the annotated entity. This allows for consistent representation of binding annotations in GO.
      • ~1200 IDA annotations to 'protein homodimerization activity'. These should be automatically converted (locally) to 'identical protein binding' using IPI and the annotated entity in the With field.
      • ~350 IDA annotations to 'protein heterodimerization activity'. Are these all cases where the interactor could not be assigned an ID? Or are these cases where curators didn't feel the assay fit with IPI?
    • There are also IGI, IMP, and RCA annotations - can any of these be converted to IPI (if there isn't a corresponding IPI already)
      • IGI (4) to 'protein heterodimerization activity' - are these supporting evidence in addition to IPI? Or something else?
      • IMP (133) to one of the MF terms - are these supporting evidence in addition to IPI? Or something else?
      • RCA (21) to one of the MF terms - most of these are older annotations from Gramene, a handful from AgBase
  • Gitter question from Petra

QC Reports

Pipeline Reports

Matrix QC checks

  • How is this work going for people?
  • Questions, comments?

GO-CAMs and Annotation Extensions

  • After USC hackathon in June, we formed a small working group to fully articulate the GO-CAM model specifications
  • This will allow uniform QC across workbenches, projects (e.g. Reactome and MOD imports)
  • Will also allow us to move forward with harmonizing AE relations used in GO-CAM and conventional annotation

Minutes

  • Present: Birgit, Bob, Colin, David, Dmitry, Dustin, Edith, Giulia, Harold, Judy, Karen, Kimberly, Laurent-Philippe, Li, Michele, Midori, Pascale, Patrick N, Petra, Ruth, Sabrina, Stacia, Stan, Seth, Suzi A, Tanya, Tom, Val

github submission for InterPro2GO mappings

  • For now, add InterPro curators as listed in ticket to any InterPro2GO mapping issues reported in go-annotation tracker
  • Also add InterPro2GO mapping label
  • We'll investigate if there are any ways to automatically assign tickets

Protein oligomerization terms in GO

  • Work on revising BP annotations seems to be going well, although some groups still have a lot of review to do
  • For MF annotations, less manual work is likely required
  • MF ticket will be created when spreadsheet recommendations are finished; will highlight those annotations that should be manually reviewed
  • RCA annotations - will double-check documentation on whether they should be filtered after a certain amount of time, e.g. year
    • Since they are supposed to be manually reviewed, there may not be a limit, but it's not clear that that's how the evidence code is being used by all groups
  • Kimberly will contact Alex about annotation updates in Protein2GO
  • Deadline to complete this work - end of summer (although spreadsheets will always be there in case people aren't able to finish annotation review before the obsoletions go into effect)

QC Reports

Pipeline Reports

  • We reviewed the GOC pipeline reports
  • Table view provides a subset of QC rules that are implemented
  • Full group reports (html likely most useful) should also be reviewed
  • Additional work is being done to incorporate more rules, e.g. taxon constraints
  • Can work on improving documentation on how to use the reports

Matrix QC checks

  • Val gave a summary of how she's currently using the matrix tool for QC checks
  • She's focusing on intersections with low numbers as these are often sources of annotation error
  • Wrt NOT annotations, the matrix tool ignores NOT annotations, but it looks like some annotations that should have a NOT annotation qualifier and don't are still sneaking in via PAINT (there is a ticket for this)
  • The matrix reports are available as a shared annotation check, but the errors are not sorted by contributor yet, so it's hard for groups to find annotations that need to be fixed
  • 130 matrix-derived rules with this latest batch (up from 107)
  • In the future, plan to extend the rules to component - process
  • Possibly also figure out a way to use the matrix to find genes annotated to both a process and regulation of that process