Difference between revisions of "Annotation Conf. Call 2017-03-14"

From GO Wiki
Jump to: navigation, search
(Software, Infrastructure)
(Software, Infrastructure)
Line 67: Line 67:
 
****note file format with type of file as extension, e.g. .gpi or .gaf
 
****note file format with type of file as extension, e.g. .gpi or .gaf
 
****prediction GAFs can be slurped into their own database pipeline or can just get incorporated into the GAF as part of the Jenkins pipeline
 
****prediction GAFs can be slurped into their own database pipeline or can just get incorporated into the GAF as part of the Jenkins pipeline
 +
**Harold - what is happening with the gp2protein, gp2rna, etc.?
 +
***Chris - could be incorporated into the target directory, but these files seems to have been subsumed by the Quest for Orthologs project
 +
**Harold - For the errors files, Mike's script would remove the lines, what does Jenkins do with the errors?
 +
***Chris - the current Jenkins filters might be a bit more liberal right now; Mike's script checked for more line-by-line types of errors, e.g. a missing column, while the Jenkins checks are more involved and require things like loading the ontologies, etc.
 +
**Midori - What is the mechanism for alerting curators to errors?  emails?
 +
***Chris - emails turn out to be problematic
 +
**Harold - Will errors block the GAF from being released?
 +
***Chris - Gross violations will be filterd
  
 
 

Revision as of 08:20, 14 March 2017

Bluejeans URL

https://bluejeans.com/993661940

Agenda

GO Meeting Reminder

Software, Infrastructure

Annotation Review

  • Review of IMP annotations for possible use of new qualifiers
    • On the February 14th call, we discussed adding new qualifiers to describe the relationship between a gene/gene product and a GO BP term
    • Right now, the default relation between a gene/gene product and a GO BP term is 'involved in', but for a long time we've wanted a way to be more specific about describing this relation
    • Action item from February 14th call, was for groups to start looking at their existing BP annotations, specifically IMP, to determine whether the default involved in qualifier is still appropriate, and if not, how would they describe the relations between the gene/gene product and process?
    • BP_Annotation_Qualifiers_Spreadsheet
    • Are there general principles we can derive to help with applying qualifiers to new and existing legacy annotations?
      • Defining starts and ends of processes will be critical for doing this.
        • Signaling, or otherwise well-defined, molecular pathways vs BPs like behaviors or developmental processes
        • Look at genes annotated to both the process and regulation of the process
        • Look at genes for which there is a regulation annotation and no MF annotation.
    • Explore effects on enrichment analyses - if we give users the option to filter annotations based on use of specific qualifiers, what might the outcome be?
    • David H has run gene sets from three papers using VLAD including and excluding IMP evidence codes.
    • GO and phenotype enrichment analyses may be complementary. Can/should we start encouraging users to do both?
  • Single-step biological processes - review annotations to help assess impact if such terms are obsoleted
    • Github ticket 'Remove all single-step BP classes'
    • What terms could possibly be considered single-step processes?
      • Look at MF-BP links
    • Mary D's analysis shows that there are 2380 bioentities that have experimental annotation to phosphorylation or its children that do not have annotation to kinase or its children. This is just one example.

Working Groups

  • Transcription factor decision tree
    • From Rachael: The update is that we have revised the decision tree, which is attached above (see github ticket), and we would like feedback on it from this working group. When the working group are happy with it, then we will announce it at a future annotation call and get it added to the website.
  • Annotating high throughput experiments
    • Proposed first meeting: Tuesday, March 21st, 8am PST

Minutes

  • On call: Chris, David H., Edith, Eric, George, Giulia, Harold, Helen, Jim, Judy, Karen, Kimberly, Li, Mary, Midori, Moni, Pascale, Petra, Sabrina, Shur-Jen, Val

GOC Meeeting - Corvallis, Oregon

  • Early June
  • Three days of GOC meeting
  • Noctua workshop
  • Reactome workshop
    • AI: Need to check what the focus of the Reactome workshop will be

Software, Infrastructure

  • Chris - update
  • Check slides from USC presentation in GO Google directory
  • GO will be ceasing support for subversion (SVN)
    • Replaced with a mixture of github (for the ontology)
      • Meeting in Berkeley at end of February for training on git and ontology editing - generally going well
    • Will bypass version control for ontology files and publish directly on Amazon S3
  • Look at go-site directory on github
    • go-site/metadata/datasets
    • datasets includes .yaml files of metadata for all GAF providers
      • gives info about the project, what types of files are submitted
      • use this metadata for validation checks, e.g. taxon
      • the metadata will be used to drive future Jenkins jobs - validations, OWL tools checks, prediction GAFs
        • see go-gaf-pipeline-NEW
        • produces a folder for each producer, e.g. pipeline/target/mgi
        • note file format with type of file as extension, e.g. .gpi or .gaf
        • prediction GAFs can be slurped into their own database pipeline or can just get incorporated into the GAF as part of the Jenkins pipeline
    • Harold - what is happening with the gp2protein, gp2rna, etc.?
      • Chris - could be incorporated into the target directory, but these files seems to have been subsumed by the Quest for Orthologs project
    • Harold - For the errors files, Mike's script would remove the lines, what does Jenkins do with the errors?
      • Chris - the current Jenkins filters might be a bit more liberal right now; Mike's script checked for more line-by-line types of errors, e.g. a missing column, while the Jenkins checks are more involved and require things like loading the ontologies, etc.
    • Midori - What is the mechanism for alerting curators to errors? emails?
      • Chris - emails turn out to be problematic
    • Harold - Will errors block the GAF from being released?
      • Chris - Gross violations will be filterd