2016 Los Angeles GOC Meeting Agenda

From GO Wiki
Revision as of 10:02, 3 November 2016 by Paul Thomas (talk | contribs)

Jump to: navigation, search

Agenda

Overview/Plan for Upcoming Five Years

  • GO PIs presentation

Review action items from GOC meeting in Geneva, April 2016

Geneva

  • AI: GOC to decide what is HTP data
  • AmiGO and web site action items (Seth)
    • AI: Mechanism to remove redundant annotations in GO display
    • AI: GOC to extract this data and display annual stats on web page
    • AI: Submit suggestions for go-site to github
    • AI: Val give presentation on call for term matrix
    • AI: High level discussions needed to discuss development of browsers
    • AI: Working group to be set up to work on community annotation web presence (on behalf of Moni)
  • AI: Seth, Chris and Tony to discuss proposed extension to GPAD to capture source data for certain annotations

Informatics and Infrastructure 5 year plan

Update on various changes (Chris)

Proposal

  • Switch to global monthly releases (cjm)
    • Still provide daily snapshots

Conference Calls and Communication

Estimated time: 30 minutes

  • Discuss different options for reducing the number of conference calls, while still facilitating effective communication between the different GO groups, e.g. annotators, ontology editors, software team
    • Consolidate all annotation calls (Monday LEGO, Tuesday Annotation, Tuesday PAINT) into one Tuesday annotation call, frequency TBD
    • Consolidate LEGO, Annotation, PAINT, and Ontology Development calls into one weekly GO call
  • Discussion on the design of new SOPs for mechanisms of communication
    • What is the best mechanism to alert annotation groups of changes to the ontology that will affect annotations? We have started a table of contacts, but is this how annotation groups would like to proceed?
    • Review of github repositories, what to record where, who is processing/clearing tickets, etc.
  • Discussion on what it means to be a member of the Gene Ontology Consortium, not just the NHGRI grant.
    • Agreed to standards, which ones?

Annotation Metrics

Estimated time: 1 hour

  • What are the optimal metrics to assess progress in GO annotation?
    • Number of annotations
    • Number of references
      • Recall ZFIN's 'paper complexity' measure as a way of normalizing for different paper content (Doug mentioned in Geneva)
    • Revised annotations, e.g. updating to a new term
    • Removing annotations, e.g. improving knowledge about how a gene product affects a downstream process
    • Adding appropriate contextual information to existing annotations
    • Percentage of genome annotated vs percentage of genome with annotatable information?
  • How does LEGO modeling change our assessment of a curator's contributions?
  • Multiple funding bodies (Ruth)
  • Distinguishing annotations that are created automatically, e.g. inference pipelines (Tony)
  • Individual curators attribution via Orcid IDs, it is important to establish if this wanted, and if it is wanted at what level of information? At an annotation by annotation level or just as a summation of contribution.

Genetic entities in GO annotation

  • Can someone claim these two topics, if they are still interested in discussing them?
    • DOS: What genetic entities do we need to refer to as the primary object of GO annotation and in extended forms of conventional or LEGO annotation?
    • What conventions should we enforce regarding which identifiers to use to refer to these genetic features and how?
  • Update on UniProt GCRP sets (Maria)
  • Update on gpi specifications and uses (Kimberly, Chris) - 10-15 minutes

Ontology Group Update (DavidH)

  • GH tracker progress
  • Special Projects
    • Cilia
    • Autophagy
    • Apoptosis
    • Plant Enzymes
    • Synapse (DOS/PDT)
  • GO help report

Annotation Issues - Conventional Annotations

Aligning Conventional and LEGO Annotations

  • A proposal to make Conventional Annotation align better with LEGO modelling (F-P linking) (Val)
https://github.com/geneontology/go-ontology/issues/12739#issuecomment-254623691

Modified Protein Binding

  • Modified protein binding: GO terms & annotations are very inconsistent. (DavidH to present Paola's proposal)
    • Recent github issues:
glycoprotein binding: https://github.com/geneontology/go-ontology/issues/12580#issuecomment-240782020
ubiquitinated protein binding https://github.com/geneontology/go-ontology/issues/12582#issuecomment-240452320

Protein Family Terms in the Ontology

  • Protein families in terms (DavidH)
    • Currently the inclusion of protein family information in term names is leading to inconsistent annotation.
      • For now, the ontology editors have not been adding terms that specifically refer to protein families with the exception of signaling pathways. Should we make this a rule? If so, how will we capture the detail desired by annotators and how will we make this backward compatible?
https://github.com/geneontology/go-ontology/issues/12440

Multiple Evidences to Support an Inference

  • How are people capturing data where both direct assay AND protein motif/domains/sequence needs to be used by the curator to provide the annotation? [15 min Ruth, started by Rebecca] presentation A system needs to be in place to enable the more specific annotations to be created for orthologous proteins (which cannot be done across all species with the IC evidence code)
    • eg transmembrane domain used as evidence to create the annotation 'integral to membrane' with IEA evidence; immunofluorescence localises protein to 'plasma membrane' (annotated with IDA evidence), ideal annotation to be created 'integral to plasma membrane'
    • 3 obvious options (any others?)
      • new evidence code IDD 'inferred by direct assay AND protein domain(sequence/motif?)' (would probably also want IMD, IGD, IED)
        • Note that ECO has a combinatorial evidence code that could possibly be used as the parent for new GO combinatorial codes:
          • combinatorial evidence used in manual assertion - ECO:0000244
      • no new evidence code requires as this is implied by the 'inferred' aspect of the evidence code as well as 'author intent'
      • Create a GOC pipeline that creates the CC annotations based on the IDA annotation (eg plasma membrane) and the IEA information (eg integral to membrane) to create the more specific annotation (eg integral to plasma membrane).

Transcription Factor Annotation Decision Tree

  • Flowchart guidelines for transcription factor annotations [10 min Rachael/Ruth/Barbara] presentation. To improve consistency UCL team have created an annotation flowchart which is being circulated to GOC members.

Consistent Classification of Signaling Pathway Terms

  • Conventions for signalling pathway terms
    • Currently you can request signalling pathway terms along multiple axes of classification including:
      • signalling module (MAPK cascade, GTPase etc)
      • process regulated
      • target TF's
      • ligand /pheromone activating pathway
      • Process regulated
      • condition activating pathway (in response to hydrogen peroxide and other oxidants for oxidative stress pathway)

This results in almost infinite number of ways to describe some pathways

https://github.com/geneontology/go-ontology/issues/12701

Annotations from High Throughput Experiments

  • Annotations from high-throughput experiments (Ruth, David Hill, Kimberly)
    • How do we decide when to make annotations from high-throughput experiments?
    • If we decide that annotations from high-throughput experiments should be removed, what are the procedures (all annotations, some annotations)?
    • Do we want new evidence codes to indicate that the annotation was inferred from a high-throughput experiment?

Annotation Issues - LEGO Annotations

  • How are we going to handle ECO codes in Noctua. Currently there are only a limited number of codes that fall under 'used in manual assertion'. If we use codes that are not specific to the manual assertion part of the ontology, then they map to EXP. Are we going to request the entire set of codes that we think we might want to use or are we going to have an automated way to map to the correct code?

Example: http://noctua.berkeleybop.org/editor/graph/gomodel:5745387b00001874

  • Generating conventional annotations from Noctua models
    • Are we going to allow Noctua to generate conventional annotations to the root nodes of the ontology?
      • This would be useful for contextual annotations that are to otherwise root nodes.
      • However some groups block these kinds of annotations because in the past, these annotations were used to keep track of genes about which we had no information.
      • Note that the evidence code for a root node annotation in Noctua would/could be different in that the curator might assert that a gene product has some molecular function due to the observation that, when mutated, there is a phenotypic outcome, e.g. apoptosis execution fails.
      • This is a different statement from no biological data (ND) in which there is no information at all to assert a role in any biological process.
  • Are some conventional annotation rules inappropriate for Noctua annotation?
    • For a molecular function occurring in a cellular location, isn't IEP a more appropriate evidence code? IDA would mean that the function was assayed in situ. https://github.com/geneontology/go-annotation/issues/1395
    • Since binding is a part of many molecular functions, should we allow evidence codes other than IPI for binding (eg TAS)?
  • MGI's experience roundtripping with Noctua Models (DavidH)

Regulation relations

Regulation and causal relations are central to LEGO annotation and to inference based on LEGO models, but definitions and guidelines still need work to ensure consistency and clarity. DOS: I would like to present progress on the development of the relevant relations along with a proposal for how to improve them. This would probably work best as a collaborative presentation with LEGO annotators where we can show application to LEGO models.

GAF/GPAD inference from LEGO models

Introduction to inferring annotations from LEGO: Extended Gene Product to GO term relations; Reasoning across causal chains.

Jim Balhoff: Inference using Blazegraph & RDFox

DOS: Templates, design patterns and inference.