Annotation Conf. Call 2016-07-26

From GO Wiki
Revision as of 07:07, 22 July 2016 by Vanaukenk (talk | contribs)

Jump to: navigation, search

Bluejeans URL:


Next GOC Meeting - USC, Los Angeles, CA, November 4-6, 2016

  • Please indicate on the Meeting Logistics Page if you plan to come.
  • We are also gauging interest in a one- or half-day Noctua/LEGO workshop at USC (either before or after the main consortium meeting) so have added another column to the table to see if people can also attend that.
  • Please add your information as soon as possible so we can make arrangements.

Annotation Consistency Exercise for the Remainder of 2016

  • Thanks to everyone who agreed to select and present a paper for the remainder of this year:
  • August 23 - SGD
  • September 27 - dictyBase
  • October 25 - RGD
  • November 22 - Zfin
  • If you have thoughts about the exercises, please send them along to David and Kimberly. At the end of the year, we will evaluate how this is working and if we need to make any changes to the format to keep these exercises useful for people.

Revised Protein Binding Doucmentation

  • On the 2016-06-28 call, we discussed how each group currently annotates protein binding experiments as it was pointed out that the current documentation does not likely reflect universal practice, specifically wrt the issue of the direct or indirect nature of the interactions captured using 'protein binding' (GO:0005515) or its children.
 Current Documentation: The 'with' column (8) and the annotation extension column (16) should be used only for direct 
 interactions and only when the binding relationship is not already included in the GO term and/or definition. See "column 16 
 documentation for relationship types to use when adding IDs in the annotation extension column (16). 
  • We surveyed curators on the call and found that there are differences in how groups use interaction experiments for GO annotation.
  • We also discussed whether we are comfortable with having differences or should try to adhere to a common practice; generally, people felt it was okay to have some differences here, but we need to reflect that in the documentation.
  • Here is a draft of an update to the binding section of our curation documentation. Let's discuss if this accurately reflects what we do and why, and then make changes, if needed, and update the documentation.
 Proposed New Guideline: The Molecular Function (MF) ontology can be used to capture macromolecular interactions, such as protein-
 protein, protein-nucleic acid, protein-lipid interactions, etc.  While GO annotations are not considered to be a repository of all 
 protein-protein interactions, many gene products are annotated to 'protein binding' (GO:0005515) or one of its child terms.  In making 
 these annotations, contributing groups may follow slightly different practices with respect to the types of experimental evidence used 
 to support these inferences, e.g. some groups may use co-immunoprecipitation as supporting evidence for a protein binding annotation 
 between two gene products, others not.  However, all groups generally adhere to the principle that, when annotated, protein binding 
 interactions inform what is believed to be the normal biological role of a gene product, i.e. the protein-protein interactions support 
 an author's hypothesis about how the gene product is thought to execute its molecular function in the context of a normal biological 
 process.  Protein-protein interactions for which there is not yet sufficient biological context are discouraged as sources of GO MF 
  • We also discussed, on the last conference call, the criteria by which protein binding annotations from IntAct are exported to GO. A response from Sandra Orchard is on the 2016-06-28 minutes.
  • A summary:
 * Only experimental data is used for making the decision to export the protein pair to UniProtKB/GOA as a true binary interacting pair
 * The export decision is always based on at least two pieces of experimental data. A single evidence cannot score highly enough to 
 trigger an export
 * An export cannot be triggered if the protein pair only ever co-occurs in larger complexes, there must be at least one evidence that 
 the proteins are probably in physical contact.

Questions about Membrane Cellular Component Annotations

  • The UCL group would like clarification and guidelines on how curators should annotate the various membrane and child terms that describe the extent to which a gene product is contained within a membrane.
  • Here is a representative branch of the CC ontology wrt these types of terms:
    • membrane part
      • [isa]intrinsic component of membrane
 Definition:  The component of a membrane consisting of the gene products having some covalently attached portion, for example part of 
 a peptide sequence or some other covalently attached group such as a GPI anchor, which spans or is embedded in one or both leaflets of 
 the membrane. Source: GOC:mah 
 Comment:  Note that proteins intrinsic to membranes cannot be removed without disrupting the membrane, e.g. by detergent.
        • [isa]integral component of membrane
 Definition:  The component of a membrane consisting of the gene products and protein complexes having at least some part of their 
 peptide sequence embedded in the hydrophobic region of the membrane. Source: GOC:go_curators, GOC:dos
        • [isa]anchored component of membrane
 Definition:  The component of a membrane consisting of the gene products that are tethered to the membrane only by a covalently 
 attached anchor, such as a lipid group that is embedded in the membrane. Gene products with peptide sequences that are embedded in the 
 membrane are excluded from this grouping. Source: GOC:dos, GOC:mah
      • [isa]extrinsic component of membrane
 Definition:  The component of a membrane consisting of gene products and protein complexes that are loosely bound to one of its 
 surfaces, but not integrated into the hydrophobic region. Source: GOC:dos, GOC:mah, GOC:jl 
 Comment:  Note that proteins extrinsic to membranes can be removed by treatments that do not disrupt the membrane, such as salt 
  • Examples from the literature:
 EXAMPLE 1: PMID:18502731
 What annotations for VGAT and VGLUT2?
 Summary of methods in the paper:
 Electron microscopy shows synaptic vesicle localisation (Figure 4)
 Immunolocalization supports the localisation of VGAT and VGLUT2 to synaptic vesicles
 The curator knows that these proteins have transmembrane domains
 Would you annotate to 
 integral component of synaptic vesicle membrane ; GO:0030285 | IDA
 synaptic vesicle ; GO:0008021 | IDA
 integral component of synaptic vesicle membrane ; GO:0030285 | IC from GO:0008021
 (NB: The IC doesn’t show the full-picture because the membrane domains/anchors are author knowledge so often curated as a NAS/TAS 
 which can’t be included in the with statement for an IC annotation).

 Example 2: PMID:17110340
 integral component of synaptic vesicle membrane ; GO:0030285
 anchored component of synaptic vesicle membrane ; GO:0098993
 Summary of methods in the paper:
 The protein composition of purified synaptic vesicles (SVs) was analysed by Mass spectrometry (MS) and 1D SDS-PAGE, and 410 proteins 
 were unambiguously identified
 Proteins are classified as: 
   Copurifying with SVs
   Ubiquitously distributed on subcellular membranes (i.e present on SVs but not enriched relative to other fractions).
 Western blots were used to quantitate the levels of SV proteins.
 Three different electron microscopy (EM) procedures imaged surface proteins, and show the surface of SVs to be covered with proteins, 
 but doesn’t identify individual proteins
 They model the SV (Figure 4) to show transmembrane domains of proteins and anchored proteins- some of these are known to me membrane 
 proteins by their previous structure (e.g. they are known ion channels)
 From this paper, would you annotate to:
 integral component of synaptic vesicle membrane ; GO:0030285 | IDA
 anchored component of synaptic vesicle membrane ; GO:0098993 | IDA
 (e.g. incorporating author say-so/previous knowledge of the protein domains into the IDA evidence code)
 synaptic vesicle membrane ; GO:0030672 | IDA
 integral component of synaptic vesicle membrane ; GO:0030285 | IC from GO:0030672
 anchored component of synaptic vesicle membrane ; GO:0098993 | IC from GO:0030672
 (NB: The IC doesn’t show the full-picture because the membrane domains/anchors are author knowledge so often curated as a NAS/TAS 
 which can’t be included in the with statement for an IC annotation).

LEGO Model for FlyBase Annotation Consistency Exercise Paper


  • On call: Alice, David H., Elena, Giulia, Kimberly, Melanie, Rebecca, Ruth, Sabrina, Stacia

Annotation Consistency Exercises

  • PomBase is up in two weeks - Val was notified
  • August is SGD - okay with them, but we need to make sure we'll have enough people around in August to make this worthwhile. So far, we're okay, but will continue to check on the next two calls.
  • September is dictyBase - Petra was contacted
  • October is RGD - Stan was contacted
  • November is Zfin - okay with Sabrina

Protein Binding Documentation

  • Following up on the protein binding documentation discussion, we reviewed the draft of new guidelines/explanation of how the GOC approaches annotations to protein binding (GO:0005515) and its children.
  • The new documentation seems okay with everyone, so we will go ahead and update the section on binding guidelines on the website.
  • Ruth also suggested we review how groups are annotating to protein complex terms using the IPI evidence code, specifically to see how curators populate the With/From field, and make sure we are being consistent here.
    • For protein complex annotations, does the With/From field list every member of the complex, pipe-separated, or only, for example, the tagged member of the complex used to identify multiple other members?
    • We will need to survey groups about this and then take it from there.

Protein Binding Documentation - IntAct Protein Binding Annotations

  • Melanie followed up with Sandra Orchard to get more information on the criteria by which IntAct's protein-protein interactions are incorporated into the GO
  • IntAct uses a scoring system that requires high confidence, experimental data to allow export
  • We will add the explanation of this pipeline to the GO website's annotation FAQs