|
|
(4 intermediate revisions by the same user not shown) |
Line 1: |
Line 1: |
| [[Category:Meetings]]
| | #REDIRECT[[Talk:2010_GO_camp_Meeting_Agenda]] |
| =Day 1 morning session=
| |
| ===9:00 Introductions and objectives of the meeting===
| |
| # Introductions & Logistics: Serenella Ferro Rojas
| |
| * Poll for Thursday lunch reservations, depending on weather.
| |
| * Dinner at [http://www.resto.ch/labourse/ Brasserie la Bourse] on the Carouge
| |
| ** ~ 1.9 km from meeting site
| |
| Friday
| |
| Reception at noon for Amos Bairoch celebration of the Otto Naegeli prize.
| |
| | |
| Introductions
| |
| | |
| ==== Goals: Pascale Gaudet ====
| |
| | |
| ===GO – Ontology, annotation, tools and technical aspects===
| |
| '''Chairs: Serenella Ferro Rojas and Pascale Gaudet'''
| |
| ==== GO overview ====
| |
| An introduction to the GO ontology : terms, definitions, synonyms, relationships, cross-products. Jane Lomax
| |
| * Inter-ontology links
| |
| ** Most tools don't make inferences across the ontoogies. Make redundant annotations.
| |
| ** Cross products
| |
| *** between GO ontologies
| |
| *** external ontologies (cell ontology; CHEBI)
| |
| * Ontology development
| |
| ** large scale targeted projects
| |
| ** logical consistency
| |
| ** small scale requests (Sourceforge tracker; future via Amigo)
| |
| | |
| Q/A: classical relationships (e.g. part_of within an ontology) are subset of cross-products.
| |
| ==== Annotation Process ====
| |
| ===== General overview of the annotation guidelines used by GO, and contributing resources. Rama Balakrishnan =====
| |
| ** Annotation guidelines
| |
| Goal:say as much as possible about a gene product. Be useful to bench and computational biologists.
| |
| * GO annotation: Gene product association with GO terms and other info.
| |
| ** Core
| |
| *** gene product identifiers
| |
| *** GO term
| |
| *** Reference
| |
| *** Evidence code
| |
| ** Additional info
| |
| *** qualifiers
| |
| *** with/from
| |
| *** Annotation detail (16)
| |
| *** Isoform
| |
| * Sources
| |
| ** Manual
| |
| ** Automated
| |
| ** PAINT (new)
| |
| *** inter-ontology inferences (new)
| |
| Differences between previous GO camps and this one. This one more internal and focused on strengthening guidelines.
| |
| | |
| * Challenges ...
| |
| * Avoiding redundancy.
| |
| ** Authoritative sources
| |
| *** no MOD - UniProt-GOA.
| |
| | |
| | |
| ===== General overview UniProtKB/SwissProt manual annotation. Serenella =====
| |
| * protein selected for manual annotation based on priorities
| |
| ** Recent papers chosen for high impact
| |
| ** Curation of specific processes (e.g ubiquitin-like conjugation)
| |
| ** User requests
| |
| Flow
| |
| * sequence curation
| |
| ** One record for all different products for the same gene
| |
| * Sequence analysis. - automated. manual checking. domains, ptms, etc.
| |
| * Literature curation. Species, protein names, gene names, journals, tissues, plasmids
| |
| ** Store as comment lines free text with controlled tags(?)
| |
| ** Sequence annotation of features (relation to SO?)
| |
| ** GO annotation 50 curators, Automated: spkw2go, mappings2GO, etc.
| |
| * Family-based curation
| |
| * Attribution
| |
| * QA and integration
| |
| ** e.g. throw error when nucleus kw for bacterial protein
| |
| | |
| Q: Isoforms?
| |
| | |
| A: linked to parent ID - ACCESSION_#
| |
| | |
| Q: Connection between references and items.
| |
| | |
| A: Findable in the XML. This is being retrofitted to older entries.
| |
| | |
| Q: What is the unit of annotation - Genes, isoforms?
| |
| | |
| A: Isoforms yes. Not yet things like cleavage products, but should be in the future.
| |
| | |
| Break
| |
| | |
| == Binding documentation ==
| |
| * '''Chairs: Ruth Lovering and Ursula Hinz '''
| |
| *'''Minutes: Jim Hu - Damien Lieberherr
| |
| * '''Working group''': [[2010_GO_camp_working_groups_composition]]
| |
| * '''Working group notes''':[[2010_GO_camp_binding documentation issues| Binding documentation issues]]
| |
| | |
| Binding has been discussed at three consortium meetings.
| |
| === Current guidelines ===
| |
| Ursula:
| |
| * Binding biological entity (not today)
| |
| ==== Macromolecules (proteins) ====
| |
| ** specific proteins vs. protein classes vs. protein domains
| |
| * GO:0005505 must be with IPI and reciprocal annotation should be made.
| |
| * Use child terms
| |
| * Evidence
| |
| ** IPI for specific proteins
| |
| ** IDA for clases of protein
| |
| * Propagation
| |
| ** GO0005515 should not be propagated via ISS.
| |
| ** propagation of child term annotations is OK
| |
| * Do not use NOT with GO:0005515
| |
| * NOT with chilld terms is OK.
| |
| | |
| ==== Small molecules ====
| |
| * avoid redundant annotation of substrates, including transporter substrates
| |
| ** e.g. ATP binding for ATPases (exceptions where hydrolysis not shown)
| |
| ** Example DNA demethylase/dioxygenase
| |
| *** are annotations to alkylated DNA binding, O2 binding etc. redundant.
| |
| ==== Discussion ====
| |
| | |
| Q: protein binding - evidence that it does not bind a specific protein. Need a new GO term?
| |
| | |
| A: No. Use column 16 or create new GO term. Still in discussion. GO terms if the proteins can be put into groups. Don't want specific protein terms.
| |
| | |
| Q: What is wrong with having 25K GO terms?
| |
| | |
| A: Does it matter? May be able to do all PRO classes. Instantiate as needed.
| |
| | |
| Comment: NOT terms.. IntAct only annotates negative interactions for isoforms where a different isoform has a positive isoform. Negatives are not exported to GO.
| |
| | |
| Judy summary: discussion of are we going to instantiate lots of protein binding terms. PRO families could be used for terms. Column 16 could be used for NOT and specific isoforms.
| |
| | |
| Emily: some things are not well captured by GO.
| |
| | |
| === Annotation extension discussion ===
| |
| Ruth
| |
| * Annotation extension = column 16
| |
| * Should only be used for direct targets.
| |
| * Examples
| |
| ** Co-IP. Lnx-I and Boz. Use two txn factor binding annotations with IPI and with for partner.
| |
| **:Q: Do we need exp evidence that (e.g.) Boz is a txn factor?
| |
| **:A: curator judgement at present. Rama: SGD would read the paper and make check other annotations of Boz, not just based on assertion in the paper. Same paper does not have to show Boz is a txn factor. Ruth: in humans, would use sequence analysis, e.g. domains. Actually SGD doesn't annotate protein binding.
| |
| | |
| Paul: Annotations for the target must exist somewhere. Does this create redundancy to annotate binding to proteins of function X where target has function X?
| |
| | |
| Jane: Won't always be function terms. e.g. LIM binding domain binding.
| |
| | |
| Ruth: GOC still needs more discussion.
| |
| | |
| Judy: no inconsistency in what SGD does and what Ruth does. Annotations are consistent but SGD chooses different annotations to make. MODs bring specific special experimental strengths. This is a difference, not an inconsistency.
| |
| | |
| Mike L.: Biogrid curation does a lot of this. How much can be transferred. Ruth: more on this later.
| |
| | |
| * Column 16 example: Lnx-1 ubiquitinates Boz but not Gsc.
| |
| ** Annotation. Lnx-1 has ubiquitin-protein ligase activity IDA Col 16:Boz
| |
| ** Annotate preteen ubiquitination IDA w/o target.
| |
| Q: problem of propagation across species. Col 16 identifier is species-specific.
| |
| | |
| A: Transferring from human to mouse. Use col 16 or not?
| |
| | |
| Propagation of Col 16 issues. Binding across species requires additional discussion. Should column 16 identifier be to a class. Should column 16 be transferred in ISS transfer. More discussion!
| |
| | |
| Q: is this redundant annotation of enzyme substrates?
| |
| | |
| A: No, we are doing substrate binding if the GO term does not provide the information.
| |
| | |
| Judy: knowledge statements vs description of the experiment.
| |
| | |
| Jim: column 16 post composition is equivalent to creation of a precomposed term, so ISS should be allowed (as appropriate, depending on whether the 16 ID is a class vs a specific product).
| |
| | |
| Paul: Think in terms of how we will do this with PAINT. We are annotating to ancestor nodes.
| |
| | |
| Comment: is the discussion generalizing? More general solution is to associate records with an external reference. Relational structure problem. In terms of binding let the protein interaction databases handle these.
| |
| | |
| Several people suggest that we should not have terms like "txn factor binding".
| |
| | |
| Ruth: Quick summary
| |
| * Use with term with IPIs if the GO term definition does not provide information
| |
| * Use column 16 for target
| |
| * In disagreement about propagation of column 16 by ISS
| |
| * Ideally info from with or col 16 to make inferences about the function of the protein. Other functions could come from other annotations of the target.
| |
| | |
| Kimberly: this has major implications for display. Keep the more specific terms (at least for now).
| |
| | |
| Ruth: enumeration of the kinds of targets could make things less clear.
| |
| | |
| ==== When not to use Col 16 ====
| |
| * For indirect targets
| |
| * FGF2 -> receptor -> phosporylation of Erk2 goes up. Erk2 is NOT a direct target of FGF2. Activation goes via Ras.
| |
| | |
| ==== Col 16 relationship ontology ====
| |
| * has_particiipant
| |
| ** has_iinput
| |
| ** has_output
| |
| Relationships go along with the ID in Col 16.
| |
| | |
| Usage
| |
| * Lnx-1 is_a ub protein ligase IDA has_input Boz.
| |
| | |
| ==== Col 16 and CHEBI ====
| |
| Example: stroid hydroxylase.
| |
| * CYP11B2 is_a steroid hydroxyls activity IDA has_input CHEBI:16827 Corticosterone
| |
| * CYP11B2 is_a steroid hydroxyls activity IDA has_output CHEBI:16827 Aldosterone
| |
| Where do we draw the lines with respect to specificity continues to be an issue of discussion.
| |
| | |
| Kimberly: Connections between CHEBI IDs and process terms - how will these be handled by GO. Will CHEBI IDs in function ontology propagate to process terms.
| |
| | |
| | |
| IPI and catalytic activity. Deprecate these?
| |
| * Rama: in SGD these came from combination of IPI and IMP evidence (Editorial comment: this is because SGD doesn't do GO:0005515).
| |
| | |
| Binding is not sufficient to infer activity by itself. GO does not capture multiple experiments in a single annotation. This is a general problem.
| |
| | |
| Judy: rules are made to be broken. (!)
| |
| | |
| Interaction with the IMEx consortium.
| |
| | |
| ==== Survey responses ====
| |
| See slides.
| |
| | |
| = Day 2 Morning =
| |
| == Binding continued ==
| |
| === Summary of ontology development ===
| |
| Chris Mungall presentation postponed from yesterday
| |
| * addition of has_rapt for reactions
| |
| * True path issues and propagation
| |
| Rules for has_part are more difficult than for other propagation.
| |
| | |
| Example G capable_of ATPase activity -> G capable_of ATP binding
| |
| | |
| * Materialize relationships at central location
| |
| | |
| Workflow:
| |
| * Curator annotates to ATPase activity
| |
| * GAF pipeline materializes ATP binding using same EC
| |
| * Reimport allows query against ATP binding query to recover ATPases etc.
| |
| ** Q: does redundancy of annotation raise issues? Probably not?
| |
| * Alternatives
| |
| ** Navigation via CHEBI too complex.
| |
| ** is_a between AATPase activity and ATP binding
| |
| | |
| Automated population of ontology using intersection_of terms ... has_input + has_output
| |
| | |
| Ontology will contain information to relieve annotators of making redundant annotations.
| |
| | |
| Q: why not have a necessitates relationship. e.g. ATPase necessitates ATP_binding. That's an alternative (different from the is_a alternative?)
| |
| | |
| Q: How will the chain of evidence work for the materialized ATP binding added to the GAF. A: original EC, reference, and ...?
| |
| | |
| Q: Look at other ontologies, e.g. txn factors. A: Don't want txn factor as a child of binding.
| |
| | |
| Q: is materializing a permanent solution? A: See later discussion.
| |
| | |
| ==== Extended GO ====
| |
| | |
| * Problem of software development assumes prior version of GO structure
| |
| * Links are only in GO_ext files.
| |
| * Future: more links. Software will have to catch up.
| |
| * Materialization service for function to process links
| |
| | |
| ==== Column 16 ====
| |
| * Want to limit prcomposition
| |
| * Annotate as if relationships are there
| |
| | |
| Syntax:
| |
| relation (class)
| |
| | |
| * When to request new term vs use col16 - would the term make sense in an enrichment analysis
| |
| * Reasoner can find equivalent terms if they exist, and materializer will add lines to the GAF.
| |
| | |
| ==== Column 17 ====
| |
| Isoforms. No time to discuss
| |
| | |
| ==== Discussion ====
| |
| * Extensions provide greater expressivity
| |
| * Possibility of expressing things different ways, but reasoner can link synonymous annotations made in different ways by annotators.
| |
| Q: relationship matrix? A: this exists in part
| |