SLC GO Consortium Meeting Minutes April 2008: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
(New page: '''GO Consortium Meeting Minutes Tuesday''' '''''ACTION ITEM:''' GO Top needs to sign off on agenda before the GOC meeting & Ref Genome meetings prior to these meetings. If there is no ac...)
 
mNo edit summary
 
(51 intermediate revisions by 11 users not shown)
Line 1: Line 1:
'''GO Consortium Meeting Minutes
'''Tuesday'''
Tuesday'''


'''''ACTION ITEM:''' GO Top needs to sign off on agenda before the GOC meeting & Ref Genome meetings prior to these meetings. If there is no action item stated, this is not the forum for the item.''
'''''ACTION ITEM:''' GO Top needs to sign off on agenda before the GOC meeting & Ref Genome meetings prior to these meetings. If there is no action item stated, this is not the forum for the item.''


'''1. Introduction & Review of Agenda'''


'''2. GO Reference Genome Annotation Team Report (Pascal)'''
= Introduction & Review of Agenda =
 
Judy – Intro
 
New people introduced – Siddhartha Basu (dictyBase), Nomi Harris (BBOP), Ramana Madupu and Scott Durkin from JVCI
 
Welcome back Jane (EBI)
 
= GO Reference Genome Annotation Team Report (Pascale) =
 
Objectives of the Project
Goal: comprehensive annotation of 12 genomes


Currently:
Currently:
Line 12: Line 21:
394 genes in target list – from disease genes, hot genes, metabolic pathways, unannotated gene lists.
394 genes in target list – from disease genes, hot genes, metabolic pathways, unannotated gene lists.


2 methods being investigated for ortholog determination: PPOD (Kara) & tree-based (Paul Thomas, SRI).
Most groups have looked at all the target genes and determined orthologs for their species.


Ref Genome Process:
Ref Genome Process:


1. generate focal sets – have been doing manually but will move to using Kara & Paul’s data.
1. generate focal sets  
This has been a difficult process since some of our genomes are not included in the common ortholog resources
a. -  P-POD (Kara Dolinski, Princeton)
b. – Tree-based ,etjpd (Paul Thomas, SRI)


2. select common curation targets
2. select common curation targets


New curation priority is to annotate the families generated by PPOD where there is at least 1 member from each Ref Gen species – there are 150 of these.
New curation priority is to annotate the families generated by PPOD where there is at least 1 member from each Ref Gen species – there are 153 of these sets.


Rex: focus on the clusters that have 1 in each species (rather than multiples) as these will be the least confusing.
Rex: focus on the clusters that have 1 in each species (rather than multiples) as these will be the least confusing.
Line 33: Line 43:


QC issues – need QC at each step in the process. For GO annotation, look at outliers, check co-occurrences of annotations. For orthologs,  manual verification of ortho sets.
QC issues – need QC at each step in the process. For GO annotation, look at outliers, check co-occurrences of annotations. For orthologs,  manual verification of ortho sets.
b. Methods
i. Number of genes with ‘no data’ with data in other organisms
ii. Graphical view displays ‘outliers’
iii. Looking for co-occurrences of annotations as a high-level (confusion matrices)
iv. Verification of ortho sets


Rex: will monitor to see if we are making improvements each time we do manual QC.
Rex: will monitor to see if we are making improvements each time we do manual QC.
Line 38: Line 54:
Proposed Changes to the GA File:
Proposed Changes to the GA File:


1. column 2: longest version of the gene/protein ID
a. Column 2: longest version of the gene/protein (gene preferable, protein acceptable; other objects such as complexes are also accepted


2. new column (17): ID of the object being annotated. Will be different from column 2 in the case of isoforms
b. New Column (17) ID of the object being annotated. Will be different from column 2 in the case of isoforms


3. column 12: will refer to column 17
c. Column 12 would refer to column 17


4. new header information to be added to the file regarding gene products annotated and expected no. genes in organism
d. Header of the GAF will have a standard sentence that says how many gene products are annotated and the expected total from that organism


=> Chris: more information about this is now on the wiki.
=> Chris: more information about this is now on the wiki.
Line 57: Line 73:
1. should only contain one version of each gene sequence
1. should only contain one version of each gene sequence


2. ---
2. ---? Pascale?


=> more information about this on the ref genome wiki.
=> more information about this on the ref genome wiki.
Line 64: Line 80:
Software development discussed during Ref Genome meeting:
Software development discussed during Ref Genome meeting:


1. reference genome tracking tool to replace google spreadsheets
1. Siddhartha, Chris, Seth, Mary – database and tool where target genes and orthosets and their curation status will be maintained


2. graphical display to look at annotation data
2. Will replace Google spreadsheet


3. integration of ref genome genes into Amigo for next release.
3. Graphical displays- several improvements


4. Integration of ref genomes genes into AmiGO


'''3. Ontology Development'''
= Ontology Development =


'''3.1. Report on accomplishments (Midori)'''
'''3.1. Report on accomplishments (Midori)'''
Line 107: Line 124:
March 25: regulates relationships go live!
March 25: regulates relationships go live!


Outcomes:
Summary of Work
 
i. Biological process ontology was improved in several areas


1. BP ontology improved – new terms, definitions
ii. New relationships portray the biology correctly


2. New relationships portray the biology correctly
Regulates
 
Positively regulates
 
Negative regulates
 
iii. New logical definitions allow automated consistency checks as the ontology is developed
 
Nightly report in update file
 
Will be viewable in OBO-Edit 2
 
Semantic parse – new bug report


3. New logical definitions allow automated creation of error reports as ontology is changed.


Next:
Next:
Line 119: Line 149:
1. intersection tags: cross products between the GO ontologies
1. intersection tags: cross products between the GO ontologies


2. intra- and inter-ontology links for regulates relationships: occurrents can regulate occurrents
2. Inter-and Intra ontology links


MF can regulate MF, BP can regulate BP BUT ALSO MF can regulate BP (an vice versa)
MF can regulate MF, BP can regulate BP BUT ALSO MF can regulate BP (an vice versa)
Will change MF from being a strict is_a ontology
Will change MF from being a strict is_a ontology
- Add regulates links between function and process
- Question: how to make relationship between ‘transcription regulator activity’ in MF to either ‘transcription’ as process with regulations
- Sue Rhee mentions work by Peter Karp on relationships in Cyc databases – has about 2000 relationships


These are ready to go now but will put out notifications first.
These are ready to go now but will put out notifications first.
Line 139: Line 175:


Metabolism – intermediate regulation parents filled in manually in one section; moving towards computational analysis to find these terms for larger areas of the ontology followed by biocurator review before adding new terms.
Metabolism – intermediate regulation parents filled in manually in one section; moving towards computational analysis to find these terms for larger areas of the ontology followed by biocurator review before adding new terms.
Signal transduction – can say that signal transduction is_a regulation of cell communication.  
 
Signal transduction
 
- ‘ultimately effecting a change in the functioning of a cell’ – sounds like regulation activity to Tanya…could make a regulation process…
 
– can say that signal transduction is_a regulation of cell communication.  
Will be starting a push to revamp signal transduction: need definitions with a start point and an end point. Beginning of signal transduction is ligand-receptor binding then the rest is a cellular process.  
Will be starting a push to revamp signal transduction: need definitions with a start point and an end point. Beginning of signal transduction is ligand-receptor binding then the rest is a cellular process.  
Finding general BP that signal transduction can fit into as a regulation.
Finding general BP that signal transduction can fit into as a regulation.
EG: biological objective of BMP signaling pathways is to regulate transcription. Where it is now, isn’t connected to transcription
EG: biological objective of BMP signaling pathways is to regulate transcription. Where it is now, isn’t connected to transcription
4. QC: continue reviewing QC reports that will be run regularly
4. QC: continue reviewing QC reports that will be run regularly


3.3. Function -Process Links (Harold)
'''3.3. Function -Process Links (Harold)'''
 
1. Considerations on Glycolysis and TCA Cycle
1. Considerations on Glycolysis and TCA Cycle
First thing is to define start and end points for glycolysis. Made stop point at pyruvate so that didn’t have to consider aerobic/anaerobic processes - same process of defining start and end points for the Kreb cycle.
First thing is to define start and end points for glycolysis. Made stop point at pyruvate so that didn’t have to consider aerobic/anaerobic processes - same process of defining start and end points for the Kreb cycle.


Line 152: Line 197:


Peter: reaction can be slightly different depending on the outcome. Should we look at the pathway as the common element for all processes or should we look at these as separate processes that have separate purposes (with overlap). The latter will become enormously complex as we start considering diverse species.
Peter: reaction can be slightly different depending on the outcome. Should we look at the pathway as the common element for all processes or should we look at these as separate processes that have separate purposes (with overlap). The latter will become enormously complex as we start considering diverse species.
David & Midori: Historically, BPs are defined by their objective, therefore it makes sense to take the latter view. Will require parent terms to collect these processes. IF different gene products are involved, it should be defined as a different process.
David & Midori: Historically, BPs are defined by their objective, therefore it makes sense to take the latter view. Will require parent terms to collect these processes. IF different gene products are involved, it should be defined as a different process.


3.4. Electron Transport Cross-products (Jen)
'''3.4. Electron Transport Cross-products (Jen)'''
 
Looking at electron transport region and representing BP & MF cross-products.
Looking at electron transport region and representing BP & MF cross-products.
Developed lists of MFs, their BP and a taxonomy group that it applies to
Developed lists of MFs, their BP and a taxonomy group that it applies to
Used this to make has_part relationships: BP can’t exist without MF in this taxon
Used this to make has_part relationships: BP can’t exist without MF in this taxon
Put has_part relationship to put BP term under MF term (at the moment there are difficulties visualizing this).
Put has_part relationship to put BP term under MF term (at the moment there are difficulties visualizing this).
Taxon issue – how do we represent that these relationships should be qualified by taxon - possible to have general parent term that has several children term to represent the different taxons.
Taxon issue – how do we represent that these relationships should be qualified by taxon - possible to have general parent term that has several children term to represent the different taxons.


In order to create links from BP and MF are going to need the new relationship, has_part. At the moment, need to come up with a way of adding this relationship in to the graphs (violates expectations – makes syntactical sense but is harder to understand for biologists).
In order to create links from BP and MF are going to need the new relationship, has_part. At the moment, need to come up with a way of adding this relationship in to the graphs (violates expectations – makes syntactical sense but is harder to understand for biologists).
Judy: Should more of these projects be undertaken at this point?
Judy: Should more of these projects be undertaken at this point?
Jen, David, Harold: These projects should continue, improve ontology.
Jen, David, Harold: These projects should continue, improve ontology.
David: Bottom line – needs ‘has_part’ relationship to relate function to process
Alex: has_part changes expectation of directionality of ontology; how do you maintain directionality in parallel?
Need feedback from users – Chris can make a separate version of AmiGO that includes these relationships.
Need feedback from users – Chris can make a separate version of AmiGO that includes these relationships.
Michael: need careful documentation for users.
Michael: need careful documentation for users.
David: needs to be presented at conferences to get it out there for comment, get users used to it.
David: needs to be presented at conferences to get it out there for comment, get users used to it.


ACTION ITEM: Jen and Harold to continue with this process, making improvements to ontology as they do.
'''''ACTION ITEM''': Nomi to fix OBO_Edit to view occurrents?''
 
'''''ACTION ITEM''': Jen and Harold to continue with this process, making improvements to ontology as they do.''


ACTION ITEM: Everyone else to look at these relationships & giving Chris feedback. Files available at OBO Edit scratch page.
'''''ACTION ITEM''': Everyone else to look at these relationships & giving Chris feedback. Files available at OBO Edit scratch page.''


3.5. Cross Products between GO BP & CC (Chris)
'''3.5. Cross Products between GO BP & CC (Chris)'''
OBO 2 has ways of looking at cross products – intersection editor; files for CC-BP are available in the scratch file.
OBO 2 has ways of looking at cross products – intersection editor; files for CC-BP are available in the scratch file.


To capture what is happening with the BP, need a number of new relationships.
To capture what is happening with the BP, need a number of new relationships:
 
- Results_in_structural_change_to
 
- Results_in_formation_of
 
- Results_in_connection_of
 
Found inconsistencies between ontologies - Idea is to use the intersection editor to keep track of consistency issues between BP & CC
Found inconsistencies between ontologies - Idea is to use the intersection editor to keep track of consistency issues between BP & CC


Cross Product timeline:
Cross Product timeline:
1. Move cross-product files into OBO edit.
 
2. Then move cross-products into main ontology file – dependent upon new release of OBO edit.
1. Formalize cross-product relationships
3. Integration of Cell Line (CL) cross-products into edit cycle – no plan at this time to integrate into the main ontology file.
 
2. Integration of internal XPs into main ontology file
 
- Introduction of ‘intersection tags’ into ontology itself
 
- This is purely an augmentation- so won’t break software
 
3. Focus on cell cross-projects
 
- Have already generated QC with SF tracker…
 
– no plan at this time to integrate into the main ontology file.


Working on how to display cross-products in AmiGO. Example: AmiGO with GO & Mouse anatomy; heart development.
Working on how to display cross-products in AmiGO. Example: AmiGO with GO & Mouse anatomy; heart development.


ACTION ITEM: Need feedback from users on AmiGO display.
'''''ACTION ITEM''': Need feedback from users on AmiGO display.''
 
=  Outstanding Issues, session 1 (Midori) =


4. Outstanding Issues, session 1 (Midori)
Proteases – is there any objections to a reorganization of MF relating to protease activity? Proposal on wiki for review.  
Proteases – is there any objections to a reorganization of MF relating to protease activity? Proposal on wiki for review.  
NOTE: If we do it, many protease terms will become obsolete on the basis that they are gene products rather than functions.
NOTE: If we do it, many protease terms will become obsolete on the basis that they are gene products rather than functions.


Peter: Proteases can be distinguished generically by structure (1) active sites (4 different types) and (2) by where in the target peptide they cut (3 types). This would give us 12 MF terms. Would this cover all proteases?
Peter: Proteases can be distinguished generically by structure (1) active sites (4 different types) and (2) by where in the target peptide they cut (3 types). This would give us 12 MF terms. Would this cover all proteases?
Need cases from the group to see if lumping the annotations into one of these 12 terms would lose information.
Need cases from the group to see if lumping the annotations into one of these 12 terms would lose information.


Line 198: Line 281:
Midori: there are some MF terms that have clear differentiate from other MF terms. These terms should be used.
Midori: there are some MF terms that have clear differentiate from other MF terms. These terms should be used.


ACTION ITEM: continue to investigate the protease terms to remove concerns about terms that reflect activity of single gene products.
'''''ACTION ITEM''': Peter and Midori – use Peter’s orthongonal activity to merge terms when there is no distinction functionally.''
 
= Advocacy/Outreach/Collaborations (Jen/Jane) =
 
'''5.1. Help Desk (stats report) and newsletter'''


5. Advocacy/Outreach/Collaborations (Jen/Jane)
5.1. Help Desk (stats report) and newsletter
Most queries to GO help desk answered within 24h.
Most queries to GO help desk answered within 24h.
Newsletter released quarterly.
Newsletter released quarterly.
Web presence working group (formerly AmiGO WG) – advocacy group will determine features then the AmiGO WG to implement.
Web presence working group (formerly AmiGO WG) – advocacy group will determine features then the AmiGO WG to implement.
AmiGO future features:
AmiGO future features:
Ref genome display
Ref genome display
Wiki style user annotation – GONUTS
Wiki style user annotation – GONUTS
AmiGO web services
AmiGO web services
How to prioritize development and implementation of AmiGO features under discussion.
How to prioritize development and implementation of AmiGO features under discussion.


5.2. Annotation Outreach
'''5.2. Annotation Outreach'''
Not cold calling dbs any more – all have been contacted.
 
Not cold calling dbs any more – all have been contacted.
 
TAIR have made agreement with journal to accept annotation by submitting authors & discussion at PAG 2008
TAIR have made agreement with journal to accept annotation by submitting authors & discussion at PAG 2008
Reactome have developed a ga file?
Reactome have developed a ga file?
Muscle annotation wiki developed
Muscle annotation wiki developed
Sol Genomics ga file submitted
Sol Genomics ga file submitted


6. AmiGO, GO Database
= AmiGO, GO Database =
6.1. Software group progress report (Seth)
 
'''6.1. Software group progress report (Seth)'''
 
GO Term enrichment
GO Term enrichment
AmiGO 1.5:
AmiGO 1.5:
– new advanced search page & improved search functions
– new advanced search page & improved search functions
– search function is extend if no result found; may need to add a limit to this search function
– search function is extend if no result found; may need to add a limit to this search function
- results sorted by relevance
- results sorted by relevance
- pie charts have been replaced with bar graphs
- pie charts have been replaced with bar graphs
- regulates relationship icons have been added
- regulates relationship icons have been added
- now have GOOSE links back into AmiGO
- now have GOOSE links back into AmiGO


Next release of AmiGO (1.6):
Next release of AmiGO (1.6):
- ref genome support is main focus
- ref genome support is main focus
- display Mary’s graphs with pan & zoom function, more interactive (highlight direct vs indirect, species, ISS only, etc
- display Mary’s graphs with pan & zoom function, more interactive (highlight direct vs indirect, species, ISS only, etc
- display of homolosets on AmiGO
- display of homolosets on AmiGO
- display of intersections between GO terms (bi-axial viewer)
- display of intersections between GO terms (bi-axial viewer)
- links to and from GONUTS to make it easier for people to leave comments & for biocurators to se these comments
- links to and from GONUTS to make it easier for people to leave comments & for biocurators to se these comments


6.2. Reference Genes - DB management extensions
'''6.2. Reference Genes - DB management extensions'''
 
Populating the database with NCBI taxonomy trees – use to filter queries
Populating the database with NCBI taxonomy trees – use to filter queries


Added support for curation of ref genome curation sets
Added support for curation of ref genome curation sets
ACTION ITEM: Ben & Mike – get isoforms into GO database
 
'''''ACTION ITEM''': Ben & Mike – get isoforms into GO database''


Annotation Cross Products:  
Annotation Cross Products:  
wiki page now for cross product annotation will have case studies added for next GO meeting.
wiki page now for cross product annotation will have case studies added for next GO meeting.
Using column 16 to capture cross-product information?  
Using column 16 to capture cross-product information?
Question: will we need another qualifier to capture this refining information or do we use a separate column (column 16)?
Question: will we need another qualifier to capture this refining information or do we use a separate column (column 16)?
Column 16 is used for refining the information that is captured eg. particular CL for a BP
Column 16 is used for refining the information that is captured eg. particular CL for a BP
The relationship is between whatever the GO terms is and the refining object.
The relationship is between whatever the GO terms is and the refining object.


7. SO (Karen E)
= SO (Karen E) =
Since Princeton have been working on SO tracker items – 16 items closed.
 
Since Princeton have been working on SO tracker items – 50 items closed.
 
Working to sort out sequence attributes of SO: looking at annotator consistency of SO attributes.
Working to sort out sequence attributes of SO: looking at annotator consistency of SO attributes.
Go through list of sequence attributes and divide into BFO classes: quality, disposition, function, role (from Barry’s definitions). This is done individually, with comments on why the decision was made. Then results pooled and analyzed statistically to determine biocurator agreement. Iterative process.
Go through list of sequence attributes and divide into BFO classes: quality, disposition, function, role (from Barry’s definitions). This is done individually, with comments on why the decision was made. Then results pooled and analyzed statistically to determine biocurator agreement. Iterative process.
Doing this to make SO attribute classes that reflect BFO.
Doing this to make SO attribute classes that reflect BFO.


8. Progress with OBO-Edit (Nomi)
= Progress with OBO-Edit (Nomi) =
 
OBO-Edit 2 improvements:
OBO-Edit 2 improvements:
Documentation and installer
Documentation and installer
I/O & command-line options
I/O & command-line options
functionality
functionality
GUI look and feel
GUI look and feel
Priorities for release of OE2:
Priorities for release of OE2:
memory/speed issues
memory/speed issues
restore all functionality that was present ni OE1
restore all functionality that was present ni OE1
bug fixes in new components eg. graph editor
bug fixes in new components eg. graph editor
documentation
documentation
new features requests
new features requests
Michael Schroeder's plug-in – link GOPubmed to OBO Editor?
Michael Schroeder's plug-in – link GOPubmed to OBO Editor?


9. Overall Project Management  
= Overall Project Management =
9.1 Structure of Management Groups  
 
'''9.1 Structure of Management Groups'''
 
What is working (and What is not)?
What is working (and What is not)?
Wiki is too Kafka-esque: many pages are not linked properly and are hard to find. Could be using wiki categories more.
Wiki is too Kafka-esque: many pages are not linked properly and are hard to find. Could be using wiki categories more.


Line 278: Line 413:
GO Top needs to review draft GOC and Ref Genome agendas as a group.
GO Top needs to review draft GOC and Ref Genome agendas as a group.


Judging by progress, tings are working pretty well but need to balance overhead by biocurators.
Judging by progress, things are working pretty well but need to balance overhead by biocurators.


GO Manager calls: not enough time to resolve issues brought up. On some calls, one issue has been discussed for most of the call and haven’t moved on to other calls. Is it necessary to get the update from each group every 2 weeks? Calls should de-emphasize progress reports, but focus on the things stopping progress.
GO Manager calls: not enough time to resolve issues brought up. On some calls, one issue has been discussed for most of the call and haven’t moved on to other calls. Is it necessary to get the update from each group every 2 weeks? Calls should de-emphasize progress reports, but focus on the things stopping progress.


9.2 Identify communication bottlenecks and time-sinks, knowing who is responsible and time estimates
'''9.2 Identify communication bottlenecks and time-sinks, knowing who is responsible and time estimates'''


Does everyone know who to contact to make progress on a road block? Yes, know who to go to but not sure if there will be a response.
Does everyone know who to contact to make progress on a road block? Yes, know who to go to but not sure if there will be a response.
Line 292: Line 427:
One roadblock is to post a source forge item and then not hear back from anyone. Biggest problem with closing items is unable to make progress because of lack of information. Suggestion is to remove the SF item if there is no response after a designated time.
One roadblock is to post a source forge item and then not hear back from anyone. Biggest problem with closing items is unable to make progress because of lack of information. Suggestion is to remove the SF item if there is no response after a designated time.


10. Outstanding issues (continued)  
Outstanding issues (continued) =
10.1 Discuss Establishment and Maintenance Terms
 
Action Item: change to maintenance of localization
'''10.1 Discuss Establishment and Maintenance Terms'''
 
Proposal: localization has two parts:
 
1. Establishment of localization
 
2. Maintenance of Localization:  make regulation of biological quality
doesn’t mean maintaining the process of moving something to location, but rather
Maintenance of Locations
 
''''''ACTION ITEM''': change to maintenance of location''' and add parentage under 'regulation of biological quality'
 
'''10.2 Discuss IMP and the with column.'''


10.2 Discuss IMP and the with column.
Guidelines for what goes there - phenotype ID, genotype ID  
Guidelines for what goes there - phenotype ID, genotype ID  
Action Item: Chris will be talking to individual groups with how they use the with column for IMP. Each MOD groups needs to respond to this for Chris.


10.3 Discuss how we are going to handle 'response to drug'  
'''''ACTION ITEM''': Chris will be talking to individual groups with how they use the with column for IMP. Each MOD groups needs to respond to this for Chris.''
 
'''10.3 Discuss how we are going to handle 'response to drug''''
 
SF 1242405 and 1494526 and 'response to toxin' SF1658374. Are they normal biological processes? Response to X (drug, toxin) Response to Chemical (see also Use of Response To Terms in Annotation, for a related issue)(David and Tanya)
SF 1242405 and 1494526 and 'response to toxin' SF1658374. Are they normal biological processes? Response to X (drug, toxin) Response to Chemical (see also Use of Response To Terms in Annotation, for a related issue)(David and Tanya)
1. We keep ‘response to drug’, add response to ‘XX’ response by role chemical is playing would be in ontology with co annotation to chemical. We keep stub of ‘response to drug’, no children.  Roles are basically stub terms in ontology
Transport of drug<br>
Degradation of drug<br>
<pre>
2. Response to Chemical [ ID as Chebi ID ultimately]
Response to Cocaine[XX}
Response to Cocaine[XX] as a Drug
</pre>
sometimes things are toxins, sometimes not; same as drug
similiar to situation in PAMGO with ‘pathogenesis’, not always deleterious
Drugs would be put in the chemical organization.
Note added Day 2: Harold's proposal:<br>
annotate to response to X with whatever evidence code<br>
annotate to response to drug with IC<br>
some discussion of this proposal, qualifiers, other gp-term rels, etc.
this is effectively a refinement of yesterday's proposal
'''''ACTION ITEM''': Implement Michelle's proposal; decide whether to put 'response to drug' ID in column 16 or is separate IC annotation.  Annotate to chemical term ‘response to cocaine’,  co-annotate with chemical term for now, then later when available, put GO ID for “response to drug’ in column 16 (or separate IC annotation).''
'''10.4 Chemical derivatives and metabolism terms:'''
How should we handle derivative compounds in the metabolism parts of GO--should we continue to include metabolism of X derivatives as is_a descendants of X metabolism? It can be confusing. For example, SFs 1885151 and 1847808 note that 'gamma-aminobutyric acid metabolic process' is a descendant of 'fatty acid metabolic process' because GABA is a derivative of butyric acid. (Midori and Val)
Resolution: GO should follow ChEBI, but use only is_a relationships from ChEBI; ignore all other ChEBI relationship types (including has_functional_parent, which will solve the derivative problem).
ACTION ITEM: Midori and David will deal with looking at these terms; Chris will help. We'll also approach Mike Bada, because he's been looking at GO and ChEBI terms for a while.
== Collaboration with other projects ==
Reports on new/ongoing collaborations with these groups:
PRO: Protein Ontology (Harold) - PO being made to relate proteins to splice forms, variants, etc. Should be able to attach GO information to these forms.
Reactome - GAF done and ready to be submitted
GOA will be picking this file up. Reminder to check with Mike and make sure that this data is being picked up.
Panther: Will be running Ref Genome targets through Panther.
NCBO
MP-PATO
CARO
= Discussion Points =
1. Discuss: At one of our meetings Chris suggested we might keep an archive of gene sets used in publications that were analyzed using GO. Should we do this? (David)
This seems to be a research project.
Should the AmiGO WG provide a way for people to provide datasets?
= Review of Action Items from Princeton Meeting =
Day 1
1. Tutorial on wiki discipline (assigned to Jim Hu ?).
NOT DONE
2. (ALL) Look at and comment on outstanding items Outstanding Action Items from 17th GOC Meeting, Cambridge UK
DONE
3. Check whether there should be a relationship between pigment metabolic process and pigmentation
DONE
4. Jen: A reference to these pages should go in next newsletter.
DONE
5. Jen Add a link from outreach to something (SOP?)
DONE
6. investigate why terms requests aren’t coming in, do we need things we need to do to make it easier, SF tracker list/annotation list/ who are on these lists/ do other people need to be on those lists?
DONE
7. NML Michael sent ISSN URL to Eurie – Action Eurie!
DONE
8. e-mail Ben if you are not getting a gp2protein check for your database.
DONE
9. Somebody mentioned RSS feed, is this a potential action?
Web presence group to discuss
10. Reactome annotations should be available from GO by the next GO Consortium meeting. Chris, Alex, Jen and Ruth to be responsible. # Add new evidence code EXP for 1:1 Reactome to literature, add all other Reactome with TAS to Reactome source.
DONE
11. Convert Reactome complex terms to GO terms
Some progress: review at next meeting.
12. Jen to do a pilot project with a minimal set of terms, as an experiment and bring back results for next GO meeting
DONE
12. (David Hill) Make difficult sensu terms organism specific (biologist intuitive) (i.e plant vacuole, fungal vacuole). However GO definitions will still be designed to be formal, not depending on species to define the term.
This needs to be moved to a source forge item.
Day 2 Action Items
1. (David Midori Seth) Deploy the part that created SF items based on a friendly webform, and would like to see an OBO format in the SF item
In progress
2. Seth, ORB: Make link to how to make a perfect GO term from the term request tool
In Progress
3. Amelia link GOOSE from front page
DONE
4. DH: Cross products: need to have webex meeting to everyone understands what to do.
DONE
5. OBO file renaming. JB: add a link to Wiki: http://gocwiki.geneontology.org/index.php/Versioning_Proposal On the best practises page: http://gocwiki.geneontology.org/index.php/Best_Practices
DONE
6. Web presence WG work on specification needed for new Amigo features.
Ongoing
7. Gene Association files: to work on a more advanced interface to download custom files (Chris)
DONE
8. Gene Association files: to filter files as they come in. (Mike)
No longer relevant?
9. Judy: Predictive Activities. Collaborations with external groups. Reports into next GOC meeting as to these kinds of activities.<br>
AND <br>
10. Jim: Suggested Making a repository for predictions POSSIBLE ACTION ITEM?
Pending
11. Finalizing proposed evidence code documentation – abbreviated version on web pages and more detailed on GOC Wiki (Rama)
Need to make sure that the evidence code ontolopgy reflects the documentation
Also, need to make sure that there is a tracker for this.
12. Eurie: querying communities on awareness of evidence codes – do you know what it is, what do you use it for? Also proposal of expanding, then get a feel for what would benefit them? So that we have a large audience.
PENDING
13. Sue, Michelle, Rama put evidence code proposal in the context of what we discussed today
DONE
14. Evidence code committee. Documentation for users and curators.
Curator - DONE
User documentation - not completed
15. Evidence code Revise evidence code documentation so that a mutation in only one gene can only be IMP (protein localization IGI example)
DONE
16. (Curators) Check whether you have used IGI in this way and update annotations
All groups to check this. David is DONE.
17. (Curators) 'with' column optional for NAS - document
DONE
18. Update evidence code decision tree in response to today's discussion on evidence code usage (Jen and EV Code WG)
Pending: check with Karen?
19. (Curators) only ND allowed to root nodes - clarify this in the documentation (Rama)
DONE
20. Karen E and Chris M will work on GO-SO cross products
In progress
= Taxon and GO terms (Jen) =
Details in slides: [http://cvsweb.geneontology.org/cgi-bin/cvsweb.cgi/go/meeting/minutes/20080420_Additional_Material/GO-Taxon_Links_Report.ppt]
Highlight 1: annotation errors uncovered in pilot project on term taxon "restriction"; vast majority are IEA
Hightlight 2: proposal for separate file of term - rel - taxon, where rel is 'never in taxon', 'never outside tracker'
:useful for QC
:well received
Action item: Jen to implement system.
= Annotation issues =
1. Evidence codes and propagating annotations (Emily; slides)
Main question was ISS (and subcodes) vs. IEA for certain methods; details in slides
Chris: this discussion shows that the evidence codes are way out of date. use evidence ontology so you can describe exactly what was done. (much nodding & agreement-ish muttering)
Judy - more detail on how MGI does their orthology sets
A long discussion ensued, which I (Midori) couldn't accurately transcribe. Points included (''not'' an exhaustive list):
*QC and general annotation practices for ISS and its subcodes
*Current documented definition of ISS
*Actual, precise meaning of "manual review" in ISS documentation
*what one can reliably infer from a sequence comparison or computational result
*Kara's proposal for IEA/manual orthogonal to method codes
There was no resolution at all, but lots of ideas.
2. Contributes_to qualifier
Val - ways it's been misused, e.g. annotating DNA pol alpha to primase, use with IPI, etc.<br>
:general agreement
'''''ACTION ITEM''': Val will circulate draft doc on how contributes_to can & can't be used; will include: "Would this annotation make sense if this subunit was" ... [thought not finished; might be something like "viewed by itself"].
'''''ACTION ITEM''': Check existing annotations; we think only allow contributes_to with IDA. Look into adding to annotation checking script to flag contributes_to.
= Next meeting =
looking at Oct 21-22 (or 21-23), Montreal
= ACTION ITEMS =
* Includes Reference Genome action items
{|border="1" cell spacing="0" cellpadding="4" align="center"
|-
! Status
! Responsible Party
! Task
! Comments
|-
| In Progress
| Documentation working group
| Document annotation SOPs
| Another factor we have been tracking is when a curator judges that the curation of a gene is ‘comprehensive’, that is, that is accurately represents the biology (irrespective of the number of papers available or read). This appears in the spreadsheets. The guideline is that when there are few papers, all papers should be read; when there are many (a curator can judge what is too many), then a review should be read to find the important primary literature and decide what information needs to be captured. We don’t keep track of whether or not reviews have been read. Wormbase uses textpresso (PMID 15383839), that helps ensuring curators do not overlook information. The ‘comprehensive’ curation status doesn’t get invalidated when a newer paper is published; however, curators may (and are encouraged to) update the date when the newer literature is curated.
|-
| DONE
| Mike Cherry
| Reference genome annotation camp
| Held in summer of 2008
|-
|
| Chris Mungall
| Re-calculate with is_a only paths
|
|-
|
| Chris Mungall
| Re-calculate with experimental codes only
| generate several versions of the data classified by different evidence codes?
|-
|
| Chris Mungall
| Provide such reports on a regular basis
|
|-
|
| Judy Blake
| Contact NCBI/NLM/OMIM to link to reference genome genes
|
|-
| In progress
| Documentation working group
| Document Changes to Gene Association File (GAF)
| column 2 is canonical gene ID; column 17 is thing you are annotating (always required); column 12 matches column 17 and contains SO ID's; add header to gene association file
|-
| In progress
| Documentation working group
| Document Changes to gp2protein file
| includes complete gene index (except for pseudogenes and transposons); column 1 is canonical gene ID; column 2 is accession for sequence of longest form of protein from UniProtKB: or NCBI; syntax of gp2protein file will be provided by Mike and Chris
|-
| In progress
| (Jane)
| write notice of changes to GAF and gp2protein to users
|
|-
| In progress
| MODs + Ben Hitz
| make sure that their input matches new GAF and gp2protein requirements
|
|-
|
| Seth Carbon
| Have AmiGO show co-occurrency terms
| similar to function in QuickGO.
|-
|
| Seth Carbon & Val Wood
| SLIM by SLIM matrix
| Would be used to review intersections of different cellular processes and look for unexpected intersections which may identify possible errors. Try first applying to function and component terms; outline cells that you expect to be empty, Have these matrices generated automatically from the AmiGO database.
|-
| DONE
| MODs and Paul Thomas
| PANTHER run with new gp2protein files
|
|-
| DONE
| Nomi
| Fix OBO_Edit to view new cross product relationships.
|
|-
|
| Ben & Mike
| Get isoforms into GO database
|
|-
| DONE
| Peter and Midori
| Use Peter’s orthogonal activity to merge terms when there is no distinction functionally
| see http://wiki.geneontology.org/index.php/Proteases for a summary; thanks also to Ben, Colin, and the MEROPS team
|-
| DONE
| <strike>Midori</strike> David and Tanya
| For Establishment and Maintenance Terms, change to maintenance of location
| 'regulation of biological quality' parentage also done
|-
|
| MODs & Chris
| Consistent use of IMP "with" column
| Chris will be talking to individual groups with how they use the with column for IMP. Each MOD groups needs to respond to this for Chris.
|-
|
| Michelle
| Implement Michelle's proposal
| decide whether to put 'response to drug' ID in column 16 or is separate IC annotation. Annotate to chemical term ‘response to cocaine’, co-annotate with chemical term for now, then later when available, put GO ID for “response to drug’ in column 16 (or separate IC annotation).
|-
| pending
| Midori, David, Chris, Mike Bada
| Chemical derivatives and metabolism terms
| Need input from Chris and Mike on how much can be automated; possibly also current and near-future state of ChEBI
|-
| DONE
| <strike>David</strike> Midori
| Make difficult sensu terms organism specific (biologist intuitive)
| e.g. plant vacuole, fungal vacuole). GO definitions will still be designed to be formal, not depending on species to define the term.
|-
|
| MODs & Pascale
| All groups to check on how they use IGI and update annotations as per Princeton discussion.
|
|-
|
| Val
| Circulate draft doc on how contributes_to can & can't be used
| Will include: "Would this annotation make sense if this subunit was" ... [thought not finished; might be something like "viewed by itself"].
|-
|
| MODs & Pascale
| Check existing annotations for "contributes_to" with IDA
| We think only allow contributes_to with IDA. Look into adding to annotation checking script to flag contributes_to.
|-
|
| Jen
| Implement rules and software for sanity checking automated annotations (species-based trigger file).
|
|}
[[Category:GO Consortium Meetings‏‎‏‎]]

Latest revision as of 10:06, 15 April 2019

Tuesday

ACTION ITEM: GO Top needs to sign off on agenda before the GOC meeting & Ref Genome meetings prior to these meetings. If there is no action item stated, this is not the forum for the item.


Introduction & Review of Agenda

Judy – Intro

New people introduced – Siddhartha Basu (dictyBase), Nomi Harris (BBOP), Ramana Madupu and Scott Durkin from JVCI

Welcome back Jane (EBI)

GO Reference Genome Annotation Team Report (Pascale)

Objectives of the Project Goal: comprehensive annotation of 12 genomes

Currently:

394 genes in target list – from disease genes, hot genes, metabolic pathways, unannotated gene lists.


Ref Genome Process:

1. generate focal sets This has been a difficult process since some of our genomes are not included in the common ortholog resources a. - P-POD (Kara Dolinski, Princeton) b. – Tree-based ,etjpd (Paul Thomas, SRI)

2. select common curation targets

New curation priority is to annotate the families generated by PPOD where there is at least 1 member from each Ref Gen species – there are 153 of these sets.

Rex: focus on the clusters that have 1 in each species (rather than multiples) as these will be the least confusing.

3. do experimental annotations – notification once completed

4. do inferential (ISS) annotations


=> at each step generate error reports

QC issues – need QC at each step in the process. For GO annotation, look at outliers, check co-occurrences of annotations. For orthologs, manual verification of ortho sets.

b. Methods i. Number of genes with ‘no data’ with data in other organisms ii. Graphical view displays ‘outliers’ iii. Looking for co-occurrences of annotations as a high-level (confusion matrices) iv. Verification of ortho sets

Rex: will monitor to see if we are making improvements each time we do manual QC.

Proposed Changes to the GA File:

a. Column 2: longest version of the gene/protein (gene preferable, protein acceptable; other objects such as complexes are also accepted

b. New Column (17) ID of the object being annotated. Will be different from column 2 in the case of isoforms

c. Column 12 would refer to column 17

d. Header of the GAF will have a standard sentence that says how many gene products are annotated and the expected total from that organism

=> Chris: more information about this is now on the wiki.

Rex: key point is that this allows us to annotate isoforms and connect those isoforms to a gene.

Judy: this change will feed into generating the gene index file


Proposed changes to the gp2protein file:

1. should only contain one version of each gene sequence

2. ---? Pascale?

=> more information about this on the ref genome wiki.


Software development discussed during Ref Genome meeting:

1. Siddhartha, Chris, Seth, Mary – database and tool where target genes and orthosets and their curation status will be maintained

2. Will replace Google spreadsheet

3. Graphical displays- several improvements

4. Integration of ref genomes genes into AmiGO

Ontology Development

3.1. Report on accomplishments (Midori)

1. Sensu – not used any more in GO terms

2. Disjoint violations corrected

3. Cardio & muscle content meeting work – this work has gone live

4. Electron transport terms reworked

5. SourceForge statistics 500 opened; 478 closed!

Have about 200 items constantly open. At the end of this meeting will close all open source forge items that are more than a year old.

Sue: request that emails be sent out warning you that the items are going to be closed – prompt submitters to reread their items.

6. Regulates relationships – gone live

3.2. Implementing the new regulates relationships (Tanya)

If regulation of X was part_of X then all that need to be changed was to convert to regulates. BP terms reviewed, some added as part_of relations (many new definitions and synonyms also added). Reasoner – used to identify inconsistencies between regulate terms and BP terms, missing terms. QC reports generated.

QC reports: - missing links between terms

- internal consistency within the DAG

- relationship problems (regulate terms that are is_a instead of part_of)

- multiple part_of parentage (if you regulate a process, do you regulate its parent? Doesn’t work in terms with multiple part_of relationships.)

March 25: regulates relationships go live!

Summary of Work

i. Biological process ontology was improved in several areas

ii. New relationships portray the biology correctly

Regulates

Positively regulates

Negative regulates

iii. New logical definitions allow automated consistency checks as the ontology is developed

Nightly report in update file

Will be viewable in OBO-Edit 2

Semantic parse – new bug report


Next:

1. intersection tags: cross products between the GO ontologies

2. Inter-and Intra ontology links

MF can regulate MF, BP can regulate BP BUT ALSO MF can regulate BP (an vice versa) Will change MF from being a strict is_a ontology

- Add regulates links between function and process

- Question: how to make relationship between ‘transcription regulator activity’ in MF to either ‘transcription’ as process with regulations

- Sue Rhee mentions work by Peter Karp on relationships in Cyc databases – has about 2000 relationships

These are ready to go now but will put out notifications first.

MF that always occurs in a single type of BP are part_of the BP. Reasoning: that BP are a collection of MFs, therefore the relationship should be part_of

=> would appreciate feedback about what people think about this

Comment: when to annotate to regulate term?

- need to decide based on author’s argument

- annotation to regulation of BP will also still be annotation to BP parent term

3. BP ontology improvements:

Metabolism – intermediate regulation parents filled in manually in one section; moving towards computational analysis to find these terms for larger areas of the ontology followed by biocurator review before adding new terms.

Signal transduction

- ‘ultimately effecting a change in the functioning of a cell’ – sounds like regulation activity to Tanya…could make a regulation process…

– can say that signal transduction is_a regulation of cell communication. Will be starting a push to revamp signal transduction: need definitions with a start point and an end point. Beginning of signal transduction is ligand-receptor binding then the rest is a cellular process.

Finding general BP that signal transduction can fit into as a regulation. EG: biological objective of BMP signaling pathways is to regulate transcription. Where it is now, isn’t connected to transcription

4. QC: continue reviewing QC reports that will be run regularly

3.3. Function -Process Links (Harold)

1. Considerations on Glycolysis and TCA Cycle

First thing is to define start and end points for glycolysis. Made stop point at pyruvate so that didn’t have to consider aerobic/anaerobic processes - same process of defining start and end points for the Kreb cycle.

Examine terms & definitions within the defined start & end of the process - find discrepancies, possibly missing terms. Looking at definitions and refining/adding. Looking at the comparable Reactome data.

Peter: reaction can be slightly different depending on the outcome. Should we look at the pathway as the common element for all processes or should we look at these as separate processes that have separate purposes (with overlap). The latter will become enormously complex as we start considering diverse species.

David & Midori: Historically, BPs are defined by their objective, therefore it makes sense to take the latter view. Will require parent terms to collect these processes. IF different gene products are involved, it should be defined as a different process.

3.4. Electron Transport Cross-products (Jen)

Looking at electron transport region and representing BP & MF cross-products.

Developed lists of MFs, their BP and a taxonomy group that it applies to

Used this to make has_part relationships: BP can’t exist without MF in this taxon

Put has_part relationship to put BP term under MF term (at the moment there are difficulties visualizing this).

Taxon issue – how do we represent that these relationships should be qualified by taxon - possible to have general parent term that has several children term to represent the different taxons.

In order to create links from BP and MF are going to need the new relationship, has_part. At the moment, need to come up with a way of adding this relationship in to the graphs (violates expectations – makes syntactical sense but is harder to understand for biologists).

Judy: Should more of these projects be undertaken at this point?

Jen, David, Harold: These projects should continue, improve ontology.

David: Bottom line – needs ‘has_part’ relationship to relate function to process

Alex: has_part changes expectation of directionality of ontology; how do you maintain directionality in parallel?

Need feedback from users – Chris can make a separate version of AmiGO that includes these relationships.

Michael: need careful documentation for users.

David: needs to be presented at conferences to get it out there for comment, get users used to it.

ACTION ITEM: Nomi to fix OBO_Edit to view occurrents?

ACTION ITEM: Jen and Harold to continue with this process, making improvements to ontology as they do.

ACTION ITEM: Everyone else to look at these relationships & giving Chris feedback. Files available at OBO Edit scratch page.

3.5. Cross Products between GO BP & CC (Chris) OBO 2 has ways of looking at cross products – intersection editor; files for CC-BP are available in the scratch file.

To capture what is happening with the BP, need a number of new relationships:

- Results_in_structural_change_to

- Results_in_formation_of

- Results_in_connection_of

Found inconsistencies between ontologies - Idea is to use the intersection editor to keep track of consistency issues between BP & CC

Cross Product timeline:

1. Formalize cross-product relationships

2. Integration of internal XPs into main ontology file

- Introduction of ‘intersection tags’ into ontology itself

- This is purely an augmentation- so won’t break software

3. Focus on cell cross-projects

- Have already generated QC with SF tracker…

– no plan at this time to integrate into the main ontology file.

Working on how to display cross-products in AmiGO. Example: AmiGO with GO & Mouse anatomy; heart development.

ACTION ITEM: Need feedback from users on AmiGO display.

Outstanding Issues, session 1 (Midori)

Proteases – is there any objections to a reorganization of MF relating to protease activity? Proposal on wiki for review.

NOTE: If we do it, many protease terms will become obsolete on the basis that they are gene products rather than functions.

Peter: Proteases can be distinguished generically by structure (1) active sites (4 different types) and (2) by where in the target peptide they cut (3 types). This would give us 12 MF terms. Would this cover all proteases?

Need cases from the group to see if lumping the annotations into one of these 12 terms would lose information.

Judy: when do MF terms stop representing classes and start representing individual genes? Specific gene products should not be represented.

Midori: there are some MF terms that have clear differentiate from other MF terms. These terms should be used.

ACTION ITEM: Peter and Midori – use Peter’s orthongonal activity to merge terms when there is no distinction functionally.

Advocacy/Outreach/Collaborations (Jen/Jane)

5.1. Help Desk (stats report) and newsletter

Most queries to GO help desk answered within 24h.

Newsletter released quarterly.

Web presence working group (formerly AmiGO WG) – advocacy group will determine features then the AmiGO WG to implement.

AmiGO future features:

Ref genome display

Wiki style user annotation – GONUTS

AmiGO web services

How to prioritize development and implementation of AmiGO features under discussion.

5.2. Annotation Outreach

Not cold calling dbs any more – all have been contacted.

TAIR have made agreement with journal to accept annotation by submitting authors & discussion at PAG 2008

Reactome have developed a ga file?

Muscle annotation wiki developed

Sol Genomics ga file submitted

AmiGO, GO Database

6.1. Software group progress report (Seth)

GO Term enrichment

AmiGO 1.5:

– new advanced search page & improved search functions

– search function is extend if no result found; may need to add a limit to this search function

- results sorted by relevance

- pie charts have been replaced with bar graphs

- regulates relationship icons have been added

- now have GOOSE links back into AmiGO


Next release of AmiGO (1.6):

- ref genome support is main focus

- display Mary’s graphs with pan & zoom function, more interactive (highlight direct vs indirect, species, ISS only, etc

- display of homolosets on AmiGO

- display of intersections between GO terms (bi-axial viewer)

- links to and from GONUTS to make it easier for people to leave comments & for biocurators to se these comments

6.2. Reference Genes - DB management extensions

Populating the database with NCBI taxonomy trees – use to filter queries

Added support for curation of ref genome curation sets

ACTION ITEM: Ben & Mike – get isoforms into GO database

Annotation Cross Products:

wiki page now for cross product annotation will have case studies added for next GO meeting. Using column 16 to capture cross-product information?

Question: will we need another qualifier to capture this refining information or do we use a separate column (column 16)?

Column 16 is used for refining the information that is captured eg. particular CL for a BP The relationship is between whatever the GO terms is and the refining object.

SO (Karen E)

Since Princeton have been working on SO tracker items – 50 items closed.

Working to sort out sequence attributes of SO: looking at annotator consistency of SO attributes. Go through list of sequence attributes and divide into BFO classes: quality, disposition, function, role (from Barry’s definitions). This is done individually, with comments on why the decision was made. Then results pooled and analyzed statistically to determine biocurator agreement. Iterative process.

Doing this to make SO attribute classes that reflect BFO.

Progress with OBO-Edit (Nomi)

OBO-Edit 2 improvements:

Documentation and installer

I/O & command-line options

functionality

GUI look and feel

Priorities for release of OE2:

memory/speed issues

restore all functionality that was present ni OE1

bug fixes in new components eg. graph editor

documentation

new features requests

Michael Schroeder's plug-in – link GOPubmed to OBO Editor?

Overall Project Management

9.1 Structure of Management Groups

What is working (and What is not)?

Wiki is too Kafka-esque: many pages are not linked properly and are hard to find. Could be using wiki categories more.

Too much work time taken up in meetings, not enough time to biocurate. Has the project reached a level of complexity where it is not possible to know everything going on? Should work be summarized so that people can be updated more efficiently? The opportune time to be updated is to review the reports posted by other groups; progress reports are now by WG rather then by MOD.

GO Top needs to review draft GOC and Ref Genome agendas as a group.

Judging by progress, things are working pretty well but need to balance overhead by biocurators.

GO Manager calls: not enough time to resolve issues brought up. On some calls, one issue has been discussed for most of the call and haven’t moved on to other calls. Is it necessary to get the update from each group every 2 weeks? Calls should de-emphasize progress reports, but focus on the things stopping progress.

9.2 Identify communication bottlenecks and time-sinks, knowing who is responsible and time estimates

Does everyone know who to contact to make progress on a road block? Yes, know who to go to but not sure if there will be a response.

Could we collect Skype addresses & have them available (but hidden form the public)? Would be good to solve problems in real time.

Wiki, Webex & Skype are helping solve problems in real time. This is really helping

One roadblock is to post a source forge item and then not hear back from anyone. Biggest problem with closing items is unable to make progress because of lack of information. Suggestion is to remove the SF item if there is no response after a designated time.

Outstanding issues (continued)

10.1 Discuss Establishment and Maintenance Terms

Proposal: localization has two parts:

1. Establishment of localization

2. Maintenance of Localization: make regulation of biological quality doesn’t mean maintaining the process of moving something to location, but rather Maintenance of Locations

'ACTION ITEM': change to maintenance of location and add parentage under 'regulation of biological quality'

10.2 Discuss IMP and the with column.

Guidelines for what goes there - phenotype ID, genotype ID

ACTION ITEM: Chris will be talking to individual groups with how they use the with column for IMP. Each MOD groups needs to respond to this for Chris.

10.3 Discuss how we are going to handle 'response to drug'

SF 1242405 and 1494526 and 'response to toxin' SF1658374. Are they normal biological processes? Response to X (drug, toxin) Response to Chemical (see also Use of Response To Terms in Annotation, for a related issue)(David and Tanya)

1. We keep ‘response to drug’, add response to ‘XX’ response by role chemical is playing would be in ontology with co annotation to chemical. We keep stub of ‘response to drug’, no children. Roles are basically stub terms in ontology

Transport of drug
Degradation of drug

2. Response to Chemical [ ID as Chebi ID ultimately]
	Response to Cocaine[XX}
		Response to Cocaine[XX] as a Drug

sometimes things are toxins, sometimes not; same as drug similiar to situation in PAMGO with ‘pathogenesis’, not always deleterious Drugs would be put in the chemical organization.

Note added Day 2: Harold's proposal:
annotate to response to X with whatever evidence code
annotate to response to drug with IC

some discussion of this proposal, qualifiers, other gp-term rels, etc.

this is effectively a refinement of yesterday's proposal

ACTION ITEM: Implement Michelle's proposal; decide whether to put 'response to drug' ID in column 16 or is separate IC annotation. Annotate to chemical term ‘response to cocaine’, co-annotate with chemical term for now, then later when available, put GO ID for “response to drug’ in column 16 (or separate IC annotation).

10.4 Chemical derivatives and metabolism terms:

How should we handle derivative compounds in the metabolism parts of GO--should we continue to include metabolism of X derivatives as is_a descendants of X metabolism? It can be confusing. For example, SFs 1885151 and 1847808 note that 'gamma-aminobutyric acid metabolic process' is a descendant of 'fatty acid metabolic process' because GABA is a derivative of butyric acid. (Midori and Val)

Resolution: GO should follow ChEBI, but use only is_a relationships from ChEBI; ignore all other ChEBI relationship types (including has_functional_parent, which will solve the derivative problem).

ACTION ITEM: Midori and David will deal with looking at these terms; Chris will help. We'll also approach Mike Bada, because he's been looking at GO and ChEBI terms for a while.

Collaboration with other projects

Reports on new/ongoing collaborations with these groups:

PRO: Protein Ontology (Harold) - PO being made to relate proteins to splice forms, variants, etc. Should be able to attach GO information to these forms.

Reactome - GAF done and ready to be submitted

GOA will be picking this file up. Reminder to check with Mike and make sure that this data is being picked up.

Panther: Will be running Ref Genome targets through Panther.

NCBO

MP-PATO

CARO

Discussion Points

1. Discuss: At one of our meetings Chris suggested we might keep an archive of gene sets used in publications that were analyzed using GO. Should we do this? (David)

This seems to be a research project.

Should the AmiGO WG provide a way for people to provide datasets?

Review of Action Items from Princeton Meeting

Day 1

1. Tutorial on wiki discipline (assigned to Jim Hu ?).

NOT DONE

2. (ALL) Look at and comment on outstanding items Outstanding Action Items from 17th GOC Meeting, Cambridge UK

DONE

3. Check whether there should be a relationship between pigment metabolic process and pigmentation

DONE

4. Jen: A reference to these pages should go in next newsletter.

DONE

5. Jen Add a link from outreach to something (SOP?)

DONE

6. investigate why terms requests aren’t coming in, do we need things we need to do to make it easier, SF tracker list/annotation list/ who are on these lists/ do other people need to be on those lists?

DONE

7. NML Michael sent ISSN URL to Eurie – Action Eurie!

DONE

8. e-mail Ben if you are not getting a gp2protein check for your database.

DONE

9. Somebody mentioned RSS feed, is this a potential action?

Web presence group to discuss

10. Reactome annotations should be available from GO by the next GO Consortium meeting. Chris, Alex, Jen and Ruth to be responsible. # Add new evidence code EXP for 1:1 Reactome to literature, add all other Reactome with TAS to Reactome source.

DONE

11. Convert Reactome complex terms to GO terms

Some progress: review at next meeting.

12. Jen to do a pilot project with a minimal set of terms, as an experiment and bring back results for next GO meeting

DONE

12. (David Hill) Make difficult sensu terms organism specific (biologist intuitive) (i.e plant vacuole, fungal vacuole). However GO definitions will still be designed to be formal, not depending on species to define the term.

This needs to be moved to a source forge item.

Day 2 Action Items

1. (David Midori Seth) Deploy the part that created SF items based on a friendly webform, and would like to see an OBO format in the SF item

In progress

2. Seth, ORB: Make link to how to make a perfect GO term from the term request tool

In Progress

3. Amelia link GOOSE from front page

DONE

4. DH: Cross products: need to have webex meeting to everyone understands what to do.

DONE

5. OBO file renaming. JB: add a link to Wiki: http://gocwiki.geneontology.org/index.php/Versioning_Proposal On the best practises page: http://gocwiki.geneontology.org/index.php/Best_Practices

DONE

6. Web presence WG work on specification needed for new Amigo features.

Ongoing

7. Gene Association files: to work on a more advanced interface to download custom files (Chris)

DONE

8. Gene Association files: to filter files as they come in. (Mike)

No longer relevant?

9. Judy: Predictive Activities. Collaborations with external groups. Reports into next GOC meeting as to these kinds of activities.
AND
10. Jim: Suggested Making a repository for predictions POSSIBLE ACTION ITEM?

Pending

11. Finalizing proposed evidence code documentation – abbreviated version on web pages and more detailed on GOC Wiki (Rama)

Need to make sure that the evidence code ontolopgy reflects the documentation

Also, need to make sure that there is a tracker for this.

12. Eurie: querying communities on awareness of evidence codes – do you know what it is, what do you use it for? Also proposal of expanding, then get a feel for what would benefit them? So that we have a large audience.

PENDING

13. Sue, Michelle, Rama put evidence code proposal in the context of what we discussed today

DONE

14. Evidence code committee. Documentation for users and curators.

Curator - DONE User documentation - not completed

15. Evidence code Revise evidence code documentation so that a mutation in only one gene can only be IMP (protein localization IGI example)

DONE

16. (Curators) Check whether you have used IGI in this way and update annotations

All groups to check this. David is DONE.

17. (Curators) 'with' column optional for NAS - document

DONE 18. Update evidence code decision tree in response to today's discussion on evidence code usage (Jen and EV Code WG)

Pending: check with Karen?


19. (Curators) only ND allowed to root nodes - clarify this in the documentation (Rama)

DONE

20. Karen E and Chris M will work on GO-SO cross products

In progress


Taxon and GO terms (Jen)

Details in slides: [1]

Highlight 1: annotation errors uncovered in pilot project on term taxon "restriction"; vast majority are IEA

Hightlight 2: proposal for separate file of term - rel - taxon, where rel is 'never in taxon', 'never outside tracker'

useful for QC
well received

Action item: Jen to implement system.

Annotation issues

1. Evidence codes and propagating annotations (Emily; slides)

Main question was ISS (and subcodes) vs. IEA for certain methods; details in slides

Chris: this discussion shows that the evidence codes are way out of date. use evidence ontology so you can describe exactly what was done. (much nodding & agreement-ish muttering)

Judy - more detail on how MGI does their orthology sets

A long discussion ensued, which I (Midori) couldn't accurately transcribe. Points included (not an exhaustive list):

  • QC and general annotation practices for ISS and its subcodes
  • Current documented definition of ISS
  • Actual, precise meaning of "manual review" in ISS documentation
  • what one can reliably infer from a sequence comparison or computational result
  • Kara's proposal for IEA/manual orthogonal to method codes

There was no resolution at all, but lots of ideas.

2. Contributes_to qualifier

Val - ways it's been misused, e.g. annotating DNA pol alpha to primase, use with IPI, etc.

general agreement

ACTION ITEM: Val will circulate draft doc on how contributes_to can & can't be used; will include: "Would this annotation make sense if this subunit was" ... [thought not finished; might be something like "viewed by itself"].

ACTION ITEM: Check existing annotations; we think only allow contributes_to with IDA. Look into adding to annotation checking script to flag contributes_to.

Next meeting

looking at Oct 21-22 (or 21-23), Montreal

ACTION ITEMS

  • Includes Reference Genome action items
Status Responsible Party Task Comments
In Progress Documentation working group Document annotation SOPs Another factor we have been tracking is when a curator judges that the curation of a gene is ‘comprehensive’, that is, that is accurately represents the biology (irrespective of the number of papers available or read). This appears in the spreadsheets. The guideline is that when there are few papers, all papers should be read; when there are many (a curator can judge what is too many), then a review should be read to find the important primary literature and decide what information needs to be captured. We don’t keep track of whether or not reviews have been read. Wormbase uses textpresso (PMID 15383839), that helps ensuring curators do not overlook information. The ‘comprehensive’ curation status doesn’t get invalidated when a newer paper is published; however, curators may (and are encouraged to) update the date when the newer literature is curated.
DONE Mike Cherry Reference genome annotation camp Held in summer of 2008
Chris Mungall Re-calculate with is_a only paths
Chris Mungall Re-calculate with experimental codes only generate several versions of the data classified by different evidence codes?
Chris Mungall Provide such reports on a regular basis
Judy Blake Contact NCBI/NLM/OMIM to link to reference genome genes
In progress Documentation working group Document Changes to Gene Association File (GAF) column 2 is canonical gene ID; column 17 is thing you are annotating (always required); column 12 matches column 17 and contains SO ID's; add header to gene association file
In progress Documentation working group Document Changes to gp2protein file includes complete gene index (except for pseudogenes and transposons); column 1 is canonical gene ID; column 2 is accession for sequence of longest form of protein from UniProtKB: or NCBI; syntax of gp2protein file will be provided by Mike and Chris
In progress (Jane) write notice of changes to GAF and gp2protein to users
In progress MODs + Ben Hitz make sure that their input matches new GAF and gp2protein requirements
Seth Carbon Have AmiGO show co-occurrency terms similar to function in QuickGO.
Seth Carbon & Val Wood SLIM by SLIM matrix Would be used to review intersections of different cellular processes and look for unexpected intersections which may identify possible errors. Try first applying to function and component terms; outline cells that you expect to be empty, Have these matrices generated automatically from the AmiGO database.
DONE MODs and Paul Thomas PANTHER run with new gp2protein files
DONE Nomi Fix OBO_Edit to view new cross product relationships.
Ben & Mike Get isoforms into GO database
DONE Peter and Midori Use Peter’s orthogonal activity to merge terms when there is no distinction functionally see http://wiki.geneontology.org/index.php/Proteases for a summary; thanks also to Ben, Colin, and the MEROPS team
DONE Midori David and Tanya For Establishment and Maintenance Terms, change to maintenance of location 'regulation of biological quality' parentage also done
MODs & Chris Consistent use of IMP "with" column Chris will be talking to individual groups with how they use the with column for IMP. Each MOD groups needs to respond to this for Chris.
Michelle Implement Michelle's proposal decide whether to put 'response to drug' ID in column 16 or is separate IC annotation. Annotate to chemical term ‘response to cocaine’, co-annotate with chemical term for now, then later when available, put GO ID for “response to drug’ in column 16 (or separate IC annotation).
pending Midori, David, Chris, Mike Bada Chemical derivatives and metabolism terms Need input from Chris and Mike on how much can be automated; possibly also current and near-future state of ChEBI
DONE David Midori Make difficult sensu terms organism specific (biologist intuitive) e.g. plant vacuole, fungal vacuole). GO definitions will still be designed to be formal, not depending on species to define the term.
MODs & Pascale All groups to check on how they use IGI and update annotations as per Princeton discussion.
Val Circulate draft doc on how contributes_to can & can't be used Will include: "Would this annotation make sense if this subunit was" ... [thought not finished; might be something like "viewed by itself"].
MODs & Pascale Check existing annotations for "contributes_to" with IDA We think only allow contributes_to with IDA. Look into adding to annotation checking script to flag contributes_to.
Jen Implement rules and software for sanity checking automated annotations (species-based trigger file).