SLC GO Consortium Meeting

Venue

Meeting Room

The Meeting will be held in the Salt Lake City Library, Conference Room 4
210 East 400 South, SLC UT 84111
http://www.slcpl.lib.ut.us/locations.jsp?parent_id=8&page_id=20

Conference Room 4 is the large conference room on the 4th floor, located behind the front desk to the right.
We are not allowed to put posters up to direct you, but if you get lost, the library staff will point you in the right direction.

Dates:

April 22-23, 2008
We will be providing lunch on the 22nd
The meeting ends at lunchtime on the 23rd.
Dinner will be held on Tuesday 22nd April at Cannellas. The reservation is for 6.30, and it is located one block South from the library.

Cost

$50.00; Receipt will be provided at registration. We will be taking cash and checks. Checks should be made payable to the Department of Human Genetics. The address if needed for the check is: Eccles Institute of Human Genetics, Building 533, University of Utah, 15 North 2030 East, Salt Lake City, Utah. 84112.
Dinner: cost TBD; we are looking for a sponsor
You are responsible for your lodging costs

Lodging:

The University GuestHouseis a good place to stay as it is a short walk to HSEB. If you are also attending the Reference Genome Meeting, it is a small walk from the that venue, and a TRAX ride from the library - door to door. The one way fare is $1.75 or a day pass can be purchased for $4.50. Tickets can be purchased at the platform. A map is available here: http://www.rideuta.com/images/TRAX082607.jpg

TRAX schedule from the guest house to the library. (The library opens at 9 am.)

MEDICAL CENTER	FT DOUGLAS	SOUTH CAMPUS	STADIUM TRAX STATION	900 EAST & 400 SOUTH (TRAX)	TROLLEY TRAX STATION	LIBRARY TRAX STATION
08:23 AM	08:25 AM	08:27 AM	08:29 AM	08:32 AM	08:34 AM	08:36 AM
08:38 AM	08:40 AM	08:42 AM	08:44 AM	08:47 AM	08:49 AM	08:51 AM

Please make reservations early as it is a popular place and there will be extra people in town for the marathon.

If you need to make other arrangements let me know (Karen E) and I will find a different hotel.
The | University_Marriott at the university is a cab ride from both venues.

There are many hotels downtown with reasonable rates such as the Little America and the Red Lion.
The Downtown_Marriott within walking distance of the library, but it is $259 per night!
To book a taxi from the library to the airport: 801 521 2100

Registration

Please [Register] for one or both meetings.

Draft Agenda

Tue am -

9:00- Intro, review agenda

Annotations

9:15 - 10:00 - Ref Genome report (Pascale)
1. Annotation Metrics
2. Annotation Pipeline
3. Annotation to Isoforms (new columns ?? )

10:00-10:45 Ontology Development

Report on accomplishments (15 min. for 1st several items; 30 min. for 'regulates')
1. Sensu
2. Disjoint violations corrected
3. Cardio & muscle content meeting work
4. Electron transport
5. SourceForge statistics
6. Regulates (including activators and inhibitors)

break 10:45-11:00

11:00-noon Ontology Development continued

Work in progress
1. Function-Process Links (Harold, Chris, Jen)
  1. Considerations on Glycolysis [[1]] and TCA cycle [[2]]
  2. Also see "term granularity" below
2. Taxon
3. Lung Development
Future work
1. Signal transduction
2. Morphogenesis
3. Transport
4. Metabolism cleanup
Outstanding issues
1. Response to X (drug, toxin) Response to Chemical
2. Establishment and Maintenance Terms
3. Chemical derivatives and metabolism terms
4. Term granularity (and Reactome)
5. Proteases

noon - lunch

Tues Afternoon Session -

1:00 - 1:30 Advocacy/Outreach/Collaborations (Jen/Jane)
1. Help Desk (stats report) and newsletter
2. User Oversight of AmiGO functionality (look and feel)
3. Active outreach collaborations
4. New outreach contacts.
1:30 - 2:00 - SO (Karen E)
2:00 - 2:30 - OBO-Edit
1. Might include demo of Michael Schoeder's plug-in

break

3:00 - 5:00 AmiGO
1. ReferenceGenes - DB management extensions

Wed AM

Overall Project Management
Metrics for Success

Agenda Items

At the moment this is a free-form list of items suggested for discussion.

Have Tanya give a presentation on implementation of 'regulates' and the work that she and David and Chris (and others) have done in this area. 'Regulates' will go live at the end of March, so there may be discussion points on this. We will also discuss 'regulation of biological quality' and how their negative and positive children relate.
Discuss how we are going to handle function terms such as activators and inhibitors. When do we create them and how do they relate to processes and other functions. (David and Tanya)
Discuss how we are going to handle 'response to drug' SF 1242405 and 1494526 and 'response to toxin' SF1658374. Are they normal biological processes? (David and Tanya)
Discuss function-process links. What relationships should we use? Who will make the assertions?(Harold, Jen & Chris)
1. See current work and discussions Function-Process_Links
(from ref genomes group): Would it be useful to add the 'comprehensively annotated' tag to all genes (not just the ref genome genes)? Either in the gene association file or in the database somehow
At one of our meetings Chris suggested we might keep an archive of gene sets used in publications that were analyzed using GO. Should we do this? (David)
Collaboration with other projects
1. PRO: Protein Ontology (Harold)
2. Reactome
3. Panther
4. NCBO
5. MP-PATO
6. KARO
Annotation Objects: how to handle GO annotation to specific isoforms, modified proteins, etc. (Harold)
Report results of experiment on finding annotation errors using GO-taxon links. (Jen)
How should we handle derivative compounds in the metabolism parts of GO--should we continue to include metabolism of X derivatives as is_a descendants of X metabolism? It can be confusing. For example, SFs 1885151 and 1847808 note that 'gamma-aminobutyric acid metabolic process' is a descendant of 'fatty acid metabolic process' because GABA is a derivative of butyric acid. (Midori and Val)
Should we approach authors of papers describing computationally assisted annotation of genes from non-MOD species with GO terms to integrate their results into the GO database? If not, can we take any advantage of these studies? For example, click on B2G in Papers on this site [[3]] (Tanya)
Proteases: any objections to making a lot of protease activity MF terms obsolete, on the grounds that they're gene products? See Proteases for a detailed proposal. (Midori et al.)
Granularity in GO - Peter (Reactome) has raised this question again, because we've discussed it but not fully reached consensus. An example from Reactome's perspective: "GO, for instance, has a single term GO:0031261 'DNA replication preinitiation complex', with no child terms, while Reactome enumerates all the intermediate stages of its assembly during G1 of the cell cycle and activation at G1/S." Should GO simply add the very specific terms from Reactome? (I think the same question applies to GO process terms -- many Reactome events are much more fine-grained than most GO processes.) (Midori and Peter)
contributes_to qualifier BAD-used to mean a number of different things (Val)

1. Mixing evidence and annotation method

Email from SGD about the need to annotate new computational datasets.

SGD has come across couple of computationally predicted GO
annotation data sets for S. cerevisiae that we would like to add to
our database. The GO annotations from these data sets are
predictions based on multiple high-throughput data sets. RCA
evidence code came to our minds but according to the documentation,
the annotations all have to be manually reviewed by a curator to
use this evidence. There are several 100 annotations of this kind
and it is not feasible for us to manually review these annotations.

Hence, we thought these annotations can be bulk loaded with IEA
evidence code. However, in the Jan 2007 (Cambridge) GO meeting, it
was decided that the 'with' column information has to be filled in
for all IEAs (else Mike's filtering script strips them out). But
these GO annotations being predictions based on multiple high-
throughput data sets, don't have any information for the with
column.  So, we are left with no choice.

Which evidence code do people think should be used for these kinds
of computational datasets when there is not an obvious "with"?

Thanks for your input.
Rama

Response from Kara (Princeton)

Hi,

The root of the problem, as I see it, is that we are mixing apples
and oranges with evidence codes.  All but one of the evidence codes
indicate the type of experimental evidence for a GO annotation, but
we have one oddball, IEA, that indicates not what the experiment is,
but rather how the annotation was done.  We keep running into
variations of the same problem:  we have some evidence (whether
experimental or computational) for a GO annotation, but also want to
indicate whether a curator looked at it or not.

My proposed (albeit radical) solution:

Remove IEA as an evidence code.

Create a new property for GO annotations (or add a new type of
qualifier) that captures how the annotation was done:  manual or
automated.

Everything that is currently IEA would be given the 'automated'
property/qualifier, and then would be given a new evidence code as
appropriate (mostly a flavor of ISS I would assume).
There can be a rule that all 'automated' annotations that are a
flavor of ISS must have a 'with' value.

This would allow us to use 'RCA' as appropriate, in some cases
they'd be 'manual', in others, they'd be 'automated'.  In Rama's
case, the annotations would be 'RCA' with an 'automated' qualifier.

I realize the issues involved in making such a drastic change, so I
understand if we don't go there, but I do think that some approach
such as the one above is the best representation of the information
that we are trying to capture.