AmiGO-chat-August1407

From GO Wiki
Revision as of 11:23, 14 August 2007 by Hitz (talk | contribs)

Jump to: navigation, search
ben: OK.
ben: I guess this is it.  Only item is slim mapper.
ben: URL?
sjcarbon: http://toy.lbl.gov:9002/cgi-bin/amigo/map2slim.cgi
8:40 AM
ben: well, there is 1 bug
ben: Firefox 2.0 Mac OSX has 2 floating scroll bars about 3/4ths of the way down on the rhs
sjcarbon: !
sjcarbon: lemme turn my mac on
ben: ah, they are useful when you open [Display gene product filters]
j-lo: I have a request from Matt
j-lo: that when you add a load of GO ids as a slim, it can display them as an indented set
val: did you make a decision about how IDs are handled?
j-lo: so you can see the hierarchy
sjcarbon: an indented set meaning just fancier whitespace?
j-lo: yep
val: I mean gene product IDs, sorry 2 questions at once
8:45 AM
ben: another usefulness - ga files are typically gzipped.  You get an error if you upload a gzipped gene association file
sjcarbon: whitespace should not be an issue
val: Matts request is difficult, how would you decide how to indent ?
val: so he wants a complete graph as output?
rama: Aren't we doing what Princeton does? Provide all GA files in a pull down?
j-lo: we're doing that too
j-lo: I don't think he wants a full graph, Val, just the terms you list
val: I don't understand Matts reguest, sound like he just wants the graphical output
rama: I think Matts request should be an enhancement- shd add to SF
val: but if thety have multiple parents this will be like the Amigo 'view'
ben: seth, are you debugging when you should be answering questions?
sjcarbon: ben--i can add the questions are coming a bit fast
sjcarbon: oops two thoughts
ben: ok  no more q's for a minute guys
ben: I am running the sgd ga file vs. Plant slim
ben: because that's the kind of guy I am
sjcarbon: i can add the gzip if people find that helpful
val: I still find the input confusing.
sjcarbon: that's already written somewhere
ben: seth - re: gzip, at least have it catch the exception.  Low priority to actually have it unzip
sjcarbon: i still need more information about exactly what matt wants
8:50 AM
val: I can input pombe gen symbols cdc2,cdc25,act1 etc.
j-lo: yeah - I'll check and get back to you!
val: There is nothing in the output that tells me which org I have searched.
val: When I click on the GO id link, I get the Amigo view for all genes annotated to this term, 
ben: my yeast-to-plant finished, took about 5 minutes (2800 terms, 6500 gps)
val: not the list of genes I added
sjcarbon: ben--you want an error for the gzipping specifically rather than a 'nothing found' error?
val: I dont think that is too long, as long as the user knows it is running
ben: I think this came up in a previous chat that I missed, but the distinction between #Inferred GPs and # Direct GPs is not obvious at all.
ben: Also, I think people would like Pvalues
sjcarbon: no pvalues in mapping
ben: Also we need to think a little about what to do with root/unknown terms
val: Ben, why dop you need P values for a slim?
rama: How would you calculate pvalues for mappings
val: they are just mapping up the tree?
rama: I think counts makes sense
sjcarbon: that should be term enrichment
ben: well, never mind then
ben: but why do we need 2 tools?
val: also, don't understand direct vs inferred?
val: They do differrent jobs?
val: one is an inrichment tool, one is a slimmer
ben: I actually DO understand direct/inferred, but it requires exposition, at least a click away.
sjcarbon: direct is the most direct mapping in the slim while inferred is all of the ancestors in the slim
rama: Val- ben direct annotations vs all 
rama: Sorry, I meant- ben meant Direct annotations vs All
ben: users won't understand the why or how about the difference between direct/inferred.
sjcarbon: ben we can add help to the column description
val: it shouldn't be inferred. These aren't inferred
val: it should be INDIRECT
ben: Also - this is a little curious, "inferred" GPs to root terms should be 100%?
val: No actually Rama is correct, it should be direct vs ALL
ben: in my sgd->plant slim I get 100% CC, 99.3% MF, 96.9% BP
rama: At SGD, we just should the genes mapped to the GO slim. We don't show the direct vs all
8:55 AM
rama: I am really sorry. my typing is bad today
val: I don't think we should allow direct/indirect for slimming, it doesn't make sense. It should be ALL always
j-lo: I agree
ben: yeah, actually, the whole point of the slim (one of them) is that we don't give a fig about the ontologoical structuremcgimmick
rama: I meant At SGD, we just *show*
ben: so I would just dump inferred/indirect
ben: unless you can show it graphically.
j-lo: yep
ben: which would be an enhancment
val: I don't even think it is needed graphiclly, if they want to look at this they want and enrichemtn tool, not a slimmer
rama: I think it is confusing to show direct and All. We shd list all genes that map to a GO slim term
sjcarbon: rama--couldn't that get rather large and confusing?
j-lo: what's the alternative?
val: SO do we agree, ditch direct show ALL
sjcarbon: already ditched
j-lo: great
val: It needs to be ALL, that is what the tool is for, 
val: great
ben: duh
rama: Thanks Val. That is exactly the point
ben: I get it now
ben: but why is "cellular process" in the Plant slim... (plz. don't answer here)
rama: Ben, please send mail to TAIR curator
j-lo: Shall we discuss gene ids?
val: please
ben: and we need to handle root terms specially
ben: otherwise they will always all be 100/100/100%
ben: we had this problem at SGD
val: I think there was an earlier mail to only show the output for the terms in the slims, not for all their parents too, this would solve this
9:00 AM
rama: do all the slims have the root terms listed. I know SGD does
ben: unless people put the root terms in their slims
sjcarbon: only slim terms are displayed now
val: Although I am getting the same problem Ben pointed out so it could be a bug.
val: Ah so the yeast slim contains the root nodes?
rama: yes
val: OK,
ben: OK, gene ids?  Can someone summarize the Q/issue for me/
ben: ?
j-lo: I can try
val: Did you see the e-mail, it was quite long, I'll have a go
val: you GO jane
j-lo: when people input their gene list into the mapper
j-lo: it may be in any of several formats
j-lo: some of which have name redundancy bewteen and within species
j-lo: also, so genes may simply not be in the database
ben: val - I just got back from vacay
j-lo: so how should AmiGO handle this?
pasc_g has joined the channel
ben: got it.  The fundamental job of bioinformatics
val: With GO term mapper and GO term finder you preselect your organism, this reduces the problem partially.
j-lo: heh - yep!
ben: I suggest we do not attempt to solve this problem in general for the next amigo release
9:05 AM
rama: the interface will show an intermediate page with IDS that it can find with an option to either move on or go back and fix
rama: Sorry, typo- I meant 'IDS it cannot find'
val: Sounds sensible
ben: there are two variants of this:
val: but you should also be able to preselect your organism
ben: 1) abort function, and query user
ben: 2) just passthough with message that some IDs could not be mapped uniquely
rama: yes, the intermediate page should say that those IDS were not valid for that species
donghui has joined the channel
val: Most of the probelms would go away if people were encouraged to use the systemtic identifiers for their species (as most users do with any geneome wide data). 
ben: or uniprot ids
val: Most problems occur becasue we are allowing cdc2 cdc28 etc
val: I don't  think Term mapper allows this, it expects systemtic identifiers
val: (I meant he princeton version)
9:10 AM
ben: going back to 1) vs. 2) above, the "correct" answer is some non-linear function of the number of "whiffs" in your input file
ben: although since (for a large number of input genes) these apps take a while anyway - mabye we can split the different.
ben: User hits "go" - sent to intermediate page which 
j-lo: Ben, I'm not sure I understnad what you're saying
ben: a) lists IDs that could not be mapped
j-lo: ah - okay - yes
ben: b) says "working......" with a status bar
ben: c) a big ol' red ABORT button if user decides the number of whiffs is too large
sjcarbon: asynchronous will be hard/take time
sjcarbon: also, the mapping is the time consuming part
j-lo: we thought an intermediate page, and the user can assess the 'whiffs', and then alter their input file and rerun, or just continue
j-lo: mapping?
ben: if mapping takes time, then just always stop and ask.
rama: keep it simple. Provide the intermediate page and force the user to make a decision
ben: Also, is that also true for term enrichment?
sjcarbon: that is fine, and can be added once the session stuff that amelia is working on is completed
ben: It doesn't have to be asynchronous.... if the intermediate page is the app that actually submits a (valid) gene list to the mapper...
9:15 AM
sjcarbon: am i correct that you want to have a job running in the background while the user decides if they want to abort?
sjcarbon: that will depend on the session stuff
ben: I see...
sjcarbon: either way
ben: not if they just hit the "stop" and "back" buttons on their browser!
j-lo: so what would the user see?
ben: I wonder how hard that would be to emulate in j/s
sjcarbon: j/s?
ben: ANYWAY.... I have a hard 9:22 PDT dead line, and I am chair, so lets wrap up
ben: javascript
ben: we all agree that we need an intermediate page
ben: we all all agree not to solve ID mapping problem in general
rama: yes
val: yep
j-lo: I'm to get more details about Matt's request
ben: details left to seth to implement in a timely manner
j-lo: and we're losing the All/direct options for just ALL
ben: anything else?
9:20 AM
j-lo: yes?
rama: yes
j-lo: cool
ben: it would be great to have a working beta, with some solid examples for the interactome meeting in UK at the end of the month.  
ben: we are doing a go demo/tutorial
ben: ok, meeting closed.  have a nice day/night all