AmiGO-chat-August1407

From GO Wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
ben: OK.
ben: I guess this is it.  Only item is slim mapper.
ben: URL?
sjcarbon: http://toy.lbl.gov:9002/cgi-bin/amigo/map2slim.cgi
8:40 AM
ben: well, there is 1 bug
ben: Firefox 2.0 Mac OSX has 2 floating scroll bars about 3/4ths of the way down on the rhs
sjcarbon: !
sjcarbon: lemme turn my mac on
ben: ah, they are useful when you open [Display gene product filters]
j-lo: I have a request from Matt
j-lo: that when you add a load of GO ids as a slim, it can display them as an indented set
val: did you make a decision about how IDs are handled?
j-lo: so you can see the hierarchy
sjcarbon: an indented set meaning just fancier whitespace?
j-lo: yep
val: I mean gene product IDs, sorry 2 questions at once
8:45 AM
ben: another usefulness - ga files are typically gzipped.  You get an error if you upload a gzipped gene association file
sjcarbon: whitespace should not be an issue
val: Matts request is difficult, how would you decide how to indent ?
val: so he wants a complete graph as output?
rama: Aren't we doing what Princeton does? Provide all GA files in a pull down?
j-lo: we're doing that too
j-lo: I don't think he wants a full graph, Val, just the terms you list
val: I don't understand Matts reguest, sound like he just wants the graphical output
rama: I think Matts request should be an enhancement- shd add to SF
val: but if thety have multiple parents this will be like the Amigo 'view'
ben: seth, are you debugging when you should be answering questions?
sjcarbon: ben--i can add the questions are coming a bit fast
sjcarbon: oops two thoughts
ben: ok  no more q's for a minute guys
ben: I am running the sgd ga file vs. Plant slim
ben: because that's the kind of guy I am
sjcarbon: i can add the gzip if people find that helpful
val: I still find the input confusing.
sjcarbon: that's already written somewhere
ben: seth - re: gzip, at least have it catch the exception.  Low priority to actually have it unzip
sjcarbon: i still need more information about exactly what matt wants
8:50 AM
val: I can input pombe gen symbols cdc2,cdc25,act1 etc.
j-lo: yeah - I'll check and get back to you!
val: There is nothing in the output that tells me which org I have searched.
val: When I click on the GO id link, I get the Amigo view for all genes annotated to this term, 
ben: my yeast-to-plant finished, took about 5 minutes (2800 terms, 6500 gps)
val: not the list of genes I added
sjcarbon: ben--you want an error for the gzipping specifically rather than a 'nothing found' error?
val: I dont think that is too long, as long as the user knows it is running
ben: I think this came up in a previous chat that I missed, but the distinction between #Inferred GPs and # Direct GPs is not obvious at all.
ben: Also, I think people would like Pvalues
sjcarbon: no pvalues in mapping
ben: Also we need to think a little about what to do with root/unknown terms
val: Ben, why dop you need P values for a slim?
rama: How would you calculate pvalues for mappings
val: they are just mapping up the tree?
rama: I think counts makes sense
sjcarbon: that should be term enrichment
ben: well, never mind then
ben: but why do we need 2 tools?
val: also, don't understand direct vs inferred?
val: They do differrent jobs?
val: one is an inrichment tool, one is a slimmer
ben: I actually DO understand direct/inferred, but it requires exposition, at least a click away.
sjcarbon: direct is the most direct mapping in the slim while inferred is all of the ancestors in the slim
rama: Val- ben direct annotations vs all 
rama: Sorry, I meant- ben meant Direct annotations vs All
ben: users won't understand the why or how about the difference between direct/inferred.
sjcarbon: ben we can add help to the column description
val: it shouldn't be inferred. These aren't inferred
val: it should be INDIRECT
ben: Also - this is a little curious, "inferred" GPs to root terms should be 100%?
val: No actually Rama is correct, it should be direct vs ALL
ben: in my sgd->plant slim I get 100% CC, 99.3% MF, 96.9% BP
rama: At SGD, we just should the genes mapped to the GO slim. We don't show the direct vs all
8:55 AM
rama: I am really sorry. my typing is bad today
val: I don't think we should allow direct/indirect for slimming, it doesn't make sense. It should be ALL always
j-lo: I agree
ben: yeah, actually, the whole point of the slim (one of them) is that we don't give a fig about the ontologoical structuremcgimmick
rama: I meant At SGD, we just *show*
ben: so I would just dump inferred/indirect
ben: unless you can show it graphically.
j-lo: yep
ben: which would be an enhancment
val: I don't even think it is needed graphiclly, if they want to look at this they want and enrichemtn tool, not a slimmer
rama: I think it is confusing to show direct and All. We shd list all genes that map to a GO slim term
sjcarbon: rama--couldn't that get rather large and confusing?
j-lo: what's the alternative?
val: SO do we agree, ditch direct show ALL
sjcarbon: already ditched
j-lo: great
val: It needs to be ALL, that is what the tool is for, 
val: great
ben: duh
rama: Thanks Val. That is exactly the point
ben: I get it now
ben: but why is "cellular process" in the Plant slim... (plz. don't answer here)
rama: Ben, please send mail to TAIR curator
j-lo: Shall we discuss gene ids?
val: please
ben: and we need to handle root terms specially
ben: otherwise they will always all be 100/100/100%
ben: we had this problem at SGD
val: I think there was an earlier mail to only show the output for the terms in the slims, not for all their parents too, this would solve this
9:00 AM
rama: do all the slims have the root terms listed. I know SGD does
ben: unless people put the root terms in their slims
sjcarbon: only slim terms are displayed now
val: Although I am getting the same problem Ben pointed out so it could be a bug.
val: Ah so the yeast slim contains the root nodes?
rama: yes
val: OK,
ben: OK, gene ids?  Can someone summarize the Q/issue for me/
ben: ?
j-lo: I can try
val: Did you see the e-mail, it was quite long, I'll have a go
val: you GO jane
j-lo: when people input their gene list into the mapper
j-lo: it may be in any of several formats
j-lo: some of which have name redundancy bewteen and within species
j-lo: also, so genes may simply not be in the database
ben: val - I just got back from vacay
j-lo: so how should AmiGO handle this?
pasc_g has joined the channel
ben: got it.  The fundamental job of bioinformatics
val: With GO term mapper and GO term finder you preselect your organism, this reduces the problem partially.
j-lo: heh - yep!
ben: I suggest we do not attempt to solve this problem in general for the next amigo release
9:05 AM
rama: the interface will show an intermediate page with IDS that it can find with an option to either move on or go back and fix
rama: Sorry, typo- I meant 'IDS it cannot find'
val: Sounds sensible
ben: there are two variants of this:
val: but you should also be able to preselect your organism
ben: 1) abort function, and query user
ben: 2) just passthough with message that some IDs could not be mapped uniquely
rama: yes, the intermediate page should say that those IDS were not valid for that species
donghui has joined the channel
val: Most of the probelms would go away if people were encouraged to use the systemtic identifiers for their species (as most users do with any geneome wide data). 
ben: or uniprot ids
val: Most problems occur becasue we are allowing cdc2 cdc28 etc
val: I don't  think Term mapper allows this, it expects systemtic identifiers
val: (I meant he princeton version)
9:10 AM
ben: going back to 1) vs. 2) above, the "correct" answer is some non-linear function of the number of "whiffs" in your input file
ben: although since (for a large number of input genes) these apps take a while anyway - mabye we can split the different.
ben: User hits "go" - sent to intermediate page which 
j-lo: Ben, I'm not sure I understnad what you're saying
ben: a) lists IDs that could not be mapped
j-lo: ah - okay - yes
ben: b) says "working......" with a status bar
ben: c) a big ol' red ABORT button if user decides the number of whiffs is too large
sjcarbon: asynchronous will be hard/take time
sjcarbon: also, the mapping is the time consuming part
j-lo: we thought an intermediate page, and the user can assess the 'whiffs', and then alter their input file and rerun, or just continue
j-lo: mapping?
ben: if mapping takes time, then just always stop and ask.
rama: keep it simple. Provide the intermediate page and force the user to make a decision
ben: Also, is that also true for term enrichment?
sjcarbon: that is fine, and can be added once the session stuff that amelia is working on is completed
ben: It doesn't have to be asynchronous.... if the intermediate page is the app that actually submits a (valid) gene list to the mapper...
9:15 AM
sjcarbon: am i correct that you want to have a job running in the background while the user decides if they want to abort?
sjcarbon: that will depend on the session stuff
ben: I see...
sjcarbon: either way
ben: not if they just hit the "stop" and "back" buttons on their browser!
j-lo: so what would the user see?
ben: I wonder how hard that would be to emulate in j/s
sjcarbon: j/s?
ben: ANYWAY.... I have a hard 9:22 PDT dead line, and I am chair, so lets wrap up
ben: javascript
ben: we all agree that we need an intermediate page
ben: we all all agree not to solve ID mapping problem in general
rama: yes
val: yep
j-lo: I'm to get more details about Matt's request
ben: details left to seth to implement in a timely manner
j-lo: and we're losing the All/direct options for just ALL
ben: anything else?
9:20 AM
j-lo: yes?
rama: yes
j-lo: cool
ben: it would be great to have a working beta, with some solid examples for the interactome meeting in UK at the end of the month.  
ben: we are doing a go demo/tutorial
ben: ok, meeting closed.  have a nice day/night all