AmiGO-chat-19June07

From GO Wiki
Jump to: navigation, search
gwg: OK, let's go then!
gwg: shall we start with the easiest one, which is the AmiGO user queries one
gwg: the question for discussion is when should we send users to the SF tracker, and when should we send them to GO help?
gwg: my feeling is that we should just send users to GO help, and if there is something tracker-worthy, the GO help person adds it
ben: I think the SF tracker would be "cleaner" if only GOC people entered it.  So, I would say, go-help, then a responder enters whatever into SF.
sjcarbon: i'd agree
j-lo: I agree
eurie: I personally think we should always just sent to GO help because then it minimize the confusion of who is being contacted
val: That sounds sensible and seemsed to be the consensus
gwg: anyone who disagrees, speak now or forever hold your peace!
rama has joined the channel
gwg: OK, we can make sure that all AmiGO-related help links go to GO help and not to SF
gwg: Hi Rama!
rama: Hi, sorry I am late. Not feeling good
val: or to AWG, ?
val: hi rama
gwg: I'm sure the AWG chat will make you feel better
rama: Absolutely
gwg: I think send to GO help, and they can forward to AWG if required
val: the curretn form directs to AWG so that also needs changing
j-lo: yep
gwg: OK, next item: Ben's bulk golite load
rama: Do you think the ORB client form will confuse users?
sjcarbon: it would only show up when someone can't find a term
val: yes so thats a bit differnet (although I'm sure it will confuse users.....)
ben: well, users are confused by anything
gwg: very true!
rama: yes, i agree. But still I found it overwhelming rather than simply writing to gohelp (sorry I am digressing)
val: its a good point though, but it will be interesting to see whether users can cope with it!
gwg: time will no doubt tell.
gwg: Anyway, on to Ben and the golite bulk load. Take it away, Ben!
rama: we don't have to discuss it today
ben: OK, I have a test amigo load of a golite-like load at
ben: goweb-dev.stanford.edu.
cjm: is that url open to the outside?
ben: It is basically the same as current production amigo
ben: yes, it is open
eurie: Can you provide some background as to purpose/reason for this?
ben: there are some subtlies in the database, and I think it's short a couple hundred sequences.
ben: We are movign to the bulk loader to allow us to load the database faster
ben: so we can do a "full" load in < 10 days (I think we shold be down to 2-3)
ben: we could also do a lite "non IEA" load every night
gwg: wow, that would be good!
ben: there is no rush to get this into production, since we have no application that uses the full +IEA database.  But we will eventually move to this
ben: well, golite still takes 8.5 hours.  so you COULD load it every day.
ben: I would also like to test this independently of any amigo changes.
gwg: what are your plans on the testing front?
ben: so we should do it before or after the next release.  It doesn't need extensive banging - just term structure, xrefs, associations, sequenes
eurie: If you load production and the dev database with the same datasets, do the numbers of rows in each of the tables match?
ben: I have done many comparisons in sql and the dbs are identical at the association level.  But there have other changes - like obo 1.2 format that slightly alter the database.  
ben: also, dbxref table changed the order of it's columns 
ben: if people would just take a quick look at areas in amigo they are familiar with, to make sure it looks right that would be great.
cjm: seth, we could use your mechanizer script to automate the testing, couldn't we?
ben: well, it's not identical.
sjcarbon: we could
sjcarbon: the looking right part might be trickier
ben: in case it's not clear - the issue is that the underlying TERM database has changed slightly.  I didn't do this.
ben: This slightly tweaks the numbers, even though the gp #,  and association numbers are all identicle.
ben: we also change the seq loading to be faster, and we seem to have lost a few sequences (but it was far from complete anyway)
eurie: So we need to be checking to make sure that the ontology was loaded and is being presented accurately?
gwg: I was going to suggest we could compare results from AmiGO working on the public database vs on this test db, but then it would be simpler just to query via sql, I would have thought
gwg: any more thoughts on this issue?
ben: yes, there is no need to compare exact numbers side by side.   the only problems I forsee is something really subtle, like the use of SGD reference ids rather than pubmed ids 
ben: this isn't bad - it's just a little different, has to do with the ordering of the dbxrefs.
ben: anyway - it's low priority.  but if anyone wants to take a look that would be great
val: i don't see any problems with the refs .....
gwg: OK, cool. The URL is out there, so if people want to test...
gwg: go for it!
ben: we probabaly SHOULD be using obo1.2 instead of the old format, especially since the graph_path comes out different
val: why is this?
ben: for some reason, the code that generates the graph_path creates many DUPLICATE links using the old obo file.
ben: so the graph is the same, but the counts are all screwy
cjm: but why? there should be no diffs in 1.0 vs 1.2 that affect this
cjm: either way the path code needs fixed
ben: I don't know, I never touched the goterm loading
val: is it do do with the many is_a and part_of changes affecting the paths/counts?
cjm: but this is nothing to do with 1.0 vs 1.2 - the formats are virtually identical. must be some other cause
val: ah right, its exactly the same graph...
cjm: we want to fix the path code anyway to allow us to get separate DAGs for is_a and part_of to improve the displays
ben: well, we should probably dump this on the sw group instead of these poor people chris
gwg: OK, so we're kicking this over to the software group?
ben: final thought:
ben: I would like a comment on :
ben: http://amigo.geneontology.org/cgi-bin/amigo/go.cgi?view=assoc&search_constraint=terms&depth=0&query=GO031012
ben: vs:
ben: http://goweb-dev.stanford.edu/cgi-bin/amigo/go.cgi?view=assoc&search_constraint=terms&depth=0&query=GO031012
ben: vis-a-vis the links in the reference column.
ben: we can do this off line
gwg: why don't you send an email about it and we will discuss it that way?
eurie: I really prefer the PubMED IDs being displayed if there is one.
val: same here
gwg: same here
cjm: same here
rama: i agree with eurie. i prefer PMIDs too.
ben: see, that wasn't so bad, was it?
gwg: I didn't expect everyone to be in agreement! :o
ben: interesting question - should ALL mods provide pubmed ids here?
val: I think users would prefer it...
gwg: I think so - where available, obviously
ben: this was just an (unhappy) accident in how the dbxrefs were ordered I guess
pascale: I dont know about others- at dictybase we have both internal and PMIDs, so unless a paper is not indexed (or it's an internal reference, we could provide either)
gwg: the more we can get everyone using a standardized system, the better I think
pascale: I agree
gwg: should that be proposed to the annotation group or something?
gwg: (there is an annotation group, isn't there?)
ben: probably need to be precise.  I will take a look at the GA files 
eurie: What is the current status of the files?
rama: yes, mailing list
ben: what do you mean status, eurie
eurie: Just what the column in the GA files actually do contain
ben: I just looked, it's a | (pipe) delimited list of ids
ben: either db ids or pubmed.
ben: my code just takes the first one, so it's the order that determined
ben: what shows up in amigo
gwg: I think it needs to be suggested to the annotation group then - use PMID unless there isn't one available; in that case, use the internal ref
ben: rgd, for example, has only RGD identifiers.
ben: but something else is wacky here
ben: on that (production) page I sent, the ISS have the SEQUENCE
ben: dbxref (for uniprot anyway)
ben: there is no checking on the type of "ref" you put in the go_ref column (column 6 I think)
gwg: is this a problem with AmiGO itself or an annotation file format / parser problem? I think we should move on to the map2slim if it's the latter
ben: I think latter... we should move on anyway
gwg: ok then, moving on
gwg: map2slim!
gwg: comments, questions, etc.?
sjcarbon: I haven't had a chance to incorporate everyone's comments and suggestions in yet
sjcarbon: thanks for the feedback
sjcarbon: i'll respond to the email threads as I work through them
eurie: I haven't had a chance to catch up on the emails so it may already be in there.  But is there going to be a mechanism for directly inputing a gene list from AMiGO into the map2slim?
gwg: I have a question for AWGers - do you think that users might want to use a certain subset of the preexisting GO database annots into map2slim?
cjm: amelia - yes
gwg: (oh, Eurie has just asked the same thing)
gwg: Seth, can m2s do that at the moment?
gwg: or is it just a wrapper round the map2slim script?
sjcarbon: no--currently map2slim doesn't have a way of directly reading those in
sjcarbon: we should be able to fix this though
cjm: but if we lifted the code up from the script and integrated it into the API then we could.. 
cjm: perhaps it might be possible to have a 'pervasive' map2slim mode, like a filter, so all query results are automatically mapped. would this be useful - or confusing?
gwg: seems to me if we had like a 'shopping basket' in which you could collect gps which you could then do m2s on, that might be useful
eurie: no worries if it's difficult at this point.  
j-lo: I like the shopping basket idea
gwg: it would be simple to have another option at the bottom of tables where you can select GPs which was something like 'Add GPs to basket', and then something where you could choose what you wanted to do with the GPs in your basket
gwg: e.g. download associations, m2s, get the FASTA seqs, whatever
sjcarbon: i think it would be easier for some of this stuff to wait until we have AMiGO a bit more AJAXified
gwg: but the principle is simple enough - it's just like any online shop where you collect up items and then do something with them
gwg: at the moment, you can collect up items and do something with them if they appear on the same results page
sjcarbon: yes, but it would be easier
cjm: (except you don't often buy 1000s of items in a shop but you might want to for map2slim)
gwg: you don't buy 1000s of items when you're shopping online? 
gwg: Anyway, we are just about out of time now
eurie: Waiting sounds fine with me
gwg: so shall we bring matters to a close?
rama: sure
gwg: action items from today's fun chat:
gwg: 1) Make sure any help-related links from AmiGO go to GO help; don't confuse users with the SF trackers
gwg: 2) Ben has a dev version of the GO db loaded via the new system available if people would like to have a point 'n' click around
gwg: 3) Could someone on the annotation mailing list volunteer to propose that PMIDs be used as the standard literature reference where they are available, and if they aren't available, the internal db ref can be used
gwg: 4) possible problems with the graph_path tables in the db and various db loading issues to be passed on to the software group
ben: 4) go-sw group to look into termdb loading
cjm: (regarding #3 - I dont think anything needs done. this is already the case. but we can follow up in email
gwg: 5) isn't really an action item, but Seth is looking at suggestions / etc. about m2s.
gwg: if someone could follow it up in email that might be good
gwg: OK, I think that's it. The torture is over for another fortnight!
ben: adios