From GO Wiki
Revision as of 10:27, 3 July 2007 by Eurie (talk | contribs) (New page: <font color="#9440B1"><font size="2">(08:38:13 AM) </font><font size="3">rama:</font></font><font size="3"> shall we get started?</font><br><font size="2"><font color="#204a87">(08:38:21 A...)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

(08:38:13 AM) rama: shall we get started?
(08:38:21 AM) sjcarbo1: sorry--new client
(08:38:44 AM) sjcarbo1: yes
(08:39:39 AM) rama: we were going to review the download options at biomart, NCBI etc
(08:40:12 AM) rama: who wants to go first?
(08:40:17 AM) ben: I can go first
(08:40:22 AM) rama: Go for it Ben
(08:40:46 AM) ben: well - I looked at the Biomart 0.5 implementation at EBI
(08:41:17 AM) ben: this querying system allows you to basically construct your own "excel" table from various columns from various databases.
(08:41:33 AM) ben: There are probably millions of possible outputs.
(08:42:12 AM) ben: Basically you go through some javascript forms to set up the columns you want in your output, then you construct "filters" which limit the rows (so you don't get back the whole genome or whatever)
(08:42:30 AM) rama: How intuitive and easy are those forms?
(08:42:39 AM) ben: they suck
(08:42:48 AM) rama: ho
(08:42:56 AM) ben: but no fault of design - it's just an impossible task
(08:43:05 AM) ben: I mean there are HUNDREDS of columns
(08:43:11 AM) ben: but we won't have that problem
(08:43:32 AM) sjcarbo1: besides the volume, were there any other problems?
(08:43:37 AM) ben: I think for amigo purposes, we can probably just ship some set of columns. The filtering system is pretty good.
(08:43:51 AM) ben: No, I actualy like it for what it does
(08:44:01 AM) ben: other cool things:
(08:44:22 AM) ben: "count" button - so when you've assembled a query and possibly filters you can see how many rows you get back.
(08:44:40 AM) ben: Easy to get varaious options (HTML, CSV, TSV, XSL, XML)
(08:44:45 AM) ben: obv. I meant various
(08:45:05 AM) ben: can export to File, browser, or compressed file
(08:45:25 AM) ben: The ability to conjoin with other databases is pretty cool - but I wouldn't think of implementing this for us.
(08:46:27 AM) rama: you mean other databases like Taxon, OMIM, Pubmed
(08:46:28 AM) ben: oh, and they have a web service and a GUI button to ship you the XML you need to re-fetch the same query.
(08:46:53 AM) ben: no, sorry - I didn't explain scope. Biomart generally searches geneomes - human, rat, etc.
(08:47:04 AM) ben: if you link databases, you can combine queries.
(08:47:17 AM) ben: genome _databases_
(08:47:30 AM) rama: I see
(08:47:56 AM) j-lo: I've used Biomart before, and although it's not immediately intutitive, it is very cool because it's so configurable
(08:48:14 AM) sjcarbo1: how was the learning curve?
(08:48:36 AM) j-lo: not too bad - one session and I had a fairly good idea how to use
(08:48:37 AM) j-lo: it
(08:48:52 AM) ben: it's irrelvant - right?
(08:49:09 AM) ben: we are not going to actually USE biomart, just rip off it's interfaces (if we like them)
(08:49:24 AM) sjcarbo1: thus the learning curve
(08:49:45 AM) sjcarbo1: where stuff is, how to get to things
(08:49:52 AM) eurie: yeah, i think the hardest thing was understanding what all the labels meant but I think if we label things on the interface appropriately, this should be easier.
(08:50:02 AM) rama: even to get the general idea of what it can do
(08:50:42 AM) j-lo: I agree with Eurie - it's figuring out what data are in the different columns that makes it tricky
(08:51:00 AM) j-lo: if we can make that intuitive we'd be laughing
(08:51:26 AM) ben: I think we would be better served by giving fewer options and just defaulting the output to more columsn
(08:51:47 AM) j-lo: yeah - our database is much simpler so we could do that
(08:52:13 AM) eurie: Also, alluding to the wizard vs. single page discussion over email, the filtering options (which data you want to restrict by) are all on one page. which helps you see everything at once.
(08:52:20 AM) ben: biomart is like that because it has to be generic - you can interface any database to it, so obviuosly it doesn't know what colunms to show.
(08:52:35 AM) ben: although in principal you could "hardwire" biomart ... but I digress
(08:53:32 AM) eurie: I think the older biomart implementation was a wizard - multi page step through.
(08:54:29 AM) gwg: the ideal implementation would be something which allowed the more advanced users to configure the results to their liking, but which wouldn't alienate newbies by being too complex
(08:54:29 AM) rama left the room (quit: Read error: Connection reset by peer).
(08:55:29 AM) eurie: oops, we lost our chair
(08:55:51 AM) gwg: perhaps the default should be simple, so that people aren't overwhelmed by huge amounts of data
(08:56:01 AM) rama [] entered the room.
(08:56:11 AM) rama: sorry, i got disconnected
(08:56:22 AM) ben: I guess to break it down:
(08:56:42 AM) ben: row selection good. column selection bad.
(08:56:55 AM) rama: Can somebody save the transcript for me ( I lost parts of it).
(08:57:02 AM) ben: maybe in blocks it's OK -
(08:57:21 AM) ben: we assume people want gene_products -> associations
(08:57:31 AM) gwg: for example, I think that the a
(08:57:35 AM) gwg: ooops, sorry
(08:57:44 AM) ben: +/- species +/- evidence (?!?) +/- references
(08:57:50 AM) sjcarbo1: (rama--I'll save and send what I have)
(08:57:59 AM) ben: oh, we should also move on to another interface
(08:58:10 AM) rama: I was just about to say that Ben
(08:58:19 AM) rama: Donghui, do you want to go?
(08:58:37 AM) donghui: sure
(08:59:16 AM) donghui: just a second
(09:00:16 AM) donghui: <a href=""></a>
(09:00:32 AM) donghui: first, The Batch Search has now been combined with the ?regular? Entrez Search: ?You will need Batch Entrez now only to upload a file of GI or accession numbers for an Entrez search. You can do all other large searches directly within Entrez.?
(09:01:22 AM) donghui: I don't know exactly how to do a large search directly within entrez
(09:01:34 AM) donghui: So i tried the Batch search
(09:02:07 AM) donghui: The nicest part of Entrez is its result display: very flexible. You can filter the results in all kinds of ways.
(09:02:07 AM) donghui: You can choose to display different parts of the gene information: Brief, Summary, Sequence in different format etc.
(09:02:07 AM) donghui: You can specify the number of hits to display per page as well as sort results accession and Taxonomy ID.
(09:02:07 AM) donghui: It allows user to send the result to various destinations: as txt, as a file, printer or clipboard.
(09:03:11 AM) donghui: the other nice feature is that Entrez allows users to save search history for 8 hrs, which is useful
(09:04:55 AM) donghui: if you want to try the search you can put the following GI/accession numbers into a txt file:
(09:05:00 AM) donghui: The nicest part of Entrez is its result display: very flexible. You can filter the results in all kinds of ways.
(09:05:00 AM) donghui: You can choose to display different parts of the gene information: Brief, Summary, Sequence in different format etc.
(09:05:00 AM) donghui: You can specify the number of hits to display per page as well as sort results accession and Taxonomy ID.
(09:05:00 AM) donghui: It allows user to send the result to various destinations: as txt, as a file, printer or clipboard.
(09:05:04 AM) donghui: sorry
(09:05:13 AM) donghui: AL436771
(09:05:14 AM) donghui: AL423804
(09:05:14 AM) donghui: AL422882
(09:05:14 AM) donghui: CZ193470
(09:05:14 AM) donghui: DQ052056
(09:05:16 AM) donghui: DQ052055
(09:06:54 AM) rama: are there links and tools relevant to your search results?
(09:07:26 AM) donghui: what do you mean by tools?
(09:07:42 AM) rama: for example ClustalW to see alignment?
(09:08:09 AM) donghui: I did not see that
(09:08:38 AM) j-lo: I couldn't get it to work - says 'No query #1 in history for database nucleotide'
(09:09:09 AM) donghui: that's strange, it happened to me too.
(09:10:39 AM) j-lo: anyone else get it to work?
(09:10:57 AM) rama: Donghui, for batch download do you always need GI/accession #s or gene names will work?
(09:13:09 AM) rama: I am going to move on
(09:13:14 AM) rama: to SRS
(09:13:18 AM) donghui: yes, that's pain
(09:13:49 AM) donghui: the file can only contain GI or accession numbers, I have tried these numbers when I tested it
(09:14:30 AM) rama: thanks DOnghui.
(09:14:43 AM) donghui: if I want to retrieve sequences for Arabidopsis genes and I only know the Arabidopsis locus identifier, these identifiers have to be converted to GI or accerssion numbers first
(09:15:44 AM) rama: that is a pain.
(09:16:03 AM) rama: the take home is we can look at their results page for display ideas
(09:16:05 AM) rama: ?
(09:16:37 AM) rama: <a href=""></a>
(09:16:51 AM) rama: Anybody, any questions on NCBI?
(09:17:52 AM) rama: Okay...moving on. Please visit SRS at the above URL
(09:18:43 AM) rama: Few things I liked- They have a whole list of Databases that one can choose from. They include Pubmed, Taxon, etc (databases are not limited to what is maintained at EBI)
(09:19:57 AM) rama: 2) Views- you can control what you want to see, what you want hyperlinked. For example, for a pubmed citation, you can view only the PMID and not the other details.
(09:20:11 AM) rama: 3) Tools relevant to your search results
(09:20:52 AM) rama: If you have a list of proteins as output, it suggests tools that can be used to analyze that list further. CLustalW for example. They have a whole suite of tools
(09:21:41 AM) rama: 4) All your queries are saved in the 'Projects' link on the top tool bar.
(09:22:50 AM) rama: First time I saw the interface it was overwhelming (like Biomart), but then it has lots of features
(09:23:10 AM) eurie: Rama, how long are your queries saved? Just for that session?
(09:23:29 AM) rama: i think just for that session.
(09:24:34 AM) rama: it is more than a tool for downloading
(09:25:06 AM) rama: I meant, it has more utilities than just downloading options
(09:25:31 AM) rama: I am not sure if they store all the Pubmed data inhouse or if it is accessed on the fly
(09:25:50 AM) j-lo: I like their 'quick search' page - the library page is a bit off-putting
(09:26:11 AM) j-lo: (PubMed is kept in-house btw)
(09:26:19 AM) rama: Jane, I totally agree with you. Everything is overwhelming
(09:27:03 AM) rama: one needs to get familiar with the system
(09:27:08 AM) j-lo: it's better than it used to be - last time i used it you went straight to that library page
(09:27:30 AM) j-lo: if you're not familiar with what's in what database, you don't have a clue
(09:27:46 AM) rama: absolutely.
(09:28:10 AM) rama: I am familiar with only few of the DBs listed on that page
(09:29:03 AM) gwg: I've always found it too intimidating to use!
(09:29:11 AM) rama: in the Pathways section for example, I had no idea what Ureaction Lreaction etc., meant
(09:29:49 AM) rama: I agree it is intimidating. But I think there are some features that we might want to copy
(09:30:41 AM) gwg: I don't think AmiGO could ever be that intimidating as we're only covering one database - they have so many databases to query that if you don't know what you're looking for, it's just overwhelming
(09:30:42 AM) rama: As I said earlier, it is just not a tool for downloading. It aims to be comprehensive and take an user through a workflow
(09:31:22 AM) rama: I just realized it is almost 930AM
(09:31:26 AM) donghui: I got the NCBI search to work with the same file, do we still have time to give it try after the SRS? We need to change the database from Nucleotide to GSS. The same file will work.
(09:32:05 AM) rama: hmmm...I am afraid we need to wrap up things soon.
(09:32:20 AM) rama: Any questions so far?
(09:32:37 AM) ben: maybe email the file donghui and we can run it on our own?
(09:32:43 AM) donghui: OK, here is the result page:
(09:32:45 AM) gwg: does anyone need to run off? otherwise I reckon keep going a bit longer as it's interesting to see the + / - points of these tools
(09:32:46 AM) donghui: <a href=""></a>
(09:32:48 AM) rama: Sure. good idea Ben
(09:33:24 AM) rama: I am fine to continue
(09:33:54 AM) sjcarbo1: i'm not going anywhere
(09:34:11 AM) j-lo: me too
(09:34:14 AM) rama: Please speak up if you *have to go*
(09:34:28 AM) donghui: alright, let's go back to NCBI for a few min
(09:34:34 AM) rama: okay
(09:34:50 AM) donghui: the above link will direct you to the results page
(09:34:55 AM) j-lo: I like the 'sort by' and 'send to' pull-down menus
(09:35:51 AM) ben: I must leave in 15-20 mins or so.
(09:35:58 AM) donghui: The NCBI interface is very simple, however, like what happended, the query file has to be properly formatted (with GI or accession numeners0, which is not very flexible
(09:36:03 AM) donghui: but I do like to results page
(09:36:17 AM) donghui: Click the Display button
(09:36:20 AM) rama: I love the results page. It is simple
(09:36:50 AM) donghui: the results page is great, that's the nicest part of Entrez
(09:37:12 AM) j-lo: so results page good, configurability bad
(09:37:16 AM) ben: I don't really care for it
(09:38:55 AM) ben: the results page... that is. It's fine I guess
(09:40:03 AM) rama: any other comments?
(09:40:06 AM) eurie: The one that frustrates me is that once you've selected an alternate display on the pull-down, it's not clear how to get back to your original results page.
(09:40:51 AM) eurie: And the pull-down options on other displays isn't consistent
(09:41:17 AM) gwg: I agree with Eurie on the alt displays
(09:41:18 AM) eurie: for example, you don't get all the same options if you've selected "graph"
(09:41:38 AM) donghui: you are right, they are not consistent
(09:41:44 AM) j-lo: it's a bit buggy too - keeps giving me error messages
(09:41:58 AM) gwg: I also don't like the way that there's no 'confirm', it just automatically switches displays - that's annoying if you pick the wrong one
(09:42:15 AM) eurie: maybe we've confused the crap out of because we've got ip addresses around the world hitting the same page.  ;)
(09:42:31 AM) rama: haha
(09:43:24 AM) eurie: but one principle that i like in general is if there are multiple pages or views or lots of categories, how do i get to a previous one without hitting the back button.
(09:43:25 AM) donghui: I actually like the No Confirm, you don't have to do that extra click
(09:45:52 AM) rama: any more comments on NCBI?
(09:46:58 AM) rama: any questions on SRS?
(09:47:52 AM) gwg: could I ask a couple of general questions if the summaries are finished?
(09:48:05 AM) rama: Go on GWG
(09:48:19 AM) gwg: these are for all the people who've looked at tools
(09:48:58 AM) gwg: did you need to have a clear goal in mind and background knowledge of the data source to be able to use the tool effectively?
(09:50:16 AM) ben: no
(09:50:32 AM) ben: although "what is effective"
(09:50:49 AM) gwg: I guess 'to be able to use the tool' would be better
(09:50:53 AM) rama: I had some goal in mind I would say
(09:51:03 AM) rama: otherwise I couldn't evaluate the results
(09:51:29 AM) gwg: were there specific features that made the tool easier to use?
(09:51:40 AM) ben: I liked "count" button
(09:52:02 AM) rama: nothing in particular for SRS
(09:52:06 AM) gwg: and conversely, things that made it hard to use?
(09:52:12 AM) ben: too many options
(09:52:26 AM) rama: SRS suffers from data overload
(09:52:30 AM) ben: ok, I gotta go
(09:52:40 AM) gwg: thanks Ben!
(09:52:41 AM) donghui: have to know which specific database to search
(09:52:44 AM) rama: see you. thanks Ben
(09:52:55 AM) donghui: shouldn't be a problem for amigo
(09:53:03 AM) donghui: I have to go too
(09:53:13 AM) rama: shall we wrap up?
(09:53:20 AM) gwg: yup, sounds good
(09:53:29 AM) rama: before I finish, somebody please save the transcript
(09:53:40 AM) sjcarbo1: saved
(09:53:50 AM) rama: I guess, I don't have any action items
(09:54:12 AM) gwg: quick! Let's think of some action items for Rama!
(09:54:30 AM) rama: We all like some options in all these sites. Seems like Biomart is very flexible and usable.
(09:54:39 AM) rama: NCBI has good results display
(09:55:20 AM) rama: SRS kind of integrates tools to the data and tries to offer a complete utility
(09:55:34 AM) rama: Quick action items?
(09:56:28 AM) gwg: I don't know that there are any really...
(09:56:34 AM) sjcarbo1: if we missed anything, we could finish over email
(09:56:44 AM) rama: okay. good. see you all in 2 weeks
(09:56:54 AM) gwg: see you then!
(09:56:57 AM) rama: Seth, please email transcript to the WG
(09:56:57 AM) gwg left the room (quit: Quit: Goodbye cruel world!).
(09:56:58 AM) rama: thanks
(09:57:08 AM) sjcarbo1: no problem--bye