SWUG:Meeting 2011 06 14

From GO Wiki
Jump to navigation Jump to search

Topic: loading of panther family trees into Gold

Present: Chris, Sven, Mary, Shahid, Seth

Summary:

  • PTL works currently as a standalone, but needs junits and integrated into gold framework
  • new paintscraper dumps NHX + additional metadata. Currently files on Sven's machine
  • Mary gave explanation of subfamilies in panther

TODO:

  • phylo.sql needs documented better -- cjm
  • phylo.sql -- additional fields for SFs, speciation/duplication and other metadata -- cjm/sven
  • junit tests plus documentation for PhyloTreeLoader -- sven
  • ?integrated paintscraper and PhyloTreeLoader (org.geneontology.gold.io) -- sven?
  • ?include SFs? -- sven, mary
  • integrate PhyloTreeLoader into admin web interface -- shahid, sven
  • define JSON format for trees -- seth, sven
  • extend gold->solr sql mapping to include families -- seth, cjm
    • add family as a document in itself in solr
    • add bioentity2family mappings to current bioentity documents (from family_member table)
  • add filter boxes in amigo2 demo interface -- seth

Skype transcript:

[6/14/11 1:17:49 PM] Chris Mungall: http://amigo.berkeleybop.org/amigo/panther/
[6/14/11 1:19:46 PM] Chris Mungall: http://www.geneontology.org/gene-associations/submission/paint/PTHR10000/PTHR10000.tree
[6/14/11 1:20:23 PM] Chris Mungall: http://amigo.berkeleybop.org/amigo/panther/PTHR10000.tree
[6/14/11 1:20:58 PM] Chris Mungall: submission dir - extended newick - has speciation/duplication
[6/14/11 1:21:14 PM] Chris Mungall: old paint scraper files in aimgo dir - plain newick
[6/14/11 1:25:55 PM] Chris Mungall: http://genomics.princeton.edu/~sven/PTHR10000.tree
[6/14/11 1:26:15 PM] Chris Mungall: new paint scraper - extended newick
[6/14/11 1:27:07 PM] Chris Mungall: [title:PHOSPHOSERINE PHOSPHATASE]
(((((((((((((((((AN17:0.049,AN18:2.000):0.017[&&NHX:Ev=0>1:S=Homo-Pan:ID=AN16],(AN20:0.016,AN21:0.005):0.032[&&NHX:Ev=0>1:S=Murinae:ID=AN19]):0.022[&&NHX:Ev=0>1:S=Euarchontoglires:ID=AN15],(AN23:0.027,AN24:0.032):0.021[&&NHX:Ev=0>1:S=Laurasiatheria:ID=AN22]):0.027[&&NHX:Ev=0>1:S=Eutheria:ID=AN14],AN25:0.055):0.069[&&NHX:Ev=0>1:S=Theria:ID=AN13],AN26:0.169):0.132[&&NHX:Ev=0>1:S=Mammalia:ID=AN12],AN27:0.101):0.069[&&NHX:Ev=0>1:S=Amniota:ID=AN11],AN28:0.136):0.083[&&NHX:Ev=0>1:S=Tetrapoda:ID=AN10],(AN30:0.137,AN31:0.160):0.055[&&NHX:Ev=0>1:S=Teleostei:ID=AN29]):0.195[&&NHX:Ev=0>1:S=Osteichthyes:ID=AN9],AN32:0.318):0.000[&&NHX:Ev=0>1:S=Chordata:ID=AN8],(AN34:2.000,AN35:2.000,AN36:2.000):0.000[&&NHX:Ev=1>0:ID=AN33]):0.232[&&NHX:Ev=0>1:S=Deuterostomia:ID=AN7],(AN38:0.430,(AN40:0.337,AN41:0.408):0.313[&&NHX:Ev=0>1:S=Insecta:ID=AN39]):0.179[&&NHX:Ev=0>1:S=Ecdysozoa:ID=AN37]):0.718[&&NHX:Ev=0>1:S=Bilateria:ID=AN6],(((AN45:0.368,AN46:0.301):0.231[&&NHX:Ev=0>1:S=Saccharomycetaceae:ID=AN44],AN47:0.510):0.255[&&NHX:Ev=0>1:S=Pezizomycotina-Saccharomycotina:ID=AN43],AN48:0.566):0.393[&&NHX:Ev=0>1:S=Ascomycota:ID=AN42]):0.068[&&NHX:Ev=0>1:S=Opisthokonts:ID=AN5],AN49:0.893):0.190[&&NHX:Ev=0>1:S=Unikonts:ID=AN4],((AN52:0.595,AN53:0.416):0.100[&&NHX:Ev=1>0:ID=AN51],((AN56:0.142,AN57:0.118):0.114[&&NHX:Ev=1>0:ID=AN55],AN58:0.198):0.329[&&NHX:Ev=0>1:S=Magnoliophyta:ID=AN54]):0.637[&&NHX:Ev=0>1:S=Viridiplantae:ID=AN50]):0.303[&&NHX:Ev=0>1:S=Eukaryota:ID=AN3],AN59:0.755):0.112[&&NHX:Ev=0>1:S=Archaea-Eukaryota:ID=AN2],(((AN63:0.943,AN64:1.003,AN65:0.886,AN66:0.845):0.242[&&NHX:Ev=1>0:ID=AN62],AN67:0.620,(AN69:0.605,AN70:0.637,AN71:0.670):0.193[&&NHX:Ev=1>0:ID=AN68]):0.938[&&NHX:Ev=0>1:S=Eubacteria:ID=AN61],(AN73:0.384,AN74:0.595,AN75:0.384,(AN77:0.556,((AN80:1.156,AN81:0.490):0.183[&&NHX:Ev=1>0:ID=AN79],AN82:0.424):0.260[&&NHX:Ev=0>1:S=Gammaproteobacteria:ID=AN78]):0.346[&&NHX:Ev=0>1:S=Proteobacteria:ID=AN76]):0.532[&&NHX:Ev=0>1:S=Eubacteria:ID=AN72]):0.165[&&NHX:Ev=1>0:ID=AN60]):0.773[&&NHX:Ev=0>1:S=LUCA:ID=AN1],((((AN87:0.000,AN88:0.000):1.149[&&NHX:Ev=1>0:ID=AN86],AN89:1.031):0.249[&&NHX:Ev=0>1:S=Eukaryota:ID=AN85],AN90:0.880):0.334[&&NHX:Ev=0>1:S=Archaea-Eukaryota:ID=AN84],((AN93:0.630,AN94:0.797,AN95:0.907,AN96:0.892,AN97:0.797):0.345[&&NHX:Ev=1>0:ID=AN92],AN98:0.829,AN99:0.996,(AN101:0.794,AN102:0.676):0.511[&&NHX:Ev=1>0:ID=AN100],AN103:1.161):0.587[&&NHX:Ev=0>1:S=Eubacteria:ID=AN91]):0.705[&&NHX:Ev=0>1:S=LUCA:ID=AN83],((AN106:0.575,AN107:0.585):0.503[&&NHX:Ev=0>1:S=Pezizomycotina-Saccharomycotina:ID=AN105],AN108:0.973):0.835[&&NHX:Ev=0>1:S=LUCA:ID=AN104],(AN110:0.951,AN111:0.923):0.660[&&NHX:Ev=0>1:S=LUCA:ID=AN109])[&&NHX:Ev=1>0:ID=AN0];
AN17:HUMAN|ENSEMBL=ENSG00000146733|UniProtKB=P78330;
AN18:PANTR|ENSEMBL=ENSPTRG00000019205|ENSEMBL=ENSPTRP00000039757;
AN20:MOUSE|MGI=MGI=97788|UniProtKB=Q99LS3;
AN21:RAT|RGD=1308764|UniProtKB=Q5M819;
AN23:BOVIN|ENSEMBL=ENSBTAG00000013081|UniProtKB=Q2KHU0;
AN24:CANFA|ENSEMBL=ENSCAFG00000010123|ENSEMBL=ENSCAFP00000014881;
AN25:MONDO|ENSEMBL=ENSMODG00000006585|ENSEMBL=ENSMODP00000008170;
AN26:ORNAN|ENSEMBL=ENSOANG00000001839|ENSEMBL=ENSOANP00000028697;
AN27:CHICK|ENTREZ=417540|NCBI=XP_415786;
[6/14/11 1:27:08 PM] Chris Mungall: .....
[6/14/11 1:36:00 PM] Mary Dolan: http://www.geneontology.org/gene-associations/submission/paint/PTHR10000/PTHR10000.sfan
[6/14/11 1:36:29 PM] Mary Dolan: http://www.geneontology.org/gene-associations/submission/paint/PTHR10009/PTHR10009.sfan
[6/14/11 1:36:49 PM] Chris Mungall: http://www.geneontology.org/gene-associations/submission/paint/PTHR10009/PTHR10009.gaf

[6/14/11 1:37:15 PM] Chris Mungall: PANTHER	PTHR10009_AN3	PTHR10009_AN3		GO:0005576	PAINT_REF:0010009	IDS	FB:FBgn0004034|FB:FBgn0038105|FB:FBgn0041710	C			protein		20110106	RefGenome
[6/14/11 1:40:14 PM] Mary Dolan: ftp://ftp.pantherdb.org/sequence_classifications/current_release/PTHR7.0_Mouse
[6/14/11 1:42:19 PM] Mary Dolan: ftp://ftp.pantherdb.org/hmm_classifications/current_release/PANTHER7.0_HMM_classifications
[6/14/11 1:43:21 PM] Mary Dolan: http://www.pantherdb.org/panther/familyList.do?searchType=basic&fieldName=all&organism=all&listType=6&fieldValue=PTHR22920%3ASF0
[6/14/11 1:45:22 PM] Chris Mungall: phylo.sql needs documented better -- cjm
[6/14/11 1:45:42 PM] Chris Mungall: phylo.sql -- additional fields for SFs, speciation/duplication and other metadata -- cjm/sven
[6/14/11 1:45:56 PM] Chris Mungall: junit tests for PhyloTreeLoader
[6/14/11 1:46:14 PM] Chris Mungall: ?integrated paintscraper and PhyloTreeLoader
[6/14/11 1:46:38 PM] Chris Mungall: (org.geneontology.gold.io)
[6/14/11 1:46:55 PM] Chris Mungall: integrate PhyloTreeLoader into admin web interface
[6/14/11 1:47:47 PM] Chris Mungall: extend gold->solr sql mapping to include families
[6/14/11 1:48:33 PM] Chris Mungall: * add family as a document in itself in solr
[6/14/11 1:48:58 PM] Chris Mungall: * add bioentity2family mappings to current bioentity documents
[6/14/11 1:49:23 PM] Chris Mungall: (from family_member table)
[6/14/11 1:49:49 PM] Chris Mungall: add filter boxes in amigo2 demo interface
[6/14/11 1:51:25 PM] Mary Dolan: ? lost
[6/14/11 1:51:56 PM] Seth Carbon: lost chris again...?
[6/14/11 1:52:12 PM] Seth Carbon: anybody still on?
[6/14/11 1:52:23 PM] Mary Dolan: i'm here
[6/14/11 1:52:45 PM] Shahid Manzoor: only chat window is active on my computer
[6/14/11 1:52:49 PM] Shahid Manzoor: but no voice
[6/14/11 1:52:49 PM] Seth Carbon: hrm.
[6/14/11 1:52:56 PM] Chris Mungall: starting again
[6/14/11 1:54:42 PM] Chris Mungall: json format trees