2 June 2015 PAINT Conference Call

From GO Wiki
Jump to: navigation, search


PAINT conference call minutes, June 2nd, 2015

Participants: Marc, Pascale, Paul, Karen, Moni, Chris, Rama, Janos, Li, and Huaiyu (left early),

Agenda and Minutes

Update on Releases

  • PANTHER10 was released.
  • Paul: in PANTHER10 the species are organized differently, following NCBI-Taxonomy rules - not manually organized as before. You will note that change in this version.
  • Paint v1.14 was released, pointing to the new library. - Unfortunately some errors in formatting were found (IDs with unnecessary colon) and because of this change, the PanTree may not point to the new version of the library. Please wait until it is debugged and tested. PAINT will be released again later, making 1.15 available to all. An announcement will be made.


Following the PAINT curation progress

  • A google spreadsheet with progress on the families being annotated can be found at:

https://docs.google.com/spreadsheets/d/1eVhVeNq9FEEo-mH9TAa6CaoerUtgVIVrOaA6yodBeOM/edit?usp=sharing


Family presentation by Marc Feuermann

PTHR11075 - Peptide chain release factor

Using PAINT v1.14 on screen for demo purposes. Trees have been updated to PANTHER 10


Technical note: Marc is unable to open an "already in progress" family using a Microsoft/Windows computer. Pascale mentioned that this is a known problem - I did not find an issue tracking this on SF or GH.


Marc:

  • Translation release factors, chosen because there is a nice duplication and a nice neofunctionalization. Two kinds of functions for these proteins. Localization: mitochondrial. Issue with microbes, and received recommendation that this localization should not be propagated into plants - because of their multiple organelles, it is very difficult to know where the products are localized. It is easy to recognize differentiation between nucleus and mitochondria, but not so easy from organelle to organelle.


Paul: PANTHER10 also includes more plant genomes - duplication events are quite common in these genomes.


Karen: Plants may have two different forms of RNApol, some more closely related to the bacterial form, still present in the plastids. Duplications would be visible in this case as well.


Marc:

  • It is difficult to identify the exact organelle, so it is better to not propagate them. When annotating mitochondrial ribosomal proteins, you annotate the ribosomal part to the root, and the mitochondrial part only on non-plant eukaryotes.
  • One duplication in the root, with three nodes. Each node has bacteria, plants, and animals, for node 3 and node 1. No human, animals, or fungi are found on node 2. Node 2 has duplication events in plants, and also has bacteria. This tree has a large amount of annotations.
  • In prokaryotes you have 2 normal peptide chain release factors (proteins that release peptide at the end of translation; but do not recognize the same stop codons: one recognizes UAG UAA (factor2), and another UGA and UAA (factor1)). This tree accurately represents what we know from the literature about the family.
  • If in the same node I can open various annotations for the same kind of proteins, it is possible to see the this function of recognizing the stop and lowering the release of the peptide - (? -- someone please check this sentence - I lost the meaning of what Marc discussed.)
  • Sometimes the ribosomes start working on the mRNA and the translation stops. At these times still a release peptide is needed, even if translation has not been completed. These proteins have peptdidyl-rRNA hydrolase ICT1 activity. Again, these have codon specificity. The initial function was the release of the finalized peptide. During duplication, there was new functionalization of release of the peptide that has not been finished. It could be that node 3 is the ancestral one - because it contains the widest variety of taxa. The first duplication came between 3 and 1, and node 2 was the result of a second duplication. In bacteria there are a higher than normal number of release factors - 3. Animals and fungi only have 2 different release factors.


Paul: There were three (3) genes in the last common ancestor.


Marc:

  • Top node has the non-specific release factor. Bacteria already have a non-specific release factor. These factors recognize all stop codons in animals and bacteria (which have the 2 factors) but not in other phyla.
  • We do not have specificity of GO terms to tell the story about the difference in recognition of Stop codons.


Paul:

  • It is not quite consistent with a loss on the ophistokonts, the chytrids still have it (a basal fungus). The tree describes 2 losses. One in the fungi, following the divergence with chytrids - though this divergence time might be short - and once in the ophistokonts.
  • Looking at the bacterial outgroup sequences, it is actually the bacterial part of the tree - this clade of bacteria demonstrates a bacterial gene. These genes (globin - Synechocystis cyanobacteria). this tells you there was a HGT from an ancestral cyanobacterium into plants. This must have been the origin of the plastids in plants. So these started out as plastid genes. Phypa (Physcomitrella patens) - a basal divergence within the land plants - shows more evidence of the origin of these genes. Then, the further duplication, and one of the copies really diverges. Looking at the Arabidopsis sequence, it is likely going to be a chloroplast. -- yep, it was.
  • A cyanobacterial gene got sucked up into an ancestor and then stayed in the chloroplast. At least propagate to that single clade of plants. It would be appropriate to go all the way back into Physcomitrella. Paul thinks this is still an absolutely reasonable guess that these are all mitochondrial genes.
  • The workflow should always be to see how far back it is appropriate to propagate the annotation. Trying to go further back in time requires looking into the additional clades -


Pascale: wishes to clarify that this process is more reliable for function than for component.


Paul:

  • The new version of the tree provides 215 spp; these were trimmed back to 104. In most cases it helped, but in other cases the inclusion of other spp - esp. those with more genes - the multiple alignments tended to get worse. Some of those trees will have to be broken up for the next version. For anything but the really large trees, the new species have helped.
  • Statistics on HGT showed that most commonly they take place from proteobacteria into eukaryotes - and whether nuclear or not, it tends to be used in the mitochondria. Also from cyanobacteria to chloroplasts.
  • It is very hard to recognize ancient events, so it is amazing to be able to follow the history of the ancestral events.


Marc:

  • About function: codon specific was propagated to node 2 and 3. The nonspecific function was left only for propagation in node 1.


Paul:

  • Is it important in biology to capture the distinction in codons? Or are we getting into too much detail and we can leave the current terms?