BHF-UCL: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
No edit summary
 
(5 intermediate revisions by one other user not shown)
Line 1: Line 1:
[[Category:Cardiovascular]]
== Overview: ==
== Overview: ==


Line 48: Line 49:


'''a.  Papers with substantial GO content'''
'''a.  Papers with substantial GO content'''
The representation of heart development in the Gene Ontology
 
'The representation of heart development in the Gene Ontology'
Varsha K Khodiyar, David P Hill, Doug Howe, Tanya Z Berardini, Susan Tweedie, Philppa J Talmud, Ross Breckenridge, Shoumo Bhattarcharya, Paul Riley, Pete Scambler, Ruth C Lovering. Developmental Biology 2011 Jun;354(1):9-17.  
Varsha K Khodiyar, David P Hill, Doug Howe, Tanya Z Berardini, Susan Tweedie, Philppa J Talmud, Ross Breckenridge, Shoumo Bhattarcharya, Paul Riley, Pete Scambler, Ruth C Lovering. Developmental Biology 2011 Jun;354(1):9-17.  


Line 54: Line 56:


The BHF-UCL GO curators are closely associated with the Cardiovascular Genetics group at UCL and have given 6 presentations at their group meetings.
The BHF-UCL GO curators are closely associated with the Cardiovascular Genetics group at UCL and have given 6 presentations at their group meetings.
The BHF-UCL team teaches a ‘bioinformatics’ module for Genetics of Human Disease MSc students this year. By focusing on the annotation of published scientific papers the students constructively apply their newly acquired knowledge of a variety of online biological resources, including Ensembl, EntrezGene, IntAct, UniProt, QuickGO, AmiGO and HCOP. In addition, the students learnt the importance of including full experimental detail in scientific publications. Last years students submitted 600 correct annotations, which have been included in the GOA dataset, by reading 162 publications, the review of these annotations lead to a further 1,000 annotations being included in the GOA database.
One of the MSc students, Sejal Patel, undertook a GO annotation project and annotated neuroligin and neurexin gene families, mutations in several of these genes are known to contribute to autisic spectrum disorders. Sejal created over 300 annotations and requested several terms within the neurogenesis ontology.
In April and September the BHF-UCL team ran 2-day GO annotation workshops at UCL. This was attended by 38 UCL scientists who learnt how to use some of the freely available biological databases, how to analyse high-throughput datasets. The workshop in April led to the creation of over 200 annotations, based on 28 papers.


'''c. Poster presentations'''
'''c. Poster presentations'''
Line 60: Line 68:


Poster presented by Ruth and Varsha entitled: 'Bringing Gene Ontology to cardiovascular research' at the ENFIN Enabling Systems Biology Conference & Bioinformatics Workshop, UCL, 11-14th April 2011
Poster presented by Ruth and Varsha entitled: 'Bringing Gene Ontology to cardiovascular research' at the ENFIN Enabling Systems Biology Conference & Bioinformatics Workshop, UCL, 11-14th April 2011


== 5. Other Highlights: ==
== 5. Other Highlights: ==
Line 66: Line 73:
'''A. Ontology Development Contributions:'''
'''A. Ontology Development Contributions:'''


Since 9th December 2009 the BHF-UCL team have made 159 Source Forge request (to 03/12/10), which have led to the creation of 346 new GO terms. The majority of these requests were relevant to cardiovascular processes, for example catenin-TCF7L2 complex, peroxisome proliferator activated receptor signaling pathway, cellular triglyceride homeostasis, carbohydrate response element binding, vitamin D receptor activator activity, regulation of release of cytochrome c from mitochondria, prostaglandin secretion involved in immune response, mononuclear cell migration, adrenergic receptor signaling pathway, macrophage apoptosis, positive regulation of brown fat cell differentiation.
Since 1st November 2007 request made by the BHF-UCL team has led to the creation of 1106 new GO terms. The majority of these requests were relevant to cardiovascular processes, for example adiponectin secretion, adrenergic receptor signaling pathway, cellular response to epinephrine stimulus, cellular triglyceride homeostasis, medium-chain-acyl-CoA dehydrogenase activity, blood microparticle and type B pancreatic cell apoptosis


Ruth has been involved in the signalling, binding, regulation and response_to ontology development working groups and co-chaired the ‘protein-binding’ discussion in the GO annotation workshop in Geneva.
Ruth has been involved in the signalling, and apoptosis working groups.
Varsha has been involved in the downstream ontology development working group and co-chaired the ‘downstream’ discussion in the GO annotation workshop in Geneva.


'''B.  Annotation Outreach and User Advocacy Efforts:'''
'''B.  Annotation Outreach and User Advocacy Efforts:'''


The BHF-UCL team is teaching a ‘bioinformatics’ module for Genetics of Human Disease MSc students this year. By focusing on the annotation of published scientific papers the students constructively apply their newly acquired knowledge of a variety of online biological resources, including Ensembl, EntrezGene, IntAct, UniProt, QuickGO, AmiGO and HCOP. In addition, the students learnt the importance of including full experimental detail in scientific publications. Last years students created 1,200 annotations to over 200 proteins, the review of these annotations lead to the addition of over 2,000 annotations to the GOA database.
The BHF-UCL team encouraged last years Genetics of Human Disease MSc students to take part in an annotation competition, CACAO, being run by Brenley McIntosh at Texas A&M University. Four students are currently taking part and have already contributed more than 300 annotations, as well as submitting a substantial number of annotation challenges.


The BHF-UCL team encouraged last years Genetics of Human Disease MSc students to take part in an annotation competition, CACAO, being run by Jim Hu at Texas A&M University. Five students took part and contributed almost 300 annotations during the 8-week competition, as well as submitting a substantial number of annotation challenges.
'''C. Other Highlights:'''
 
In September the BHF-UCL team ran a 2-day GO annotation workshop at UCL. This was attended by 15 UCL scientists who learnt how to use some of the freely available biological databases, how to analyse high-throughput datasets and also contributed 143 GO annotations directly to the GOA database, based on 37 papers. The BHF-UCL GO curators reviewed these annotations and were able to create over 250 annotations based on the papers reviewed by the attendees.


'''C.  Other Highlights:'''
This year the Initiative has circulated four newsletters, in January, April, July, and October by direct email to the International Advisory Committee and individuals who have expressed an interest in this project; by indirect email, though the mailing lists of several cardiovascular related societies and to the UCL Department of Medicine mailing list, as hardcopies at meetings and through our web site.  
This year the Initiative has circulated four newsletters, in January, April, July, and October by direct email to the International Advisory Committee and individuals who have expressed an interest in this project; by indirect email, though the mailing lists of several cardiovascular related societies and to the UCL Department of Medicine mailing list, as hardcopies at meetings and through our web site. In March, Ruth and Varsha attended the London Hypertension Society and London Vascular Biology Forum and distributed leaflets describing the Cardiovascular GO Annotation Initiative project.
Varsha co-ordinated the WNT signalling project http://gocwiki.geneontology .org/index.php/Wnt_signaling_Pathway. This annotation project is part of the Reference Genome project and has led to the annotation of 23 subfamilies of genes involved in WNT signalling, including WNT1-11, FZD1-10, AXIN1-2, CTNNB1 and LRP5, across 12 different species, by 12 different databases. WNT signalling plays an important role in heart development and Varsha provided a key role in choosing the annotations targets for this project and in discussions about the creation of new GO terms to describe this system.  


Following discussions with Manuel Mayr, the BHF-UCL team added our first high-throughput annotations, following the publication of Manuel Mayr paper describing the proteomic analysis of the human aortic extracellular matrix (PMID:20551380). This resulted in 100 proteins being annotated with the GO term ‘extracellular matrix’ or ‘extracellular space’ or both.
Varsha co-ordinated the heart development transcription project http://gocwiki.geneontology.org/index.php/RefG_Heart_Development_co-curation#Heart_Development_Transcription_Annotation_Targets. This annotation project is part of the Reference Genome project and has led to the annotation of 25 subfamilies of transcription factors involved in heart development. Varsha provided a key role in choosing the annotations targets for this project.

Latest revision as of 18:44, 11 April 2014

Overview:

The aim of the Cardiovascular GO Annotation Initiative (BHF-UCL, British Heart Foundation – University College London) is to provide GO annotation to human cardiovascular-associated genes. This project represents a successful collaboration between University College London (UCL) and the European Bioinformatics Institute (EBI); the annotations created by the UCL-based curators are made directly into the GOA database at the EBI. 4000 human genes have been identified as associated with cardiovascular processes and annotation priorities are agreed on an annual basis in consultation with the Co-Grant holders, the International Scientific Advisory Committee and the UCL-based GO curators. The Initiative aims to comprehensively annotate 1500 genes in 5 years. BHF-UCL has been a GOC member since July 2008.

1. Staff:

  • Dr Ruth Lovering, 1 FTE – UCL-based curator, BHF grant to November 2012
  • Dr Varsha Khodiyar, 0.7 FTE – UCL-based curator, BHF grant to January 2013
  • Tony Sawford, 0.25 FTE – EBI-based Software engineer, BHF grant to November 2012

No funding by GOC NIHGRI grant

[please include FTEs working on GOC tasks designating as well how many FTEs funding by GOC NIHGRI grant]

2. Annotation Progress

The annotation progress reflects the priority of this project to annotate human genes, with 15,760 GO terms associated to 1,831 human proteins (1st November 2007 to 1st November 2011). Across all species BHF-UCL have annotated 3,017 proteins with 22,148 GO terms.

3. Methods and strategies for annotation

(please note % effort on literature curation vs. computational annotation methods)

a. Literature curation (100%): The aim of this Initiative is to provide complete and deep annotation of 300 human proteins per year. This is achieved through both protein-centric and process-centric targeting of proteins to annotate. The process-centric annotation enables the curators to gain a better understanding of the targeted a process and using the GONUTs table ensure that relevant terms are associated with all proteins involved in a particular process. The protein-centric annotation is undertaken when annotating proteins on the reference genome list. In addition, we annotate proteins following requests from cardiovascular scientists or when annotated by attendees of our MSc module or 2-day annotation workshops. The following approaches are taken to achieve this:

  • To ensure a rapid improvement in the annotations available for a large number of cardiovascular associated proteins the curators spend a maximum of one day researching the literature associated with each protein.
  • The protein will be marked as ‘complete’ if the curator feels there are no further terms to add.
  • If complete annotation cannot be achieved in a day, the protein record is marked as first pass complete. The intention is to revisit these first pass proteins, hopefully with some expert scientist input, in the following year.
  • The approved gene symbol (and relevant gene and protein aliases) are used to query a variety of biomedical search engines, including NCBI PubMed, iHOP and GOPubMed, to identify suitable papers for the GO annotation of each target protein (with highly researched genes the search is usually limited to human entries only).
  • The curators will usually associate GO terms to all of the human proteins mentioned in each paper read, depending on the experimental evidence available (occasionally GO terms are associated with non-human proteins too).
  • Preference is given to the use of experimental-based evidence codes, however these are only used when the curator is completely confident of the identity of the protein and its derivative species.
  • Reviews are also used to provide an overview of the characteristics of a protein and an insight into the complete set of GO terms required.
  • Experimental data relating to model organism proteins maybe included in our GO annotation process, through the direct annotation of the model organism protein and the use of the ‘inferred by sequence similarity’ evidence code to transfer the information to the orthologous human protein.
  • When experimentally supported literature is unobtainable, due to insufficient information about the species the protein is derived from, the lack of access to a referenced paper, or simply because the knowledge is considered so well accepted that references are not supplied, author statements are used.
  • When possible we associate the chronologically first paper that provides experimental evidence for the characteristic features of a given human protein.
  • We aim to capture the knowledge about each protein using a limited number of papers, with experimental evidence.
  • We do not annotate all relevant papers, if this will lead to repeated duplication of GO terms associated to the protein.
  • GO terms are chosen by querying the GO files with QuickGO or AmiGO.
  • Before assigning a GO term, its definition and position within the ontology are checked to ensure its suitability.
  • The GO editorial office is contacted, via SourceForge, when a new GO term is required, or modifications are needed to an existing GO term.

b. Computational annotation strategies: None used

c. Priorities for annotation Human genes involved in cardiovascular-related processes, as agreed by the International Scientific Advisory Panel

4. Presentations and Publications

a. Papers with substantial GO content

'The representation of heart development in the Gene Ontology' Varsha K Khodiyar, David P Hill, Doug Howe, Tanya Z Berardini, Susan Tweedie, Philppa J Talmud, Ross Breckenridge, Shoumo Bhattarcharya, Paul Riley, Pete Scambler, Ruth C Lovering. Developmental Biology 2011 Jun;354(1):9-17.

b. Presentations including Talks and Tutorials and Teaching

The BHF-UCL GO curators are closely associated with the Cardiovascular Genetics group at UCL and have given 6 presentations at their group meetings.

The BHF-UCL team teaches a ‘bioinformatics’ module for Genetics of Human Disease MSc students this year. By focusing on the annotation of published scientific papers the students constructively apply their newly acquired knowledge of a variety of online biological resources, including Ensembl, EntrezGene, IntAct, UniProt, QuickGO, AmiGO and HCOP. In addition, the students learnt the importance of including full experimental detail in scientific publications. Last years students submitted 600 correct annotations, which have been included in the GOA dataset, by reading 162 publications, the review of these annotations lead to a further 1,000 annotations being included in the GOA database.

One of the MSc students, Sejal Patel, undertook a GO annotation project and annotated neuroligin and neurexin gene families, mutations in several of these genes are known to contribute to autisic spectrum disorders. Sejal created over 300 annotations and requested several terms within the neurogenesis ontology.

In April and September the BHF-UCL team ran 2-day GO annotation workshops at UCL. This was attended by 38 UCL scientists who learnt how to use some of the freely available biological databases, how to analyse high-throughput datasets. The workshop in April led to the creation of over 200 annotations, based on 28 papers.

c. Poster presentations

Poster presented by Ruth and Varsha: 'Bringing Gene Ontology to cardiovascular research' at the Computational Biology Symposium UCL, 15th February 2011

Poster presented by Ruth and Varsha entitled: 'Bringing Gene Ontology to cardiovascular research' at the ENFIN Enabling Systems Biology Conference & Bioinformatics Workshop, UCL, 11-14th April 2011

5. Other Highlights:

A. Ontology Development Contributions:

Since 1st November 2007 request made by the BHF-UCL team has led to the creation of 1106 new GO terms. The majority of these requests were relevant to cardiovascular processes, for example adiponectin secretion, adrenergic receptor signaling pathway, cellular response to epinephrine stimulus, cellular triglyceride homeostasis, medium-chain-acyl-CoA dehydrogenase activity, blood microparticle and type B pancreatic cell apoptosis

Ruth has been involved in the signalling, and apoptosis working groups.

B. Annotation Outreach and User Advocacy Efforts:

The BHF-UCL team encouraged last years Genetics of Human Disease MSc students to take part in an annotation competition, CACAO, being run by Brenley McIntosh at Texas A&M University. Four students are currently taking part and have already contributed more than 300 annotations, as well as submitting a substantial number of annotation challenges.

C. Other Highlights:

This year the Initiative has circulated four newsletters, in January, April, July, and October by direct email to the International Advisory Committee and individuals who have expressed an interest in this project; by indirect email, though the mailing lists of several cardiovascular related societies and to the UCL Department of Medicine mailing list, as hardcopies at meetings and through our web site.

Varsha co-ordinated the heart development transcription project http://gocwiki.geneontology.org/index.php/RefG_Heart_Development_co-curation#Heart_Development_Transcription_Annotation_Targets. This annotation project is part of the Reference Genome project and has led to the annotation of 25 subfamilies of transcription factors involved in heart development. Varsha provided a key role in choosing the annotations targets for this project.