Publications by authors named "Ruth C Lovering"

54 Publications

The genomics of heart failure: design and rationale of the HERMES consortium.

ESC Heart Fail 2021 Sep 3. Epub 2021 Sep 3.

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.

Aims: The HERMES (HEart failure Molecular Epidemiology for Therapeutic targetS) consortium aims to identify the genomic and molecular basis of heart failure.

Methods And Results: The consortium currently includes 51 studies from 11 countries, including 68 157 heart failure cases and 949 888 controls, with data on heart failure events and prognosis. All studies collected biological samples and performed genome-wide genotyping of common genetic variants. The enrolment of subjects into participating studies ranged from 1948 to the present day, and the median follow-up following heart failure diagnosis ranged from 2 to 116 months. Forty-nine of 51 individual studies enrolled participants of both sexes; in these studies, participants with heart failure were predominantly male (34-90%). The mean age at diagnosis or ascertainment across all studies ranged from 54 to 84 years. Based on the aggregate sample, we estimated 80% power to genetic variant associations with risk of heart failure with an odds ratio of ≥1.10 for common variants (allele frequency ≥ 0.05) and ≥1.20 for low-frequency variants (allele frequency 0.01-0.05) at P < 5 × 10 under an additive genetic model.

Conclusions: HERMES is a global collaboration aiming to (i) identify the genetic determinants of heart failure; (ii) generate insights into the causal pathways leading to heart failure and enable genetic approaches to target prioritization; and (iii) develop genomic tools for disease stratification and risk prediction.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/ehf2.13517DOI Listing
September 2021

Gene Ontology representation for transcription factor functions.

Biochim Biophys Acta Gene Regul Mech 2021 Aug 28;1864(11-12):194752. Epub 2021 Aug 28.

Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA.

Transcription plays a central role in defining the identity and functionalities of cells, as well as in their responses to changes in the cellular environment. The Gene Ontology (GO) provides a rigorously defined set of concepts that describe the functions of gene products. A GO annotation is a statement about the function of a particular gene product, represented as an association between a gene product and the biological concept a GO term defines. Critically, each GO annotation is based on traceable scientific evidence. Here, we describe the different GO terms that are associated with proteins involved in transcription and its regulation, focusing on the standard of evidence required to support these associations. This article is intended to help users of GO annotations understand how to interpret the annotations and can contribute to the consistency of GO annotations. We distinguish between three classes of activities involved in transcription or directly regulating it - general transcription factors, DNA-binding transcription factors, and transcription co-regulators.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.bbagrm.2021.194752DOI Listing
August 2021

Sequence Ontology terminology for gene regulation.

Biochim Biophys Acta Gene Regul Mech 2021 10 11;1864(10):194745. Epub 2021 Aug 11.

Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, USA. Electronic address:

The Sequence Ontology (SO) is a structured, controlled vocabulary that provides terms and definitions for genomic annotation. The Gene Regulation Ensemble Effort for the Knowledge Commons (GREEKC) initiative has gathered input from many groups of researchers, including the SO, the Gene Ontology (GO), and gene regulation experts, with the goal of curating information about how gene expression is regulated at the molecular level. Here we discuss recent updates to the SO reflecting current knowledge. We have developed more accurate human-readable terms (also known as classes), including new definitions, and relationships related to the expression of genes. New findings continue to give us insight into the biology of gene regulation, including the order of events, and participants in those events. These updates to the SO support logical reasoning with the current understanding of gene expression regulation at the molecular level.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.bbagrm.2021.194745DOI Listing
October 2021

Plasma proteins, cognitive decline, and 20-year risk of dementia in the Whitehall II and Atherosclerosis Risk in Communities studies.

Alzheimers Dement 2021 Aug 2. Epub 2021 Aug 2.

Department of Epidemiology and Public Health, University College London, London, UK.

Introduction: Plasma proteins affect biological processes and are common drug targets but their role in the development of Alzheimer's disease and related dementias remains unclear. We examined associations between 4953 plasma proteins and cognitive decline and risk of dementia in two cohort studies with 20-year follow-ups.

Methods: In the Whitehall II prospective cohort study proteins were measured using SOMAscan technology. Cognitive performance was tested five times over 20 years. Linkage to electronic health records identified incident dementia. The results were replicated in the Atherosclerosis Risk in Communities (ARIC) study.

Results: Fifteen non-amyloid/non-tau-related proteins were associated with cognitive decline and dementia, were consistently identified in both cohorts, and were not explained by known dementia risk factors. Levels of six of the proteins are modifiable by currently approved medications for other conditions.

Discussion: This study identified several plasma proteins in dementia-free people that are associated with long-term risk of cognitive decline and dementia.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/alz.12419DOI Listing
August 2021

Towards a unified open access dataset of molecular interactions.

Nat Commun 2020 12 1;11(1):6144. Epub 2020 Dec 1.

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Campus, Hinxton, Cambridge, CB10 1SD, UK.

The International Molecular Exchange (IMEx) Consortium provides scientists with a single body of experimentally verified protein interactions curated in rich contextual detail to an internationally agreed standard. In this update to the work of the IMEx Consortium, we discuss how this initiative has been working in practice, how it has ensured database sustainability, and how it is meeting emerging annotation challenges through the introduction of new interactor types and data formats. Additionally, we provide examples of how IMEx data are being used by biomedical researchers and integrated in other bioinformatic tools and resources.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-19942-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7708836PMC
December 2020

Term Matrix: a novel Gene Ontology annotation quality control system based on ontology term co-annotation patterns.

Open Biol 2020 09 2;10(9):200149. Epub 2020 Sep 2.

Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.

Biological processes are accomplished by the coordinated action of gene products. Gene products often participate in multiple processes, and can therefore be annotated to multiple Gene Ontology (GO) terms. Nevertheless, processes that are functionally, temporally and/or spatially distant may have few gene products in common, and co-annotation to unrelated processes probably reflects errors in literature curation, ontology structure or automated annotation pipelines. We have developed an annotation quality control workflow that uses rules based on mutually exclusive processes to detect annotation errors, based on and validated by case studies including the three we present here: fission yeast protein-coding gene annotations over time; annotations for cohesin complex subunits in human and model species; and annotations using a selected set of GO biological process terms in human and five model species. For each case study, we reviewed available GO annotations, identified pairs of biological processes which are unlikely to be correctly co-annotated to the same gene products (e.g. amino acid metabolism and cytokinesis), and traced erroneous annotations to their sources. To date we have generated 107 quality control rules, and corrected 289 manual annotations in eukaryotes and over 52 700 automatically propagated annotations across all taxa.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1098/rsob.200149DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7536087PMC
September 2020

A Coordinated Approach by Public Domain Bioinformatics Resources to Aid the Fight Against Alzheimer's Disease Through Expert Curation of Key Protein Targets.

J Alzheimers Dis 2020 ;77(1):257-273

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Campus, Hinxton, Cambridge, UK.

Background: The analysis and interpretation of data generated from patient-derived clinical samples relies on access to high-quality bioinformatics resources. These are maintained and updated by expert curators extracting knowledge from unstructured biological data described in free-text journal articles and converting this into more structured, computationally-accessible forms. This enables analyses such as functional enrichment of sets of genes/proteins using the Gene Ontology, and makes the searching of data more productive by managing issues such as gene/protein name synonyms, identifier mapping, and data quality.

Objective: To undertake a coordinated annotation update of key public-domain resources to better support Alzheimer's disease research.

Methods: We have systematically identified target proteins critical to disease process, in part by accessing informed input from the clinical research community.

Results: Data from 954 papers have been added to the UniProtKB, Gene Ontology, and the International Molecular Exchange Consortium (IMEx) databases, with 299 human proteins and 279 orthologs updated in UniProtKB. 745 binary interactions were added to the IMEx human molecular interaction dataset.

Conclusion: This represents a significant enhancement in the expert curated data pertinent to Alzheimer's disease available in a number of biomedical databases. Relevant protein entries have been updated in UniProtKB and concomitantly in the Gene Ontology. Molecular interaction networks have been significantly extended in the IMEx Consortium dataset and a set of reference protein complexes created. All the resources described are open-source and freely available to the research community and we provide examples of how these data could be exploited by researchers.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3233/JAD-200206DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7592670PMC
September 2021

The Minimum Information about a Molecular Interaction CAusal STatement (MI2CAST).

Bioinformatics 2021 04;36(24):5712-5718

Department of Biology, Norwegian University of Science and Technology (NTNU), Trondheim 7491, Norway.

Motivation: A large variety of molecular interactions occurs between biomolecular components in cells. When a molecular interaction results in a regulatory effect, exerted by one component onto a downstream component, a so-called 'causal interaction' takes place. Causal interactions constitute the building blocks in our understanding of larger regulatory networks in cells. These causal interactions and the biological processes they enable (e.g. gene regulation) need to be described with a careful appreciation of the underlying molecular reactions. A proper description of this information enables archiving, sharing and reuse by humans and for automated computational processing. Various representations of causal relationships between biological components are currently used in a variety of resources.

Results: Here, we propose a checklist that accommodates current representations, called the Minimum Information about a Molecular Interaction CAusal STatement (MI2CAST). This checklist defines both the required core information, as well as a comprehensive set of other contextual details valuable to the end user and relevant for reusing and reproducing causal molecular interaction information. The MI2CAST checklist can be used as reporting guidelines when annotating and curating causal statements, while fostering uniformity and interoperability of the data across resources.

Availability And Implementation: The checklist together with examples is accessible at https://github.com/MI2CAST/MI2CAST.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa622DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8023674PMC
April 2021

PINOT: an intuitive resource for integrating protein-protein interactions.

Cell Commun Signal 2020 06 11;18(1):92. Epub 2020 Jun 11.

School of Pharmacy, University of Reading, Whiteknights, Reading, RG6 6AP, UK.

Background: The past decade has seen the rise of omics data for the understanding of biological systems in health and disease. This wealth of information includes protein-protein interaction (PPI) data derived from both low- and high-throughput assays, which are curated into multiple databases that capture the extent of available information from the peer-reviewed literature. Although these curation efforts are extremely useful, reliably downloading and integrating PPI data from the variety of available repositories is challenging and time consuming.

Methods: We here present a novel user-friendly web-resource called PINOT (Protein Interaction Network Online Tool; available at http://www.reading.ac.uk/bioinf/PINOT/PINOT_form.html) to optimise the collection and processing of PPI data from IMEx consortium associated repositories (members and observers) and WormBase, for constructing, respectively, human and Caenorhabditis elegans PPI networks.

Results: Users submit a query containing a list of proteins of interest for which PINOT extracts data describing PPIs. At every query submission PPI data are downloaded, merged and quality assessed. Then each PPI is confidence scored based on the number of distinct methods used for interaction detection and the number of publications that report the specific interaction. Examples of how PINOT can be applied are provided to highlight the performance, ease of use and potential utility of this tool.

Conclusions: PINOT is a tool that allows users to survey the curated literature, extracting PPI data in relation to a list of proteins of interest. PINOT extracts a similar numbers of PPIs as other, analogous, tools and incorporates a set of innovative features. PINOT is able to process large queries, it downloads human PPIs live through PSICQUIC and it applies quality control filters on the downloaded PPI data (i.e. removing the need for manual inspection by the user). PINOT provides the user with information on detection methods and publication history for each downloaded interaction data entry and outputs the results in a table format that can be straightforwardly further customised and/or directly uploaded into network visualization software. Video abstract.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12964-020-00554-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7291677PMC
June 2020

Gene Ontology Curation of Neuroinflammation Biology Improves the Interpretation of Alzheimer's Disease Gene Expression Data.

J Alzheimers Dis 2020 ;75(4):1417-1435

Functional Gene Annotation, Preclinical and Fundamental Science, UCL Institute of Cardiovascular Science, University College London, London, UK.

Background: Gene Ontology (GO) is a major bioinformatic resource used for analysis of large biomedical datasets, for example from genome-wide association studies, applied universally across biological fields, including Alzheimer's disease (AD) research.

Objective: We aim to demonstrate the applicability of GO for interpretation of AD datasets to improve the understanding of the underlying molecular disease mechanisms, including the involvement of inflammatory pathways and dysregulated microRNAs (miRs).

Methods: We have undertaken a systematic full article GO annotation approach focused on microglial proteins implicated in AD and the miRs regulating their expression. PANTHER was used for enrichment analysis of previously published AD data. Cytoscape was used for visualizing and analyzing miR-target interactions captured from published experimental evidence.

Results: We contributed 3,084 new annotations for 494 entities, i.e., on average six new annotations per entity. This included a total of 1,352 annotations for 40 prioritized microglial proteins implicated in AD and 66 miRs regulating their expression, yielding an average of twelve annotations per prioritized entity. The updated GO resource was then used to re-analyze previously published data. The re-analysis showed novel processes associated with AD-related genes, not identified in the original study, such as 'gliogenesis', 'regulation of neuron projection development', or 'response to cytokine', demonstrating enhanced applicability of GO for neuroscience research.

Conclusions: This study highlights ongoing development of the neurobiological aspects of GO and demonstrates the value of biocuration activities in the area, thus helping to delineate the molecular bases of AD to aid the development of diagnostic tools and treatments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3233/JAD-200207DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7369085PMC
May 2021

Genome-wide association and Mendelian randomisation analysis provide insights into the pathogenesis of heart failure.

Nat Commun 2020 01 9;11(1):163. Epub 2020 Jan 9.

Department of Biostatistics, University of Liverpool, Liverpool, UK.

Heart failure (HF) is a leading cause of morbidity and mortality worldwide. A small proportion of HF cases are attributable to monogenic cardiomyopathies and existing genome-wide association studies (GWAS) have yielded only limited insights, leaving the observed heritability of HF largely unexplained. We report results from a GWAS meta-analysis of HF comprising 47,309 cases and 930,014 controls. Twelve independent variants at 11 genomic loci are associated with HF, all of which demonstrate one or more associations with coronary artery disease (CAD), atrial fibrillation, or reduced left ventricular function, suggesting shared genetic aetiology. Functional analysis of non-CAD-associated loci implicate genes involved in cardiac development (MYOZ1, SYNPO2L), protein homoeostasis (BAG3), and cellular senescence (CDKN1A). Mendelian randomisation analysis supports causal roles for several HF risk factors, and demonstrates CAD-independent effects for atrial fibrillation, body mass index, and hypertension. These findings extend our knowledge of the pathways underlying HF and may inform new therapeutic strategies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-019-13690-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6952380PMC
January 2020

RNA sequencing-based transcriptome profiling of cardiac tissue implicates novel putative disease mechanisms in FLNC-associated arrhythmogenic cardiomyopathy.

Int J Cardiol 2020 03 6;302:124-130. Epub 2019 Dec 6.

Centre for Heart Muscle Disease, Institute of Cardiovascular Science, University College London, London, UK. Electronic address:

Arrhythmogenic cardiomyopathy (ACM) encompasses a group of inherited cardiomyopathies including arrhythmogenic right ventricular cardiomyopathy (ARVC) whose molecular disease mechanism is associated with dysregulation of the canonical WNT signalling pathway. Recent evidence indicates that ARVC and ACM caused by pathogenic variants in the FLNC gene encoding filamin C, a major cardiac structural protein, may have different molecular mechanisms of pathogenesis. We sought to identify dysregulated biological pathways in FLNC-associated ACM. RNA was extracted from seven paraffin-embedded left ventricular tissue samples from deceased ACM patients carrying FLNC variants and sequenced. Transcript levels of 623 genes were upregulated and 486 genes were reduced in ACM in comparison to control samples. The cell adhesion pathway and ILK signalling were among the prominent dysregulated pathways in ACM. Consistent with these findings, transcript levels of cell adhesion genes JAM2, NEO1, VCAM1 and PTPRC were upregulated in ACM samples. Moreover, several actin-associated genes, including FLNC, VCL, PARVB and MYL7, were suppressed, suggesting dysregulation of the actin cytoskeleton. Analysis of the transcriptome for dysregulated biological pathways predicted activation of inflammation and apoptosis and suppression of oxidative phosphorylation and MTORC1 signalling in ACM. Our data suggests dysregulated cell adhesion and ILK signalling as novel putative pathogenic mechanisms of ACM caused by FLNC variants which are distinct from the postulated disease mechanism of classic ARVC caused by desmosomal gene mutations. This knowledge could help in the design of future gene therapy strategies which would target specific components of these pathways and potentially lead to novel treatments for ACM.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ijcard.2019.12.002DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6940594PMC
March 2020

Non-coding RNA regulatory networks.

Biochim Biophys Acta Gene Regul Mech 2020 06 4;1863(6):194417. Epub 2019 Sep 4.

European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton CB10 1SD, UK. Electronic address:

It is well established that the vast majority of human RNA transcripts do not encode for proteins and that non-coding RNAs regulate cell physiology and shape cellular functions. A subset of them is involved in gene regulation at different levels, from epigenetic gene silencing to post-transcriptional regulation of mRNA stability. Notably, the aberrant expression of many non-coding RNAs has been associated with aggressive pathologies. Rapid advances in network biology indicates that the robustness of cellular processes is the result of specific properties of biological networks such as scale-free degree distribution and hierarchical modularity, suggesting that regulatory network analyses could provide new insights on gene regulation and dysfunction mechanisms. In this study we present an overview of public repositories where non-coding RNA-regulatory interactions are collected and annotated, we discuss unresolved questions for data integration and we recall existing resources to build and analyse networks.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.bbagrm.2019.194417DOI Listing
June 2020

SynGO: An Evidence-Based, Expert-Curated Knowledge Base for the Synapse.

Neuron 2019 07 3;103(2):217-234.e4. Epub 2019 Jun 3.

Molecular Physiology of the Synapse Laboratory, Biomedical Research Institute Sant Pau, 08025 Barcelona, Spain; Universitat Autònoma de Barcelona, 08193 Bellaterra, Cerdanyola del Vallès, Spain.

Synapses are fundamental information-processing units of the brain, and synaptic dysregulation is central to many brain disorders ("synaptopathies"). However, systematic annotation of synaptic genes and ontology of synaptic processes are currently lacking. We established SynGO, an interactive knowledge base that accumulates available research about synapse biology using Gene Ontology (GO) annotations to novel ontology terms: 87 synaptic locations and 179 synaptic processes. SynGO annotations are exclusively based on published, expert-curated evidence. Using 2,922 annotations for 1,112 genes, we show that synaptic genes are exceptionally well conserved and less tolerant to mutations than other genes. Many SynGO terms are significantly overrepresented among gene variations associated with intelligence, educational attainment, ADHD, autism, and bipolar disorder and among de novo variants associated with neurodevelopmental disorders, including schizophrenia. SynGO is a public, universal reference for synapse research and an online analysis platform for interpretation of large-scale -omics data (https://syngoportal.org and http://geneontology.org).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neuron.2019.05.002DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6764089PMC
July 2019

Annotation of gene product function from high-throughput studies using the Gene Ontology.

Database (Oxford) 2019 01 1;2019. Epub 2019 Jan 1.

Zebrafish Information Network, University of Oregon, Eugene, OR, USA.

High-throughput studies constitute an essential and valued source of information for researchers. However, high-throughput experimental workflows are often complex, with multiple data sets that may contain large numbers of false positives. The representation of high-throughput data in the Gene Ontology (GO) therefore presents a challenging annotation problem, when the overarching goal of GO curation is to provide the most precise view of a gene's role in biology. To address this, representatives from annotation teams within the GO Consortium reviewed high-throughput data annotation practices. We present an annotation framework for high-throughput studies that will facilitate good standards in GO curation and, through the use of new high-throughput evidence codes, increase the visibility of these annotations to the research community.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/database/baz007DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6355445PMC
January 2019

Distinct proteomic profiles in monozygotic twins discordant for ischaemic stroke.

Mol Cell Biochem 2019 Jun 29;456(1-2):157-165. Epub 2019 Jan 29.

Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK.

Stroke is a common disorder with significant morbidity and mortality, and complex aetiology involving both environmental and genetic risk factors. Although some of the major risk factors for stoke, such as smoking and hypertension, are well-documented, the underlying genetic and detailed molecular mechanisms remain elusive. Exploring the relevant biochemical pathways may contribute to the clinical diagnosis of stroke and shed light on its aetiology. A comparative proteomic analysis of blood serum of a pair of monozygotic (MZ) twins discordant for ischaemic stroke (IS) was performed using a label-free quantitative proteomics approach. To overcome the limit of reproducibility in the serum preparation, two separate runs were performed, each consisting of three technical replicates per sample. Biological processes associated with proteins differentially expressed between the twins were explored with gene ontology (GO) classification using the functional analysis tool g:Profiler. ANOVA test performed in Progenesis LC-MS identified 179 (run 1) and 209 (run 2) proteins as differentially expressed between the affected and unaffected twin (p < 0.05). Furthermore, the level of serum fibulin 1, an extracellular matrix protein associated with arterial stiffness, was on average 13.37-fold higher in the affected twin. Each dataset was then analysed independently, and the proteins were classified according to GO terms. The categories overrepresented in the affected twin predominantly corresponded to stroke-relevant processes, including wound healing, blood coagulation and haemostasis, with a high proportion of the proteins overexpressed in the affected twin associated with these terms. By contrast, in the unaffected twin diagnosed with atopic dermatitis, there were increased levels of keratin proteins and GO terms associated with skin development. The identification of cellular pathways enriched in IS as well as the upregulation of fibulin 1 sheds new light on the underlying disease-causing mechanisms at the molecular level. Our findings of distinct proteomic signatures associated with IS and atopic dermatitis suggest proteomic profiling could be used as a general approach for improved diagnostic, prognostic and therapeutic strategies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s11010-019-03501-2DOI Listing
June 2019

Improving the Gene Ontology Resource to Facilitate More Informative Analysis and Interpretation of Alzheimer's Disease Data.

Genes (Basel) 2018 Nov 29;9(12). Epub 2018 Nov 29.

UCL Institute of Cardiovascular Science, University College London, Rayne Building, 5 University Street, London WC1E 6JF, UK.

The analysis and interpretation of high-throughput datasets relies on access to high-quality bioinformatics resources, as well as processing pipelines and analysis tools. Gene Ontology (GO, geneontology.org) is a major resource for gene enrichment analysis. The aim of this project, funded by the Alzheimer's Research United Kingdom (ARUK) foundation and led by the University College London (UCL) biocuration team, was to enhance the GO resource by developing new neurological GO terms, and use GO terms to annotate gene products associated with dementia. Specifically, proteins and protein complexes relevant to processes involving amyloid-beta and tau have been annotated and the resulting annotations are denoted in GO databases as 'ARUK-UCL'. Biological knowledge presented in the scientific literature was captured through the association of GO terms with dementia-relevant protein records; GO itself was revised, and new GO terms were added. This literature biocuration increased the number of Alzheimer's-relevant gene products that were being associated with neurological GO terms, such as 'amyloid-beta clearance' or 'learning or memory', as well as neuronal structures and their compartments. Of the total 2055 annotations that we contributed for the prioritised gene products, 526 have associated proteins and complexes with neurological GO terms. To ensure that these descriptive annotations could be provided for Alzheimer's-relevant gene products, over 70 new GO terms were created. Here, we describe how the improvements in ontology development and biocuration resulting from this initiative can benefit the scientific community and enhance the interpretation of dementia data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/genes9120593DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6315915PMC
November 2018

GWAS and colocalization analyses implicate carotid intima-media thickness and carotid plaque loci in cardiovascular outcomes.

Nat Commun 2018 12 3;9(1):5141. Epub 2018 Dec 3.

Department of Medicine, University of Mississippi Medical Center, Jackson, MS, 39216, USA.

Carotid artery intima media thickness (cIMT) and carotid plaque are measures of subclinical atherosclerosis associated with ischemic stroke and coronary heart disease (CHD). Here, we undertake meta-analyses of genome-wide association studies (GWAS) in 71,128 individuals for cIMT, and 48,434 individuals for carotid plaque traits. We identify eight novel susceptibility loci for cIMT, one independent association at the previously-identified PINX1 locus, and one novel locus for carotid plaque. Colocalization analysis with nearby vascular expression quantitative loci (cis-eQTLs) derived from arterial wall and metabolic tissues obtained from patients with CHD identifies candidate genes at two potentially additional loci, ADAMTS9 and LOXL4. LD score regression reveals significant genetic correlations between cIMT and plaque traits, and both cIMT and plaque with CHD, any stroke subtype and ischemic stroke. Our study provides insights into genes and tissue-specific regulatory mechanisms linking atherosclerosis both to its functional genomic origins and its clinical consequences in humans.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-018-07340-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6277418PMC
December 2018

Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources.

Nucleic Acids Res 2019 01;47(D1):D1018-D1027

The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA.

The Human Phenotype Ontology (HPO)-a standardized vocabulary of phenotypic abnormalities associated with 7000+ diseases-is used by thousands of researchers, clinicians, informaticians and electronic health record systems around the world. Its detailed descriptions of clinical abnormalities and computable disease definitions have made HPO the de facto standard for deep phenotyping in the field of rare disease. The HPO's interoperability with other ontologies has enabled it to be used to improve diagnostic accuracy by incorporating model organism data. It also plays a key role in the popular Exomiser tool, which identifies potential disease-causing variants from whole-exome or whole-genome sequencing data. Since the HPO was first introduced in 2008, its users have become both more numerous and more diverse. To meet these emerging needs, the project has added new content, language translations, mappings and computational tooling, as well as integrations with external community data. The HPO continues to collaborate with clinical adopters to improve specific areas of the ontology and extend standardized disease descriptions. The newly redesigned HPO website (www.human-phenotype-ontology.org) simplifies browsing terms and exploring clinical features, diseases, and human genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gky1105DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6324074PMC
January 2019

Stratification of candidate genes for Parkinson's disease using weighted protein-protein interaction network analysis.

BMC Genomics 2018 Jun 13;19(1):452. Epub 2018 Jun 13.

Department of Molecular Neuroscience, UCL Institute of Neurology, Queen Square, London, WC1B 5EH, UK.

Background: Genome wide association studies (GWAS) have helped identify large numbers of genetic loci that significantly associate with increased risk of developing diseases. However, translating genetic knowledge into understanding of the molecular mechanisms underpinning disease (i.e. disease-specific impacted biological processes) has to date proved to be a major challenge. This is primarily due to difficulties in confidently defining candidate genes at GWAS-risk loci. The goal of this study was to better characterize candidate genes within GWAS loci using a protein interactome based approach and with Parkinson's disease (PD) data as a test case.

Results: We applied a recently developed Weighted Protein-Protein Interaction Network Analysis (WPPINA) pipeline as a means to define impacted biological processes, risk pathways and therein key functional players. We used previously established Mendelian forms of PD to identify seed proteins, and to construct a protein network for genetic Parkinson's and carried out functional enrichment analyses. We isolated PD-specific processes indicating 'mitochondria stressors mediated cell death', 'immune response and signaling', and 'waste disposal' mediated through 'autophagy'. Merging the resulting protein network with data from Parkinson's GWAS we confirmed 10 candidate genes previously selected by pure proximity and were able to nominate 17 novel candidate genes for sporadic PD.

Conclusions: With this study, we were able to better characterize the underlying genetic and functional architecture of idiopathic PD, thus validating WPPINA as a robust pipeline for the in silico genetic and functional dissection of complex disorders.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12864-018-4804-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6000968PMC
June 2018

Expanding the horizons of microRNA bioinformatics.

RNA 2018 08 5;24(8):1005-1017. Epub 2018 Jun 5.

Institute of Cardiovascular Science, University College London, London WC1E 6JF, United Kingdom.

MicroRNA regulation of key biological and developmental pathways is a rapidly expanding area of research, accompanied by vast amounts of experimental data. This data, however, is not widely available in bioinformatic resources, making it difficult for researchers to find and analyze microRNA-related experimental data and define further research projects. We are addressing this problem by providing two new bioinformatics data sets that contain experimentally verified functional information for mammalian microRNAs involved in cardiovascular-relevant, and other, processes. To date, our resource provides over 4400 Gene Ontology annotations associated with over 500 microRNAs from human, mouse, and rat and over 2400 experimentally validated microRNA:target interactions. We illustrate how this resource can be used to create microRNA-focused interaction networks with a biological context using the known biological role of microRNAs and the mRNAs they regulate, enabling discovery of associations between gene products, biological pathways and, ultimately, diseases. This data will be crucial in advancing the field of microRNA bioinformatics and will establish consistent data sets for reproducible functional analysis of microRNAs across all biological research areas.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1261/rna.065565.118DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6049505PMC
August 2018

Exploring autophagy with Gene Ontology.

Autophagy 2018 17;14(3):419-436. Epub 2018 Feb 17.

e European Bioinformatics Institute (EMBL-EBI) , European Molecular Biology Laboratory, Wellcome Genome Campus , Hinxton , Cambridge , UK.

Autophagy is a fundamental cellular process that is well conserved among eukaryotes. It is one of the strategies that cells use to catabolize substances in a controlled way. Autophagy is used for recycling cellular components, responding to cellular stresses and ridding cells of foreign material. Perturbations in autophagy have been implicated in a number of pathological conditions such as neurodegeneration, cardiac disease and cancer. The growing knowledge about autophagic mechanisms needs to be collected in a computable and shareable format to allow its use in data representation and interpretation. The Gene Ontology (GO) is a freely available resource that describes how and where gene products function in biological systems. It consists of 3 interrelated structured vocabularies that outline what gene products do at the biochemical level, where they act in a cell and the overall biological objectives to which their actions contribute. It also consists of 'annotations' that associate gene products with the terms. Here we describe how we represent autophagy in GO, how we create and define terms relevant to autophagy researchers and how we interrelate those terms to generate a coherent view of the process, therefore allowing an interoperable description of its biological aspects. We also describe how annotation of gene products with GO terms improves data analysis and interpretation, hence bringing a significant benefit to this field of study.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1080/15548627.2017.1415189DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5915032PMC
March 2019

Improving Interpretation of Cardiac Phenotypes and Enhancing Discovery With Expanded Knowledge in the Gene Ontology.

Circ Genom Precis Med 2018 02;11(2):e001813

From the Institute of Cardiovascular Science (R.C.L., V.K.K., R.E.F., N.H.C., R.P.H., P.J.T., P.D.L., P.M.E., L.C.) and Metabolism and Experimental Therapeutics, Division of Medicine (R.B.), University College London, United Kingdom; European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Hinxton, United Kingdom (P.R., D.O.-S.); Gene Ontology Consortium (P.R., T.Z.B., D.O.-S., J.A.B., D.P.H.); The Zebrafish Model Organism Database, University of Oregon, Eugene (D.G.H.); Rat Genome Database, Human Molecular Genetics Center, Medical College of Wisconsin, Milwaukee (S.J.F.L.); Arabidopsis Information Resource, Phoenix Bioinformatics, Fremont, CA (T.Z.B.); FlyBase, University of Cambridge, United Kingdom (S.T.); Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME (J.A.B., D.P.H.); Oxbridge BHF Centre of Regenerative Medicine, Department of Physiology, Anatomy and Genetics, University of Oxford, United Kingdom (P.R.R.); and William Harvey Heart Centre, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, United Kingdom (A.T.).

Background: A systems biology approach to cardiac physiology requires a comprehensive representation of how coordinated processes operate in the heart, as well as the ability to interpret relevant transcriptomic and proteomic experiments. The Gene Ontology (GO) Consortium provides structured, controlled vocabularies of biological terms that can be used to summarize and analyze functional knowledge for gene products.

Methods And Results: In this study, we created a computational resource to facilitate genetic studies of cardiac physiology by integrating literature curation with attention to an improved and expanded ontological representation of heart processes in the Gene Ontology. As a result, the Gene Ontology now contains terms that comprehensively describe the roles of proteins in cardiac muscle cell action potential, electrical coupling, and the transmission of the electrical impulse from the sinoatrial node to the ventricles. Evaluating the effectiveness of this approach to inform data analysis demonstrated that Gene Ontology annotations, analyzed within an expanded ontological context of heart processes, can help to identify candidate genes associated with arrhythmic disease risk loci.

Conclusions: We determined that a combination of curation and ontology development for heart-specific genes and processes supports the identification and downstream analysis of genes responsible for the spread of the cardiac action potential through the heart. Annotating these genes and processes in a structured format facilitates data analysis and supports effective retrieval of gene-centric information about cardiac defects.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1161/CIRCGEN.117.001813DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5821137PMC
February 2018

MicroRNA Biomarkers and Platelet Reactivity: The Clot Thickens.

Circ Res 2017 Jan;120(2):418-435

From the King's British Heart Foundation Centre, King's College London, United Kingdom (N.S., P.S., T.B., R.L., A.J., M.M.); and Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, United Kingdom (R.P.H., R.C.L.).

Over the last few years, several groups have evaluated the potential of microRNAs (miRNAs) as biomarkers for cardiometabolic disease. In this review, we discuss the emerging literature on the role of miRNAs and other small noncoding RNAs in platelets and in the circulation, and the potential use of miRNAs as biomarkers for platelet activation. Platelets are a major source of miRNAs, YRNAs, and circular RNAs. By harnessing multiomics approaches, we may gain valuable insights into their potential function. Because not all miRNAs are detectable in the circulation, we also created a gene ontology annotation for circulating miRNAs using the gene ontology term extracellular space as part of blood plasma. Finally, we share key insights for measuring circulating miRNAs. We propose ways to standardize miRNA measurements, in particular by using platelet-poor plasma to avoid confounding caused by residual platelets in plasma or by adding RNase inhibitors to serum to reduce degradation. This should enhance comparability of miRNA measurements across different cohorts. We provide recommendations for future miRNA biomarker studies, emphasizing the need for accurate interpretation within a biological and methodological context.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1161/CIRCRESAHA.116.309303DOI Listing
January 2017

Vascular Endothelial Growth Factor (VEGF) Promotes Assembly of the p130Cas Interactome to Drive Endothelial Chemotactic Signaling and Angiogenesis.

Mol Cell Proteomics 2017 02 22;16(2):168-180. Epub 2016 Dec 22.

From the ‡Centre for Cardiovascular Biology and Medicine, Division of Medicine The Rayne Building, University College London, London WC1E 6JJ, United Kingdom;

p130Cas is a polyvalent adapter protein essential for cardiovascular development, and with a key role in cell movement. In order to identify the pathways by which p130Cas exerts its biological functions in endothelial cells we mapped the p130Cas interactome and its dynamic changes in response to VEGF using high-resolution mass spectrometry and reconstruction of protein interaction (PPI) networks with the aid of multiple PPI databases. VEGF enriched the p130Cas interactome in proteins involved in actin cytoskeletal dynamics and cell movement, including actin-binding proteins, small GTPases and regulators or binders of GTPases. Detailed studies showed that p130Cas association of the GTPase-binding scaffold protein, IQGAP1, plays a key role in VEGF chemotactic signaling, endothelial polarization, VEGF-induced cell migration, and endothelial tube formation. These findings indicate a cardinal role for assembly of the p130Cas interactome in mediating the cell migratory response to VEGF in angiogenesis, and provide a basis for further studies of p130Cas in cell movement.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1074/mcp.M116.064428DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5294206PMC
February 2017

Weighted Protein Interaction Network Analysis of Frontotemporal Dementia.

J Proteome Res 2017 02 12;16(2):999-1013. Epub 2017 Jan 12.

Department of Molecular Neuroscience, UCL Institute of Neurology , Russell Square House, 9-12 Russell Square House, London WC1B 5EH, United Kingdom.

The genetic analysis of complex disorders has undoubtedly led to the identification of a wealth of associations between genes and specific traits. However, moving from genetics to biochemistry one gene at a time has, to date, rather proved inefficient and under-powered to comprehensively explain the molecular basis of phenotypes. Here we present a novel approach, weighted protein-protein interaction network analysis (W-PPI-NA), to highlight key functional players within relevant biological processes associated with a given trait. This is exemplified in the current study by applying W-PPI-NA to frontotemporal dementia (FTD): We first built the state of the art FTD protein network (FTD-PN) and then analyzed both its topological and functional features. The FTD-PN resulted from the sum of the individual interactomes built around FTD-spectrum genes, leading to a total of 4198 nodes. Twenty nine of 4198 nodes, called inter-interactome hubs (IIHs), represented those interactors able to bridge over 60% of the individual interactomes. Functional annotation analysis not only reiterated and reinforced previous findings from single genes and gene-coexpression analyses but also indicated a number of novel potential disease related mechanisms, including DNA damage response, gene expression regulation, and cell waste disposal and potential biomarkers or therapeutic targets including EP300. These processes and targets likely represent the functional core impacted in FTD, reflecting the underlying genetic architecture contributing to disease. The approach presented in this study can be applied to other complex traits for which risk-causative genes are known as it provides a promising tool for setting the foundations for collating genomics and wet laboratory data in a bidirectional manner. This is and will be critical to accelerate molecular target prioritization and drug discovery.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jproteome.6b00934DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6152613PMC
February 2017

Annotation Extensions.

Methods Mol Biol 2017 ;1446:233-243

Functional Gene Annotation Initiative, Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, 5 University Street, London, WC1E 6JF, UK.

The specificity of knowledge that Gene Ontology (GO) annotations currently can represent is still restricted by the legacy format of the GO annotation file, a format intentionally designed for simplicity to keep the barriers to entry low and thus encourage initial adoption. Historically, the information that could be captured in a GO annotation was simply the role or location of a gene product, although genetically interacting or binding partners could be specified. While there was no mechanism within the original GO annotation format for capturing additional information about the context of a GO term, such as the target gene of an activity or the location of a molecular function, the long-term vision for the GO Consortium was to provide greater expressivity in its annotations to capture physiologically relevant information.Thus, as a step forwards, the GO Consortium has introduced a new field into the annotation format, annotation extensions, which can be used to capture valuable contextual detail. This provides experimentally verified links between gene products and other physiological information that is crucial for accurate analysis of pathway and network data. This chapter will provide a simple overview of annotation extensions, illustrated with examples of their usage, and explain why they are useful for scientists and bioinformaticians alike.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-1-4939-3743-1_17DOI Listing
December 2017

How Does the Scientific Community Contribute to Gene Ontology?

Authors:
Ruth C Lovering

Methods Mol Biol 2017 ;1446:85-93

Functional Gene Annotation Initiative, Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, 5 University Street, London, WC1E 6JF, UK.

Collaborations between the scientific community and members of the Gene Ontology (GO) Consortium have led to an increase in the number and specificity of GO terms, as well as increasing the number of GO annotations. A variety of approaches have been taken to encourage research scientists to contribute to the GO, but the success of these approaches has been variable. This chapter reviews both the successes and failures of engaging the scientific community in GO development and annotation, as well as, providing motivation and advice to encourage individual researchers to contribute to GO.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-1-4939-3743-1_7DOI Listing
December 2017

An expanded evaluation of protein function prediction methods shows an improvement in accuracy.

Genome Biol 2016 09 7;17(1):184. Epub 2016 Sep 7.

Department of Information Technology, University of Turku, Turku, Finland.

Background: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging.

Results: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2.

Conclusions: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-016-1037-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5015320PMC
September 2016
-->