Publications by authors named "Ioannis Xenarios"

168 Publications

Blood Virosphere in Febrile Tanzanian Children.

Emerg Microbes Infect 2021 Apr 30:1-237. Epub 2021 Apr 30.

Division of Infectious Diseases, Geneva University Hospitals, 1205 Geneva, Switzerland.

Viral infections are the leading cause of childhood acute febrile illnesses motivating consultation in sub-Saharan Africa. The majority of causal viruses are never identified in low-resource clinical settings as such testing is either not part of routine screening or available diagnostic tools have limited ability to detect new/unexpected viral variants. An in-depth exploration of the blood virome is therefore necessary to clarify the potential viral origin of fever in children.Metagenomic next-generation sequencing is a powerful tool for such broad investigations, allowing the detection of RNA and DNA viral genomes. Here, we describe the blood virome of 816 febrile children (<5 years) presenting at outpatient departments in Dar es Salaam over one-year. We show that half of the patients (394/816) had at least one detected virus recognized as causes of human infection/disease (13.8% enteroviruses (enterovirus A, B, C, and rhinovirus A and C), 12% rotaviruses, 11% human herpesvirus type 6). Additionally, we report the detection of a large number of viruses (related to arthropod, vertebrate or mammalian viral species) not yet known to cause human infection/disease, highlighting those who should be on the radar, deserve specific attention in the febrile paediatric population and, more broadly, for surveillance of emerging pathogens. ClinicalTrials.gov identifier: NCT02225769..
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1080/22221751.2021.1925161DOI Listing
April 2021

Virosaurus A Reference to Explore and Capture Virus Genetic Diversity.

Viruses 2020 11 1;12(11). Epub 2020 Nov 1.

Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, 1011 Geneva, Switzerland.

The huge genetic diversity of circulating viruses is a challenge for diagnostic assays for emerging or rare viral diseases. High-throughput technology offers a new opportunity to explore the global virome of patients without preconception about the culpable pathogens. It requires a solid reference dataset to be accurate. Virosaurus has been designed to offer a non-biased, automatized and annotated database for clinical metagenomics studies and diagnosis. Raw viral sequences have been extracted from GenBank, and cleaned up to remove potentially erroneous sequences. Complete sequences have been identified for all genera infecting vertebrates, plants and other eukaryotes (insect, fungus, etc.). To facilitate the analysis of clinically relevant viruses, we have annotated all sequences with official and common virus names, acronym, genotypes, and genomic features (linear, circular, DNA, RNA, etc.). Sequences have been clustered to remove redundancy at 90% or 98% identity. The analysis of clustering results reveals the state of the virus genetic landscape knowledge. Because herpes and poxviruses were under-represented in complete genomes considering their potential diversity in nature, we used genes instead of complete genomes for those in Virosaurus.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/v12111248DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7693494PMC
November 2020

Predicting combinations of immunomodulators to enhance dendritic cell-based vaccination based on a hybrid experimental and computational platform.

Comput Struct Biotechnol J 2020 8;18:2217-2227. Epub 2020 Aug 8.

Department of Oncology, University Hospital of Lausanne, Lausanne, Switzerland.

Dendritic cell (DC)-based vaccines have been largely used in the adjuvant setting for the treatment of cancer, however, despite their proven safety, clinical outcomes still remain modest. In order to improve their efficacy, DC-based vaccines are often combined with one or multiple immunomodulatory agents. However, the selection of the most promising combinations is hampered by the plethora of agents available and the unknown interplay between these different agents. To address this point, we developed a hybrid experimental and computational platform to predict the effects and immunogenicity of dual combinations of stimuli once combined with DC vaccination, based on the experimental data of a variety of assays to monitor different aspects of the immune response after a single stimulus. To assess the stimuli behavior when used as single agents, we first developed an co-culture system of T cell priming using monocyte-derived DCs loaded with whole tumor lysate to prime autologous peripheral blood mononuclear cells in the presence of the chosen stimuli, as single adjuvants, and characterized the elicited response assessing 18 different phenotypic and functional traits important for an efficient anti-cancer response. We then developed and applied a prediction algorithm, generating a ranking for all possible dual combinations of the different single stimuli considered here. The ranking generated by the prediction tool was then validated with experimental data showing a strong correlation with the predicted scores, confirming that the top ranked conditions globally significantly outperformed the worst conditions. Thus, the method developed here constitutes an innovative tool for the selection of the best immunomodulatory agents to implement in future DC-based vaccines.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.csbj.2020.08.001DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7475195PMC
August 2020

Three-dimensional chromatin interactions remain stable upon CAG/CTG repeat expansion.

Sci Adv 2020 Jul 3;6(27):eaaz4012. Epub 2020 Jul 3.

UK Dementia Research Institute at Cardiff University at Cardiff University, Hadyn Ellis Building, Maindy Road, CF24 4HQ Cardiff, UK.

Expanded CAG/CTG repeats underlie 13 neurological disorders, including myotonic dystrophy type 1 (DM1) and Huntington's disease (HD). Upon expansion, disease loci acquire heterochromatic characteristics, which may provoke changes to chromatin conformation and thereby affect both gene expression and repeat instability. Here, we tested this hypothesis by performing 4C sequencing at the and loci from DM1 and HD-derived cells. We find that allele sizes ranging from 15 to 1700 repeats displayed similar chromatin interaction profiles. This was true for both loci and for alleles with different DNA methylation levels and CTCF binding. Moreover, the ectopic insertion of an expanded CAG repeat tract did not change the conformation of the surrounding chromatin. We conclude that CAG/CTG repeat expansions are not enough to alter chromatin conformation in cis. Therefore, it is unlikely that changes in chromatin interactions drive repeat instability or changes in gene expression in these disorders.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/sciadv.aaz4012DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7334000PMC
July 2020

Setting the basis of best practices and standards for curation and annotation of logical models in biology-highlights of the [BC]2 2019 CoLoMoTo/SysMod Workshop.

Brief Bioinform 2021 Mar;22(2):1848-1859

The fast accumulation of biological data calls for their integration, analysis and exploitation through more systematic approaches. The generation of novel, relevant hypotheses from this enormous quantity of data remains challenging. Logical models have long been used to answer a variety of questions regarding the dynamical behaviours of regulatory networks. As the number of published logical models increases, there is a pressing need for systematic model annotation, referencing and curation in community-supported and standardised formats. This article summarises the key topics and future directions of a meeting entitled 'Annotation and curation of computational models in biology', organised as part of the 2019 [BC]2 conference. The purpose of the meeting was to develop and drive forward a plan towards the standardised annotation of logical models, review and connect various ongoing projects of experts from different communities involved in the modelling and annotation of molecular biological entities, interactions, pathways and models. This article defines a roadmap towards the annotation and curation of logical models, including milestones for best practices and minimum standard requirements.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bbaa046DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7986594PMC
March 2021

New genome assembly of the barn owl ().

Ecol Evol 2020 Mar 19;10(5):2284-2298. Epub 2020 Feb 19.

Department of Ecology and Evolution University of Lausanne Lausanne Switzerland.

New genomic tools open doors to study ecology, evolution, and population genomics of wild animals. For the Barn owl species complex, a cosmopolitan nocturnal raptor, a very fragmented draft genome was assembled for the American species () (Jarvis et al. 2014). To improve the genome, we assembled de novo Illumina and Pacific Biosciences (PacBio) long reads sequences of its European counterpart (). This genome assembly of 1.219 Gbp comprises 21,509 scaffolds and results in a N50 of 4,615,526 bp. BUSCO (Universal Single-Copy Orthologs) analysis revealed an assembly completeness of 94.8% with only 1.8% of the genes missing out of 4,915 avian orthologs searched, a proportion similar to that found in the genomes of the zebra finch () or the collared flycatcher (). By mapping the reads of the female American barn owl to the male European barn owl reads, we detected several structural variants and identified 70 Mbp of the Z chromosome. The barn owl scaffolds were further mapped to the chromosomes of the zebra finch. In addition, the completeness of the European barn owl genome is demonstrated with 94 of 128 proteins missing in the chicken genome retrieved in the European barn owl transcripts. This improved genome will help future barn owl population genomic investigations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/ece3.5991DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7069322PMC
March 2020

Integrated proteogenomic deep sequencing and analytics accurately identify non-canonical peptides in tumor immunopeptidomes.

Nat Commun 2020 03 10;11(1):1293. Epub 2020 Mar 10.

Ludwig Institute for Cancer Research, University of Lausanne, Agora Center, Rue du Bugnon 25A, 1005, Lausanne, Switzerland.

Efforts to precisely identify tumor human leukocyte antigen (HLA) bound peptides capable of mediating T cell-based tumor rejection still face important challenges. Recent studies suggest that non-canonical tumor-specific HLA peptides derived from annotated non-coding regions could elicit anti-tumor immune responses. However, sensitive and accurate mass spectrometry (MS)-based proteogenomics approaches are required to robustly identify these non-canonical peptides. We present an MS-based analytical approach that characterizes the non-canonical tumor HLA peptide repertoire, by incorporating whole exome sequencing, bulk and single-cell transcriptomics, ribosome profiling, and two MS/MS search tools in combination. This approach results in the accurate identification of hundreds of shared and tumor-specific non-canonical HLA peptides, including an immunogenic peptide derived from an open reading frame downstream of the melanoma stem cell marker gene ABCB5. These findings hold great promise for the discovery of previously unknown tumor antigens for cancer immunotherapy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-14968-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7064602PMC
March 2020

Contribution of exome sequencing to the identification of genes involved in the response to clopidogrel in cardiovascular patients.

J Thromb Haemost 2020 06 20;18(6):1425-1434. Epub 2020 Mar 20.

Geneva Platelet Group, Faculty of Medicine, University of Geneva, Geneva, Switzerland.

Background: On-clopidogrel platelet reactivity (PR) is associated with the risk of thrombotic or bleeding event in selected populations of high-risk patients. PR is a highly heritable phenotype and a few variants of cytochrome genes, essentially CYP2C19, are associated with PR but only explain 5% to 12% of the variability.

Objective: The aim of this study is to delineate genetic determinants of on-clopidogrel PR using high-throughput sequencing.

Methods: We performed a whole exome sequencing of 96 low- and matched high-PR patients in a discovery cohort. Exomes from genes with variants significantly associated with PR were sequenced in 96 low- and matched high-PR patients from an independent replication cohort.

Results: We identified 585 variants in 417 genes with an adjusted P value < .05. In the replication cohort, all top variants including CYP2C8, CYP2C18, and CYP2C19 from the discovery population were found again. An original network analysis identified several candidate genes of potential interest such as a regulator of PI3K, a key actor in the downstream signaling pathway of the P2Y receptor.

Conclusion: This study emphasizes the role of CYP-related genes as major regulators of clopidogrel response, including the poorly investigated CYP2C8 and CYP2C18.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/jth.14776DOI Listing
June 2020

Fisetin protects against cardiac cell death through reduction of ROS production and caspases activity.

Sci Rep 2020 02 19;10(1):2896. Epub 2020 Feb 19.

Quantitative Biology Unit, Luxembourg Institute of Health (LIH), Luxembourg, 1445, Strassen, Luxembourg.

Myocardial infarction (MI) is a leading cause of death worldwide. Reperfusion is considered as an optimal therapy following cardiac ischemia. However, the promotion of a rapid elevation of O levels in ischemic cells produces high amounts of reactive oxygen species (ROS) leading to myocardial tissue injury. This phenomenon is called ischemia reperfusion injury (IRI). We aimed at identifying new and effective compounds to treat MI and minimize IRI. We previously studied heart regeneration following myocardial injury in zebrafish and described each step of the regeneration process, from the day of injury until complete recovery, in terms of transcriptional responses. Here, we mined the data and performed a deep in silico analysis to identify drugs highly likely to induce cardiac regeneration. Fisetin was identified as the top candidate. We validated its effects in an in vitro model of MI/IRI in mammalian cardiac cells. Fisetin enhances viability of rat cardiomyocytes following hypoxia/starvation - reoxygenation. It inhibits apoptosis, decreases ROS generation and caspase activation and protects from DNA damage. Interestingly, fisetin also activates genes involved in cell proliferation. Fisetin is thus a highly promising candidate drug with clinical potential to protect from ischemic damage following MI and to overcome IRI.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-020-59894-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7031222PMC
February 2020

HAMAP as SPARQL rules-A portable annotation pipeline for genomes and proteomes.

Gigascience 2020 02;9(2)

Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Centre Médical Universitaire, 1 rue Michel-Servet, CH-1211 Geneva 4, Switzerland.

Background: Genome and proteome annotation pipelines are generally custom built and not easily reusable by other groups. This leads to duplication of effort, increased costs, and suboptimal annotation quality. One way to address these issues is to encourage the adoption of annotation standards and technological solutions that enable the sharing of biological knowledge and tools for genome and proteome annotation.

Results: Here we demonstrate one approach to generate portable genome and proteome annotation pipelines that users can run without recourse to custom software. This proof of concept uses our own rule-based annotation pipeline HAMAP, which provides functional annotation for protein sequences to the same depth and quality as UniProtKB/Swiss-Prot, and the World Wide Web Consortium (W3C) standards Resource Description Framework (RDF) and SPARQL (a recursive acronym for the SPARQL Protocol and RDF Query Language). We translate complex HAMAP rules into the W3C standard SPARQL 1.1 syntax, and then apply them to protein sequences in RDF format using freely available SPARQL engines. This approach supports the generation of annotation that is identical to that generated by our own in-house pipeline, using standard, off-the-shelf solutions, and is applicable to any genome or proteome annotation pipeline.

Conclusions: HAMAP SPARQL rules are freely available for download from the HAMAP FTP site, ftp://ftp.expasy.org/databases/hamap/sparql/, under the CC-BY-ND 4.0 license. The annotations generated by the rules are under the CC-BY 4.0 license. A tutorial and supplementary code to use HAMAP as SPARQL are available on GitHub at https://github.com/sib-swiss/HAMAP-SPARQL, and general documentation about HAMAP can be found on the HAMAP website at https://hamap.expasy.org.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giaa003DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7007698PMC
February 2020

PamgeneAnalyzeR: open and reproducible pipeline for kinase profiling.

Bioinformatics 2020 12;36(20):5117-5119

Center for Integrative Genomics, University of Lausanne, Lausanne CH-1015, Switzerland.

Protein phosphorylation--catalyzed by protein kinases-is the most common post-translational modification. It increases the functional diversity of the proteome and influences various aspects of normal physiology and can be altered in disease states. High throughput profiling of kinases is becoming an essential experimental approach to investigate their activity and this can be achieved using technologies such as PamChip® arrays provided by PamGene for kinase activity measurement. Here, we present 'pamgeneAnalyzeR', an R package developed as an alternative to the manual steps necessary to extract the data from PamChip® peptide microarrays images in a reproducible and robust manner. The extracted data can be directly used for downstream analysis.

Availability And Implementation: PamgeneAnalyzeR is implemented in R and can be obtained from https://github.com/amelbek/pamgeneAnalyzeR.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz858DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7755406PMC
December 2020

Incorporating heterogeneous sampling probabilities in continuous phylogeographic inference - Application to H5N1 spread in the Mekong region.

Bioinformatics 2020 04;36(7):2098-2104

Spatial Epidemiology Lab (SpELL), Université Libre de Bruxelles, 1050 Bruxelles, Belgium.

Motivation: The potentially low precision associated with the geographic origin of sampled sequences represents an important limitation for spatially explicit (i.e. continuous) phylogeographic inference of fast-evolving pathogens such as RNA viruses. A substantial proportion of publicly available sequences is geo-referenced at broad spatial scale such as the administrative unit of origin, rather than more precise locations (e.g. geographic coordinates). Most frequently, such sequences are either discarded prior to continuous phylogeographic inference or arbitrarily assigned to the geographic coordinates of the centroid of their administrative area of origin for lack of a better alternative.

Results: We here implement and describe a new approach that allows to incorporate heterogeneous prior sampling probabilities over a geographic area. External data, such as outbreak locations, are used to specify these prior sampling probabilities over a collection of sub-polygons. We apply this new method to the analysis of highly pathogenic avian influenza H5N1 clade data in the Mekong region. Our method allows to properly include, in continuous phylogeographic analyses, H5N1 sequences that are only associated with large administrative areas of origin and assign them with more accurate locations. Finally, we use continuous phylogeographic reconstructions to analyse the dispersal dynamics of different H5N1 clades and investigate the impact of environmental factors on lineage dispersal velocities.

Availability And Implementation: Our new method allowing heterogeneous sampling priors for continuous phylogeographic inference is implemented in the open-source multi-platform software package BEAST 1.10.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz882DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7141868PMC
April 2020

Sleep-wake-driven and circadian contributions to daily rhythms in gene expression and chromatin accessibility in the murine cortex.

Proc Natl Acad Sci U S A 2019 12 27;116(51):25773-25783. Epub 2019 Nov 27.

Center for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne, CH-1015 Lausanne, Switzerland;

The timing and duration of sleep results from the interaction between a homeostatic sleep-wake-driven process and a periodic circadian process, and involves changes in gene regulation and expression. Unraveling the contributions of both processes and their interaction to transcriptional and epigenomic regulatory dynamics requires sampling over time under conditions of unperturbed and perturbed sleep. We profiled mRNA expression and chromatin accessibility in the cerebral cortex of mice over a 3-d period, including a 6-h sleep deprivation (SD) on day 2. We used mathematical modeling to integrate time series of mRNA expression data with sleep-wake history, which established that a large proportion of rhythmic genes are governed by the homeostatic process with varying degrees of interaction with the circadian process, sometimes working in opposition. Remarkably, SD caused long-term effects on gene-expression dynamics, outlasting phenotypic recovery, most strikingly illustrated by a damped oscillation of most core clock genes, including /, suggesting that enforced wakefulness directly impacts the molecular clock machinery. Chromatin accessibility proved highly plastic and dynamically affected by SD. Dynamics in distal regions, rather than promoters, correlated with mRNA expression, implying that changes in expression result from constitutively accessible promoters under the influence of enhancers or repressors. Serum response factor (SRF) was predicted as a transcriptional regulator driving immediate response, suggesting that SRF activity mirrors the build-up and release of sleep pressure. Our results demonstrate that a single, short SD has long-term aftereffects at the genomic regulatory level and highlights the importance of the sleep-wake distribution to diurnal rhythmicity and circadian processes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1910590116DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6925978PMC
December 2019

Enzyme annotation in UniProtKB using Rhea.

Bioinformatics 2020 03;36(6):1896-1901

Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva 1211-4, Switzerland.

Motivation: To provide high quality computationally tractable enzyme annotation in UniProtKB using Rhea, a comprehensive expert-curated knowledgebase of biochemical reactions which describes reaction participants using the ChEBI (Chemical Entities of Biological Interest) ontology.

Results: We replaced existing textual descriptions of biochemical reactions in UniProtKB with their equivalents from Rhea, which is now the standard for annotation of enzymatic reactions in UniProtKB. We developed improved search and query facilities for the UniProt website, REST API and SPARQL endpoint that leverage the chemical structure data, nomenclature and classification that Rhea and ChEBI provide.

Availability And Implementation: UniProtKB at https://www.uniprot.org; UniProt REST API at https://www.uniprot.org/help/api; UniProt SPARQL endpoint at https://sparql.uniprot.org/; Rhea at https://www.rhea-db.org.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz817DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7162351PMC
March 2020

A multi-omics digital research object for the genetics of sleep regulation.

Sci Data 2019 10 31;6(1):258. Epub 2019 Oct 31.

Ludwig Cancer Research/CHUV-UNIL, Lausanne, Switzerland.

With the aim to uncover the molecular pathways underlying the regulation of sleep, we recently assembled an extensive and comprehensive systems genetics dataset interrogating a genetic reference population of mice at the levels of the genome, the brain and liver transcriptomes, the plasma metabolome, and the sleep-wake phenome. To facilitate a meaningful and efficient re-use of this public resource by others we designed, describe in detail, and made available a Digital Research Object (DRO), embedding data, documentation, and analytics. We present and discuss both the advantages and limitations of our multi-modal resource and analytic pipeline. The reproducibility of the results was tested by a bioinformatician not implicated in the original project and the robustness of results was assessed by re-annotating genetic and transcriptome data from the mm9 to the mm10 mouse genome assembly.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41597-019-0171-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6823400PMC
October 2019

Characterization and mutagenesis of Chinese hamster ovary cells endogenous retroviruses to inactivate viral particle release.

Biotechnol Bioeng 2020 02 12;117(2):466-485. Epub 2019 Nov 12.

Institute of Biotechnology and Department of Fundamental Microbiology, University of Lausanne, Lausanne, Switzerland.

The Chinese hamster ovary (CHO) cells used to produce biopharmaceutical proteins are known to contain type-C endogenous retrovirus (ERV) sequences in their genome and to release retroviral-like particles. Although evidence for their infectivity is missing, this has raised safety concerns. As the genomic origin of these particles remained unclear, we characterized type-C ERV elements at the genome, transcriptome, and viral particle RNA levels. We identified 173 type-C ERV sequences clustering into three functionally conserved groups. Transcripts from one type-C ERV group were full-length, with intact open reading frames, and cognate viral genome RNA was loaded into retroviral-like particles, suggesting that this ERV group may produce functional viruses. CRISPR-Cas9 genome editing was used to disrupt the gag gene of the expressed type-C ERV group. Comparison of CRISPR-derived mutations at the DNA and RNA level led to the identification of a single ERV as the main source of the release of RNA-loaded viral particles. Clones bearing a Gag loss-of-function mutation in this ERV showed a reduction of RNA-containing viral particle release down to detection limits, without compromising cell growth or therapeutic protein production. Overall, our study provides a strategy to mitigate potential viral particle contaminations resulting from ERVs during biopharmaceutical manufacturing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/bit.27200DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7003738PMC
February 2020

HENA, heterogeneous network-based data set for Alzheimer's disease.

Sci Data 2019 08 14;6(1):151. Epub 2019 Aug 14.

Quretec Ltd., Ülikooli 6a, 51003, Tartu, Estonia.

Alzheimer's disease and other types of dementia are the top cause for disabilities in later life and various types of experiments have been performed to understand the underlying mechanisms of the disease with the aim of coming up with potential drug targets. These experiments have been carried out by scientists working in different domains such as proteomics, molecular biology, clinical diagnostics and genomics. The results of such experiments are stored in the databases designed for collecting data of similar types. However, in order to get a systematic view of the disease from these independent but complementary data sets, it is necessary to combine them. In this study we describe a heterogeneous network-based data set for Alzheimer's disease (HENA). Additionally, we demonstrate the application of state-of-the-art graph convolutional networks, i.e. deep learning methods for the analysis of such large heterogeneous biological data sets. We expect HENA to allow scientists to explore and analyze their own results in the broader context of Alzheimer's disease research.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41597-019-0152-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6694132PMC
August 2019

Gene expression across mammalian organ development.

Nature 2019 07 26;571(7766):505-509. Epub 2019 Jun 26.

Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA.

The evolution of gene expression in mammalian organ development remains largely uncharacterized. Here we report the transcriptomes of seven organs (cerebrum, cerebellum, heart, kidney, liver, ovary and testis) across developmental time points from early organogenesis to adulthood for human, rhesus macaque, mouse, rat, rabbit, opossum and chicken. Comparisons of gene expression patterns identified correspondences of developmental stages across species, and differences in the timing of key events during the development of the gonads. We found that the breadth of gene expression and the extent of purifying selection gradually decrease during development, whereas the amount of positive selection and expression of new genes increase. We identified differences in the temporal trajectories of expression of individual genes across species, with brain tissues showing the smallest percentage of trajectory changes, and the liver and testis showing the largest. Our work provides a resource of developmental transcriptomes of seven organs across seven species, and comparative analyses that characterize the development and evolution of mammalian organs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-019-1338-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6658352PMC
July 2019

DynaStI: A Dynamic Retention Time Database for Steroidomics.

Metabolites 2019 Apr 30;9(5). Epub 2019 Apr 30.

School of Pharmaceutical Sciences, University of Geneva, University of Lausanne, 1206 Geneva, Switzerland.

Steroidomics studies face the challenge of separating analytical compounds with very similar structures (i.e., isomers). Liquid chromatography (LC) is commonly used to this end, but the shared core structure of this family of compounds compromises effective separations among the numerous chemical analytes with comparable physico-chemical properties. Careful tuning of the mobile phase gradient and an appropriate choice of the stationary phase can be used to overcome this problem, in turn modifying the retention times in different ways for each compound. In the usual workflow, this approach is suboptimal for the annotation of features based on retention times since it requires characterizing a library of known compounds for every fine-tuned configuration. We introduce a software solution, DynaStI, that is capable of annotating liquid chromatography-mass spectrometry (LC-MS) features by dynamically generating the retention times from a database containing intrinsic properties of a library of metabolites. DynaStI uses the well-established linear solvent strength (LSS) model for reversed-phase LC. Given a list of LC-MS features and some characteristics of the LC setup, this software computes the corresponding retention times for the internal database and then annotates the features using the exact masses with predicted retention times at the working conditions. DynaStI (https://dynasti.vital-it.ch) is able to automatically calibrate its predictions to compensate for deviations in the input parameters. The database also includes identification and structural information for each annotation, such as IUPAC name, CAS number, SMILES string, metabolic pathways, and links to external metabolomic or lipidomic databases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/metabo9050085DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6572260PMC
April 2019

Laser capture microdissection of human pancreatic islets reveals novel eQTLs associated with type 2 diabetes.

Mol Metab 2019 06 18;24:98-107. Epub 2019 Mar 18.

Imperial College London, Department of Genomics of Common Disease, London, UK; University of Lille, CNRS, Institute Pasteur de Lille, UMR 8199 - EGID, F-59000, Lille, France. Electronic address:

Objective: Genome wide association studies (GWAS) for type 2 diabetes (T2D) have identified genetic loci that often localise in non-coding regions of the genome, suggesting gene regulation effects. We combined genetic and transcriptomic analysis from human islets obtained from brain-dead organ donors or surgical patients to detect expression quantitative trait loci (eQTLs) and shed light into the regulatory mechanisms of these genes.

Methods: Pancreatic islets were isolated either by laser capture microdissection (LCM) from surgical specimens of 103 metabolically phenotyped pancreatectomized patients (PPP) or by collagenase digestion of pancreas from 100 brain-dead organ donors (OD). Genotyping (> 8.7 million single nucleotide polymorphisms) and expression (> 47,000 transcripts and splice variants) analyses were combined to generate cis-eQTLs.

Results: After applying genome-wide false discovery rate significance thresholds, we identified 1,173 and 1,021 eQTLs in samples of OD and PPP, respectively. Among the strongest eQTLs shared between OD and PPP were CHURC1 (OD p-value=1.71 × 10; PPP p-value = 3.64 × 10) and PSPH (OD p-value = 3.92 × 10; PPP p-value = 3.64 × 10). We identified eQTLs in linkage-disequilibrium with GWAS loci T2D and associated traits, including TTLL6, MLX and KIF9 loci, which do not implicate the nearest gene. We found in the PPP datasets 11 eQTL genes, which were differentially expressed in T2D and two genes (CYP4V2 and TSEN2) associated with HbA1c but none in the OD samples.

Conclusions: eQTL analysis of LCM islets from PPP led us to identify novel genes which had not been previously linked to islet biology and T2D. The understanding gained from eQTL approaches, especially using surgical samples of living patients, provides a more accurate 3-dimensional representation than those from genetic studies alone.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.molmet.2019.03.004DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6531807PMC
June 2019

Control of Cognate Sense mRNA Translation by cis-Natural Antisense RNAs.

Plant Physiol 2019 05 13;180(1):305-322. Epub 2019 Feb 13.

Department of Plant Molecular Biology, University of Lausanne, Biophore Building, CH-1015 Lausanne, Switzerland

Cis-Natural Antisense Transcripts (cis-NATs), which overlap protein coding genes and are transcribed from the opposite DNA strand, constitute an important group of noncoding RNAs. Whereas several examples of cis-NATs regulating the expression of their cognate sense gene are known, most cis-NATs function by altering the steady-state level or structure of mRNA via changes in transcription, mRNA stability, or splicing, and very few cases involve the regulation of sense mRNA translation. This study was designed to systematically search for cis-NATs influencing cognate sense mRNA translation in Arabidopsis (). Establishment of a pipeline relying on sequencing of total polyA and polysomal RNA from Arabidopsis grown under various conditions (i.e. nutrient deprivation and phytohormone treatments) allowed the identification of 14 cis-NATs whose expression correlated either positively or negatively with cognate sense mRNA translation. With use of a combination of cis-NAT stable over-expression in transgenic plants and transient expression in protoplasts, the impact of cis-NAT expression on mRNA translation was confirmed for 4 out of 5 tested cis-NAT:sense mRNA pairs. These results expand the number of cis-NATs known to regulate cognate sense mRNA translation and provide a foundation for future studies of their mode of action. Moreover, this study highlights the role of this class of noncoding RNAs in translation regulation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1104/pp.19.00043DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6501089PMC
May 2019

Improving the quality and workflow of bacterial genome sequencing and analysis: paving the way for a Switzerland-wide molecular epidemiological surveillance platform.

Swiss Med Wkly 2018 12 15;148:w14693. Epub 2018 Dec 15.

SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.

Facing multidrug resistant (MDR) bacterial pathogens is one of the most important challenges for our society. The spread of highly virulent and resistant pathogens can be described using molecular typing technologies; in particular, whole genome sequencing (WGS) data can be used for molecular typing purposes with high resolution. WGS data analysis can explain the spatiotemporal patterns of pathogen transmission. However, the transmission between compartments (human, animal, food, environment) is very complex. Interoperable and curated metadata are a key requirement for fully understanding this complexity. In addition, high quality sequence data are a key element between centres using WGS data for diagnostic and epidemiological applications. We aim to describe steps to improve WGS data analysis and to implement a molecular surveillance platform allowing integration of high resolution WGS typing data and epidemiological data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.4414/smw.2018.14693DOI Listing
December 2018

Genome-wide identification of microRNAs regulating the human prion protein.

Brain Pathol 2019 03 21;29(2):232-244. Epub 2018 Dec 21.

Institute of Neuropathology, University of Zürich, Zürich, Switzerland.

The cellular prion protein (PrP ) is best known for its misfolded disease-causing conformer, PrP . Because the availability of PrP is often limiting for prion propagation, understanding its regulation may point to possible therapeutic targets. We sought to determine to what extent the human microRNAome is involved in modulating PrP levels through direct or indirect pathways. We probed PrP protein levels in cells subjected to a genome-wide library encompassing 2019 miRNA mimics using a robust time-resolved fluorescence-resonance screening assay. Screening was performed in three human neuroectodermal cell lines: U-251 MG, CHP-212 and SH-SY5Y. The three screens yielded 17 overlapping high-confidence miRNA mimic hits, 13 of which were found to regulate PrP biosynthesis directly via binding to the PRNP 3'UTR, thereby inducing transcript degradation. The four remaining hits (miR-124-3p, 192-3p, 299-5p and 376b-3p) did not bind either the 3'UTR or CDS of PRNP, and were therefore deemed indirect regulators of PrP . Our results show that multiple miRNAs regulate PrP levels both directly and indirectly. These findings may have profound implications for prion disease pathogenesis and potentially also for their therapy. Furthermore, the possible role of PrP as a mediator of Aβ toxicity suggests that its regulation by miRNAs may also impinge on Alzheimer's disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/bpa.12679DOI Listing
March 2019

Updates in Rhea: SPARQLing biochemical reaction data.

Nucleic Acids Res 2019 01;47(D1):D596-D600

Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, Switzerland.

Rhea (http://www.rhea-db.org) is a comprehensive and non-redundant resource of over 11 000 expert-curated biochemical reactions that uses chemical entities from the ChEBI ontology to represent reaction participants. Originally designed as an annotation vocabulary for the UniProt Knowledgebase (UniProtKB), Rhea also provides reaction data for a range of other core knowledgebases and data repositories including ChEBI and MetaboLights. Here we describe recent developments in Rhea, focusing on a new resource description framework representation of Rhea reaction data and an SPARQL endpoint (https://sparql.rhea-db.org/sparql) that provides access to it. We demonstrate how federated queries that combine the Rhea SPARQL endpoint and other SPARQL endpoints such as that of UniProt can provide improved metabolite annotation and support integrative analyses that link the metabolome through the proteome to the transcriptome and genome. These developments will significantly boost the utility of Rhea as a means to link chemistry and biology for a more holistic understanding of biological systems and their function in health and disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gky876DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6324061PMC
January 2019

Estimating the Contribution of Proteasomal Spliced Peptides to the HLA-I Ligandome.

Mol Cell Proteomics 2018 12 31;17(12):2347-2357. Epub 2018 Aug 31.

Vital-IT, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland. Electronic address:

Spliced peptides are short protein fragments spliced together in the proteasome by peptide bond formation. True estimation of the contribution of proteasome-spliced peptides (PSPs) to the global human leukocyte antigen (HLA) ligandome is critical. A recent study suggested that PSPs contribute up to 30% of the HLA ligandome. We performed a thorough reanalysis of the reported results using multiple computational tools and various validation steps and concluded that only a fraction of the proposed PSPs passes the quality filters. To better estimate the actual number of PSPs, we present an alternative workflow. We performed sequencing of the HLA-peptide spectra and discarded all sequences found in the UniProt database. We checked whether the remaining sequences could match spliced peptides from human proteins. The spliced sequences were appended to the UniProt fasta file, which was searched by two search tools at a false discovery rate (FDR) of 1%. We find that 2-6% of the HLA ligandome could be explained as spliced protein fragments. The majority of these potential PSPs have good peptide-spectrum match properties and are predicted to bind the respective HLA molecules. However, it remains to be shown how many of these potential PSPs actually originate from proteasomal splicing events.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1074/mcp.RA118.000877DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6283289PMC
December 2018

Scaling up data curation using deep learning: An application to literature triage in genomic variation resources.

PLoS Comput Biol 2018 08 13;14(8):e1006390. Epub 2018 Aug 13.

National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, Maryland, United States of America.

Manually curating biomedical knowledge from publications is necessary to build a knowledge based service that provides highly precise and organized information to users. The process of retrieving relevant publications for curation, which is also known as document triage, is usually carried out by querying and reading articles in PubMed. However, this query-based method often obtains unsatisfactory precision and recall on the retrieved results, and it is difficult to manually generate optimal queries. To address this, we propose a machine-learning assisted triage method. We collect previously curated publications from two databases UniProtKB/Swiss-Prot and the NHGRI-EBI GWAS Catalog, and used them as a gold-standard dataset for training deep learning models based on convolutional neural networks. We then use the trained models to classify and rank new publications for curation. For evaluation, we apply our method to the real-world manual curation process of UniProtKB/Swiss-Prot and the GWAS Catalog. We demonstrate that our machine-assisted triage method outperforms the current query-based triage methods, improves efficiency, and enriches curated content. Our method achieves a precision 1.81 and 2.99 times higher than that obtained by the current query-based triage methods of UniProtKB/Swiss-Prot and the GWAS Catalog, respectively, without compromising recall. In fact, our method retrieves many additional relevant publications that the query-based method of UniProtKB/Swiss-Prot could not find. As these results show, our machine learning-based method can make the triage process more efficient and is being implemented in production so that human curators can focus on more challenging tasks to improve the quality of knowledge bases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1006390DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6107285PMC
August 2018

A systems genetics resource and analysis of sleep regulation in the mouse.

PLoS Biol 2018 08 9;16(8):e2005750. Epub 2018 Aug 9.

Center for Integrative Genomics, University of Lausanne, Switzerland.

Sleep is essential for optimal brain functioning and health, but the biological substrates through which sleep delivers these beneficial effects remain largely unknown. We used a systems genetics approach in the BXD genetic reference population (GRP) of mice and assembled a comprehensive experimental knowledge base comprising a deep "sleep-wake" phenome, central and peripheral transcriptomes, and plasma metabolome data, collected under undisturbed baseline conditions and after sleep deprivation (SD). We present analytical tools to interactively interrogate the database, visualize the molecular networks altered by sleep loss, and prioritize candidate genes. We found that a one-time, short disruption of sleep already extensively reshaped the systems genetics landscape by altering 60%-78% of the transcriptomes and the metabolome, with numerous genetic loci affecting the magnitude and direction of change. Systems genetics integrative analyses drawing on all levels of organization imply α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptor trafficking and fatty acid turnover as substrates of the negative effects of insufficient sleep. Our analyses demonstrate that genetic heterogeneity and the effects of insufficient sleep itself on the transcriptome and metabolome are far more widespread than previously reported.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pbio.2005750DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6085075PMC
August 2018