Publications by authors named "Johannes Eichner"

24 Publications

  • Page 1 of 1

Systematic discovery of uncharacterized transcription factors in Escherichia coli K-12 MG1655.

Nucleic Acids Res 2018 11;46(20):10682-10696

Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA.

Transcriptional regulation enables cells to respond to environmental changes. Of the estimated 304 candidate transcription factors (TFs) in Escherichia coli K-12 MG1655, 185 have been experimentally identified, but ChIP methods have been used to fully characterize only a few dozen. Identifying these remaining TFs is key to improving our knowledge of the E. coli transcriptional regulatory network (TRN). Here, we developed an integrated workflow for the computational prediction and comprehensive experimental validation of TFs using a suite of genome-wide experiments. We applied this workflow to (i) identify 16 candidate TFs from over a hundred uncharacterized genes; (ii) capture a total of 255 DNA binding peaks for ten candidate TFs resulting in six high-confidence binding motifs; (iii) reconstruct the regulons of these ten TFs by determining gene expression changes upon deletion of each TF and (iv) identify the regulatory roles of three TFs (YiaJ, YdcI, and YeiE) as regulators of l-ascorbate utilization, proton transfer and acetate metabolism, and iron homeostasis under iron-limited conditions, respectively. Together, these results demonstrate how this workflow can be used to discover, characterize, and elucidate regulatory functions of uncharacterized TFs in parallel.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gky752DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6237786PMC
November 2018

Network analysis of coronary artery disease risk genes elucidates disease mechanisms and druggable targets.

Sci Rep 2018 02 21;8(1):3434. Epub 2018 Feb 21.

Clinical Gene Networks AB, Stockholm, Sweden.

Genome-wide association studies (GWAS) have identified over two hundred chromosomal loci that modulate risk of coronary artery disease (CAD). The genes affected by variants at these loci are largely unknown and an untapped resource to improve our understanding of CAD pathophysiology and identify potential therapeutic targets. Here, we prioritized 68 genes as the most likely causal genes at genome-wide significant loci identified by GWAS of CAD and examined their regulatory roles in 286 metabolic and vascular tissue gene-protein sub-networks ("modules"). The modules and genes within were scored for CAD druggability potential. The scoring enriched for targets of cardiometabolic drugs currently in clinical use and in-depth analysis of the top-scoring modules validated established and revealed novel target tissues, biological processes, and druggable targets. This study provides an unprecedented resource of tissue-defined gene-protein interactions directly affected by genetic variance in CAD risk loci.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-018-20721-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5821758PMC
February 2018

Mesenchyme-derived factors enhance preneoplastic growth by non-genotoxic carcinogens in rat liver.

Arch Toxicol 2018 Feb 21;92(2):953-966. Epub 2017 Dec 21.

Department of Medicine I, Comprehensive Cancer Center, Institute of Cancer Research, Medical University of Vienna, Borschkegasse 8a, 1090, Vienna, Austria.

Many frequently prescribed drugs are non-genotoxic carcinogens (NGC) in rodent liver. Their mode of action and health risks for humans remain to be elucidated. Here, we investigated the impact of two model NGC, the anti-epileptic drug phenobarbital (PB) and the contraceptive cyproterone acetate (CPA), on intrahepatic epithelial-mesenchymal crosstalk and on growth of first stages of hepatocarcinogenesis. Unaltered hepatocytes (HC) and preneoplastic HC (HC) were isolated from rat liver for primary culture. DNA replication of HC and HC was increased by in vitro treatment with 10 µM CPA, but not 1 mM PB. Next, mesenchymal cells (MC) obtained from liver of rats treated with either PB (50 mg/kg bw/day) or CPA (100 mg/kg bw/day), were cultured. Supernatants from both types of MC raised DNA synthesis of HC and HC. This indicates that PB induces replication of HC and HC only indirectly, via growth factors secreted by MC. CPA, however, acts on HC and HC directly as well as indirectly via mesenchymal factors. Transcriptomics and bio-informatics revealed that PB and CPA induce extensive changes in the expression profile of MC affecting many growth factors and pathways. MC from PB-treated rats produced and secreted enhanced levels of HBEGF and GDF15, factors found to suppress apoptosis and/or induce DNA synthesis in cultured HC and HC. MC from CPA-treated animals showed enhanced expression and secretion of HGF, which strongly raised DNA replication of HC and HC. In conclusion, our findings reveal profound effects of two prototypical NGC on the hepatic mesenchyme. The resulting release of factors, which suppress apoptosis and/or enhance cell replication preferentially in cancer prestages, appears to be crucial for tumor promotion by NGC in the liver.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00204-017-2080-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5818586PMC
February 2018

The potential of circulating tumor DNA methylation analysis for the early detection and management of ovarian cancer.

Genome Med 2017 12 22;9(1):116. Epub 2017 Dec 22.

Genedata AG, Margarethenstrasse 38, 4053, Basel, Switzerland.

Background: Despite a myriad of attempts in the last three decades to diagnose ovarian cancer (OC) earlier, this clinical aim still remains a significant challenge. Aberrant methylation patterns of linked CpGs analyzed in DNA fragments shed by cancers into the bloodstream (i.e. cell-free DNA) can provide highly specific signals indicating cancer presence.

Methods: We analyzed 699 cancerous and non-cancerous tissues using a methylation array or reduced representation bisulfite sequencing to discover the most specific OC methylation patterns. A three-DNA-methylation-serum-marker panel was developed using targeted ultra-high coverage bisulfite sequencing in 151 women and validated in 250 women with various conditions, particularly in those associated with high CA125 levels (endometriosis and other benign pelvic masses), serial samples from 25 patients undergoing neoadjuvant chemotherapy, and a nested case control study of 172 UKCTOCS control arm participants which included serum samples up to two years before OC diagnosis.

Results: The cell-free DNA amount and average fragment size in the serum samples was up to ten times higher than average published values (based on samples that were immediately processed) due to leakage of DNA from white blood cells owing to delayed time to serum separation. Despite this, the marker panel discriminated high grade serous OC patients from healthy women or patients with a benign pelvic mass with specificity/sensitivity of 90.7% (95% confidence interval [CI] = 84.3-94.8%) and 41.4% (95% CI = 24.1-60.9%), respectively. Levels of all three markers plummeted after exposure to chemotherapy and correctly identified 78% and 86% responders and non-responders (Fisher's exact test, p = 0.04), respectively, which was superior to a CA125 cut-off of 35 IU/mL (20% and 75%). 57.9% (95% CI 34.0-78.9%) of women who developed OC within two years of sample collection were identified with a specificity of 88.1% (95% CI = 77.3-94.3%). Sensitivity and specificity improved further when specifically analyzing CA125 negative samples only (63.6% and 87.5%, respectively).

Conclusions: Our data suggest that DNA methylation patterns in cell-free DNA have the potential to detect a proportion of OCs up to two years in advance of diagnosis and may potentially guide personalized treatment. The prospective use of novel collection vials, which stabilize blood cells and reduce background DNA contamination in serum/plasma samples, will facilitate clinical implementation of liquid biopsy analyses.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13073-017-0500-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5740748PMC
December 2017

Methylation patterns in serum DNA for early identification of disseminated breast cancer.

Genome Med 2017 12 22;9(1):115. Epub 2017 Dec 22.

Genedata AG, Margarethenstrasse 38, 4053, Basel, Switzerland.

Background: Monitoring treatment and early detection of fatal breast cancer (BC) remains a major unmet need. Aberrant circulating DNA methylation (DNAme) patterns are likely to provide a highly specific cancer signal. We hypothesized that cell-free DNAme markers could indicate disseminated breast cancer, even in the presence of substantial quantities of background DNA.

Methods: We used reduced representation bisulfite sequencing (RRBS) of 31 tissues and established serum assays based on ultra-high coverage bisulfite sequencing in two independent prospective serum sets (n = 110). The clinical use of one specific region, EFC#93, was validated in 419 patients (in both pre- and post-adjuvant chemotherapy samples) from SUCCESS (Simultaneous Study of Gemcitabine-Docetaxel Combination adjuvant treatment, as well as Extended Bisphosphonate and Surveillance-Trial) and 925 women (pre-diagnosis) from the UKCTOCS (UK Collaborative Trial of Ovarian Cancer Screening) population cohort, with overall survival and occurrence of incident breast cancer (which will or will not lead to death), respectively, as primary endpoints.

Results: A total of 18 BC specific DNAme patterns were discovered in tissue, of which the top six were further tested in serum. The best candidate, EFC#93, was validated for clinical use. EFC#93 was an independent poor prognostic marker in pre-chemotherapy samples (hazard ratio [HR] for death = 7.689) and superior to circulating tumor cells (CTCs) (HR for death = 5.681). More than 70% of patients with both CTCs and EFC#93 serum DNAme positivity in their pre-chemotherapy samples relapsed within five years. EFC#93-positive disseminated disease in post-chemotherapy samples seems to respond to anti-hormonal treatment. The presence of EFC#93 serum DNAme identified 42.9% and 25% of women who were diagnosed with a fatal BC within 3-6 and 6-12 months of sample donation, respectively, with a specificity of 88%. The sensitivity with respect to detecting fatal BC was ~ 4-fold higher compared to non-fatal BC.

Conclusions: Detection of EFC#93 serum DNAme patterns offers a new tool for early diagnosis and management of disseminated breast cancers. Clinical trials are required to assess whether EFC#93-positive women in the absence of radiological detectable breast cancers will benefit from anti-hormonal treatment before the breast lesions become clinically apparent.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13073-017-0499-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5740791PMC
December 2017

ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

PLoS One 2016 16;11(2):e0149263. Epub 2016 Feb 16.

Department of Computer Science, University of Tübingen, Tübingen, Germany.

Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0149263PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4801062PMC
July 2016

SBMLsqueezer 2: context-sensitive creation of kinetic equations in biochemical networks.

BMC Syst Biol 2015 Oct 9;9:68. Epub 2015 Oct 9.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Sand 1, Tübingen, 72076, Germany.

Background: The size and complexity of published biochemical network reconstructions are steadily increasing, expanding the potential scale of derived computational models. However, the construction of large biochemical network models is a laborious and error-prone task. Automated methods have simplified the network reconstruction process, but building kinetic models for these systems is still a manually intensive task. Appropriate kinetic equations, based upon reaction rate laws, must be constructed and parameterized for each reaction. The complex test-and-evaluation cycles that can be involved during kinetic model construction would thus benefit from automated methods for rate law assignment.

Results: We present a high-throughput algorithm to automatically suggest and create suitable rate laws based upon reaction type according to several criteria. The criteria for choices made by the algorithm can be influenced in order to assign the desired type of rate law to each reaction. This algorithm is implemented in the software package SBMLsqueezer 2. In addition, this program contains an integrated connection to the kinetics database SABIO-RK to obtain experimentally-derived rate laws when desired.

Conclusions: The described approach fills a heretofore absent niche in workflows for large-scale biochemical kinetic model construction. In several applications the algorithm has already been demonstrated to be useful and scalable. SBMLsqueezer is platform independent and can be used as a stand-alone package, as an integrated plugin, or through a web interface, enabling flexible solutions and use-case scenarios.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12918-015-0212-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4600286PMC
October 2015

Proinflammatory mesenchymal effects of the non-genotoxic hepatocarcinogen phenobarbital: a novel mechanism of antiapoptosis and tumor promotion.

Carcinogenesis 2015 Dec 16;36(12):1521-30. Epub 2015 Sep 16.

Center of Bioinformatics Tübingen (ZBIT), University of Tübingen, 72070 Tübingen, Germany and.

Many environmental pollutants and drugs, including steroid hormones, hypolipidemics and antiepileptics, are non-genotoxic carcinogens (NGC) in rodent liver. The mechanism of action and the risk for human health are still insufficiently known. Here, we study the effects of phenobarbital (PB), a widely used model NGC, on hepatic epithelial-mesenchymal crosstalk and the impact on hepatic apoptosis. Mesenchymal cells (MC) and hepatocytes (HC) were isolated from control and PB-treated rat livers. PB induced extensive changes in gene expression in MC and much less in HC as shown by transcriptomics with oligoarrays. In MC only, transcript levels of numerous proinflammatory cytokines were elevated. Correspondingly, ELISA on the supernatant of MC from PB-treated rats revealed enhanced release of various cytokines. In cultured HC, this supernatant caused (i) nuclear translocation and activation of nuclear factor-κB (shown by immunoblots of nuclear extracts and reporter gene assays), (ii) elevated expression of proinflammatory genes and (iii) protection from the proapoptotic action of transforming growth factor beta 1 (TGFß1). PB treatment in vivo or in vitro elevated the production and release of tumor necrosis factor alpha from MC, which was identified as mainly responsible for the inhibition of apoptosis in HC. In conclusion, our findings reveal profound proinflammatory effects of PB on hepatic mesenchyme and mesenchymal-epithelial interactions. The resulting release of cytokines acts antiapoptotic in HC, an effect crucial for tumor promotion and carcinogenesis by NGC.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/carcin/bgv135DOI Listing
December 2015

JSBML 1.0: providing a smorgasbord of options to encode systems biology models.

Bioinformatics 2015 Oct 16;31(20):3383-6. Epub 2015 Jun 16.

University of California, San Diego, La Jolla, CA, USA, Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen, Germany.

Unlabelled: JSBML, the official pure Java programming library for the Systems Biology Markup Language (SBML) format, has evolved with the advent of different modeling formalisms in systems biology and their ability to be exchanged and represented via extensions of SBML. JSBML has matured into a major, active open-source project with contributions from a growing, international team of developers who not only maintain compatibility with SBML, but also drive steady improvements to the Java interface and promote ease-of-use with end users.

Availability And Implementation: Source code, binaries and documentation for JSBML can be freely obtained under the terms of the LGPL 2.1 from the website http://sbml.org/Software/JSBML. More information about JSBML can be found in the user guide at http://sbml.org/Software/JSBML/docs/.

Contact: jsbml-development@googlegroups.com or andraeger@eng.ucsd.edu

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btv341DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4595895PMC
October 2015

ToxDBScan: Large-scale similarity screening of toxicological databases for drug candidates.

Int J Mol Sci 2014 Oct 21;15(10):19037-55. Epub 2014 Oct 21.

Center of Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen 72074, Germany.

We present a new tool for hepatocarcinogenicity evaluation of drug candidates in rodents. ToxDBScan is a web tool offering quick and easy similarity screening of new drug candidates against two large-scale public databases, which contain expression profiles for substances with known carcinogenic profiles: TG-GATEs and DrugMatrix. ToxDBScan uses a set similarity score that computes the putative similarity based on similar expression of genes to identify chemicals with similar genotoxic and hepatocarcinogenic potential. We propose using a discretized representation of expression profiles, which use only information on up- or down-regulation of genes as relevant features. Therefore, only the deregulated genes are required as input. ToxDBScan provides an extensive report on similar compounds, which includes additional information on compounds, differential genes and pathway enrichments. We evaluated ToxDBScan with expression data from 15 chemicals with known hepatocarcinogenic potential and observed a sensitivity of 88 Based on the identified chemicals, we achieved perfect classification of the independent test set. ToxDBScan is publicly available from the ZBIT Bioinformatics Toolbox.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/ijms151019037DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4227259PMC
October 2014

RPPApipe: a pipeline for the analysis of reverse-phase protein array data.

Biosystems 2014 Aug 18;122:19-24. Epub 2014 Jun 18.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen 72076, Germany.

Background And Scope: Today, web-based data analysis pipelines exist for a wide variety of microarray platforms, such as ordinary gene-centered arrays, exon arrays and SNP arrays. However, most of the available software tools provide only limited support for reverse-phase protein arrays (RPPA), as relevant inherent properties of the corresponding datasets are not taken into account. Thus, we developed the web-based data analysis pipeline RPPApipe, which was specifically tailored to suit the characteristics of the RPPA platform and encompasses various tools for data preprocessing, statistical analysis, clustering and pathway analysis.

Implementation And Performance: All tools which are part of the RPPApipe software were implemented using R/Bioconductor. The software was embedded into our web-based ZBIT Bioinformatics Toolbox which is a customized instance of the Galaxy platform.

Availability: RPPApipe is freely available under GNU Public License from http://webservices.cs.uni-tuebingen.de. A full documentation of the tool can be found on the corresponding website http://www.cogsys.cs.uni-tuebingen.de/software/RPPApipe.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.biosystems.2014.06.009DOI Listing
August 2014

Cross-platform toxicogenomics for the prediction of non-genotoxic hepatocarcinogenesis in rat.

PLoS One 2014 15;9(5):e97640. Epub 2014 May 15.

Center of Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen, Germany.

In the area of omics profiling in toxicology, i.e. toxicogenomics, characteristic molecular profiles have previously been incorporated into prediction models for early assessment of a carcinogenic potential and mechanism-based classification of compounds. Traditionally, the biomarker signatures used for model construction were derived from individual high-throughput techniques, such as microarrays designed for monitoring global mRNA expression. In this study, we built predictive models by integrating omics data across complementary microarray platforms and introduced new concepts for modeling of pathway alterations and molecular interactions between multiple biological layers. We trained and evaluated diverse machine learning-based models, differing in the incorporated features and learning algorithms on a cross-omics dataset encompassing mRNA, miRNA, and protein expression profiles obtained from rat liver samples treated with a heterogeneous set of substances. Most of these compounds could be unambiguously classified as genotoxic carcinogens, non-genotoxic carcinogens, or non-hepatocarcinogens based on evidence from published studies. Since mixed characteristics were reported for the compounds Cyproterone acetate, Thioacetamide, and Wy-14643, we reclassified these compounds as either genotoxic or non-genotoxic carcinogens based on their molecular profiles. Evaluating our toxicogenomics models in a repeated external cross-validation procedure, we demonstrated that the prediction accuracy of our models could be increased by joining the biomarker signatures across multiple biological layers and by adding complex features derived from cross-platform integration of the omics data. Furthermore, we found that adding these features resulted in a better separation of the compound classes and a more confident reclassification of the three undefined compounds as non-genotoxic carcinogens.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0097640PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4022579PMC
January 2015

Evaluation of toxicogenomics approaches for assessing the risk of nongenotoxic carcinogenicity in rat liver.

PLoS One 2014 14;9(5):e97678. Epub 2014 May 14.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen, Germany.

The current gold-standard method for cancer safety assessment of drugs is a rodent two-year bioassay, which is associated with significant costs and requires testing a high number of animals over lifetime. Due to the absence of a comprehensive set of short-term assays predicting carcinogenicity, new approaches are currently being evaluated. One promising approach is toxicogenomics, which by virtue of genome-wide molecular profiling after compound treatment can lead to an increased mechanistic understanding, and potentially allow for the prediction of a carcinogenic potential via mathematical modeling. The latter typically involves the extraction of informative genes from omics datasets, which can be used to construct generalizable models allowing for the early classification of compounds with unknown carcinogenic potential. Here we formally describe and compare two novel methodologies for the reproducible extraction of characteristic mRNA signatures, which were employed to capture specific gene expression changes observed for nongenotoxic carcinogens. While the first method integrates multiple gene rankings, generated by diverse algorithms applied to data from different subsamplings of the training compounds, the second approach employs a statistical ratio for the identification of informative genes. Both methods were evaluated on a dataset obtained from the toxicogenomics database TG-GATEs to predict the outcome of a two-year bioassay based on profiles from 14-day treatments. Additionally, we applied our methods to datasets from previous studies and showed that the derived prediction models are on average more accurate than those built from the original signatures. The selected genes were mostly related to p53 signaling and to specific changes in anabolic processes or energy metabolism, which are typically observed in tumor cells. Among the genes most frequently incorporated into prediction models were Phlda3, Cdkn1a, Akr7a3, Ccng1 and Abcb4.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0097678PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4020844PMC
June 2015

Integrated enrichment analysis and pathway-centered visualization of metabolomics, proteomics, transcriptomics, and genomics data by using the InCroMAP software.

J Chromatogr B Analyt Technol Biomed Life Sci 2014 Sep 25;966:77-82. Epub 2014 Apr 25.

Institute for Diabetes Research and Metabolic Diseases of the Helmholtz Centre Munich at the University of Tübingen, Tübingen, Germany; Division of Clinical Chemistry and Pathobiochemistry, Department of Internal Medicine IV, University Hospital Tübingen, Tübingen, Germany; German Center for Diabetes Research (DZD), Germany. Electronic address:

In systems biology, the combination of multiple types of omics data, such as metabolomics, proteomics, transcriptomics, and genomics, yields more information on a biological process than the analysis of a single type of data. Thus, data from different omics platforms is usually combined in one experimental setup to obtain insight into a biological process or a disease state. Particularly high accuracy metabolomics data from modern mass spectrometry instruments is currently more and more integrated into biological studies. Reflecting this trend, we extended InCroMAP, a data integration, analysis and visualization tool for genomics, transcriptomics, and proteomics data. Now, the tool is able to perform an integrated enrichment analysis and pathway-based visualization of multi-omics data and thus, it is suitable for the evaluation of comprehensive systems biology studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jchromb.2014.04.030DOI Listing
September 2014

Ha-ras and β-catenin oncoproteins orchestrate metabolic programs in mouse liver tumors.

Int J Cancer 2014 Oct 3;135(7):1574-85. Epub 2014 Mar 3.

Institute of Experimental and Clinical Pharmacology and Toxicology Department of Toxicology, Eberhard Karls University of Tübingen, Tübingen, 72074, Germany.

The process of hepatocarcinogenesis in the diethylnitrosamine (DEN) initiation/phenobarbital (PB) promotion mouse model involves the selective clonal outgrowth of cells harboring oncogene mutations in Ctnnb1, while spontaneous or DEN-only-induced tumors are often Ha-ras- or B-raf-mutated. The molecular mechanisms and pathways underlying these different tumor sub-types are not well characterized. Their identification may help identify markers for xenobiotic promoted versus spontaneously occurring liver tumors. Here, we have characterized mouse liver tumors harboring either Ctnnb1 or Ha-ras mutations via integrated molecular profiling at the transcriptional, translational and post-translational levels. In addition, metabolites of the intermediary metabolism were quantified by high resolution (1)H magic angle nuclear magnetic resonance. We have identified tumor genotype-specific differences in mRNA and miRNA expression, protein levels, post-translational modifications, and metabolite levels that facilitate the molecular and biochemical stratification of tumor phenotypes. Bioinformatic integration of these data at the pathway level led to novel insights into tumor genotype-specific aberrant cell signaling and in particular to a better understanding of alterations in pathways of the cell intermediary metabolism, which are driven by the constitutive activation of the β-Catenin and Ha-ras oncoproteins in tumors of the two genotypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/ijc.28798DOI Listing
October 2014

TFpredict and SABINE: sequence-based prediction of structural and functional characteristics of transcription factors.

PLoS One 2013 12;8(12):e82238. Epub 2013 Dec 12.

Center of Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen, Germany.

One of the key mechanisms of transcriptional control are the specific connections between transcription factors (TF) and cis-regulatory elements in gene promoters. The elucidation of these specific protein-DNA interactions is crucial to gain insights into the complex regulatory mechanisms and networks underlying the adaptation of organisms to dynamically changing environmental conditions. As experimental techniques for determining TF binding sites are expensive and mostly performed for selected TFs only, accurate computational approaches are needed to analyze transcriptional regulation in eukaryotes on a genome-wide level. We implemented a four-step classification workflow which for a given protein sequence (1) discriminates TFs from other proteins, (2) determines the structural superclass of TFs, (3) identifies the DNA-binding domains of TFs and (4) predicts their cis-acting DNA motif. While existing tools were extended and adapted for performing the latter two prediction steps, the first two steps are based on a novel numeric sequence representation which allows for combining existing knowledge from a BLAST scan with robust machine learning-based classification. By evaluation on a set of experimentally confirmed TFs and non-TFs, we demonstrate that our new protein sequence representation facilitates more reliable identification and structural classification of TFs than previously proposed sequence-derived features. The algorithms underlying our proposed methodology are implemented in the two complementary tools TFpredict and SABINE. The online and stand-alone versions of TFpredict and SABINE are freely available to academics at http://www.cogsys.cs.uni-tuebingen.de/software/TFpredict/ and http://www.cogsys.cs.uni-tuebingen.de/software/SABINE/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0082238PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3861411PMC
October 2014

A toxicogenomic approach for the prediction of murine hepatocarcinogenesis using ensemble feature selection.

PLoS One 2013 10;8(9):e73938. Epub 2013 Sep 10.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen, Germany.

The current strategy for identifying the carcinogenicity of drugs involves the 2-year bioassay in male and female rats and mice. As this assay is cost-intensive and time-consuming there is a high interest in developing approaches for the screening and prioritization of drug candidates in preclinical safety evaluations. Predictive models based on toxicogenomics investigations after short-term exposure have shown their potential for assessing the carcinogenic risk. In this study, we investigated a novel method for the evaluation of toxicogenomics data based on ensemble feature selection in conjunction with bootstrapping for the purpose to derive reproducible and characteristic multi-gene signatures. This method was evaluated on a microarray dataset containing global gene expression data from liver samples of both male and female mice. The dataset was generated by the IMI MARCAR consortium and included gene expression profiles of genotoxic and nongenotoxic hepatocarcinogens obtained after treatment of CD-1 mice for 3 or 14 days. We developed predictive models based on gene expression data of both sexes and the models were employed for predicting the carcinogenic class of diverse compounds. Comparing the predictivity of our multi-gene signatures against signatures from literature, we demonstrated that by incorporating our gene sets as features slightly higher accuracy is on average achieved by a representative set of state-of-the art supervised learning methods. The constructed models were also used for the classification of Cyproterone acetate (CPA), Wy-14643 (WY) and Thioacetamid (TAA), whose primary mechanism of carcinogenicity is controversially discussed. Based on the extracted mouse liver gene expression patterns, CPA would be predicted as a nongenotoxic compound. In contrast, both WY and TAA would be classified as genotoxic mouse hepatocarcinogens.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0073938PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3769381PMC
June 2014

Inference of alternative splicing from tiling array data.

Authors:
Johannes Eichner

Methods Mol Biol 2013 ;1067:143-64

Center for Bioinformatics, University of Tuebingen, Tübingen, Germany.

Alternative splicing (AS) is an important mechanism implicated in eukaryotic gene expression, whereby exon segments of precursor-mRNA transcripts are joined together in different arrangements corresponding to diverse isoforms of mature mRNA. Accumulating evidence suggests that in many instances this process is specifically regulated and contributes to the structural and functional diversification of tissues and cell types. Furthermore, several studies support the view that environmental stresses dramatically impact on AS and reported the presence of novel transcript isoforms in response to biotic or abiotic stresses. Since specific regulation of AS in plants is a largely unexplored field of research, large-scale approaches aimed at monitoring AS on a genome-wide level are of increasing importance to gain insights into tissue-specific splicing regulation and to study the effects of changed environmental conditions on pre-mRNA splicing.Here, we describe the concepts of a traditional statistical approach, and a more recently developed machine learning-based method for AS detection from tiling arrays. The here presented approaches were employed for the detection and profiling of AS events in the model plant A. thaliana, and applied to a large dataset comprising transcriptomic expression data from 11 tissues and 13 stress conditions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-1-62703-607-8_10DOI Listing
March 2014

InCroMAP: integrated analysis of cross-platform microarray and pathway data.

Bioinformatics 2013 Feb 20;29(4):506-8. Epub 2012 Dec 20.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, 72076 Tübingen, Germany.

Summary: Microarrays are commonly used to detect changes in gene expression between different biological samples. For this purpose, many analysis tools have been developed that offer visualization, statistical analysis and more sophisticated analysis methods. Most of these tools are designed specifically for messenger RNA microarrays. However, today, more and more different microarray platforms are available. Changes in DNA methylation, microRNA expression or even protein phosphorylation states can be detected with specialized arrays. For these microarray technologies, the number of available tools is small compared with mRNA analysis tools. Especially, a joint analysis of different microarray platforms that have been used on the same set of biological samples is hardly supported by most microarray analysis tools. Here, we present InCroMAP, a tool for the analysis and visualization of high-level microarray data from individual or multiple different platforms. Currently, InCroMAP supports mRNA, microRNA, DNA methylation and protein modification datasets. Several methods are offered that allow for an integrated analysis of data from those platforms. The available features of InCroMAP range from visualization of DNA methylation data over annotation of microRNA targets and integrated gene set enrichment analysis to a joint visualization of data from all platforms in the context of metabolic or signalling pathways.

Availability: InCroMAP is freely available as Java™ application at www.cogsys.cs.uni-tuebingen.de/software/InCroMAP, including a comprehensive user's guide and example files.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bts709DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3570209PMC
February 2013

Pathway-based visualization of cross-platform microarray datasets.

Bioinformatics 2012 Dec 9;28(23):3021-6. Epub 2012 Oct 9.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, 72076 Tübingen, Germany.

Motivation: Traditionally, microarrays were almost exclusively used for the genome-wide analysis of differential gene expression. But nowadays, their scope of application has been extended to various genomic features, such as microRNAs (miRNAs), proteins and DNA methylation (DNAm). Most available methods for the visualization of these datasets are focused on individual platforms and are not capable of integratively visualizing multiple microarray datasets from cross-platform studies. Above all, there is a demand for methods that can visualize genomic features that are not directly linked to protein-coding genes, such as regulatory RNAs (e.g. miRNAs) and epigenetic alterations (e.g. DNAm), in a pathway-centred manner.

Results: We present a novel pathway-based visualization method that is especially suitable for the visualization of high-throughput datasets from multiple different microarray platforms that were used for the analysis of diverse genomic features in the same set of biological samples. The proposed methodology includes concepts for linking DNAm and miRNA expression datasets to canonical signalling and metabolic pathways. We further point out strategies for displaying data from multiple proteins and protein modifications corresponding to the same gene. Ultimately, we show how data from four distinct platform types (messenger RNA, miRNA, protein and DNAm arrays) can be integratively visualized in the context of canonical pathways.

Availability: The described method is implemented as part of the InCroMAP application that is freely available at www.cogsys.cs.uni-tuebingen.de/software/InCroMAP.

Contact: clemens.wrzodek@uni-tuebingen.de or andreas.zell@uni-tuebingen.de.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bts583DOI Listing
December 2012

Qualitative translation of relations from BioPAX to SBML qual.

Bioinformatics 2012 Oct 24;28(20):2648-53. Epub 2012 Aug 24.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, 72076 Tübingen, Germany.

Motivation: The biological pathway exchange language (BioPAX) and the systems biology markup language (SBML) belong to the most popular modeling and data exchange languages in systems biology. The focus of SBML is quantitative modeling and dynamic simulation of models, whereas the BioPAX specification concentrates mainly on visualization and qualitative analysis of pathway maps. BioPAX describes reactions and relations. In contrast, SBML core exclusively describes quantitative processes such as reactions. With the SBML qualitative models extension (qual), it has recently also become possible to describe relations in SBML. Before the development of SBML qual, relations could not be properly translated into SBML. Until now, there exists no BioPAX to SBML converter that is fully capable of translating both reactions and relations.

Results: The entire nature pathway interaction database has been converted from BioPAX (Level 2 and Level 3) into SBML (Level 3 Version 1) including both reactions and relations by using the new qual extension package. Additionally, we present the new webtool BioPAX2SBML for further BioPAX to SBML conversions. Compared with previous conversion tools, BioPAX2SBML is more comprehensive, more robust and more exact.

Availability: BioPAX2SBML is freely available at http://webservices.cs.uni-tuebingen.de/ and the complete collection of the PID models is available at http://www.cogsys.cs.uni-tuebingen.de/downloads/Qualitative-Models/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bts508DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3467751PMC
October 2012

Linking the epigenome to the genome: correlation of different features to DNA methylation of CpG islands.

PLoS One 2012 30;7(4):e35327. Epub 2012 Apr 30.

Center for Bioinformatics Tübingen, ZBIT, University of Tübingen, Tübingen, Germany.

DNA methylation of CpG islands plays a crucial role in the regulation of gene expression. More than half of all human promoters contain CpG islands with a tissue-specific methylation pattern in differentiated cells. Still today, the whole process of how DNA methyltransferases determine which region should be methylated is not completely revealed. There are many hypotheses of which genomic features are correlated to the epigenome that have not yet been evaluated. Furthermore, many explorative approaches of measuring DNA methylation are limited to a subset of the genome and thus, cannot be employed, e.g., for genome-wide biomarker prediction methods. In this study, we evaluated the correlation of genetic, epigenetic and hypothesis-driven features to DNA methylation of CpG islands. To this end, various binary classifiers were trained and evaluated by cross-validation on a dataset comprising DNA methylation data for 190 CpG islands in HEPG2, HEK293, fibroblasts and leukocytes. We achieved an accuracy of up to 91% with an MCC of 0.8 using ten-fold cross-validation and ten repetitions. With these models, we extended the existing dataset to the whole genome and thus, predicted the methylation landscape for the given cell types. The method used for these predictions is also validated on another external whole-genome dataset. Our results reveal features correlated to DNA methylation and confirm or disprove various hypotheses of DNA methylation related features. This study confirms correlations between DNA methylation and histone modifications, DNA structure, DNA sequence, genomic attributes and CpG island properties. Furthermore, the method has been validated on a genome-wide dataset from the ENCODE consortium. The developed software, as well as the predicted datasets and a web-service to compare methylation states of CpG islands are available at http://www.cogsys.cs.uni-tuebingen.de/software/dna-methylation/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0035327PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3340366PMC
September 2012

Support vector machines-based identification of alternative splicing in Arabidopsis thaliana from whole-genome tiling arrays.

BMC Bioinformatics 2011 Feb 16;12:55. Epub 2011 Feb 16.

Friedrich Miescher Laboratory, Max Planck Society, Spemannstr, 39, 72076 Tübingen, Germany.

Background: Alternative splicing (AS) is a process which generates several distinct mRNA isoforms from the same gene by splicing different portions out of the precursor transcript. Due to the (patho-)physiological importance of AS, a complete inventory of AS is of great interest. While this is in reach for human and mammalian model organisms, our knowledge of AS in plants has remained more incomplete. Experimental approaches for monitoring AS are either based on transcript sequencing or rely on hybridization to DNA microarrays. Among the microarray platforms facilitating the discovery of AS events, tiling arrays are well-suited for identifying intron retention, the most prevalent type of AS in plants. However, analyzing tiling array data is challenging, because of high noise levels and limited probe coverage.

Results: In this work, we present a novel method to detect intron retentions (IR) and exon skips (ES) from tiling arrays. While statistical tests have typically been proposed for this purpose, our method instead utilizes support vector machines (SVMs) which are appreciated for their accuracy and robustness to noise. Existing EST and cDNA sequences served for supervised training and evaluation. Analyzing a large collection of publicly available microarray and sequence data for the model plant A. thaliana, we demonstrated that our method is more accurate than existing approaches. The method was applied in a genome-wide screen which resulted in the discovery of 1,355 IR events. A comparison of these IR events to the TAIR annotation and a large set of short-read RNA-seq data showed that 830 of the predicted IR events are novel and that 525 events (39%) overlap with either the TAIR annotation or the IR events inferred from the RNA-seq data.

Conclusions: The method developed in this work expands the scarce repertoire of analysis tools for the identification of alternative mRNA splicing from whole-genome tiling arrays. Our predictions are highly enriched with known AS events and complement the A. thaliana genome annotation with respect to AS. Since all predicted AS events can be precisely attributed to experimental conditions, our work provides a basis for follow-up studies focused on the elucidation of the regulatory mechanisms underlying tissue-specific and stress-dependent AS in plants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2105-12-55DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3051901PMC
February 2011

Predicting DNA-binding specificities of eukaryotic transcription factors.

PLoS One 2010 Nov 30;5(11):e13876. Epub 2010 Nov 30.

Center for Bioinformatics Tübingen (ZBIT), University of Tübingen, Tübingen, Germany.

Today, annotated amino acid sequences of more and more transcription factors (TFs) are readily available. Quantitative information about their DNA-binding specificities, however, are hard to obtain. Position frequency matrices (PFMs), the most widely used models to represent binding specificities, are experimentally characterized only for a small fraction of all TFs. Even for some of the most intensively studied eukaryotic organisms (i.e., human, rat and mouse), roughly one-sixth of all proteins with annotated DNA-binding domain have been characterized experimentally. Here, we present a new method based on support vector regression for predicting quantitative DNA-binding specificities of TFs in different eukaryotic species. This approach estimates a quantitative measure for the PFM similarity of two proteins, based on various features derived from their protein sequences. The method is trained and tested on a dataset containing 1 239 TFs with known DNA-binding specificity, and used to predict specific DNA target motifs for 645 TFs with high accuracy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0013876PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2994704PMC
November 2010