Publications by authors named "Andreas Zell"

74 Publications

Mesenchyme-derived factors enhance preneoplastic growth by non-genotoxic carcinogens in rat liver.

Arch Toxicol 2018 Feb 21;92(2):953-966. Epub 2017 Dec 21.

Department of Medicine I, Comprehensive Cancer Center, Institute of Cancer Research, Medical University of Vienna, Borschkegasse 8a, 1090, Vienna, Austria.

Many frequently prescribed drugs are non-genotoxic carcinogens (NGC) in rodent liver. Their mode of action and health risks for humans remain to be elucidated. Here, we investigated the impact of two model NGC, the anti-epileptic drug phenobarbital (PB) and the contraceptive cyproterone acetate (CPA), on intrahepatic epithelial-mesenchymal crosstalk and on growth of first stages of hepatocarcinogenesis. Unaltered hepatocytes (HC) and preneoplastic HC (HC) were isolated from rat liver for primary culture. DNA replication of HC and HC was increased by in vitro treatment with 10 µM CPA, but not 1 mM PB. Next, mesenchymal cells (MC) obtained from liver of rats treated with either PB (50 mg/kg bw/day) or CPA (100 mg/kg bw/day), were cultured. Supernatants from both types of MC raised DNA synthesis of HC and HC. This indicates that PB induces replication of HC and HC only indirectly, via growth factors secreted by MC. CPA, however, acts on HC and HC directly as well as indirectly via mesenchymal factors. Transcriptomics and bio-informatics revealed that PB and CPA induce extensive changes in the expression profile of MC affecting many growth factors and pathways. MC from PB-treated rats produced and secreted enhanced levels of HBEGF and GDF15, factors found to suppress apoptosis and/or induce DNA synthesis in cultured HC and HC. MC from CPA-treated animals showed enhanced expression and secretion of HGF, which strongly raised DNA replication of HC and HC. In conclusion, our findings reveal profound effects of two prototypical NGC on the hepatic mesenchyme. The resulting release of factors, which suppress apoptosis and/or enhance cell replication preferentially in cancer prestages, appears to be crucial for tumor promotion by NGC in the liver.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00204-017-2080-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5818586PMC
February 2018

Xenobiotic CAR Activators Induce Dlk1-Dio3 Locus Noncoding RNA Expression in Mouse Liver.

Toxicol Sci 2017 08;158(2):367-378

Preclinical Safety, Translational Medicine, Novartis Institutes for Biomedical Research, CH-4057 Basel, Switzerland.

Derisking xenobiotic-induced nongenotoxic carcinogenesis (NGC) represents a significant challenge during the safety assessment of chemicals and therapeutic drugs. The identification of robust mechanism-based NGC biomarkers has the potential to enhance cancer hazard identification. We previously demonstrated Constitutive Androstane Receptor (CAR) and WNT signaling-dependent up-regulation of the pluripotency associated Dlk1-Dio3 imprinted gene cluster noncoding RNAs (ncRNAs) in the liver of mice treated with tumor-promoting doses of phenobarbital (PB). Here, we have compared phenotypic, transcriptional ,and proteomic data from wild-type, CAR/PXR double knock-out and CAR/PXR double humanized mice treated with either PB or chlordane, and show that hepatic Dlk1-Dio3 locus long ncRNAs are upregulated in a CAR/PXR-dependent manner by two structurally distinct CAR activators. We further explored the specificity of Dlk1-Dio3 locus ncRNAs as hepatic NGC biomarkers in mice treated with additional compounds working through distinct NGC modes of action. We propose that up-regulation of Dlk1-Dio3 cluster ncRNAs can serve as an early biomarker for CAR activator-induced nongenotoxic hepatocarcinogenesis and thus may contribute to mechanism-based assessments of carcinogenicity risk for chemicals and novel therapeutics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/toxsci/kfx104DOI Listing
August 2017

ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

PLoS One 2016 16;11(2):e0149263. Epub 2016 Feb 16.

Department of Computer Science, University of Tübingen, Tübingen, Germany.

Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0149263PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4801062PMC
July 2016

Tumor promotion and inhibition by phenobarbital in livers of conditional Apc-deficient mice.

Arch Toxicol 2016 Jun 2;90(6):1481-94. Epub 2016 Feb 2.

Department of Toxicology, University of Tuebingen, Wilhelmstr. 56, 72074, Tuebingen, Germany.

Activation of Wnt/β-catenin signaling is important for human and rodent hepatocarcinogenesis. In mice, the tumor promoter phenobarbital (PB) selects for hepatocellular tumors with activating β-catenin mutations via constitutive androstane receptor activation. PB-dependent tumor promotion was studied in mice with genetic inactivation of Apc, a negative regulator of β-catenin, to circumvent the problem of randomly induced mutations by chemical initiators and to allow monitoring of PB- and Wnt/β-catenin-dependent tumorigenesis in the absence of unknown genomic alterations. Moreover, the study was designed to investigate PB-induced proliferation of liver cells with activated β-catenin. PB treatment provided Apc-deficient hepatocytes with only a minor proliferative advantage, and additional connexin 32 deficiency did not affect the proliferative response. PB significantly promoted the outgrowth of Apc-deficient hepatocellular adenoma (HCA), but simultaneously inhibited the formation of Apc-deficient hepatocellular carcinoma (HCC). The probability of tumor promotion by PB was calculated to be much lower for hepatocytes with loss of Apc, as compared to mutational β-catenin activation. Comprehensive transcriptomic and phosphoproteomic characterization of HCA and HCC revealed molecular details of the two tumor types. HCC were characterized by a loss of differentiated hepatocellular gene expression, enhanced proliferative signaling, and massive over-activation of Wnt/β-catenin signaling. In conclusion, PB exerts a dual role in liver tumor formation by promoting the growth of HCA but inhibiting the growth of HCC. Data demonstrate that one and the same compound can produce opposite effects on hepatocarcinogenesis, depending on context, highlighting the necessity to develop a more differentiated view on the tumorigenicity of this model compound.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00204-016-1667-1DOI Listing
June 2016

Coordinating Role of RXRα in Downregulating Hepatic Detoxification during Inflammation Revealed by Fuzzy-Logic Modeling.

PLoS Comput Biol 2016 Jan 4;12(1):e1004431. Epub 2016 Jan 4.

Dr. Margarete Fischer Bosch-Institute of Clinical Pharmacology, Stuttgart.

During various inflammatory processes circulating cytokines including IL-6, IL-1β, and TNFα elicit a broad and clinically relevant impairment of hepatic detoxification that is based on the simultaneous downregulation of many drug metabolizing enzymes and transporter genes. To address the question whether a common mechanism is involved we treated human primary hepatocytes with IL-6, the major mediator of the acute phase response in liver, and characterized acute phase and detoxification responses in quantitative gene expression and (phospho-)proteomics data sets. Selective inhibitors were used to disentangle the roles of JAK/STAT, MAPK, and PI3K signaling pathways. A prior knowledge-based fuzzy logic model comprising signal transduction and gene regulation was established and trained with perturbation-derived gene expression data from five hepatocyte donors. Our model suggests a greater role of MAPK/PI3K compared to JAK/STAT with the orphan nuclear receptor RXRα playing a central role in mediating transcriptional downregulation. Validation experiments revealed a striking similarity of RXRα gene silencing versus IL-6 induced negative gene regulation (rs = 0.79; P<0.0001). These results concur with RXRα functioning as obligatory heterodimerization partner for several nuclear receptors that regulate drug and lipid metabolism.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1004431DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4699813PMC
January 2016

SBMLsqueezer 2: context-sensitive creation of kinetic equations in biochemical networks.

BMC Syst Biol 2015 Oct 9;9:68. Epub 2015 Oct 9.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Sand 1, Tübingen, 72076, Germany.

Background: The size and complexity of published biochemical network reconstructions are steadily increasing, expanding the potential scale of derived computational models. However, the construction of large biochemical network models is a laborious and error-prone task. Automated methods have simplified the network reconstruction process, but building kinetic models for these systems is still a manually intensive task. Appropriate kinetic equations, based upon reaction rate laws, must be constructed and parameterized for each reaction. The complex test-and-evaluation cycles that can be involved during kinetic model construction would thus benefit from automated methods for rate law assignment.

Results: We present a high-throughput algorithm to automatically suggest and create suitable rate laws based upon reaction type according to several criteria. The criteria for choices made by the algorithm can be influenced in order to assign the desired type of rate law to each reaction. This algorithm is implemented in the software package SBMLsqueezer 2. In addition, this program contains an integrated connection to the kinetics database SABIO-RK to obtain experimentally-derived rate laws when desired.

Conclusions: The described approach fills a heretofore absent niche in workflows for large-scale biochemical kinetic model construction. In several applications the algorithm has already been demonstrated to be useful and scalable. SBMLsqueezer is platform independent and can be used as a stand-alone package, as an integrated plugin, or through a web interface, enabling flexible solutions and use-case scenarios.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12918-015-0212-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4600286PMC
October 2015

Proinflammatory mesenchymal effects of the non-genotoxic hepatocarcinogen phenobarbital: a novel mechanism of antiapoptosis and tumor promotion.

Carcinogenesis 2015 Dec 16;36(12):1521-30. Epub 2015 Sep 16.

Center of Bioinformatics Tübingen (ZBIT), University of Tübingen, 72070 Tübingen, Germany and.

Many environmental pollutants and drugs, including steroid hormones, hypolipidemics and antiepileptics, are non-genotoxic carcinogens (NGC) in rodent liver. The mechanism of action and the risk for human health are still insufficiently known. Here, we study the effects of phenobarbital (PB), a widely used model NGC, on hepatic epithelial-mesenchymal crosstalk and the impact on hepatic apoptosis. Mesenchymal cells (MC) and hepatocytes (HC) were isolated from control and PB-treated rat livers. PB induced extensive changes in gene expression in MC and much less in HC as shown by transcriptomics with oligoarrays. In MC only, transcript levels of numerous proinflammatory cytokines were elevated. Correspondingly, ELISA on the supernatant of MC from PB-treated rats revealed enhanced release of various cytokines. In cultured HC, this supernatant caused (i) nuclear translocation and activation of nuclear factor-κB (shown by immunoblots of nuclear extracts and reporter gene assays), (ii) elevated expression of proinflammatory genes and (iii) protection from the proapoptotic action of transforming growth factor beta 1 (TGFß1). PB treatment in vivo or in vitro elevated the production and release of tumor necrosis factor alpha from MC, which was identified as mainly responsible for the inhibition of apoptosis in HC. In conclusion, our findings reveal profound proinflammatory effects of PB on hepatic mesenchyme and mesenchymal-epithelial interactions. The resulting release of cytokines acts antiapoptotic in HC, an effect crucial for tumor promotion and carcinogenesis by NGC.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/carcin/bgv135DOI Listing
December 2015

Influence of Feature Encoding and Choice of Classifier on Disease Risk Prediction in Genome-Wide Association Studies.

PLoS One 2015 18;10(8):e0135832. Epub 2015 Aug 18.

Cognitive Systems Group, University of Tübingen, Tübingen, Germany.

Various attempts have been made to predict the individual disease risk based on genotype data from genome-wide association studies (GWAS). However, most studies only investigated one or two classification algorithms and feature encoding schemes. In this study, we applied seven different classification algorithms on GWAS case-control data sets for seven different diseases to create models for disease risk prediction. Further, we used three different encoding schemes for the genotypes of single nucleotide polymorphisms (SNPs) and investigated their influence on the predictive performance of these models. Our study suggests that an additive encoding of the SNP data should be the preferred encoding scheme, as it proved to yield the best predictive performances for all algorithms and data sets. Furthermore, our results showed that the differences between most state-of-the-art classification algorithms are not statistically significant. Consequently, we recommend to prefer algorithms with simple models like the linear support vector machine (SVM) as they allow for better subsequent interpretation without significant loss of accuracy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0135832PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4540285PMC
May 2016

A ranking method for the concurrent learning of compounds with various activity profiles.

J Cheminform 2015 16;7(1). Epub 2015 Jan 16.

Center for Bioinformatics Tübingen (ZBIT), University of Tuebingen, Sand 1, Tübingen, 72076 Germany.

Background: In this study, we present a SVM-based ranking algorithm for the concurrent learning of compounds with different activity profiles and their varying prioritization. To this end, a specific labeling of each compound was elaborated in order to infer virtual screening models against multiple targets. We compared the method with several state-of-the-art SVM classification techniques that are capable of inferring multi-target screening models on three chemical data sets (cytochrome P450s, dehydrogenases, and a trypsin-like protease data set) containing three different biological targets each.

Results: The experiments show that ranking-based algorithms show an increased performance for single- and multi-target virtual screening. Moreover, compounds that do not completely fulfill the desired activity profile are still ranked higher than decoys or compounds with an entirely undesired profile, compared to other multi-target SVM methods.

Conclusions: SVM-based ranking methods constitute a valuable approach for virtual screening in multi-target drug design. The utilization of such methods is most helpful when dealing with compounds with various activity profiles and the finding of many ligands with an already perfectly matching activity profile is not to be expected.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-014-0050-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4306736PMC
February 2015

SBMLSimulator: A Java Tool for Model Simulation and Parameter Estimation in Systems Biology.

Computation (Basel) 2014 Dec 18;2(4):246-257. Epub 2014 Dec 18.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Sand 1, 72076 Tübingen, Germany.

The identification of suitable model parameters for biochemical reactions has been recognized as a quite difficult endeavor. Parameter values from literature or experiments can often not directly be combined in complex reaction systems. Nature-inspired optimization techniques can find appropriate sets of parameters that calibrate a model to experimentally obtained time series data. We present SBMLsimulator, a tool that combines the Systems Biology Simulation Core Library for dynamic simulation of biochemical models with the heuristic optimization framework EvA2. SBMLsimulator provides an intuitive graphical user interface with various options as well as a fully-featured command-line interface for large-scale and script-based model simulation and calibration. In a parameter estimation study based on a published model and artificial data we demonstrate the capability of SBMLsimulator to identify parameters. SBMLsimulator is useful for both, the interactive simulation and exploration of the parameter space and for the large-scale model calibration and estimation of uncertain parameter values.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/computation2040246DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7093077PMC
December 2014

ToxDBScan: Large-scale similarity screening of toxicological databases for drug candidates.

Int J Mol Sci 2014 Oct 21;15(10):19037-55. Epub 2014 Oct 21.

Center of Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen 72074, Germany.

We present a new tool for hepatocarcinogenicity evaluation of drug candidates in rodents. ToxDBScan is a web tool offering quick and easy similarity screening of new drug candidates against two large-scale public databases, which contain expression profiles for substances with known carcinogenic profiles: TG-GATEs and DrugMatrix. ToxDBScan uses a set similarity score that computes the putative similarity based on similar expression of genes to identify chemicals with similar genotoxic and hepatocarcinogenic potential. We propose using a discretized representation of expression profiles, which use only information on up- or down-regulation of genes as relevant features. Therefore, only the deregulated genes are required as input. ToxDBScan provides an extensive report on similar compounds, which includes additional information on compounds, differential genes and pathway enrichments. We evaluated ToxDBScan with expression data from 15 chemicals with known hepatocarcinogenic potential and observed a sensitivity of 88 Based on the identified chemicals, we achieved perfect classification of the independent test set. ToxDBScan is publicly available from the ZBIT Bioinformatics Toolbox.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/ijms151019037DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4227259PMC
October 2014

Dysregulated serum response factor triggers formation of hepatocellular carcinoma.

Hepatology 2015 Mar 30;61(3):979-89. Epub 2015 Jan 30.

Department for Molecular Biology, Interfaculty Institute of Cell Biology, Tuebingen University, Germany.

Unlabelled: The ubiquitously expressed transcriptional regulator serum response factor (SRF) is controlled by both Ras/MAPK (mitogen-activated protein kinase) and Rho/actin signaling pathways, which are frequently activated in hepatocellular carcinoma (HCC). We generated SRF-VP16iHep mice, which conditionally express constitutively active SRF-VP16 in hepatocytes, thereby controlling subsets of both Ras/MAPK- and Rho/actin-stimulated target genes. All SRF-VP16iHep mice develop hyperproliferative liver nodules that progresses to lethal HCC. Some murine (m)HCCs acquire Ctnnb1 mutations equivalent to those in human (h)HCC. The resulting transcript signatures mirror those of a distinct subgroup of hHCCs, with shared activation of oncofetal genes including Igf2, correlating with CpG hypomethylation at the imprinted Igf2/H19 locus.

Conclusion: SRF-VP16iHep mHCC reveal convergent Ras/MAPK and Rho/actin signaling as a highly oncogenic driver mechanism for hepatocarcinogenesis. This suggests simultaneous inhibition of Ras/MAPK and Rho/actin signaling as a treatment strategy in hHCC therapy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/hep.27539DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4365683PMC
March 2015

RPPApipe: a pipeline for the analysis of reverse-phase protein array data.

Biosystems 2014 Aug 18;122:19-24. Epub 2014 Jun 18.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen 72076, Germany.

Background And Scope: Today, web-based data analysis pipelines exist for a wide variety of microarray platforms, such as ordinary gene-centered arrays, exon arrays and SNP arrays. However, most of the available software tools provide only limited support for reverse-phase protein arrays (RPPA), as relevant inherent properties of the corresponding datasets are not taken into account. Thus, we developed the web-based data analysis pipeline RPPApipe, which was specifically tailored to suit the characteristics of the RPPA platform and encompasses various tools for data preprocessing, statistical analysis, clustering and pathway analysis.

Implementation And Performance: All tools which are part of the RPPApipe software were implemented using R/Bioconductor. The software was embedded into our web-based ZBIT Bioinformatics Toolbox which is a customized instance of the Galaxy platform.

Availability: RPPApipe is freely available under GNU Public License from http://webservices.cs.uni-tuebingen.de. A full documentation of the tool can be found on the corresponding website http://www.cogsys.cs.uni-tuebingen.de/software/RPPApipe.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.biosystems.2014.06.009DOI Listing
August 2014

Cross-platform toxicogenomics for the prediction of non-genotoxic hepatocarcinogenesis in rat.

PLoS One 2014 15;9(5):e97640. Epub 2014 May 15.

Center of Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen, Germany.

In the area of omics profiling in toxicology, i.e. toxicogenomics, characteristic molecular profiles have previously been incorporated into prediction models for early assessment of a carcinogenic potential and mechanism-based classification of compounds. Traditionally, the biomarker signatures used for model construction were derived from individual high-throughput techniques, such as microarrays designed for monitoring global mRNA expression. In this study, we built predictive models by integrating omics data across complementary microarray platforms and introduced new concepts for modeling of pathway alterations and molecular interactions between multiple biological layers. We trained and evaluated diverse machine learning-based models, differing in the incorporated features and learning algorithms on a cross-omics dataset encompassing mRNA, miRNA, and protein expression profiles obtained from rat liver samples treated with a heterogeneous set of substances. Most of these compounds could be unambiguously classified as genotoxic carcinogens, non-genotoxic carcinogens, or non-hepatocarcinogens based on evidence from published studies. Since mixed characteristics were reported for the compounds Cyproterone acetate, Thioacetamide, and Wy-14643, we reclassified these compounds as either genotoxic or non-genotoxic carcinogens based on their molecular profiles. Evaluating our toxicogenomics models in a repeated external cross-validation procedure, we demonstrated that the prediction accuracy of our models could be increased by joining the biomarker signatures across multiple biological layers and by adding complex features derived from cross-platform integration of the omics data. Furthermore, we found that adding these features resulted in a better separation of the compound classes and a more confident reclassification of the three undefined compounds as non-genotoxic carcinogens.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0097640PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4022579PMC
January 2015

Evaluation of toxicogenomics approaches for assessing the risk of nongenotoxic carcinogenicity in rat liver.

PLoS One 2014 14;9(5):e97678. Epub 2014 May 14.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen, Germany.

The current gold-standard method for cancer safety assessment of drugs is a rodent two-year bioassay, which is associated with significant costs and requires testing a high number of animals over lifetime. Due to the absence of a comprehensive set of short-term assays predicting carcinogenicity, new approaches are currently being evaluated. One promising approach is toxicogenomics, which by virtue of genome-wide molecular profiling after compound treatment can lead to an increased mechanistic understanding, and potentially allow for the prediction of a carcinogenic potential via mathematical modeling. The latter typically involves the extraction of informative genes from omics datasets, which can be used to construct generalizable models allowing for the early classification of compounds with unknown carcinogenic potential. Here we formally describe and compare two novel methodologies for the reproducible extraction of characteristic mRNA signatures, which were employed to capture specific gene expression changes observed for nongenotoxic carcinogens. While the first method integrates multiple gene rankings, generated by diverse algorithms applied to data from different subsamplings of the training compounds, the second approach employs a statistical ratio for the identification of informative genes. Both methods were evaluated on a dataset obtained from the toxicogenomics database TG-GATEs to predict the outcome of a two-year bioassay based on profiles from 14-day treatments. Additionally, we applied our methods to datasets from previous studies and showed that the derived prediction models are on average more accurate than those built from the original signatures. The selected genes were mostly related to p53 signaling and to specific changes in anabolic processes or energy metabolism, which are typically observed in tumor cells. Among the genes most frequently incorporated into prediction models were Phlda3, Cdkn1a, Akr7a3, Ccng1 and Abcb4.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0097678PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4020844PMC
June 2015

Integrated enrichment analysis and pathway-centered visualization of metabolomics, proteomics, transcriptomics, and genomics data by using the InCroMAP software.

J Chromatogr B Analyt Technol Biomed Life Sci 2014 Sep 25;966:77-82. Epub 2014 Apr 25.

Institute for Diabetes Research and Metabolic Diseases of the Helmholtz Centre Munich at the University of Tübingen, Tübingen, Germany; Division of Clinical Chemistry and Pathobiochemistry, Department of Internal Medicine IV, University Hospital Tübingen, Tübingen, Germany; German Center for Diabetes Research (DZD), Germany. Electronic address:

In systems biology, the combination of multiple types of omics data, such as metabolomics, proteomics, transcriptomics, and genomics, yields more information on a biological process than the analysis of a single type of data. Thus, data from different omics platforms is usually combined in one experimental setup to obtain insight into a biological process or a disease state. Particularly high accuracy metabolomics data from modern mass spectrometry instruments is currently more and more integrated into biological studies. Reflecting this trend, we extended InCroMAP, a data integration, analysis and visualization tool for genomics, transcriptomics, and proteomics data. Now, the tool is able to perform an integrated enrichment analysis and pathway-based visualization of multi-omics data and thus, it is suitable for the evaluation of comprehensive systems biology studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jchromb.2014.04.030DOI Listing
September 2014

Ha-ras and β-catenin oncoproteins orchestrate metabolic programs in mouse liver tumors.

Int J Cancer 2014 Oct 3;135(7):1574-85. Epub 2014 Mar 3.

Institute of Experimental and Clinical Pharmacology and Toxicology Department of Toxicology, Eberhard Karls University of Tübingen, Tübingen, 72074, Germany.

The process of hepatocarcinogenesis in the diethylnitrosamine (DEN) initiation/phenobarbital (PB) promotion mouse model involves the selective clonal outgrowth of cells harboring oncogene mutations in Ctnnb1, while spontaneous or DEN-only-induced tumors are often Ha-ras- or B-raf-mutated. The molecular mechanisms and pathways underlying these different tumor sub-types are not well characterized. Their identification may help identify markers for xenobiotic promoted versus spontaneously occurring liver tumors. Here, we have characterized mouse liver tumors harboring either Ctnnb1 or Ha-ras mutations via integrated molecular profiling at the transcriptional, translational and post-translational levels. In addition, metabolites of the intermediary metabolism were quantified by high resolution (1)H magic angle nuclear magnetic resonance. We have identified tumor genotype-specific differences in mRNA and miRNA expression, protein levels, post-translational modifications, and metabolite levels that facilitate the molecular and biochemical stratification of tumor phenotypes. Bioinformatic integration of these data at the pathway level led to novel insights into tumor genotype-specific aberrant cell signaling and in particular to a better understanding of alterations in pathways of the cell intermediary metabolism, which are driven by the constitutive activation of the β-Catenin and Ha-ras oncoproteins in tumors of the two genotypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/ijc.28798DOI Listing
October 2014

Identification of short terminal motifs enriched by antibodies using peptide mass fingerprinting.

Bioinformatics 2014 May 9;30(9):1205-13. Epub 2014 Jan 9.

Natural and Medical Sciences Institute (NMI) at the University of Tübingen, Markwiesenstr. 55, D-72770 Reutlingen and Center for Bioinformatics, University of Tübingen, Sand 1, D-72076 Tuebingen, Germany.

Motivation: Mass spectrometry-based protein profiling has become a key technology in biomedical research and biomarker discovery. Sample preparation strategies that reduce the complexity of tryptic digests by immunoaffinity substantially increase throughput and sensitivity in proteomic mass spectrometry. The scarce availability of peptide-specific capture antibodies limits these approaches. Recently antibodies directed against short terminal motifs were found to enrich subsets of peptides with identical terminal sequences. This approach holds the promise of a significant gain in efficiency. TXP (Triple X Proteomics) and context-independent motif specific/global proteome survey binders are variants of this concept. Principally the binding motifs of such antibodies have to be elucidated after generating these antibodies. This entails a substantial effort in the lab, as it requires synthetic peptide libraries and numerous mass spectrometry experiments.

Results: We present an algorithm for predicting the antibody-binding motif in a mass spectrum obtained from a tryptic digest of a common cell line after immunoprecipitation. The epitope prediction, based on peptide mass fingerprinting, reveals the most enriched terminal epitopes. The tool provides a P-value for each potential epitope, estimated by sampling random spectra from a peptide database. The second algorithm combines the predicted sequences to more complex binding motifs. A comparison with library screenings shows that the predictions made by the novel methods are reliable and reproducible indicators of the binding properties of an antibody.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btu009DOI Listing
May 2014

TFpredict and SABINE: sequence-based prediction of structural and functional characteristics of transcription factors.

PLoS One 2013 12;8(12):e82238. Epub 2013 Dec 12.

Center of Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen, Germany.

One of the key mechanisms of transcriptional control are the specific connections between transcription factors (TF) and cis-regulatory elements in gene promoters. The elucidation of these specific protein-DNA interactions is crucial to gain insights into the complex regulatory mechanisms and networks underlying the adaptation of organisms to dynamically changing environmental conditions. As experimental techniques for determining TF binding sites are expensive and mostly performed for selected TFs only, accurate computational approaches are needed to analyze transcriptional regulation in eukaryotes on a genome-wide level. We implemented a four-step classification workflow which for a given protein sequence (1) discriminates TFs from other proteins, (2) determines the structural superclass of TFs, (3) identifies the DNA-binding domains of TFs and (4) predicts their cis-acting DNA motif. While existing tools were extended and adapted for performing the latter two prediction steps, the first two steps are based on a novel numeric sequence representation which allows for combining existing knowledge from a BLAST scan with robust machine learning-based classification. By evaluation on a set of experimentally confirmed TFs and non-TFs, we demonstrate that our new protein sequence representation facilitates more reliable identification and structural classification of TFs than previously proposed sequence-derived features. The algorithms underlying our proposed methodology are implemented in the two complementary tools TFpredict and SABINE. The online and stand-alone versions of TFpredict and SABINE are freely available to academics at http://www.cogsys.cs.uni-tuebingen.de/software/TFpredict/ and http://www.cogsys.cs.uni-tuebingen.de/software/SABINE/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0082238PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3861411PMC
October 2014

Integrative pathway-based approach for genome-wide association studies: identification of new pathways for rheumatoid arthritis and type 1 diabetes.

PLoS One 2013 25;8(10):e78577. Epub 2013 Oct 25.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen, Germany.

Genome-wide association studies (GWAS) led to the identification of numerous novel loci for a number of complex diseases. Pathway-based approaches using genotypic data provide tangible leads which cannot be identified by single marker approaches as implemented in GWAS. The available pathway analysis approaches mainly differ in the employed databases and in the applied statistics for determining the significance of the associated disease markers. So far, pathway-based approaches using GWAS data failed to consider the overlapping of genes among different pathways or the influence of protein-interactions. We performed a multistage integrative pathway (MIP) analysis on three common diseases--Crohn's disease (CD), rheumatoid arthritis (RA) and type 1 diabetes (T1D)--incorporating genotypic, pathway, protein- and domain-interaction data to identify novel associations between these diseases and pathways. Additionally, we assessed the sensitivity of our method by studying the influence of the most significant SNPs on the pathway analysis by removing those and comparing the corresponding pathway analysis results. Apart from confirming many previously published associations between pathways and RA, CD and T1D, our MIP approach was able to identify three new associations between disease phenotypes and pathways. This includes a relation between the influenza-A pathway and RA, as well as a relation between T1D and the phagosome and toxoplasmosis pathways. These results provide new leads to understand the molecular underpinnings of these diseases. The developed software herein used is available at http://www.cogsys.cs.uni-tuebingen.de/software/GWASPathwayIdentifier/index.htm.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0078577PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3808349PMC
August 2014

Parkinson's disease: dopaminergic nerve cell model is consistent with experimental finding of increased extracellular transport of α-synuclein.

BMC Neurosci 2013 Nov 6;14:136. Epub 2013 Nov 6.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, 72076 Tübingen, Germany.

Background: Parkinson's disease is an age-related disease whose pathogenesis is not completely known. Animal models exist for investigating the disease but not all results can be easily transferred to humans. Therefore, mathematical or probabilistic models for the human disease are to be constructed in silico in order to predict specific processes within a cell, such as the dopamine metabolism and transport processes in a neuron.

Results: We present a Systems Biology Markup Language (SBML) model of a whole dopaminergic nerve cell consisting of 139 reactions and 111 metabolites which includes, among others, the dopamine metabolism and transport, oxidative stress, aggregation of α-synuclein (αSYN), lysosomal and proteasomal degradation, and mitophagy. The predictive power of the model was investigated using flux balance analysis for the identification of steady model states. To this end, we performed six experiments: (i) investigation of the normal cell behavior, (ii) increase of O2, (iii) increase of ATP, (iv) influence of neurotoxins, (v) increase of αSYN in the cell, and (vi) increase of dopamine synthesis. The SBML model is available in the BioModels database with identifier MODEL1302200000.

Conclusion: It is possible to simulate the normal behavior of an in vivo nerve cell with the developed model. We show that the model is sensitive for neurotoxins and oxidative stress. Further, an increased level of αSYN induces apoptosis and an increased flux of αSYN to the extracellular space was observed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2202-14-136DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3871002PMC
November 2013

Path2Models: large-scale generation of computational models from biochemical pathway maps.

BMC Syst Biol 2013 Nov 1;7:116. Epub 2013 Nov 1.

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

Background: Systems biology projects and omics technologies have led to a growing number of biochemical pathway models and reconstructions. However, the majority of these models are still created de novo, based on literature mining and the manual processing of pathway data.

Results: To increase the efficiency of model creation, the Path2Models project has automatically generated mathematical models from pathway representations using a suite of freely available software. Data sources include KEGG, BioCarta, MetaCyc and SABIO-RK. Depending on the source data, three types of models are provided: kinetic, logical and constraint-based. Models from over 2 600 organisms are encoded consistently in SBML, and are made freely available through BioModels Database at http://www.ebi.ac.uk/biomodels-main/path2models. Each model contains the list of participants, their interactions, the relevant mathematical constructs, and initial parameter values. Most models are also available as easy-to-understand graphical SBGN maps.

Conclusions: To date, the project has resulted in more than 140 000 freely available models. Such a resource can tremendously accelerate the development of mathematical models by providing initial starting models for simulation and analysis, which can be subsequently curated and further parameterized.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1752-0509-7-116DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4228421PMC
November 2013

Automated label-free quantification of metabolites from liquid chromatography-mass spectrometry data.

Mol Cell Proteomics 2014 Jan 31;13(1):348-59. Epub 2013 Oct 31.

Applied Bioinformatics, Center for Bioinformatics, Quantitative Biology Center, and Department of Computer Science, University of Tuebingen, Sand 14, 72076 Tuebingen, Germany;

Liquid chromatography coupled to mass spectrometry (LC-MS) has become a standard technology in metabolomics. In particular, label-free quantification based on LC-MS is easily amenable to large-scale studies and thus well suited to clinical metabolomics. Large-scale studies, however, require automated processing of the large and complex LC-MS datasets. We present a novel algorithm for the detection of mass traces and their aggregation into features (i.e. all signals caused by the same analyte species) that is computationally efficient and sensitive and that leads to reproducible quantification results. The algorithm is based on a sensitive detection of mass traces, which are then assembled into features based on mass-to-charge spacing, co-elution information, and a support vector machine-based classifier able to identify potential metabolite isotope patterns. The algorithm is not limited to metabolites but is applicable to a wide range of small molecules (e.g. lipidomics, peptidomics), as well as to other separation technologies. We assessed the algorithm's robustness with regard to varying noise levels on synthetic data and then validated the approach on experimental data investigating human plasma samples. We obtained excellent results in a fully automated data-processing pipeline with respect to both accuracy and reproducibility. Relative to state-of-the art algorithms, ours demonstrated increased precision and recall of the method. The algorithm is available as part of the open-source software package OpenMS and runs on all major operating systems.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1074/mcp.M113.031278DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3879626PMC
January 2014

Screening for protein-DNA interactions by automatable DNA-protein interaction ELISA.

PLoS One 2013 11;8(10):e75177. Epub 2013 Oct 11.

Plant Physiology, Center for Plant Molecular Biology, University of Tuebingen, Tuebingen, Germany.

DNA-binding proteins (DBPs), such as transcription factors, constitute about 10% of the protein-coding genes in eukaryotic genomes and play pivotal roles in the regulation of chromatin structure and gene expression by binding to short stretches of DNA. Despite their number and importance, only for a minor portion of DBPs the binding sequence had been disclosed. Methods that allow the de novo identification of DNA-binding motifs of known DBPs, such as protein binding microarray technology or SELEX, are not yet suited for high-throughput and automation. To close this gap, we report an automatable DNA-protein-interaction (DPI)-ELISA screen of an optimized double-stranded DNA (dsDNA) probe library that allows the high-throughput identification of hexanucleotide DNA-binding motifs. In contrast to other methods, this DPI-ELISA screen can be performed manually or with standard laboratory automation. Furthermore, output evaluation does not require extensive computational analysis to derive a binding consensus. We could show that the DPI-ELISA screen disclosed the full spectrum of binding preferences for a given DBP. As an example, AtWRKY11 was used to demonstrate that the automated DPI-ELISA screen revealed the entire range of in vitro binding preferences. In addition, protein extracts of AtbZIP63 and the DNA-binding domain of AtWRKY33 were analyzed, which led to a refinement of their known DNA-binding consensi. Finally, we performed a DPI-ELISA screen to disclose the DNA-binding consensus of a yet uncharacterized putative DBP, AtTIFY1. A palindromic TGATCA-consensus was uncovered and we could show that the GATC-core is compulsory for AtTIFY1 binding. This specific interaction between AtTIFY1 and its DNA-binding motif was confirmed by in vivo plant one-hybrid assays in protoplasts. Thus, the value and applicability of the DPI-ELISA screen for de novo binding site identification of DBPs, also under automatized conditions, is a promising approach for a deeper understanding of gene regulation in any organism of choice.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0075177PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3795721PMC
July 2014

A toxicogenomic approach for the prediction of murine hepatocarcinogenesis using ensemble feature selection.

PLoS One 2013 10;8(9):e73938. Epub 2013 Sep 10.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen, Germany.

The current strategy for identifying the carcinogenicity of drugs involves the 2-year bioassay in male and female rats and mice. As this assay is cost-intensive and time-consuming there is a high interest in developing approaches for the screening and prioritization of drug candidates in preclinical safety evaluations. Predictive models based on toxicogenomics investigations after short-term exposure have shown their potential for assessing the carcinogenic risk. In this study, we investigated a novel method for the evaluation of toxicogenomics data based on ensemble feature selection in conjunction with bootstrapping for the purpose to derive reproducible and characteristic multi-gene signatures. This method was evaluated on a microarray dataset containing global gene expression data from liver samples of both male and female mice. The dataset was generated by the IMI MARCAR consortium and included gene expression profiles of genotoxic and nongenotoxic hepatocarcinogens obtained after treatment of CD-1 mice for 3 or 14 days. We developed predictive models based on gene expression data of both sexes and the models were employed for predicting the carcinogenic class of diverse compounds. Comparing the predictivity of our multi-gene signatures against signatures from literature, we demonstrated that by incorporating our gene sets as features slightly higher accuracy is on average achieved by a representative set of state-of-the art supervised learning methods. The constructed models were also used for the classification of Cyproterone acetate (CPA), Wy-14643 (WY) and Thioacetamid (TAA), whose primary mechanism of carcinogenicity is controversially discussed. Based on the extracted mouse liver gene expression patterns, CPA would be predicted as a nongenotoxic compound. In contrast, both WY and TAA would be classified as genotoxic mouse hepatocarcinogens.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0073938PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3769381PMC
June 2014

Loss of mitochondrial peptidase Clpp leads to infertility, hearing loss plus growth retardation via accumulation of CLPX, mtDNA and inflammatory factors.

Hum Mol Genet 2013 Dec 12;22(24):4871-87. Epub 2013 Jul 12.

Experimental Neurology.

The caseinolytic peptidase P (CLPP) is conserved from bacteria to humans. In the mitochondrial matrix, it multimerizes and forms a macromolecular proteasome-like cylinder together with the chaperone CLPX. In spite of a known relevance for the mitochondrial unfolded protein response, its substrates and tissue-specific roles are unclear in mammals. Recessive CLPP mutations were recently observed in the human Perrault variant of ovarian failure and sensorineural hearing loss. Here, a first characterization of CLPP null mice demonstrated complete female and male infertility and auditory deficits. Disrupted spermatogenesis already at the spermatid stage and ovarian follicular differentiation failure were evident. Reduced pre-/post-natal survival and marked ubiquitous growth retardation contrasted with only light impairment of movement and respiratory activities. Interestingly, the mice showed resistance to ulcerative dermatitis. Systematic expression studies detected up-regulation of other mitochondrial chaperones, accumulation of CLPX and mtDNA as well as inflammatory factors throughout tissues. T-lymphocytes in the spleen were activated. Thus, murine Clpp deletion represents a faithful Perrault model. The disease mechanism probably involves deficient clearance of mitochondrial components and inflammatory tissue destruction.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddt338DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7108587PMC
December 2013

Inferring multi-target QSAR models with taxonomy-based multi-task learning.

J Cheminform 2013 Jul 11;5(1):33. Epub 2013 Jul 11.

Center for Bioinformatics (ZBIT), University of Tübingen, Sand 1, Tübingen 72076, Germany.

Background: A plethora of studies indicate that the development of multi-target drugs is beneficial for complex diseases like cancer. Accurate QSAR models for each of the desired targets assist the optimization of a lead candidate by the prediction of affinity profiles. Often, the targets of a multi-target drug are sufficiently similar such that, in principle, knowledge can be transferred between the QSAR models to improve the model accuracy. In this study, we present two different multi-task algorithms from the field of transfer learning that can exploit the similarity between several targets to transfer knowledge between the target specific QSAR models.

Results: We evaluated the two methods on simulated data and a data set of 112 human kinases assembled from the public database ChEMBL. The relatedness between the kinase targets was derived from the taxonomy of the humane kinome. The experiments show that multi-task learning increases the performance compared to training separate models on both types of data given a sufficient similarity between the tasks. On the kinase data, the best multi-task approach improved the mean squared error of the QSAR models of 58 kinase targets.

Conclusions: Multi-task learning is a valuable approach for inferring multi-target QSAR models for lead optimization. The application of multi-task learning is most beneficial if knowledge can be transferred from a similar task with a lot of in-domain knowledge to a task with little in-domain knowledge. Furthermore, the benefit increases with a decreasing overlap between the chemical space spanned by the tasks.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1758-2946-5-33DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4104930PMC
July 2013

The systems biology simulation core algorithm.

BMC Syst Biol 2013 Jul 5;7:55. Epub 2013 Jul 5.

Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen, Germany.

Background: With the increasing availability of high dimensional time course data for metabolites, genes, and fluxes, the mathematical description of dynamical systems has become an essential aspect of research in systems biology. Models are often encoded in formats such as SBML, whose structure is very complex and difficult to evaluate due to many special cases.

Results: This article describes an efficient algorithm to solve SBML models that are interpreted in terms of ordinary differential equations. We begin our consideration with a formal representation of the mathematical form of the models and explain all parts of the algorithm in detail, including several preprocessing steps. We provide a flexible reference implementation as part of the Systems Biology Simulation Core Library, a community-driven project providing a large collection of numerical solvers and a sophisticated interface hierarchy for the definition of custom differential equation systems. To demonstrate the capabilities of the new algorithm, it has been tested with the entire SBML Test Suite and all models of BioModels Database.

Conclusions: The formal description of the mathematics behind the SBML format facilitates the implementation of the algorithm within specifically tailored programs. The reference implementation can be used as a simulation backend for Java™-based programs. Source code, binaries, and documentation can be freely obtained under the terms of the LGPL version 3 from http://simulation-core.sourceforge.net. Feature requests, bug reports, contributions, or any further discussion can be directed to the mailing list simulation-core-development@lists.sourceforge.net.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1752-0509-7-55DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3707837PMC
July 2013

Applicability of different hydraulic parameters to describe soil detachment in eroding rills.

PLoS One 2013 24;8(5):e64861. Epub 2013 May 24.

Department of Physical Geography, Trier University, Trier, Germany.

This study presents the comparison of experimental results with assumptions used in numerical models. The aim of the field experiments is to test the linear relationship between different hydraulic parameters and soil detachment. For example correlations between shear stress, unit length shear force, stream power, unit stream power and effective stream power and the detachment rate does not reveal a single parameter which consistently displays the best correlation. More importantly, the best fit does not only vary from one experiment to another, but even between distinct measurement points. Different processes in rill erosion are responsible for the changing correlations. However, not all these procedures are considered in soil erosion models. Hence, hydraulic parameters alone are not sufficient to predict detachment rates. They predict the fluvial incising in the rill's bottom, but the main sediment sources are not considered sufficiently in its equations. The results of this study show that there is still a lack of understanding of the physical processes underlying soil erosion. Exerted forces, soil stability and its expression, the abstraction of the detachment and transport processes in shallow flowing water remain still subject of unclear description and dependence.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0064861PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3663750PMC
November 2013

Optimization and visualization of the edge weights in optimal assignment methods for virtual screening.

BioData Min 2013 Mar 26;6(1). Epub 2013 Mar 26.

University of Tübingen, Center for Bioinformatics (ZBIT), Sand 1, 72076 Tübingen, Germany.

Background: Ligand-based virtual screening plays a fundamental part in the early drug discovery stage. In a virtual screening, a chemical library is searched for molecules with similar properties to a query molecule by means of a similarity function. The optimal assignment of chemical graphs has proven to be a valuable similarity function for many cheminformatic tasks, such as virtual screening. The optimal assignment assumes all atoms of a query molecule to be equally important, which is not realistic depending on the binding mode of a ligand. The importance of a query molecule's atoms can be integrated in the optimal assignment by weighting the assignment edges. We optimized the edge weights with respect to the virtual screening performance by means of evolutionary algorithms. Furthermore, we propose a visualization approach for the interpretation of the edge weights.

Results: We evaluated two different evolutionary algorithms, differential evolution and particle swarm optimization, for their suitability for optimizing the assignment edge weights. The results showed that both optimization methods are suited to optimize the edge weights. Furthermore, we compared our approach to the optimal assignment with equal edge weights and two literature similarity functions on a subset of the Directory of Useful Decoys using sophisticated virtual screening performance metrics. Our approach achieved a considerably better overall and early enrichment performance. The visualization of the edge weights enables the identification of substructures that are important for a good retrieval of ligands and for the binding to the protein target.

Conclusions: The optimization of the edge weights in optimal assignment methods is a valuable approach for ligand-based virtual screening experiments. The approach can be applied to any similarity function that employs the optimal assignment method, which includes a variety of similarity measures that have proven to be valuable in various cheminformatic tasks. The proposed visualization helps to get a better understanding of the binding mode of the analyzed query molecule.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1756-0381-6-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3639874PMC
March 2013