Publications by authors named "Stephan Aiche"

14 Publications

  • Page 1 of 1

ProteomicsDB: a multi-omics and multi-organism resource for life science research.

Nucleic Acids Res 2020 01;48(D1):D1153-D1163

Chair of Proteomics and Bioanalytics, Technical University of Munich (TUM), Freising, Bavaria, Germany.

ProteomicsDB (https://www.ProteomicsDB.org) started as a protein-centric in-memory database for the exploration of large collections of quantitative mass spectrometry-based proteomics data. The data types and contents grew over time to include RNA-Seq expression data, drug-target interactions and cell line viability data. In this manuscript, we summarize new developments since the previous update that was published in Nucleic Acids Research in 2017. Over the past two years, we have enriched the data content by additional datasets and extended the platform to support protein turnover data. Another important new addition is that ProteomicsDB now supports the storage and visualization of data collected from other organisms, exemplified by Arabidopsis thaliana. Due to the generic design of ProteomicsDB, all analytical features available for the original human resource seamlessly transfer to other organisms. Furthermore, we introduce a new service in ProteomicsDB which allows users to upload their own expression datasets and analyze them alongside with data stored in ProteomicsDB. Initially, users will be able to make use of this feature in the interactive heat map functionality as well as the drug sensitivity prediction, but ultimately will be able to use all analytical features of ProteomicsDB in this way.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkz974DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7145565PMC
January 2020

Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning.

Nat Methods 2019 06 27;16(6):509-518. Epub 2019 May 27.

Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany.

In mass-spectrometry-based proteomics, the identification and quantification of peptides and proteins heavily rely on sequence database searching or spectral library matching. The lack of accurate predictive models for fragment ion intensities impairs the realization of the full potential of these approaches. Here, we extended the ProteomeTools synthetic peptide library to 550,000 tryptic peptides and 21 million high-quality tandem mass spectra. We trained a deep neural network, termed Prosit, resulting in chromatographic retention time and fragment ion intensity predictions that exceed the quality of the experimental data. Integrating Prosit into database search pipelines led to more identifications at >10× lower false discovery rates. We show the general applicability of Prosit by predicting spectra for proteases other than trypsin, generating spectral libraries for data-independent acquisition and improving the analysis of metaproteomes. Prosit is integrated into ProteomicsDB, allowing search result re-scoring and custom spectral library generation for any organism on the basis of peptide sequence alone.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41592-019-0426-7DOI Listing
June 2019

Smart Medical Information Technology for Healthcare (SMITH).

Methods Inf Med 2018 07 17;57(S 01):e92-e105. Epub 2018 Jul 17.

Introduction: This article is part of the Focus Theme of Methods of Information in Medicine on the German Medical Informatics Initiative. "Smart Medical Information Technology for Healthcare (SMITH)" is one of four consortia funded by the German Medical Informatics Initiative (MI-I) to create an alliance of universities, university hospitals, research institutions and IT companies. SMITH's goals are to establish Data Integration Centers (DICs) at each SMITH partner hospital and to implement use cases which demonstrate the usefulness of the approach.

Objectives: To give insight into architectural design issues underlying SMITH data integration and to introduce the use cases to be implemented.

Governance And Policies: SMITH implements a federated approach as well for its governance structure as for its information system architecture. SMITH has designed a generic concept for its data integration centers. They share identical services and functionalities to take best advantage of the interoperability architectures and of the data use and access process planned. The DICs provide access to the local hospitals' Electronic Medical Records (EMR). This is based on data trustee and privacy management services. DIC staff will curate and amend EMR data in the Health Data Storage.

Methodology And Architectural Framework: To share medical and research data, SMITH's information system is based on communication and storage standards. We use the Reference Model of the Open Archival Information System and will consistently implement profiles of Integrating the Health Care Enterprise (IHE) and Health Level Seven (HL7) standards. Standard terminologies will be applied. The SMITH Market Place will be used for devising agreements on data access and distribution. 3LGM for enterprise architecture modeling supports a consistent development process.The DIC reference architecture determines the services, applications and the standardsbased communication links needed for efficiently supporting the ingesting, data nourishing, trustee, privacy management and data transfer tasks of the SMITH DICs. The reference architecture is adopted at the local sites. Data sharing services and the market place enable interoperability.

Use Cases: The methodological use case "Phenotype Pipeline" (PheP) constructs algorithms for annotations and analyses of patient-related phenotypes according to classification rules or statistical models based on structured data. Unstructured textual data will be subject to natural language processing to permit integration into the phenotyping algorithms. The clinical use case "Algorithmic Surveillance of ICU Patients" (ASIC) focusses on patients in Intensive Care Units (ICU) with the acute respiratory distress syndrome (ARDS). A model-based decision-support system will give advice for mechanical ventilation. The clinical use case HELP develops a "hospital-wide electronic medical record-based computerized decision support system to improve outcomes of patients with blood-stream infections" (HELP). ASIC and HELP use the PheP. The clinical benefit of the use cases ASIC and HELP will be demonstrated in a change of care clinical trial based on a step wedge design.

Discussion: SMITH's strength is the modular, reusable IT architecture based on interoperability standards, the integration of the hospitals' information management departments and the public-private partnership. The project aims at sustainability beyond the first 4-year funding period.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3414/ME18-02-0004DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6193398PMC
July 2018

The target landscape of clinical kinase drugs.

Science 2017 12;358(6367)

Center of Thoracic Surgery, Krefeld, Germany.

Kinase inhibitors are important cancer therapeutics. Polypharmacology is commonly observed, requiring thorough target deconvolution to understand drug mechanism of action. Using chemical proteomics, we analyzed the target spectrum of 243 clinically evaluated kinase drugs. The data revealed previously unknown targets for established drugs, offered a perspective on the "druggable" kinome, highlighted (non)kinase off-targets, and suggested potential therapeutic applications. Integration of phosphoproteomic data refined drug-affected pathways, identified response markers, and strengthened rationale for combination treatments. We exemplify translational value by discovering SIK2 (salt-inducible kinase 2) inhibitors that modulate cytokine production in primary cells, by identifying drugs against the lung cancer survival marker MELK (maternal embryonic leucine zipper kinase), and by repurposing cabozantinib to treat FLT3-ITD-positive acute myeloid leukemia. This resource, available via the ProteomicsDB database, should facilitate basic, clinical, and drug discovery research and aid clinical decision-making.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aan4368DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6542668PMC
December 2017

ProteomicsDB.

Nucleic Acids Res 2018 01;46(D1):D1271-D1281

Chair of Proteomics and Bioanalytics, Technical University of Munich (TUM), Freising, 85354 Bavaria, Germany.

ProteomicsDB (https://www.ProteomicsDB.org) is a protein-centric in-memory database for the exploration of large collections of quantitative mass spectrometry-based proteomics data. ProteomicsDB was first released in 2014 to enable the interactive exploration of the first draft of the human proteome. To date, it contains quantitative data from 78 projects totalling over 19k LC-MS/MS experiments. A standardized analysis pipeline enables comparisons between multiple datasets to facilitate the exploration of protein expression across hundreds of tissues, body fluids and cell lines. We recently extended the data model to enable the storage and integrated visualization of other quantitative omics data. This includes transcriptomics data from e.g. NCBI GEO, protein-protein interaction information from STRING, functional annotations from KEGG, drug-sensitivity/selectivity data from several public sources and reference mass spectra from the ProteomeTools project. The extended functionality transforms ProteomicsDB into a multi-purpose resource connecting quantification and meta-data for each protein. The rich user interface helps researchers to navigate all data sources in either a protein-centric or multi-protein-centric manner. Several options are available to download data manually, while our application programming interface enables accessing quantitative data systematically.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkx1029DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5753189PMC
January 2018

Building ProteomeTools based on a complete synthetic human proteome.

Nat Methods 2017 03 30;14(3):259-262. Epub 2017 Jan 30.

Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany.

We describe ProteomeTools, a project building molecular and digital tools from the human proteome to facilitate biomedical research. Here we report the generation and multimodal liquid chromatography-tandem mass spectrometry analysis of >330,000 synthetic tryptic peptides representing essentially all canonical human gene products, and we exemplify the utility of these data in several applications. The resource (available at http://www.proteometools.org) will be extended to >1 million peptides, and all data will be shared with the community via ProteomicsDB and ProteomeXchange.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.4153DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5868332PMC
March 2017

OpenMS: a flexible open-source software platform for mass spectrometry data analysis.

Nat Methods 2016 08;13(9):741-8

Department of Computer Science, University of Tübingen, Tübingen, Germany.

High-resolution mass spectrometry (MS) has become an important tool in the life sciences, contributing to the diagnosis and understanding of human diseases, elucidating biomolecular structural information and characterizing cellular signaling networks. However, the rapid growth in the volume and complexity of MS data makes transparent, accurate and reproducible analysis difficult. We present OpenMS 2.0 (http://www.openms.de), a robust, open-source, cross-platform software specifically designed for the flexible and reproducible analysis of high-throughput MS data. The extensible OpenMS software implements common mass spectrometric data processing tasks through a well-defined application programming interface in C++ and Python and through standardized open data formats. OpenMS additionally provides a set of 185 tools and ready-made workflows for common mass spectrometric data processing tasks, which enable users to perform complex quantitative mass spectrometric analyses with ease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.3959DOI Listing
August 2016

From the desktop to the grid: scalable bioinformatics via workflow conversion.

BMC Bioinformatics 2016 Mar 12;17:127. Epub 2016 Mar 12.

Center for Bioinformatics and Dept. of Computer Science, University of Tübingen, Sand 14, Tübingen, 72070, Germany.

Background: Reproducibility is one of the tenets of the scientific method. Scientific experiments often comprise complex data flows, selection of adequate parameters, and analysis and visualization of intermediate and end results. Breaking down the complexity of such experiments into the joint collaboration of small, repeatable, well defined tasks, each with well defined inputs, parameters, and outputs, offers the immediate benefit of identifying bottlenecks, pinpoint sections which could benefit from parallelization, among others. Workflows rest upon the notion of splitting complex work into the joint effort of several manageable tasks. There are several engines that give users the ability to design and execute workflows. Each engine was created to address certain problems of a specific community, therefore each one has its advantages and shortcomings. Furthermore, not all features of all workflow engines are royalty-free -an aspect that could potentially drive away members of the scientific community.

Results: We have developed a set of tools that enables the scientific community to benefit from workflow interoperability. We developed a platform-free structured representation of parameters, inputs, outputs of command-line tools in so-called Common Tool Descriptor documents. We have also overcome the shortcomings and combined the features of two royalty-free workflow engines with a substantial user community: the Konstanz Information Miner, an engine which we see as a formidable workflow editor, and the Grid and User Support Environment, a web-based framework able to interact with several high-performance computing resources. We have thus created a free and highly accessible way to design workflows on a desktop computer and execute them on high-performance computing resources.

Conclusions: Our work will not only reduce time spent on designing scientific workflows, but also make executing workflows on remote high-performance computing resources more accessible to technically inexperienced users. We strongly believe that our efforts not only decrease the turnaround time to obtain scientific results but also have a positive impact on reproducibility, thus elevating the quality of obtained scientific results.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-016-0978-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4788856PMC
March 2016

Evaluation of drug-induced neurotoxicity based on metabolomics, proteomics and electrical activity measurements in complementary CNS in vitro models.

Toxicol In Vitro 2015 Dec 27;30(1 Pt A):138-65. Epub 2015 May 27.

European Commission Joint Research Centre, Institute for Health and Consumer Protection, I-21027 Ispra, VA, Italy. Electronic address:

The present study was performed in an attempt to develop an in vitro integrated testing strategy (ITS) to evaluate drug-induced neurotoxicity. A number of endpoints were analyzed using two complementary brain cell culture models and an in vitro blood-brain barrier (BBB) model after single and repeated exposure treatments with selected drugs that covered the major biological, pharmacological and neuro-toxicological responses. Furthermore, four drugs (diazepam, cyclosporine A, chlorpromazine and amiodarone) were tested more in depth as representatives of different classes of neurotoxicants, inducing toxicity through different pathways of toxicity. The developed in vitro BBB model allowed detection of toxic effects at the level of BBB and evaluation of drug transport through the barrier for predicting free brain concentrations of the studied drugs. The measurement of neuronal electrical activity was found to be a sensitive tool to predict the neuroactivity and neurotoxicity of drugs after acute exposure. The histotypic 3D re-aggregating brain cell cultures, containing all brain cell types, were found to be well suited for OMICs analyses after both acute and long term treatment. The obtained data suggest that an in vitro ITS based on the information obtained from BBB studies and combined with metabolomics, proteomics and neuronal electrical activity measurements performed in stable in vitro neuronal cell culture systems, has high potential to improve current in vitro drug-induced neurotoxicity evaluation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.tiv.2015.05.016DOI Listing
December 2015

Workflows for automated downstream data analysis and visualization in large-scale computational mass spectrometry.

Proteomics 2015 Apr 14;15(8):1443-7. Epub 2015 Feb 14.

Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany.

MS-based proteomics and metabolomics are rapidly evolving research fields driven by the development of novel instruments, experimental approaches, and analysis methods. Monolithic analysis tools perform well on single tasks but lack the flexibility to cope with the constantly changing requirements and experimental setups. Workflow systems, which combine small processing tools into complex analysis pipelines, allow custom-tailored and flexible data-processing workflows that can be published or shared with collaborators. In this article, we present the integration of established tools for computational MS from the open-source software framework OpenMS into the workflow engine Konstanz Information Miner (KNIME) for the analysis of large datasets and production of high-quality visualizations. We provide example workflows to demonstrate combined data processing and visualization for three diverse tasks in computational MS: isobaric mass tag based quantitation in complex experimental setups, label-free quantitation and identification of metabolites, and quality control for proteomics experiments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/pmic.201400391DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4415483PMC
April 2015

Mechanism of cisplatin proximal tubule toxicity revealed by integrating transcriptomics, proteomics, metabolomics and biokinetics.

Toxicol In Vitro 2015 Dec 23;30(1 Pt A):117-27. Epub 2014 Oct 23.

Division of Physiology, Department of Physiology and Medical Physics, Medical University of Innsbruck, Innsbruck 6020, Austria.

Cisplatin is one of the most widely used chemotherapeutic agents for the treatment of solid tumours. The major dose-limiting factor is nephrotoxicity, in particular in the proximal tubule. Here, we use an integrated omics approach, including transcriptomics, proteomics and metabolomics coupled to biokinetics to identify cell stress response pathways induced by cisplatin. The human renal proximal tubular cell line RPTEC/TERT1 was treated with sub-cytotoxic concentrations of cisplatin (0.5 and 2 μM) in a daily repeat dose treating regime for up to 14 days. Biokinetic analysis showed that cisplatin was taken up from the basolateral compartment, transported to the apical compartment, and accumulated in cells over time. This is in line with basolateral uptake of cisplatin via organic cation transporter 2 and bioactivation via gamma-glutamyl transpeptidase located on the apical side of proximal tubular cells. Cisplatin affected several pathways including, p53 signalling, Nrf2 mediated oxidative stress response, mitochondrial processes, mTOR and AMPK signalling. In addition, we identified novel pathways changed by cisplatin, including eIF2 signalling, actin nucleation via the ARP/WASP complex and regulation of cell polarization. In conclusion, using an integrated omic approach together with biokinetics we have identified both novel and established mechanisms of cisplatin toxicity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.tiv.2014.10.006DOI Listing
December 2015

qcML: an exchange format for quality control metrics from mass spectrometry experiments.

Mol Cell Proteomics 2014 Aug 23;13(8):1905-13. Epub 2014 Apr 23.

‡‡Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium; §§Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium;

Quality control is increasingly recognized as a crucial aspect of mass spectrometry based proteomics. Several recent papers discuss relevant parameters for quality control and present applications to extract these from the instrumental raw data. What has been missing, however, is a standard data exchange format for reporting these performance metrics. We therefore developed the qcML format, an XML-based standard that follows the design principles of the related mzML, mzIdentML, mzQuantML, and TraML standards from the HUPO-PSI (Proteomics Standards Initiative). In addition to the XML format, we also provide tools for the calculation of a wide range of quality metrics as well as a database format and interconversion tools, so that existing LIMS systems can easily add relational storage of the quality control data to their existing schema. We here describe the qcML specification, along with possible use cases and an illustrative example of the subsequent analysis possibilities. All information about qcML is available at http://code.google.com/p/qcml.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1074/mcp.M113.035907DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4125725PMC
August 2014

Inferring proteolytic processes from mass spectrometry time series data using degradation graphs.

PLoS One 2012 17;7(7):e40656. Epub 2012 Jul 17.

Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany.

Background: Proteases play an essential part in a variety of biological processes. Besides their importance under healthy conditions they are also known to have a crucial role in complex diseases like cancer. In recent years, it has been shown that not only the fragments produced by proteases but also their dynamics, especially ex vivo, can serve as biomarkers. But so far, only a few approaches were taken to explicitly model the dynamics of proteolysis in the context of mass spectrometry.

Results: We introduce a new concept to model proteolytic processes, the degradation graph. The degradation graph is an extension of the cleavage graph, a data structure to reconstruct and visualize the proteolytic process. In contrast to previous approaches we extended the model to incorporate endoproteolytic processes and present a method to construct a degradation graph from mass spectrometry time series data. Based on a degradation graph and the intensities extracted from the mass spectra it is possible to estimate reaction rates of the underlying processes. We further suggest a score to rate different degradation graphs in their ability to explain the observed data. This score is used in an iterative heuristic to improve the structure of the initially constructed degradation graph.

Conclusion: We show that the proposed method is able to recover all degraded and generated peptides, the underlying reactions, and the reaction rates of proteolytic processes based on mass spectrometry time series data. We use simulated and real data to demonstrate that a given process can be reconstructed even in the presence of extensive noise, isobaric signals and false identifications. While the model is currently only validated on peptide data it is also applicable to proteins, as long as the necessary time series data can be produced.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0040656PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3398944PMC
March 2013

MSSimulator: Simulation of mass spectrometry data.

J Proteome Res 2011 Jul 28;10(7):2922-9. Epub 2011 Apr 28.

Institute of Computer Science, Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany.

Mass spectrometry coupled to liquid chromatography (LC-MS and LC-MS/MS) is commonly used to analyze the protein content of biological samples in large scale studies, enabling quantitation and identification of proteins and peptides using a wide range of experimental protocols, algorithms, and statistical models to analyze the data. Currently it is difficult to compare the plethora of algorithms for these tasks. So far, curated benchmark data exists for peptide identification algorithms but data that represents a ground truth for the evaluation of LC-MS data is limited. Hence there have been attempts to simulate such data in a controlled fashion to evaluate and compare algorithms. We present MSSimulator, a simulation software for LC-MS and LC-MS/MS experiments. Starting from a list of proteins from a FASTA file, the simulation will perform in-silico digestion, retention time prediction, ionization filtering, and raw signal simulation (including MS/MS), while providing many options to change the properties of the resulting data like elution profile shape, resolution and sampling rate. Several protocols for SILAC, iTRAQ or MS(E) are available, in addition to the usual label-free approach, making MSSimulator the most comprehensive simulator for LC-MS and LC-MS/MS data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/pr200155fDOI Listing
July 2011