Publications by authors named "Steffen Neumann"

72 Publications

Metabolic drift in the aging nervous system is reflected in human cerebrospinal fluid.

Sci Rep 2021 Sep 22;11(1):18822. Epub 2021 Sep 22.

Department of Medical Sciences, Clinical Chemistry, Uppsala University, 751 85, Uppsala, Sweden.

Chronic diseases affecting the central nervous system (CNS) like Alzheimer's or Parkinson's disease typically develop with advanced chronological age. Yet, aging at the metabolic level has been explored only sporadically in humans using biofluids in close proximity to the CNS such as the cerebrospinal fluid (CSF). We have used an untargeted liquid chromatography high-resolution mass spectrometry (LC-HRMS) based metabolomics approach to measure the levels of metabolites in the CSF of non-neurological control subjects in the age of 20 up to 74. Using a random forest-based feature selection strategy, we extracted 69 features that were strongly related to age (p < 0.001, r = 0.762, R = 0.764). Combining an in-house library of known substances with in silico chemical classification and functional semantic annotation we successfully assigned putative annotations to 59 out of the 69 CSF metabolites. We found alterations in metabolites related to the Cytochrome P450 system, perturbations in the tryptophan and kynurenine pathways, metabolites associated with cellular energy (NAD+, ADP), mitochondrial and ribosomal metabolisms, neurological dysfunction, and an increase of adverse microbial metabolites. Taken together our results point at a key role for metabolites found in CSF related to the Cytochrome P450 system as most often associated with metabolic aging.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-021-97491-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8458502PMC
September 2021

Modulation of Phosphate Deficiency-Induced Metabolic Changes by Iron Availability in .

Int J Mol Sci 2021 Jul 16;22(14). Epub 2021 Jul 16.

Department of Molecular Signal Processing, Leibniz Institute of Plant Biochemistry, Weinberg 3, D-06120 Halle, Germany.

Concurrent suboptimal supply of several nutrients requires the coordination of nutrient-specific transcriptional, phenotypic, and metabolic changes in plants in order to optimize growth and development in most agricultural and natural ecosystems. Phosphate (P) and iron (Fe) deficiency induce overlapping but mostly opposing transcriptional and root growth responses in . On the metabolite level, P deficiency negatively modulates Fe deficiency-induced coumarin accumulation, which is controlled by Fe as well as P deficiency response regulators. Here, we report the impact of Fe availability on seedling growth under P limiting conditions and on P deficiency-induced accumulation of amino acids and organic acids, which play important roles in P use efficiency. Fe deficiency in P replete conditions hardly changed growth and metabolite profiles in roots and shoots of , but partially rescued growth under conditions of P starvation and severely modulated P deficiency-induced metabolic adjustments. Analysis of T-DNA insertion lines revealed the concerted coordination of metabolic profiles by regulators of Fe (FIT, bHLH104, BRUTUS, PYE) as well as of P (SPX1, PHR1, PHL1, bHLH32) starvation responses. The results show the interdependency of P and Fe availability and the interplay between P and Fe starvation signaling on the generation of plant metabolite profiles.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/ijms22147609DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8306678PMC
July 2021

Mass spectrometry-based metabolomics: a guide for annotation, quantification and best reporting practices.

Nat Methods 2021 07 8;18(7):747-756. Epub 2021 Jul 8.

CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, China.

Mass spectrometry-based metabolomics approaches can enable detection and quantification of many thousands of metabolite features simultaneously. However, compound identification and reliable quantification are greatly complicated owing to the chemical complexity and dynamic range of the metabolome. Simultaneous quantification of many metabolites within complex mixtures can additionally be complicated by ion suppression, fragmentation and the presence of isomers. Here we present guidelines covering sample preparation, replication and randomization, quantification, recovery and recombination, ion suppression and peak misidentification, as a means to enable high-quality reporting of liquid chromatography- and gas chromatography-mass spectrometry-based metabolomics-derived data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41592-021-01197-1DOI Listing
July 2021

Untargeted In Silico Compound Classification-A Novel Metabolomics Method to Assess the Chemodiversity in Bryophytes.

Int J Mol Sci 2021 Mar 23;22(6). Epub 2021 Mar 23.

Bioinformatics & Scientific Data, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120 Halle (Saale), Germany.

In plant ecology, biochemical analyses of bryophytes and vascular plants are often conducted on dried herbarium specimen as species typically grow in distant and inaccessible locations. Here, we present an automated in silico compound classification framework to annotate metabolites using an untargeted data independent acquisition (DIA)-LC/MS-QToF-sequential windowed acquisition of all theoretical fragment ion mass spectra (SWATH) ecometabolomics analytical method. We perform a comparative investigation of the chemical diversity at the global level and the composition of metabolite families in ten different species of bryophytes using fresh samples collected on-site and dried specimen stored in a herbarium for half a year. Shannon and Pielou's diversity indices, hierarchical clustering analysis (HCA), sparse partial least squares discriminant analysis (sPLS-DA), distance-based redundancy analysis (dbRDA), ANOVA with post-hoc Tukey honestly significant difference (HSD) test, and the Fisher's exact test were used to determine differences in the richness and composition of metabolite families, with regard to herbarium conditions, ecological characteristics, and species. We functionally annotated metabolite families to biochemical processes related to the structural integrity of membranes and cell walls (proto-lignin, glycerophospholipids, carbohydrates), chemical defense (polyphenols, steroids), reactive oxygen species (ROS) protection (alkaloids, amino acids, flavonoids), nutrition (nitrogen- and phosphate-containing glycerophospholipids), and photosynthesis. Changes in the composition of metabolite families also explained variance related to ecological functioning like physiological adaptations of bryophytes to dry environments (proteins, peptides, flavonoids, terpenes), light availability (flavonoids, terpenes, carbohydrates), temperature (flavonoids), and biotic interactions (steroids, terpenes). The results from this study allow to construct chemical traits that can be attributed to biogeochemistry, habitat conditions, environmental changes and biotic interactions. Our classification framework accelerates the complex annotation process in metabolomics and can be used to simplify biochemical patterns. We show that compound classification is a powerful tool that allows to explore relationships in both molecular biology by "zooming in" and in ecology by "zooming out". The insights revealed by our framework allow to construct new research hypotheses and to enable detailed follow-up studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/ijms22063251DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8005083PMC
March 2021

Empowering large chemical knowledge bases for exposomics: PubChemLite meets MetFrag.

J Cheminform 2021 Mar 8;13(1):19. Epub 2021 Mar 8.

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.

Compound (or chemical) databases are an invaluable resource for many scientific disciplines. Exposomics researchers need to find and identify relevant chemicals that cover the entirety of potential (chemical and other) exposures over entire lifetimes. This daunting task, with over 100 million chemicals in the largest chemical databases, coupled with broadly acknowledged knowledge gaps in these resources, leaves researchers faced with too much-yet not enough-information at the same time to perform comprehensive exposomics research. Furthermore, the improvements in analytical technologies and computational mass spectrometry workflows coupled with the rapid growth in databases and increasing demand for high throughput "big data" services from the research community present significant challenges for both data hosts and workflow developers. This article explores how to reduce candidate search spaces in non-target small molecule identification workflows, while increasing content usability in the context of environmental and exposomics analyses, so as to profit from the increasing size and information content of large compound databases, while increasing efficiency at the same time. In this article, these methods are explored using PubChem, the NORMAN Network Suspect List Exchange and the in silico fragmentation approach MetFrag. A subset of the PubChem database relevant for exposomics, PubChemLite, is presented as a database resource that can be (and has been) integrated into current workflows for high resolution mass spectrometry. Benchmarking datasets from earlier publications are used to show how experimental knowledge and existing datasets can be used to detect and fill gaps in compound databases to progressively improve large resources such as PubChem, and topic-specific subsets such as PubChemLite. PubChemLite is a living collection, updating as annotation content in PubChem is updated, and exported to allow direct integration into existing workflows such as MetFrag. The source code and files necessary to recreate or adjust this are jointly hosted between the research parties (see data availability statement). This effort shows that enhancing the FAIRness (Findability, Accessibility, Interoperability and Reusability) of open resources can mutually enhance several resources for whole community benefit. The authors explicitly welcome additional community input on ideas for future developments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-021-00489-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7938590PMC
March 2021

LC-MS based plant metabolic profiles of thirteen grassland species grown in diverse neighbourhoods.

Sci Data 2021 02 9;8(1):52. Epub 2021 Feb 9.

Bioinformatics & Scientific Data, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120, Halle, Germany.

In plants, secondary metabolite profiles provide a unique opportunity to explore seasonal variation and responses to the environment. These include both abiotic and biotic factors. In field experiments, such stress factors occur in combination. This variation alters the plant metabolic profiles in yet uninvestigated ways. This data set contains trait and mass spectrometry data of thirteen grassland species collected at four time points in the growing season in 2017. We collected above-ground vegetative material of seven grass and six herb species that were grown in plant communities with different levels of diversity in the Jena Experiment. For each sample, we recorded visible traits and acquired shoot metabolic profiles on a UPLC-ESI-Qq-TOF-MS. We performed the raw data pre-processing in Galaxy-W4M and prepared the data for statistical analysis in R by applying missing data imputation, batch correction, and validity checks on the features. This comprehensive data set provides the opportunity to investigate environmental dynamics across diverse neighbourhoods that are reflected in the metabolomic profile.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41597-021-00836-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7873126PMC
February 2021

Reshaping of the Arabidopsis thaliana Proteome Landscape and Co-regulation of Proteins in Development and Immunity.

Mol Plant 2020 12 29;13(12):1709-1732. Epub 2020 Sep 29.

Leibniz Institute of Plant Biochemistry, Biochemistry of Plant Interactions Department, Proteome Biology of Plant Interactions Research Group, Weinberg 3, Halle/Saale D-06120, Germany. Electronic address:

Proteome remodeling is a fundamental adaptive response, and proteins in complexes and functionally related proteins are often co-expressed. Using a deep sampling strategy we define core proteomes of Arabidopsis thaliana tissues with around 10 000 proteins per tissue, and absolutely quantify (copy numbers per cell) nearly 16 000 proteins throughout the plant lifecycle. A proteome-wide survey of global post-translational modification revealed amino acid exchanges pointing to potential conservation of translational infidelity in eukaryotes. Correlation analysis of protein abundance uncovered potentially new tissue- and age-specific roles of entire signaling modules regulating transcription in photosynthesis, seed development, and senescence and abscission. Among others, the data suggest a potential function of RD26 and other NAC transcription factors in seed development related to desiccation tolerance as well as a possible function of cysteine-rich receptor-like kinases (CRKs) as ROS sensors in senescence. All of the components of ribosome biogenesis factor (RBF) complexes were found to be co-expressed in a tissue- and age-specific manner, indicating functional promiscuity in the assembly of these less-studied protein complexes in Arabidopsis.Furthermore, we characterized detailed proteome remodeling in basal immunity by treating Arabidopsis seeldings with flg22. Through simultaneously monitoring phytohormone and transcript changes upon flg22 treatment, we obtained strong evidence of suppression of jasmonate (JA) and JA-isoleucine (JA-Ile) levels by deconjugation and hydroxylation by IAA-ALA RESISTANT3 (IAR3) and JASMONATE-INDUCED OXYGENASE 2 (JOX2), respectively, under the control of JASMONATE INSENSITIVE 1 (MYC2), suggesting an unrecognized role of a new JA regulatory switch in pattern-triggered immunity. Taken together, the datasets generated in this study present extensive coverage of the Arabidopsis proteome in various biological scenarios, providing a rich resource available to the whole plant science community.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.molp.2020.09.024DOI Listing
December 2020

Feature-based molecular networking in the GNPS analysis environment.

Nat Methods 2020 09 24;17(9):905-908. Epub 2020 Aug 24.

Univ. Grenoble Alpes, CNRS, Grenoble INP, CHU Grenoble Alpes, TIMC-IMAG, Grenoble, France.

Molecular networking has become a key method to visualize and annotate the chemical space in non-targeted mass spectrometry data. We present feature-based molecular networking (FBMN) as an analysis method in the Global Natural Products Social Molecular Networking (GNPS) infrastructure that builds on chromatographic feature detection and alignment tools. FBMN enables quantitative analysis and resolution of isomers, including from ion mobility spectrometry.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41592-020-0933-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7885687PMC
September 2020

Chemical Diversity and Classification of Secondary Metabolites in Nine Bryophyte Species.

Metabolites 2019 Oct 11;9(10). Epub 2019 Oct 11.

Leibniz Institute of Plant Biochemistry, Bioinformatics and Scientific Data, Weinberg 3, 06120 Halle (Saale), Germany.

The central aim in ecometabolomics and chemical ecology is to pinpoint chemical features that explain molecular functioning. The greatest challenge is the identification of compounds due to the lack of constitutive reference spectra, the large number of completely unknown compounds, and bioinformatic methods to analyze the big data. In this study we present an interdisciplinary methodological framework that extends ultra-performance liquid chromatography coupled to electrospray ionization quadrupole time-of-flight mass spectrometry (UPLC/ESI-QTOF-MS) with data-dependent acquisition (DDA-MS) and the automated classification of fragment peaks into compound classes. We synthesize findings from a prior study that explored the influence of seasonal variations on the chemodiversity of secondary metabolites in nine bryophyte species. Here we reuse and extend the representative dataset with DDA-MS data. Hierarchical clustering, heatmaps, dbRDA, and ANOVA with post-hoc Tukey HSD were used to determine relationships of the study factors species, seasons, and ecological characteristics. The tested bryophytes showed species-specific metabolic responses to seasonal variations (50% vs. 5% of explained variation). , , and were biochemically most diverse and unique. Flavonoids and sesquiterpenoids were upregulated in all bryophytes in the growing seasons. We identified ecological functioning of compound classes indicating light protection (flavonoids), biotic and pathogen interactions (sesquiterpenoids, flavonoids), low temperature and desiccation tolerance (glycosides, sesquiterpenoids, anthocyanins, lactones), and moss growth supporting anatomic structures (few methoxyphenols and cinnamic acids as part of proto-lignin constituents). The reusable bioinformatic framework of this study can differentiate species based on automated compound classification. Our study allows detailed insights into the ecological roles of biochemical constituents of bryophytes with regard to seasonal variations. We demonstrate that compound classification can be improved with adding constitutive reference spectra to existing spectral libraries. We also show that generalization on compound classes improves our understanding of molecular ecological functioning and can be used to generate new research hypotheses.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/metabo9100222DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6835487PMC
October 2019

The metaRbolomics Toolbox in Bioconductor and beyond.

Metabolites 2019 Sep 23;9(10). Epub 2019 Sep 23.

Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.

Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of open source software, providing freely available advanced processing and analysis approaches. The programming and statistics environment R has emerged as one of the most popular environments to process and analyse Metabolomics datasets. A major benefit of such an environment is the possibility of connecting different tools into more complex workflows. Combining reusable data processing R scripts with the experimental data thus allows for open, reproducible research. This review provides an extensive overview of existing packages in R for different steps in a typical computational metabolomics workflow, including data processing, biostatistics, metabolite annotation and identification, and biochemical network and pathway analysis. Multifunctional workflows, possible user interfaces and integration into workflow management systems are also reviewed. In total, this review summarises more than two hundred metabolomics specific packages primarily available on CRAN, Bioconductor and GitHub.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/metabo9100200DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6835268PMC
September 2019

Golden Mutagenesis: An efficient multi-site-saturation mutagenesis approach by Golden Gate cloning with automated primer design.

Sci Rep 2019 07 29;9(1):10932. Epub 2019 Jul 29.

Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120, Halle (Saale), Germany.

Site-directed methods for the generation of genetic diversity are essential tools in the field of directed enzyme evolution. The Golden Gate cloning technique has been proven to be an efficient tool for a variety of cloning setups. The utilization of restriction enzymes which cut outside of their recognition domain allows the assembly of multiple gene fragments obtained by PCR amplification without altering the open reading frame of the reconstituted gene. We have developed a protocol, termed Golden Mutagenesis that allows the rapid, straightforward, reliable and inexpensive construction of mutagenesis libraries. One to five amino acid positions within a coding sequence could be altered simultaneously using a protocol which can be performed within one day. To facilitate the implementation of this technique, a software library and web application for automated primer design and for the graphical evaluation of the randomization success based on the sequencing results was developed. This allows facile primer design and application of Golden Mutagenesis also for laboratories, which are not specialized in molecular biology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-019-47376-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6662682PMC
July 2019

Improving MetFrag with statistical learning of fragment annotations.

BMC Bioinformatics 2019 Jul 5;20(1):376. Epub 2019 Jul 5.

Institute of Computer Science, Martin Luther University Halle-Wittenberg, Von-Seckendorff-Platz 1, Halle (Saale), 06099, Germany.

Background: Molecule identification is a crucial step in metabolomics and environmental sciences. Besides in silico fragmentation, as performed by MetFrag, also machine learning and statistical methods evolved, showing an improvement in molecule annotation based on MS/MS data. In this work we present a new statistical scoring method where annotations of m/z fragment peaks to fragment-structures are learned in a training step. Based on a Bayesian model, two additional scoring terms are integrated into the new MetFrag2.4.5 and evaluated on the test data set of the CASMI 2016 contest.

Results: The results on the 87 MS/MS spectra from positive and negative mode show a substantial improvement of the results compared to submissions made by the former MetFrag approach. Top1 rankings increased from 5 to 21 and Top10 rankings from 39 to 55 both showing higher values than for CSI:IOKR, the winner of the CASMI 2016 contest. For the negative mode spectra, MetFrag's statistical scoring outperforms all other participants which submitted results for this type of spectra.

Conclusions: This study shows how statistical learning can improve molecular structure identification based on MS/MS data compared on the same method using combinatorial in silico fragmentation only. MetFrag2.4.5 shows especially in negative mode a better performance compared to the other participating approaches.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2954-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612146PMC
July 2019

Supporting non-target identification by adding hydrogen deuterium exchange MS/MS capabilities to MetFrag.

Anal Bioanal Chem 2019 Jul 17;411(19):4683-4700. Epub 2019 Jun 17.

Helmholtz Centre for Environmental Research - UFZ, Permoserstr. 15, 04318, Leipzig, Germany.

Liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS) is increasingly popular for the non-targeted exploration of complex samples, where tandem mass spectrometry (MS/MS) is used to characterize the structure of unknown compounds. However, mass spectra do not always contain sufficient information to unequivocally identify the correct structure. This study investigated how much additional information can be gained using hydrogen deuterium exchange (HDX) experiments. The exchange of "easily exchangeable" hydrogen atoms (connected to heteroatoms), with predominantly [M+D] ions in positive mode and [M-D] in negative mode was observed. To enable high-throughput processing, new scoring terms were incorporated into the in silico fragmenter MetFrag. These were initially developed on small datasets and then tested on 762 compounds of environmental interest. Pairs of spectra (normal and deuterated) were found for 593 of these substances (506 positive mode, 155 negative mode spectra). The new scoring terms resulted in 29 additional correct identifications (78 vs 49) for positive mode and an increase in top 10 rankings from 80 to 106 in negative mode. Compounds with dual functionality (polar head group, long apolar tail) exhibited dramatic retention time (RT) shifts of up to several minutes, compared with an average 0.04 min RT shift. For a smaller dataset of 80 metabolites, top 10 rankings improved from 13 to 24 (positive mode, 57 spectra) and from 14 to 31 (negative mode, 63 spectra) when including HDX information. The results of standard measurements were confirmed using targets and tentatively identified surfactant species in an environmental sample collected from the river Danube near Novi Sad (Serbia). The changes to MetFrag have been integrated into the command line version available at http://c-ruttkies.github.io/MetFrag and all resulting spectra and compounds are available in online resources and in the Electronic Supplementary Material (ESM). Graphical abstract.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00216-019-01885-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6611743PMC
July 2019

Interoperable and scalable data analysis with microservices: applications in metabolomics.

Bioinformatics 2019 10;35(19):3752-3760

CEA, LIST, Laboratory for Data Analysis and Systems' Intelligence, MetaboHUB, Gif-sur-Yvette, France.

Motivation: Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed using the Kubernetes container orchestrator.

Results: We developed a Virtual Research Environment (VRE) which facilitates rapid integration of new tools and developing scalable and interoperable workflows for performing metabolomics data analysis. The environment can be launched on-demand on cloud resources and desktop computers. IT-expertise requirements on the user side are kept to a minimum, and workflows can be re-used effortlessly by any novice user. We validate our method in the field of metabolomics on two mass spectrometry, one nuclear magnetic resonance spectroscopy and one fluxomics study. We showed that the method scales dynamically with increasing availability of computational resources. We demonstrated that the method facilitates interoperability using integration of the major software suites resulting in a turn-key workflow encompassing all steps for mass-spectrometry-based metabolomics including preprocessing, statistics and identification. Microservices is a generic methodology that can serve any scientific discipline and opens up for new types of large-scale integrative science.

Availability And Implementation: The PhenoMeNal consortium maintains a web portal (https://portal.phenomenal-h2020.eu) providing a GUI for launching the Virtual Research Environment. The GitHub repository https://github.com/phnmnl/ hosts the source code of all projects.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz160DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6761976PMC
October 2019

mzTab-M: A Data Standard for Sharing Quantitative Results in Mass Spectrometry Metabolomics.

Anal Chem 2019 03 13;91(5):3302-3310. Epub 2019 Feb 13.

Institute of Integrative Biology, University of Liverpool , Liverpool L69 7ZB , United Kingdom.

Mass spectrometry (MS) is one of the primary techniques used for large-scale analysis of small molecules in metabolomics studies. To date, there has been little data format standardization in this field, as different software packages export results in different formats represented in XML or plain text, making data sharing, database deposition, and reanalysis highly challenging. Working within the consortia of the Metabolomics Standards Initiative, Proteomics Standards Initiative, and the Metabolomics Society, we have created mzTab-M to act as a common output format from analytical approaches using MS on small molecules. The format has been developed over several years, with input from a wide range of stakeholders. mzTab-M is a simple tab-separated text format, but importantly, the structure is highly standardized through the design of a detailed specification document, tightly coupled to validation software, and a mandatory controlled vocabulary of terms to populate it. The format is able to represent final quantification values from analyses, as well as the evidence trail in terms of features measured directly from MS (e.g., LC-MS, GC-MS, DIMS, etc.) and different types of approaches used to identify molecules. mzTab-M allows for ambiguity in the identification of molecules to be communicated clearly to readers of the files (both people and software). There are several implementations of the format available, and we anticipate widespread adoption in the field.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.analchem.8b04310DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6660005PMC
March 2019

PhenoMeNal: processing and analysis of metabolomics data in the cloud.

Gigascience 2019 02;8(2)

Leibniz Institute of Plant Biochemistry, Stress and Developmental Biology, Weinberg 3, 06120 Halle (Saale), Germany.

Background: Metabolomics is the comprehensive study of a multitude of small molecules to gain insight into an organism's metabolism. The research field is dynamic and expanding with applications across biomedical, biotechnological, and many other applied biological domains. Its computationally intensive nature has driven requirements for open data formats, data repositories, and data analysis tools. However, the rapid progress has resulted in a mosaic of independent, and sometimes incompatible, analysis methods that are difficult to connect into a useful and complete data analysis solution.

Findings: PhenoMeNal (Phenome and Metabolome aNalysis) is an advanced and complete solution to set up Infrastructure-as-a-Service (IaaS) that brings workflow-oriented, interoperable metabolomics data analysis platforms into the cloud. PhenoMeNal seamlessly integrates a wide array of existing open-source tools that are tested and packaged as Docker containers through the project's continuous integration process and deployed based on a kubernetes orchestration framework. It also provides a number of standardized, automated, and published analysis workflows in the user interfaces Galaxy, Jupyter, Luigi, and Pachyderm.

Conclusions: PhenoMeNal constitutes a keystone solution in cloud e-infrastructures available for metabolomics. PhenoMeNal is a unique and complete solution for setting up cloud e-infrastructures through easy-to-use web interfaces that can be scaled to any custom public and private cloud environment. By harmonizing and automating software installation and configuration and through ready-to-use scientific workflow user interfaces, PhenoMeNal has succeeded in providing scientists with workflow-driven, reproducible, and shareable metabolomics data analysis platforms that are interfaced through standard data formats, representative datasets, versioned, and have been tested for reproducibility and interoperability. The elastic implementation of PhenoMeNal further allows easy adaptation of the infrastructure to other application areas and 'omics research domains.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giy149DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6377398PMC
February 2019

Seasonal variation of secondary metabolites in nine different bryophytes.

Ecol Evol 2018 Sep 22;8(17):9105-9117. Epub 2018 Aug 22.

Leibniz Institute of Plant Biochemistry, Stress and Developmental Biology Halle Germany.

Bryophytes occur in almost all land ecosystems and contribute to global biogeochemical cycles, ecosystem functioning, and influence vegetation dynamics. As growth and biochemistry of bryophytes are strongly dependent on the season, we analyzed metabolic variation across seasons with regard to ecological characteristics and phylogeny. Using bioinformatics methods, we present an integrative and reproducible approach to connect ecology with biochemistry. Nine different bryophyte species were collected in three composite samples in four seasons. Untargeted liquid chromatography coupled with mass spectrometry (LC/MS) was performed to obtain metabolite profiles. Redundancy analysis, Pearson's correlation, Shannon diversity, and hierarchical clustering were used to determine relationships among species, seasons, ecological characteristics, and hierarchical clustering. Metabolite profiles of and which are species with ruderal life strategy (R-selected) showed low seasonal variability, while the profiles of the pleurocarpous mosses and which have characteristics of a competitive strategy (C-selected) were more variable. and had intermediary life strategies. Our study revealed strong species-specific differences in metabolite profiles between the seasons. Life strategies, growth forms, and indicator values for light and soil were among the most important ecological predictors. We demonstrate that untargeted Eco-Metabolomics provide useful biochemical insight that improves our understanding of fundamental ecological strategies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/ece3.4361DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6157681PMC
September 2018

Expanding the Use of Spectral Libraries in Proteomics.

J Proteome Res 2018 12 11;17(12):4051-4060. Epub 2018 Oct 11.

The Donnelly Centre , University of Toronto , 160 College Street , Toronto , Ontario M5S 3E1 , Canada.

The 2017 Dagstuhl Seminar on Computational Proteomics provided an opportunity for a broad discussion on the current state and future directions of the generation and use of peptide tandem mass spectrometry spectral libraries. Their use in proteomics is growing slowly, but there are multiple challenges in the field that must be addressed to further increase the adoption of spectral libraries and related techniques. The primary bottlenecks are the paucity of high quality and comprehensive libraries and the general difficulty of adopting spectral library searching into existing workflows. There are several existing spectral library formats, but none captures a satisfactory level of metadata; therefore, a logical next improvement is to design a more advanced, Proteomics Standards Initiative-approved spectral library format that can encode all of the desired metadata. The group discussed a series of metadata requirements organized into three designations of completeness or quality, tentatively dubbed bronze, silver, and gold. The metadata can be organized at four different levels of granularity: at the collection (library) level, at the individual entry (peptide ion) level, at the peak (fragment ion) level, and at the peak annotation level. Strategies for encoding mass modifications in a consistent manner and the requirement for encoding high-quality and commonly seen but as-yet-unidentified spectra were discussed. The group also discussed related topics, including strategies for comparing two spectra, techniques for generating representative spectra for a library, approaches for selection of optimal signature ions for targeted workflows, and issues surrounding the merging of two or more libraries into one. We present here a review of this field and the challenges that the community must address in order to accelerate the adoption of spectral libraries in routine analysis of proteomics datasets.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jproteome.8b00485DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6443480PMC
December 2018

Mind the Gap: Mapping Mass Spectral Databases in Genome-Scale Metabolic Networks Reveals Poorly Covered Areas.

Metabolites 2018 Sep 15;8(3). Epub 2018 Sep 15.

Metabolomics Platform, IISPV, Department of Electronic Engineering, Universitat Rovira i Virgili, Avinguda Paisos Catalans 26, 43007 Tarragona, Spain.

The use of mass spectrometry-based metabolomics to study human, plant and microbial biochemistry and their interactions with the environment largely depends on the ability to annotate metabolite structures by matching mass spectral features of the measured metabolites to curated spectra of reference standards. While reference databases for metabolomics now provide information for hundreds of thousands of compounds, barely 5% of these known small molecules have experimental data from pure standards. Remarkably, it is still unknown how well existing mass spectral libraries cover the biochemical landscape of prokaryotic and eukaryotic organisms. To address this issue, we have investigated the coverage of 38 genome-scale metabolic networks by public and commercial mass spectral databases, and found that on average only 40% of nodes in metabolic networks could be mapped by mass spectral information from standards. Next, we deciphered computationally which parts of the human metabolic network are poorly covered by mass spectral libraries, revealing gaps in the eicosanoids, vitamins and bile acid metabolism. Finally, our network topology analysis based on the betweenness centrality of metabolites revealed the top 20 most important metabolites that, if added to MS databases, may facilitate human metabolome characterization in the future.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/metabo8030051DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6161000PMC
September 2018

Computational workflow to study the seasonal variation of secondary metabolites in nine different bryophytes.

Sci Data 2018 08 28;5:180179. Epub 2018 Aug 28.

Leibniz Institute of Plant Biochemistry, Stress and Developmental Biology, Weinberg 3, 06120 Halle (Saale), Germany.

In Eco-Metabolomics interactions are studied of non-model organisms in their natural environment and relations are made between biochemistry and ecological function. Current challenges when processing such metabolomics data involve complex experiment designs which are often carried out in large field campaigns involving multiple study factors, peak detection parameter settings, the high variation of metabolite profiles and the analysis of non-model species with scarcely characterised metabolomes. Here, we present a dataset generated from 108 samples of nine bryophyte species obtained in four seasons using an untargeted liquid chromatography coupled with mass spectrometry acquisition method (LC/MS). Using this dataset we address the current challenges when processing Eco-Metabolomics data. Here, we also present a reproducible and reusable computational workflow implemented in Galaxy focusing on standard formats, data import, technical validation, feature detection, diversity analysis and multivariate statistics. We expect that the representative dataset and the reusable processing pipeline will facilitate future studies in the research field of Eco-Metabolomics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/sdata.2018.179DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6111888PMC
August 2018

ChemFrag: Chemically meaningful annotation of fragment ion mass spectra.

J Mass Spectrom 2018 Nov;53(11):1104-1115

Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry, Weinberg 3, Halle (Saale), 06120, Germany.

Identification and structural determination of small molecules by mass spectrometry is an important step in chemistry and biochemistry. However, the chemically realistic annotation of a fragment ion spectrum can be a difficult challenge. We developed ChemFrag, for the detection of fragmentation pathways and the annotation of fragment ions with chemically reasonable structures. ChemFrag combines a quantum chemical with a rule-based approach. For different doping substances as test instances, ChemFrag correctly annotates fragment ions. In most cases, the predicted fragments are chemically more realistic than those from purely combinatorial approaches, or approaches based on machine learning. The annotation generated by ChemFrag often coincides with spectra that have been manually annotated by experts. This is a major advance in peak annotation and allows a more precise automatic interpretation of mass spectra.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/jms.4278DOI Listing
November 2018

Current Challenges in Plant Eco-Metabolomics.

Int J Mol Sci 2018 May 6;19(5). Epub 2018 May 6.

German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany.

The relatively new research discipline of Eco-Metabolomics is the application of metabolomics techniques to ecology with the aim to characterise biochemical interactions of organisms across different spatial and temporal scales. Metabolomics is an untargeted biochemical approach to measure many thousands of metabolites in different species, including plants and animals. Changes in metabolite concentrations can provide mechanistic evidence for biochemical processes that are relevant at ecological scales. These include physiological, phenotypic and morphological responses of plants and communities to environmental changes and also interactions with other organisms. Traditionally, research in biochemistry and ecology comes from two different directions and is performed at distinct spatiotemporal scales. Biochemical studies most often focus on intrinsic processes in individuals at physiological and cellular scales. Generally, they take a bottom-up approach scaling up cellular processes from spatiotemporally fine to coarser scales. Ecological studies usually focus on extrinsic processes acting upon organisms at population and community scales and typically study top-down and bottom-up processes in combination. Eco-Metabolomics is a transdisciplinary research discipline that links biochemistry and ecology and connects the distinct spatiotemporal scales. In this review, we focus on approaches to study chemical and biochemical interactions of plants at various ecological levels, mainly plant⁻organismal interactions, and discuss related examples from other domains. We present recent developments and highlight advancements in Eco-Metabolomics over the last decade from various angles. We further address the five key challenges: (1) complex experimental designs and large variation of metabolite profiles; (2) feature extraction; (3) metabolite identification; (4) statistical analyses; and (5) bioinformatics software tools and workflows. The presented solutions to these challenges will advance connecting the distinct spatiotemporal scales and bridging biochemistry and ecology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/ijms19051385DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5983679PMC
May 2018

The future of metabolomics in ELIXIR.

F1000Res 2017 6;6. Epub 2017 Sep 6.

Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Belvaux, L-4367, Luxembourg.

Metabolomics, the youngest of the major omics technologies, is supported by an active community of researchers and infrastructure developers across Europe. To coordinate and focus efforts around infrastructure building for metabolomics within Europe, a workshop on the "Future of metabolomics in ELIXIR" was organised at Frankfurt Airport in Germany. This one-day strategic workshop involved representatives of ELIXIR Nodes, members of the PhenoMeNal consortium developing an e-infrastructure that supports workflow-based metabolomics analysis pipelines, and experts from the international metabolomics community. The workshop established as the critical area, where a maximal impact of computational metabolomics and data management on other fields could be achieved. In particular, the existing four ELIXIR Use Cases, where the metabolomics community - both industry and academia - would benefit most, and which could be exhaustively mapped onto the current five ELIXIR Platforms were discussed. This opinion article is a call for support for a new ELIXIR metabolomics Use Case, which aligns with and complements the existing and planned ELIXIR Platforms and Use Cases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/f1000research.12342.2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5627583PMC
September 2017

Critical Assessment of Small Molecule Identification 2016: automated methods.

J Cheminform 2017 Mar 27;9(1):22. Epub 2017 Mar 27.

Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120, Halle, Germany.

Background: The fourth round of the Critical Assessment of Small Molecule Identification (CASMI) Contest ( www.casmi-contest.org ) was held in 2016, with two new categories for automated methods. This article covers the 208 challenges in Categories 2 and 3, without and with metadata, from organization, participation, results and post-contest evaluation of CASMI 2016 through to perspectives for future contests and small molecule annotation/identification.

Results: The Input Output Kernel Regression (CSI:IOKR) machine learning approach performed best in "Category 2: Best Automatic Structural Identification-In Silico Fragmentation Only", won by Team Brouard with 41% challenge wins. The winner of "Category 3: Best Automatic Structural Identification-Full Information" was Team Kind (MS-FINDER), with 76% challenge wins. The best methods were able to achieve over 30% Top 1 ranks in Category 2, with all methods ranking the correct candidate in the Top 10 in around 50% of challenges. This success rate rose to 70% Top 1 ranks in Category 3, with candidates in the Top 10 in over 80% of the challenges. The machine learning and chemistry-based approaches are shown to perform in complementary ways.

Conclusions: The improvement in (semi-)automated fragmentation methods for small molecule identification has been substantial. The achieved high rates of correct candidates in the Top 1 and Top 10, despite large candidate numbers, open up great possibilities for high-throughput annotation of untargeted analysis for "known unknowns". As more high quality training data becomes available, the improvements in machine learning methods will likely continue, but the alternative approaches still provide valuable complementary information. Improved integration of experimental context will also improve identification success further for "real life" annotations. The true "unknown unknowns" remain to be evaluated in future CASMI contests. Graphical abstract .
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-017-0207-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5368104PMC
March 2017

nmrML: A Community Supported Open Data Standard for the Description, Storage, and Exchange of NMR Data.

Anal Chem 2018 01 14;90(1):649-656. Epub 2017 Dec 14.

European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, U.K.

NMR is a widely used analytical technique with a growing number of repositories available. As a result, demands for a vendor-agnostic, open data format for long-term archiving of NMR data have emerged with the aim to ease and encourage sharing, comparison, and reuse of NMR data. Here we present nmrML, an open XML-based exchange and storage format for NMR spectral data. The nmrML format is intended to be fully compatible with existing NMR data for chemical, biochemical, and metabolomics experiments. nmrML can capture raw NMR data, spectral data acquisition parameters, and where available spectral metadata, such as chemical structures associated with spectral assignments. The nmrML format is compatible with pure-compound NMR data for reference spectral libraries as well as NMR data from complex biomixtures, i.e., metabolomics experiments. To facilitate format conversions, we provide nmrML converters for Bruker, JEOL and Agilent/Varian vendor formats. In addition, easy-to-use Web-based spectral viewing, processing, and spectral assignment tools that read and write nmrML have been developed. Software libraries and Web services for data validation are available for tool developers and end-users. The nmrML format has already been adopted for capturing and disseminating NMR data for small molecules by several open source data processing tools and metabolomics reference spectral libraries, e.g., serving as storage format for the MetaboLights data repository. The nmrML open access data standard has been endorsed by the Metabolomics Standards Initiative (MSI), and we here encourage user participation and feedback to increase usability and make it a successful standard.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.analchem.7b02795DOI Listing
January 2018

Bioinformatics can boost metabolomics research.

J Biotechnol 2017 Nov 26;261:137-141. Epub 2017 May 26.

Leibniz Institute of Plant Biochemistry, Department of Stress and Developmental Biology, Weinberg 3, 06120 Halle, Germany; German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig Deutscher, Platz 5e, 04103 Leipzig, Germany. Electronic address:

Metabolomics is the modern term for the field of small molecule research in biology and biochemistry. Currently, metabolomics is undergoing a transition where the classic analytical chemistry is combined with modern cheminformatics and bioinformatics methods, paving the way for large-scale data analysis. We give some background on past developments, highlight current state-of-the-art approaches, and give a perspective on future requirements.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jbiotec.2017.05.018DOI Listing
November 2017

LipidFrag: Improving reliability of in silico fragmentation of lipids and application to the Caenorhabditis elegans lipidome.

PLoS One 2017 9;12(3):e0172311. Epub 2017 Mar 9.

Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, German Research Center for Environmental Health, Ingolstaedter Landstrasse, Neuherberg, Germany.

Lipid identification is a major bottleneck in high-throughput lipidomics studies. However, tools for the analysis of lipid tandem MS spectra are rather limited. While the comparison against spectra in reference libraries is one of the preferred methods, these libraries are far from being complete. In order to improve identification rates, the in silico fragmentation tool MetFrag was combined with Lipid Maps and lipid-class specific classifiers which calculate probabilities for lipid class assignments. The resulting LipidFrag workflow was trained and evaluated on different commercially available lipid standard materials, measured with data dependent UPLC-Q-ToF-MS/MS acquisition. The automatic analysis was compared against manual MS/MS spectra interpretation. With the lipid class specific models, identification of the true positives was improved especially for cases where candidate lipids from different lipid classes had similar MetFrag scores by removing up to 56% of false positive results. This LipidFrag approach was then applied to MS/MS spectra of lipid extracts of the nematode Caenorhabditis elegans. Fragments explained by LipidFrag match known fragmentation pathways, e.g., neutral losses of lipid headgroups and fatty acid side chain fragments. Based on prediction models trained on standard lipid materials, high probabilities for correct annotations were achieved, which makes LipidFrag a good choice for automated lipid data analysis and reliability testing of lipid identifications.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0172311PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5344313PMC
September 2017

Prediction, Detection, and Validation of Isotope Clusters in Mass Spectrometry Data.

Metabolites 2016 Oct 20;6(4). Epub 2016 Oct 20.

Department of Stress and Developmental Biology, Leibniz Institute for Plant Biochemistry, Weinberg 3, Halle 06120, Germany.

Mass spectrometry is a key analytical platform for metabolomics. The precise quantification and identification of small molecules is a prerequisite for elucidating the metabolism and the detection, validation, and evaluation of isotope clusters in LC-MS data is important for this task. Here, we present an approach for the improved detection of isotope clusters using chemical prior knowledge and the validation of detected isotope clusters depending on the substance mass using database statistics. We find remarkable improvements regarding the number of detected isotope clusters and are able to predict the correct molecular formula in the top three ranks in 92 % of the cases. We make our methodology freely available as part of the Bioconductor packages version 1.50.0 and version 1.30.0.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/metabo6040037DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5192443PMC
October 2016

Plant-to-Plant Variability in Root Metabolite Profiles of 19 Arabidopsis thaliana Accessions Is Substance-Class-Dependent.

Int J Mol Sci 2016 Sep 16;17(9). Epub 2016 Sep 16.

Leibniz Institute of Plant Biochemistry, Stress and Developmental Biology, Weinberg 3, 06120 Halle (Saale), Germany.

Natural variation of secondary metabolism between different accessions of Arabidopsis thaliana (A. thaliana) has been studied extensively. In this study, we extended the natural variation approach by including biological variability (plant-to-plant variability) and analysed root metabolic patterns as well as their variability between plants and naturally occurring accessions. To screen 19 accessions of A. thaliana, comprehensive non-targeted metabolite profiling of single plant root extracts was performed using ultra performance liquid chromatography/electrospray ionization quadrupole time-of-flight mass spectrometry (UPLC/ESI-QTOF-MS) and gas chromatography/electron ionization quadrupole mass spectrometry (GC/EI-QMS). Linear mixed models were applied to dissect the total observed variance. All metabolic profiles pointed towards a larger plant-to-plant variability than natural variation between accessions and variance of experimental batches. Ratios of plant-to-plant to total variability were high and distinct for certain secondary metabolites. None of the investigated accessions displayed a specifically high or low biological variability for these substance classes. This study provides recommendations for future natural variation analyses of glucosinolates, flavonoids, and phenylpropanoids and also reference data for additional substance classes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/ijms17091565DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5037833PMC
September 2016
-->