Publications by authors named "Anna Gambin"

66 Publications

TADeus2: a web server facilitating the clinical diagnosis by pathogenicity assessment of structural variations disarranging 3D chromatin structure.

Nucleic Acids Res 2022 May 7. Epub 2022 May 7.

Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, 2 Banacha street, 02-097 Warsaw, Poland.

In recent years great progress has been made in identification of structural variants (SV) in the human genome. However, the interpretation of SVs, especially located in non-coding DNA, remains challenging. One of the reasons stems in the lack of tools exclusively designed for clinical SVs evaluation acknowledging the 3D chromatin architecture. Therefore, we present TADeus2 a web server dedicated for a quick investigation of chromatin conformation changes, providing a visual framework for the interpretation of SVs affecting topologically associating domains (TADs). This tool provides a convenient visual inspection of SVs, both in a continuous genome view as well as from a rearrangement's breakpoint perspective. Additionally, TADeus2 allows the user to assess the influence of analyzed SVs within flaking coding/non-coding regions based on the Hi-C matrix. Importantly, the SVs pathogenicity is quantified and ranked using TADA, ClassifyCNV tools and sampling-based P-value. TADeus2 is publicly available at https://tadeus2.mimuw.edu.pl.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkac318DOI Listing
May 2022

Data-driven case fatality rate estimation for the primary lineage of SARS-CoV-2 in Poland.

Methods 2022 Jan 24. Epub 2022 Jan 24.

Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland.

After more than one and a half year since the COVID-19 pandemics outbreak the scientific world is constantly trying to understand its dynamics. In this paper of the case fatality rates (CFR) for COVID-19 we study the historic data regarding mortality in Poland during the first six months of pandemic, when no SARS-CoV-2 variants of concern were present among infected. To this end, we apply competing risk models to perform both uni- and multivariate analyses on specific subpopulations selected by different factors including the key indicators: age, sex, hospitalization. The study explores the case fatality rate to find out its decreasing trend in time. Furthermore, we describe the differences in mortality among hospitalized and other cases indicating a sudden increase of mortality among hospitalized cases at the end of the 2020 spring season. Exploratory and multivariate analysis revealed the real impact of each variable and besides the expected factors indicating increased mortality (age, comorbidities) we track more non-obvious indicators. Recent medical care as well as the identification of the source contact, independently of the comorbidities, significantly impact an individual mortality risk. As a result, the study provides a twofold insight into the COVID-19 mortality in Poland. On one hand we explore mortality in different groups with respect to different variables, on the other we indicate novel factors that may be crucial in reducing mortality. The later can be coped, e.g. by more efficient contact tracing and proper organization and management of the health care system to accompany those who need medical care independently of comorbidities or COVID-19 infection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ymeth.2022.01.006DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8785264PMC
January 2022

Harvest time affects antioxidant capacity, total polyphenol and flavonoid content of Polish St John's wort's (Hypericum perforatum L.) flowers.

Sci Rep 2021 02 17;11(1):3989. Epub 2021 Feb 17.

Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warszawa, Poland.

The polyphenol content and antioxidant capacity of hyperforin and hypericin-standardized H. perforatum L. extracts may vary due to the harvest time. In this work, ethanol and ethanol-water extracts of air-dried and lyophilized flowers of H. perforatum L., collected throughout a vegetation season in central Poland, were studied. Air-dried flowers extracts had higher polyphenol (371 mg GAE/g) and flavonoid (160 mg CAE/g) content, DPPH radical scavenging (1672 mg DPPH/g), ORAC (5214 µmol TE/g) and FRAP (2.54 mmol Fe/g) than lyophilized flowers extracts (238 mg GAE/g, 107 mg CAE/g, 1287 mg DPPH/g, 3313 µmol TE/g and 0.31 mmol Fe/g, respectively). Principal component analysis showed that the collection date influenced the flavonoid and polyphenol contents and FRAP of ethanol extracts, and DPPH and ORAC values of ethanol-water extracts. The ethanol extracts with the highest polyphenol and flavonoid content protected human erythrocytes against bisphenol A-induced damage. Both high field and benchtop NMR spectra of selected extracts, revealed differences in composition caused by extraction solvent and raw material collection date. Moreover, we have shown that benchtop NMR can be used to detect the compositional variation of extracts if the assignment of signals is done previously.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-021-83409-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7889936PMC
February 2021

Low Entropy Sub-Networks Prevent the Integration of Metabolomic and Transcriptomic Data.

Entropy (Basel) 2020 Oct 31;22(11). Epub 2020 Oct 31.

Institute of Informatics, University of Warsaw, 02-097 Warsaw, Poland.

The constantly and rapidly increasing amount of the biological data gained from many different high-throughput experiments opens up new possibilities for data- and model-driven inference. Yet, alongside, emerges a problem of risks related to data integration techniques. The latter are not so widely taken account of. Especially, the approaches based on the flux balance analysis (FBA) are sensitive to the structure of a metabolic network for which the low-entropy clusters can prevent the inference from the activity of the metabolic reactions. In the following article, we set forth problems that may arise during the integration of metabolomic data with gene expression datasets. We analyze common pitfalls, provide their possible solutions, and exemplify them by a case study of the renal cell carcinoma (RCC). Using the proposed approach we provide a metabolic description of the known morphological RCC subtypes and suggest a possible existence of the poor-prognosis cluster of patients, which are commonly characterized by the low activity of the drug transporting enzymes crucial in the chemotherapy. This discovery suits and extends the already known poor-prognosis characteristics of RCC. Finally, the goal of this work is also to point out the problem that arises from the integration of high-throughput data with the inherently nonuniform, manually curated low-throughput data. In such cases, the over-represented information may potentially overshadow the non-trivial discoveries.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/e22111238DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7712986PMC
October 2020

Masserstein: Linear regression of mass spectra by optimal transport.

Rapid Commun Mass Spectrom 2020 Sep 30:e8956. Epub 2020 Sep 30.

Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland.

Rationale: The linear regression of mass spectra is a computational problem defined as fitting a linear combination of reference spectra to an experimental one. It is typically used to estimate the relative quantities of selected ions. In this work, we study this problem in an abstract setting to develop new approaches applicable to a diverse range of experiments.

Methods: To overcome the sensitivity of the ordinary least-squares regression to measurement inaccuracies, we base our methods on a non-conventional spectral dissimilarity measure, known as the Wasserstein or the Earth Mover's distance. This distance is based on the notion of the cost of transporting signal between mass spectra, which renders it naturally robust to measurement inaccuracies in the mass domain.

Results: Using a data set of 200 mass spectra, we show that our approach is capable of estimating ion proportions accurately without extensive preprocessing of spectra required by other methods. The conclusions are further substantiated using data sets simulated in a way that mimics most of the measurement inaccuracies occurring in real experiments.

Conclusions: We have developed a linear regression algorithm based on the notion of the cost of transporting signal between spectra. Our implementation is available in a Python 3 package called masserstein, which is freely available at https://github.com/mciach/masserstein.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/rcm.8956DOI Listing
September 2020

Breakpoint Mapping of Symptomatic Balanced Translocations Links the , and Genes to Novel Disease Phenotype.

J Clin Med 2020 Apr 25;9(5). Epub 2020 Apr 25.

Department of Medical Genetics, Medical University of Warsaw, 02-106 Warsaw, Poland.

De novo balanced chromosomal aberrations (BCAs), such as reciprocal translocations and inversions, are genomic aberrations that, in approximately 25% of cases, affect the human phenotype. Delineation of the exact structure of BCAs may provide a precise diagnosis and/or point to new disease loci. We report on six patients with de novo balanced chromosomal translocations (BCTs) and one patient with a de novo inversion, in whom we mapped breakpoints to a resolution of 1 bp, using shallow whole-genome mate pair sequencing. In all seven cases, a disruption of at least one gene was found. In two patients, the phenotypic impact of the disrupted genes is well known (). In five patients, the aberration damaged genes: and , whose influence on the human phenotype is poorly understood. In particular, our results suggest novel candidate genes for retinal degeneration with anophthalmia (), developmental delay with speech impairment (), and developmental delay with brain dysembryoplastic neuroepithelial tumor (). In conclusion, identification of the exact structure of symptomatic BCTs using next generation sequencing is a viable method for both diagnosis and finding novel disease candidate genes in humans.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/jcm9051245DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7287862PMC
April 2020

Jaccard/Tanimoto similarity test and estimation methods for biological presence-absence data.

BMC Bioinformatics 2019 Dec 24;20(Suppl 15):644. Epub 2019 Dec 24.

Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, Warsaw, 02-097, Poland.

Background: A survey of presences and absences of specific species across multiple biogeographic units (or bioregions) are used in a broad area of biological studies from ecology to microbiology. Using binary presence-absence data, we evaluate species co-occurrences that help elucidate relationships among organisms and environments. To summarize similarity between occurrences of species, we routinely use the Jaccard/Tanimoto coefficient, which is the ratio of their intersection to their union. It is natural, then, to identify statistically significant Jaccard/Tanimoto coefficients, which suggest non-random co-occurrences of species. However, statistical hypothesis testing using this similarity coefficient has been seldom used or studied.

Results: We introduce a hypothesis test for similarity for biological presence-absence data, using the Jaccard/Tanimoto coefficient. Several key improvements are presented including unbiased estimation of expectation and centered Jaccard/Tanimoto coefficients, that account for occurrence probabilities. The exact and asymptotic solutions are derived. To overcome a computational burden due to high-dimensionality, we propose the bootstrap and measurement concentration algorithms to efficiently estimate statistical significance of binary similarity. Comprehensive simulation studies demonstrate that our proposed methods produce accurate p-values and false discovery rates. The proposed estimation methods are orders of magnitude faster than the exact solution, particularly with an increasing dimensionality. We showcase their applications in evaluating co-occurrences of bird species in 28 islands of Vanuatu and fish species in 3347 freshwater habitats in France. The proposed methods are implemented in an open source R package called jaccard (https://cran.r-project.org/package=jaccard).

Conclusion: We introduce a suite of statistical methods for the Jaccard/Tanimoto similarity coefficient for binary data, that enable straightforward incorporation of probabilistic measures in analysis for species co-occurrences. Due to their generality, the proposed methods and implementations are applicable to a wide range of binary data arising from genomics, biochemistry, and other areas of science.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-3118-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929325PMC
December 2019

Knot_pull-python package for biopolymer smoothing and knot detection.

Bioinformatics 2020 02;36(3):953-955

Centre of New Technologies, Warsaw 02-097, Poland.

Summary: The biggest hurdle in studying topology in biopolymers is the steep learning curve for actually seeing the knots in structure visualization. Knot_pull is a command line utility designed to simplify this process-it presents the user with a smoothing trajectory for provided structures (any number and length of protein, RNA or chromatin chains in PDB, CIF or XYZ format), and calculates the knot type (including presence of any links, and slipknots when a subchain is specified).

Availability And Implementation: Knot_pull works under Python >=2.7 and is system independent. Source code and documentation are available at http://github.com/dzarmola/knot_pull under GNU GPL license and include also a wrapper script for PyMOL for easier visualization. Examples of smoothing trajectories can be found at: https://www.youtube.com/watch?v=IzSGDfc1vAY.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz644DOI Listing
February 2020

MIND: A Double-Linear Model To Accurately Determine Monoisotopic Precursor Mass in High-Resolution Top-Down Proteomics.

Anal Chem 2019 08 23;91(15):10310-10319. Epub 2019 Jul 23.

UA-VITO Center for Proteomics , University of Antwerp , 2000 Antwerp , Belgium.

Top-down proteomics approaches are becoming ever more popular, due to the advantages offered by knowledge of the intact protein mass in correctly identifying the various proteoforms that potentially arise due to point mutation, alternative splicing, post-translational modifications, etc. Usually, the average mass is used in this context; however, it is known that this can fluctuate significantly due to both natural and technical causes. Ideally, one would prefer to use the monoisotopic precursor mass, but this falls below the detection limit for all but the smallest proteins. Methods that predict the monoisotopic mass based on the average mass are potentially affected by imprecisions associated with the average mass. To address this issue, we have developed a framework based on simple, linear models that allows prediction of the monoisotopic mass based on the exact mass of the most-abundant (aggregated) isotope peak, which is a robust measure of mass, insensitive to the aforementioned natural and technical causes. This linear model was tested experimentally, as well as in silico, and typically predicts monoisotopic masses with an accuracy of only a few parts per million. A confidence measure is associated with the predicted monoisotopic mass to handle the off-by-one-Da prediction error. Furthermore, we introduce a correction function to extract the "true" (i.e., theoretically) most-abundant isotope peak from a spectrum, even if the observed isotope distribution is distorted by noise or poor ion statistics. The method is available online as an R shiny app: https://valkenborg-lab.shinyapps.io/mind/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.analchem.9b02682DOI Listing
August 2019

Biologically sound formal model of Hsp70 heat induction.

J Theor Biol 2019 10 8;478:74-101. Epub 2019 Jun 8.

Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, ul. Banacha 2, Warsaw 02-097, Poland; Department of Biosystems, Science and Engineering, ETH Zurich, Basel, Switzerland. Electronic address:

A proper response to rapid environmental changes is essential for cell survival and requires efficient modifications in the pattern of gene expression. In this respect, a prominent example is Hsp70, a chaperone protein whose synthesis is dynamically regulated in stress conditions. In this paper, we expand a formal model of Hsp70 heat induction originally proposed in previous articles. To accurately capture various modes of heat shock effects, we not only introduce temperature dependencies in transcription to Hsp70 mRNA and in dissociation of transcriptional complexes, but we also derive a new formal expression for the temperature dependence in protein denaturation. We calibrate our model using comprehensive sets of both previously published experimental data and also biologically justified constraints. Interestingly, we obtain a biologically plausible temperature dependence of the transcriptional complex dissociation, despite the lack of biological constraints imposed in the calibration process. Finally, based on a sensitivity analysis of the model carried out in both deterministic and stochastic settings, we suggest that the regulation of the binding of transcriptional complexes plays a key role in Hsp70 induction upon heat shock. In conclusion, we provide a model that is able to capture the essential dynamics of the Hsp70 heat induction whilst being biologically sound in terms of temperature dependencies, description of protein denaturation and imposed calibration constraints.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jtbi.2019.05.022DOI Listing
October 2019

Truncated Robust Principal Component Analysis and Noise Reduction for Single Cell RNA Sequencing Data.

J Comput Biol 2019 08 1;26(8):782-793. Epub 2019 May 1.

1Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warszawa, Poland.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1089/cmb.2018.0255DOI Listing
August 2019

Automatic mapping of atoms across both simple and complex chemical reactions.

Nat Commun 2019 03 29;10(1):1434. Epub 2019 Mar 29.

Institute of Organic Chemistry, Polish Academy of Sciences, Ul. Kasprzaka 44/52, Warsaw, 02-224, Poland.

Mapping atoms across chemical reactions is important for substructure searches, automatic extraction of reaction rules, identification of metabolic pathways, and more. Unfortunately, the existing mapping algorithms can deal adequately only with relatively simple reactions but not those in which expert chemists would benefit from computer's help. Here we report how a combination of algorithmics and expert chemical knowledge significantly improves the performance of atom mapping, allowing the machine to deal with even the most mechanistically complex chemical and biochemical transformations. The key feature of our approach is the use of few but judiciously chosen reaction templates that are used to generate plausible "intermediate" atom assignments which then guide a graph-theoretical algorithm towards the chemically correct isomorphic mappings. The algorithm performs significantly better than the available state-of-the-art reaction mappers, suggesting its uses in database curation, mechanism assignments, and - above all - machine extraction of reaction rules underlying modern synthesis-planning programs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-019-09440-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6441094PMC
March 2019

masstodon: A Tool for Assigning Peaks and Modeling Electron Transfer Reactions in Top-Down Mass Spectrometry.

Anal Chem 2019 02 28;91(3):1801-1807. Epub 2019 Jan 28.

Department of Mathematics, Informatics, and Mechanics , University of Warsaw , Warsaw 02-097 , Poland.

Top-down mass spectrometry methods are becoming continuously more popular in the effort to describe the proteome. They rely on the fragmentation of intact protein ions inside the mass spectrometer. Among the existing fragmentation methods, electron transfer dissociation is known for its precision and wide coverage of different cleavage sites. However, several side reactions can occur under electron transfer dissociation (ETD) conditions, including nondissociative electron transfer and proton transfer reaction. Evaluating their extent can provide more insight into reaction kinetics as well as instrument operation. Furthermore, preferential formation of certain reaction products can reveal important structural information. To the best of our knowledge, there are currently no tools capable of tracing and analyzing the products of these reactions in a systematic way. In this Article, we present in detail masstodon: a computer program for assigning peaks and interpreting mass spectra. Besides being a general purpose tool, masstodon also offers the possibility to trace the products of reactions occurring under ETD conditions and provides insights into the parameters driving them. It is available free of charge under the GNU AGPL V3 public license.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.analchem.8b01479DOI Listing
February 2019

Mapping of breakpoints in balanced chromosomal translocations by shallow whole-genome sequencing points to , and as novel candidates for genes causing human Mendelian disorders.

J Med Genet 2019 02 23;56(2):104-112. Epub 2018 Oct 23.

Department of Medical Genetics, Medical University of Warsaw, Warsaw, Poland.

Background: Mapping the breakpoints in de novo balanced chromosomal translocations (BCT) in symptomatic individuals provides a unique opportunity to identify in an unbiased way the likely causative genetic defect and thus find novel human disease candidate genes. Our aim was to fine-map breakpoints of de novo BCTs in a case series of nine patients.

Methods: Shallow whole-genome mate pair sequencing (SGMPS) together with long-range PCR and Sanger sequencing. In one case (BCT disrupting and ) cDNA analysis was used to verify expression of a fusion transcript in cultured fibroblasts.

Results: In all nine probands 11 disrupted genes were found, that is, and . Five subjects had translocations that disrupted genes with so far unknown () or poorly delineated impact on the phenotype ( two previous reports of BCT disrupting the gene). The four genes with no previous disease associations (), when compared with all human genes by a bootstrap test, had significantly higher pLI (p<0.017) and DOMINO (p<0.02) scores indicating enrichment in genes likely to be intolerant to single copy damage. Inspection of individual pLI and DOMINO scores, and local topologically associating domain structure suggested that and were particularly good candidates for novel disease loci. The pathomechanism for may involve deregulation of expression due to fusion with promoter.

Conclusion: SGMPS in symptomatic carriers of BCTs is a powerful approach to delineate novel human gene-disease associations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1136/jmedgenet-2018-105527DOI Listing
February 2019

LINE- and Alu-containing genomic instability hotspot at 16q24.1 associated with recurrent and nonrecurrent CNV deletions causative for ACDMPV.

Hum Mutat 2018 12 22;39(12):1916-1925. Epub 2018 Aug 22.

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas.

Transposable elements modify human genome by inserting into new loci or by mediating homology-, microhomology-, or homeology-driven DNA recombination or repair, resulting in genomic structural variation. Alveolar capillary dysplasia with misalignment of pulmonary veins (ACDMPV) is a rare lethal neonatal developmental lung disorder caused by point mutations or copy-number variant (CNV) deletions of FOXF1 or its distant tissue-specific enhancer. Eighty-five percent of 45 ACDMPV-causative CNV deletions, of which junctions have been sequenced, had at least one of their two breakpoints located in a retrotransposon, with more than half of them being Alu elements. We describe a novel ∼35 kb-large genomic instability hotspot at 16q24.1, involving two evolutionarily young LINE-1 (L1) elements, L1PA2 and L1PA3, flanking AluY, two AluSx, AluSx1, and AluJr elements. The occurrence of L1s at this location coincided with the branching out of the Homo-Pan-Gorilla clade, and was preceded by the insertion of AluSx, AluSx1, and AluJr. Our data show that, in addition to mediating recurrent CNVs, L1 and Alu retrotransposons can predispose the human genome to formation of variably sized CNVs, both of clinical and evolutionary relevance. Nonetheless, epigenetic or other genomic features of this locus might also contribute to its increased instability.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.23608DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6240370PMC
December 2018

Inferring Molecular Processes Heterogeneity from Transcriptional Data.

Biomed Res Int 2017 6;2017:6961786. Epub 2017 Dec 6.

Institute of Informatics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland.

RNA microarrays and RNA-seq are nowadays standard technologies to study the transcriptional activity of cells. Most studies focus on tracking transcriptional changes caused by specific experimental conditions. Information referring to genes up- and downregulation is evaluated analyzing the behaviour of relatively large population of cells by averaging its properties. However, even assuming perfect sample homogeneity, different subpopulations of cells can exhibit diverse transcriptomic profiles, as they may follow different regulatory/signaling pathways. The purpose of this study is to provide a novel methodological scheme to account for possible internal, functional heterogeneity in homogeneous cell lines, including cancer ones. We propose a novel computational method to infer the proportion between subpopulations of cells that manifest various functional behaviour in a given sample. Our method was validated using two datasets from RNA microarray experiments. Both experiments aimed to examine cell viability in specific experimental conditions. The presented methodology can be easily extended to RNA-seq data as well as other molecular processes. Moreover, it complements standard tools to indicate most important networks from transcriptomic data and in particular could be useful in the analysis of cancer cell lines affected by biologically active compounds or drugs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1155/2017/6961786DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5736944PMC
August 2018

Inferring transposons activity chronology by TRANScendence - TEs database and de-novo mining tool.

BMC Bioinformatics 2017 Oct 16;18(Suppl 12):422. Epub 2017 Oct 16.

Institute of Informatics, University of Warsaw, Banacha 2, Warsaw, 02097, Poland.

Background: The constant progress in sequencing technology leads to ever increasing amounts of genomic data. In the light of current evidence transposable elements (TEs for short) are becoming useful tools for learning about the evolution of host genome. Therefore the software for genome-wide detection and analysis of TEs is of great interest.

Results: Here we describe the computational tool for mining, classifying and storing TEs from newly sequenced genomes. This is an online, web-based, user-friendly service, enabling users to upload their own genomic data, and perform de-novo searches for TEs. The detected TEs are automatically analyzed, compared to reference databases, annotated, clustered into families, and stored in TEs repository. Also, the genome-wide nesting structure of found elements are detected and analyzed by new method for inferring evolutionary history of TEs. We illustrate the functionality of our tool by performing a full-scale analyses of TE landscape in Medicago truncatula genome.

Conclusions: TRANScendence is an effective tool for the de-novo annotation and classification of transposable elements in newly-acquired genomes. Its streamlined interface makes it well-suited for evolutionary studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-017-1824-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5657132PMC
October 2017

Leaf and Plant Age Affects Photosynthetic Performance and Photoprotective Capacity.

Plant Physiol 2017 Dec 10;175(4):1634-1648. Epub 2017 Oct 10.

Biophysics of Photosynthesis/Energy, Faculty of Sciences, Department of Physics and Astronomy, VU Amsterdam, 1081 HV Amsterdam, The Netherlands.

In this work, we studied the changes in high-light tolerance and photosynthetic activity in leaves of the Arabidopsis () rosette throughout the vegetative stage of growth. We implemented an image-analysis work flow to analyze the capacity of both the whole plant and individual leaves to cope with excess excitation energy by following the changes in absorbed light energy partitioning. The data show that leaf and plant age are both important factors influencing the fate of excitation energy. During the dark-to-light transition, the age of the plant affects mostly steady-state levels of photochemical and nonphotochemical quenching, leading to an increased photosynthetic performance of its leaves. The age of the leaf affects the induction kinetics of nonphotochemical quenching. These observations were confirmed using model selection procedures. We further investigated how different leaves on a rosette acclimate to high light and show that younger leaves are less prone to photoinhibition than older leaves. Our results stress that both plant and leaf age should be taken into consideration during the quantification of photosynthetic and photoprotective traits to produce repeatable and reliable results.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1104/pp.17.00904DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5717728PMC
December 2017

Estimation of Rates of Reactions Triggered by Electron Transfer in Top-Down Mass Spectrometry.

J Comput Biol 2018 03 25;25(3):282-301. Epub 2017 Sep 25.

1 Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw , Warsaw, Poland .

Electron transfer dissociation (ETD) is a versatile technique used in mass spectrometry for the high-throughput characterization of proteins. It consists of several concurrent reactions triggered by the transfer of an electron from its anion source to sample cations. Transferring an electron causes peptide backbone cleavage while leaving labile post-translational modifications intact. The obtained fragmentation spectra provide valuable information for sequence and structure analyses. In this study, we propose a formal mathematical model of the ETD fragmentation process in the form of a system of stochastic differential equations describing its joint dynamics. Parameters of the model correspond to the rates of occurring reactions. Their estimates for various experimental settings give insight into the dynamics of the ETD process. We estimate the model parameters from the relative quantities of fragmentation products in a given mass spectrum by solving a nonlinear optimization problem. The cost function penalizes for the differences between the analytically derived average number of reaction products and their experimental counterparts. The presented method proves highly robust to noise in silico. Moreover, the model can explain a considerable amount of experimental results for a wide range of instrumentation settings. The implementation of the presented workflow, code-named ETDetective, is freely available under the two-clause BSD license.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1089/cmb.2017.0156DOI Listing
March 2018

IsoSpec: Hyperfast Fine Structure Calculator.

Anal Chem 2017 03 8;89(6):3272-3277. Epub 2017 Mar 8.

Department of Mathematics, Informatics, and Mechanics, University of Warsaw , 02-097 Warsaw, Poland.

As high-resolution mass spectrometry (HRMS) becomes increasingly available, the need of software tools capable of handling more complex data is surging. The complexity of the HRMS data stems partly from the presence of isotopes that give rise to more peaks to interpret compared to lower resolution instruments. However, a new generation of fine isotope calculators is on the rise. They calculate the smallest possible sets of isotopologues. However, none of these calculators lets the user specify the joint probability of the revealed envelope in advance. Instead, the user must provide a lower limit on the probability of isotopologues of interest, that is, provide minimal peak height. The choice of such threshold is far from obvious. In particular, it is impossible to a priori balance the trade-off between the algorithm speed and the portion of the revealed theoretical spectrum. We show that this leads to considerable inefficiencies. Here, we present IsoSpec: an algorithm for fast computation of isotopologues of chemical substances that can alternate between joint probability and peak height threshold. We prove that IsoSpec is optimal in terms of time complexity. Its implementation is freely available under a 2-clause BSD license, with bindings for C++, C, R, and Python.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.analchem.6b01459DOI Listing
March 2017

Lethal lung hypoplasia and vascular defects in mice with conditional Foxf1 overexpression.

Biol Open 2016 Nov 15;5(11):1595-1606. Epub 2016 Nov 15.

Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA

FOXF1 heterozygous point mutations and genomic deletions have been reported in newborns with the neonatally lethal lung developmental disorder, alveolar capillary dysplasia with misalignment of pulmonary veins (ACDMPV). However, no gain-of-function mutations in FOXF1 have been identified yet in any human disease conditions. To study the effects of FOXF1 overexpression in lung development, we generated a Foxf1 overexpression mouse model by knocking-in a Cre-inducible Foxf1 allele into the ROSA26 (R26) locus. The mice were phenotyped using micro-computed tomography (micro-CT), head-out plethysmography, ChIP-seq and transcriptome analyses, immunohistochemistry, and lung histopathology. Thirty-five percent of heterozygous R26-Lox-Stop-Lox (LSL)-Foxf1 embryonic day (E)15.5 embryos exhibit subcutaneous edema, hemorrhages and die perinatally when bred to Tie2-cre mice, which targets Foxf1 overexpression to endothelial and hematopoietic cells. Histopathological and micro-CT evaluations revealed that R26Foxf1; Tie2-cre embryos have immature lungs with a diminished vascular network. Neonates exhibited respiratory deficits verified by detailed plethysmography studies. ChIP-seq and transcriptome analyses in E18.5 lungs identified Sox11, Ghr, Ednrb, and Slit2 as potential downstream targets of FOXF1. Our study shows that overexpression of the highly dosage-sensitive Foxf1 impairs lung development and causes vascular abnormalities. This has important clinical implications when considering potential gene therapy approaches to treat disorders of FOXF1 abnormal dosage, such as ACDMPV.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1242/bio.019208DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5155529PMC
November 2016

Conformational Space and Stability of ETD Charge Reduction Products of Ubiquitin.

J Am Soc Mass Spectrom 2017 01 5;28(1):69-76. Epub 2016 Aug 5.

Biomolecular and Analytical Mass Spectrometry Group, Department of Chemistry, University of Antwerp, Antwerpen, Belgium.

Owing to its versatility, electron transfer dissociation (ETD) has become one of the most commonly utilized fragmentation techniques in both native and non-native top-down mass spectrometry. However, several competing reactions-primarily different forms of charge reduction-occur under ETD conditions, as evidenced by the distorted isotope patterns usually observed. In this work, we analyze these isotope patterns to compare the stability of nondissociative electron transfer (ETnoD) products, specifically noncovalent c/z fragment complexes, across a range of ubiquitin conformational states. Using ion mobility, we find that more extended states are more prone to fragment release. We obtain evidence that for a given charge state, populations of ubiquitin ions formed either directly by electrospray ionization or through collapse of more extended states upon charge reduction, span a similar range of collision cross-sections. Products of gas-phase collapse are, however, less stabilized towards unfolding than the native conformation, indicating that the ions retain a memory of previous conformational states. Furthermore, this collapse of charge-reduced ions is promoted if the ions are 'preheated' using collisional activation, with possible implications for the kinetics of gas-phase compaction. Graphical Abstract ᅟ.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s13361-016-1444-7DOI Listing
January 2017

Gene Expression Profile of the Clinically Aggressive Micropapillary Variant of Bladder Cancer.

Eur Urol 2016 10 15;70(4):611-620. Epub 2016 Mar 15.

Department of Pathology, University of Texas MD Anderson Cancer Center, Houston, TX, USA. Electronic address:

Background: Progression of conventional urothelial carcinoma of the bladder to a tumor with unique microscopic features referred to as micropapillary carcinoma is coupled with aggressive clinical behavior signified by a high propensity for metastasis to regional lymph nodes and distant organs resulting in shorter survival.

Objective: To analyze the expression profile of micropapillary cancer and define its molecular features relevant to clinical behavior.

Design, Setting, And Participants: We retrospectively identified 43 patients with micropapillary bladder cancers and a reference set of 89 patients with conventional urothelial carcinomas and performed whole-genome expression messenger RNA profiling.

Outcome Measurements And Statistical Analysis: The tumors were segregated into distinct groups according to hierarchical clustering analyses. They were also classified according to luminal, p53-like, and basal categories using a previously described algorithm. We applied Ingenuity Pathway Analysis software (Qiagen, Redwood City, CA, USA) and gene set enrichment analysis for pathway analyses. Cox proportional hazards models and Kaplan-Meier methods were used to assess the relationship between survival and molecular subtypes. The expression profile of micropapillary cancer was validated for selected markers by immunohistochemistry on parallel tissue microarrays.

Results And Limitations: We show that the striking features of micropapillary cancer are downregulation of miR-296 and activation of chromatin-remodeling complex RUVBL1. In contrast to conventional urothelial carcinomas that based on their expression can be equally divided into luminal and basal subtypes, micropapillary cancer is almost exclusively luminal, displaying enrichment of active peroxisome proliferator-activated receptor γ and suppression of p63 target genes. As with conventional luminal urothelial carcinomas, a subset of micropapillary cancers exhibit activation of wild-type p53 downstream genes and represent the most aggressive molecular subtype of the disease with the shortest survival. The involvement of miR-296 and RUVBL1 in the development of micropapillary bladder cancer was identified by the analyses of correlative associations of genome expression profiles and requires mechanistic validation.

Conclusions: Micropapillary cancer evolves through the luminal pathway and is characterized by the activation of miR-296 and RUVBL1 target genes.

Patient Summary: Our observations have important implications for prognosis and for possible future development of more effective therapies for micropapillary bladder cancer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.eururo.2016.02.056DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5804336PMC
October 2016

Computational modeling of sphingolipid metabolism.

BMC Syst Biol 2015 Aug 15;9:47. Epub 2015 Aug 15.

Institute of Informatics, University of Warsaw, Warsaw, Poland.

Background: As suggested by the origin of the word, sphingolipids are mysterious molecules with various roles in antagonistic cellular processes such as autophagy, apoptosis, proliferation and differentiation. Moreover, sphingolipids have recently been recognized as important messengers in cellular signaling pathways. Notably, sphingolipid metabolism disorders have been observed in various pathological conditions such as cancer and neurodegeneration.

Results: The existing formal models of sphingolipid metabolism focus mainly on de novo ceramide synthesis or are limited to biochemical transformations of particular subspecies. Here, we propose the first comprehensive computational model of sphingolipid metabolism in human tissue. Contrary to the previous approaches, we use a model that reflects cell compartmentalization thereby highlighting the differences among individual organelles.

Conclusions: The model that we present here was validated using recently proposed methods of model analysis, allowing to detect the most sensitive and experimentally non-identifiable parameters and determine the main sources of model variance. Moreover, we demonstrate the usefulness of our model in the study of molecular processes underlying Alzheimer's disease, which are associated with sphingolipid metabolism.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12918-015-0176-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4537549PMC
August 2015

On the Fine Isotopic Distribution and Limits to Resolution in Mass Spectrometry.

J Am Soc Mass Spectrom 2015 Oct 12;26(10):1732-45. Epub 2015 Aug 12.

Institute of Informatics, University of Warsaw, Warsaw, Poland.

Mass spectrometry enables the study of increasingly larger biomolecules with increasingly higher resolution, which is able to distinguish between fine isotopic variants having the same additional nucleon count, but slightly different masses. Therefore, the analysis of the fine isotopic distribution becomes an interesting research topic with important practical applications. In this paper, we propose the comprehensive methodology for studying the basic characteristics of the fine isotopic distribution. Our approach uses a broad spectrum of methods ranging from generating functions--that allow us to estimate the variance and the information theory entropy of the distribution--to the theory of thermal energy fluctuations. Having characterized the variance, spread, shape, and size of the fine isotopic distribution, we are able to indicate limitations to high resolution mass spectrometry. Moreover, the analysis of "thermorelativistic" effects (i.e., mass uncertainty attributable to relativistic effects coupled with the statistical mechanical uncertainty of the energy of an isolated ion), in turn, gives us an estimate of impassable limits of isotopic resolution (understood as the ability to distinguish fine structure peaks), which can be moved further only by cooling the ions. The presented approach highlights the potential of theoretical analysis of the fine isotopic distribution, which allows modeling the data more accurately, aiming to support the successful experimental measurements.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s13361-015-1180-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4565875PMC
October 2015

MuTAnT: a family of Mutator-like transposable elements targeting TA microsatellites in Medicago truncatula.

Genetica 2015 Aug 17;143(4):433-40. Epub 2015 May 17.

Institute of Plant Biology and Biotechnology, University of Agriculture in Krakow, Al. 29 Listopada 54, 31-425, Kraków, Poland.

Transposable elements (TEs) are mobile DNA segments, abundant and dynamic in plant genomes. Because their mobility can be potentially deleterious to the host, a variety of mechanisms evolved limiting that negative impact, one of them being preference for a specific target insertion site. Here, we describe a family of Mutator-like DNA transposons in Medicago truncatula targeting TA microsatellites. We identified 218 copies of MuTAnTs and an element carrying a complete ORF encoding a mudrA-like transposase. Most insertion sites are flanked by a variable number of TA tandem repeats, indicating that MuTAnTs are specifically targeting TA microsatellites. Other TE families flanked by TA repeats (e.g. TAFT elements in maize) were described previously, however we identified the first putative autonomous element sharing that characteristics with a related group of short non-autonomous transposons.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s10709-015-9842-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4486113PMC
August 2015

Genome-wide analyses of LINE-LINE-mediated nonallelic homologous recombination.

Nucleic Acids Res 2015 Feb 22;43(4):2188-98. Epub 2015 Jan 22.

Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, 2 Banacha street, 02-097 Warsaw, Poland Mossakowski Medical Research Centre, Polish Academy of Sciences, 5 Pawińskiego street, 02-106 Warsaw, Poland

Nonallelic homologous recombination (NAHR), occurring between low-copy repeats (LCRs) >10 kb in size and sharing >97% DNA sequence identity, is responsible for the majority of recurrent genomic rearrangements in the human genome. Recent studies have shown that transposable elements (TEs) can also mediate recurrent deletions and translocations, indicating the features of substrates that mediate NAHR may be significantly less stringent than previously believed. Using >4 kb length and >95% sequence identity criteria, we analyzed of the genome-wide distribution of long interspersed element (LINE) retrotransposon and their potential to mediate NAHR. We identified 17 005 directly oriented LINE pairs located <10 Mbp from each other as potential NAHR substrates, placing 82.8% of the human genome at risk of LINE-LINE-mediated instability. Cross-referencing these regions with CNVs in the Baylor College of Medicine clinical chromosomal microarray database of 36 285 patients, we identified 516 CNVs potentially mediated by LINEs. Using long-range PCR of five different genomic regions in a total of 44 patients, we confirmed that the CNV breakpoints in each patient map within the LINE elements. To additionally assess the scale of LINE-LINE/NAHR phenomenon in the human genome, we tested DNA samples from six healthy individuals on a custom aCGH microarray targeting LINE elements predicted to mediate CNVs and identified 25 LINE-LINE rearrangements. Our data indicate that LINE-LINE-mediated NAHR is widespread and under-recognized, and is an important mechanism of structural rearrangement contributing to human genomic variability.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gku1394DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4344489PMC
February 2015

Molecular and clinical analyses of 16q24.1 duplications involving FOXF1 identify an evolutionarily unstable large minisatellite.

BMC Med Genet 2014 Dec 4;15:128. Epub 2014 Dec 4.

Interdepartmental Program in Translational Biology & Molecular Medicine, Baylor College of Medicine, Houston, TX, USA.

Background: Point mutations or genomic deletions of FOXF1 result in a lethal developmental lung disease Alveolar Capillary Dysplasia with Misalignment of Pulmonary Veins. However, the clinical consequences of the constitutively increased dosage of FOXF1 are unknown.

Methods: Copy-number variations and their parental origin were identified using a combination of array CGH, long-range PCR, DNA sequencing, and microsatellite analyses. Minisatellite sequences across different species were compared using a gready clustering algorithm and genome-wide analysis of the distribution of minisatellite sequences was performed using R statistical software.

Results: We report four unrelated families with 16q24.1 duplications encompassing entire FOXF1. In a 4-year-old boy with speech delay and a café-au-lait macule, we identified an ~15 kb 16q24.1 duplication inherited from the reportedly healthy father, in addition to a de novo ~1.09 Mb mosaic 17q11.2 NF1 deletion. In a 13-year-old patient with autism and mood disorder, we found an ~0.3 Mb duplication harboring FOXF1 and an ~0.5 Mb 16q23.3 duplication, both inherited from the father with bipolar disorder. In a 47-year old patient with pyloric stenosis, mesenterium commune, and aplasia of the appendix, we identified an ~0.4 Mb duplication in 16q24.1 encompassing 16 genes including FOXF1. The patient transmitted the duplication to her daughter, who presented with similar symptoms. In a fourth patient with speech and motor delay, and borderline intellectual disability, we identified an ~1.7 Mb FOXF1 duplication adjacent to a large minisatellite. This duplication has a complex structure and arose de novo on the maternal chromosome, likely as a result of a DNA replication error initiated by the adjacent large tandem repeat. Using bioinformatic and array CGH analyses of the minisatellite, we found a large variation of its size in several different species and individuals, demonstrating both its evolutionarily instability and population polymorphism.

Conclusions: Our data indicate that constitutional duplication of FOXF1 in humans is not associated with any pediatric lung abnormalities. We propose that patients with gut malrotation, pyloric or duodenal stenosis, and gall bladder agenesis should be tested for FOXF1 alterations. We suggest that instability of minisatellites greater than 1 kb can lead to structural variation due to DNA replication errors.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12881-014-0128-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4411736PMC
December 2014

Towards automated discrimination of lipids versus peptides from full scan mass spectra.

EuPA Open Proteom 2014 Sep;4:87-100

Applied Bio & molecular Systems, VITO, Mol, Belgium ; Center for Proteomics, Antwerp, Belgium ; Interuniversity Institute for Biostatistics and Statistical Bioinformatics, Hasselt University, Diepenbeek, Belgium.

Although physicochemical fractionation techniques play a crucial role in the analysis of complex mixtures, they are not necessarily the best solution to separate specific molecular classes, such as lipids and peptides. Any physical fractionation step such as, for example, those based on liquid chromatography, will introduce its own variation and noise. In this paper we investigate to what extent the high sensitivity and resolution of contemporary mass spectrometers offers viable opportunities for computational separation of signals in full scan spectra. We introduce an automatic method that can discriminate peptide from lipid peaks in full scan mass spectra, based on their isotopic properties. We systematically evaluate which features maximally contribute to a peptide versus lipid classification. The selected features are subsequently used to build a random forest classifier that enables almost perfect separation between lipid and peptide signals without requiring ion fragmentation and classical tandem MS-based identification approaches. The classifier is trained on data, but is also capable of discriminating signals in real world experiments. We evaluate the influence of typical data inaccuracies of common classes of mass spectrometry instruments on the optimal set of discriminant features. Finally, the method is successfully extended towards the classification of individual lipid classes from full scan mass spectral features, based on input data defined by the Lipid Maps Consortium.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.euprot.2014.05.002DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4234154PMC
September 2014

Human endogenous retroviral elements promote genome instability via non-allelic homologous recombination.

BMC Biol 2014 Sep 23;12:74. Epub 2014 Sep 23.

Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Rm ABBR-R809, Houston, TX, USA.

Background: Recurrent rearrangements of the human genome resulting in disease or variation are mainly mediated by non-allelic homologous recombination (NAHR) between low-copy repeats. However, other genomic structures, including AT-rich palindromes and retroviruses, have also been reported to underlie recurrent structural rearrangements. Notably, recurrent deletions of Yq12 conveying azoospermia, as well as non-pathogenic reciprocal duplications, are mediated by human endogenous retroviral elements (HERVs). We hypothesized that HERV elements throughout the genome can serve as substrates for genomic instability and result in human copy-number variation (CNV).

Results: We developed parameters to identify HERV elements similar to those that mediate Yq12 rearrangements as well as recurrent deletions of 3q13.2q13.31. We used these parameters to identify HERV pairs genome-wide that may cause instability. Our analysis highlighted 170 pairs, flanking 12.1% of the genome. We cross-referenced these predicted susceptibility regions with CNVs from our clinical databases for potentially HERV-mediated rearrangements and identified 78 CNVs. We subsequently molecularly confirmed recurrent deletion and duplication rearrangements at four loci in ten individuals, including reciprocal rearrangements at two loci. Breakpoint sequencing revealed clustering in regions of high sequence identity enriched in PRDM9-mediated recombination hotspot motifs.

Conclusions: The presence of deletions and reciprocal duplications suggests NAHR as the causative mechanism of HERV-mediated CNV, even though the length and the sequence homology of the HERV elements are less than currently thought to be required for NAHR. We propose that in addition to HERVs, other repetitive elements, such as long interspersed elements, may also be responsible for the formation of recurrent CNVs via NAHR.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12915-014-0074-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4195946PMC
September 2014
-->