Publications by authors named "Maido Remm"

82 Publications

Competitiveness for Nodule Colonization in Sinorhizobium meliloti: Combined -Tagged Strain Competition and Genome-Wide Association Analysis.

mSystems 2021 Aug 27;6(4):e0055021. Epub 2021 Jul 27.

Department of Biology, University of Bari Aldo Morogrid.7644.1, Bari, Italy.

Associations between leguminous plants and symbiotic nitrogen-fixing rhizobia are a classic example of mutualism between a eukaryotic host and a specific group of prokaryotic microbes. Although this symbiosis is in part species specific, different rhizobial strains may colonize the same nodule. Some rhizobial strains are commonly known as better competitors than others, but detailed analyses that aim to predict rhizobial competitive abilities based on genomes are still scarce. Here, we performed a bacterial genome-wide association (GWAS) analysis to define the genomic determinants related to the competitive capabilities in the model rhizobial species Sinorhizobium meliloti. For this, 13 tester strains were green fluorescent protein (GFP) tagged and assayed versus 3 red fluorescent protein (RFP)-tagged reference competitor strains (Rm1021, AK83, and BL225C) in a Medicago sativa nodule occupancy test. Competition data and strain genomic sequences were employed to build a model for GWAS based on -mers. Among the -mers with the highest scores, 51 -mers mapped on the genomes of four strains showing the highest competition phenotypes (>60% single strain nodule occupancy; GR4, KH35c, KH46, and SM11) versus BL225C. These -mers were mainly located on the symbiosis-related megaplasmid pSymA, specifically on genes coding for transporters, proteins involved in the biosynthesis of cofactors, and proteins related to metabolism (e.g., fatty acids). The same analysis was performed considering the sum of single and mixed nodules obtained in the competition assays versus BL225C, retrieving -mers mapped on the genes previously found and on genes. Therefore, the competition abilities seem to be linked to multiple genetic determinants and comprise several cellular components. Decoding the competitive pattern that occurs in the rhizosphere is challenging in the study of bacterial social interaction strategies. To date, the single-gene approach has mainly been used to uncover the bases of nodulation, but there is still a knowledge gap regarding the main features that characterize rhizobial strains able to outcompete indigenous rhizobia. Therefore, tracking down which traits make different rhizobial strains able to win the competition for plant infection over other indigenous rhizobia will improve the strain selection process and, consequently, plant yield in sustainable agricultural production systems. We proved that a -mer-based GWAS approach can efficiently identify the competition determinants of a panel of strains previously analyzed for their plant tissue occupancy using double fluorescent labeling. The reported strategy will be useful for detailed studies on the genomic aspects of the evolution of bacterial symbiosis and for an extensive evaluation of rhizobial inoculants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/mSystems.00550-21DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8407117PMC
August 2021

Molecular Characterization of Isolates From Different Sources in Estonia Reveals Potential Transmission of Resistance Genes Among Different Reservoirs.

Front Microbiol 2021 26;12:601490. Epub 2021 Mar 26.

Institute of Technology, University of Tartu, Tartu, Estonia.

In this study, we aimed to characterize the population structure, drug resistance mechanisms, and virulence genes of isolates in Estonia. Sixty-one and 34 isolates were collected between 2012 and 2014 across the country from various sites and sources, including farm animals and poultry ( = 53), humans ( = 12), environment ( = 24), and wild birds ( = 44). Clonal relationships of the strains were determined by whole-genome sequencing and analyzed by multi-locus sequence typing. We determined the presence of acquired antimicrobial resistance genes and 23S rRNA mutations, virulence genes, and also the plasmid or chromosomal origin of the genes using dedicated DNA sequence analysis tools available and/or homology search against an compiled database of relevant sequences. Two isolates from human with genes were highly resistant to vancomycin. Closely related strains were isolated from different host species. This indicates interspecies spread of strains and potential transfer of antibiotic resistance. Genomic context analysis of the resistance genes indicated frequent association with plasmids and mobile genetic elements. Resistance genes are often present in the identical genetic context in strains with diverse origins, suggesting the occurrence of transfer events.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fmicb.2021.601490DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8032980PMC
March 2021

The new COST Action European Venom Network (EUVEN)-synergy and future perspectives of modern venomics.

Gigascience 2021 Mar;10(3)

Department of Ecology and Evolution, University of Lausanne, UNIL Sorge Le Biophore, CH - 1015 Lausanne, Switzerland.

Venom research is a highly multidisciplinary field that involves multiple subfields of biology, informatics, pharmacology, medicine, and other areas. These different research facets are often technologically challenging and pursued by different teams lacking connection with each other. This lack of coordination hampers the full development of venom investigation and applications. The COST Action CA19144-European Venom Network was recently launched to promote synergistic interactions among different stakeholders and foster venom research at the European level.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giab019DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7992391PMC
March 2021

KATK: Fast genotyping of rare variants directly from unmapped sequencing reads.

Hum Mutat 2021 06 1;42(6):777-786. Epub 2021 Apr 1.

Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia.

KATK is a fast and accurate software tool for calling variants directly from raw next-generation sequencing reads. It uses predefined k-mers to retrieve only the reads of interest from the FASTQ file and calls genotypes by aligning retrieved reads locally. KATK does not use data about known polymorphisms and has NC (no call) as the default genotype. The reference or variant allele is called only if there is sufficient evidence for their presence in data. Thus it is not biased against rare variants or de-novo mutations. With simulated datasets, we achieved a false-negative rate of 0.23% (sensitivity 99.77%) and a false discovery rate of 0.19%. Calling all human exonic regions with KATK requires 1-2 h, depending on sequencing coverage.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.24197DOI Listing
June 2021

A human-specific VNTR in the TRIB3 promoter causes gene expression variation between individuals.

PLoS Genet 2020 08 3;16(8):e1008981. Epub 2020 Aug 3.

Estonian Biocentre, Institute of Genomics, University of Tartu, Tartu, Estonia.

Tribbles homolog 3 (TRIB3) is pseudokinase involved in intracellular regulatory processes and has been implicated in several diseases. In this article, we report that human TRIB3 promoter contains a 33-bp variable number tandem repeat (VNTR) and characterize the heterogeneity and function of this genetic element. Analysis of human populations around the world uncovered the existence of alleles ranging from 1 to 5 copies of the repeat, with 2-, 3- and 5-copy alleles being the most common but displaying considerable geographical differences in frequency. The repeated sequence overlaps a C/EBP-ATF transcriptional regulatory element and is highly conserved, but not repeated, in various mammalian species, including great apes. The repeat is however evident in Neanderthal and Denisovan genomes. Reporter plasmid experiments in human cell culture reveal that an increased copy number of the TRIB3 promoter 33-bp repeat results in increased transcriptional activity. In line with this, analysis of whole genome sequencing and RNA-Seq data from human cohorts demonstrates that the copy number of TRIB3 promoter 33-bp repeats is positively correlated with TRIB3 mRNA expression level in many tissues throughout the body. Moreover, the copy number of the TRIB3 33-bp repeat appears to be linked to known TRIB3 eQTL SNPs as well as TRIB3 SNPs reported in genetic association studies. Taken together, the results indicate that the promoter 33-bp VNTR constitutes a causal variant for TRIB3 expression variation between individuals and could underlie the results of SNP-based genetic studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1008981DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7425993PMC
August 2020

Characteristics of Extended-Spectrum Beta-Lactamase-Producing Enterobacteriaceae and Contact to Animals in Estonia.

Microorganisms 2020 Jul 27;8(8). Epub 2020 Jul 27.

Department of Microbiology, Institute of Biomedicine and Translational Medicine, University of Tartu, 50411 Tartu, Estonia.

We have attempted to define the prevalence and risk factors of extended-spectrum beta-lactamase-producing Enterobacteriaceae (ESBL-Enterobacteriaceae) carriage, and to characterize antimicrobial susceptibility, beta-lactamase genes, and major types of isolated strains in volunteers, with a specific focus on humans in contact with animals. Samples were collected from 207 volunteers (veterinarians, pig farmers, dog owners, etc.) and cultured on selective agar. Clonal relationships of the isolated ESBL-Enterobacteriaceae were determined by whole genome sequencing and multi-locus sequence typing. Beta-lactamases were detected using a homology search. Subjects filled in questionnaires analyzed by univariate and multiple logistic regression. Colonization with ESBL-Enterobacteriaceae was found in fecal samples of 14 individuals (6.8%; 95%CI: 3.75-11.09%). In multiple regression analysis, working as a pig farmer was a significant risk factor for ESBL-Enterobacteriaceae carriage (OR 4.8; 95%CI 1.2-19.1). The only species isolated was that distributed into 11 sequence types. All ESBL-Enterobacteriaceae isolates were of CTX-M genotype, with the CTX-M-1 being the most prevalent and more common in pig farmers than in other groups. Despite the generally low prevalence of ESBL-Enterobacteriaceae in Estonia, the pig farmers may still pose a threat to transfer resistant microorganisms. The clinical relevance of predominant CTX-M-1 carrying is still unclear and needs further studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/microorganisms8081130DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7465280PMC
July 2020

Method for the Identification of Plant DNA in Food Using Alignment-Free Analysis of Sequencing Reads: A Case Study on Lupin.

Front Plant Sci 2020 21;11:646. Epub 2020 May 21.

Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia.

Fast and reliable analytical methods for the identification of plants from metagenomic samples play an important role in identifying the components of complex mixtures of processed biological materials, including food, herbal products, gut contents or environmental samples. Different PCR-based methods that are commonly used for plant identification from metagenomic samples are often inapplicable due to DNA degradation, a low level of successful amplification or a lack of detection power. We introduce a method that combines metagenomic sequencing and an alignment-free -mer based approach for the identification of plant DNA in processed metagenomic samples. Our method identifies plant DNA directly from metagenomic sequencing reads and does not require mapping or assembly of the reads. We identified more than 31,000 -specific 32-mers from assembled chloroplast genome sequences. We demonstrate that lupin DNA can be detected from controlled mixtures of sequences from target species (different species) and closely related non-target species (, and ). Moreover, these 32-mers are detectable in the following processed samples: lupin flour, conserved seeds and baked cookies containing different amounts of lupin flour. Under controlled conditions, lupin-specific components are detectable in baked cookies containing a minimum of 0.05% of lupin flour in wheat flour.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fpls.2020.00646DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7253697PMC
May 2020

Chromosomal toxin-antitoxin systems in Pseudomonas putida are rather selfish than beneficial.

Sci Rep 2020 06 8;10(1):9230. Epub 2020 Jun 8.

Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia.

Chromosomal toxin-antitoxin (TA) systems are widespread genetic elements among bacteria, yet, despite extensive studies in the last decade, their biological importance remains ambivalent. The ability of TA-encoded toxins to affect stress tolerance when overexpressed supports the hypothesis of TA systems being associated with stress adaptation. However, the deletion of TA genes has usually no effects on stress tolerance, supporting the selfish elements hypothesis. Here, we aimed to evaluate the cost and benefits of chromosomal TA systems to Pseudomonas putida. We show that multiple TA systems do not confer fitness benefits to this bacterium as deletion of 13 TA loci does not influence stress tolerance, persistence or biofilm formation. Our results instead show that TA loci are costly and decrease the competitive fitness of P. putida. Still, the cost of multiple TA systems is low and detectable in certain conditions only. Construction of antitoxin deletion strains showed that only five TA systems code for toxic proteins, while other TA loci have evolved towards reduced toxicity and encode non-toxic or moderately potent proteins. Analysis of P. putida TA systems' homologs among fully sequenced Pseudomonads suggests that the TA loci have been subjected to purifying selection and that TA systems spread among bacteria by horizontal gene transfer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-020-65504-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7280312PMC
June 2020

Phenotypic and Molecular Epidemiology of ESBL-, AmpC-, and Carbapenemase-Producing in Northern and Eastern Europe.

Front Microbiol 2019 22;10:2465. Epub 2019 Nov 22.

Department of Microbiology, Institute of Biomedicine and Translational Medicine, University of Tartu, Tartu, Estonia.

Extended-spectrum beta-lactamases (ESBL) and AmpC producing- have spread worldwide, but data about ESBL-producing- in the Northern and Eastern regions of Europe is scant. The aim of this study has been to describe the phenotypical and molecular epidemiology of different ESBL/AmpC/Carbapenemases genes in strains isolated from the Baltic States (Estonia, Latvia, and Lithuania), Norway and St. Petersburg (Russia), and to determine the predominant multilocus sequence type and single nucleotide polymorphisms diversity of isolates deduced by whole genome sequencing (WGS). A total of 10,780 clinical strains were screened for reduced sensitivity to third-generation cephalosporins. They were collected from 21 hospitals located in Estonia, Latvia, Lithuania, Norway and St. Petersburg during a 5 month period in 2012. The overall prevalence of ESBL/AmpC strains was 4.7% by phenotypical test and 3.9% by sequencing. We found more strains with the ESBL/AmpC phenotype and genotype in St. Petersburg and Latvia than other countries. Of phenotypic strains, 85% contained confirmed ESBL genes (including , , ), AmpC genes ( , , , , ), or carbapenemase genes ( ). , and were found in all countries, but prevalence was higher in Latvia than in St. Petersburg (Russia), Estonia, Norway and Lithuania. The dominating AmpC genes were in the Baltic States and Norway, and in St. Petersburg. strains belonged to 83 different sequence types, of which the most prevalent was ST131 (40%). In conclusion, we generally found low ESBL/AmpC/Carbapenemase prevalence in strains isolated in Northern/Eastern Europe. However, several inter-country differences in distribution of particular genes and multilocus sequence types were found.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fmicb.2019.02465DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6882919PMC
November 2019

Muropeptides Stimulate Growth Resumption from Stationary Phase in Escherichia coli.

Sci Rep 2019 12 2;9(1):18043. Epub 2019 Dec 2.

Institute of Technology, University of Tartu, Tartu, Estonia.

When nutrients run out, bacteria enter a dormant metabolic state. This low or undetectable metabolic activity helps bacteria to preserve their scant reserves for the future needs, yet it also diminishes their ability to scan the environment for new growth-promoting substrates. However, neighboring microbial growth is a reliable indicator of a favorable environment and can thus serve as a cue for exiting dormancy. Here we report that for Escherichia coli and Pseudomonas aeruginosa this cue is provided by the basic peptidoglycan unit (i.e. muropeptide). We show that several forms of muropeptides from a variety of bacterial species can stimulate growth resumption of dormant cells and the sugar - peptide bond is crucial for this activity. These results, together with previous research that identifies muropeptides as a germination signal for bacterial spores, and their detection by mammalian immune cells, show that muropeptides are a universal cue for bacterial growth.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-019-54646-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6888817PMC
December 2019

Application of Molecular Methods for Carbapenemase Detection.

Front Microbiol 2019 2;10:1755. Epub 2019 Aug 2.

Department of Microbiology, Institute of Biomedicine and Translational Medicine, University of Tartu, Tartu, Estonia.

This study has evaluated the correlation between different carbapenemases detection methods on carbapenem non-susceptible strains from Northern and Eastern Europe; 31 institutions in 9 countries participated in the research project, namely Finland, Estonia, Latvia, Lithuania, Russia, St. Petersburg, Poland, Belarus, Ukraine, and Georgia. During the research program, a total of 5,001 clinical isolates were screened for any carbapenem non-susceptibility by the disk diffusion method, Vitek 2 or Phoenix system following the EUCAST guideline on detection of resistance mechanisms, version 1.0. Strains isolated from outpatients and hospitalized patients from April 2015 to June 2015 were included. All types of samples (blood, pus, urine, etc.) excluding fecal screening or fecal colonization samples have been represented. In total, 171 carbapenemase screening-positive isolates (3.42%) were found and characterized. Several methods were used for detection of carbapenemases production, including Luminex assay (PCR and hybridization), whole genome sequencing, MALDI-TOF based Imipenem degradation assay, and immunochromatography testing. Minimal inhibitory concentration determination for Meropenem by agar-based gradient method was also used. Finally, 83 strains were carbapenemase negative by all confirmation methods (49.4% of all screening-positive ones), 74 - positive by three methods (44.0%), 8 - positive by two methods (4.8%) and 3 - positive by only one method (1.8%). The sensitivity of the tests was 96.3% for Whole genome sequencing and MALDI-TOF assay (both three undetected cases), and 95.1% for Luminex-Carba (4 undetected cases). The most commonly detected carbapenemases were NDM ( = 54) and OXA-48 ( = 26), followed by KPC-2, VIM-5, and OXA-72 (one case of each). Our results showed that different types of carbapenemases can be detected in the countries involved in the project. The sensitivity of our methods for carbapenemase detection (including screening as a first step and further confirmation tests) was >95%, but we would recommend using different methods to increase the sensitivity of detection and make it more precise.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fmicb.2019.01755DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6687770PMC
August 2019

AluMine: alignment-free method for the discovery of polymorphic Alu element insertions.

Mob DNA 2019 18;10:31. Epub 2019 Jul 18.

Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia.

Background: Recently, alignment-free sequence analysis methods have gained popularity in the field of personal genomics. These methods are based on counting frequencies of short -mer sequences, thus allowing faster and more robust analysis compared to traditional alignment-based methods.

Results: We have created a fast alignment-free method, AluMine, to analyze polymorphic insertions of Alu elements in the human genome. We tested the method on 2,241 individuals from the Estonian Genome Project and identified 28,962 potential polymorphic Alu element insertions. Each tested individual had on average 1,574 Alu element insertions that were different from those in the reference genome. In addition, we propose an alignment-free genotyping method that uses the frequency of insertion/deletion-specific 32-mer pairs to call the genotype directly from raw sequencing reads. Using this method, the concordance between the predicted and experimentally observed genotypes was 98.7%. The running time of the discovery pipeline is approximately 2 h per individual. The genotyping of potential polymorphic insertions takes between 0.4 and 4 h per individual, depending on the hardware configuration.

Conclusions: AluMine provides tools that allow discovery of novel Alu element insertions and/or genotyping of known Alu element insertions from personal genomes within few hours.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13100-019-0174-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6639938PMC
July 2019

The Arrival of Siberian Ancestry Connecting the Eastern Baltic to Uralic Speakers further East.

Curr Biol 2019 05 9;29(10):1701-1711.e16. Epub 2019 May 9.

Estonian Biocentre, Institute of Genomics, University of Tartu, Tartu 51010, Estonia. Electronic address:

In this study, we compare the genetic ancestry of individuals from two as yet genetically unstudied cultural traditions in Estonia in the context of available modern and ancient datasets: 15 from the Late Bronze Age stone-cist graves (1200-400 BC) (EstBA) and 6 from the Pre-Roman Iron Age tarand cemeteries (800/500 BC-50 AD) (EstIA). We also included 5 Pre-Roman to Roman Iron Age Ingrian (500 BC-450 AD) (IngIA) and 7 Middle Age Estonian (1200-1600 AD) (EstMA) individuals to build a dataset for studying the demographic history of the northern parts of the Eastern Baltic from the earliest layer of Mesolithic to modern times. Our findings are consistent with EstBA receiving gene flow from regions with strong Western hunter-gatherer (WHG) affinities and EstIA from populations related to modern Siberians. The latter inference is in accordance with Y chromosome (chrY) distributions in present day populations of the Eastern Baltic, as well as patterns of autosomal variation in the majority of the westernmost Uralic speakers [1-5]. This ancestry reached the coasts of the Baltic Sea no later than the mid-first millennium BC; i.e., in the same time window as the diversification of west Uralic (Finnic) languages [6]. Furthermore, phenotypic traits often associated with modern Northern Europeans, like light eyes, hair, and skin, as well as lactose tolerance, can be traced back to the Bronze Age in the Eastern Baltic. VIDEO ABSTRACT.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cub.2019.04.026DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6544527PMC
May 2019

Endophytic bacterial communities in peels and pulp of five root vegetables.

PLoS One 2019 11;14(1):e0210542. Epub 2019 Jan 11.

Institute of Technology, University of Tartu, Tartu, Estonia.

Plants contain endophytic bacteria, whose communities both influence plant growth and can be an important source of probiotics. Here we used deep sequencing of a 16S rRNA gene fragment and bacterial cultivation to independently characterize the microbiomes of five plant species from divergent taxonomic orders-potato (Solanum tuberosum), carrot (Daucus sativus), beet (Beta vulgaris), neep (Brassica napus spp. napobrassica), and topinambur (Helianthus tuberosus). We found that both species richness and diversity tend to be higher in the peel, where Alphaproteobacteria and Actinobacteria dominate, while Gammaproteobacteria and Firmicutes dominate in the pulp. A statistical analysis revealed that the main characteristic features of the microbiomes of plant species originate from the peel microbiomes. Topinambur pulp displayed an interesting characteristic feature: it contained up to 108 CFUs of lactic acid bacteria, suggesting its use as a source of probiotic bacteria. We also detected Listeria sp., in topinambur pulps, however, the 16S rRNA gene fragment is unable to distinguish between pathogenic versus non-pathogenic species, so the evaluation of this potential health risk is left to a future study.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0210542PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6329509PMC
November 2019

Genetic variation in the Estonian population: pharmacogenomics study of adverse drug effects using electronic health records.

Eur J Hum Genet 2019 03 12;27(3):442-454. Epub 2018 Nov 12.

Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, 51010, Estonia.

Pharmacogenomics aims to tailor pharmacological treatment to each individual by considering associations between genetic polymorphisms and adverse drug effects (ADEs). With technological advances, pharmacogenomic research has evolved from candidate gene analyses to genome-wide association studies. Here, we integrate deep whole-genome sequencing (WGS) information with drug prescription and ADE data from Estonian electronic health record (EHR) databases to evaluate genome- and pharmacome-wide associations on an unprecedented scale. We leveraged WGS data of 2240 Estonian Biobank participants and imputed all single-nucleotide variants (SNVs) with allele counts over 2 for 13,986 genotyped participants. Overall, we identified 41 (10 novel) loss-of-function and 567 (134 novel) missense variants in 64 very important pharmacogenes. The majority of the detected variants were very rare with frequencies below 0.05%, and 6 of the novel loss-of-function and 99 of the missense variants were only detected as single alleles (allele count = 1). We also validated documented pharmacogenetic associations and detected new independent variants in known gene-drug pairs. Specifically, we found that CTNNA3 was associated with myositis and myopathies among individuals taking nonsteroidal anti-inflammatory oxicams and replicated this finding in an extended cohort of 706 individuals. These findings illustrate that population-based WGS-coupled EHRs are a useful tool for biomarker discovery.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41431-018-0300-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6460570PMC
March 2019

A k-mer-based method for the identification of phenotype-associated genomic biomarkers and predicting phenotypes of sequenced bacteria.

PLoS Comput Biol 2018 10 22;14(10):e1006434. Epub 2018 Oct 22.

Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia.

We have developed an easy-to-use and memory-efficient method called PhenotypeSeeker that (a) identifies phenotype-specific k-mers, (b) generates a k-mer-based statistical model for predicting a given phenotype and (c) predicts the phenotype from the sequencing data of a given bacterial isolate. The method was validated on 167 Klebsiella pneumoniae isolates (virulence), 200 Pseudomonas aeruginosa isolates (ciprofloxacin resistance) and 459 Clostridium difficile isolates (azithromycin resistance). The phenotype prediction models trained from these datasets obtained the F1-measure of 0.88 on the K. pneumoniae test set, 0.88 on the P. aeruginosa test set and 0.97 on the C. difficile test set. The F1-measures were the same for assembled sequences and raw sequencing data; however, building the model from assembled genomes is significantly faster. On these datasets, the model building on a mid-range Linux server takes approximately 3 to 5 hours per phenotype if assembled genomes are used and 10 hours per phenotype if raw sequencing data are used. The phenotype prediction from assembled genomes takes less than one second per isolate. Thus, PhenotypeSeeker should be well-suited for predicting phenotypes from large sequencing datasets. PhenotypeSeeker is implemented in Python programming language, is open-source software and is available at GitHub (https://github.com/bioinfo-ut/PhenotypeSeeker/).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1006434DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6211763PMC
October 2018

Multidrug resistant Pseudomonas aeruginosa in Estonian hospitals.

BMC Infect Dis 2018 Oct 11;18(1):513. Epub 2018 Oct 11.

Department of Microbiology, Institute of Biomedicine and Translational Medicine, University of Tartu, Ravila 19, 50411, Tartu, Estonia.

Background: We aimed to identify the main spreading clones, describe the resistance mechanisms associated with carbapenem- and/or multidrug-resistant P. aeruginosa and characterize patients at risk of acquiring these strains in Estonian hospitals.

Methods: Ninety-two non-duplicated carbapenem- and/or multidrug-resistant P. aeruginosa strains were collected between 27th March 2012 and 30th April 2013. Clinical data of the patients was obtained retrospectively from the medical charts. Clonal relationships of the strains were determined by whole genome sequencing and analyzed by multi-locus sequence typing. The presence of resistance genes and beta-lactamases and their origin was determined. Combined-disk method and PCR was used to evaluate carbapenemase and metallo-beta-lactamase production.

Results: Forty-three strains were carbapenem-resistant, 11 were multidrug-resistant and 38 were both carbapenem- and multidrug-resistant. Most strains (54%) were isolated from respiratory secretions and caused an infection (74%). Over half of the patients (57%) were ≥ 65 years old and 85% had ≥1 co-morbidity; 96% had contacts with healthcare and/or had received antimicrobial treatment in the previous 90 days. Clinically relevant beta-lactamases (OXA-101, OXA-2 and GES-5) were found in 12% of strains, 27% of which were located in plasmids. No Ambler class B beta-lactamases were detected. Aminoglycoside modifying enzymes were found in 15% of the strains. OprD was defective in 13% of the strains (all with CR phenotype); carbapenem resistance triggering mutations (F170 L, W277X, S403P) were present in 29% of the strains. Ciprofloxacin resistance correlated well with mutations in topoisomerase genes gyrA (T83I, D87N) and parC (S87 L). Almost all strains (97%) with these mutations showed ciprofloxacin-resistant phenotype. Multi-locus sequence type analysis indicated high diversity at the strain level - 36 different sequence types being detected. Two sequence types (ST108 (n = 23) and ST260 (n = 18)) predominated. Whereas ST108 was associated with localized spread in one hospital and mostly carbapenem-resistant phenotype, ST260 strains occurred in all hospitals, mostly with multi-resistant phenotype and carried different resistance genotype/machinery.

Conclusions: Diverse spread of local rather than international P. aeruginosa strains harboring multiple chromosomal mutations, but not plasmid-mediated Ambler class B beta-lactamases, were found in Estonian hospitals.

Trial Registration: This trial was registered retrospectively in ClinicalTrials.gov ( NCT03343119 ).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12879-018-3421-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6182868PMC
October 2018

PlasmidSeeker: identification of known plasmids from bacterial whole genome sequencing reads.

PeerJ 2018 2;6:e4588. Epub 2018 Apr 2.

Department of Bioinformatics, IMCB, University of Tartu, Tartu, Estonia.

Background: Plasmids play an important role in the dissemination of antibiotic resistance, making their detection an important task. Using whole genome sequencing (WGS), it is possible to capture both bacterial and plasmid sequence data, but short read lengths make plasmid detection a complex problem.

Results: We developed a tool named PlasmidSeeker that enables the detection of plasmids from bacterial WGS data without read assembly. The PlasmidSeeker algorithm is based on -mers and uses -mer abundance to distinguish between plasmid and bacterial sequences. We tested the performance of PlasmidSeeker on a set of simulated and real bacterial WGS samples, resulting in 100% sensitivity and 99.98% specificity.

Conclusion: PlasmidSeeker enables quick detection of known plasmids and complements existing tools that assemble plasmids de novo. The PlasmidSeeker source code is stored on GitHub: https://github.com/bioinfo-ut/PlasmidSeeker.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7717/peerj.4588DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5885972PMC
April 2018

Method for the Identification of Taxon-Specific -mers from Chloroplast Genome: A Case Study on Tomato Plant ().

Front Plant Sci 2018 17;9. Epub 2018 Jan 17.

Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia.

Polymerase chain reaction and different barcoding methods commonly used for plant identification from metagenomics samples are based on the amplification of a limited number of pre-selected barcoding regions. These methods are often inapplicable due to DNA degradation, low amplification success or low species discriminative power of selected genomic regions. Here we introduce a method for the rapid identification of plant taxon-specific -mers, that is applicable for the fast detection of plant taxa directly from raw sequencing reads without aligning, mapping or assembling the reads. We identified more than 800 specific -mers (32 nucleotides in length) from 42 different chloroplast genome regions using the developed method. We demonstrated that identified -mers are also detectable in whole genome sequencing raw reads from . Also, we demonstrated the usability of taxon-specific -mers in artificial mixtures of sequences from closely related species. Developed method offers a novel strategy for fast identification of taxon-specific genome regions and offers new perspectives for detection of plant taxa directly from sequencing raw reads.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fpls.2018.00006DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5776150PMC
January 2018

Primer3_masker: integrating masking of template sequence with primer design software.

Bioinformatics 2018 06;34(11):1937-1938

Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia.

Summary: Designing PCR primers for amplifying regions of eukaryotic genomes is a complicated task because the genomes contain a large number of repeat sequences and other regions unsuitable for amplification by PCR. We have developed a novel k-mer based masking method that uses a statistical model to detect and mask failure-prone regions on the DNA template prior to primer design. We implemented the software as a standalone software primer3_masker and integrated it into the primer design program Primer3.

Availability And Implementation: The standalone version of primer3_masker is implemented in C. The source code is freely available at https://github.com/bioinfo-ut/primer3_masker/ (standalone version for Linux and macOS) and at https://github.com/primer3-org/primer3/ (integrated version). Primer3 web application that allows masking sequences of 196 animal and plant genomes is available at http://primer3.ut.ee/.

Contact: [email protected]

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty036DOI Listing
June 2018

Increased sequencing depth does not increase captured diversity of arbuscular mycorrhizal fungi.

Mycorrhiza 2017 Nov 20;27(8):761-773. Epub 2017 Jul 20.

Department of Botany, Institute of Ecology and Earth Sciences, University of Tartu, 40 Lai Str, 51005, Tartu, Estonia.

The arrival of 454 sequencing represented a major breakthrough by allowing deeper sequencing of environmental samples than was possible with existing Sanger approaches. Illumina MiSeq provides a further increase in sequencing depth but shorter read length compared with 454 sequencing. We explored whether Illumina sequencing improves estimates of arbuscular mycorrhizal (AM) fungal richness in plant root samples, compared with 454 sequencing. We identified AM fungi in root samples by sequencing amplicons of the SSU rRNA gene with 454 and Illumina MiSeq paired-end sequencing. In addition, we sequenced metagenomic DNA without prior PCR amplification. Amplicon-based Illumina sequencing yielded two orders of magnitude higher sequencing depth per sample than 454 sequencing. Initial analysis with minimal quality control recorded five times higher AM fungal richness per sample with Illumina sequencing. Additional quality control of Illumina samples, including restriction of the marker region to the most variable amplicon fragment, revealed AM fungal richness values close to those produced by 454 sequencing. Furthermore, AM fungal richness estimates were not correlated with sequencing depth between 300 and 30,000 reads per sample, suggesting that the lower end of this range is sufficient for adequate description of AM fungal communities. By contrast, metagenomic Illumina sequencing yielded very few AM fungal reads and taxa and was dominated by plant DNA, suggesting that AM fungal DNA is present at prohibitively low abundance in colonised root samples. In conclusion, Illumina MiSeq sequencing yielded higher sequencing depth, but similar richness of AM fungi in root samples, compared with 454 sequencing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00572-017-0791-yDOI Listing
November 2017

FastGT: an alignment-free method for calling common SNVs directly from raw sequencing reads.

Sci Rep 2017 05 31;7(1):2537. Epub 2017 May 31.

Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia.

We have developed a computational method that counts the frequencies of unique k-mers in FASTQ-formatted genome data and uses this information to infer the genotypes of known variants. FastGT can detect the variants in a 30x genome in less than 1 hour using ordinary low-cost server hardware. The overall concordance with the genotypes of two Illumina "Platinum" genomes is 99.96%, and the concordance with the genotypes of the Illumina HumanOmniExpress is 99.82%. Our method provides k-mer database that can be used for the simultaneous genotyping of approximately 30 million single nucleotide variants (SNVs), including >23,000 SNVs from Y chromosome. The source code of FastGT software is available at GitHub (https://github.com/bioinfo-ut/GenomeTester4/).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-017-02487-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5451431PMC
May 2017

StrainSeeker: fast identification of bacterial strains from raw sequencing reads using user-provided guide trees.

PeerJ 2017 18;5:e3353. Epub 2017 May 18.

Department of Bioinformatics, University of Tartu, Tartu, Estonia.

Background: Fast, accurate and high-throughput identification of bacterial isolates is in great demand. The present work was conducted to investigate the possibility of identifying isolates from unassembled next-generation sequencing reads using custom-made guide trees.

Results: A tool named StrainSeeker was developed that constructs a list of specific -mers for each node of any given Newick-format tree and enables the identification of bacterial isolates in 1-2 min. It uses a novel algorithm, which analyses the observed and expected fractions of node-specific -mers to test the presence of each node in the sample. This allows StrainSeeker to determine where the isolate branches off the guide tree and assign it to a clade whereas other tools assign each read to a reference genome. Using a dataset of 100 isolates, we demonstrate that StrainSeeker can predict the clades of with 92% accuracy and correct tree branch assignment with 98% accuracy. Twenty-five thousand Illumina HiSeq reads are sufficient for identification of the strain.

Conclusion: StrainSeeker is a software program that identifies bacterial isolates by assigning them to nodes or leaves of a custom-made guide tree. StrainSeeker's web interface and pre-computed guide trees are available at http://bioinfo.ut.ee/strainseeker. Source code is stored at GitHub: https://github.com/bioinfo-ut/StrainSeeker.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7717/peerj.3353DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5438578PMC
May 2017

Natural Variation in Arabidopsis Cvi-0 Accession Reveals an Important Role of MPK12 in Guard Cell CO2 Signaling.

PLoS Biol 2016 Dec 6;14(12):e2000322. Epub 2016 Dec 6.

Institute of Technology, University of Tartu, Tartu, Estonia.

Plant gas exchange is regulated by guard cells that form stomatal pores. Stomatal adjustments are crucial for plant survival; they regulate uptake of CO2 for photosynthesis, loss of water, and entrance of air pollutants such as ozone. We mapped ozone hypersensitivity, more open stomata, and stomatal CO2-insensitivity phenotypes of the Arabidopsis thaliana accession Cvi-0 to a single amino acid substitution in MITOGEN-ACTIVATED PROTEIN (MAP) KINASE 12 (MPK12). In parallel, we showed that stomatal CO2-insensitivity phenotypes of a mutant cis (CO2-insensitive) were caused by a deletion of MPK12. Lack of MPK12 impaired bicarbonate-induced activation of S-type anion channels. We demonstrated that MPK12 interacted with the protein kinase HIGH LEAF TEMPERATURE 1 (HT1)-a central node in guard cell CO2 signaling-and that MPK12 functions as an inhibitor of HT1. These data provide a new function for plant MPKs as protein kinase inhibitors and suggest a mechanism through which guard cell CO2 signaling controls plant water management.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pbio.2000322DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5147794PMC
December 2016

Plasmid with Colistin Resistance Gene mcr-1 in Extended-Spectrum-β-Lactamase-Producing Escherichia coli Strains Isolated from Pig Slurry in Estonia.

Antimicrob Agents Chemother 2016 11 21;60(11):6933-6936. Epub 2016 Oct 21.

Institute of Technology, University of Tartu, Tartu, Estonia

A plasmid carrying the colistin resistance gene mcr-1 was isolated from a pig slurry sample in Estonia. The gene was present on a 33,311-bp plasmid of the IncX4 group. mcr-1 is the only antibiotic resistance gene on the plasmid, with the other genes mainly coding for proteins involved in conjugative DNA transfer (taxA, taxB, taxC, trbM, and the pilX operon). The plasmid pESTMCR was present in three phylogenetically very different Escherichia coli strains, suggesting that it has high potential for horizontal transfer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/AAC.00443-16DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5075111PMC
November 2016

GenomeTester4: a toolkit for performing basic set operations - union, intersection and complement on k-mer lists.

Gigascience 2015 3;4:58. Epub 2015 Dec 3.

Department of Bioinformatics, University of Tartu, Riia 23, Tartu, 51010 Estonia ; Estonian Biocentre, Riia 23B, Tartu, 51010 Estonia.

Background: K-mer-based methods of genome analysis have attracted great interest because they do not require genome assembly and can be performed directly on sequencing reads. Many analysis tasks require one to compare k-mer lists from different sequences to find words that are either unique to a specific sequence or common to many sequences. However, no stand-alone k-mer analysis tool currently allows one to perform these algebraic set operations.

Findings: We have developed the GenomeTester4 toolkit, which contains a novel tool GListCompare for performing union, intersection and complement (difference) set operations on k-mer lists. We provide examples of how these general operations can be combined to solve a variety of biological analysis tasks.

Conclusions: GenomeTester4 can be used to simplify k-mer list manipulation for many biological analysis tasks.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13742-015-0097-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4669650PMC
July 2016

Microbial population dynamics in response to Pectobacterium atrosepticum infection in potato tubers.

Sci Rep 2015 Jun 29;5:11606. Epub 2015 Jun 29.

Institute of Molecular and Cell Biology, University of Tartu, 23 Riia Street, Tartu 51010, Estonia.

Endophytes are microbes and fungi that live inside plant tissues without damaging the host. Herein we examine the dynamic changes in the endophytic bacterial community in potato (Solanum tuberosum) tuber in response to pathogenic infection by Pectobacterium atrosepticum, which causes soft rot in numerous economically important crops. We quantified community changes using both cultivation and next-generation sequencing of the 16S rRNA gene and found that, despite observing significant variability in both the mass of macerated tissue and structure of the endophytic community between individual potato tubers, P. atrosepticum is always taken over by the endophytes during maceration. 16S rDNA sequencing revealed bacteria from the phyla Proteobacteria, Actinobacteria, Firmicutes, Bacteroidetes, Fusobacteria, Verrucomicrobia, Acidobacteria, TM7, and Deinococcus-Thermus. Prior to infection, Propionibacterium acnes is frequently among the dominant taxa, yet is out competed by relatively few dominant taxa as the infection proceeds. Two days post-infection, the most abundant sequences in macerated potato tissue are Gammaproteobacteria. The most dominant genera are Enterobacter and Pseudomonas. Eight days post-infection, the number of anaerobic pectolytic Clostridia increases, probably due to oxygen depletion. These results demonstrate that the pathogenesis is strictly initiated by the pathogen (sensu stricto) and proceeds with a major contribution from the endophytic community.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/srep11606DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4484245PMC
June 2015

Haplotype phasing and inheritance of copy number variants in nuclear families.

PLoS One 2015 8;10(4):e0122713. Epub 2015 Apr 8.

Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia; Estonian Biocentre, Tartu, Estonia.

DNA copy number variants (CNVs) that alter the copy number of a particular DNA segment in the genome play an important role in human phenotypic variability and disease susceptibility. A number of CNVs overlapping with genes have been shown to confer risk to a variety of human diseases thus highlighting the relevance of addressing the variability of CNVs at a higher resolution. So far, it has not been possible to deterministically infer the allelic composition of different haplotypes present within the CNV regions. We have developed a novel computational method, called PiCNV, which enables to resolve the haplotype sequence composition within CNV regions in nuclear families based on SNP genotyping microarray data. The algorithm allows to i) phase normal and CNV-carrying haplotypes in the copy number variable regions, ii) resolve the allelic copies of rearranged DNA sequence within the haplotypes and iii) infer the heritability of identified haplotypes in trios or larger nuclear families. To our knowledge this is the first program available that can deterministically phase null, mono-, di-, tri- and tetraploid genotypes in CNV loci. We applied our method to study the composition and inheritance of haplotypes in CNV regions of 30 HapMap Yoruban trios and 34 Estonian families. For 93.6% of the CNV loci, PiCNV enabled to unambiguously phase normal and CNV-carrying haplotypes and follow their transmission in the corresponding families. Furthermore, allelic composition analysis identified the co-occurrence of alternative allelic copies within 66.7% of haplotypes carrying copy number gains. We also observed less frequent transmission of CNV-carrying haplotypes from parents to children compared to normal haplotypes and identified an emergence of several de novo deletions and duplications in the offspring.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0122713PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4390228PMC
March 2016

A recent bottleneck of Y chromosome diversity coincides with a global change in culture.

Genome Res 2015 Apr 13;25(4):459-66. Epub 2015 Mar 13.

Center of Molecular Diagnosis and Genetic Research, University Hospital of Obstetrics and Gynecology, Tirana, ALB1005, Albania;

It is commonly thought that human genetic diversity in non-African populations was shaped primarily by an out-of-Africa dispersal 50-100 thousand yr ago (kya). Here, we present a study of 456 geographically diverse high-coverage Y chromosome sequences, including 299 newly reported samples. Applying ancient DNA calibration, we date the Y-chromosomal most recent common ancestor (MRCA) in Africa at 254 (95% CI 192-307) kya and detect a cluster of major non-African founder haplogroups in a narrow time interval at 47-52 kya, consistent with a rapid initial colonization model of Eurasia and Oceania after the out-of-Africa bottleneck. In contrast to demographic reconstructions based on mtDNA, we infer a second strong bottleneck in Y-chromosome lineages dating to the last 10 ky. We hypothesize that this bottleneck is caused by cultural changes affecting variance of reproductive success among males.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.186684.114DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4381518PMC
April 2015

MultiPLX: automatic grouping and evaluation of PCR primers.

Methods Mol Biol 2015 ;1275:127-42

Department of Bioinformatics, University of Tartu, Tartu, Estonia.

In this chapter we describe MultiPLX-a tool for automatic grouping of PCR primers for multiplexed PCR. Both generic working principle and step-by-step practical procedures with examples are presented. MultiPLX performs grouping by calculating many important interaction levels between the different primer pairs and then distributes primer pairs to groups so that the strength of unwanted interactions is kept below user-defined compatibility level. In addition it can be used to select optimal primer pairs for multiplexing from list of candidates. MultiPLX can be downloaded from http://bioinfo.ut.ee/?page_id=167. Graphical web-based interface to most functions of MultiPLX is available at http://bioinfo.ut.ee/multiplx/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-1-4939-2365-6_9DOI Listing
December 2015
-->