22,172 results match your criteria Bioinformatics [Journal]


SCIP: A Single-Cell Image Processor toolbox.

Bioinformatics 2018 Jun 21. Epub 2018 Jun 21.

CA3 CTS/UNINOVA. Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, Portugal.

Summary: Each cell is a phenotypically unique individual that is influenced by internal and external processes, operating in parallel. To characterize the dynamics of cellular processes one needs to observe many individual cells from multiple points of view and over time, so as to identify commonalities and variability. With this aim, we engineered a software, "SCIP", to analyse multi-modal, multi-process, time-lapse microscopy morphological and functional images. Read More

View Article

Reverse-engineering flow-cytometry gating strategies for phenotypic labelling and high-performance cell sorting.

Bioinformatics 2018 Jun 21. Epub 2018 Jun 21.

Singapore Immunology Network, Agency for Science Technology and Research.

Motivation: Recent flow and mass cytometers generate datasets of dimensions 20 to 40 and a million single cells. From these, many tools facilitate the discovery of new cell populations associated with diseases or physiology. These new cell populations require the identification of new gating strategies, but gating strategies become exponentially more difficult to optimize when dimensionality increases. Read More

View Article

snpAD: An ancient DNA genotype caller.

Authors:
Kay Prüfer

Bioinformatics 2018 Jun 21. Epub 2018 Jun 21.

Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany.

Motivation: The study of ancient genomes can elucidate the evolutionary past. However, analyses are complicated by base-modifications in ancient DNA molecules that result in errors in DNA sequences. These errors are particularly common near the ends of sequences and pose a challenge for genotype calling. Read More

View Article

PASTA for Proteins.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Department of Computer Science, University of Illinois, Urbana, 61866, USA.

Summary: PASTA is a multiple sequence method that uses divide-and-conquer plus iteration to enable base alignment methods to scale with high accuracy to large sequence datasets. By default, PASTA included MAFFT L-INS-i; our new extension of PASTA enables the use of MAFFT G-INS-i, MAFFT Homologs, CONTRAlign, and ProbCons. We analyzed the performance of each base method and PASTA using these base methods on 224 datasets from BAliBASE 4 with at least 50 sequences. Read More

View Article

Meffil: efficient normalization and analysis of very large DNA methylation datasets.

Bioinformatics 2018 Jun 21. Epub 2018 Jun 21.

MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK.

Motivation: DNA methylation datasets are growing ever larger both in sample size and genome coverage. Novel computational solutions are required to efficiently handle these data.

Results: We have developed meffil, an R package designed for efficient quality control, normalization and epigenome-wide association studies of large samples of Illumina Methylation BeadChip microarrays. Read More

View Article

Accurate Prediction of Protein Contact Maps by Coupling Residual Two-Dimensional Bidirectional Long Short-Term Memory with Convolutional Neural Networks.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, 4222, QLD, Australia.

Motivation: Accurate prediction of a protein contact map depends greatly on capturing as much contextual information as possible from surrounding residues for a target residue pair. Recently, ultra-deep residual convolutional networks were found to be state-of-the-art in the latest Critical Assessment of Structure Prediction techniques (CASP12, (Schaarschmidt et al., 2018)) for protein contact map prediction by attempting to provide a protein-wide context at each residue pair. Read More

View Article

BioStructMap: A Python tool for integration of protein structure and sequence-based features.

Bioinformatics 2018 Jun 21. Epub 2018 Jun 21.

Life Sciences, Burnet Institute, Melbourne, Australia.

Summary: A sliding window analysis over a protein or genomic sequence is commonly performed, and we present a Python tool, BioStructMap, that extends this concept to three-dimensional (3D) space, allowing the application of a 3D sliding window analysis over a protein structure. BioStructMap is easily extensible, allowing the user to apply custom functions to spatially aggregated data. BioStructMap also allows mapping of underlying genomic sequences to protein structures, allowing the user to perform genetic-based analysis over spatially linked codons-this has applications when selection pressures arise at the level of protein structure. Read More

View Article

ICBdocker: a Docker image for proteome annotation and visualization.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Laboratory of Evolutionary Innovations, Centro Andaluz de Biologia del Desarollo, CSIC, Universidad Pablo de Olavide, Carretera de Utrera, Km 1, 41013 Seville, Spain.

Summary: We introduce ICBdocker, a Docker environment that allows the annotation of functional and structural features of proteomes through a Python/Perl pipeline. DataTables pages make it easy to set up a web-resource for research groups with a focus on the same organisms or datasets. The results are available as tab-separated values (tsv) files and HTML, allowing data analysis and browsing. Read More

View Article

MolArt: A molecular structure annotation and visualization tool.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6, avenue du Swing L-4367 Belvaux, Luxembourg.

Summary: MolArt fills the gap between sequence and structure visualization by providing a light-weight, interactive environment enabling exploration of sequence annotations in the context of available experimental or predicted protein structures. Provided a UniProt ID, MolArt downloads and displays sequence annotations, sequence-structure mapping and relevant structures. The sequence and structure views are interlinked, enabling sequence annotations being color overlaid over the mapped structures, thus providing an enhanced understanding and interpretation of the available molecular data. Read More

View Article

GIFT: Guided and Interpretable Factorization for Tensors with an Application to Large-Scale Multi-platform Cancer Analysis.

Bioinformatics 2018 Jun 21. Epub 2018 Jun 21.

Computer Science and Engineering, Seoul National University, Seoul, Korea.

Motivation: Given multi-platform genome data with prior knowledge of functional gene sets, how can we extract interpretable latent relationships between patients and genes? More specifically, how can we devise a tensor factorization method which produces an interpretable gene factor matrix based on functional gene set information while maintaining the decomposition quality and speed?

Method: We propose GIFT, a Guided and Interpretable Factorization for Tensors. GIFT provides interpretable factor matrices by encoding prior knowledge as a regularization term in its objective function.

Results: We apply GIFT to the PanCan12 dataset (TCGA multi-platform genome data) and compare the performance with P-Tucker, our baseline method without prior knowledge constraint, and Silenced-TF, our naive interpretable method. Read More

View Article
June 2018
1 Read

13Check_RNA: A tool to evaluate 13C chemical shifts assignments of RNA.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Instituto de Matemática Aplicada San Luis, Universidad Nacional de San Luis, CONICET, Avenida Italia 1556, 5700, San Luis-Argentina.

Motivation: Chemical shifts (CS) are an important source of structural information of macromolecules such as RNA. In addition to the scarce availability of CS for RNA, the observed values are prone to errors due to a wrong re-calibration or miss assignments. Different groups have dedicated their efforts to correct CS systematic errors on RNA. Read More

View Article

Estimating pseudocounts and fold changes for digital expression measurements.

Authors:
Florian Erhard

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Institut für Virologie und Immunbiologie, Julius-Maximilians-Universität Würzburg, Versbacher Straße 7, 97078 Würzburg, Germany.

Motivation: Fold changes from count based high-throughput experiments such as RNA-seq suffer from a zero-frequency problem. To circumvent division by zero, so-called pseudocounts are added to make all observed counts strictly positive. The magnitude of pseudocounts for digital expression measurements and on which stage of the analysis they are introduced remained an arbitrary choice. Read More

View Article

SONiCS: PCR stutter noise correction in genome-scale microsatellites.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Department of Anthropology, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560 USA.

Motivation: Massively parallel capture of short tandem repeats (STRs, or microsatellites) provides a strategy for population genomic and demographic analyses at high resolution with or without a reference genome. However, the high Polymerase Chain Reaction (PCR) cycle numbers needed for target capture experiments create genotyping noise through polymerase slippage known as PCR stutter.

Results: We developed SONiCS-Stutter mONte Carlo Simulation-a solution for stutter correction based on dense forward simulations of PCR and capture experimental conditions. Read More

View Article

Grimon: Graphical interface to visualize multi-omics networks.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita 565-0871, Japan.

Summary: Rapid advances in high-throughput sequencing technologies have enabled more efficient acquisition of massive amount of multi-omics data. However, interpretation of the underlying relationships across multi-omics networks has not been fully succeeded, partly due to the lack of effective methods in visualization. To aid interpretation of the results from such multi-omics data, we here present Grimon (Graphical interface to visualize multi-omics networks), an R package that visualizes high-dimensional multi-layered data sets in three-dimensional parallel coordinates. Read More

View Article

3DPatch: fast 3D structure visualization with residue conservation.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Summary: Amino acid residues showing above background levels of conservation are often indicative of functionally significant regions within a protein. Understanding how the sequence conservation profile relates in space requires projection onto a protein structure, a potentially time-consuming process. 3DPatch is a web application that streamlines this task by automatically generating multiple sequence alignments (where appropriate) and finding structural homologs, presenting the user with a choice of structures matching their query, annotated with residue conservation scores in a matter of seconds. Read More

View Article

iLoc-lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC.

Bioinformatics 2018 Jun 21. Epub 2018 Jun 21.

Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.

Motivation: Long non-coding RNAs (lncRNAs) are a class of RNA molecules with more than 200 nucleotides. They have important functions in cell development and metabolism, such as genetic markers, genome rearrangements, chromatin modifications, cell cycle regulation, transcription and translation. Their functions are generally closely related to their localization in the cell. Read More

View Article

Exploring the functional impact of alternative splicing on human protein isoforms using available annotation sources.

Brief Bioinform 2018 Jun 21. Epub 2018 Jun 21.

Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA.

In recent years, the emphasis of scientific inquiry has shifted from whole-genome analyses to an understanding of cellular responses specific to tissue, developmental stage or environmental conditions. One of the central mechanisms underlying the diversity and adaptability of the contextual responses is alternative splicing (AS). It enables a single gene to encode multiple isoforms with distinct biological functions. Read More

View Article

SpliceRover: Interpretable Convolutional Neural: Networks for Improved Splice Site Prediction.

Bioinformatics 2018 Jun 21. Epub 2018 Jun 21.

Center for Biotech Data Science, Ghent University Global Campus, Songdo, Incheon, 305-701, South Korea.

Motivation: During the last decade, improvements in high-throughput sequencing have generated a wealth of genomic data. Functionally interpreting these sequences and finding the biological signals that are hallmarks of gene function and regulation is currently mostly done using automated genome annotation platforms, which mainly rely on integrated machine learning frameworks to identify different functional sites of interest, including splice sites. Splicing is an essential step in the gene regulation process, and the correct identification of splice sites is a major cornerstone in a genome annotation system. Read More

View Article

Prediction of kinetics of protein folding with non-redundant contact information.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Institute of Chemistry and Center for Computational Engineering & Science, University of Campinas, Campinas, SP, Brazil.

Motivation: The majority of the inter-residue distances in a protein structure are correlated given a fixed topology. Here, we investigate whether we are able to predict a structure's folding rate, which is known to depend on the complexity of its fold, while considering only a small, uncorrelated fraction of its contacts.

Results: We define an expression for the probabilistic information content associated to the relative position of a pair of amino acid residues in a protein structure. Read More

View Article

Deep convolutional networks for quality assessment of protein folds.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Department of Chemistry and Biochemistry and Centre for Research in Molecular Modeling (CERMM), Concordia University, Montréal, H4B 1R6, Canada.

Motivation: The computational prediction of a protein structure from its sequence generally relies on a method to assess the quality of protein models. Most assessment methods rank candidate models using heavily engineered structural features, defined as complex functions of the atomic coordinates. However, very few methods have attempted to learn these features directly from the data. Read More

View Article

Operon-mapper: A Web Server for Precise Operon Identification in Bacterial and Archaeal Genomes.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Department of Molecular Microbiology, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, México.

Summary: Operon-mapper is a web server that accurately, easily, and directly predicts the operons of any bacterial or archaeal genome sequence. The operon predictions are based on the intergenic distance of neighboring genes as well as the functional relationships of their protein-coding products. To this end, Operon-mapper finds all the ORFs within a given nucleotide sequence, along with their genomic coordinates, orthology groups, and functional relationships. Read More

View Article

panISa: Ab initio detection of insertion sequences in bacterial genomes from short read sequence data.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

UMR CNRS 6249, Chrono-environnement, Université de Bourgogne Franche-Comté, Besançon, France.

Motivation: The advent of next-generation sequencing has boosted the analysis of bacterial genome evolution. Insertion sequence (IS) elements play a key role in prokaryotic genome organization and evolution, but their repetitions in genomes complicate their detection from short-read data.

Results: PanISa is a software pipeline that identifies IS insertions ab initio in bacterial genomes from short-read data. Read More

View Article

htsget: a protocol for securely streaming genomic data.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, CB10 1SD, UK.

Summary: Standardised interfaces for efficiently accessing high-throughput sequencing data are a fundamental requirement for large-scale genomic data sharing. We have developed htsget, a protocol for secure, efficient and reliable access to sequencing read and variation data. We demonstrate four independent client and server implementations, and the results of a comprehensive interoperability demonstration. Read More

View Article

Modelling Signalling Networks from Perturbation Data.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Institute of Pathology, Charité Universitätsmedizin, Berlin, Germany.

Motivation: Intracellular signalling is realised by complex signalling networks which are almost impossible to understand without network models, especially if feedbacks are involved. Modular Response Analysis (MRA) is a convenient modelling method to study signalling networks in various contexts.

Results: We developed the software package STASNet that provides an augmented and extended version of MRA suited to model signalling networks from incomplete perturbation schemes and multi-perturbation data. Read More

View Article

Predicting clone genotypes from tumor bulk sequencing of multiple samples.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Institute for Genomics and Evolutionary Medicine.

Motivation: Analyses of data generated from bulk sequencing of tumors have revealed extensive genomic heterogeneity within patients. Many computational methods have been developed to enable the inference of genotypes of tumor cell populations (clones) from bulk sequencing data. However, the relative and absolute accuracy of available computational methods in estimating clone counts and clone genotypes is not yet known. Read More

View Article

GWASinlps: Nonlocal prior based iterative SNP selection tool for genome-wide association studies.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Department of Radiology, University of California, San Diego, La Jolla, CA 92093, USA.

Motivation: Multiple marker analysis of the genome-wide association study (GWAS) data has gained ample attention in recent years. However, because of the ultra high-dimensionality of GWAS data, such analysis is challenging. Frequently used penalized regression methods often lead to large number of false positives, whereas Bayesian methods are computationally very expensive. Read More

View Article

Using tree-based methods for detection of gene-gene interactions in the presence of a polygenic signal: simulation study with application to educational attainment in the Generation Scotland Cohort Study.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Centre for Genomic and Experimental Medicine, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK.

Motivation: The genomic architecture of human complex diseases is thought to be attributable to single markers, polygenic components and epistatic components. No study has examined the ability of tree-based methods to detect epistasis in the presence of a polygenic signal. We sought to apply decision tree-based methods, C5. Read More

View Article
June 2018
1 Read

mol2sphere: Spherical Decomposition of Multi-Domain Molecules for Visualization and Coarse Grained Spatial Modeling.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030 USA.

Motivation: Proteins, especially those involved in signaling pathways are composed of functional modules connected by linker domains with varying degrees of flexibility. To understand the structure-function relationships in these macromolecules, it is helpful to visualize the geometric arrangement of domains. Furthermore, accurate spatial representation of domain structure is necessary for coarse-grain models of the multi-molecular interactions that comprise signaling pathways. Read More

View Article

HUME: Large-scale Detection of Causal Genetic Factors of Adverse Drug Reactions.

Bioinformatics 2018 Jun 19. Epub 2018 Jun 19.

Department of Computing Science, Simon Fraser University, Burnaby, Canada.

Motivation: Adverse Drug Reactions are one of the major factors that affect the wellbeing of patients and financial costs of healthcare systems. Genetic variations of patients have been shown to be a key factor in the occurrence and severity of many ADRs. However, the large number of confounding drugs and genetic biomarkers for each adverse reaction case demands a method that evaluates all potential genetic causes of ADRs simultaneously. Read More

View Article

Gene length corrected trimmed mean of M-values (GeTMM) processing of RNA-seq data performs similarly in intersample analyses while improving intrasample comparisons.

BMC Bioinformatics 2018 Jun 22;19(1):236. Epub 2018 Jun 22.

Department of Medical Oncology, Erasmus MC Cancer Institute, Erasmus MC University Medical Center, 3015 CE, Rotterdam, The Netherlands.

Background: Current normalization methods for RNA-sequencing data allow either for intersample comparison to identify differentially expressed (DE) genes or for intrasample comparison for the discovery and validation of gene signatures. Most studies on optimization of normalization methods typically use simulated data to validate methodologies. We describe a new method, GeTMM, which allows for both inter- and intrasample analyses with the same normalized data set. Read More

View Article

A selective method for optimizing ensemble docking-based experiments on an InhA Fully-Flexible receptor model.

BMC Bioinformatics 2018 Jun 22;19(1):235. Epub 2018 Jun 22.

Bioinformatics and Biossystems Modeling and Simulation Lab-LABIO, School of Technology, PUCRS, Av. Ipiranga, 6681, Building 32, Room 602, Porto Alegre, RS, Brazil.

Background: In the rational drug design process, an ensemble of conformations obtained from a molecular dynamics simulation plays a crucial role in docking experiments. Some studies have found that Fully-Flexible Receptor (FFR) models predict realistic binding energy accurately and improve scoring to enhance selectiveness. At the same time, methods have been proposed to reduce the high computational costs involved in considering the explicit flexibility of proteins in receptor-ligand docking. Read More

View Article

ARKS: chromosome-scale scaffolding of human genome drafts with linked read kmers.

BMC Bioinformatics 2018 Jun 20;19(1):234. Epub 2018 Jun 20.

BC Cancer Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada.

Background: The long-range sequencing information captured by linked reads, such as those available from 10× Genomics (10xG), helps resolve genome sequence repeats, and yields accurate and contiguous draft genome assemblies. We introduce ARKS, an alignment-free linked read genome scaffolding methodology that uses linked reads to organize genome assemblies further into contiguous drafts. Our approach departs from other read alignment-dependent linked read scaffolders, including our own (ARCS), and uses a kmer-based mapping approach. Read More

View Article

Epigenetic machine learning: utilizing DNA methylation patterns to predict spastic cerebral palsy.

BMC Bioinformatics 2018 Jun 21;19(1):225. Epub 2018 Jun 21.

Nemours Biomedical Research, Nemours - Alfred I. duPont Hospital for Children, 1600 Rockland Rd, Wilmington, DE, 19803, USA.

Background: Spastic cerebral palsy (CP) is a leading cause of physical disability. Most people with spastic CP are born with it, but early diagnosis is challenging, and no current biomarker platform readily identifies affected individuals. The aim of this study was to evaluate epigenetic profiles as biomarkers for spastic CP. Read More

View Article

Feature extraction method for proteins based on Markov tripeptide by compressive sensing.

Authors:
C F Gao X Y Wu

BMC Bioinformatics 2018 Jun 18;19(1):229. Epub 2018 Jun 18.

School of Science, Jiangnan University, Wuxi, 214122, China.

Background: In order to capture the vital structural information of the original protein, the symbol sequence was transformed into the Markov frequency matrix according to the consecutive three residues throughout the chain. A three-dimensional sparse matrix sized 20 × 20 × 20 was obtained and expanded to one-dimensional vector. Then, an appropriate measurement matrix was selected for the vector to obtain a compressed feature set by random projection. Read More

View Article

Performance of epistasis detection methods in semi-simulated GWAS.

BMC Bioinformatics 2018 Jun 18;19(1):231. Epub 2018 Jun 18.

SANOFI R&D, Translational Sciences, Chilly Mazarin, 91385, France.

Background: Part of the missing heritability in Genome Wide Association Studies (GWAS) is expected to be explained by interactions between genetic variants, also called epistasis. Various statistical methods have been developed to detect epistasis in case-control GWAS. These methods face major statistical challenges due to the number of tests required, the complexity of the Linkage Disequilibrium (LD) structure, and the lack of consensus regarding the definition of epistasis. Read More

View Article
June 2018
1 Read

SamSelect: a sample sequence selection algorithm for quorum planted motif search on large DNA datasets.

BMC Bioinformatics 2018 Jun 18;19(1):228. Epub 2018 Jun 18.

School of Computer Science and Technology, Xidian University, Xi'an, 710071, China.

Background: Given a set of t n-length DNA sequences, q satisfying 0 < q ≤ 1, and l and d satisfying 0 ≤ d < l < n, the quorum planted motif search (qPMS) finds l-length strings that occur in at least qt input sequences with up to d mismatches and is mainly used to locate transcription factor binding sites in DNA sequences. Existing qPMS algorithms have been able to efficiently process small standard datasets (e.g. Read More

View Article

Algorithms designed for compressed-gene-data transformation among gene banks with different references.

BMC Bioinformatics 2018 Jun 18;19(1):230. Epub 2018 Jun 18.

NHPCC/Guangdong Key Laboratory of popular HPC and College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China.

Background: With the reduction of gene sequencing cost and demand for emerging technologies such as precision medical treatment and deep learning in genome, it is an era of gene data outbreaks today. How to store, transmit and analyze these data has become a hotspot in the current research. Now the compression algorithm based on reference is widely used due to its high compression ratio. Read More

View Article

Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data.

BMC Bioinformatics 2018 Jun 19;19(1):232. Epub 2018 Jun 19.

Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, USA.

Background: A fundamental fact in biology states that genes do not operate in isolation, and yet, methods that infer regulatory networks for single cell gene expression data have been slow to emerge. With single cell sequencing methods now becoming accessible, general network inference algorithms that were initially developed for data collected from bulk samples may not be suitable for single cells. Meanwhile, although methods that are specific for single cell data are now emerging, whether they have improved performance over general methods is unknown. Read More

View Article

Predicting drug-disease associations by using similarity constrained matrix factorization.

BMC Bioinformatics 2018 Jun 19;19(1):233. Epub 2018 Jun 19.

School of Computer Science, Wuhan University, Wuhan, 430072, China.

Background: Drug-disease associations provide important information for the drug discovery. Wet experiments that identify drug-disease associations are time-consuming and expensive. However, many drug-disease associations are still unobserved or unknown. Read More

View Article

Metaxa2 Database Builder: Enabling taxonomic identification from metagenomic or metabarcoding data using any genetic marker.

Bioinformatics 2018 Jun 15. Epub 2018 Jun 15.

Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Gothenburg, Sweden.

Motivation: Correct taxonomic identification of DNA sequences is central to studies of biodiversity using both shotgun metagenomic and metabarcoding approaches. However, no genetic marker gives sufficient performance across all the biological kingdoms, hampering studies of taxonomic diversity in many groups of organisms. This has led to the adoption of a range of genetic markers for DNA metabarcoding. Read More

View Article

GeneSpy, a user-friendly and flexible genomic context visualizer.

Bioinformatics 2018 Jun 15. Epub 2018 Jun 15.

Univ Lyon, Université Lyon 1, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Évolutive, 43 bd du 11 novembre 1918, F-69622, Villeurbanne, France.

Summary: The exploration and comparison of genome organization is routinely used in the frame of genomic and phylogenomic analyses. As a consequence, in the past few years, various tools allowing visualizing genomic contexts have been developed. However, their use is often hampered by a lack of flexibility, particularly concerning databases, input formats and figure customization. Read More

View Article

iMetaLab 1.0: A web platform for metaproteomics data analysis.

Bioinformatics 2018 Jun 15. Epub 2018 Jun 15.

Department of Biochemistry, Microbiology and Immunology, Ottawa Institute of Systems Biology, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada.

Summary: The human gut microbiota, a complex, dynamic and biodiverse community, has been increasingly shown to influence many aspects of health and disease. Metaproteomic analysis has proven to be a powerful approach to study the functionality of the microbiota. However, the processing and analyses of metaproteomic mass spectrometry (MS) data remains a daunting task in metaproteomics data analysis. Read More

View Article

Chemical shift-based identification of monosaccharide spin-systems with NMR spectroscopy to complement untargeted glycomics.

Bioinformatics 2018 Jun 15. Epub 2018 Jun 15.

Department of Biosciences, University of Salzburg, Billrothstrasse 11, 5020 Salzburg, Austria.

Motivation: A better understanding of oligosaccharides and their wide-ranging functions in almost every aspect of biology and medicine promises to uncover hidden layers of biology and will support the development of better therapies. Elucidating the chemical structure of an unknown oligosaccharide is still a challenge. Efficient tools are required for non-targeted glycomics. Read More

View Article

TiSAn: Estimating Tissue Specific Effects of Coding and Noncoding Variants.

Bioinformatics 2018 Apr 18. Epub 2018 Apr 18.

Department of Psychiatry, University of Iowa, Carver College of Medicine, Iowa City, 52240, USA.

Motivation: Model-based estimates of general deleteriousness, like CADD, DANN or PolyPhen, have become indispensable tools in the interpretation of genetic variants. However, these approaches say little about the tissues in which the effects of deleterious variants will be most meaningful. Tissue-specific annotations have been recently inferred for dozens of tissues/cell types from large collections of cross-tissue epigenomic data, and have demonstrated sensitivity in predicting affected tissues in complex traits. Read More

View Article

Identifying differentially methylated sites in samples with varying tumor purity.

Bioinformatics 2018 Apr 18. Epub 2018 Apr 18.

Research Programs Unit, Genome-Scale Biology, Medicum and Department of Biochemistry and Developmental Biology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.

Motivation: DNA methylation aberrations are common in many cancer types. A major challenge hindering comparison of patient-derived samples is that they comprise of heterogeneous collection of cancer and microenvironment cells. We present a computational method that allows comparing cancer methylomes in two or more heterogeneous tumor samples featuring differing, unknown fraction of cancer cells. Read More

View Article

Prioritizing Predictive Biomarkers for Gene Essentiality in Cancer Cells with mRNA Expression Data and DNA Copy Number Profile.

Bioinformatics 2018 Jun 15. Epub 2018 Jun 15.

Department of Computational Medicine and Bioinformatics, University of Michigan. 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218, USA.

Motivation: Finding driver genes that are responsible for the aberrant proliferation rate of cancer cells is informative for both cancer research and the development of targeted drugs. The established experimental and computational methods are labor-intensive. To make algorithms feasible in real clinical settings, methods that can predict driver genes using less experimental data are urgently needed. Read More

View Article

The Ancestral KH Peptide at the Root of a Domain Family With Three Different Folds.

Bioinformatics 2018 Jun 15. Epub 2018 Jun 15.

Department of Protein Evolution, Max-Planck-Institute for Developmental Biology, Tübingen 72076, Germany.

Motivation: The direct ancestor of the DNA-protein world of today is considered to have been an RNA-peptide world, in which peptides were co-factors of RNA-mediated catalysis and replication. Evidence for these ancestral peptides, from which folded proteins evolved, can be derived even today from regions of local sequence similarity within globally dissimilar folds. One of these is the 45-residue motif common to both folds of the hnRNP K homology (KH) domain. Read More

View Article
June 2018
1 Read