Search our Database of Scientific Publications and Authors

I’m looking for a

    21627 results match your criteria Bioinformatics [Journal]

    1 OF 433

    Bacmeta: simulator for genomic evolution in bacterial metapopulations.
    Bioinformatics 2018 Feb 20. Epub 2018 Feb 20.
    Department of Mathematics and Statistics, University of Helsinki, Helsinki, 00014, Finland.
    Summary: The advent of genomic data from densely sampled bacterial populations has created a need for flexible simulators by which models and hypotheses can be efficiently investigated in the light of empirical observations. Bacmeta provides fast stochastic simulation of neutral evolution within a large collection of interconnected bacterial populations with completely adjustable connectivity network. Stochastic events of mutations, recombinations, insertions/deletions, migrations and microepidemics can be simulated in discrete non-overlapping generations with a Wright-Fisher model that operates on explicit sequence data of any desired genome length. Read More

    STAR Chimeric Post For Rapid Detection of Circular RNA and Fusion Transcripts.
    Bioinformatics 2018 Feb 20. Epub 2018 Feb 20.
    Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology Icahn School of Medicine at Mount Sinai, New York, NY, USA.
    Motivation: The biological relevance of chimeric RNA alignments is now well established. Chimera arising as chromosomal fusions are often drivers of cancer, and recently discovered circular RNA are only now being characterized. While software already exists for fusion discovery and quantitation, high false positive rates and high run-times hamper scalable fusion discovery on large datasets. Read More

    PennDiff: Detecting Differential Alternative Splicing and Transcription by RNA Sequencing.
    Bioinformatics 2018 Feb 20. Epub 2018 Feb 20.
    Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA.
    Motivation: Alternative splicing and alternative transcription are a major mechanism for generating transcriptome diversity. Differential alternative splicing and transcription (DAST), which describe different usage of transcript isoforms across different conditions, can complement differential expression in characterizing gene regulation. However, the analysis of DAST is challenging because only a small fraction of RNA-seq reads is informative for isoforms. Read More

    MUGAN: Multi-GPU accelerated AmpliconNoise server for rapid microbial diversity assessment.
    Bioinformatics 2018 Feb 20. Epub 2018 Feb 20.
    Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea.
    Motivation: Metagenomic sequencing has become a crucial tool for obtaining a gene catalogue of operational taxonomic units (OTUs) in a microbial community. A typical metagenomic sequencing produces a large amount of data (often in the order of terabytes or more), and computational tools are indispensable for efficient processing. In particular, error correction in metagenomics is crucial for accurate and robust genetic cataloging of microbial communities. Read More

    EPIC-CoGe: Managing and Analyzing Genomic Data.
    Bioinformatics 2018 Feb 20. Epub 2018 Feb 20.
    BIO5 Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA.
    Summary: The EPIC-CoGe browser is a web-based genome visualization utility that integrates the GMOD JBrowse genome browser with the extensive CoGe genome database (currently containing over 30,000 genomes). In addition, the EPIC-CoGe browser boasts many additional features over basic JBrowse, including enhanced search capability and on-the-fly analyses for comparisons and analyses between all types of functional and diversity genomics data. There is no installation required and data (genome, annotation, functional genomic, and diversity data) can be loaded by following a simple point and click wizard, or using a REST API, making the browser widely accessible and easy to use by researchers of all computational skill levels. Read More

    IWTomics: testing high-resolution sequence-based "Omics" data at multiple locations and scales.
    Bioinformatics 2018 Feb 20. Epub 2018 Feb 20.
    MOX - Modeling and Scientific Computing, Dept. of Mathematics, Politecnico di Milano, Milano, Italy.
    Summary: With increased generation of high-resolution sequence-based "Omics" data, detecting statistically significant effects at different genomic locations and scales has become key to addressing several scientific questions. IWTomics is an R/Bioconductor package (integrated in Galaxy) that, exploiting sophisticated Functional Data Analysis techniques (i.e. Read More

    Progressive Approach for SNP Calling and Haplotype Assembly Using Single Molecular Sequencing Data.
    Bioinformatics 2018 Feb 19. Epub 2018 Feb 19.
    Department of Computer Science, City University of Hong Kong, 83 Tat Chee Ave., Hong Kong.
    Motivation: Haplotype information is essential to the complete description and interpretation of genomes, genetic diversity and genetic ancestry. The new technologies can provide Single Molecular Sequencing (SMS) data that cover about 90% of positions over chromosomes. However, the SMS data has a higher error rate comparing to 1% error rate for short reads. Read More

    ViCTree: An automated framework for taxonomic classification from protein sequences.
    Bioinformatics 2018 Feb 20. Epub 2018 Feb 20.
    MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom, UK.
    Motivation: The increasing rate of submission of genetic sequences into public databases is providing a growing resource for classifying the organisms that these sequences represent. To aid viral classification, we have developed ViCTree, which automatically integrates the relevant sets of sequences in NCBI GenBank and transforms them into an interactive maximum likelihood phylogenetic tree that can be updated automatically. ViCTree incorporates ViCTreeView, which is a JavaScript-based visualisation tool that enables the tree to be explored interactively in the context of pairwise distance data. Read More

    Cost Function Network-based Design of Protein-Protein Interactions: predicting changes in binding affinity.
    Bioinformatics 2018 Feb 20. Epub 2018 Feb 20.
    LISBP, Université de Toulouse, CNRS, INRA, INSA, Toulouse, France.
    Motivation: Accurate and economic methods to predict change in protein binding free energy upon mutation are imperative to accelerate the design of proteins for a wide range of applications. Free energy is defined by enthalpic and entropic contributions. Following the recent progresses of Artificial Intelligence-based algorithms for guaranteed NP-hard energy optimization and partition function computation, it becomes possible to quickly compute minimum energy conformations and to reliably estimate the entropic contribution of side-chains in the change of free energy of large protein interfaces. Read More

    How large B-factors can be in protein crystal structures.
    BMC Bioinformatics 2018 Feb 23;19(1):61. Epub 2018 Feb 23.
    Department of Structural and Computational Biology, University of Vienna, Campus Vienna Biocenter 5, A-1030, Vienna, Austria.
    Background: Protein crystal structures are potentially over-interpreted since they are routinely refined without any restraint on the upper limit of atomic B-factors. Consequently, some of their atoms, undetected in the electron density maps, are allowed to reach extremely large B-factors, even above 100 square Angstroms, and their final positions are purely speculative and not based on any experimental evidence.

    Results: A strategy to define B-factors upper limits is described here, based on the analysis of protein crystal structures deposited in the Protein Data Bank prior 2008, when the tendency to allow B-factor to arbitrary inflate was limited. Read More

    Multiobjective multifactor dimensionality reduction to detect SNP-SNP interactions.
    Bioinformatics 2018 Feb 19. Epub 2018 Feb 19.
    Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan.
    Motivation: Single-nucleotide polymorphism (SNP)-SNP interactions (SSIs) are popular markers for understanding disease susceptibility. Multifactor dimensionality reduction (MDR) can successfully detect considerable SSIs. Currently, MDR-based methods mainly adopt a single-objective function (a single measure based on contingency tables) to detect SSIs. Read More

    The VAAST Variant Prioritizer (VVP): ultrafast, easy to use whole genome variant prioritization tool.
    BMC Bioinformatics 2018 Feb 20;19(1):57. Epub 2018 Feb 20.
    Department of Human Genetics, University of Utah, Salt Lake City, UT, USA.
    Background: Prioritization of sequence variants for diagnosis and discovery of Mendelian diseases is challenging, especially in large collections of whole genome sequences (WGS). Fast, scalable solutions are needed for discovery research, for clinical applications, and for curation of massive public variant repositories such as dbSNP and gnomAD. In response, we have developed VVP, the VAAST Variant Prioritizer. Read More

    The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier.
    Bioinformatics 2018 Feb 15. Epub 2018 Feb 15.
    Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China.
    Motivation: The long non-coding RNA (lncRNA) studies have been hot topics in the field of RNA biology. Recent studies have shown that their subcellular localizations carry important information for understanding their complex biological functions. Considering the costly and time-consuming experiments for identifying subcellular localization of lncRNAs, computational methods are urgently desired. Read More

    A new approach for interpreting random forest models and its application to the biology of ageing.
    Bioinformatics 2018 Feb 16. Epub 2018 Feb 16.
    School of Computing, University of Kent, Canterbury, Kent, CT2 7NF, UK.
    Motivation: This work uses the Random Forest (RF) classification algorithm to predict if a gene is overexpressed, underexpressed or has no change in expression with age in the brain. RFs have high predictive power, and RF models can be interpreted using a feature (variable) importance measure. However, current feature importance measures evaluate a feature as a whole (all feature values). Read More

    Combining co-evolution and secondary structure prediction to improve fragment library generation.
    Bioinformatics 2018 Feb 15. Epub 2018 Feb 15.
    Department of Statiscts, University of Oxford, Oxford, OX1 3LB, United Kingdom.
    Motivation: Recent advances in co-evolution techniques have made possible the accurate prediction of protein structures in the absence of a template. Here, we provide a general approach that further utilizes co- evolution constraints to generate better fragment libraries for fragment-based protein structure prediction.

    Results: We have compared five different fragment library generation programmes on three different data sets encompassing over 400 unique protein folds. Read More

    flowLearn: Fast and precise identification and quality checking of cell populations in flow cytometry.
    Bioinformatics 2018 Feb 15. Epub 2018 Feb 15.
    CITEC centre of excellence, Bielefeld, 33619, Germany.
    Motivation: Identification of cell populations in flow cytometry is a critical part of the analysis and lays the groundwork for many applications and research discovery. The current paradigm of manual analysis is time consuming and subjective. A common goal of users is to replace manual analysis with automated methods that replicate their results. Read More

    SecretSanta: flexible pipelines for functional secretome prediction.
    Bioinformatics 2018 Feb 16. Epub 2018 Feb 16.
    University of Cambridge, Sainsbury Laboratory, Cambridge, United Kingdom.
    Motivation: The secretome denotes the collection of secreted proteins exported outside of the cell. The functional roles of secreted proteins include the maintenance and remodelling of the extracellular matrix as well as signalling between host and non-host cells. These features make secretomes rich reservoirs of biomarkers for disease classification and host-pathogen interaction studies. Read More

    LS-align: an atom-level, flexible ligand structural alignment algorithm for high-throughput virtual screening.
    Bioinformatics 2018 Feb 15. Epub 2018 Feb 15.
    Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw, Ann Arbor, MI 48109-2218, USA.
    Motivation: Sequence-order independent structural comparison, also called structural alignment, of small ligand molecules is often needed for computer-aided virtual drug screening. Although many ligand structure alignment programs are proposed, most of them build the alignments based on rigid-body shape comparison which cannot provide atom-specific alignment information nor allow structural variation; both abilities are critical to efficient high-throughput virtual screening.

    Results: We propose a novel ligand comparison algorithm, LS-align, to generate fast and accurate atom-level structural alignments of ligand molecules, through an iterative heuristic search of the target function that combines inter-atom distance with mass and chemical bond comparisons. Read More

    CEMiTool: a Bioconductor package for performing comprehensive modular co-expression analyses.
    BMC Bioinformatics 2018 Feb 20;19(1):56. Epub 2018 Feb 20.
    Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of São Paulo, São Paulo, SP, 05508-900, Brazil.
    Background: The analysis of modular gene co-expression networks is a well-established method commonly used for discovering the systems-level functionality of genes. In addition, these studies provide a basis for the discovery of clinically relevant molecular pathways underlying different diseases and conditions.

    Results: In this paper, we present a fast and easy-to-use Bioconductor package named CEMiTool that unifies the discovery and the analysis of co-expression modules. Read More

    StructRNAfinder: an automated pipeline and web server for RNA families prediction.
    BMC Bioinformatics 2018 Feb 17;19(1):55. Epub 2018 Feb 17.
    Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, 8580745, Santiago, Chile.
    Background: The function of many noncoding RNAs (ncRNAs) depend upon their secondary structures. Over the last decades, several methodologies have been developed to predict such structures or to use them to functionally annotate RNAs into RNA families. However, to fully perform this analysis, researchers should utilize multiple tools, which require the constant parsing and processing of several intermediate files. Read More

    pBRIT: Gene Prioritization by Correlating Functional and Phenotypic Annotations Through Integrative Data Fusion.
    Bioinformatics 2018 Feb 14. Epub 2018 Feb 14.
    Center of Medical Genetics, University of Antwerp & Antwerp University Hospital, Antwerp, Belgium.
    Motivation: Computational gene prioritization can aid in disease gene identification. Here, we propose pBRIT (prioritization using Bayesian Ridge regression and Information Theoretic model), a novel adaptive and scalable prioritization tool, integrating Pubmed abstracts, Gene Ontology, Sequence similarities, Mammalian and Human Phenotype Ontology, Pathway, Interactions, Disease Ontology, Gene Association database and Human Genome Epidemiology database, into the prediction model.We explore and address effects of sparsity and inter-feature dependencies within annotation sources, and the impact of bias towards specific annotations. Read More

    StructureMapper: a high-throughput algorithm for analyzing protein sequence locations in structural data.
    Bioinformatics 2018 Feb 14. Epub 2018 Feb 14.
    Faculty of Medicine and Life Sciences and BioMediTech, University of Tampere, Arvo Ylpön katu 34, 33520 Tampere, Finland.
    Motivation: StructureMapper is a high-throughput algorithm for automated mapping of protein primary amino sequence locations to existing three-dimensional protein structures. The algorithm is intended for facilitating easy and efficient utilization of structural information in protein characterization and proteomics. StructureMapper provides an analysis of the identified structural locations that includes surface accessibility, flexibility, protein-protein interfacing, intrinsic disorder prediction, secondary structure assignment, biological assembly information, and sequence identity percentages, among other metrics. Read More

    SEED 2: a user-friendly platform for amplicon high-throughput sequencing data analyses.
    Bioinformatics 2018 Feb 14. Epub 2018 Feb 14.
    Institute of Microbiology of the CAS, Videnská 1083, 14220 Prague 4, Czech Republic.
    Motivation: Modern molecular methods have increased our ability to describe microbial communities. Along with the advances brought by new sequencing technologies, we now require intensive computational resources to make sense of the large numbers of sequences continuously produced. The software developed by the scientific community to address this demand, although very useful, require experience of the command-line environment, extensive training and have steep learning curves, limiting their use. Read More

    RaMWAS: Fast Methylome-Wide Association Study Pipeline for Enrichment Platforms.
    Bioinformatics 2018 Feb 12. Epub 2018 Feb 12.
    Center for Biomarker Research and Precision Medicine, Virginia Commonwealth University, Richmond, 23298 VA, USA.
    Motivation: Enrichment-based technologies can provide measurements of DNA methylation at tens of millions of CpGs for thousands of samples. Existing tools for methylome-wide association studies cannot analyze data sets of this size and lack important features like principal component analysis, combined analysis with SNP data, and outcome predictions that are based on all informative methylation sites.

    Results: We present a Bioconductor R package called RaMWAS with a full set of tools for large-scale methylome-wide association studies. Read More

    Enhancing protein fold determination by exploring the complementary information of chemical cross-linking and coevolutionary signals.
    Bioinformatics 2018 Feb 12. Epub 2018 Feb 12.
    Institute of Chemistry.
    Motivation: Elucidation of protein native states from amino acid sequences is a primary computational challenge. Modern computational and experimental methodologies, such as molecular coevolution and chemical cross-linking mass-spectrometry allowed protein structural characterization to previously intangible systems. Despite several independent successful examples, data from these distinct methodologies have not been systematically studied in conjunction. Read More

    The BioCyc collection of microbial genomes and metabolic pathways.
    Brief Bioinform 2017 Aug 17. Epub 2017 Aug 17.
    BioCyc.org is a microbial genome Web portal that combines thousands of genomes with additional information inferred by computer programs, imported from other databases and curated from the biomedical literature by biologist curators. BioCyc also provides an extensive range of query tools, visualization services and analysis software. Read More

    ChemDistiller: an engine for metabolite annotation in mass spectrometry.
    Bioinformatics 2018 Feb 12. Epub 2018 Feb 12.
    Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, UK.
    Motivation: High-resolution mass spectrometry permits simultaneous detection of thousands of different metabolites in biological samples; however their automated annotation still presents a challenge due to the limited number of tailored computational solutions freely available to the scientific community.

    Results: Here we introduce ChemDistiller, a customizable engine that combines automated large-scale annotation of metabolites using tandem MS data with a compiled database containing tens of millions of compounds with pre-calculated 'fingerprints' and fragmentation patterns. Our tests using publicly and commercially available tandem MS spectra for reference compounds show retrievals rates comparable to or exceeding the ones obtainable by the current state-of-the-art solutions in the field while offering higher throughput, scalability and processing speed. Read More

    Oasis 2: improved online analysis of small RNA-seq data.
    BMC Bioinformatics 2018 Feb 14;19(1):54. Epub 2018 Feb 14.
    Laboratory of Computational Systems Biology, German Center for Neurodegenerative Diseases, Göttingen, Germany.
    Background: Small RNA molecules play important roles in many biological processes and their dysregulation or dysfunction can cause disease. The current method of choice for genome-wide sRNA expression profiling is deep sequencing.

    Results: Here we present Oasis 2, which is a new main release of the Oasis web application for the detection, differential expression, and classification of small RNAs in deep sequencing data. Read More

    Evaluation of reaction gap-filling accuracy by randomization.
    BMC Bioinformatics 2018 Feb 14;19(1):53. Epub 2018 Feb 14.
    SRI International/Artificial Intelligence Center, 333 Ravenswood Ave, Menlo Park, 94025, USA.
    Background: Completion of genome-scale flux-balance models using computational reaction gap-filling is a widely used approach, but its accuracy is not well known.

    Results: We report on computational experiments of reaction gap filling in which we generated degraded versions of the EcoCyc-20.0-GEM model by randomly removing flux-carrying reactions from a growing model. Read More

    Compression of genomic sequencing reads via hash-based reordering: algorithm and analysis.
    Bioinformatics 2018 Feb;34(4):558-567
    Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA.
    Motivation: New Generation Sequencing (NGS) technologies for genome sequencing produce large amounts of short genomic reads per experiment, which are highly redundant and compressible. However, general-purpose compressors are unable to exploit this redundancy due to the special structure present in the data.

    Results: We present a new algorithm for compressing reads both with and without preserving the read order. Read More

    findGSE: estimating genome size variation within human and Arabidopsis using k-mer frequencies.
    Bioinformatics 2018 02;34(4):550-557
    Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany.
    Motivation: Analyzing k-mer frequencies in whole-genome sequencing data is becoming a common method for estimating genome size (GS). However, it remains uninvestigated how accurate the method is, especially if it can capture intra-species GS variation.

    Results: We present findGSE, which fits skew normal distributions to k-mer frequencies to estimate GS. Read More

    Squeakr: an exact and approximate k-mer counting system.
    Bioinformatics 2018 Feb;34(4):568-575
    Department of Computer Science, Stony Brook University, Stony Brook, NY 11790, USA.
    Motivation: k-mer-based algorithms have become increasingly popular in the processing of high-throughput sequencing data. These algorithms span the gamut of the analysis pipeline from k-mer counting (e.g. Read More

    ThreaDNA: predicting DNA mechanics' contribution to sequence selectivity of proteins along whole genomes.
    Bioinformatics 2018 Feb;34(4):609-616
    Microbiologie, Adaptation et Pathogénie, UMR5240, INSA Lyon, Université de Lyon, 69622 Villeurbanne.
    Motivation: Many DNA-binding proteins recognize their target sequences indirectly, by sensing DNA's response to mechanical distortion. ThreaDNA estimates this response based on high-resolution structures of the protein-DNA complex of interest. Implementing an efficient nanoscale modeling of DNA deformations involving essentially no adjustable parameters, it returns the profile of deformation energy along whole genomes, at base-pair resolution, within minutes on usual laptop/desktop computers. Read More

    BetaSerpentine: a bioinformatics tool for reconstruction of amyloid structures.
    Bioinformatics 2018 Feb;34(4):599-608
    Structural Bioinformatics and Molecular Modeling, Centre de Recherche en Biologie Cellulaire de Montpellier, CNRS, Université Montpellier, Montpellier 34293, France.
    Motivation: Numerous experimental studies have suggested that polypeptide chains of large amyloidogenic regions zig-zag in β-serpentine arrangements. These β-serpentines are stacked axially and form the superpleated β-structure. Despite this progress in the understanding of amyloid folds, the determination of their 3D structure at the atomic level is still a problem due to the polymorphism of these fibrils and incompleteness of experimental structural data. Read More

    HiCapTools: a software suite for probe design and proximity detection for targeted chromosome conformation capture applications.
    Bioinformatics 2018 Feb;34(4):675-677
    KTH - Royal Institute of Technology, Science for Life Laboratory, School of Biotechnology, Solna 171?65, Sweden.
    Summary: Folding of eukaryotic genomes within nuclear space enables physical and functional contacts between regions that are otherwise kilobases away in sequence space. Targeted chromosome conformation capture methods (T2C, chi-C and HiCap) are capable of informing genomic contacts for a subset of regions targeted by probes. We here present HiCapTools, a software package that can design sequence capture probes for targeted chromosome capture applications and analyse sequencing output to detect proximities involving targeted fragments. Read More

    ssbio: A Python Framework for Structural Systems Biology.
    Bioinformatics 2018 Feb 12. Epub 2018 Feb 12.
    Department of Bioengineering, University of California, San Diego, CA 92093.
    Summary: Working with protein structures at the genome-scale has been challenging in a variety of ways. Here, we present ssbio, a Python package that provides a framework to easily work with structural information in the context of genome-scale network reconstructions, which can contain thousands of individual proteins. The ssbio package provides an automated pipeline to construct high quality genomescale models with protein structures (GEM-PROs), wrappers to popular third-party programs to compute associated protein properties, and methods to visualize and annotate structures directly in Jupyter notebooks, thus lowering the barrier of linking 3D structural data with established systems workflows. Read More

    Hierarchical Analysis of RNA-Seq Reads Improves the Accuracy of Allele-specific Expression.
    Bioinformatics 2018 Feb 12. Epub 2018 Feb 12.
    The Jackson Laboratory, Bar Harbor, ME 04609, USA.
    Motivation: Allele-specific expression (ASE) refers to the differential abundance of the allelic copies of a transcript. RNA sequencing (RNA-Seq) can provide quantitative estimates of ASE for genes with transcribed polymorphisms. When short-read sequences are aligned to a diploid transcriptome, readmapping ambiguities confound our ability to directly count reads. Read More

    HIITE: HIV-1 Incidence and Infection Time Estimator.
    Bioinformatics 2018 Feb 9. Epub 2018 Feb 9.
    Department of Molecular Microbiology and Immunology, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States.
    Motivation: Around 2.1 million new HIV-1 infections were reported in 2015, alerting that the HIV-1 epidemic remains a significant global health challenge. Precise incidence assessment strengthens epidemic monitoring efforts and guides strategy optimization for prevention programs. Read More

    KMgene: a unified R package for gene-based association analysis for complex traits.
    Bioinformatics 2018 Feb 9. Epub 2018 Feb 9.
    Division of Pulmonary Medicine, Allergy and Immunology, Department of Pediatrics, Children's Hospital of Pittsburgh of UPMC, University of Pittsburgh, Pittsburgh, PA 15224, USA.
    Summary: In this report, we introduce an R package KMgene for performing gene-based association tests for familial, multivariate or longitudinal traits using kernel machine (KM) regression under a generalized linear mixed model (GLMM) framework. Extensive simulations were performed to evaluate the validity of the approaches implemented in KMgene.

    Availability: http://cran. Read More

    chromswitch: A flexible method to detect chromatin state switches.
    Bioinformatics 2018 Feb 9. Epub 2018 Feb 9.
    Department of Human Genetics, McGill University.
    Summary: Chromatin state plays a major role in controlling gene expression, and comparative analysis of ChIP-seq data is key to understanding epigenetic regulation. We present chromswitch, an R/Bioconductor package to integrate epigenomic data in a defined window of interest to detect an overall switch in chromatin state. Chromswitch accurately classifies a benchmarking dataset, and when applied genome-wide, the tool successfully detects chromatin changes that result in brain-specific expression. Read More

    Artificial intelligence in drug combination therapy.
    Brief Bioinform 2018 Feb 9. Epub 2018 Feb 9.
    Currently, the development of medicines for complex diseases requires the development of combination drug therapies. It is necessary because in many cases, one drug cannot target all necessary points of intervention. For example, in cancer therapy, a physician often meets a patient having a genomic profile including more than five molecular aberrations. Read More

    Antigenic cartography of H1N1 influenza viruses using sequence-based antigenic distance calculation.
    BMC Bioinformatics 2018 Feb 12;19(1):51. Epub 2018 Feb 12.
    New York Influenza Center of Excellence at David Smith Center for Immunology and Vaccine Biology, Department of Microbiology and Immunology, University of Rochester School of Medicine and Dentistry, Rochester, NY, USA.
    Background: The ease at which influenza virus sequence data can be used to estimate antigenic relationships between strains and the existence of databases containing sequence data for hundreds of thousands influenza strains make sequence-based antigenic distance estimates an attractive approach to researchers. Antigenic mismatch between circulating strains and vaccine strains results in significantly decreased vaccine effectiveness. Furthermore, antigenic relatedness between the vaccine strain and the strains an individual was originally primed with can affect the cross-reactivity of the antibody response. Read More

    CrossPlan: Systematic Planning of Genetic Crosses to Validate Mathematical Models.
    Bioinformatics 2018 Feb 8. Epub 2018 Feb 8.
    Dept. of Computer Science, Virginia Tech, Blacksburg, VA.
    Motivation: Mathematical models of cellular processes can systematically predict the phenotypes of novel combinations of multi-gene mutations. Searching for informative predictions and prioritizing them for experimental validation is challenging since the number of possible combinations grows exponentially in the number of mutations. Moreover, keeping track of the crosses needed to make new mutants and planning sequences of experiments is unmanageable when the experimenter is deluged by hundreds of potentially informative predictions to test. Read More

    WDL-RF: Predicting Bioactivities of Ligand Molecules Acting with G Protein-coupled Receptors by Combining Weighted Deep Learning and Random Forest.
    Bioinformatics 2018 Feb 8. Epub 2018 Feb 8.
    Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA.
    Motivation: Precise assessment of ligand bioactivities (including IC50, EC50, Ki, Kd, etc.) is essential for virtual screening and lead compound identification. However, not all ligands have experimentally-determined activities. Read More

    Spectral clustering based on learning similarity matrix.
    Bioinformatics 2018 Feb 8. Epub 2018 Feb 8.
    Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, 06511, USA.
    Motivation: Single-cell RNA-sequencing (scRNA-seq) technology can generate genome-wide expression data at the single-cell levels. One important objective in scRNA-seq analysis is to cluster cells where each cluster consists of cells belonging to the same cell type based on gene expression patterns.

    Results: We introduce a novel spectral clustering framework that imposes sparse structures on a target matrix. Read More

    Improving SNP Prioritization and Pleiotropic Architecture Estimation by Incorporating Prior Knowledge Using graph-GPA.
    Bioinformatics 2018 Feb 8. Epub 2018 Feb 8.
    Department of Public Health Science, Medical University of South Carolina, Charleston, 29425, USA.
    Availability: graph-GPA is implemented as an R package 'GGPA', which is publicly available at http://dongjunchung.github.io/GGPA/. Read More

    FMLRC: Hybrid long read error correction using an FM-index.
    BMC Bioinformatics 2018 Feb 9;19(1):50. Epub 2018 Feb 9.
    Department of Biology and Integrative Program for Biological and Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
    Background: Long read sequencing is changing the landscape of genomic research, especially de novo assembly. Despite the high error rate inherent to long read technologies, increased read lengths dramatically improve the continuity and accuracy of genome assemblies. However, the cost and throughput of these technologies limits their application to complex genomes. Read More

    Selenzyme: Enzyme selection tool for pathway design.
    Bioinformatics 2018 Feb 7. Epub 2018 Feb 7.
    BBSRC/EPSRC Manchester Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), Manchester Institute of Biotechnology, The University of Manchester, Manchester M1 7DN, United Kingdom.
    Summary: Synthetic biology applies the principles of engineering to biology in order to create biological functionalities not seen before in nature. One of the most exciting applications of synthetic biology is the design of new organisms with the ability to produce valuable chemicals including pharmaceuticals and biomaterials in a greener; sustainable fashion. Selecting the right enzymes to catalyze each reaction step in order to produce a desired target compound is, however, not trivial. Read More

    INfORM: Inference of NetwOrk Response Modules.
    Bioinformatics 2018 Feb 7. Epub 2018 Feb 7.
    Faculty of Medicine and Life Sciences, University of Tampere, Finland.
    Summary: Detecting and interpreting responsive modules from gene expression data by using network-based approaches is a common but laborious task. It often requires the application of several computational methods implemented in different software packages, forcing biologists to compile complex analytical pipelines. Here we introduce INfORM (Inference of NetwOrk Response Modules), an R shiny application that enables non-expert users to detect, evaluate, and select gene modules with high statistical and biological significance. Read More

    Kpax3: Bayesian bi-clustering of large sequence datasets.
    Bioinformatics 2018 Feb 7. Epub 2018 Feb 7.
    Department of Mathematics and Statistics, University of Helsinki, 00014 Helsinki, Finland.
    Motivation: Estimation of the hidden population structure is an important step in many genetic studies. Often the aim is also to identify which sequence locations are the most discriminative between groups of samples for a given data partition. Automated discovery of interesting patterns that are present in the data can help to generate new biological hypotheses. Read More

    1 OF 433