Search our Database of Scientific Publications and Authors

I’m looking for a

    8058 results match your criteria BMC Bioinformatics [Journal]

    1 OF 162

    Improving contig binning of metagenomic data using [Formula: see text] oligonucleotide frequency dissimilarity.
    BMC Bioinformatics 2017 Sep 20;18(1):425. Epub 2017 Sep 20.
    Molecular and Computational Biology Program, University of Southern California, Los Angeles, California, CA, 90089, USA.
    Background: Metagenomics sequencing provides deep insights into microbial communities. To investigate their taxonomic structure, binning assembled contigs into discrete clusters is critical. Many binning algorithms have been developed, but their performance is not always satisfactory, especially for complex microbial communities, calling for further development. Read More

    VISMapper: ultra-fast exhaustive cartography of viral insertion sites for gene therapy.
    BMC Bioinformatics 2017 Sep 20;18(1):421. Epub 2017 Sep 20.
    Clinical Bioinformatics Research Area, Fundación Progreso y Salud, Hospital Virgen del Rocío, 41013, Sevilla, Spain.
    Background: The possibility of integrating viral vectors to become a persistent part of the host genome makes them a crucial element of clinical gene therapy. However, viral integration has associated risks, such as the unintentional activation of oncogenes that can result in cancer. Therefore, the analysis of integration sites of retroviral vectors is a crucial step in developing safer vectors for therapeutic use. Read More

    Reconstruction and visualization of large-scale volumetric models of neocortical circuits for physically-plausible in silico optical studies.
    BMC Bioinformatics 2017 Sep 13;18(Suppl 10):402. Epub 2017 Sep 13.
    Blue Brain Project (BBP), École Polytechnique Fédérale de Lausanne (EPFL), Biotech Campus, Chemin des Mines 9, Geneva, 1202, Switzerland.
    Background: We present a software workflow capable of building large scale, highly detailed and realistic volumetric models of neocortical circuits from the morphological skeletons of their digitally reconstructed neurons. The limitations of the existing approaches for creating those models are explained, and then, a multi-stage pipeline is discussed to overcome those limitations. Starting from the neuronal morphologies, we create smooth piecewise watertight polygonal models that can be efficiently utilized to synthesize continuous and plausible volumetric models of the neurons with solid voxelization. Read More

    Vermont: a multi-perspective visual interactive platform for mutational analysis.
    BMC Bioinformatics 2017 Sep 13;18(Suppl 10):403. Epub 2017 Sep 13.
    Department of Computer Science, Universidade Federal de Viçosa, Peter Henry Rolfs avenue, Campus Universitário, Viçosa, 36570-900, Brazil.
    Background: A huge amount of data about genomes and sequence variation is available and continues to grow on a large scale, which makes experimentally characterizing these mutations infeasible regarding disease association and effects on protein structure and function. Therefore, reliable computational approaches are needed to support the understanding of mutations and their impacts. Here, we present VERMONT 2. Read More

    MediSyn: uncertainty-aware visualization of multiple biomedical datasets to support drug treatment selection.
    BMC Bioinformatics 2017 Sep 13;18(Suppl 10):393. Epub 2017 Sep 13.
    Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Gustaf Hällströmin katu 2b, Helsinki, 00560, Finland.
    Background: Dispersed biomedical databases limit user exploration to generate structured knowledge. Linked Data unifies data structures and makes the dispersed data easy to search across resources, but it lacks supporting human cognition to achieve insights. In addition, potential errors in the data are difficult to detect in their free formats. Read More

    Bayesian Unidimensional Scaling for visualizing uncertainty in high dimensional datasets with latent ordering of observations.
    BMC Bioinformatics 2017 Sep 13;18(Suppl 10):394. Epub 2017 Sep 13.
    Department of Statistics, Stanford University, Stanford, 94305, USA.
    Background: Detecting patterns in high-dimensional multivariate datasets is non-trivial. Clustering and dimensionality reduction techniques often help in discerning inherent structures. In biological datasets such as microbial community composition or gene expression data, observations can be generated from a continuous process, often unknown. Read More

    CellNetVis: a web tool for visualization of biological networks using force-directed layout constrained by cellular components.
    BMC Bioinformatics 2017 Sep 13;18(Suppl 10):395. Epub 2017 Sep 13.
    University of São Paulo, Instituto de Ciências Matemáticas e de Computação, Av. Trabalhador São-carlense, 400, São Carlos-SP, Brazil.
    Background: The advent of "omics" science has brought new perspectives in contemporary biology through the high-throughput analyses of molecular interactions, providing new clues in protein/gene function and in the organization of biological pathways. Biomolecular interaction networks, or graphs, are simple abstract representations where the components of a cell (e.g. Read More

    C-State: an interactive web app for simultaneous multi-gene visualization and comparative epigenetic pattern search.
    BMC Bioinformatics 2017 Sep 13;18(Suppl 10):392. Epub 2017 Sep 13.
    CSIR- Centre for Cellular and Molecular Biology, Hyderabad, India.
    Background: Comparative epigenomic analysis across multiple genes presents a bottleneck for bench biologists working with NGS data. Despite the development of standardized peak analysis algorithms, the identification of novel epigenetic patterns and their visualization across gene subsets remains a challenge.

    Results: We developed a fast and interactive web app, C-State (Chromatin-State), to query and plot chromatin landscapes across multiple loci and cell types. Read More

    Methods for discovering genomic loci exhibiting complex patterns of differential methylation.
    BMC Bioinformatics 2017 Sep 18;18(1):416. Epub 2017 Sep 18.
    Department of Plant Sciences, University of Cambridge, Downing Street, Cambridge, CB2 3EA, UK.
    Background: Cytosine methylation is widespread in most eukaryotic genomes and is known to play a substantial role in various regulatory pathways. Unmethylated cytosines may be converted to uracil through the addition of sodium bisulphite, allowing genome-wide quantification of cytosine methylation via high-throughput sequencing. The data thus acquired allows the discovery of methylation 'loci'; contiguous regions of methylation consistently methylated across biological replicates. Read More

    Cleaning by clustering: methodology for addressing data quality issues in biomedical metadata.
    BMC Bioinformatics 2017 Sep 18;18(1):415. Epub 2017 Sep 18.
    Institute of Data Science, Maastricht University, Maastricht, 6200, MD, The Netherlands.
    Background: The ability to efficiently search and filter datasets depends on access to high quality metadata. While most biomedical repositories require data submitters to provide a minimal set of metadata, some such as the Gene Expression Omnibus (GEO) allows users to specify additional metadata in the form of textual key-value pairs (e.g. Read More

    Deep learning methods for protein torsion angle prediction.
    BMC Bioinformatics 2017 Sep 18;18(1):417. Epub 2017 Sep 18.
    Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA.
    Background: Deep learning is one of the most powerful machine learning methods that has achieved the state-of-the-art performance in many domains. Since deep learning was introduced to the field of bioinformatics in 2012, it has achieved success in a number of areas such as protein residue-residue contact prediction, secondary structure prediction, and fold recognition. In this work, we developed deep learning methods to improve the prediction of torsion (dihedral) angles of proteins. Read More

    Segmentation and classification of two-channel C. elegans nucleus-labeled fluorescence images.
    BMC Bioinformatics 2017 Sep 15;18(1):412. Epub 2017 Sep 15.
    Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Yiheyuan Road, Beijing, 100871, China.
    Background: Aging is characterized by a gradual breakdown of cellular structures. Nuclear abnormality is a hallmark of progeria in human. Analysis of age-dependent nuclear morphological changes in Caenorhabditis elegans is of great value to aging research, and this calls for an automatic image processing method that is suitable for both normal and abnormal structures. Read More

    mmquant: how to count multi-mapping reads?
    BMC Bioinformatics 2017 Sep 15;18(1):411. Epub 2017 Sep 15.
    MIAT, Toulouse INRA, BP 52627, Castanet-Tolosan cedex, 31326, France.
    Background: RNA-Seq is currently used routinely, and it provides accurate information on gene transcription. However, the method cannot accurately estimate duplicated genes expression. Several strategies have been previously used (drop duplicated genes, distribute uniformly the reads, or estimate expression), but all of them provide biased results. Read More

    Interactive visual exploration and refinement of cluster assignments.
    BMC Bioinformatics 2017 Sep 12;18(1):406. Epub 2017 Sep 12.
    Scientific Computing and Imaging Institute, University of Utah, 72 Sout Central Campus Drive, Salt Lake City, 84112, USA.
    Background: With ever-increasing amounts of data produced in biology research, scientists are in need of efficient data analysis methods. Cluster analysis, combined with visualization of the results, is one such method that can be used to make sense of large data volumes. At the same time, cluster analysis is known to be imperfect and depends on the choice of algorithms, parameters, and distance measures. Read More

    Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis.
    BMC Bioinformatics 2017 Sep 7;18(1):401. Epub 2017 Sep 7.
    Department of Clinical Sciences, UT Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX, 75390-9066, USA.
    Background: Deep sequencing of lymphocyte receptor repertoires has made it possible to comprehensively profile the clonal composition of lymphocyte populations. This opens the door for novel approaches to diagnose and prognosticate diseases with a driving immune component by identifying repertoire sequence patterns associated with clinical phenotypes. Indeed, recent studies support the feasibility of this, demonstrating an association between repertoire-level summary statistics (e. Read More

    High-throughput PCR assay design for targeted resequencing using primerXL.
    BMC Bioinformatics 2017 Sep 6;18(1):400. Epub 2017 Sep 6.
    Center for Medical Genetics, Ghent University, De Pintelaan 185, 9000, Ghent, Belgium.
    Background: Although the sequencing landscape is rapidly evolving and sequencing costs are continuously decreasing, whole genome sequencing is still too expensive for use on a routine basis. Targeted resequencing of only the regions of interest decreases both costs and the complexity of the downstream data-analysis. Various target enrichment strategies are available, but none of them obtain the degree of coverage uniformity, flexibility and specificity of PCR-based enrichment. Read More

    BUFET: boosting the unbiased miRNA functional enrichment analysis using bitsets.
    BMC Bioinformatics 2017 Sep 6;18(1):399. Epub 2017 Sep 6.
    "Athena" Research and Innovation Center, Athens, 15125, Greece.
    Background: A group of miRNAs can regulate a biological process by targeting genes involved in the process. The unbiased miRNA functional enrichment analysis is the most precise in silico approach to predict the biological processes that may be regulated by a given miRNA group. However, it is computationally intensive and significantly more expensive than its alternatives. Read More

    Phylo_dCor: distance correlation as a novel metric for phylogenetic profiling.
    BMC Bioinformatics 2017 Sep 5;18(1):396. Epub 2017 Sep 5.
    Dipartimento di Malattie Infettive, Parassitarie e Immunomediate, Istituto Superiore di Sanità, Viale Regina Elena 299, 00161, Rome, Italy.
    Background: Elaboration of powerful methods to predict functional and/or physical protein-protein interactions from genome sequence is one of the main tasks in the post-genomic era. Phylogenetic profiling allows the prediction of protein-protein interactions at a whole genome level in both Prokaryotes and Eukaryotes. For this reason it is considered one of the most promising methods. Read More

    Investigating MicroRNA and transcription factor co-regulatory networks in colorectal cancer.
    BMC Bioinformatics 2017 Sep 2;18(1):388. Epub 2017 Sep 2.
    Department of Pathology, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China.
    Background: Colorectal cancer (CRC) is one of the most common malignancies worldwide with poor prognosis. Studies have showed that abnormal microRNA (miRNA) expression can affect CRC pathogenesis and development through targeting critical genes in cellular system. However, it is unclear about which miRNAs play central roles in CRC's pathogenesis and how they interact with transcription factors (TFs) to regulate the cancer-related genes. Read More

    RRCRank: a fusion method using rank strategy for residue-residue contact prediction.
    BMC Bioinformatics 2017 Sep 2;18(1):390. Epub 2017 Sep 2.
    School of Computer Science, Fudan University, Shanghai, 200433, People's Republic of China.
    Background: In structural biology area, protein residue-residue contacts play a crucial role in protein structure prediction. Some researchers have found that the predicted residue-residue contacts could effectively constrain the conformational search space, which is significant for de novo protein structure prediction. In the last few decades, related researchers have developed various methods to predict residue-residue contacts, especially, significant performance has been achieved by using fusion methods in recent years. Read More

    A new and updated resource for codon usage tables.
    BMC Bioinformatics 2017 Sep 2;18(1):391. Epub 2017 Sep 2.
    Division of Plasma Protein Therapeutics, Office of Tissue and Advanced Therapies, Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, USA.
    Background: Due to the degeneracy of the genetic code, most amino acids can be encoded by multiple synonymous codons. Synonymous codons naturally occur with different frequencies in different organisms. The choice of codons may affect protein expression, structure, and function. Read More

    Robust gene selection methods using weighting schemes for microarray data analysis.
    BMC Bioinformatics 2017 Sep 2;18(1):389. Epub 2017 Sep 2.
    Department of Statistics, Ewha Womans University, Seoul, South Korea.
    Background: A common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states. Owing to the high-dimensional nature of microarray data, identification of significant genes has been essential in analyzing the data. However, the performances of many gene selection techniques are highly dependent on the experimental conditions, such as the presence of measurement error or a limited number of sample replicates. Read More

    QNB: differential RNA methylation analysis for count-based small-sample sequencing data with a quad-negative binomial model.
    BMC Bioinformatics 2017 Aug 31;18(1):387. Epub 2017 Aug 31.
    Department of Biological Sciences, HRINU, SUERI, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu, 215123, China.
    Background: As a newly emerged research area, RNA epigenetics has drawn increasing attention recently for the participation of RNA methylation and other modifications in a number of crucial biological processes. Thanks to high throughput sequencing techniques, such as, MeRIP-Seq, transcriptome-wide RNA methylation profile is now available in the form of count-based data, with which it is often of interests to study the dynamics at epitranscriptomic layer. However, the sample size of RNA methylation experiment is usually very small due to its costs; and additionally, there usually exist a large number of genes whose methylation level cannot be accurately estimated due to their low expression level, making differential RNA methylation analysis a difficult task. Read More

    EL_PSSM-RT: DNA-binding residue prediction by integrating ensemble learning with PSSM Relation Transformation.
    BMC Bioinformatics 2017 Aug 29;18(1):379. Epub 2017 Aug 29.
    School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, HIT Campus Shenzhen University Town, Xili, Shenzhen, Guangdong, 518055, China.
    Background: Prediction of DNA-binding residue is important for understanding the protein-DNA recognition mechanism. Many computational methods have been proposed for the prediction, but most of them do not consider the relationships of evolutionary information between residues.

    Results: In this paper, we first propose a novel residue encoding method, referred to as the Position Specific Score Matrix (PSSM) Relation Transformation (PSSM-RT), to encode residues by utilizing the relationships of evolutionary information between residues. Read More

    Improved protein structure reconstruction using secondary structures, contacts at higher distance thresholds, and non-contacts.
    BMC Bioinformatics 2017 Aug 29;18(1):380. Epub 2017 Aug 29.
    Department of Electrical Engineering & Computer Science, Informatics Institute, University of Missouri, Columbia, MO, 65211, USA.
    Background: Residue-residue contacts are key features for accurate de novo protein structure prediction. For the optimal utilization of these predicted contacts in folding proteins accurately, it is important to study the challenges of reconstructing protein structures using true contacts. Because contact-guided protein modeling approach is valuable for predicting the folds of proteins that do not have structural templates, it is necessary for reconstruction studies to focus on hard-to-predict protein structures. Read More

    A weighted string kernel for protein fold recognition.
    BMC Bioinformatics 2017 Aug 25;18(1):378. Epub 2017 Aug 25.
    Department of Computer Science and Genome Center, 1, Shields Avenue, Davis, 95616, CA, USA.
    Background: Alignment-free methods for comparing protein sequences have proved to be viable alternatives to approaches that first rely on an alignment of the sequences to be compared. Much work however need to be done before those methods provide reliable fold recognition for proteins whose sequences share little similarity. We have recently proposed an alignment-free method based on the concept of string kernels, SeqKernel (Nojoomi and Koehl, BMC Bioinformatics, 2017, 18:137). Read More

    Identifying pleiotropic genes in genome-wide association studies from related subjects using the linear mixed model and Fisher combination function.
    BMC Bioinformatics 2017 Aug 24;18(1):376. Epub 2017 Aug 24.
    Department of Health Behavior and Biological Sciences, University of Michigan, Ann Arbor, 48104, Michigan, USA.
    Background: A multivariate genome-wide association test is proposed for analyzing data on multivariate quantitative phenotypes collected from related subjects. The proposed method is a two-step approach. The first step models the association between the genotype and marginal phenotype using a linear mixed model. Read More

    A regulation probability model-based meta-analysis of multiple transcriptomics data sets for cancer biomarker identification.
    BMC Bioinformatics 2017 Aug 23;18(1):375. Epub 2017 Aug 23.
    Cancer Hospital, CAS, Hefei, Anhui, 230031, China.
    Background: Large-scale accumulation of omics data poses a pressing challenge of integrative analysis of multiple data sets in bioinformatics. An open question of such integrative analysis is how to pinpoint consistent but subtle gene activity patterns across studies. Study heterogeneity needs to be addressed carefully for this goal. Read More

    Evaluation of the impact of Illumina error correction tools on de novo genome assembly.
    BMC Bioinformatics 2017 Aug 18;18(1):374. Epub 2017 Aug 18.
    Department of Information Technology, Ghent University-imec, IDLab, Ghent, B-9052, Belgium.
    Background: Recently, many standalone applications have been proposed to correct sequencing errors in Illumina data. The key idea is that downstream analysis tools such as de novo genome assemblers benefit from a reduced error rate in the input data. Surprisingly, a systematic validation of this assumption using state-of-the-art assembly methods is lacking, even for recently published methods. Read More

    SG-ADVISER mtDNA: a web server for mitochondrial DNA annotation with data from 200 samples of a healthy aging cohort.
    BMC Bioinformatics 2017 Aug 18;18(1):373. Epub 2017 Aug 18.
    The Scripps Translational Science Institute, Scripps Health, and The Scripps Research Institute, La Jolla, CA, 92037, USA.
    Background: Whole genome and exome sequencing usually include reads containing mitochondrial DNA (mtDNA). Yet, state-of-the-art pipelines and services for human nuclear genome variant calling and annotation do not handle mitochondrial genome data appropriately. As a consequence, any researcher desiring to add mtDNA variant analysis to their investigations is forced to explore the literature for mtDNA pipelines, evaluate them, and implement their own instance of the desired tool. Read More

    Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles.
    BMC Bioinformatics 2017 Aug 17;18(1):372. Epub 2017 Aug 17.
    Computational Bioscience Program, University of Colorado School of Medicine, Denver, CO, USA.
    Background: Coreference resolution is the task of finding strings in text that have the same referent as other strings. Failures of coreference resolution are a common cause of false negatives in information extraction from the scientific literature. In order to better understand the nature of the phenomenon of coreference in biomedical publications and to increase performance on the task, we annotated the Colorado Richly Annotated Full Text (CRAFT) corpus with coreference relations. Read More

    eccCL: parallelized GPU implementation of Ensemble Classifier Chains.
    BMC Bioinformatics 2017 Aug 17;18(1):371. Epub 2017 Aug 17.
    Department of Bioinformatics, Straubing Center of Science, Petersgasse 18, Straubing, 94315, Germany.
    Background: Multi-label classification has recently gained great attention in diverse fields of research, e.g., in biomedical application such as protein function prediction or drug resistance testing in HIV. Read More

    Cancerouspdomains: comprehensive analysis of cancer type-specific recurrent somatic mutations in proteins and domains.
    BMC Bioinformatics 2017 Aug 16;18(1):370. Epub 2017 Aug 16.
    Faculty of New Sciences and Technologies, University of Tehran, North Kargar St, Tehran, Tehran, 1439957131, Iran.
    Background: Discriminating driver mutations from the ones that play no role in cancer is a severe bottleneck in elucidating molecular mechanisms underlying cancer development. Since protein domains are representatives of functional regions within proteins, mutations on them may disturb the protein functionality. Therefore, studying mutations at domain level may point researchers to more accurate assessment of the functional impact of the mutations. Read More

    A neural network multi-task learning approach to biomedical named entity recognition.
    BMC Bioinformatics 2017 Aug 15;18(1):368. Epub 2017 Aug 15.
    Language Technology Laboratory, DTAL, University of Cambridge, 9 West Road, Cambridge, CB39DB, UK.
    Background: Named Entity Recognition (NER) is a key task in biomedical text mining. Accurate NER systems require task-specific, manually-annotated datasets, which are expensive to develop and thus limited in size. Since such datasets contain related but different information, an interesting question is whether it might be possible to use them together to improve NER performance. Read More

    Improving fold resistance prediction of HIV-1 against protease and reverse transcriptase inhibitors using artificial neural networks.
    BMC Bioinformatics 2017 Aug 15;18(1):369. Epub 2017 Aug 15.
    Research Unit in Bioinformatics (RUBi), Department of Biochemistry and Microbiology, Rhodes University, Grahamstown, 6140, South Africa.
    Background: Drug resistance in HIV treatment is still a worldwide problem. Predicting resistance to antiretrovirals (ARVs) before starting any treatment is important. Prediction accuracy is essential, as low-accuracy predictions increase the risk of prescribing sub-optimal drug regimens leading to patients developing resistance sooner. Read More

    Full L1-regularized Traction Force Microscopy over whole cells.
    BMC Bioinformatics 2017 Aug 10;18(1):365. Epub 2017 Aug 10.
    Bioengineering and Aerospace Engineering Department, Universidad Carlos III de Madrid, Leganés, Spain.
    Background: Traction Force Microscopy (TFM) is a widespread technique to estimate the tractions that cells exert on the surrounding substrate. To recover the tractions, it is necessary to solve an inverse problem, which is ill-posed and needs regularization to make the solution stable. The typical regularization scheme is given by the minimization of a cost functional, which is divided in two terms: the error present in the data or data fidelity term; and the regularization or penalty term. Read More

    ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding.
    BMC Bioinformatics 2017 Aug 10;18(1):367. Epub 2017 Aug 10.
    Department of Plant Pathology, 495 Borlaug Hall, 1991 Upper Buford Circle, St. Paul, MN, 55108, USA.
    Background: Rapid generation of omics data in recent years have resulted in vast amounts of disconnected datasets without systemic integration and knowledge building, while individual groups have made customized, annotated datasets available on the web with few ways to link them to in-lab datasets. With so many research groups generating their own data, the ability to relate it to the larger genomic and comparative genomic context is becoming increasingly crucial to make full use of the data.

    Results: The Omics Database Generator (ODG) allows users to create customized databases that utilize published genomics data integrated with experimental data which can be queried using a flexible graph database. Read More

    Network design and analysis for multi-enzyme biocatalysis.
    BMC Bioinformatics 2017 Aug 10;18(1):366. Epub 2017 Aug 10.
    Biochemical Engineering Institute, Saarland University, Campus A1.5, Saarbrücken, 66123, Germany.
    Background: As more and more biological reaction data become available, the full exploration of the enzymatic potential for the synthesis of valuable products opens up exciting new opportunities but is becoming increasingly complex. The manual design of multi-step biosynthesis routes involving enzymes from different organisms is very challenging. To harness the full enzymatic potential, we developed a computational tool for the directed design of biosynthetic production pathways for multi-step catalysis with in vitro enzyme cascades, cell hydrolysates and permeabilized cells. Read More

    Local sequence and sequencing depth dependent accuracy of RNA-seq reads.
    BMC Bioinformatics 2017 Aug 9;18(1):364. Epub 2017 Aug 9.
    Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC, USA.
    Background: Many biases and spurious effects are inherent in RNA-seq technology, resulting in a non-uniform distribution of sequencing read counts for each base position in a gene. Therefore, a base-level strategy is required to model the non-uniformity. Also, the properties of sequencing read counts can be leveraged to achieve a more precise estimation of the mean and variance of measurement. Read More

    CIPHER: a flexible and extensive workflow platform for integrative next-generation sequencing data analysis and genomic regulatory element prediction.
    BMC Bioinformatics 2017 Aug 8;18(1):363. Epub 2017 Aug 8.
    Department of Microbiology, The University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA.
    Background: Next-generation sequencing (NGS) approaches are commonly used to identify key regulatory networks that drive transcriptional programs. Although these technologies are frequently used in biological studies, NGS data analysis remains a challenging, time-consuming, and often irreproducible process. Therefore, there is a need for a comprehensive and flexible workflow platform that can accelerate data processing and analysis so more time can be spent on functional studies. Read More

    A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support.
    BMC Bioinformatics 2017 Aug 7;18(1):361. Epub 2017 Aug 7.
    Department of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, 3333 Burnet Ave., MLC 7024, Cincinnati, OH, 45229-3039, USA.
    Background: Probabilistic assessments of clinical care are essential for quality care. Yet, machine learning, which supports this care process has been limited to categorical results. To maximize its usefulness, it is important to find novel approaches that calibrate the ML output with a likelihood scale. Read More

    Repliscan: a tool for classifying replication timing regions.
    BMC Bioinformatics 2017 Aug 7;18(1):362. Epub 2017 Aug 7.
    Texas Advanced Computing Center, University of Texas at Austin, 10100 Burnet Road, Austin, 78758-4497, TX, USA.
    Background: Replication timing experiments that use label incorporation and high throughput sequencing produce peaked data similar to ChIP-Seq experiments. However, the differences in experimental design, coverage density, and possible results make traditional ChIP-Seq analysis methods inappropriate for use with replication timing.

    Results: To accurately detect and classify regions of replication across the genome, we present Repliscan. Read More

    Evaluation of high-throughput isomiR identification tools: illuminating the early isomiRome of Tribolium castaneum.
    BMC Bioinformatics 2017 Aug 3;18(1):359. Epub 2017 Aug 3.
    Fraunhofer Institute for Molecular Biology and Applied Ecology, Department of Bioresources, Winchester Str. 2, 35394, Giessen, Germany.
    Background: MicroRNAs carry out post-transcriptional gene regulation in animals by binding to the 3' untranslated regions of mRNAs, causing their degradation or translational repression. MicroRNAs influence many biological functions, and dysregulation can therefore disrupt development or even cause death. High-throughput sequencing and the mining of animal small RNA data has shown that microRNA genes can yield differentially expressed isoforms, known as isomiRs. Read More

    Correcting nucleotide-specific biases in high-throughput sequencing data.
    BMC Bioinformatics 2017 Aug 1;18(1):357. Epub 2017 Aug 1.
    Department of Genetics, University of North Carolina at Chapel Hill, CB 7032, 7314 Medical Biomolecular Research Building, 111 Mason Farm Road, Chapel Hill, 27599, NC, USA.
    Background: High-throughput sequence (HTS) data exhibit position-specific nucleotide biases that obscure the intended signal and reduce the effectiveness of these data for downstream analyses. These biases are particularly evident in HTS assays for identifying regulatory regions in DNA (DNase-seq, ChIP-seq, FAIRE-seq, ATAC-seq). Biases may result from many experiment-specific factors, including selectivity of DNA restriction enzymes and fragmentation method, as well as sequencing technology-specific factors, such as choice of adapters/primers and sample amplification methods. Read More

    Variable selection for disease progression models: methods for oncogenetic trees and application to cancer and HIV.
    BMC Bioinformatics 2017 Aug 1;18(1):358. Epub 2017 Aug 1.
    Department of Statistics, TU Dortmund University, Dortmund, 44221, Germany.
    Background: Disease progression models are important for understanding the critical steps during the development of diseases. The models are imbedded in a statistical framework to deal with random variations due to biology and the sampling process when observing only a finite population. Conditional probabilities are used to describe dependencies between events that characterise the critical steps in the disease process. Read More

    l1kdeconv: an R package for peak calling analysis with LINCS L1000 data.
    BMC Bioinformatics 2017 Jul 27;18(1):356. Epub 2017 Jul 27.
    Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, 77843, USA.
    Background: LINCS L1000 is a high-throughput technology that allows gene expression measurement in a large number of assays. However, to fit the measurements of ~1000 genes in the ~500 color channels of LINCS L1000, every two landmark genes are designed to share a single channel. Thus, a deconvolution step is required to infer the expression values of each gene. Read More

    Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility.
    BMC Bioinformatics 2017 Jul 27;18(1):355. Epub 2017 Jul 27.
    Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, 21287, MD, USA.
    Background: Computational prediction of transcription factor (TF) binding sites in different cell types is challenging. Recent technology development allows us to determine the genome-wide chromatin accessibility in various cellular and developmental contexts. The chromatin accessibility profiles provide useful information in prediction of TF binding events in various physiological conditions. Read More

    Quantification of tumour evolution and heterogeneity via Bayesian epiallele detection.
    BMC Bioinformatics 2017 Jul 25;18(1):354. Epub 2017 Jul 25.
    UCL Cancer Institute, University College London, London, UK.
    Background: Epigenetic heterogeneity within a tumour can play an important role in tumour evolution and the emergence of resistance to treatment. It is increasingly recognised that the study of DNA methylation (DNAm) patterns along the genome - so-called 'epialleles' - offers greater insight into epigenetic dynamics than conventional analyses which examine DNAm marks individually.

    Results: We have developed a Bayesian model to infer which epialleles are present in multiple regions of the same tumour. Read More

    Identifying and mitigating batch effects in whole genome sequencing data.
    BMC Bioinformatics 2017 Jul 24;18(1):351. Epub 2017 Jul 24.
    Bioinformatics and Computational Biology Department, Genentech Inc, 1 DNA Way, South San Francisco, CA, 94080, USA.
    Background: Large sample sets of whole genome sequencing with deep coverage are being generated, however assembling datasets from different sources inevitably introduces batch effects. These batch effects are not well understood and can be due to changes in the sequencing protocol or bioinformatics tools used to process the data. No systematic algorithms or heuristics exist to detect and filter batch effects or remove associations impacted by batch effects in whole genome sequencing data. Read More

    1 OF 162