8,964 results match your criteria BMC bioinformatics[Journal]


Metabopolis: scalable network layout for biological pathway diagrams in urban map style.

BMC Bioinformatics 2019 Apr 15;20(1):187. Epub 2019 Apr 15.

Research Division of Computer Graphics, Institute of Visual Computing and Human- Centered Technology, TU Wien, Vienna, Austria.

Background: Biological pathways represent chains of molecular interactions in biological systems that jointly form complex dynamic networks. The network structure changes from the significance of biological experiments and layout algorithms often sacrifice low-level details to maintain high-level information, which complicates the entire image to large biochemical systems such as human metabolic pathways.

Results: Our work is inspired by concepts from urban planning since we create a visual hierarchy of biological pathways, which is analogous to city blocks and grid-like road networks in an urban area. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2779-4DOI Listing

eNetXplorer: an R package for the quantitative exploration of elastic net families for generalized linear models.

BMC Bioinformatics 2019 Apr 16;20(1):189. Epub 2019 Apr 16.

Trans-NIH Center for Human Immunology (CHI), National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA.

Background: Regularized generalized linear models (GLMs) are popular regression methods in bioinformatics, particularly useful in scenarios with fewer observations than parameters/features or when many of the features are correlated. In both ridge and lasso regularization, feature shrinkage is controlled by a penalty parameter λ. The elastic net introduces a mixing parameter α to tune the shrinkage continuously from ridge to lasso. Read More

View Article

Download full-text PDF

Source
https://bmcbioinformatics.biomedcentral.com/articles/10.1186
Publisher Site
http://dx.doi.org/10.1186/s12859-019-2778-5DOI Listing
April 2019
1 Read

metamicrobiomeR: an R package for analysis of microbiome relative abundance data using zero-inflated beta GAMLSS and meta-analysis across studies using random effects models.

BMC Bioinformatics 2019 Apr 16;20(1):188. Epub 2019 Apr 16.

Gertrude H. Sergievsky Center, Columbia University, New York City, NY, USA.

Background: The rapid growth of high-throughput sequencing-based microbiome profiling has yielded tremendous insights into human health and physiology. Data generated from high-throughput sequencing of 16S rRNA gene amplicons are often preprocessed into composition or relative abundance. However, reproducibility has been lacking due to the myriad of different experimental and computational approaches taken in these studies. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2744-2DOI Listing

Effect of stochasticity on coinfection dynamics of respiratory viruses.

BMC Bioinformatics 2019 Apr 16;20(1):191. Epub 2019 Apr 16.

Department of Physics & Astronomy, Texas Christian University, Fort Worth, TX, USA.

Background: Respiratory viral infections are a leading cause of mortality worldwide. As many as 40% of patients hospitalized with influenza-like illness are reported to be infected with more than one type of virus. However, it is not clear whether these infections are more severe than single viral infections. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2793-6DOI Listing

Ryūtō: network-flow based transcriptome reconstruction.

BMC Bioinformatics 2019 Apr 16;20(1):190. Epub 2019 Apr 16.

Bioinformatics Group, Department of Computer Science & Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16-18, Leipzig, 04107, Germany.

Background: The rapid increase in High-throughput sequencing of RNA (RNA-seq) has led to tremendous improvements in the detection and reconstruction of both expressed coding and non-coding RNA transcripts. Yet, the complete and accurate annotation of the complex transcriptional output of not only the human genome has remained elusive. One of the critical bottlenecks in this endeavor is the computational reconstruction of transcript structures, due to high noise levels, technological limits, and other biases in the raw data. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2786-5DOI Listing

Highly efficient hypothesis testing methods for regression-type tests with correlated observations and heterogeneous variance structure.

BMC Bioinformatics 2019 Apr 15;20(1):185. Epub 2019 Apr 15.

Department of Biostatistics and Computational Biology, University of Rochester, 601 Elmwood Ave, Rochester, Rochester 14642, NY, USA.

Background: For many practical hypothesis testing (H-T) applications, the data are correlated and/or with heterogeneous variance structure. The regression t-test for weighted linear mixed-effects regression (LMER) is a legitimate choice because it accounts for complex covariance structure; however, high computational costs and occasional convergence issues make it impractical for analyzing high-throughput data. In this paper, we propose computationally efficient parametric and semiparametric tests based on a set of specialized matrix techniques dubbed as the PB-transformation. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2783-8DOI Listing

FGMP: assessing fungal genome completeness.

BMC Bioinformatics 2019 Apr 15;20(1):184. Epub 2019 Apr 15.

Department of Microbiology & Plant Pathology and Institute for Integrative Genome Biology, University of California-Riverside, Riverside, CA, 92521, USA.

Background: Inexpensive high-throughput DNA sequencing has democratized access to genetic information for most organisms so that research utilizing a genome or transcriptome of an organism is not limited to model systems. However, the quality of the assemblies of sampled genomes can vary greatly which hampers utility for comparisons and meaningful interpretation. The uncertainty of the completeness of a given genome sequence can limit feasibility of asserting patterns of high rates of gene loss reported in many lineages. Read More

View Article

Download full-text PDF

Source
https://bmcbioinformatics.biomedcentral.com/articles/10.1186
Publisher Site
http://dx.doi.org/10.1186/s12859-019-2782-9DOI Listing
April 2019
1 Read

Leveraging the effects of chloroquine on resistant malaria parasites for combination therapies.

BMC Bioinformatics 2019 Apr 15;20(1):186. Epub 2019 Apr 15.

Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA.

Background: Malaria is a major global health problem, with the Plasmodium falciparum protozoan parasite causing the most severe form of the disease. Prevalence of drug-resistant P. falciparum highlights the need to understand the biology of resistance and to identify novel combination therapies that are effective against resistant parasites. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2756-yDOI Listing
April 2019
1 Read

Neural sentence embedding models for semantic similarity estimation in the biomedical domain.

BMC Bioinformatics 2019 Apr 11;20(1):178. Epub 2019 Apr 11.

Section for Artificial Intelligence and Decision Support, Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Währinger Straße 25a, 1090, Vienna, Austria.

Background: Neural network based embedding models are receiving significant attention in the field of natural language processing due to their capability to effectively capture semantic information representing words, sentences or even larger text elements in low-dimensional vector space. While current state-of-the-art models for assessing the semantic similarity of textual statements from biomedical publications depend on the availability of laboriously curated ontologies, unsupervised neural embedding models only require large text corpora as input and do not need manual curation. In this study, we investigated the efficacy of current state-of-the-art neural sentence embedding models for semantic similarity estimation of sentences from biomedical literature. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2789-2DOI Listing
April 2019
2 Reads

Identification of monotonically differentially expressed genes for non-small cell lung cancer.

Authors:
Suyan Tian

BMC Bioinformatics 2019 Apr 11;20(1):177. Epub 2019 Apr 11.

Division of Clinical Research, The First Hospital of Jilin University, 71 Xinmin Street, Changchun, 130021, Jilin, China.

Background: Monotonically expressed genes (MEGs) are genes whose expression values increase or decrease monotonically as a disease advances or time proceeds. Non-small cell lung cancer (NSCLC) is a multistage progression process resulting from genetic sequences mutations, the identification of MEGs for NSCLC is important.

Results: With the aid of a feature selection algorithm capable of identifying MEGs - the MFSelector method - two sets of potential MEGs were selected in this study: the MEGs across the different pathologic stages and the MEGs across the risk levels of death for the NSCLC patients at early stages. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2775-8DOI Listing
April 2019
1 Read

3DMMS: robust 3D Membrane Morphological Segmentation of C. elegans embryo.

BMC Bioinformatics 2019 Apr 8;20(1):176. Epub 2019 Apr 8.

Department of Electronic Engineering, City University of Hong Kong, Kowloon Tong, Hong Kong.

Background: Understanding the cellular architecture is a fundamental problem in various biological studies. C. elegans is widely used as a model organism in these studies because of its unique fate determinations. Read More

View Article

Download full-text PDF

Source
https://bmcbioinformatics.biomedcentral.com/articles/10.1186
Publisher Site
http://dx.doi.org/10.1186/s12859-019-2720-xDOI Listing
April 2019
3 Reads

Ranking genomic features using an information-theoretic measure of epigenetic discordance.

BMC Bioinformatics 2019 Apr 8;20(1):175. Epub 2019 Apr 8.

Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, USA.

Background: Establishment and maintenance of DNA methylation throughout the genome is an important epigenetic mechanism that regulates gene expression whose disruption has been implicated in human diseases like cancer. It is therefore crucial to know which genes, or other genomic features of interest, exhibit significant discordance in DNA methylation between two phenotypes. We have previously proposed an approach for ranking genes based on methylation discordance within their promoter regions, determined by centering a window of fixed size at their transcription start sites. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2777-6DOI Listing
April 2019
3 Reads

Computational enhancer prediction: evaluation and improvements.

BMC Bioinformatics 2019 Apr 5;20(1):174. Epub 2019 Apr 5.

Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, 701 Ellicott St, Buffalo, NY, 14203, USA.

Background: Identifying transcriptional enhancers and other cis-regulatory modules (CRMs) is an important goal of post-sequencing genome annotation. Computational approaches provide a useful complement to empirical methods for CRM discovery, but it is critical that we develop effective means to evaluate their performance in terms of estimating their sensitivity and specificity.

Results: We introduce here pCRMeval, a pipeline for in silico evaluation of any enhancer prediction tools that are flexible enough to be applied to the Drosophila melanogaster genome. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2781-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6451241PMC

SNP2SIM: a modular workflow for standardizing molecular simulation and functional analysis of protein variants.

BMC Bioinformatics 2019 Apr 3;20(1):171. Epub 2019 Apr 3.

Innovation Center for Biomedical Informatics, Georgetown University Medical Center, 2115 Wisconsin Avenue, NW, Suite 110, Washington, D.C., 20007, USA.

Background: Molecular simulations are used to provide insight into protein structure and dynamics, and have the potential to provide important context when predicting the impact of sequence variation on protein function. In addition to understanding molecular mechanisms and interactions on the atomic scale, translational applications of those approaches include drug screening, development of novel molecular therapies, and targeted treatment planning. Supporting the continued development of these applications, we have developed the SNP2SIM workflow that generates reproducible molecular dynamics and molecular docking simulations for downstream functional variant analysis. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2774-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6448223PMC
April 2019
2.576 Impact Factor

Robustness of signal detection in cryo-electron microscopy via a bi-objective-function approach.

BMC Bioinformatics 2019 Apr 3;20(1):169. Epub 2019 Apr 3.

Intel® Parallel Computing Center for Structural Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.

Background: The detection of weak signals and selection of single particles from low-contrast micrographs of frozen hydrated biomolecules by cryo-electron microscopy (cryo-EM) represents a major practical bottleneck in cryo-EM data analysis. Template-based particle picking by an objective function using fast local correlation (FLC) allows computational extraction of a large number of candidate particles from micrographs. Another independent objective function based on maximum likelihood estimates (MLE) can be used to align the images and verify the presence of a signal in the selected particles. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2714-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6446299PMC
April 2019
1 Read
2.576 Impact Factor

FeatureSelect: a software for feature selection based on machine learning approaches.

BMC Bioinformatics 2019 Apr 3;20(1):170. Epub 2019 Apr 3.

Laboratory of system Biology and Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.

Background: Feature selection, as a preprocessing stage, is a challenging problem in various sciences such as biology, engineering, computer science, and other fields. For this purpose, some studies have introduced tools and softwares such as WEKA. Meanwhile, these tools or softwares are based on filter methods which have lower performance relative to wrapper methods. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2754-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6446290PMC
April 2019
2 Reads

Haplin power analysis: a software module for power and sample size calculations in genetic association analyses of family triads and unrelated controls.

BMC Bioinformatics 2019 Apr 2;20(1):165. Epub 2019 Apr 2.

Department of Global Public Health and Primary Care, University of Bergen, Bergen, Norway.

Background: Log-linear and multinomial modeling offer a flexible framework for genetic association analyses of offspring (child), parent-of-origin and maternal effects, based on genotype data from a variety of child-parent configurations. Although the calculation of statistical power or sample size is an important first step in the planning of any scientific study, there is currently a lack of software for genetic power calculations in family-based study designs. Here, we address this shortcoming through new implementations of power calculations in the R package Haplin, which is a flexible and robust software for genetic epidemiological analyses. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2727-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444579PMC

NITPicker: selecting time points for follow-up experiments.

BMC Bioinformatics 2019 Apr 2;20(1):166. Epub 2019 Apr 2.

Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, CB3 0WA, UK.

Background: The design of an experiment influences both what a researcher can measure, as well as how much confidence can be placed in the results. As such, it is vitally important that experimental design decisions do not systematically bias research outcomes. At the same time, making optimal design decisions can produce results leading to statistically stronger conclusions. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2717-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444531PMC
April 2019
2 Reads

AUTALASSO: an automatic adaptive LASSO for genome-wide prediction.

BMC Bioinformatics 2019 Apr 2;20(1):167. Epub 2019 Apr 2.

Division of Livestock Sciences,Department of Sustainable Agricultural Systems,University of Natural Resources and Life Sciences Vienna, Gregor Mendel Str. 33, Vienna, A-1180, Austria.

Background: Genome-wide prediction has become the method of choice in animal and plant breeding. Prediction of breeding values and phenotypes are routinely performed using large genomic data sets with number of markers on the order of several thousands to millions. The number of evaluated individuals is usually smaller which results in problems where model sparsity is of major concern. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2743-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444607PMC
April 2019
2 Reads

Data and knowledge management in translational research: implementation of the eTRIKS platform for the IMI OncoTrack consortium.

BMC Bioinformatics 2019 Apr 1;20(1):164. Epub 2019 Apr 1.

Janssen Research and Development Ltd, High Wycombe, UK.

Background: For large international research consortia, such as those funded by the European Union's Horizon 2020 programme or the Innovative Medicines Initiative, good data coordination practices and tools are essential for the successful collection, organization and analysis of the resulting data. Research consortia are attempting ever more ambitious science to better understand disease, by leveraging technologies such as whole genome sequencing, proteomics, patient-derived biological models and computer-based systems biology simulations.

Results: The IMI eTRIKS consortium is charged with the task of developing an integrated knowledge management platform capable of supporting the complexity of the data generated by such research programmes. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2748-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444691PMC

A statistical normalization method and differential expression analysis for RNA-seq data between different species.

BMC Bioinformatics 2019 Mar 29;20(1):163. Epub 2019 Mar 29.

College of Mathematics and Statistics, Institute of Statistical Sciences, Shenzhen University, Shenzhen, 518060, China.

Background: High-throughput techniques bring novel tools and also statistical challenges to genomic research. Identifying genes with differential expression between different species is an effective way to discover evolutionarily conserved transcriptional responses. To remove systematic variation between different species for a fair comparison, normalization serves as a crucial pre-processing step that adjusts for the varying sample sequencing depths and other confounding technical effects. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2745-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6441199PMC

Constructing effective energy functions for protein structure prediction through broadening attraction-basin and reverse Monte Carlo sampling.

BMC Bioinformatics 2019 Mar 29;20(Suppl 3):135. Epub 2019 Mar 29.

Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, 6, Kexueyuan South Road, Zhongguancun, Beijing, 100190, China.

Background: The ab initio approaches to protein structure prediction usually employ the Monte Carlo technique to search the structural conformation that has the lowest energy. However, the widely-used energy functions are usually ineffective for conformation search. How to construct an effective energy function remains a challenging task. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2652-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439974PMC

Protein complex detection based on flower pollination mechanism in multi-relation reconstructed dynamic protein networks.

BMC Bioinformatics 2019 Mar 29;20(Suppl 3):131. Epub 2019 Mar 29.

Department of Mechanical Engineering and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, S7N 5A9, Canada.

Background: Detecting protein complex in protein-protein interaction (PPI) networks plays a significant part in bioinformatics field. It enables us to obtain the better understanding for the structures and characteristics of biological systems.

Methods: In this study, we present a novel algorithm, named Improved Flower Pollination Algorithm (IFPA), to identify protein complexes in multi-relation reconstructed dynamic PPI networks. Read More

View Article

Download full-text PDF

Source
https://bmcbioinformatics.biomedcentral.com/articles/10.1186
Publisher Site
http://dx.doi.org/10.1186/s12859-019-2649-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6440282PMC
March 2019
2 Reads

RCPred: RNA complex prediction as a constrained maximum weight clique problem.

BMC Bioinformatics 2019 Mar 29;20(Suppl 3):128. Epub 2019 Mar 29.

IBISC, Univ Evry, Université Paris-Saclay, Evry, 91025, France.

Background: RNAs can interact and form complexes, which have various biological roles. The secondary structure prediction of those complexes is a first step towards the identification of their 3D structure. We propose an original approach that takes advantage of the high number of RNA secondary structure and RNA-RNA interaction prediction tools. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2648-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439972PMC

Detecting virus-specific effects on post-infection temporal gene expression.

Authors:
Quan Chen Jun Zhu

BMC Bioinformatics 2019 Mar 29;20(Suppl 3):129. Epub 2019 Mar 29.

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA.

Background: Different types of viruses have different envelope proteins, and may have their shared or distinctive host-virus interactions which result in various post-infection effects in humans and animals. These effects often do not appear at once but take time to unfold. To characterize the virus-specific effects, we applied a Multivariate Polynomial Time-dependent Genetic Association (MPTGA) method, previously proposed for detecting differences in temporal gene expression traits, to test for the differences in mouse lung transcriptome response to infection of different subtypes of influenza A viruses. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2653-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439963PMC

Identification of trans-eQTLs using mediation analysis with multiple mediators.

BMC Bioinformatics 2019 Mar 29;20(Suppl 3):126. Epub 2019 Mar 29.

Center for Statistical Science, Tsinghua University, Beijing, 100084, China.

Background: Mapping expression quantitative trait loci (eQTLs) has provided insight into gene regulation. Compared to cis-eQTLs, the regulatory mechanisms of trans-eQTLs are less known. Previous studies suggest that trans-eQTLs may regulate expression of remote genes by altering the expression of nearby genes. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2651-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6440281PMC
March 2019
1 Read

Automatic localization and identification of mitochondria in cellular electron cryo-tomography using faster-RCNN.

BMC Bioinformatics 2019 Mar 29;20(Suppl 3):132. Epub 2019 Mar 29.

Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA.

Background: Cryo-electron tomography (cryo-ET) enables the 3D visualization of cellular organization in near-native state which plays important roles in the field of structural cell biology. However, due to the low signal-to-noise ratio (SNR), large volume and high content complexity within cells, it remains difficult and time-consuming to localize and identify different components in cellular cryo-ET. To automatically localize and recognize in situ cellular structures of interest captured by cryo-ET, we proposed a simple yet effective automatic image analysis approach based on Faster-RCNN. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2650-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439989PMC

SplicedFamAlign: CDS-to-gene spliced alignment and identification of transcript orthology groups.

BMC Bioinformatics 2019 Mar 29;20(Suppl 3):133. Epub 2019 Mar 29.

Department of Computer science, Faculty of Science, Université de Sherbrooke, Sherbrooke, Quebec, Canada.

Background: The inference of splicing orthology relationships between gene transcripts is a basic step for the prediction of transcripts and the annotation of gene structures in genomes. The splicing structure of a sequence refers to the exon extremity information in a CDS or the exon-intron extremity information in a gene sequence. Splicing orthologous CDS are pairs of CDS with similar sequences and conserved splicing structures from orthologous genes. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2647-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439985PMC

Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition.

BMC Bioinformatics 2019 Mar 29;20(Suppl 3):134. Epub 2019 Mar 29.

School of Computer Science, Wuhan University, Wuhan, 430072, People's Republic of China.

Background: In the field of drug repositioning, it is assumed that similar drugs may treat similar diseases, therefore many existing computational methods need to compute the similarities of drugs and diseases. However, the calculation of similarity depends on the adopted measure and the available features, which may lead that the similarity scores vary dramatically from one to another, and it will not work when facing the incomplete data. Besides, supervised learning based methods usually need both positive and negative samples to train the prediction models, whereas in drug-disease pairs data there are only some verified interactions (positive samples) and a lot of unlabeled pairs. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2644-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439991PMC

reactIDR: evaluation of the statistical reproducibility of high-throughput structural analyses towards a robust RNA structure prediction.

BMC Bioinformatics 2019 Mar 29;20(Suppl 3):130. Epub 2019 Mar 29.

Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, Aomi, Koto-ku, Tokyo, Japan.

Background: Recently, next-generation sequencing techniques have been applied for the detection of RNA secondary structures, which is referred to as high-throughput RNA structural (HTS) analyses, and many different protocols have been used to detect comprehensive RNA structures at single-nucleotide resolution. However, the existing computational analyses heavily depend on the experimental methodology to generate data, which results in difficulties associated with statistically sound comparisons or combining the results obtained using different HTS methods.

Results: Here, we introduced a statistical framework, reactIDR, which can be applied to the experimental data obtained using multiple HTS methodologies. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2645-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439966PMC

A new class of constitutively active super-enhancers is associated with fast recovery of 3D chromatin loops.

BMC Bioinformatics 2019 Mar 29;20(Suppl 3):127. Epub 2019 Mar 29.

Department of Biological Sciences, KAIST, Daejeon, 34141, South Korea.

Background: Super-enhancers or stretch enhancers are clusters of active enhancers that often coordinate cell-type specific gene regulation during development and differentiation. In addition, the enrichment of disease-associated single nucleotide polymorphism in super-enhancers indicates their critical function in disease-specific gene regulation. However, little is known about the function of super-enhancers beyond gene regulation. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2646-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6439976PMC
March 2019
2 Reads

A (fire)cloud-based DNA methylation data preprocessing and quality control platform.

BMC Bioinformatics 2019 Mar 29;20(1):160. Epub 2019 Mar 29.

Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA.

Background: Bisulfite sequencing allows base-pair resolution profiling of DNA methylation and has recently been adapted for use in single-cells. Analyzing these data, including making comparisons with existing data, remains challenging due to the scale of the data and differences in preprocessing methods between published datasets.

Results: We present a set of preprocessing pipelines for bisulfite sequencing DNA methylation data that include a new R/Bioconductor package, scmeth, for a series of efficient QC analyses of large datasets. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2750-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6440105PMC
March 2019
1 Read

PyCellBase, an efficient python package for easy retrieval of biological data from heterogeneous sources.

BMC Bioinformatics 2019 Mar 28;20(1):159. Epub 2019 Mar 28.

HPC Service, UIS, University of Cambridge, Cambridge, UK.

Background: Biological databases and repositories are incrementing in diversity and complexity over the years. This rapid expansion of current and new sources of biological knowledge raises serious problems of data accessibility and integration. To handle the growing necessity of unification, CellBase was created as an integrative solution. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2726-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6438028PMC

High throughput automatic muscle image segmentation using parallel framework.

BMC Bioinformatics 2019 Mar 28;20(1):158. Epub 2019 Mar 28.

Department of Information Science and Technology, Northwest University, Xi'an, China.

Background: Fast and accurate automatic segmentation of skeletal muscle cell image is crucial for the diagnosis of muscle related diseases, which extremely reduces the labor-intensive manual annotation. Recently, several methods have been presented for automatic muscle cell segmentation. However, most methods exhibit high model complexity and time cost, and they are not adaptive to large-scale images such as whole-slide scanned specimens. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2719-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6437912PMC

Linking entities through an ontology using word embeddings and syntactic re-ranking.

BMC Bioinformatics 2019 Mar 27;20(1):156. Epub 2019 Mar 27.

Department of Computer Engineering, Boğaziçi University, İstanbul, 34342, Turkey.

Background: Although there is an enormous number of textual resources in the biomedical domain, currently, manually curated resources cover only a small part of the existing knowledge. The vast majority of these information is in unstructured form which contain nonstandard naming conventions. The task of named entity recognition, which is the identification of entity names from text, is not adequate without a standardization step. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2678-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6437991PMC

GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness.

BMC Bioinformatics 2019 Mar 27;20(1):155. Epub 2019 Mar 27.

Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, USA.

Background: Biological knowledge, and therefore Gene Ontology annotation sets, for human genes is incomplete. Recent studies have reported that biases in available GO annotations result in biased estimates of functional similarities of genes, but it is still unclear what the effect of incompleteness itself may be, even in the absence of bias. Pairwise gene similarities are used in a number of contexts, including gene "functional similarity" clustering and the related problem of functional ontology structure inference, but it is not known how different similarity measures or clustering methods perform on this task, and how the clusters are affected by annotation completeness. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2752-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6437941PMC

Predicting enhancers in mammalian genomes using supervised hidden Markov models.

BMC Bioinformatics 2019 Mar 27;20(1):157. Epub 2019 Mar 27.

Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, Berlin, 14195, Germany.

Background: Eukaryotic gene regulation is a complex process comprising the dynamic interaction of enhancers and promoters in order to activate gene expression. In recent years, research in regulatory genomics has contributed to a better understanding of the characteristics of promoter elements and for most sequenced model organism genomes there exist comprehensive and reliable promoter annotations. For enhancers, however, a reliable description of their characteristics and location has so far proven to be elusive. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2708-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6437899PMC

A new massively parallel nanoball sequencing platform for whole exome research.

BMC Bioinformatics 2019 Mar 25;20(1):153. Epub 2019 Mar 25.

BGI-Genomics, BGI-Shenzhen, Shenzhen, 518083, China.

Background: Whole exome sequencing (WES) has been widely used in human genetics research. BGISEQ-500 is a recently established next-generation sequencing platform. However, the performance of BGISEQ-500 on WES is not well studied. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2751-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6434795PMC

UTAP: User-friendly Transcriptome Analysis Pipeline.

BMC Bioinformatics 2019 Mar 25;20(1):154. Epub 2019 Mar 25.

Bioinformatics Unit, Department of Life Sciences Core Facilities, Weizmann Institute of Science, 76100, Rehovot, Israel.

Background: RNA-Seq technology is routinely used to characterize the transcriptome, and to detect gene expression differences among cell types, genotypes and conditions. Advances in short-read sequencing instruments such as Illumina Next-Seq have yielded easy-to-operate machines, with high throughput, at a lower price per base. However, processing this data requires bioinformatics expertise to tailor and execute specific solutions for each type of library preparation. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2728-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6434621PMC

SSS-test: a novel test for detecting positive selection on RNA secondary structure.

BMC Bioinformatics 2019 Mar 21;20(1):151. Epub 2019 Mar 21.

Human Biology Group, Institute for Biology, Department of Biology, Chemistry, Pharmacy, Freie Universitaet Berlin, Königin-Luise-Straße 1-3, Berlin, 14195, Germany.

Background: Long non-coding RNAs (lncRNAs) play an important role in regulating gene expression and are thus important for determining phenotypes. Most attempts to measure selection in lncRNAs have focused on the primary sequence. The majority of small RNAs and at least some parts of lncRNAs must fold into specific structures to perform their biological function. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2711-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6429701PMC

FitTetra 2.0 - improved genotype calling for tetraploids with multiple population and parental data support.

BMC Bioinformatics 2019 Mar 20;20(1):148. Epub 2019 Mar 20.

Wageningen University and Research - Plant Breeding, Wageningen, The Netherlands.

Background: Genetic studies in tetraploids are lagging behind in comparison with studies of diploids as the complex genetics of tetraploids require much more elaborated computational methodologies. Recent advancements in development of molecular techniques and computational tools facilitate new methods for automated, high-throughput genotype calling in tetraploid species. We report on the upgrade of the widely-used fitTetra software aiming to improve its accuracy, which to date is hampered by technical artefacts in the data. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2703-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6425654PMC

A likelihood ratio test for changes in homeolog expression bias.

BMC Bioinformatics 2019 Mar 20;20(1):149. Epub 2019 Mar 20.

Department of Biology, The College of William & Mary, Williamsburg, 23187, VA, USA.

Background: Gene duplications are a major source of raw material for evolution and a likely contributor to the diversity of life on earth. Duplicate genes (i.e. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2709-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6427896PMC

Replicability analysis in genome-wide association studies via Cartesian hidden Markov models.

BMC Bioinformatics 2019 Mar 18;20(1):146. Epub 2019 Mar 18.

Key Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal University, 5268 Renmin Street, Changchun, 130024, China.

Background: Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one study. Since the neighboring single nucleotide polymorphisms (SNPs) often exhibit high correlation, it is desirable to exploit the dependency information among adjacent SNPs properly in replicability analysis. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2707-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6423849PMC

MGSEA - a multivariate Gene set enrichment analysis.

BMC Bioinformatics 2019 Mar 18;20(1):145. Epub 2019 Mar 18.

Institute of Statistical Science, Academia Sinica, Taipei, Taiwan.

Background: Gene Set Enrichment Analysis (GSEA) is a powerful tool to identify enriched functional categories of informative biomarkers. Canonical GSEA takes one-dimensional feature scores derived from the data of one platform as inputs. Numerous extensions of GSEA handling multimodal OMIC data are proposed, yet none of them explicitly captures combinatorial relations of feature scores from multiple platforms. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2716-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6421703PMC
March 2019
9 Reads

GMASS: a novel measure for genome assembly structural similarity.

BMC Bioinformatics 2019 Mar 18;20(1):147. Epub 2019 Mar 18.

Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, South Korea.

Background: Thanks to the recent advancements in next-generation sequencing (NGS) technologies, large amount of genomic data, which are short DNA sequences known as reads, has been accumulating. Diverse assemblers have been developed to generate high quality de novo assemblies using the NGS reads, but their output is very different because of algorithmic differences. However, there are not properly structured measures to show the similarity or difference in assemblies. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2710-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6423833PMC

FitEllipsoid: a fast supervised ellipsoid segmentation plugin.

BMC Bioinformatics 2019 Mar 15;20(1):142. Epub 2019 Mar 15.

ITAV, CNRS, Université de Toulouse, 1 Pl. Pierre Potier, Toulouse, 31106, France.

Background: The segmentation of a 3D image is a task that can hardly be automatized in certain situations, notably when the contrast is low and/or the distance between elements is small. The existing supervised methods require a high amount of user input, e.g. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2673-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6419800PMC

Identifying miRNA-mRNA regulatory relationships in breast cancer with invariant causal prediction.

BMC Bioinformatics 2019 Mar 15;20(1):143. Epub 2019 Mar 15.

School of Information Technology and Mathematical Sciences, University of South Australia, Adelaide, Australia.

Background: microRNAs (miRNAs) regulate gene expression at the post-transcriptional level and they play an important role in various biological processes in the human body. Therefore, identifying their regulation mechanisms is essential for the diagnostics and therapeutics for a wide range of diseases. There have been a large number of researches which use gene expression profiles to resolve this problem. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2668-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6419852PMC

Network meta-analysis correlates with analysis of merged independent transcriptome expression data.

BMC Bioinformatics 2019 Mar 15;20(1):144. Epub 2019 Mar 15.

Institute for Animal Breeding and Genetics, University of Veterinary Medicine Hannover, Bünteweg 17p, Hannover, 30559, Germany.

Background: Using meta-analysis, high-dimensional transcriptome expression data from public repositories can be merged to make group comparisons that have not been considered in the original studies. Merging of high-dimensional expression data can, however, implicate batch effects that are sometimes difficult to be removed. Removing batch effects becomes even more difficult when expression data was taken using different technologies in the individual studies (e. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2705-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6420731PMC

Evaluation of linear models and missing value imputation for the analysis of peptide-centric proteomics.

BMC Bioinformatics 2019 Mar 14;20(Suppl 2):102. Epub 2019 Mar 14.

Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS, USA.

Background: Several methods to handle data generated from bottom-up proteomics via liquid chromatography-mass spectrometry, particularly for peptide-centric quantification dealing with post-translational modification (PTM) analysis like reversible cysteine oxidation are evaluated. The paper proposes a pipeline based on the R programming language to analyze PTMs from peptide-centric label-free quantitative proteomics data.

Results: Our methodology includes variance stabilization, normalization, and missing data imputation to account for the large dynamic range of PTM measurements. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2619-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6419331PMC

Predicting protein residue-residue contacts using random forests and deep networks.

BMC Bioinformatics 2019 Mar 14;20(Suppl 2):100. Epub 2019 Mar 14.

Department of Computer Science, University of Miami, 1365 Memorial Drive, Coral Gables, FL, 33124, USA.

Background: The ability to predict which pairs of amino acid residues in a protein are in contact with each other offers many advantages for various areas of research that focus on proteins. For example, contact prediction can be used to reduce the computational complexity of predicting the structure of proteins and even to help identify functionally important regions of proteins. These predictions are becoming especially important given the relatively low number of experimentally determined protein structures compared to the amount of available protein sequence data. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2627-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6419322PMC