1,558 results match your criteria Evolutionary Bioinformatics [Journal]


Dhaka: Variational Autoencoder for Unmasking Tumor Heterogeneity from Single Cell Genomic Data.

Bioinformatics 2019 Feb 15. Epub 2019 Feb 15.

Microsoft Research, Redmond, USA.

Motivation: Intra-tumor heterogeneity is one of the key confounding factors in deciphering tumor evolution. Malignant cells exhibit variations in their gene expression, copy numbers, and mutation even when originating from a single progenitor cell. Single cell sequencing of tumor cells has recently emerged as a viable option for unmasking the underlying tumor heterogeneity. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz095DOI Listing
February 2019

MultiDomainBenchmark: a multi-domain query and subject database suite.

BMC Bioinformatics 2019 Feb 14;20(1):77. Epub 2019 Feb 14.

National Center for Biotechnology Information, Bethesda, National Institutes of Health, 8600 Rockville Pike, Bethesda, 20894, MD, USA.

Background: Genetic sequence database retrieval benchmarks play an essential role in evaluating the performance of sequence searching tools. To date, all phylogenetically diverse benchmarks known to the authors include only query sequences with single protein domains. Domains are the primary building blocks of protein structure and function. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2660-5DOI Listing
February 2019

Estimation of duplication history under a stochastic model for tandem repeats.

BMC Bioinformatics 2019 Feb 6;20(1):64. Epub 2019 Feb 6.

Department of Electrical Engineering, California Institute of Technology, Pasadena, USA.

Background: Tandem repeat sequences are common in the genomes of many organisms and are known to cause important phenomena such as gene silencing and rapid morphological changes. Due to the presence of multiple copies of the same pattern in tandem repeats and their high variability, they contain a wealth of information about the mutations that have led to their formation. The ability to extract this information can enhance our understanding of evolutionary mechanisms. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2603-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6364452PMC
February 2019

Incorporating alignment uncertainty into Felsenstein's phylogenetic bootstrap to improve its reliability.

Bioinformatics 2019 Feb 6. Epub 2019 Feb 6.

Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.

Motivation: Most evolutionary analyses are based on pre-estimated multiple sequence alignment. Wong et al. established the existence of an uncertainty induced by multiple sequence alignment when reconstructing phylogenies. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz082DOI Listing
February 2019
1 Read

Understanding the evolutionary trend of intrinsically structural disorders in cancer relevant proteins as probed by Shannon entropy scoring and structure network analysis.

BMC Bioinformatics 2019 Feb 4;19(Suppl 13):549. Epub 2019 Feb 4.

Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, Massachusetts, 02138, USA.

Background: Malignant diseases have become a threat for health care system. A panoply of biological processes is involved as the cause of these diseases. In order to unveil the mechanistic details of these diseased states, we analyzed protein families relevant to these diseases. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2552-0DOI Listing
February 2019
1 Read
2.576 Impact Factor

Reviewer-coerced citation: Case report, update on journal policy, and suggestions for future prevention.

Bioinformatics 2019 Jan 30. Epub 2019 Jan 30.

Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany.

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz071DOI Listing
January 2019

s-dePooler: determination of polymorphism carriers from overlapping DNA pools.

BMC Bioinformatics 2019 Jan 22;20(1):45. Epub 2019 Jan 22.

Research Department of Non-Coronary Heart Diseases, Almazov National Medical Research Center, Ministry of Health of Russia, 2 Akkuratova St., St. Petersburg, 197341, Russia.

Background: Samples pooling is a method widely used in studies to reduce costs and labour. DNA sample pooling combined with massive parallel sequencing is a powerful tool for discovering DNA variants (polymorphisms) in large analysing populations, which is the base of such research fields as Genome-Wide Association Studies, evolutionary and population studies, etc. Usage of overlapping pools where each sample is present in multiple pools can enhance the accuracy of polymorphism detection and allow identifying carriers of rare-variants. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2616-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6343301PMC
January 2019

Protein Fold Recognition based on Multi-view Modeling.

Bioinformatics 2019 Jan 21. Epub 2019 Jan 21.

School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China.

Motivation: Protein fold recognition has attracted increasing attention because it is critical for studies of the 3D structures of proteins and drug design. Researchers have been extensively studying this important task, and several features with high discriminative power have been proposed. However, the development of methods that efficiently combine these features to improve the predictive performance remains a challenging problem. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz040DOI Listing
January 2019

admixr - R package for reproducible analyses using ADMIXTOOLS.

Bioinformatics 2019 Jan 22. Epub 2019 Jan 22.

Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany.

Summary: We present a new R package admixr, which provides a convenient interface for performing reproducible population genetic analyses (f3, D, f4, f4-ratio, qpWave and qpAdm), as implemented by command-line programs in the ADMIXTOOLS software suite. In a traditional ADMIXTOOLS workflow, the user must first generate a set of text configuration files tailored to each individual analysis, often using a combination of shell scripting and manual text editing. The non-tabular output files then need to be parsed to extract values of interest prior to further analyses. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz030DOI Listing
January 2019

PSiTE: a Phylogeny guided Simulator for Tumor Evolution.

Bioinformatics 2019 Jan 14. Epub 2019 Jan 14.

Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, P.R.China.

Summary: Simulating realistic clonal dynamics of tumors is an important topic in cancer genomics. Here, we present PSiTE (Phylogeny guided Simulator for Tumor Evolution), a tool that can simulate different types of tumor samples including single sector, multi-sector bulk tumor as well as single-cell tumor data under a wide range of evolutionary trajectories. PSiTE provides an efficient tool for understanding clonal evolution of cancer. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz028DOI Listing
January 2019
1 Read

Characterization and identification of long non-coding RNAs based on feature relationship.

Bioinformatics 2019 Jan 12. Epub 2019 Jan 12.

CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.

Motivation: The significance of long non-coding RNAs (lncRNAs) in many biological processes and diseases has gained intense interests over the past several years. However, computational identification of lncRNAs in a wide range of species remains challenging; it requires prior knowledge of well-established sequences and annotations or species-specific training data, but the reality is that only a limited number of species have high-quality sequences and annotations.

Results: Here we first characterize lncRNAs by contrast to protein-coding RNAs based on feature relationship and find that the feature relationship between ORF (open reading frame) length and GC content presents universally substantial divergence in lncRNAs and protein-coding RNAs, as observed in a broad variety of species. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/advance-article/doi/
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/btz008DOI Listing
January 2019
5 Reads

AQUAPONY: visualization and interpretation of phylogeographic information on phylogenetic trees.

Bioinformatics 2019 Jan 14. Epub 2019 Jan 14.

LIRMM, UMR 5506, CNRS and Université Montpellier, Montpellier, France.

Motivation: The visualization and interpretation of evolutionary spatiotemporal scenarios is broadly and increasingly used in infectious disease research, ecology, or agronomy. Using probabilistic frameworks, well-known tools can infer from molecular data ancestral traits for internal nodes in a phylogeny, and numerous phylogenetic rendering tools can display such evolutionary trees. However, visualizing such ancestral information and its uncertainty on the tree remains tedious. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz011DOI Listing
January 2019
1 Read

Simulation of Heterogeneous Tumour Genomes with HeteroGenesis and In Silico Whole Exome Sequencing.

Bioinformatics 2019 Jan 4. Epub 2019 Jan 4.

Leeds Institute of Medical Research at St James's, St James's University Hospital, Leeds, UK.

Summary: Tumour evolution results in progressive cancer phenotypes such as metastatic spread and treatment resistance. To better treat cancers, we must characterise tumour evolution and the genetic events that confer progressive phenotypes. This is facilitated by high coverage genome or exome sequencing. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty1063DOI Listing
January 2019

Single-cell RNA-seq Interpretations using Evolutionary Multiobjective Ensemble Pruning.

Bioinformatics 2018 Dec 28. Epub 2018 Dec 28.

Department of Computer Science, City University of Hong Kong, Hong Kong SAR.

Motivation: In recent years, single-cell RNA sequencing enables us to discover cell types or even subtypes. Its increasing availability provides opportunities to identify cell populations from single-cell RNA-seq data. Computational methods have been employed to reveal the gene expression variations among multiple cell populations. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty1056DOI Listing
December 2018

Homeolog expression quantification methods for allopolyploids.

Brief Bioinform 2018 12 27. Epub 2018 Dec 27.

Artificial Intelligence Research Center, AIST, 2-3-26 Aomi, Koto-ku, Tokyo 135-0064, Japan.

Genome duplication with hybridization, or allopolyploidization, occurs in animals, fungi and plants, and is especially common in crop plants. There is an increasing interest in the study of allopolyploids because of advances in polyploid genome assembly; however, the high level of sequence similarity in duplicated gene copies (homeologs) poses many challenges. Here we compared standard RNA-seq expression quantification approaches used currently for diploid species against subgenome-classification approaches which maps reads to each subgenome separately. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bib/advance-article/doi/10.1093/bib
Publisher Site
http://dx.doi.org/10.1093/bib/bby121DOI Listing
December 2018
4 Reads

Ancestral sequence reconstruction: accounting for structural information by averaging over replacement matrices.

Bioinformatics 2018 Dec 24. Epub 2018 Dec 24.

School of Molecular Cell Biology and Biotechnology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel.

Motivation: Ancestral sequence reconstruction (ASR) is widely used to understand protein evolution, structure and function. Current ASR methodologies do not fully consider differences in evolutionary constraints among positions imposed by the three-dimensional (3D) structure of the protein. Here we developed an ASR algorithm that allows different protein sites to evolve according to different mixtures of replacement matrices. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty1031DOI Listing
December 2018

Degeneracy and genetic assimilation in RNA evolution.

BMC Bioinformatics 2018 Dec 27;19(1):543. Epub 2018 Dec 27.

University of Virginia Biocomplexity Institute, 995 Research Park Boulevard, Charlottesville, 22911, USA.

Background: The neutral theory of Motoo Kimura stipulates that evolution is mostly driven by neutral mutations. However adaptive pressure eventually leads to changes in phenotype that involve non-neutral mutations. The relation between neutrality and adaptation has been studied in the context of RNA before and here we further study transitional mutations in the context of degenerate (plastic) RNA sequences and genetic assimilation. Read More

View Article

Download full-text PDF

Source
https://bmcbioinformatics.biomedcentral.com/articles/10.1186
Publisher Site
http://dx.doi.org/10.1186/s12859-018-2497-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6307299PMC
December 2018
6 Reads

GLUE: a flexible software system for virus sequence data.

BMC Bioinformatics 2018 Dec 18;19(1):532. Epub 2018 Dec 18.

MRC-University of Glasgow Centre for Virus Research, Glasgow, Scotland, UK.

Background: Virus genome sequences, generated in ever-higher volumes, can provide new scientific insights and inform our responses to epidemics and outbreaks. To facilitate interpretation, such data must be organised and processed within scalable computing resources that encapsulate virology expertise. GLUE (Genes Linked by Underlying Evolution) is a data-centric bioinformatics environment for building such resources. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2459-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6299651PMC
December 2018
3 Reads

ModL: exploring and restoring regularity when testing for positive selection.

Bioinformatics 2018 Dec 12. Epub 2018 Dec 12.

Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada.

Motivation: Likelihood ratio tests are commonly used to test for positive selection acting on proteins. They are usually applied with thresholds for declaring a protein under positive selection determined from a chi-square or mixture of chi-square distributions. While it is known that such distributions are not strictly justified due to the statistical irregularity of the problem, the hope has been that the resulting tests are conservative and do not lose much power in comparison with the same test using the unknown, correct threshold. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty1019DOI Listing
December 2018
1 Read

Increasing the accuracy of protein loop structure prediction with evolutionary constraints.

Bioinformatics 2018 Dec 10. Epub 2018 Dec 10.

Department of Statistics, University of Oxford, Oxford, United Kingdom.

Motivation: Accurate prediction of loop structures remains challenging. This is especially true for long loops where the large conformational space and limited coverage of experimentally-determined structures often leads to low accuracy. Co-evolutionary contact predictors, which provide information about the proximity of pairs of residues, have been used to improve whole-protein models generated through de novo techniques. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty996DOI Listing
December 2018

A Novel Measure of Non-coding Genome Conservation Identifies Genomic Regulatory Blocks Within Primates.

Bioinformatics 2018 Dec 7. Epub 2018 Dec 7.

Computational Regulatory Genomics Group, MRC London Institute of Medical Sciences, Du Cane Road, London, UK.

Motivation: Clusters of extremely conserved non-coding elements (CNEs) mark genomic regions devoted to cis-regulation of key developmental genes in Metazoa. We have recently shown that their span coincides with that of topologically associating domains (TADs), making them useful for estimating conserved TAD boundaries in the absence of Hi-C data. The standard approach - detecting CNEs in genome alignments and then establishing the boundaries of their clusters - requires tuning of several parameters and breaks down when comparing closely related genomes. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty1014DOI Listing
December 2018

iHam & pyHam: visualizing and processing hierarchical orthologous groups.

Bioinformatics 2018 Dec 3. Epub 2018 Dec 3.

SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.

Summary: The evolutionary history of gene families can be complex due to duplications and losses. This complexity is compounded by the large number of genomes simultaneously considered in contemporary comparative genomic analyses. As provided by several orthology databases, hierarchical orthologous groups (HOGs) are sets of genes that are inferred to have descended from a common ancestral gene within a species clade. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/advance-article/doi/
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty994DOI Listing
December 2018
8 Reads

Multi-omic analysis of signalling factors in inflammatory comorbidities.

BMC Bioinformatics 2018 Nov 30;19(Suppl 15):439. Epub 2018 Nov 30.

Computer Laboratory, University of Cambridge, Cambridge, UK.

Background: Inflammation is a core element of many different, systemic and chronic diseases that usually involve an important autoimmune component. The clinical phase of inflammatory diseases is often the culmination of a long series of pathologic events that started years before. The systemic characteristics and related mechanisms could be investigated through the multi-omic comparative analysis of many inflammatory diseases. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2413-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6266935PMC
November 2018
1 Read

Multilevel comparative bioinformatics to investigate evolutionary relationships and specificities in gene annotations: an example for tomato and grapevine.

BMC Bioinformatics 2018 Nov 30;19(Suppl 15):435. Epub 2018 Nov 30.

Department of Agriculture, University of Naples "Federico II,", Portici, Naples, Italy.

Background: "Omics" approaches may provide useful information for a deeper understanding of speciation events, diversification and function innovation. This can be achieved by investigating the molecular similarities at sequence level between species, allowing the definition of ortholog and paralog genes. However, the spreading of sequenced genome, often endowed with still preliminary annotations, requires suitable bioinformatics to be appropriately exploited in this framework. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2420-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6266932PMC
November 2018
1 Read

Comprehensive review of the identification of essential genes using computational methods: focusing on feature implementation and assessment.

Brief Bioinform 2018 Nov 29. Epub 2018 Nov 29.

School of Life Science and Technology, Center for Informational Biology, Intelligent Learning Institute for Science and Application, University of Electronic Science and Technology of China, Chengdu, China.

Essential genes have attracted increasing attention in recent years due to the important functions of these genes in organisms. Among the methods used to identify the essential genes, accurate and efficient computational methods can make up for the deficiencies of expensive and time-consuming experimental technologies. In this review, we have collected researches on essential gene predictions in prokaryotes and eukaryotes and summarized the five predominant types of features used in these studies. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bby116DOI Listing
November 2018
1 Read

PhastWeb: a web interface for evolutionary conservation scoring of multiple sequence alignments using phastCons and phyloP.

Bioinformatics 2018 Nov 27. Epub 2018 Nov 27.

Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.

Summary: The Phylogenetic Analysis with Space/Time models (PHAST) package is a widely used software package for comparative genomics that has been freely available for download since 2002. Here we introduce a web interface (phastWeb) that makes it possible to use two of the most popular programs in PHAST, phastCons and phyloP, without downloading and installing the PHAST software. This interface allows users to upload a sequence alignment and either upload a corresponding phylogeny or have one estimated from the alignment. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/advance-article/doi/
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty966DOI Listing
November 2018
7 Reads

A statistical method to identify recombination in bacterial genomes based on SNP incompatibility.

BMC Bioinformatics 2018 Nov 22;19(1):450. Epub 2018 Nov 22.

Department of Computer Science & Engineering, Texas A&M University, College Station, TX 77843, USA.

Background: Phylogeny estimation for bacteria is likely to reflect their true evolutionary histories only if they are highly clonal. However, recombination events could occur during evolution for some species. The reconstruction of phylogenetic trees from an alignment without considering recombination could be misleading, since the relationships among strains in some parts of the genome might be different than in others. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2456-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6251179PMC
November 2018
6 Reads

Gene characteristics predicting missense, nonsense and frameshift mutations in tumor samples.

BMC Bioinformatics 2018 Nov 19;19(1):430. Epub 2018 Nov 19.

The Geisel School of Medicine, Department of Biomedical Data Science, Dartmouth College, HB7936, One Medical Center Dr., Dartmouth-Hitchcock Medical Center, Beirut, NH, 03756, Lebanon.

Background: Because driver mutations provide selective advantage to the mutant clone, they tend to occur at a higher frequency in tumor samples compared to selectively neutral (passenger) mutations. However, mutation frequency alone is insufficient to identify cancer genes because mutability is influenced by many gene characteristics, such as size, nucleotide composition, etc. The goal of this study was to identify gene characteristics associated with the frequency of somatic mutations in the gene in tumor samples. Read More

View Article

Download full-text PDF

Source
https://bmcbioinformatics.biomedcentral.com/articles/10.1186
Publisher Site
http://dx.doi.org/10.1186/s12859-018-2455-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245819PMC
November 2018
12 Reads

Automated selection of homologs to track the evolutionary history of proteins.

BMC Bioinformatics 2018 Nov 19;19(1):431. Epub 2018 Nov 19.

Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128, Mainz, Germany.

Background: The selection of distant homologs of a query protein under study is a usual and useful application of protein sequence databases. Such sets of homologs are often applied to investigate the function of a protein and the degree to which experimental results can be transferred from one organism to another. In particular, a variety of databases facilitates static browsing for orthologs. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2457-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245638PMC
November 2018
2.576 Impact Factor

gmRAD: an integrated SNP calling pipeline for genetic mapping with RADseq across a hybrid population.

Brief Bioinform 2018 Nov 14. Epub 2018 Nov 14.

Southern Modern Forestry Collaborative Innovation Center, College of Forestry, Nanjing Forestry University, Nanjing, China.

Restriction site-associated DNA sequencing (RADseq) is a powerful technology that has been extensively applied in population genetics, phylogenetics and genetic mapping. Although many software packages are available for ecological and evolutionary studies, a few effective tools are available for extracting genotype data with RADseq for genetic mapping, a prerequisite for quantitative trait locus mapping, comparative genomics and genome scaffold assembly. Here, we present an integrated pipeline called gmRAD for generating single nucleotide polymorphism (SNP) genotypes from RADseq data, de novo, across a genetic mapping population derived by crossing two parents. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bby114DOI Listing
November 2018
2 Reads

FLYCOP: metabolic modeling-based analysis and engineering microbial communities.

Bioinformatics 2018 Sep;34(17):i954-i963

Department of Systems Biology, Centro Nacional de Biotecnología, Consejo Superior de Investigaciones Científicas (CNB-CSIC), 28049 Madrid, Spain.

Motivation: Synthetic microbial communities begin to be considered as promising multicellular biocatalysts having a large potential to replace engineered single strains in biotechnology applications, in pharmaceutical, chemical and living architecture sectors. In contrast to single strain engineering, the effective and high-throughput analysis and engineering of microbial consortia face the lack of knowledge, tools and well-defined workflows. This manuscript contributes to fill this important gap with a framework, called FLYCOP (FLexible sYnthetic Consortium OPtimization), which contributes to microbial consortia modeling and engineering, while improving the knowledge about how these communities work. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/article/34/17/i954/5
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty561DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129290PMC
September 2018
12 Reads

Fast characterization of segmental duplications in genome assemblies.

Bioinformatics 2018 Sep;34(17):i706-i714

Vancouver Prostate Centre, Vancouver, Canada.

Motivation: Segmental duplications (SDs) or low-copy repeats, are segments of DNA > 1 Kbp with high sequence identity that are copied to other regions of the genome. SDs are among the most important sources of evolution, a common cause of genomic structural variation and several are associated with diseases of genomic origin including schizophrenia and autism. Despite their functional importance, SDs present one of the major hurdles for de novo genome assembly due to the ambiguity they cause in building and traversing both state-of-the-art overlap-layout-consensus and de Bruijn graphs. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty586DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129265PMC
September 2018
3 Reads

Predicting protein-protein interactions through sequence-based deep learning.

Bioinformatics 2018 Sep;34(17):i802-i810

Toyota Technological Institute at Chicago, Chicago, IL, USA.

Motivation: High-throughput experimental techniques have produced a large amount of protein-protein interaction (PPI) data, but their coverage is still low and the PPI data is also very noisy. Computational prediction of PPIs can be used to discover new PPIs and identify errors in the experimental PPI data.

Results: We present a novel deep learning framework, DPPI, to model and predict PPIs from sequence information alone. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/article/34/17/i802/5
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty573DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129267PMC
September 2018
29 Reads

Learning structural motif representations for efficient protein structure search.

Bioinformatics 2018 Sep;34(17):i773-i780

Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA.

Motivation: Given a protein of unknown function, fast identification of similar protein structures from the Protein Data Bank (PDB) is a critical step for inferring its biological function. Such structural neighbors can provide evolutionary insights into protein conformation, interfaces and binding sites that are not detectable from sequence similarity. However, the computational cost of performing pairwise structural alignment against all structures in PDB is prohibitively expensive. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/article/34/17/i773/5
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty585DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129266PMC
September 2018
8 Reads

Computational enhancement of single-cell sequences for inferring tumor evolution.

Bioinformatics 2018 Sep;34(17):i917-i926

Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA.

Motivation: Tumor sequencing has entered an exciting phase with the advent of single-cell techniques that are revolutionizing the assessment of single nucleotide variation (SNV) at the highest cellular resolution. However, state-of-the-art single-cell sequencing technologies produce data with many missing bases (MBs) and incorrect base designations that lead to false-positive (FP) and false-negative (FN) detection of somatic mutations. While computational methods are available to make biological inferences in the presence of these errors, the accuracy of the imputed MBs and corrected FPs and FNs remains unknown. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/article/34/17/i917/5
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty571DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129264PMC
September 2018
13 Reads
4.981 Impact Factor

SPhyR: tumor phylogeny estimation from single-cell sequencing data under loss and error.

Bioinformatics 2018 Sep;34(17):i671-i679

Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA.

Motivation: Cancer is characterized by intra-tumor heterogeneity, the presence of distinct cell populations with distinct complements of somatic mutations, which include single-nucleotide variants (SNVs) and copy-number aberrations (CNAs). Single-cell sequencing technology enables one to study these cell populations at single-cell resolution. Phylogeny estimation algorithms that employ appropriate evolutionary models are key to understanding the evolutionary mechanisms behind intra-tumor heterogeneity. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty589DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6153375PMC
September 2018

wgd - simple command line tools for the analysis of ancient whole genome duplications.

Bioinformatics 2018 Nov 6. Epub 2018 Nov 6.

Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.

Motivation: Ancient whole genome duplications (WGDs) have been uncovered in almost all major lineages of life on Earth and the search for traces or remnants of such events has become standard practice in most genome analyses. This is especially true for plants, where ancient WGDs are abundant. Common approaches to find evidence for ancient WGDs include the construction of KS distributions and the analysis of intragenomic co-linearity. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/advance-article/doi/
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty915DOI Listing
November 2018
4 Reads

High-Complexity Regions in Mammalian Genomes are Enriched for Developmental Genes.

Bioinformatics 2018 Nov 5. Epub 2018 Nov 5.

Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Biology, Plön, Germany.

Motivation: Unique sequence regions are associated with genetic function in vertebrate genomes. However, measuring uniqueness, or absence of long repeats, along a genome is conceptually and computationally difficult. Here we use a previously published variant of the Lempel-Ziv complexity, the match complexity, Cm, and augment it by deriving its null distribution for random sequences. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty922DOI Listing
November 2018

ISU FLUture: a veterinary diagnostic laboratory web-based platform to monitor the temporal genetic patterns of Influenza A virus in swine.

BMC Bioinformatics 2018 Nov 1;19(1):397. Epub 2018 Nov 1.

Department of Veterinary Diagnostic & Production Animal Medicine, Iowa State University, 1575 Vet Med, 1850 Christensen Dr, Ames, IA, 50011-1134, USA.

Background: Influenza A Virus (IAV) causes respiratory disease in swine and is a zoonotic pathogen. Uncontrolled IAV in swine herds not only affects animal health, it also impacts production through increased costs associated with treatment and prevention efforts. The Iowa State University Veterinary Diagnostic Laboratory (ISU VDL) diagnoses influenza respiratory disease in swine and provides epidemiological analyses on samples submitted by veterinarians. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2408-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6211438PMC
November 2018
12 Reads

On the impact of uncertain gene tree rooting on duplication-transfer-loss reconciliation.

BMC Bioinformatics 2018 Aug 13;19(Suppl 9):290. Epub 2018 Aug 13.

Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, 06269, USA.

Background: Duplication-Transfer-Loss (DTL) reconciliation is a powerful and increasingly popular technique for studying the evolution of microbial gene families. DTL reconciliation requires the use of rooted gene trees to perform the reconciliation with the species tree, and the standard technique for rooting gene trees is to assign a root that results in the minimum reconciliation cost across all rootings of that gene tree. However, even though it is well understood that many gene trees have multiple optimal roots, only a single optimal root is randomly chosen to create the rooted gene tree and perform the reconciliation. Read More

View Article

Download full-text PDF

Source
https://bmcbioinformatics.biomedcentral.com/articles/10.1186
Publisher Site
http://dx.doi.org/10.1186/s12859-018-2269-0DOI Listing
August 2018
9 Reads

ConnectedAlign: a PPI network alignment method for identifying conserved protein complexes across multiple species.

BMC Bioinformatics 2018 Aug 13;19(Suppl 9):286. Epub 2018 Aug 13.

School of Information Science and Engineering, Central South University, Changsha, 410083, China.

Background: In bioinformatics, network alignment algorithms have been applied to protein-protein interaction (PPI) networks to discover evolutionary conserved substructures at the system level. However, most previous methods aim to maximize the similarity of aligned proteins in pairwise networks, while concerning little about the feature of connectivity in these substructures, such as the protein complexes.

Results: In this paper, we identify the problem of finding conserved protein complexes, which requires the aligned proteins in a PPI network to form a connected subnetwork. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2271-6DOI Listing

Measuring phylogenetic signal between categorical traits and phylogenies.

Bioinformatics 2018 Oct 25. Epub 2018 Oct 25.

CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, Matosinhos, Portugal.

Motivation: Determining whether a trait and phylogeny share some degree of phylogenetic signal is a flagship goal in evolutionary biology. Signatures of phylogenetic signal can assist the resolution of a broad range of evolutionary questions regarding the tempo and mode of phenotypic evolution. However, despite the considerable number of strategies to measure it, few and limited approaches exist for categorical traits. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/advance-article/doi/
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty800DOI Listing
October 2018
12 Reads

PoSE: visualization of patterns of sequence evolution using PAML and MATLAB.

BMC Bioinformatics 2018 Oct 22;19(Suppl 11):364. Epub 2018 Oct 22.

Polio and Picornavirus Laboratory Branch, G-10, Division of Viral Diseases, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, 1600 Clifton Rd., N.E, Atlanta, GA, 30329, USA.

Background: Determining patterns of nucleotide and amino acid substitution is the first step during sequence evolution analysis. However, it is not easy to visualize the different phylogenetic signatures imprinted in aligned nucleotide and amino acid sequences.

Results: Here we present PoSE (Pattern of Sequence Evolution), a reliable resource for unveiling the evolutionary history of sequence alignments and for graphically displaying their contents. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2335-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6196406PMC
October 2018

outbreaker2: a modular platform for outbreak reconstruction.

BMC Bioinformatics 2018 Oct 22;19(Suppl 11):363. Epub 2018 Oct 22.

MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, UK.

Background: Reconstructing individual transmission events in an infectious disease outbreak can provide valuable information and help inform infection control policy. Recent years have seen considerable progress in the development of methodologies for reconstructing transmission chains using both epidemiological and genetic data. However, only a few of these methods have been implemented in software packages, and with little consideration for customisability and interoperability. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2330-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6196407PMC
October 2018

MetaSMC: A Coalescent-based Shotgun Sequence Simulator for Evolving Microbial Populations.

Bioinformatics 2018 Oct 13. Epub 2018 Oct 13.

Institute of Statistics, National Tsing-Hua University, Hsinchu, Taiwan.

Motivation: High-throughput sequencing technology has revolutionized the study of metagenomics and cancer evolution. In a relatively simple environment, a metagenomics sequencing data is dominated by a few species. By analyzing the alignment of reads from microbial species, single nucleotide polymorphisms can be discovered and the evolutionary history of the populations can be reconstructed. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/advance-article/doi/
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty840DOI Listing
October 2018
12 Reads

The EVcouplings Python framework for coevolutionary sequence analysis.

Bioinformatics 2018 Oct 9. Epub 2018 Oct 9.

Department of Systems Biology, Harvard Medical School, Boston, MA, USA.

Summary: Coevolutionary sequence analysis has become a commonly used technique for de novo prediction of the structure and function of proteins, RNA, and protein complexes. We present the EVcouplings framework, a fully integrated open-source application and Python package for coevolutionary analysis. The framework enables generation of sequence alignments, calculation and evaluation of evolutionary couplings (ECs), and de novo prediction of structure and mutation effects. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/advance-article/doi/
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty862DOI Listing
October 2018
4 Reads

A new method bridging graph theory and residue co-evolutionary networks for specificity determinant positions detection.

Bioinformatics 2018 Oct 8. Epub 2018 Oct 8.

Departmento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais. Av. Presidente Antônio Carlos, 6627 - Pampulha, Belo Horizonte - MG, Brazil.

Motivation: computational studies of molecular evolution are usually performed from a multiple alignment of homologous sequences, on which sequences resulting from a common ancestor are aligned so that equivalent residues are placed in the same position. Residues frequency patterns of a full alignment or from a subset of its sequences can be highly useful for suggesting positions under selection. Most methods mapping co-evolving or specificity determinant sites are focused on positions, however, they do not consider the case for residues that are specificity determinants in one subclass, but variable in others. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty846DOI Listing
October 2018

De Novo pattern discovery enables robust assessment of functional consequences of noncoding variants.

Bioinformatics 2018 Sep 26. Epub 2018 Sep 26.

Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America.

Motivation: Given the complexity of genome regions, prioritize the functional effects of noncoding variants remains a challenge. Although several frameworks have been proposed for the evaluation of the functionality of noncoding variants, most of them used "black boxes" methods that simplify the task as the pathogenicity/benign classification problem, which ignores the distinct regulatory mechanisms of variants and leads to less desirable performance. In this study, we developed DVAR, an unsupervised framework that leverage various biochemical and evolutionary evidence to distinguish the gene regulatory categories of variants and assess their comprehensive functional impact simultaneously. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty826DOI Listing
September 2018
1 Read

Phage spanins: diversity, topological dynamics and gene convergence.

BMC Bioinformatics 2018 Sep 15;19(1):326. Epub 2018 Sep 15.

Center for Phage Technology, Department of Biochemistry and Biophysics, Texas A&M University, 2128 TAMU, College Station, TX, 77843-2128, USA.

Background: Spanins are phage lysis proteins required to disrupt the outer membrane. Phages employ either two-component spanins or unimolecular spanins in this final step of Gram-negative host lysis. Two-component spanins like Rz-Rz1 from phage lambda consist of an integral inner membrane protein: i-spanin, and an outer membrane lipoprotein: o-spanin, that form a complex spanning the periplasm. Read More

View Article

Download full-text PDF

Source
https://bmcbioinformatics.biomedcentral.com/articles/10.1186
Publisher Site
http://dx.doi.org/10.1186/s12859-018-2342-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6139136PMC
September 2018
8 Reads

Enhancer reprogramming in mammalian genomes.

BMC Bioinformatics 2018 Sep 10;19(1):316. Epub 2018 Sep 10.

Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD, 20894, USA.

Background: Transcription factor binding site (TFBS) loss, gain, and reshuffling within the sequence of a regulatory element could alter the function of that regulatory element. Some of the changes will be detrimental to the fitness of the species and will result in gradual removal from a population, while other changes would be either beneficial or just a part of genetic drift and end up being fixed in a population. This "reprogramming" of regulatory elements results in modification of the gene regulatory landscape during evolution. Read More

View Article

Download full-text PDF

Source
https://bmcbioinformatics.biomedcentral.com/articles/10.1186
Publisher Site
http://dx.doi.org/10.1186/s12859-018-2343-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6131754PMC
September 2018
3 Reads