1,575 results match your criteria Evolutionary Bioinformatics [Journal]


Detailed prediction of protein sub-nuclear localization.

BMC Bioinformatics 2019 Apr 23;20(1):205. Epub 2019 Apr 23.

Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr. 3, 85748, Garching/Munich, Germany.

Background: Sub-nuclear structures or locations are associated with various nuclear processes. Proteins localized in these substructures are important to understand the interior nuclear mechanisms. Despite advances in high-throughput methods, experimental protein annotations remain limited. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2790-9DOI Listing

FGMP: assessing fungal genome completeness.

BMC Bioinformatics 2019 Apr 15;20(1):184. Epub 2019 Apr 15.

Department of Microbiology & Plant Pathology and Institute for Integrative Genome Biology, University of California-Riverside, Riverside, CA, 92521, USA.

Background: Inexpensive high-throughput DNA sequencing has democratized access to genetic information for most organisms so that research utilizing a genome or transcriptome of an organism is not limited to model systems. However, the quality of the assemblies of sampled genomes can vary greatly which hampers utility for comparisons and meaningful interpretation. The uncertainty of the completeness of a given genome sequence can limit feasibility of asserting patterns of high rates of gene loss reported in many lineages. Read More

View Article

Download full-text PDF

Source
https://bmcbioinformatics.biomedcentral.com/articles/10.1186
Publisher Site
http://dx.doi.org/10.1186/s12859-019-2782-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6466665PMC
April 2019
1 Read

IntronDB: a database for eukaryotic intron features.

Authors:
Dapeng Wang

Bioinformatics 2019 Apr 5. Epub 2019 Apr 5.

Department of Plant Sciences, University of Oxford, S Parks Rd, Oxford OX1 3RB, UK.

Summary: The rate and extent of unbalanced eukaryotic intron changes exhibit dynamic patterns for different lineages of species or certain functional groups of genes with varied spatio-temporal expression modes affected by selective pressure. To date, only a few key conserved splicing signals or regulatory elements have been identified in introns and little is known about the remaining intronic regions. To trace the evolutionary trajectory of spliceosomal introns from available genomes under a unified framework, we present IntronDB, which catalogues approximately 50000000 introns from over 1000 genomes spanning the major eukaryotic clades in the tree of life. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/advance-article/doi/
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/btz242DOI Listing
April 2019
9 Reads

Assessing reproducibility of matrix factorization methods in independent transcriptomes.

Bioinformatics 2019 Apr 2. Epub 2019 Apr 2.

Institut Curie, PSL Research University, Paris, France.

Motivation: Matrix factorization (MF) methods are widely used in order to reduce dimensionality of transcriptomic datasets to the action of few hidden factors (metagenes). MF algorithms have never been compared based on the between-datasets reproducibility of their outputs in similar independent datasets. Lack of this knowledge might have a crucial impact when generalizing the predictions made in a study to others. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz225DOI Listing

Integration of network models and evolutionary analysis into high-throughput modeling of protein dynamics and allosteric regulation: theory, tools and applications.

Brief Bioinform 2019 Mar 21. Epub 2019 Mar 21.

School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China.

Proteins are dynamical entities that undergo a plethora of conformational changes, accomplishing their biological functions. Molecular dynamics simulation and normal mode analysis methods have become the gold standard for studying protein dynamics, analyzing molecular mechanism and allosteric regulation of biological systems. The enormous amount of the ensemble-based experimental and computational data on protein structure and dynamics has presented a major challenge for the high-throughput modeling of protein regulation and molecular mechanisms. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bbz029DOI Listing

ASTRAL-MP: scaling ASTRAL to very large datasets using randomization and parallelization.

Bioinformatics 2019 Mar 23. Epub 2019 Mar 23.

Department of Electrical and Computer Engineering, University of California, San Diego, USA.

Motivation: Evolutionary histories can change from one part of the genome to another. The potential for discordance between the gene trees has motivated the development of summary methods that reconstruct a species tree from an input collection of gene trees. ASTRAL is a widely used summary method and has been able to scale to relatively large datasets. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/advance-article/doi/
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/btz211DOI Listing
March 2019
4 Reads
4.981 Impact Factor

SSS-test: a novel test for detecting positive selection on RNA secondary structure.

BMC Bioinformatics 2019 Mar 21;20(1):151. Epub 2019 Mar 21.

Human Biology Group, Institute for Biology, Department of Biology, Chemistry, Pharmacy, Freie Universitaet Berlin, Königin-Luise-Straße 1-3, Berlin, 14195, Germany.

Background: Long non-coding RNAs (lncRNAs) play an important role in regulating gene expression and are thus important for determining phenotypes. Most attempts to measure selection in lncRNAs have focused on the primary sequence. The majority of small RNAs and at least some parts of lncRNAs must fold into specific structures to perform their biological function. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2711-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6429701PMC

A likelihood ratio test for changes in homeolog expression bias.

BMC Bioinformatics 2019 Mar 20;20(1):149. Epub 2019 Mar 20.

Department of Biology, The College of William & Mary, Williamsburg, 23187, VA, USA.

Background: Gene duplications are a major source of raw material for evolution and a likely contributor to the diversity of life on earth. Duplicate genes (i.e. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2709-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6427896PMC

phylostratr: A framework for phylostratigraphy.

Bioinformatics 2019 Mar 14. Epub 2019 Mar 14.

Bioinformatics and Computational Biology Program, Iowa State University, Ames, USA.

Motivation: The goal of phylostratigraphy is to infer the evolutionary origin of each gene in an organism. This is done by searching for homologs within increasingly broad clades. The deepest clade that contains a homolog of the protein(s) encoded by a gene is that gene's phylostratum. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz171DOI Listing
March 2019
1 Read

SodaPop: A Forward Simulation Suite for the Evolutionary Dynamics of Asexual Populations on Protein Fitness Landscapes.

Bioinformatics 2019 Mar 13. Epub 2019 Mar 13.

Département de Biochimie, Université de Montréal, Montréal, Québec, Canada.

Motivation: Protein evolution is determined by forces at multiple levels of biological organization. Random mutations have an immediate effect on the biophysical properties, structure and function of proteins. These same mutations also affect the fitness of the organism. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz175DOI Listing

Multispecies genome-wide analysis defines the MAP3K gene family in Gossypium hirsutum and reveals conserved family expansions.

BMC Bioinformatics 2019 Mar 14;20(Suppl 2):99. Epub 2019 Mar 14.

Institute for Genomics, Biocomputing and Bioengineering, Mississippi State University, Mississippi State, MS, 39762, USA.

Background: Gene families are sets of structurally and evolutionarily related genes - in one or multiple species - that typically share a conserved biological function. As such, the identification and subsequent analyses of entire gene families are widely employed in the fields of evolutionary and functional genomics of both well established and newly sequenced plant genomes. Currently, plant gene families are typically identified using one of two major ways: 1) HMM-profile based searches using models built on Arabidopsis thaliana genes or 2) coding sequence homology searches using curated databases. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2624-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6419318PMC
March 2019
1 Read

GToTree: a user-friendly workflow for phylogenomics.

Authors:
Michael D Lee

Bioinformatics 2019 Mar 13. Epub 2019 Mar 13.

Exobiology Branch, NASA Ames Research Center, Moffett Field, CA, USA.

Summary: Genome-level evolutionary inference (i.e., phylogenomics) is becoming an increasingly essential step in many biologists' work. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz188DOI Listing
March 2019
2 Reads

Choice of species affects phylogenetic stability of deep nodes: an empirical example in Terrabacteria.

Bioinformatics 2019 Feb 19. Epub 2019 Feb 19.

Department of Biological Sciences, Oakland University, Rochester, MI, USA.

Motivation: The promise of higher phylogenetic stability through increased dataset sizes within tree of life (TOL) reconstructions has not been fulfilled. Among the many possible causes are changes in species composition (taxon sampling) that could influence phylogenetic accuracy of the methods by altering the relative weight of the evolutionary histories of each individual species. This effect would be stronger in clades that are represented by few lineages, which is common in many prokaryote phyla. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz121DOI Listing
February 2019

A Gaussian process model and Bayesian variable selection for mapping function-valued quantitative traits with incomplete phenotypic data.

Bioinformatics 2019 Mar 8. Epub 2019 Mar 8.

Department of Mathematical Sciences, Biocenter Oulu and Infotech Oulu, University of Oulu, Oulu, FI-90014, Finland.

Motivation: Recent advances in high dimensional phenotyping bring time as an extra dimension into the phenotypes. This promotes the quantitative trait locus (QTL) studies of function-valued traits such as those related to growth and development. Existing approaches for analyzing functional traits utilize either parametric methods or semi-parametric approaches based on splines and wavelets. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz164DOI Listing

The influence of different types of translational inaccuracies on the genetic code structure.

BMC Bioinformatics 2019 Mar 6;20(1):114. Epub 2019 Mar 6.

Department of Genomics, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, 50-383, Poland.

Background: The standard genetic code is a recipe for assigning unambiguously 21 labels, i.e. amino acids and stop translation signal, to 64 codons. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2661-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6404327PMC
March 2019
2 Reads

VHost-Classifier: Virus-Host Classification using natural language processing.

Bioinformatics 2019 Mar 1. Epub 2019 Mar 1.

Department of Microbiology and Immunology, University of British Columbia, Vancouver, BC, Canada.

Motivation: When analysing viral metagenomic sequences, it is often desired to filter the results of a BLAST analysis by the host species of the virus. VHost-Classifier automates this procedure using a natural language processing algorithm written in Python 3, which takes a list of taxonomic identifiers (taxids) returned from a BLAST query using viral sequences as input. The taxid output is binned by the evolutionary lineage of their host, based on string matching the words in their English names. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz151DOI Listing
March 2019
4.981 Impact Factor

Toward perfect reads: self-correction of short reads via mapping on de Bruijn graphs.

Bioinformatics 2019 Feb 20. Epub 2019 Feb 20.

Univ Rennes, Inria, CNRS, IRISA, Rennes, France.

Motivations Short-read accuracy is important for downstream analyses such as genome assembly and hybrid long-read correction. Despite much work on short-read correction, present-day correctors either do not scale well on large data sets or consider reads as mere suites of k-mers, without taking into account their full-length read information. Results We propose a new method to correct short reads using de Bruijn graphs, and implement it as a tool called Bcool. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz102DOI Listing
February 2019

Dhaka: Variational Autoencoder for Unmasking Tumor Heterogeneity from Single Cell Genomic Data.

Bioinformatics 2019 Feb 15. Epub 2019 Feb 15.

Microsoft Research, Redmond, USA.

Motivation: Intra-tumor heterogeneity is one of the key confounding factors in deciphering tumor evolution. Malignant cells exhibit variations in their gene expression, copy numbers, and mutation even when originating from a single progenitor cell. Single cell sequencing of tumor cells has recently emerged as a viable option for unmasking the underlying tumor heterogeneity. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz095DOI Listing
February 2019
1 Read

MultiDomainBenchmark: a multi-domain query and subject database suite.

BMC Bioinformatics 2019 Feb 14;20(1):77. Epub 2019 Feb 14.

National Center for Biotechnology Information, Bethesda, National Institutes of Health, 8600 Rockville Pike, Bethesda, 20894, MD, USA.

Background: Genetic sequence database retrieval benchmarks play an essential role in evaluating the performance of sequence searching tools. To date, all phylogenetically diverse benchmarks known to the authors include only query sequences with single protein domains. Domains are the primary building blocks of protein structure and function. Read More

View Article

Download full-text PDF

Source
https://bmcbioinformatics.biomedcentral.com/articles/10.1186
Publisher Site
http://dx.doi.org/10.1186/s12859-019-2660-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6376684PMC
February 2019
4 Reads

Estimation of duplication history under a stochastic model for tandem repeats.

BMC Bioinformatics 2019 Feb 6;20(1):64. Epub 2019 Feb 6.

Department of Electrical Engineering, California Institute of Technology, Pasadena, USA.

Background: Tandem repeat sequences are common in the genomes of many organisms and are known to cause important phenomena such as gene silencing and rapid morphological changes. Due to the presence of multiple copies of the same pattern in tandem repeats and their high variability, they contain a wealth of information about the mutations that have led to their formation. The ability to extract this information can enhance our understanding of evolutionary mechanisms. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2603-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6364452PMC
February 2019
1 Read

Incorporating alignment uncertainty into Felsenstein's phylogenetic bootstrap to improve its reliability.

Bioinformatics 2019 Feb 6. Epub 2019 Feb 6.

Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.

Motivation: Most evolutionary analyses are based on pre-estimated multiple sequence alignment. Wong et al. established the existence of an uncertainty induced by multiple sequence alignment when reconstructing phylogenies. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz082DOI Listing
February 2019
2 Reads

Understanding the evolutionary trend of intrinsically structural disorders in cancer relevant proteins as probed by Shannon entropy scoring and structure network analysis.

BMC Bioinformatics 2019 Feb 4;19(Suppl 13):549. Epub 2019 Feb 4.

Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, Massachusetts, 02138, USA.

Background: Malignant diseases have become a threat for health care system. A panoply of biological processes is involved as the cause of these diseases. In order to unveil the mechanistic details of these diseased states, we analyzed protein families relevant to these diseases. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2552-0DOI Listing
February 2019
1 Read
2.576 Impact Factor

Reviewer-coerced citation: Case report, update on journal policy, and suggestions for future prevention.

Bioinformatics 2019 Jan 30. Epub 2019 Jan 30.

Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany.

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz071DOI Listing
January 2019
1 Read

s-dePooler: determination of polymorphism carriers from overlapping DNA pools.

BMC Bioinformatics 2019 Jan 22;20(1):45. Epub 2019 Jan 22.

Research Department of Non-Coronary Heart Diseases, Almazov National Medical Research Center, Ministry of Health of Russia, 2 Akkuratova St., St. Petersburg, 197341, Russia.

Background: Samples pooling is a method widely used in studies to reduce costs and labour. DNA sample pooling combined with massive parallel sequencing is a powerful tool for discovering DNA variants (polymorphisms) in large analysing populations, which is the base of such research fields as Genome-Wide Association Studies, evolutionary and population studies, etc. Usage of overlapping pools where each sample is present in multiple pools can enhance the accuracy of polymorphism detection and allow identifying carriers of rare-variants. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2616-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6343301PMC
January 2019

Protein Fold Recognition based on Multi-view Modeling.

Bioinformatics 2019 Jan 21. Epub 2019 Jan 21.

School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China.

Motivation: Protein fold recognition has attracted increasing attention because it is critical for studies of the 3D structures of proteins and drug design. Researchers have been extensively studying this important task, and several features with high discriminative power have been proposed. However, the development of methods that efficiently combine these features to improve the predictive performance remains a challenging problem. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz040DOI Listing
January 2019

admixr - R package for reproducible analyses using ADMIXTOOLS.

Bioinformatics 2019 Jan 22. Epub 2019 Jan 22.

Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany.

Summary: We present a new R package admixr, which provides a convenient interface for performing reproducible population genetic analyses (f3, D, f4, f4-ratio, qpWave and qpAdm), as implemented by command-line programs in the ADMIXTOOLS software suite. In a traditional ADMIXTOOLS workflow, the user must first generate a set of text configuration files tailored to each individual analysis, often using a combination of shell scripting and manual text editing. The non-tabular output files then need to be parsed to extract values of interest prior to further analyses. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz030DOI Listing
January 2019

PSiTE: a Phylogeny guided Simulator for Tumor Evolution.

Bioinformatics 2019 Jan 14. Epub 2019 Jan 14.

Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, P.R.China.

Summary: Simulating realistic clonal dynamics of tumors is an important topic in cancer genomics. Here, we present PSiTE (Phylogeny guided Simulator for Tumor Evolution), a tool that can simulate different types of tumor samples including single sector, multi-sector bulk tumor as well as single-cell tumor data under a wide range of evolutionary trajectories. PSiTE provides an efficient tool for understanding clonal evolution of cancer. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz028DOI Listing
January 2019
1 Read

Characterization and identification of long non-coding RNAs based on feature relationship.

Bioinformatics 2019 Jan 12. Epub 2019 Jan 12.

CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.

Motivation: The significance of long non-coding RNAs (lncRNAs) in many biological processes and diseases has gained intense interests over the past several years. However, computational identification of lncRNAs in a wide range of species remains challenging; it requires prior knowledge of well-established sequences and annotations or species-specific training data, but the reality is that only a limited number of species have high-quality sequences and annotations.

Results: Here we first characterize lncRNAs by contrast to protein-coding RNAs based on feature relationship and find that the feature relationship between ORF (open reading frame) length and GC content presents universally substantial divergence in lncRNAs and protein-coding RNAs, as observed in a broad variety of species. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/advance-article/doi/
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/btz008DOI Listing
January 2019
21 Reads

AQUAPONY: visualization and interpretation of phylogeographic information on phylogenetic trees.

Bioinformatics 2019 Jan 14. Epub 2019 Jan 14.

LIRMM, UMR 5506, CNRS and Université Montpellier, Montpellier, France.

Motivation: The visualization and interpretation of evolutionary spatiotemporal scenarios is broadly and increasingly used in infectious disease research, ecology, or agronomy. Using probabilistic frameworks, well-known tools can infer from molecular data ancestral traits for internal nodes in a phylogeny, and numerous phylogenetic rendering tools can display such evolutionary trees. However, visualizing such ancestral information and its uncertainty on the tree remains tedious. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz011DOI Listing
January 2019
1 Read

Simulation of Heterogeneous Tumour Genomes with HeteroGenesis and In Silico Whole Exome Sequencing.

Bioinformatics 2019 Jan 4. Epub 2019 Jan 4.

Leeds Institute of Medical Research at St James's, St James's University Hospital, Leeds, UK.

Summary: Tumour evolution results in progressive cancer phenotypes such as metastatic spread and treatment resistance. To better treat cancers, we must characterise tumour evolution and the genetic events that confer progressive phenotypes. This is facilitated by high coverage genome or exome sequencing. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty1063DOI Listing
January 2019
1 Read

Single-cell RNA-seq Interpretations using Evolutionary Multiobjective Ensemble Pruning.

Bioinformatics 2018 Dec 28. Epub 2018 Dec 28.

Department of Computer Science, City University of Hong Kong, Hong Kong SAR.

Motivation: In recent years, single-cell RNA sequencing enables us to discover cell types or even subtypes. Its increasing availability provides opportunities to identify cell populations from single-cell RNA-seq data. Computational methods have been employed to reveal the gene expression variations among multiple cell populations. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty1056DOI Listing
December 2018
1 Read

Homeolog expression quantification methods for allopolyploids.

Brief Bioinform 2018 12 27. Epub 2018 Dec 27.

Artificial Intelligence Research Center, AIST, 2-3-26 Aomi, Koto-ku, Tokyo 135-0064, Japan.

Genome duplication with hybridization, or allopolyploidization, occurs in animals, fungi and plants, and is especially common in crop plants. There is an increasing interest in the study of allopolyploids because of advances in polyploid genome assembly; however, the high level of sequence similarity in duplicated gene copies (homeologs) poses many challenges. Here we compared standard RNA-seq expression quantification approaches used currently for diploid species against subgenome-classification approaches which maps reads to each subgenome separately. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bib/advance-article/doi/10.1093/bib
Publisher Site
http://dx.doi.org/10.1093/bib/bby121DOI Listing
December 2018
11 Reads

Ancestral sequence reconstruction: accounting for structural information by averaging over replacement matrices.

Bioinformatics 2018 Dec 24. Epub 2018 Dec 24.

School of Molecular Cell Biology and Biotechnology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel.

Motivation: Ancestral sequence reconstruction (ASR) is widely used to understand protein evolution, structure and function. Current ASR methodologies do not fully consider differences in evolutionary constraints among positions imposed by the three-dimensional (3D) structure of the protein. Here we developed an ASR algorithm that allows different protein sites to evolve according to different mixtures of replacement matrices. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty1031DOI Listing
December 2018
1 Read

Degeneracy and genetic assimilation in RNA evolution.

BMC Bioinformatics 2018 Dec 27;19(1):543. Epub 2018 Dec 27.

University of Virginia Biocomplexity Institute, 995 Research Park Boulevard, Charlottesville, 22911, USA.

Background: The neutral theory of Motoo Kimura stipulates that evolution is mostly driven by neutral mutations. However adaptive pressure eventually leads to changes in phenotype that involve non-neutral mutations. The relation between neutrality and adaptation has been studied in the context of RNA before and here we further study transitional mutations in the context of degenerate (plastic) RNA sequences and genetic assimilation. Read More

View Article

Download full-text PDF

Source
https://bmcbioinformatics.biomedcentral.com/articles/10.1186
Publisher Site
http://dx.doi.org/10.1186/s12859-018-2497-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6307299PMC
December 2018
11 Reads

GLUE: a flexible software system for virus sequence data.

BMC Bioinformatics 2018 Dec 18;19(1):532. Epub 2018 Dec 18.

MRC-University of Glasgow Centre for Virus Research, Glasgow, Scotland, UK.

Background: Virus genome sequences, generated in ever-higher volumes, can provide new scientific insights and inform our responses to epidemics and outbreaks. To facilitate interpretation, such data must be organised and processed within scalable computing resources that encapsulate virology expertise. GLUE (Genes Linked by Underlying Evolution) is a data-centric bioinformatics environment for building such resources. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2459-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6299651PMC
December 2018
3 Reads

ModL: exploring and restoring regularity when testing for positive selection.

Bioinformatics 2018 Dec 12. Epub 2018 Dec 12.

Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada.

Motivation: Likelihood ratio tests are commonly used to test for positive selection acting on proteins. They are usually applied with thresholds for declaring a protein under positive selection determined from a chi-square or mixture of chi-square distributions. While it is known that such distributions are not strictly justified due to the statistical irregularity of the problem, the hope has been that the resulting tests are conservative and do not lose much power in comparison with the same test using the unknown, correct threshold. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty1019DOI Listing
December 2018
15 Reads

Increasing the accuracy of protein loop structure prediction with evolutionary constraints.

Bioinformatics 2018 Dec 10. Epub 2018 Dec 10.

Department of Statistics, University of Oxford, Oxford, United Kingdom.

Motivation: Accurate prediction of loop structures remains challenging. This is especially true for long loops where the large conformational space and limited coverage of experimentally-determined structures often leads to low accuracy. Co-evolutionary contact predictors, which provide information about the proximity of pairs of residues, have been used to improve whole-protein models generated through de novo techniques. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty996DOI Listing
December 2018
1 Read

A Novel Measure of Non-coding Genome Conservation Identifies Genomic Regulatory Blocks Within Primates.

Bioinformatics 2018 Dec 7. Epub 2018 Dec 7.

Computational Regulatory Genomics Group, MRC London Institute of Medical Sciences, Du Cane Road, London, UK.

Motivation: Clusters of extremely conserved non-coding elements (CNEs) mark genomic regions devoted to cis-regulation of key developmental genes in Metazoa. We have recently shown that their span coincides with that of topologically associating domains (TADs), making them useful for estimating conserved TAD boundaries in the absence of Hi-C data. The standard approach - detecting CNEs in genome alignments and then establishing the boundaries of their clusters - requires tuning of several parameters and breaks down when comparing closely related genomes. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty1014DOI Listing
December 2018
1 Read

iHam & pyHam: visualizing and processing hierarchical orthologous groups.

Bioinformatics 2018 Dec 3. Epub 2018 Dec 3.

SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.

Summary: The evolutionary history of gene families can be complex due to duplications and losses. This complexity is compounded by the large number of genomes simultaneously considered in contemporary comparative genomic analyses. As provided by several orthology databases, hierarchical orthologous groups (HOGs) are sets of genes that are inferred to have descended from a common ancestral gene within a species clade. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/advance-article/doi/
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty994DOI Listing
December 2018
15 Reads

Multi-omic analysis of signalling factors in inflammatory comorbidities.

BMC Bioinformatics 2018 Nov 30;19(Suppl 15):439. Epub 2018 Nov 30.

Computer Laboratory, University of Cambridge, Cambridge, UK.

Background: Inflammation is a core element of many different, systemic and chronic diseases that usually involve an important autoimmune component. The clinical phase of inflammatory diseases is often the culmination of a long series of pathologic events that started years before. The systemic characteristics and related mechanisms could be investigated through the multi-omic comparative analysis of many inflammatory diseases. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2413-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6266935PMC
November 2018
2 Reads

Multilevel comparative bioinformatics to investigate evolutionary relationships and specificities in gene annotations: an example for tomato and grapevine.

BMC Bioinformatics 2018 Nov 30;19(Suppl 15):435. Epub 2018 Nov 30.

Department of Agriculture, University of Naples "Federico II,", Portici, Naples, Italy.

Background: "Omics" approaches may provide useful information for a deeper understanding of speciation events, diversification and function innovation. This can be achieved by investigating the molecular similarities at sequence level between species, allowing the definition of ortholog and paralog genes. However, the spreading of sequenced genome, often endowed with still preliminary annotations, requires suitable bioinformatics to be appropriately exploited in this framework. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2420-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6266932PMC
November 2018
2 Reads

Comprehensive review of the identification of essential genes using computational methods: focusing on feature implementation and assessment.

Brief Bioinform 2018 Nov 29. Epub 2018 Nov 29.

School of Life Science and Technology, Center for Informational Biology, Intelligent Learning Institute for Science and Application, University of Electronic Science and Technology of China, Chengdu, China.

Essential genes have attracted increasing attention in recent years due to the important functions of these genes in organisms. Among the methods used to identify the essential genes, accurate and efficient computational methods can make up for the deficiencies of expensive and time-consuming experimental technologies. In this review, we have collected researches on essential gene predictions in prokaryotes and eukaryotes and summarized the five predominant types of features used in these studies. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bby116DOI Listing
November 2018
1 Read

PhastWeb: a web interface for evolutionary conservation scoring of multiple sequence alignments using phastCons and phyloP.

Bioinformatics 2018 Nov 27. Epub 2018 Nov 27.

Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.

Summary: The Phylogenetic Analysis with Space/Time models (PHAST) package is a widely used software package for comparative genomics that has been freely available for download since 2002. Here we introduce a web interface (phastWeb) that makes it possible to use two of the most popular programs in PHAST, phastCons and phyloP, without downloading and installing the PHAST software. This interface allows users to upload a sequence alignment and either upload a corresponding phylogeny or have one estimated from the alignment. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/advance-article/doi/
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty966DOI Listing
November 2018
13 Reads

A statistical method to identify recombination in bacterial genomes based on SNP incompatibility.

BMC Bioinformatics 2018 Nov 22;19(1):450. Epub 2018 Nov 22.

Department of Computer Science & Engineering, Texas A&M University, College Station, TX 77843, USA.

Background: Phylogeny estimation for bacteria is likely to reflect their true evolutionary histories only if they are highly clonal. However, recombination events could occur during evolution for some species. The reconstruction of phylogenetic trees from an alignment without considering recombination could be misleading, since the relationships among strains in some parts of the genome might be different than in others. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2456-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6251179PMC
November 2018
13 Reads

Gene characteristics predicting missense, nonsense and frameshift mutations in tumor samples.

BMC Bioinformatics 2018 Nov 19;19(1):430. Epub 2018 Nov 19.

The Geisel School of Medicine, Department of Biomedical Data Science, Dartmouth College, HB7936, One Medical Center Dr., Dartmouth-Hitchcock Medical Center, Beirut, NH, 03756, Lebanon.

Background: Because driver mutations provide selective advantage to the mutant clone, they tend to occur at a higher frequency in tumor samples compared to selectively neutral (passenger) mutations. However, mutation frequency alone is insufficient to identify cancer genes because mutability is influenced by many gene characteristics, such as size, nucleotide composition, etc. The goal of this study was to identify gene characteristics associated with the frequency of somatic mutations in the gene in tumor samples. Read More

View Article

Download full-text PDF

Source
https://bmcbioinformatics.biomedcentral.com/articles/10.1186
Publisher Site
http://dx.doi.org/10.1186/s12859-018-2455-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245819PMC
November 2018
19 Reads

Automated selection of homologs to track the evolutionary history of proteins.

BMC Bioinformatics 2018 Nov 19;19(1):431. Epub 2018 Nov 19.

Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128, Mainz, Germany.

Background: The selection of distant homologs of a query protein under study is a usual and useful application of protein sequence databases. Such sets of homologs are often applied to investigate the function of a protein and the degree to which experimental results can be transferred from one organism to another. In particular, a variety of databases facilitates static browsing for orthologs. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2457-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245638PMC
November 2018
1 Read
2.576 Impact Factor

gmRAD: an integrated SNP calling pipeline for genetic mapping with RADseq across a hybrid population.

Brief Bioinform 2018 Nov 14. Epub 2018 Nov 14.

Southern Modern Forestry Collaborative Innovation Center, College of Forestry, Nanjing Forestry University, Nanjing, China.

Restriction site-associated DNA sequencing (RADseq) is a powerful technology that has been extensively applied in population genetics, phylogenetics and genetic mapping. Although many software packages are available for ecological and evolutionary studies, a few effective tools are available for extracting genotype data with RADseq for genetic mapping, a prerequisite for quantitative trait locus mapping, comparative genomics and genome scaffold assembly. Here, we present an integrated pipeline called gmRAD for generating single nucleotide polymorphism (SNP) genotypes from RADseq data, de novo, across a genetic mapping population derived by crossing two parents. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bby114DOI Listing
November 2018
2 Reads

FLYCOP: metabolic modeling-based analysis and engineering microbial communities.

Bioinformatics 2018 Sep;34(17):i954-i963

Department of Systems Biology, Centro Nacional de Biotecnología, Consejo Superior de Investigaciones Científicas (CNB-CSIC), 28049 Madrid, Spain.

Motivation: Synthetic microbial communities begin to be considered as promising multicellular biocatalysts having a large potential to replace engineered single strains in biotechnology applications, in pharmaceutical, chemical and living architecture sectors. In contrast to single strain engineering, the effective and high-throughput analysis and engineering of microbial consortia face the lack of knowledge, tools and well-defined workflows. This manuscript contributes to fill this important gap with a framework, called FLYCOP (FLexible sYnthetic Consortium OPtimization), which contributes to microbial consortia modeling and engineering, while improving the knowledge about how these communities work. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/article/34/17/i954/5
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty561DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129290PMC
September 2018
19 Reads

Fast characterization of segmental duplications in genome assemblies.

Bioinformatics 2018 Sep;34(17):i706-i714

Vancouver Prostate Centre, Vancouver, Canada.

Motivation: Segmental duplications (SDs) or low-copy repeats, are segments of DNA > 1 Kbp with high sequence identity that are copied to other regions of the genome. SDs are among the most important sources of evolution, a common cause of genomic structural variation and several are associated with diseases of genomic origin including schizophrenia and autism. Despite their functional importance, SDs present one of the major hurdles for de novo genome assembly due to the ambiguity they cause in building and traversing both state-of-the-art overlap-layout-consensus and de Bruijn graphs. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty586DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129265PMC
September 2018
9 Reads

Predicting protein-protein interactions through sequence-based deep learning.

Bioinformatics 2018 Sep;34(17):i802-i810

Toyota Technological Institute at Chicago, Chicago, IL, USA.

Motivation: High-throughput experimental techniques have produced a large amount of protein-protein interaction (PPI) data, but their coverage is still low and the PPI data is also very noisy. Computational prediction of PPIs can be used to discover new PPIs and identify errors in the experimental PPI data.

Results: We present a novel deep learning framework, DPPI, to model and predict PPIs from sequence information alone. Read More

View Article

Download full-text PDF

Source
https://academic.oup.com/bioinformatics/article/34/17/i802/5
Publisher Site
http://dx.doi.org/10.1093/bioinformatics/bty573DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129267PMC
September 2018
59 Reads