Adv Appl Math 2018 May 28;96:39-75. Epub 2018 Feb 28.
Department of Mathematics, University of California, 970 Evans Hall #3840, Berkeley, CA 94720-3840, U.S.A.
Given an edge-weighted tree with leaves, sample the leaves uniformly at random without replacement and let , 2 ≤ ≤ , be the length of the subtree spanned by the first leaves. We consider the question, "Can be identified (up to isomorphism) by the joint probability distribution of the random vector (, …, )?" We show that if is known to belong to one of various families of edge-weighted trees, then the answer is, "Yes." These families include the edge-weighted trees with edge-weights in general position, the ultrametric edge-weighted trees, and certain families with equal weights on all edges such as ( + 1)-valent and rooted -ary trees for ≥ 2 and caterpillars. Read More
J Comput Biol 2018 03 13;25(3):253-269. Epub 2017 Oct 13.
1 Department of Computer Science, National Tsing Hua University , Hsinchu, Taiwan .
Given a distance matrix M that represents evolutionary distances between any two species, an edge-weighted phylogenetic network N is said to satisfy M if between any pair of species, there exists a path in N with a length equal to the corresponding entry in M. In this article, we consider a special class of networks called a one-articulated network, which is a proper superset of galled trees. We show that if the distance matrix M is derived from an ultrametric one-articulated network N (i. Read More
Mol Biol Evol 2015 Jun 4;32(6):1628-42. Epub 2015 Feb 4.
School of Computing Sciences, University of East Anglia, United Kingdom
The wealth of phylogenetic information accumulated over many decades of biological research, coupled with recent technological advances in molecular sequence generation, presents significant opportunities for researchers to investigate relationships across and within the kingdoms of life. However, to make best use of this data wealth, several problems must first be overcome. One key problem is finding effective strategies to deal with missing data. Read More
J Comput Biol 2013 Apr 19;20(4):311-21. Epub 2013 Mar 19.
Department of Computer Science, National University of Computer and Emerging Sciences, Karachi, Pakistan.
In metabolomics and other fields dealing with small compounds, mass spectrometry is applied as a sensitive high-throughput technique. Recently, fragmentation trees have been proposed to automatically analyze the fragmentation mass spectra recorded by such instruments. Computationally, this leads to the problem of finding a maximum weight subtree in an edge-weighted and vertex-colored graph, such that every color appears, at most once in the solution. Read More
Bull Math Biol 2013 Mar 5;75(3):444-65. Epub 2013 Feb 5.
School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK.
The construction of a dendogram on a set of individuals is a key component of a genomewide association study. However, even with modern sequencing technologies the distances on the individuals required for the construction of such a structure may not always be reliable making it tempting to exclude them from an analysis. This, in turn, results in an input set for dendogram construction that consists of only partial distance information, which raises the following fundamental question. Read More