Publications by authors named "Philippe Lemey"

236 Publications

SARS-CoV-2 European resurgence foretold: interplay of introductions and persistence by leveraging genomic and mobility data.

Res Sq 2021 Feb 10. Epub 2021 Feb 10.

Following the first wave of SARS-CoV-2 infections in spring 2020, Europe experienced a resurgence of the virus starting late summer that was deadlier and more difficult to contain. Relaxed intervention measures and summer travel have been implicated as drivers of the second wave. Here, we build a phylogeographic model to evaluate how newly introduced lineages, as opposed to the rekindling of persistent lineages, contributed to the COVID-19 resurgence in Europe. We inform this model using genomic, mobility and epidemiological data from 10 West European countries and estimate that in many countries more than 50% of the lineages circulating in late summer resulted from new introductions since June 15th. The success in onwards transmission of these lineages is predicted by SARS-CoV-2 incidence during this period. Relatively early introductions from Spain into the United Kingdom contributed to the successful spread of the 20A.EU1/B.1.177 variant. The pervasive spread of variants that have not been associated with an advantage in transmissibility highlights the threat of novel variants of concern that emerged more recently and have been disseminated by holiday travel. Our findings indicate that more effective and coordinated measures are required to contain spread through cross-border travel.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.21203/rs.3.rs-208849/v1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7885927PMC
February 2021

Relax, keep walking-a practical guide to continuous phylogeographic inference with BEAST.

Mol Biol Evol 2021 Feb 2. Epub 2021 Feb 2.

Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Herestraat 49, Leuven, 3000, Belgium.

Spatially-explicit phylogeographic analyses can be performed with an inference framework that employs relaxed random walks to reconstruct phylogenetic dispersal histories in continuous space. This core model was first implemented ten years ago and has opened up new opportunities in the field of phylodynamics, allowing researchers to map and analyse the spatial dissemination of rapidly evolving pathogens. We here provide a detailed and step-by-step guide on how to set up, run, and interpret continuous phylogeographic analyses using the programs BEAUti, BEAST, Tracer, and TreeAnnotator.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/msab031DOI Listing
February 2021

Determinants of dengue virus dispersal in the Americas.

Virus Evol 2020 Jul 2;6(2):veaa074. Epub 2020 Dec 2.

Department of Preclinical Sciences, Faculty of Medical Sciences, University of the West Indies, St. Augustine, Trinidad and Tobago.

Dengue viruses (DENVs) are classified into four serotypes, each of which contains multiple genotypes. DENV genotypes introduced into the Americas over the past five decades have exhibited different rates and patterns of spatial dispersal. In order to understand factors underlying these patterns, we utilized a statistical framework that allows for the integration of ecological, socioeconomic, and air transport mobility data as predictors of viral diffusion while inferring the phylogeographic history. Predictors describing spatial diffusion based on several covariates were compared using a generalized linear model approach, where the support for each scenario and its contribution is estimated simultaneously from the data set. Although different predictors were identified for different serotypes, our analysis suggests that overall diffusion of DENV-1, -2, and -3 in the Americas was associated with airline traffic. The other significant predictors included human population size, the geographical distance between countries and between urban centers and the density of people living in urban environments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/ve/veaa074DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7772473PMC
July 2020

HIV-1 p24Gag adaptation to modern and archaic HLA-allele frequency differences in ethnic groups contributes to viral subtype diversification.

Virus Evol 2020 Jul 12;6(2):veaa085. Epub 2020 Dec 12.

Division of Clinical Neurology, Nuffield Department of Clinical Neurosciences, Weatherall Institute of Molecular Medicine, University of Oxford, Oxford OX3 9DS, UK.

Pathogen-driven selection and past interbreeding with archaic human lineages have resulted in differences in human leukocyte antigen (HLA)-allele frequencies between modern human populations. Whether or not this variation affects pathogen subtype diversification is unknown. Here we show a strong positive correlation between ethnic diversity in African countries and both human immunodeficiency virus (HIV)-1 p24 and subtype diversity. We demonstrate that ethnic HLA-allele differences between populations have influenced HIV-1 subtype diversification as the virus adapted to escape common antiviral immune responses. The evolution of HIV Subtype B (HIV-B), which does not appear to be indigenous to Africa, is strongly affected by immune responses associated with Eurasian HLA variants acquired through adaptive introgression from Neanderthals and Denisovans. Furthermore, we show that the increasing and disproportionate number of HIV-infections among African Americans in the USA drive HIV-B evolution towards an Africa-centric HIV-1 state. Similar adaptation of other pathogens to HLA variants common in affected populations is likely.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/ve/veaa085DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7733611PMC
July 2020

Temporal signal and the phylodynamic threshold of SARS-CoV-2.

Virus Evol 2020 Jul 19;6(2):veaa061. Epub 2020 Aug 19.

Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Leuven, Belgium.

The ongoing SARS-CoV-2 outbreak marks the first time that large amounts of genome sequence data have been generated and made publicly available in near real time. Early analyses of these data revealed low sequence variation, a finding that is consistent with a recently emerging outbreak, but which raises the question of whether such data are sufficiently informative for phylogenetic inferences of evolutionary rates and time scales. The phylodynamic threshold is a key concept that refers to the point in time at which sufficient molecular evolutionary change has accumulated in available genome samples to obtain robust phylodynamic estimates. For example, before the phylodynamic threshold is reached, genomic variation is so low that even large amounts of genome sequences may be insufficient to estimate the virus's evolutionary rate and the time scale of an outbreak. We collected genome sequences of SARS-CoV-2 from public databases at eight different points in time and conducted a range of tests of temporal signal to determine if and when the phylodynamic threshold was reached, and the range of inferences that could be reliably drawn from these data. Our results indicate that by 2 February 2020, estimates of evolutionary rates and time scales had become possible. Analyses of subsequent data sets, that included between 47 and 122 genomes, converged at an evolutionary rate of about 1.1 × 10 subs/site/year and a time of origin of around late November 2019. Our study provides guidelines to assess the phylodynamic threshold and demonstrates that establishing this threshold constitutes a fundamental step for understanding the power and limitations of early data in outbreak genome surveillance.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/ve/veaa061DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7454936PMC
July 2020

Epidemiological hypothesis testing using a phylogeographic and phylodynamic framework.

Nat Commun 2020 11 6;11(1):5620. Epub 2020 Nov 6.

Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Herestraat 49, 3000, Leuven, Belgium.

Computational analyses of pathogen genomes are increasingly used to unravel the dispersal history and transmission dynamics of epidemics. Here, we show how to go beyond historical reconstructions and use spatially-explicit phylogeographic and phylodynamic approaches to formally test epidemiological hypotheses. We illustrate our approach by focusing on the West Nile virus (WNV) spread in North America that has substantially impacted public, veterinary, and wildlife health. We apply an analytical workflow to a comprehensive WNV genome collection to test the impact of environmental factors on the dispersal of viral lineages and on viral population genetic diversity through time. We find that WNV lineages tend to disperse faster in areas with higher temperatures and we identify temporal variation in temperature as a main predictor of viral genetic diversity through time. By contrasting inference with simulation, we find no evidence for viral lineages to preferentially circulate within the same migratory bird flyway, suggesting a substantial role for non-migratory birds or mosquito dispersal along the longitudinal gradient.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-19122-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7648063PMC
November 2020

Towards a unified classification for human respiratory syncytial virus genotypes.

Virus Evol 2020 Jul 24;6(2):veaa052. Epub 2020 Jul 24.

KU Leuven, Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, Laboratory of Clinical and Epidemiological Virology, Herestraat 49 box 1040, BE-3000 Leuven, Belgium.

Since the first human respiratory syncytial virus (HRSV) genotype classification in 1998, inconsistent conclusions have been drawn regarding the criteria that define HRSV genotypes and their nomenclature, challenging data comparisons between research groups. In this study, we aim to unify the field of HRSV genotype classification by reviewing the different methods that have been used in the past to define HRSV genotypes and by proposing a new classification procedure, based on well-established phylogenetic methods. All available complete HRSV genomes (>12,000 bp) were downloaded from GenBank and divided into the two subgroups: HRSV-A and HRSV-B. From whole-genome alignments, the regions that correspond to the open reading frame of the glycoprotein G and the second hypervariable region (HVR2) of the ectodomain were extracted. In the resulting partial alignments, the phylogenetic signal within each fragment was assessed. Maximum likelihood phylogenetic trees were reconstructed using the complete genome alignments. Patristic distances were calculated between all pairs of tips in the phylogenetic tree and summarized as a density plot in order to determine a cutoff value at the lowest point following the major distance peak. Our data show that neither the HVR2 fragment nor the G gene contains sufficient phylogenetic signal to perform reliable phylogenetic reconstruction. Therefore, whole-genome alignments were used to determine HRSV genotypes. We define a genotype using the following criteria: a bootstrap support of 70 per cent for the respective clade and a maximum patristic distance between all members of the clade of ≤0.018 substitutions per site for HRSV-A or ≤0.026 substitutions per site for HRSV-B. By applying this definition, we distinguish twenty-three genotypes within subtype HRSV-A and six genotypes within subtype HRSV-B. Applying the genotype criteria on subsampled data sets confirmed the robustness of the method.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/ve/veaa052DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7552823PMC
July 2020

Accommodating individual travel history and unsampled diversity in Bayesian phylogeographic inference of SARS-CoV-2.

Nat Commun 2020 10 9;11(1):5110. Epub 2020 Oct 9.

Department of Biomathematics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, 90095, USA.

Spatiotemporal bias in genome sampling can severely confound discrete trait phylogeographic inference. This has impeded our ability to accurately track the spread of SARS-CoV-2, the virus responsible for the COVID-19 pandemic, despite the availability of unprecedented numbers of SARS-CoV-2 genomes. Here, we present an approach to integrate individual travel history data in Bayesian phylogeographic inference and apply it to the early spread of SARS-CoV-2. We demonstrate that including travel history data yields i) more realistic hypotheses of virus spread and ii) higher posterior predictive accuracy compared to including only sampling location. We further explore methods to ameliorate the impact of sampling bias by augmenting the phylogeographic analysis with lineages from undersampled locations. Our reconstructions reinforce specific transmission hypotheses suggested by the inclusion of travel history data, but also suggest alternative routes of virus migration that are plausible within the epidemiological context but are not apparent with current sampling efforts.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-18877-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7547076PMC
October 2020

Air conditioning system usage and SARS-CoV-2 transmission dynamics in Iran.

Med Hypotheses 2020 10 5;143:110164. Epub 2020 Aug 5.

KU Leuven, Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, Laboratory for Clinical and Epidemiological Virology, Leuven, Belgium.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.mehy.2020.110164DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7405771PMC
October 2020

nosoi: A stochastic agent-based transmission chain simulation framework in r.

Methods Ecol Evol 2020 Aug 21;11(8):1002-1007. Epub 2020 Jun 21.

Department of Microbiology, Immunology and Transplantation Rega Institute KU Leuven Leuven Belgium.

The transmission process of an infectious agent creates a connected chain of hosts linked by transmission events, known as a transmission chain. Reconstructing transmission chains remains a challenging endeavour, except in rare cases characterized by intense surveillance and epidemiological inquiry. Inference frameworks attempt to estimate or approximate these transmission chains but the accuracy and validity of such methods generally lack formal assessment on datasets for which the actual transmission chain was observed.We here introduce nosoi, an open-source r package that offers a complete, tunable and expandable agent-based framework to simulate transmission chains under a wide range of epidemiological scenarios for single-host and dual-host epidemics. nosoi is accessible through GitHub and CRAN, and is accompanied by extensive documentation, providing help and practical examples to assist users in setting up their own simulations.Once infected, each host or agent can undergo a series of events during each time step, such as moving (between locations) or transmitting the infection, all of these being driven by user-specified rules or data, such as travel patterns between locations. nosoi is able to generate a multitude of epidemic scenarios, that can-for example-be used to validate a wide range of reconstruction methods, including epidemic modelling and phylodynamic analyses. nosoi also offers a comprehensive framework to leverage empirically acquired data, allowing the user to explore how variations in parameters can affect epidemic potential. Aside from research questions, nosoi can provide lecturers with a complete teaching tool to offer students a hands-on exploration of the dynamics of epidemiological processes and the factors that impact it. Because the package does not rely on mathematical formalism but uses a more intuitive algorithmic approach, even extensive changes of the entire model can be easily and quickly implemented.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/2041-210X.13422DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7496779PMC
August 2020

Hamiltonian Monte Carlo sampling to estimate past population dynamics using the skygrid coalescent model in a Bayesian phylogenetics framework.

Wellcome Open Res 2020 30;5:53. Epub 2020 Mar 30.

Departments of Biostatistics, Biomathematics and Human Genetics, University of California, Los Angeles, 695 Charles E. Young Drive, Los Angeles, California, 90095-1766, USA.

Nonparametric coalescent-based models are often employed to infer past population dynamics over time. Several of these models, such as the skyride and skygrid models, are equipped with a block-updating Markov chain Monte Carlo sampling scheme to efficiently estimate model parameters. The advent of powerful computational hardware along with the use of high-performance libraries for statistical phylogenetics has, however, made the development of alternative estimation methods feasible. We here present the implementation and performance assessment of a Hamiltonian Monte Carlo gradient-based sampler to infer the parameters of the skygrid model. The skygrid is a popular and flexible coalescent-based model for estimating population dynamics over time and is available in BEAST 1.10.5, a widely-used software package for Bayesian pylogenetic and phylodynamic analysis. Taking into account the increased computational cost of gradient evaluation, we report substantial increases in effective sample size per time unit compared to the established block-updating sampler. We expect gradient-based samplers to assume an increasingly important role for different classes of parameters typically estimated in Bayesian phylogenetic and phylodynamic analyses.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/wellcomeopenres.15770.1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7463299PMC
March 2020

The emergence of SARS-CoV-2 in Europe and North America.

Science 2020 10 10;370(6516):564-570. Epub 2020 Sep 10.

KU Leuven Department of Microbiology, Immunology and Transplantation, Rega Institute, Laboratory of Clinical and Epidemiological Virology, Leuven, Belgium.

Accurate understanding of the global spread of emerging viruses is critical for public health responses and for anticipating and preventing future outbreaks. Here we elucidate when, where, and how the earliest sustained severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) transmission networks became established in Europe and North America. Our results suggest that rapid early interventions successfully prevented early introductions of the virus from taking hold in Germany and the United States. Other, later introductions of the virus from China to both Italy and Washington state, United States, founded the earliest sustained European and North America transmission networks. Our analyses demonstrate the effectiveness of public health measures in preventing onward transmission and show that intensive testing and contact tracing could have prevented SARS-CoV-2 outbreaks from becoming established in these regions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.abc8169DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7810038PMC
October 2020

Bayesian Evaluation of Temporal Signal in Measurably Evolving Populations.

Mol Biol Evol 2020 11;37(11):3363-3379

Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Leuven, Belgium.

Phylogenetic methods can use the sampling times of molecular sequence data to calibrate the molecular clock, enabling the estimation of evolutionary rates and timescales for rapidly evolving pathogens and data sets containing ancient DNA samples. A key aspect of such calibrations is whether a sufficient amount of molecular evolution has occurred over the sampling time window, that is, whether the data can be treated as having come from a measurably evolving population. Here, we investigate the performance of a fully Bayesian evaluation of temporal signal (BETS) in sequence data. The method involves comparing the fit to the data of two models: a model in which the data are accompanied by the actual (heterochronous) sampling times, and a model in which the samples are constrained to be contemporaneous (isochronous). We conducted simulations under a wide range of conditions to demonstrate that BETS accurately classifies data sets according to whether they contain temporal signal or not, even when there is substantial among-lineage rate variation. We explore the behavior of this classification in analyses of five empirical data sets: modern samples of A/H1N1 influenza virus, the bacterium Bordetella pertussis, coronaviruses from mammalian hosts, ancient DNA from Hepatitis B virus, and mitochondrial genomes of dog species. Our results indicate that BETS is an effective alternative to other tests of temporal signal. In particular, this method has the key advantage of allowing a coherent assessment of the entire model, including the molecular clock and tree prior which are essential aspects of Bayesian phylodynamic analyses.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/msaa163DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7454806PMC
November 2020

Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic.

Nat Microbiol 2020 11 28;5(11):1408-1417. Epub 2020 Jul 28.

MRC-University of Glasgow Centre for Virus Research, Glasgow, UK.

There are outstanding evolutionary questions on the recent emergence of human coronavirus SARS-CoV-2 including the role of reservoir species, the role of recombination and its time of divergence from animal viruses. We find that the sarbecoviruses-the viral subgenus containing SARS-CoV and SARS-CoV-2-undergo frequent recombination and exhibit spatially structured genetic diversity on a regional scale in China. SARS-CoV-2 itself is not a recombinant of any sarbecoviruses detected to date, and its receptor-binding motif, important for specificity to human ACE2 receptors, appears to be an ancestral trait shared with bat viruses and not one acquired recently via recombination. To employ phylogenetic dating methods, recombinant regions of a 68-genome sarbecovirus alignment were removed with three independent methods. Bayesian evolutionary rate and divergence date estimates were shown to be consistent for these three approaches and for two different prior specifications of evolutionary rates based on HCoV-OC43 and MERS-CoV. Divergence dates between SARS-CoV-2 and the bat sarbecovirus reservoir were estimated as 1948 (95% highest posterior density (HPD): 1879-1999), 1969 (95% HPD: 1930-2000) and 1982 (95% HPD: 1948-2009), indicating that the lineage giving rise to SARS-CoV-2 has been circulating unnoticed in bats for decades.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41564-020-0771-4DOI Listing
November 2020

Evolution and epidemic spread of SARS-CoV-2 in Brazil.

Science 2020 09 23;369(6508):1255-1260. Epub 2020 Jul 23.

Department of Zoology, University of Oxford, Oxford, UK.

Brazil currently has one of the fastest-growing severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) epidemics in the world. Because of limited available data, assessments of the impact of nonpharmaceutical interventions (NPIs) on this virus spread remain challenging. Using a mobility-driven transmission model, we show that NPIs reduced the reproduction number from >3 to 1 to 1.6 in São Paulo and Rio de Janeiro. Sequencing of 427 new genomes and analysis of a geographically representative genomic dataset identified >100 international virus introductions in Brazil. We estimate that most (76%) of the Brazilian strains fell in three clades that were introduced from Europe between 22 February and 11 March 2020. During the early epidemic phase, we found that SARS-CoV-2 spread mostly locally and within state borders. After this period, despite sharp decreases in air travel, we estimated multiple exportations from large urban centers that coincided with a 25% increase in average traveled distances in national flights. This study sheds new light on the epidemic transmission and evolutionary trajectories of SARS-CoV-2 lineages in Brazil and provides evidence that current interventions remain insufficient to keep virus transmission under control in this country.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.abd2161DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7402630PMC
September 2020

Relaxed Random Walks at Scale.

Syst Biol 2021 Feb;70(2):258-267

Department of Biomathematics, David Geffen School of Medicine at UCLA, University of California, Los Angeles, CA, USA.

Relaxed random walk (RRW) models of trait evolution introduce branch-specific rate multipliers to modulate the variance of a standard Brownian diffusion process along a phylogeny and more accurately model overdispersed biological data. Increased taxonomic sampling challenges inference under RRWs as the number of unknown parameters grows with the number of taxa. To solve this problem, we present a scalable method to efficiently fit RRWs and infer this branch-specific variation in a Bayesian framework. We develop a Hamiltonian Monte Carlo (HMC) sampler to approximate the high-dimensional, correlated posterior that exploits a closed-form evaluation of the gradient of the trait data log-likelihood with respect to all branch-rate multipliers simultaneously. Our gradient calculation achieves computational complexity that scales only linearly with the number of taxa under study. We compare the efficiency of our HMC sampler to the previously standard univariable Metropolis-Hastings approach while studying the spatial emergence of the West Nile virus in North America in the early 2000s. Our method achieves at least a 6-fold speed increase over the univariable approach. Additionally, we demonstrate the scalability of our method by applying the RRW to study the correlation between five mammalian life history traits in a phylogenetic tree with $3650$ tips.[Bayesian inference; BEAST; Hamiltonian Monte Carlo; life history; phylodynamics, relaxed random walk.].
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/sysbio/syaa056DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7875444PMC
February 2021

Accommodating individual travel history, global mobility, and unsampled diversity in phylogeography: a SARS-CoV-2 case study.

bioRxiv 2020 Jun 23. Epub 2020 Jun 23.

Department of Biomathematics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA.

Spatiotemporal bias in genome sequence sampling can severely confound phylogeographic inference based on discrete trait ancestral reconstruction. This has impeded our ability to accurately track the emergence and spread of SARS-CoV-2, which is the virus responsible for the COVID-19 pandemic. Despite the availability of staggering numbers of genomes on a global scale, evolutionary reconstructions of SARS-CoV-2 are hindered by the slow accumulation of sequence divergence over its relatively short transmission history. When confronted with these issues, incorporating additional contextual data may critically inform phylodynamic reconstructions. Here, we present a new approach to integrate individual travel history data in Bayesian phylogeographic inference and apply it to the early spread of SARS-CoV-2, while also including global air transportation data. We demonstrate that including travel history data for each SARS-CoV-2 genome yields more realistic reconstructions of virus spread, particularly when travelers from undersampled locations are included to mitigate sampling bias. We further explore the impact of sampling bias by incorporating unsampled sequences from undersampled locations in the analyses. Our reconstructions reinforce specific transmission hypotheses suggested by the inclusion of travel history data, but also suggest alternative routes of virus migration that are plausible within the epidemiological context but are not apparent with current sampling efforts. Although further research is needed to fully examine the performance of our new data integration approaches and to further improve them, they represent multiple new avenues for directly addressing the colossal issue of sample bias in phylogeographic inference.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2020.06.22.165464DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7315996PMC
June 2020

Measles virus and rinderpest virus divergence dated to the sixth century BCE.

Science 2020 06;368(6497):1367-1370

Epidemiology of Highly Pathogenic Microorganisms Project Group, Robert Koch Institute, Berlin, Germany.

Many infectious diseases are thought to have emerged in humans after the Neolithic revolution. Although it is broadly accepted that this also applies to measles, the exact date of emergence for this disease is controversial. We sequenced the genome of a 1912 measles virus and used selection-aware molecular clock modeling to determine the divergence date of measles virus and rinderpest virus. This divergence date represents the earliest possible date for the establishment of measles in human populations. Our analyses show that the measles virus potentially arose as early as the sixth century BCE, possibly coinciding with the rise of large cities.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aba9411DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7713999PMC
June 2020

The emergence of SARS-CoV-2 in Europe and the US.

bioRxiv 2020 May 23. Epub 2020 May 23.

KU Leuven Department of Microbiology, Immunology and Transplantation, Rega Institute, Laboratory of Clinical and Evolutionary Virology, Leuven, Belgium.

Accurate understanding of the global spread of emerging viruses is critically important for public health response and for anticipating and preventing future outbreaks. Here, we elucidate when, where and how the earliest sustained SARS-CoV-2 transmission networks became established in Europe and the United States (US). Our results refute prior findings erroneously linking cases in January 2020 with outbreaks that occurred weeks later. Instead, rapid interventions successfully prevented onward transmission of those early cases in Germany and Washington State. Other, later introductions of the virus from China to both Italy and Washington State founded the earliest sustained European and US transmission networks. Our analyses reveal an extended period of missed opportunity when intensive testing and contact tracing could have prevented SARS-CoV-2 from becoming established in the US and Europe.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2020.05.21.109322DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7265688PMC
May 2020

Gradients Do Grow on Trees: A Linear-Time O(N)-Dimensional Gradient for Statistical Phylogenetics.

Mol Biol Evol 2020 10;37(10):3047-3060

Department of Biomathematics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA.

Calculation of the log-likelihood stands as the computational bottleneck for many statistical phylogenetic algorithms. Even worse is its gradient evaluation, often used to target regions of high probability. Order O(N)-dimensional gradient calculations based on the standard pruning algorithm require O(N2) operations, where N is the number of sampled molecular sequences. With the advent of high-throughput sequencing, recent phylogenetic studies have analyzed hundreds to thousands of sequences, with an apparent trend toward even larger data sets as a result of advancing technology. Such large-scale analyses challenge phylogenetic reconstruction by requiring inference on larger sets of process parameters to model the increasing data heterogeneity. To make these analyses tractable, we present a linear-time algorithm for O(N)-dimensional gradient evaluation and apply it to general continuous-time Markov processes of sequence substitution on a phylogenetic tree without a need to assume either stationarity or reversibility. We apply this approach to learn the branch-specific evolutionary rates of three pathogenic viruses: West Nile virus, Dengue virus, and Lassa virus. Our proposed algorithm significantly improves inference efficiency with a 126- to 234-fold increase in maximum-likelihood optimization and a 16- to 33-fold computational performance increase in a Bayesian framework.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/msaa130DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7530611PMC
October 2020

Genomic Epidemiology of 2015-2016 Zika Virus Outbreak in Cape Verde.

Emerg Infect Dis 2020 06;26(6):1084-1090

During 2015-2016, Cape Verde, an island nation off the coast of West Africa, experienced a Zika virus (ZIKV) outbreak involving 7,580 suspected Zika cases and 18 microcephaly cases. Analysis of the complete genomes of 3 ZIKV isolates from the outbreak indicated the strain was of the Asian (not African) lineage. The Cape Verde ZIKV sequences formed a distinct monophylogenetic group and possessed 1-2 (T659A, I756V) unique amino acid changes in the envelope protein. Phylogeographic and serologic evidence support earlier introduction of this lineage into Cape Verde, possibly from northeast Brazil, between June 2014 and August 2015, suggesting cryptic circulation of the virus before the initial wave of cases were detected in October 2015. These findings underscore the utility of genomic-scale epidemiology for outbreak investigations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3201/eid2606.190928DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7258482PMC
June 2020

A near full-length HIV-1 genome from 1966 recovered from formalin-fixed paraffin-embedded tissue.

Proc Natl Acad Sci U S A 2020 06 19;117(22):12222-12229. Epub 2020 May 19.

Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721;

With very little direct biological data of HIV-1 from before the 1980s, far-reaching evolutionary and epidemiological inferences regarding the long prediscovery phase of this pandemic are based on extrapolations by phylodynamic models of HIV-1 genomic sequences gathered mostly over recent decades. Here, using a very sensitive multiplex RT-PCR assay, we screened 1,645 formalin-fixed paraffin-embedded tissue specimens collected for pathology diagnostics in Central Africa between 1958 and 1966. We report the near-complete viral genome in one HIV-1 positive specimen from Kinshasa, Democratic Republic of Congo (DRC), from 1966 ("DRC66")-a nonrecombinant sister lineage to subtype C that constitutes the oldest HIV-1 near full-length genome recovered to date. Root-to-tip plots showed the DRC66 sequence is not an outlier as would be expected if dating estimates from more recent genomes were systematically biased; and inclusion of the DRC66 sequence in tip-dated BEAST analyses did not significantly alter root and internal node age estimates based on post-1978 HIV-1 sequences. There was larger variation in divergence time estimates among datasets that were subsamples of the available HIV-1 genomes from 1978 to 2014, showing the inherent phylogenetic stochasticity across subsets of the real HIV-1 diversity. Our phylogenetic analyses date the origin of the pandemic lineage of HIV-1 to a time period around the turn of the 20th century (1881 to 1918). In conclusion, this unique archival HIV-1 sequence provides direct genomic insight into HIV-1 in 1960s DRC, and, as an ancient-DNA calibrator, it validates our understanding of HIV-1 evolutionary history.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1913682117DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7275743PMC
June 2020

Markov-Modulated Continuous-Time Markov Chains to Identify Site- and Branch-Specific Evolutionary Variation in BEAST.

Syst Biol 2021 Jan;70(1):181-189

Department of Biostatistics, Jonathan and Karin Fielding School of Public Health, University of California, Los Angeles, CA 90095, USA.

Markov models of character substitution on phylogenies form the foundation of phylogenetic inference frameworks. Early models made the simplifying assumption that the substitution process is homogeneous over time and across sites in the molecular sequence alignment. While standard practice adopts extensions that accommodate heterogeneity of substitution rates across sites, heterogeneity in the process over time in a site-specific manner remains frequently overlooked. This is problematic, as evolutionary processes that act at the molecular level are highly variable, subjecting different sites to different selective constraints over time, impacting their substitution behavior. We propose incorporating time variability through Markov-modulated models (MMMs), which extend covarion-like models and allow the substitution process (including relative character exchange rates as well as the overall substitution rate) at individual sites to vary across lineages. We implement a general MMM framework in BEAST, a popular Bayesian phylogenetic inference software package, allowing researchers to compose a wide range of MMMs through flexible XML specification. Using examples from bacterial, viral, and plastid genome evolution, we show that MMMs impact phylogenetic tree estimation and can substantially improve model fit compared to standard substitution models. Through simulations, we show that marginal likelihood estimation accurately identifies the generative model and does not systematically prefer the more parameter-rich MMMs. To mitigate the increased computational demands associated with MMMs, our implementation exploits recent developments in BEAGLE, a high-performance computational library for phylogenetic inference. [Bayesian inference; BEAGLE; BEAST; covarion, heterotachy; Markov-modulated models; phylogenetics.].
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/sysbio/syaa037DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7744037PMC
January 2021

Genomic Epidemiology, Evolution, and Transmission Dynamics of Porcine Deltacoronavirus.

Mol Biol Evol 2020 09;37(9):2641-2654

MOE International Joint Collaborative Research Laboratory for Animal Health & Food Safety, Jiangsu Engineering Laboratory of Animal Immunology, Institute of Immunology, College of Veterinary Medicine, Nanjing Agricultural University, Nanjing, China.

The emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has shown once again that coronavirus (CoV) in animals are potential sources for epidemics in humans. Porcine deltacoronavirus (PDCoV) is an emerging enteropathogen of swine with a worldwide distribution. Here, we implemented and described an approach to analyze the epidemiology of PDCoV following its emergence in the pig population. We performed an integrated analysis of full genome sequence data from 21 newly sequenced viruses, along with comprehensive epidemiological surveillance data collected globally over the last 15 years. We found four distinct phylogenetic lineages of PDCoV, which differ in their geographic circulation patterns. Interestingly, we identified more frequent intra- and interlineage recombination and higher virus genetic diversity in the Chinese lineages compared with the USA lineage where pigs are raised in different farming systems and ecological environments. Most recombination breakpoints are located in the ORF1ab gene rather than in genes encoding structural proteins. We also identified five amino acids under positive selection in the spike protein suggesting a role for adaptive evolution. According to structural mapping, three positively selected sites are located in the N-terminal domain of the S1 subunit, which is the most likely involved in binding to a carbohydrate receptor, whereas the other two are located in or near the fusion peptide of the S2 subunit and thus might affect membrane fusion. Finally, our phylogeographic investigations highlighted notable South-North transmission as well as frequent long-distance dispersal events in China that could implicate human-mediated transmission. Our findings provide new insights into the evolution and dispersal of PDCoV that contribute to our understanding of the critical factors involved in CoVs emergence.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/msaa117DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7454817PMC
September 2020

Assessing the role of live poultry trade in community-structured transmission of avian influenza in China.

Proc Natl Acad Sci U S A 2020 03 2;117(11):5949-5954. Epub 2020 Mar 2.

State Key Laboratory of Remote Sensing Science, College of Global Change and Earth System Science, Beijing Normal University, 100875 Beijing, China;

The live poultry trade is thought to play an important role in the spread and maintenance of highly pathogenic avian influenza A viruses (HP AIVs) in Asia. Despite an abundance of small-scale observational studies, the role of the poultry trade in disseminating AIV over large geographic areas is still unclear, especially for developing countries with complex poultry production systems. Here we combine virus genomes and reconstructed poultry transportation data to measure and compare the spatial spread in China of three key subtypes of AIV: H5N1, H7N9, and H5N6. Although it is difficult to disentangle the contribution of confounding factors, such as bird migration and spatial distance, we find evidence that the dissemination of these subtypes among domestic poultry is geographically continuous and likely associated with the intensity of the live poultry trade in China. Using two independent data sources and network analysis methods, we report a regional-scale community structure in China that might explain the spread of AIV subtypes in the country. The identification of this structure has the potential to inform more targeted strategies for the prevention and control of AIV in China.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1906954117DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7084072PMC
March 2020

Online Bayesian Phylodynamic Inference in BEAST with Application to Epidemic Reconstruction.

Mol Biol Evol 2020 06;37(6):1832-1842

Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Leuven, Belgium.

Reconstructing pathogen dynamics from genetic data as they become available during an outbreak or epidemic represents an important statistical scenario in which observations arrive sequentially in time and one is interested in performing inference in an "online" fashion. Widely used Bayesian phylogenetic inference packages are not set up for this purpose, generally requiring one to recompute trees and evolutionary model parameters de novo when new data arrive. To accommodate increasing data flow in a Bayesian phylogenetic framework, we introduce a methodology to efficiently update the posterior distribution with newly available genetic data. Our procedure is implemented in the BEAST 1.10 software package, and relies on a distance-based measure to insert new taxa into the current estimate of the phylogeny and imputes plausible values for new model parameters to accommodate growing dimensionality. This augmentation creates informed starting values and re-uses optimally tuned transition kernels for posterior exploration of growing data sets, reducing the time necessary to converge to target posterior distributions. We apply our framework to data from the recent West African Ebola virus epidemic and demonstrate a considerable reduction in time required to obtain posterior estimates at different time points of the outbreak. Beyond epidemic monitoring, this framework easily finds other applications within the phylogenetics community, where changes in the data-in terms of alignment changes, sequence addition or removal-present common scenarios that can benefit from online inference.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/msaa047DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7253210PMC
June 2020

Evaluating predictive markers for viral rebound and safety assessment in blood and lumbar fluid during HIV-1 treatment interruption.

J Antimicrob Chemother 2020 05;75(5):1311-1320

HIV Cure Research Center, Department of Internal Medicine and Paediatrics, Faculty of Medicine and Health Sciences, Ghent University and Ghent University Hospital, Corneel Heymanslaan 10, 9000 Ghent, Belgium.

Background: Validated biomarkers to evaluate HIV-1 cure strategies are currently lacking, therefore requiring analytical treatment interruption (ATI) in study participants. Little is known about the safety of ATI and its long-term impact on patient health.

Objectives: ATI safety was assessed and potential biomarkers predicting viral rebound were evaluated.

Methods: PBMCs, plasma and CSF were collected from 11 HIV-1-positive individuals at four different timepoints during ATI (NCT02641756). Total and integrated HIV-1 DNA, cell-associated (CA) HIV-1 RNA transcripts and restriction factor (RF) expression were measured by PCR-based assays. Markers of neuroinflammation and neuronal injury [neurofilament light chain (NFL) and YKL-40 protein] were measured in CSF. Additionally, neopterin, tryptophan and kynurenine were measured, both in plasma and CSF, as markers of immune activation.

Results: Total HIV-1 DNA, integrated HIV-1 DNA and CA viral RNA transcripts did not differ pre- and post-ATI. Similarly, no significant NFL or YKL-40 increases in CSF were observed between baseline and viral rebound. Furthermore, markers of immune activation did not increase during ATI. Interestingly, the RFs SLFN11 and APOBEC3G increased after ATI before viral rebound. Similarly, Tat-Rev transcripts were increased preceding viral rebound after interruption.

Conclusions: ATI did not increase viral reservoir size and it did not reveal signs of increased neuronal injury or inflammation, suggesting that these well-monitored ATIs are safe. Elevation of Tat-Rev transcription and induced expression of the RFs SLFN11 and APOBEC3G after ATI, prior to viral rebound, indicates that these factors could be used as potential biomarkers predicting viral rebound.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/jac/dkaa003DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7869780PMC
May 2020

In Search of Covariates of HIV-1 Subtype B Spread in the United States-A Cautionary Tale of Large-Scale Bayesian Phylogeography.

Viruses 2020 02 5;12(2). Epub 2020 Feb 5.

Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, KU Leuven, 3000 Leuven, Belgium.

Infections with HIV-1 group M subtype B viruses account for the majority of the HIV epidemic in the Western world. Phylogeographic studies have placed the introduction of subtype B in the United States in New York around 1970, where it grew into a major source of spread. Currently, it is estimated that over one million people are living with HIV in the US and that most are infected with subtype B variants. Here, we aim to identify the drivers of HIV-1 subtype B dispersal in the United States by analyzing a collection of 23,588 ol sequences, collected for drug resistance testing from 45 states during 2004-2011. To this end, we introduce a workflow to reduce this large collection of data to more computationally-manageable sample sizes and apply the BEAST framework to test which covariates associate with the spread of HIV-1 across state borders. Our results show that we are able to consistently identify certain predictors of spread under reasonable run times across datasets of up to 10,000 sequences. However, the general lack of phylogenetic structure and the high uncertainty associated with HIV trees make it difficult to interpret the epidemiological relevance of the drivers of spread we are able to identify. While the workflow we present here could be applied to other virus datasets of a similar scale, the characteristic star-like shape of HIV-1 phylogenies poses a serious obstacle to reconstructing a detailed evolutionary and spatial history for HIV-1 subtype B in the US.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/v12020182DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7077180PMC
February 2020

Symptom evolution following the emergence of maize streak virus.

Elife 2020 Jan 15;9. Epub 2020 Jan 15.

Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Diseases and Molecular Medicine, University of Cape Town, Observatory, Cape Town, South Africa.

For pathogens infecting single host species evolutionary trade-offs have previously been demonstrated between pathogen-induced mortality rates and transmission rates. It remains unclear, however, how such trade-offs impact sub-lethal pathogen-inflicted damage, and whether these trade-offs even occur in broad host-range pathogens. Here, we examine changes over the past 110 years in symptoms induced in maize by the broad host-range pathogen, maize streak virus (MSV). Specifically, we use the quantified symptom intensities of cloned MSV isolates in differentially resistant maize genotypes to phylogenetically infer ancestral symptom intensities and check for phylogenetic signal associated with these symptom intensities. We show that whereas symptoms reflecting harm to the host have remained constant or decreased, there has been an increase in how extensively MSV colonizes the cells upon which transmission vectors feed. This demonstrates an evolutionary trade-off between amounts of pathogen-inflicted harm and how effectively viruses position themselves within plants to enable onward transmission.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7554/eLife.51984DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7034976PMC
January 2020