Publications by authors named "Antonio Barbadilla"

27 Publications

  • Page 1 of 1

Germline de novo mutation rates on exons versus introns in humans.

Nat Commun 2020 07 3;11(1):3304. Epub 2020 Jul 3.

Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), 08193, Barcelona, Spain.

A main assumption of molecular population genetics is that genomic mutation rate does not depend on sequence function. Challenging this assumption, a recent study has found a reduction in the mutation rate in exons compared to introns in somatic cells, ascribed to an enhanced exonic mismatch repair system activity. If this reduction happens also in the germline, it can compromise studies of population genomics, including the detection of selection when using introns as proxies for neutrality. Here we compile and analyze published germline de novo mutation data to test if the exonic mutation rate is also reduced in germ cells. After controlling for sampling bias in datasets with diseased probands and extended nucleotide context dependency, we find no reduction in the mutation rate in exons compared to introns in the germline. Therefore, there is no evidence that enhanced exonic mismatch repair activity determines the mutation rate in germline cells.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-17162-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7334200PMC
July 2020

iMKT: the integrative McDonald and Kreitman test.

Nucleic Acids Res 2019 07;47(W1):W283-W288

Institut de Biotecnologia i de Biomedicina and Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain.

The McDonald and Kreitman test (MKT) is one of the most powerful and widely used methods to detect and quantify recurrent natural selection using DNA sequence data. Here we present iMKT (acronym for integrative McDonald and Kreitman test), a novel web-based service performing four distinct MKT types. It allows the detection and estimation of four different selection regimes -adaptive, neutral, strongly deleterious and weakly deleterious- acting on any genomic sequence. iMKT can analyze both user's own population genomic data and pre-loaded Drosophila melanogaster and human sequences of protein-coding genes obtained from the largest population genomic datasets to date. Advanced options in the website allow testing complex hypotheses such as the application example showed here: do genes located in high recombination regions undergo higher rates of adaptation? We aim that iMKT will become a reference site tool for the study of evolutionary adaptation in massive population genomics datasets, especially in Drosophila and humans. iMKT is a free resource online at https://imkt.uab.cat.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkz372DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6602517PMC
July 2019

Adaptation and Conservation throughout the Drosophila melanogaster Life-Cycle.

Genome Biol Evol 2019 05;11(5):1463-1482

Genomics, Bioinformatics and Evolution, Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain.

Previous studies of the evolution of genes expressed at different life-cycle stages of Drosophila melanogaster have not been able to disentangle adaptive from nonadaptive substitutions when using nonsynonymous sites. Here, we overcome this limitation by combining whole-genome polymorphism data from D. melanogaster and divergence data between D. melanogaster and Drosophila yakuba. For the set of genes expressed at different life-cycle stages of D. melanogaster, as reported in modENCODE, we estimate the ratio of substitutions relative to polymorphism between nonsynonymous and synonymous sites (α) and then α is discomposed into the ratio of adaptive (ωa) and nonadaptive (ωna) substitutions to synonymous substitutions. We find that the genes expressed in mid- and late-embryonic development are the most conserved, whereas those expressed in early development and postembryonic stages are the least conserved. Importantly, we found that low conservation in early development is due to high rates of nonadaptive substitutions (high ωna), whereas in postembryonic stages it is due, instead, to high rates of adaptive substitutions (high ωa). By using estimates of different genomic features (codon bias, average intron length, exon number, recombination rate, among others), we also find that genes expressed in mid- and late-embryonic development show the most complex architecture: they are larger, have more exons, more transcripts, and longer introns. In addition, these genes are broadly expressed among all stages. We suggest that all these genomic features are related to the conservation of mid- and late-embryonic development. Globally, our study supports the hourglass pattern of conservation and adaptation over the life-cycle.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gbe/evz086DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6535812PMC
May 2019

PopHumanScan: the online catalog of human genome adaptation.

Nucleic Acids Res 2019 01;47(D1):D1080-D1089

Institut de Biotecnologia i de Biomedicina and Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain.

Since the migrations that led humans to colonize Earth, our species has faced frequent adaptive challenges that have left signatures in the landscape of genetic variation and that we can identify in our today's genomes. Here, we (i) perform an outlier approach on eight different population genetic statistics for 22 non-admixed human populations of the Phase III of the 1000 Genomes Project to detect selective sweeps at different historical ages, as well as events of recurrent positive selection in the human lineage; and (ii) create PopHumanScan, an online catalog that compiles and annotates all candidate regions under selection to facilitate their validation and thoroughly analysis. Well-known examples of human genetic adaptation published elsewhere are included in the catalog, as well as hundreds of other attractive candidates that will require further investigation. Designed as a collaborative database, PopHumanScan aims to become a central repository to share information, guide future studies and help advance our understanding of how selection has modeled our genomes as a response to changes in the environment or lifestyle of human populations. PopHumanScan is open and freely available at https://pophumanscan.uab.cat.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gky959DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6323894PMC
January 2019

PopHuman: the human population genomics browser.

Nucleic Acids Res 2018 01;46(D1):D1003-D1010

Institut de Biotecnologia i de Biomedicina and Department de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain.

The 1000 Genomes Project (1000GP) represents the most comprehensive world-wide nucleotide variation data set so far in humans, providing the sequencing and analysis of 2504 genomes from 26 populations and reporting >84 million variants. The availability of this sequence data provides the human lineage with an invaluable resource for population genomics studies, allowing the testing of molecular population genetics hypotheses and eventually the understanding of the evolutionary dynamics of genetic variation in human populations. Here we present PopHuman, a new population genomics-oriented genome browser based on JBrowse that allows the interactive visualization and retrieval of an extensive inventory of population genetics metrics. Efficient and reliable parameter estimates have been computed using a novel pipeline that faces the unique features and limitations of the 1000GP data, and include a battery of nucleotide variation measures, divergence and linkage disequilibrium parameters, as well as different tests of neutrality, estimated in non-overlapping windows along the chromosomes and in annotated genes for all 26 populations of the 1000GP. PopHuman is open and freely available at http://pophuman.uab.cat.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkx943DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5753332PMC
January 2018

Mapping Selection within Drosophila melanogaster Embryo's Anatomy.

Mol Biol Evol 2018 01;35(1):66-79

Evo-devo Helsinki Community, Centre of Excellence in Experimental and Computational Developmental Biology, Institute of Biotechnology, University of Helsinki, Helsinki, Finland.

We present a survey of selection across Drosophila melanogaster embryonic anatomy. Our approach integrates genomic variation, spatial gene expression patterns, and development with the aim of mapping adaptation over the entire embryo's anatomy. Our adaptation map is based on analyzing spatial gene expression information for 5,969 genes (from text-based annotations of in situ hybridization data directly from the BDGP database, Tomancak et al. 2007) and the polymorphism and divergence in these genes (from the project DGRP, Mackay et al. 2012).The proportion of nonsynonymous substitutions that are adaptive, neutral, or slightly deleterious are estimated for the set of genes expressed in each embryonic anatomical structure using the distribution of fitness effects-alpha method (Eyre-Walker and Keightley 2009). This method is a robust derivative of the McDonald and Kreitman test (McDonald and Kreitman 1991). We also explore whether different anatomical structures differ in the phylogenetic age, codon usage, or expression bias of the genes they express and whether genes expressed in many anatomical structures show more adaptive substitutions than other genes.We found that: 1) most of the digestive system and ectoderm-derived structures are under selective constraint, 2) the germ line and some specific mesoderm-derived structures show high rates of adaptive substitution, and 3) the genes that are expressed in a small number of anatomical structures show higher expression bias, lower phylogenetic ages, and less constraint.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/msx266DOI Listing
January 2018

PopFly: the Drosophila population genomics browser.

Bioinformatics 2017 Sep;33(17):2779-2780

Institut de Biotecnologia i de Biomedicina and Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès 08193, Spain.

Summary: The recent compilation of over 1100 worldwide wild-derived Drosophila melanogaster genome sequences reassembled using a standardized pipeline provides a unique resource for population genomic studies (Drosophila Genome Nexus, DGN). A visual display of the estimated metrics describing genome-wide variation and selection patterns would allow gaining a global view and understanding of the evolutionary forces shaping genome variation.

Availability And Implementation: Here, we present PopFly, a population genomics-oriented genome browser, based on JBrowse software, that contains a complete inventory of population genomic parameters estimated from DGN data. This browser is designed for the automatic analysis and display of genetic variation data within and between populations along the D. melanogaster genome. PopFly allows the visualization and retrieval of functional annotations, estimates of nucleotide diversity metrics, linkage disequilibrium statistics, recombination rates, a battery of neutrality tests, and population differentiation parameters at different window sizes through the euchromatic chromosomes. PopFly is open and freely available at site http://popfly.uab.cat .

Contact: sergi.hervas@uab.cat or antonio.barbadilla@uab.cat.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btx301DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860067PMC
September 2017

Molecular Population Genetics.

Genetics 2017 Mar;205(3):1003-1035

Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, 08193, Spain

Molecular population genetics aims to explain genetic variation and molecular evolution from population genetics principles. The field was born 50 years ago with the first measures of genetic variation in allozyme loci, continued with the nucleotide sequencing era, and is currently in the era of population genomics. During this period, molecular population genetics has been revolutionized by progress in data acquisition and theoretical developments. The conceptual elegance of the neutral theory of molecular evolution or the footprint carved by natural selection on the patterns of genetic variation are two examples of the vast number of inspiring findings of population genetics research. Since the inception of the field, has been the prominent model species: molecular variation in populations was first described in and most of the population genetics hypotheses were tested in species. In this review, we describe the main concepts, methods, and landmarks of molecular population genetics, using the model as a reference. We describe the different genetic data sets made available by advances in molecular technologies, and the theoretical developments fostered by these data. Finally, we review the results and new insights provided by the population genomics approach, and conclude by enumerating challenges and new lines of inquiry posed by increasingly large population scale sequence data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1534/genetics.116.196493DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5340319PMC
March 2017

Genetic polymorphisms of FAS and EVER genes in a Greek population and their susceptibility to cervical cancer: a case control study.

BMC Cancer 2016 11 29;16(1):923. Epub 2016 Nov 29.

4th University Clinic of Obstetrics and Gynecology, Aristotle University of Thessaloniki, "Hippokrateion" General Hospital of Thessaloniki, Thessaloniki, Greece.

Background: The aim of the study was to evaluate the association of two SNPs of EVER1/2 genes' region (rs2290907, rs16970849) and the FAS-670 polymorphism with the susceptibility to precancerous lesions and cervical cancer in a Greek population.

Methods: Among the 515 women who were included in the statistical analysis, 113 belong to the case group and present with precancerous lesions or cervical cancer (27 with persistent CIN1, 66 with CIN2/3 and 20 with cervical cancer) and 402 belong to the control group. The chi-squared test was used to compare the case and the control groups with an allelic and a genotype-based analysis.

Results: The results of the statistical analysis comparing the case and the control groups for all the SNPs tested were not statistically significant. Borderline significant difference (p value = 0.079) was only found by the allelic model between the control group and the CIN1/CIN2 patients' subgroup for the polymorphism rs16970849. The comparison of the other case subgroups with the control group did not show any statistically significant difference.

Conclusions: None of the SNPs included in the study can be associated with statistical significance with the development of precancerous lesions or cervical cancer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12885-016-2960-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5129199PMC
November 2016

Adaptive Evolution Is Substantially Impeded by Hill-Robertson Interference in Drosophila.

Mol Biol Evol 2016 Feb 22;33(2):442-55. Epub 2015 Oct 22.

Centre for the Study of Evolution, School of Life Sciences, University of Sussex, Brighton, United Kingdom

Hill-Robertson interference (HRi) is expected to reduce the efficiency of natural selection when two or more linked selected sites do not segregate freely, but no attempt has been done so far to quantify the overall impact of HRi on the rate of adaptive evolution for any given genome. In this work, we estimate how much HRi impedes the rate of adaptive evolution in the coding genome of Drosophila melanogaster. We compiled a data set of 6,141 autosomal protein-coding genes from Drosophila, from which polymorphism levels in D. melanogaster and divergence out to D. yakuba were estimated. The rate of adaptive evolution was calculated using a derivative of the McDonald-Kreitman test that controls for slightly deleterious mutations. We find that the rate of adaptive amino acid substitution at a given position of the genome is positively correlated to both the rate of recombination and the mutation rate, and negatively correlated to the gene density of the region. These correlations are robust to controlling for each other, for synonymous codon bias and for gene functions related to immune response and testes. We show that HRi diminishes the rate of adaptive evolution by approximately 27%. Interestingly, genes with low mutation rates embedded in gene poor regions lose approximately 17% of their adaptive substitutions whereas genes with high mutation rates embedded in gene rich regions lose approximately 60%. We conclude that HRi hampers the rate of adaptive evolution in Drosophila and that the variation in recombination, mutation, and gene density along the genome affects the HRi effect.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/msv236DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4794616PMC
February 2016

Genomics of ecological adaptation in cactophilic Drosophila.

Genome Biol Evol 2014 Dec 31;7(1):349-66. Epub 2014 Dec 31.

Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Spain

Cactophilic Drosophila species provide a valuable model to study gene-environment interactions and ecological adaptation. Drosophila buzzatii and Drosophila mojavensis are two cactophilic species that belong to the repleta group, but have very different geographical distributions and primary host plants. To investigate the genomic basis of ecological adaptation, we sequenced the genome and developmental transcriptome of D. buzzatii and compared its gene content with that of D. mojavensis and two other noncactophilic Drosophila species in the same subgenus. The newly sequenced D. buzzatii genome (161.5 Mb) comprises 826 scaffolds (>3 kb) and contains 13,657 annotated protein-coding genes. Using RNA sequencing data of five life-stages we found expression of 15,026 genes, 80% protein-coding genes, and 20% noncoding RNA genes. In total, we detected 1,294 genes putatively under positive selection. Interestingly, among genes under positive selection in the D. mojavensis lineage, there is an excess of genes involved in metabolism of heterocyclic compounds that are abundant in Stenocereus cacti and toxic to nonresident Drosophila species. We found 117 orphan genes in the shared D. buzzatii-D. mojavensis lineage. In addition, gene duplication analysis identified lineage-specific expanded families with functional annotations associated with proteolysis, zinc ion binding, chitin binding, sensory perception, ethanol tolerance, immunity, physiology, and reproduction. In summary, we identified genetic signatures of adaptation in the shared D. buzzatii-D. mojavensis lineage, and in the two separate D. buzzatii and D. mojavensis lineages. Many of the novel lineage-specific genomic features are promising candidates for explaining the adaptation of these species to their distinct ecological niches.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gbe/evu291DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4316639PMC
December 2014

Natural variation in genome architecture among 205 Drosophila melanogaster Genetic Reference Panel lines.

Genome Res 2014 Jul 8;24(7):1193-208. Epub 2014 Apr 8.

Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA;

The Drosophila melanogaster Genetic Reference Panel (DGRP) is a community resource of 205 sequenced inbred lines, derived to improve our understanding of the effects of naturally occurring genetic variation on molecular and organismal phenotypes. We used an integrated genotyping strategy to identify 4,853,802 single nucleotide polymorphisms (SNPs) and 1,296,080 non-SNP variants. Our molecular population genomic analyses show higher deletion than insertion mutation rates and stronger purifying selection on deletions. Weaker selection on insertions than deletions is consistent with our observed distribution of genome size determined by flow cytometry, which is skewed toward larger genomes. Insertion/deletion and single nucleotide polymorphisms are positively correlated with each other and with local recombination, suggesting that their nonrandom distributions are due to hitchhiking and background selection. Our cytogenetic analysis identified 16 polymorphic inversions in the DGRP. Common inverted and standard karyotypes are genetically divergent and account for most of the variation in relatedness among the DGRP lines. Intriguingly, variation in genome size and many quantitative traits are significantly associated with inversions. Approximately 50% of the DGRP lines are infected with Wolbachia, and four lines have germline insertions of Wolbachia sequences, but effects of Wolbachia infection on quantitative traits are rarely significant. The DGRP complements ongoing efforts to functionally annotate the Drosophila genome. Indeed, 15% of all D. melanogaster genes segregate for potentially damaged proteins in the DGRP, and genome-wide analyses of quantitative traits identify novel candidate genes. The DGRP lines, sequence data, genotypes, quality scores, phenotypes, and analysis and visualization tools are publicly available.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.171546.113DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4079974PMC
July 2014

InvFEST, a database integrating information of polymorphic inversions in the human genome.

Nucleic Acids Res 2014 Jan 18;42(Database issue):D1027-32. Epub 2013 Nov 18.

Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain, Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain and Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.

The newest genomic advances have uncovered an unprecedented degree of structural variation throughout genomes, with great amounts of data accumulating rapidly. Here we introduce InvFEST (http://invfestdb.uab.cat), a database combining multiple sources of information to generate a complete catalogue of non-redundant human polymorphic inversions. Due to the complexity of this type of changes and the underlying high false-positive discovery rate, it is necessary to integrate all the available data to get a reliable estimate of the real number of inversions. InvFEST automatically merges predictions into different inversions, refines the breakpoint locations, and finds associations with genes and segmental duplications. In addition, it includes data on experimental validation, population frequency, functional effects and evolutionary history. All this information is readily accessible through a complete and user-friendly web report for each inversion. In its current version, InvFEST combines information from 34 different studies and contains 1092 candidate inversions, which are categorized based on internal scores and manual curation. Therefore, InvFEST aims to represent the most reliable set of human inversions and become a central repository to share information, guide future studies and contribute to the analysis of the functional and evolutionary impact of inversions on the human genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkt1122DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3965118PMC
January 2014

The Drosophila melanogaster Genetic Reference Panel.

Nature 2012 Feb 8;482(7384):173-8. Epub 2012 Feb 8.

Department of Genetics, North Carolina State University, Raleigh, North Carolina 27695, USA.

A major challenge of biology is understanding the relationship between molecular genetic variation and variation in quantitative traits, including fitness. This relationship determines our ability to predict phenotypes from genotypes and to understand how evolutionary forces shape variation within and between species. Previous efforts to dissect the genotype-phenotype map were based on incomplete genotypic information. Here, we describe the Drosophila melanogaster Genetic Reference Panel (DGRP), a community resource for analysis of population genomics and quantitative traits. The DGRP consists of fully sequenced inbred lines derived from a natural population. Population genomic analyses reveal reduced polymorphism in centromeric autosomal regions and the X chromosome, evidence for positive and negative selection, and rapid evolution of the X chromosome. Many variants in novel genes, most at low frequency, are associated with quantitative traits and explain a large fraction of the phenotypic variance. The DGRP facilitates genotype-phenotype mapping using the power of Drosophila genetics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature10811DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3683990PMC
February 2012

PopDrowser: the Population Drosophila Browser.

Bioinformatics 2012 Feb 15;28(4):595-6. Epub 2011 Dec 15.

Institut de Biotecnologia i de Biomedicina and Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain.

Motivation: The completion of 168 genome sequences from a single population of Drosophila melanogaster provides a global view of genomic variation and an understanding of the evolutionary forces shaping the patterns of DNA polymorphism and divergence along the genome.

Results: We present the 'Population Drosophila Browser' (PopDrowser), a new genome browser specially designed for the automatic analysis and representation of genetic variation across the D. melanogaster genome sequence. PopDrowser allows estimating and visualizing the values of a number of DNA polymorphism and divergence summary statistics, linkage disequilibrium parameters and several neutrality tests. PopDrowser also allows performing custom analyses on-the-fly using user-selected parameters.

Availability: PopDrowser is freely available from http://PopDrowser.uab.cat.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btr691DOI Listing
February 2012

Drosophila polymorphism database (DPDB): a portal for nucleotide polymorphism in Drosophila.

Fly (Austin) 2007 Jul-Aug;1(4):205-11. Epub 2007 Jul 17.

Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain.

As a growing number of haplotypic sequences from resequencing studies are now accumulating for Drosophila in the main primary sequence databases, collectively they can now be used to describe the general pattern of nucleotide variation across species and genes of this genus. The Drosophila Polymorphism Database (DPDB) is a secondary database that provides a collection of all well-annotated polymorphic sequences in Drosophila together with their associated diversity measures and options for reanalysis of the data that greatly facilitate both multi-locus and multi-species diversity studies in one of the most important groups of model organisms. Here we describe the state-of-the-art of the DPDB database and provide a step-by-step guide to all its searching and analytic capabilities. Finally, we illustrate its usefulness through selected examples. DPDB is freely available at http://dpdb.uab.cat.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.4161/fly.5043DOI Listing
November 2008

Standard and generalized McDonald-Kreitman test: a website to detect selection by comparing different classes of DNA sites.

Nucleic Acids Res 2008 Jul 30;36(Web Server issue):W157-62. Epub 2008 May 30.

Genomics, Bioinformatics and Evolution Group, Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra (Barcelona), Spain.

The McDonald and Kreitman test (MKT) is one of the most powerful and extensively used tests to detect the signature of natural selection at the molecular level. Here, we present the standard and generalized MKT website, a novel website that allows performing MKTs not only for synonymous and nonsynonymous changes, as the test was initially described, but also for other classes of regions and/or several loci. The website has three different interfaces: (i) the standard MKT, where users can analyze several types of sites in a coding region, (ii) the advanced MKT, where users can compare two closely linked regions in the genome that can be either coding or noncoding, and (iii) the multi-locus MKT, where users can analyze many separate loci in a single multi-locus test. The website has already been used to show that selection efficiency is positively correlated with effective population size in the Drosophila genus and it has been applied to include estimates of selection in DPDB. This website is a timely resource, which will presumably be widely used by researchers in the field and will contribute to enlarge the catalogue of cases of adaptive evolution. It is available at http://mkt.uab.es.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkn337DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2447769PMC
July 2008

Purifying selection maintains highly conserved noncoding sequences in Drosophila.

Mol Biol Evol 2007 Oct 23;24(10):2222-34. Epub 2007 Jul 23.

Faculty of Life Sciences, University of Manchester, Michael Smith Building, Manchester M13 9PT, UK.

The majority of metazoan genomes consist of nonprotein-coding regions, although the functional significance of most noncoding DNA sequences remains unknown. Highly conserved noncoding sequences (CNSs) have proven to be reliable indicators of functionally constrained sequences such as cis-regulatory elements and noncoding RNA genes. However, CNSs may arise from nonselective evolutionary processes such as genomic regions with extremely low mutation rates known as mutation "cold spots." Here we combine comparative genomic data from recently completed insect genome projects with population genetic data in Drosophila melanogaster to test predictions of the mutational cold spot model of CNS evolution in the genus Drosophila. We find that point mutations in intronic and intergenic CNSs exhibit a significant reduction in levels of divergence relative to levels of polymorphism, as well as a significant excess of rare derived alleles, compared with either the nonconserved spacer regions between CNSs or with 4-fold silent sites in coding regions. Controlling for the effects of purifying selection, we find no evidence of positive selection acting on Drosophila CNSs, although we do find evidence for the action of recurrent positive selection in the spacer regions between CNSs. We estimate that approximately 85% of sites in Drosophila CNSs are under constraint with selection coefficients (N(e)s) on the order of 10-100, and thus, the estimated strength and number of sites under purifying selection is greater for Drosophila CNSs relative to those in the human genome. These patterns of nonneutral molecular evolution are incompatible with the mutational cold spot hypothesis to explain the existence of CNSs in Drosophila and, coupled with similar findings in mammals, argue against the general likelihood that CNSs are generated by mutational cold spots in any metazoan genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/msm150DOI Listing
October 2007

Protein polymorphism is negatively correlated with conservation of intronic sequences and complexity of expression patterns in Drosophila melanogaster.

J Mol Evol 2007 May 24;64(5):511-8. Epub 2007 Apr 24.

Departament de Genètica i Microbiologia, Facultat de Biociències, Universitat Autònoma de Barcelona, 08193, Bellaterra, Barcelona, Spain.

We report a significant negative correlation between nonsynonymous polymorphism and intron length in Drosophila melanogaster. This correlation is similar to that between protein divergence and intron length previously reported in Drosophila. We show that the relationship can be explained by the content of conserved noncoding sequences (CNS) within introns. In addition, genes with a high regulatory complexity and many genetic interactions also exhibit larger amounts of CNS within their introns and lower values of nonsynonymous polymorphism. The present study provides relevant evidence on the importance of intron content and expression patterns on the levels of coding polymorphism.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00239-006-0047-5DOI Listing
May 2007

Fast sequence evolution of Hox and Hox-derived genes in the genus Drosophila.

BMC Evol Biol 2006 Dec 12;6:106. Epub 2006 Dec 12.

Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra (Barcelona), Spain.

Background: It is expected that genes that are expressed early in development and have a complex expression pattern are under strong purifying selection and thus evolve slowly. Hox genes fulfill these criteria and thus, should have a low evolutionary rate. However, some observations point to a completely different scenario. Hox genes are usually highly conserved inside the homeobox, but very variable outside it.

Results: We have measured the rates of nucleotide divergence and indel fixation of three Hox genes, labial (lab), proboscipedia (pb) and abdominal-A (abd-A), and compared them with those of three genes derived by duplication from Hox3, bicoid (bcd), zerknüllt (zen) and zerknüllt-related (zen2), and 15 non-Hox genes in sets of orthologous sequences of three species of the genus Drosophila. These rates were compared to test the hypothesis that Hox genes evolve slowly. Our results show that the evolutionary rate of Hox genes is higher than that of non-Hox genes when both amino acid differences and indels are taken into account: 43.39% of the amino acid sequence is altered in Hox genes, versus 30.97% in non-Hox genes and 64.73% in Hox-derived genes. Microsatellites scattered along the coding sequence of Hox genes explain partially, but not fully, their fast sequence evolution.

Conclusion: These results show that Hox genes have a higher evolutionary dynamics than other developmental genes, and emphasize the need to take into account indels in addition to nucleotide substitutions in order to accurately estimate evolutionary rates.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2148-6-106DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1764764PMC
December 2006

MamPol: a database of nucleotide polymorphism in the Mammalia class.

Nucleic Acids Res 2007 Jan 16;35(Database issue):D624-9. Epub 2006 Nov 16.

Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain.

Multi-locus and multi-species nucleotide diversity studies would benefit enormously from a public database encompassing high-quality haplotypic sequences with their associated genetic diversity measures. MamPol, 'Mammalia Polymorphism Database', is a website containing all the well-annotated polymorphic sequences available in GenBank for the Mammalia class grouped by name of organism and gene. Diversity measures of single nucleotide polymorphisms are provided for each set of haplotypic homologous sequences, including polymorphism at synonymous and non-synonymous sites, linkage disequilibrium and codon bias. Data gathering, calculation of diversity measures and daily updates are automatically performed using PDA software. The MamPol website includes several interfaces for browsing the contents of the database and making customizable comparative searches of different species or taxonomic groups. It also contains a set of tools for simple re-analysis of the available data and a statistics section that is updated daily and summarizes the contents of the database. MamPol is available at http://mampol.uab.es/ and can be downloaded via FTP.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkl833DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1669741PMC
January 2007

PDA v.2: improving the exploration and estimation of nucleotide polymorphism in large datasets of heterogeneous DNA.

Nucleic Acids Res 2006 Jul;34(Web Server issue):W632-4

Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain.

Pipeline Diversity Analysis (PDA) is an open-source, web-based tool that allows the exploration of polymorphism in large datasets of heterogeneous DNA sequences, and can be used to create secondary polymorphism databases for different taxonomic groups, such as the Drosophila Polymorphism Database (DPDB). A new version of the pipeline presented here, PDA v.2, incorporates substantial improvements, including new methods for data mining and grouping sequences, new criteria for data quality assessment and a better user interface. PDA is a powerful tool to obtain and synthesize existing empirical evidence on genetic diversity in any species or species group. PDA v.2 is available on the web at http://pda.uab.es/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkl080DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1538800PMC
July 2006

DPDB: a database for the storage, representation and analysis of polymorphism in the Drosophila genus.

Bioinformatics 2005 Sep;21 Suppl 2:ii26-30

Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Bellaterra (Barcelona), Spain.

Motivation: Polymorphism studies are one of the main research areas of this genomic era. To date, however, no comprehensive secondary databases have been designed to provide searchable collections of polymorphic sequences with their associated diversity measures.

Results: We define a data model for the storage, representation and analysis of genotypic and haplotypic data. Under this model we have created DPDB, 'Drosophila Polymorphism Database', a web site that provides a daily updated repository of all well-annotated polymorphic sequences in the Drosophila genus. It allows the search for any polymorphic set according to different parameter values of nucleotide diversity, linkage disequilibrium and codon bias. For data collection, analysis and updating we use PDA, a pipeline that automates the process of sequence retrieval, grouping, alignment and estimation of nucleotide diversity from Genbank sequences in different functional regions. The web site also includes analysis tools for sequence comparison and the estimation of genetic diversity, a page with real-time statistics of the database contents, a help section and a collection of selected links.

Availability: DPDB is freely available at http://dpdb.uab.es and can be downloaded via FTP.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bti1103DOI Listing
September 2005

Conservation of regulatory sequences and gene expression patterns in the disintegrating Drosophila Hox gene complex.

Genome Res 2005 May;15(5):692-700

Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain.

Homeotic (Hox) genes are usually clustered and arranged in the same order as they are expressed along the anteroposterior body axis of metazoans. The mechanistic explanation for this colinearity has been elusive, and it may well be that a single and universal cause does not exist. The Hox-gene complex (HOM-C) has been rearranged differently in several Drosophila species, producing a striking diversity of Hox gene organizations. We investigated the genomic and functional consequences of the two HOM-C splits present in Drosophila buzzatii. Firstly, we sequenced two regions of the D. buzzatii genome, one containing the genes labial and abdominal A, and another one including proboscipedia, and compared their organization with that of D. melanogaster and D. pseudoobscura in order to map precisely the two splits. Then, a plethora of conserved noncoding sequences, which are putative enhancers, were identified around the three Hox genes closer to the splits. The position and order of these enhancers are conserved, with minor exceptions, between the three Drosophila species. Finally, we analyzed the expression patterns of the same three genes in embryos and imaginal discs of four Drosophila species with different Hox-gene organizations. The results show that their expression patterns are conserved despite the HOM-C splits. We conclude that, in Drosophila, Hox-gene clustering is not an absolute requirement for proper function. Rather, the organization of Hox genes is modular, and their clustering seems the result of phylogenetic inertia more than functional necessity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.3468605DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1088297PMC
May 2005

PDA: a pipeline to explore and estimate polymorphism in large DNA databases.

Nucleic Acids Res 2004 Jul;32(Web Server issue):W166-9

Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain.

Polymorphism studies are one of the main research areas of this genomic era. To date, however, no available web server or software package has been designed to automate the process of exploring and estimating nucleotide polymorphism in large DNA databases. Here, we introduce a novel software, PDA, Pipeline Diversity Analysis, that automatically can (i) search for polymorphic sequences in large databases, and (ii) estimate their genetic diversity. PDA is a collection of modules, mainly written in Perl, which works sequentially as follows: unaligned sequence retrieved from a DNA database are automatically classified by organism and gene, and aligned using the ClustalW algorithm. Sequence sets are regrouped depending on their similarity scores. Main diversity parameters, including polymorphism, synonymous and non-synonymous substitutions, linkage disequilibrium and codon bias are estimated both for the full length of the sequences and for specific functional regions. Program output includes a database with all sequences and estimations, and HTML pages with summary statistics, the performed alignments and a histogram maker tool. PDA is an essential tool to explore polymorphism in large DNA databases for sequences from different genes, populations or species. It has already been successfully applied to create a secondary database. PDA is available on the web at http://pda.uab.es/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkh428DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC441566PMC
July 2004

INVERSION LENGTH AND BREAKPOINT DISTRIBUTION IN THE DROSOPHILA BUZZATII SPECIES COMPLEX: IS INVERSION LENGTH A SELECTED TRAIT?

Evolution 1997 Aug;51(4):1149-1155

Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193, Bellaterra, Barcelona, Spain.

Length and position of breakpoints are characteristics of inversions that can be precisely determined on the polytene chromosomes of Drosophila species, and they provide crucial information about the processes that govern the origin and evolution of inversions. Eighty-six paracentric inversions described in the Drosophila buzzatii species complex and 18 inversions induced by introgressive hybridization in D. buzzatii were analyzed. In contrast to previous studies, inversion length and breakpoint distribution have been considered simultaneously. We conclude that: (1) inversion length is a selected trait; rare inversions are predominantly small while evolutionarily successful inversions, polymorphic and fixed, are predominantly intermediate in length; a nearly continuous variation in length, from small to medium sized, is found between less and more successful inversions; (2) there exists a significant negative correlation between length and number of polymorphic inversions per species which explains 39% of the inversion length variance; (3) natural selection on inversion length seems the main factor determining the relative position of breakpoints along the chromosomes; (4) the distribution of breakpoints according to their band location is non-random, with chromosomal segments that accumulate up to eight breakpoints.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/j.1558-5646.1997.tb03962.xDOI Listing
August 1997

MATING PATTERN AND FITNESS-COMPONENT ANALYSIS ASSOCIATED WITH INVERSION POLYMORPHISM IN A NATURAL POPULATION OF DROSOPHILA BUZZATII.

Evolution 1994 Jun;48(3):767-780

Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193, Bellaterra (Barcelona), Spain.

Direct studies of mating success or mating pattern associated with Mendelian factors rarely have been carried out in nature. From the samples taken for the standard analyses of selection components, it is not usually possible to obtain the mating table, and only directional selection for male mating success can be detected. Both processes, mating pattern and differential mating probability, together with other fitness components, have been investigated for the inversion polymorphism of a natural population of the cactophilic species Drosophila buzzatii. Two independent samples of adult flies were collected: nonmating or single individuals (base population) and mating pairs (mating population). All individuals were karyotyped for the second and fourth chromosomes. A sequence of models with increasing simplicity was fitted to the data to test null hypotheses of no selection and random union of gametes and karyotypes. The main results were (1) no deviations from random mating were found; (2) differential mating probability was nonsignificant in both sexes; (3) inversion and karyotypic frequencies did not differ between sexes; and (4) karyotypic frequencies did not depart from Hardy-Weinberg expectations. These results are discussed in light of complementary evidence showing the need for interpreting with caution no-effect hypotheses such as the ones tested here. The use of complementary selective tests in these studies is suggested.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/j.1558-5646.1994.tb01360.xDOI Listing
June 1994