Publications by authors named "Graham R S Ritchie"

25 Publications

  • Page 1 of 1

Uganda Genome Resource Enables Insights into Population History and Genomic Discovery in Africa.

Cell 2019 10;179(4):984-1002.e36

Wellcome Sanger Institute, Hinxton, Cambridge, UK.

Genomic studies in African populations provide unique opportunities to understand disease etiology, human diversity, and population history. In the largest study of its kind, comprising genome-wide data from 6,400 individuals and whole-genome sequences from 1,978 individuals from rural Uganda, we find evidence of geographically correlated fine-scale population substructure. Historically, the ancestry of modern Ugandans was best represented by a mixture of ancient East African pastoralists. We demonstrate the value of the largest sequence panel from Africa to date as an imputation resource. Examining 34 cardiometabolic traits, we show systematic differences in trait heritability between European and African populations, probably reflecting the differential impact of genes and environment. In a multi-trait pan-African GWAS of up to 14,126 individuals, we identify novel loci associated with anthropometric, hematological, lipid, and glycemic traits. We find that several functionally important signals are driven by Africa-specific variants, highlighting the value of studying diverse populations across the region.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2019.10.004DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7202134PMC
October 2019

GARFIELD classifies disease-relevant genomic features through integration of functional annotations with association signals.

Nat Genet 2019 02 28;51(2):343-353. Epub 2019 Jan 28.

Human Genetics, Wellcome Sanger Institute, Hinxton, UK.

Loci discovered by genome-wide association studies predominantly map outside protein-coding genes. The interpretation of the functional consequences of non-coding variants can be greatly enhanced by catalogs of regulatory genomic regions in cell lines and primary tissues. However, robust and readily applicable methods are still lacking by which to systematically evaluate the contribution of these regions to genetic variation implicated in diseases or quantitative traits. Here we propose a novel approach that leverages genome-wide association studies' findings with regulatory or functional annotations to classify features relevant to a phenotype of interest. Within our framework, we account for major sources of confounding not offered by current methods. We further assess enrichment of genome-wide association studies for 19 traits within Encyclopedia of DNA Elements- and Roadmap-derived regulatory regions. We characterize unique enrichment patterns for traits and annotations driving novel biological insights. The method is implemented in standalone software and an R package, to facilitate its application by the research community.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-018-0322-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6908448PMC
February 2019

Evaluation of shared genetic aetiology between osteoarthritis and bone mineral density identifies SMAD3 as a novel osteoarthritis risk locus.

Hum Mol Genet 2017 10;26(19):3850-3858

Human Genetics, Wellcome Trust Sanger Institute, Hinxton CB10 1HH, UK.

Osteoarthritis (OA) is a common complex disease with high public health burden and no curative therapy. High bone mineral density (BMD) is associated with an increased risk of developing OA, suggesting a shared underlying biology. Here, we performed the first systematic overlap analysis of OA and BMD on a genome wide scale. We used summary statistics from the GEFOS consortium for lumbar spine (n = 31,800) and femoral neck (n = 32,961) BMD, and from the arcOGEN consortium for three OA phenotypes (hip, ncases=3,498; knee, ncases=3,266; hip and/or knee, ncases=7,410; ncontrols=11,009). Performing LD score regression we found a significant genetic correlation between the combined OA phenotype (hip and/or knee) and lumbar spine BMD (rg=0.18, P = 2.23 × 10-2), which may be driven by the presence of spinal osteophytes. We identified 143 variants with evidence for cross-phenotype association which we took forward for replication in independent large-scale OA datasets, and subsequent meta-analysis with arcOGEN for a total sample size of up to 23,425 cases and 236,814 controls. We found robustly replicating evidence for association with OA at rs12901071 (OR 1.08 95% CI 1.05-1.11, Pmeta=3.12 × 10-10), an intronic variant in the SMAD3 gene, which is known to play a role in bone remodeling and cartilage maintenance. We were able to confirm expression of SMAD3 in intact and degraded cartilage of the knee and hip. Our findings provide the first systematic evaluation of pleiotropy between OA and BMD, highlight genes with biological relevance to both traits, and establish a robust new OA genetic risk locus at SMAD3.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddx285DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5886098PMC
October 2017

Integrative epigenomics, transcriptomics and proteomics of patient chondrocytes reveal genes and pathways involved in osteoarthritis.

Sci Rep 2017 08 21;7(1):8935. Epub 2017 Aug 21.

Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

Osteoarthritis (OA) is a common disease characterized by cartilage degeneration and joint remodeling. The underlying molecular changes underpinning disease progression are incompletely understood. We investigated genes and pathways that mark OA progression in isolated primary chondrocytes taken from paired intact versus degraded articular cartilage samples across 38 patients undergoing joint replacement surgery (discovery cohort: 12 knee OA, replication cohorts: 17 knee OA, 9 hip OA patients). We combined genome-wide DNA methylation, RNA sequencing, and quantitative proteomics data. We identified 49 genes differentially regulated between intact and degraded cartilage in at least two -omics levels, 16 of which have not previously been implicated in OA progression. Integrated pathway analysis implicated the involvement of extracellular matrix degradation, collagen catabolism and angiogenesis in disease progression. Using independent replication datasets, we showed that the direction of change is consistent for over 90% of differentially expressed genes and differentially methylated CpG probes. AQP1, COL1A1 and CLEC3B were significantly differentially regulated across all three -omics levels, confirming their differential expression in human disease. Through integration of genome-wide methylation, gene and protein expression data in human primary chondrocytes, we identified consistent molecular players in OA progression that replicated across independent datasets and that have translational potential.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-017-09335-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5566454PMC
August 2017

Whole-Genome Sequencing Coupled to Imputation Discovers Genetic Signals for Anthropometric Traits.

Am J Hum Genet 2017 Jun 25;100(6):865-884. Epub 2017 May 25.

Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London W2 1PG, UK; Department of Cardiology, Ealing Hospital NHS Trust, Middlesex UB1 3EU, UK.

Deep sequence-based imputation can enhance the discovery power of genome-wide association studies by assessing previously unexplored variation across the common- and low-frequency spectra. We applied a hybrid whole-genome sequencing (WGS) and deep imputation approach to examine the broader allelic architecture of 12 anthropometric traits associated with height, body mass, and fat distribution in up to 267,616 individuals. We report 106 genome-wide significant signals that have not been previously identified, including 9 low-frequency variants pointing to functional candidates. Of the 106 signals, 6 are in genomic regions that have not been implicated with related traits before, 28 are independent signals at previously reported regions, and 72 represent previously reported signals for a different anthropometric trait. 71% of signals reside within genes and fine mapping resolves 23 signals to one or two likely causal variants. We confirm genetic overlap between human monogenic and polygenic anthropometric traits and find signal enrichment in cis expression QTLs in relevant tissues. Our results highlight the potential of WGS strategies to enhance biologically relevant discoveries across the frequency spectrum.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2017.04.014DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5473732PMC
June 2017

The Ensembl Variant Effect Predictor.

Genome Biol 2016 06 6;17(1):122. Epub 2016 Jun 6.

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-016-0974-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4893825PMC
June 2016

Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences.

Nat Genet 2016 06 25;48(6):593-9. Epub 2016 Apr 25.

Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA.

We report the sequences of 1,244 human Y chromosomes randomly ascertained from 26 worldwide populations by the 1000 Genomes Project. We discovered more than 65,000 variants, including single-nucleotide variants, multiple-nucleotide variants, insertions and deletions, short tandem repeats, and copy number variants. Of these, copy number variants contribute the greatest predicted functional impact. We constructed a calibrated phylogenetic tree on the basis of binary single-nucleotide variants and projected the more complex variants onto it, estimating the number of mutations for each class. Our phylogeny shows bursts of extreme expansion in male numbers that have occurred independently among each of the five continental superpopulations examined, at times of known migrations and technological innovations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3559DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4884158PMC
June 2016

Comparison of GENCODE and RefSeq gene annotation and the impact of reference geneset on variant effect prediction.

BMC Genomics 2015 18;16 Suppl 8:S2. Epub 2015 Jun 18.

Background: A vast amount of DNA variation is being identified by increasingly large-scale exome and genome sequencing projects. To be useful, variants require accurate functional annotation and a wide range of tools are available to this end. McCarthy et al recently demonstrated the large differences in prediction of loss-of-function (LoF) variation when RefSeq and Ensembl transcripts are used for annotation, highlighting the importance of the reference transcripts on which variant functional annotation is based.

Results: We describe a detailed analysis of the similarities and differences between the gene and transcript annotation in the GENCODE and RefSeq genesets. We demonstrate that the GENCODE Comprehensive set is richer in alternative splicing, novel CDSs, novel exons and has higher genomic coverage than RefSeq, while the GENCODE Basic set is very similar to RefSeq. Using RNAseq data we show that exons and introns unique to one geneset are expressed at a similar level to those common to both. We present evidence that the differences in gene annotation lead to large differences in variant annotation where GENCODE and RefSeq are used as reference transcripts, although this is predominantly confined to non-coding transcripts and UTR sequence, with at most ~30% of LoF variants annotated discordantly. We also describe an investigation of dominant transcript expression, showing that it both supports the utility of the GENCODE Basic set in providing a smaller set of more highly expressed transcripts and provides a useful, biologically-relevant filter for further reducing the complexity of the transcriptome.

Conclusions: The reference transcripts selected for variant functional annotation do have a large effect on the outcome. The GENCODE Comprehensive transcripts contain more exons, have greater genomic coverage and capture many more variants than RefSeq in both genome and exome datasets, while the GENCODE Basic set shows a higher degree of concordance with RefSeq and has fewer unique features. We propose that the GENCODE Comprehensive set has great utility for the discovery of new variants with functional potential, while the GENCODE Basic set is more suitable for applications demanding less complex interpretation of functional variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2164-16-S8-S2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4502323PMC
March 2016

The African Genome Variation Project shapes medical genetics in Africa.

Nature 2015 Jan 3;517(7534):327-32. Epub 2014 Dec 3.

Medical Research Council Unit, Atlantic Boulevard, SerrekundaPO Box 273, Banjul, The Gambia.

Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature13997DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4297536PMC
January 2015

Genetic characterization of Greek population isolates reveals strong genetic drift at missense and trait-associated variants.

Nat Commun 2014 Nov 6;5:5345. Epub 2014 Nov 6.

Department of Nutrition and Dietetics, Harokopio University of Athens, Athens 17671, Greece.

Isolated populations are emerging as a powerful study design in the search for low-frequency and rare variant associations with complex phenotypes. Here we genotype 2,296 samples from two isolated Greek populations, the Pomak villages (HELIC-Pomak) in the North of Greece and the Mylopotamos villages (HELIC-MANOLIS) in Crete. We compare their genomic characteristics to the general Greek population and establish them as genetic isolates. In the MANOLIS cohort, we observe an enrichment of missense variants among the variants that have drifted up in frequency by more than fivefold. In the Pomak cohort, we find novel associations at variants on chr11p15.4 showing large allele frequency increases (from 0.2% in the general Greek population to 4.6% in the isolate) with haematological traits, for example, with mean corpuscular volume (rs7116019, P=2.3 × 10(-26)). We replicate this association in a second set of Pomak samples (combined P=2.0 × 10(-36)). We demonstrate significant power gains in detecting medical trait associations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncomms6345DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4242463PMC
November 2014

The Ensembl REST API: Ensembl Data for Any Language.

Bioinformatics 2015 Jan 17;31(1):143-5. Epub 2014 Sep 17.

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD and Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD and Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

Motivation: We present a Web service to access Ensembl data using Representational State Transfer (REST). The Ensembl REST server enables the easy retrieval of a wide range of Ensembl data by most programming languages, using standard formats such as JSON and FASTA while minimizing client work. We also introduce bindings to the popular Ensembl Variant Effect Predictor tool permitting large-scale programmatic variant analysis independent of any specific programming language.

Availability And Implementation: The Ensembl REST API can be accessed at http://rest.ensembl.org and source code is freely available under an Apache 2.0 license from http://github.com/Ensembl/ensembl-rest.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btu613DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4271150PMC
January 2015

Functional annotation of noncoding sequence variants.

Nat Methods 2014 Mar 2;11(3):294-6. Epub 2014 Feb 2.

1] European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK. [2] Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.

Identifying functionally relevant variants against the background of ubiquitous genetic variation is a major challenge in human genetics. For variants in protein-coding regions, our understanding of the genetic code and splicing allows us to identify likely candidates, but interpreting variants outside genic regions is more difficult. Here we present genome-wide annotation of variants (GWAVA), a tool that supports prioritization of noncoding variants by integrating various genomic and epigenomic annotations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.2832DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5015703PMC
March 2014

Revisiting the thrifty gene hypothesis via 65 loci associated with susceptibility to type 2 diabetes.

Am J Hum Genet 2014 Feb 9;94(2):176-85. Epub 2014 Jan 9.

The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1HH, UK. Electronic address:

We have investigated the evidence for positive selection in samples of African, European, and East Asian ancestry at 65 loci associated with susceptibility to type 2 diabetes (T2D) previously identified through genome-wide association studies. Selection early in human evolutionary history is predicted to lead to ancestral risk alleles shared between populations, whereas late selection would result in population-specific signals at derived risk alleles. By using a wide variety of tests based on the site frequency spectrum, haplotype structure, and population differentiation, we found no global signal of enrichment for positive selection when we considered all T2D risk loci collectively. However, in a locus-by-locus analysis, we found nominal evidence for positive selection at 14 of the loci. Selection favored the protective and risk alleles in similar proportions, rather than the risk alleles specifically as predicted by the thrifty gene hypothesis, and may not be related to influence on diabetes. Overall, we conclude that past positive selection has not been a powerful influence driving the prevalence of T2D risk alleles.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2013.12.010DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3928649PMC
February 2014

A rare functional cardioprotective APOC3 variant has risen in frequency in distinct population isolates.

Nat Commun 2013 ;4:2872

Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.

Isolated populations can empower the identification of rare variation associated with complex traits through next generation association studies, but the generalizability of such findings remains unknown. Here we genotype 1,267 individuals from a Greek population isolate on the Illumina HumanExome Beadchip, in search of functional coding variants associated with lipids traits. We find genome-wide significant evidence for association between R19X, a functional variant in APOC3, with increased high-density lipoprotein and decreased triglycerides levels. Approximately 3.8% of individuals are heterozygous for this cardioprotective variant, which was previously thought to be private to the Amish founder population. R19X is rare (<0.05% frequency) in outbred European populations. The increased frequency of R19X enables discovery of this lipid traits signal at genome-wide significance in a small sample size. This work exemplifies the value of isolated populations in successfully detecting transferable rare variant associations of high medical relevance.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncomms3872DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3905724PMC
October 2014

Integrative annotation of variants from 1092 humans: application to cancer genomics.

Science 2013 Oct;342(6154):1235587

Pediatric Surgical Research Laboratories, MassGeneral Hospital for Children, Massachusetts General Hospital, Boston, MA 02114, USA.

Interpreting variants, especially noncoding ones, in the increasing number of personal genomes is challenging. We used patterns of polymorphisms in functionally annotated regions in 1092 humans to identify deleterious variants; then we experimentally validated candidates. We analyzed both coding and noncoding regions, with the former corroborating the latter. We found regions particularly sensitive to mutations ("ultrasensitive") and variants that are disruptive because of mechanistic effects on transcription-factor binding (that is, "motif-breakers"). We also found variants in regions with higher network centrality tend to be deleterious. Insertions and deletions followed a similar pattern to single-nucleotide variants, with some notable exceptions (e.g., certain deletions and enhancers). On the basis of these patterns, we developed a computational tool (FunSeq), whose application to ~90 cancer genomes reveals nearly a hundred candidate noncoding drivers.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.1235587DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3947637PMC
October 2013

Computational approaches to identify functional genetic variants in cancer genomes.

Nat Methods 2013 Aug;10(8):723-9

Research Unit on Biomedical Informatics, University Pompeu Fabra, Barcelona, Spain.

The International Cancer Genome Consortium (ICGC) aims to catalog genomic abnormalities in tumors from 50 different cancer types. Genome sequencing reveals hundreds to thousands of somatic mutations in each tumor but only a minority of these drive tumor progression. We present the result of discussions within the ICGC on how to address the challenge of identifying mutations that contribute to oncogenesis, tumor maintenance or response to therapy, and recommend computational techniques to annotate somatic variants and predict their impact on cancer phenotype.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.2562DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3919555PMC
August 2013

Ensembl 2013.

Nucleic Acids Res 2013 Jan 30;41(Database issue):D48-55. Epub 2012 Nov 30.

European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton Cambridge CB10 1SD, UK.

The Ensembl project (http://www.ensembl.org) provides genome information for sequenced chordate genomes with a particular focus on human, mouse, zebrafish and rat. Our resources include evidenced-based gene sets for all supported species; large-scale whole genome multiple species alignments across vertebrates and clade-specific alignments for eutherian mammals, primates, birds and fish; variation data resources for 17 species and regulation annotations based on ENCODE and other data sets. Ensembl data are accessible through the genome browser at http://www.ensembl.org and through other tools and programmatic interfaces.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gks1236DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531136PMC
January 2013

Genome-wide meta-analysis of common variant differences between men and women.

Hum Mol Genet 2012 Nov 27;21(21):4805-15. Epub 2012 Jul 27.

Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

The male-to-female sex ratio at birth is constant across world populations with an average of 1.06 (106 male to 100 female live births) for populations of European descent. The sex ratio is considered to be affected by numerous biological and environmental factors and to have a heritable component. The aim of this study was to investigate the presence of common allele modest effects at autosomal and chromosome X variants that could explain the observed sex ratio at birth. We conducted a large-scale genome-wide association scan (GWAS) meta-analysis across 51 studies, comprising overall 114 863 individuals (61 094 women and 53 769 men) of European ancestry and 2 623 828 common (minor allele frequency >0.05) single-nucleotide polymorphisms (SNPs). Allele frequencies were compared between men and women for directly-typed and imputed variants within each study. Forward-time simulations for unlinked, neutral, autosomal, common loci were performed under the demographic model for European populations with a fixed sex ratio and a random mating scheme to assess the probability of detecting significant allele frequency differences. We do not detect any genome-wide significant (P < 5 × 10(-8)) common SNP differences between men and women in this well-powered meta-analysis. The simulated data provided results entirely consistent with these findings. This large-scale investigation across ~115 000 individuals shows no detectable contribution from common genetic variants to the observed skew in the sex ratio. The absence of sex-specific differences is useful in guiding genetic association study design, for example when using mixed controls for sex-biased traits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/dds304DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3471397PMC
November 2012

A combined functional annotation score for non-synonymous variants.

Hum Hered 2012 18;73(1):47-51. Epub 2012 Jan 18.

Wellcome Trust Sanger Institute, Hinxton, Hinxton, UK.

Aims: Next-generation sequencing has opened the possibility of large-scale sequence-based disease association studies. A major challenge in interpreting whole-exome data is predicting which of the discovered variants are deleterious or neutral. To address this question in silico, we have developed a score called Combined Annotation scoRing toOL (CAROL), which combines information from 2 bioinformatics tools: PolyPhen-2 and SIFT, in order to improve the prediction of the effect of non-synonymous coding variants.

Methods: We used a weighted Z method that combines the probabilistic scores of PolyPhen-2 and SIFT. We defined 2 dataset pairs to train and test CAROL using information from the dbSNP: 'HGMD-PUBLIC' and 1000 Genomes Project databases. The training pair comprises a total of 980 positive control (disease-causing) and 4,845 negative control (non-disease-causing) variants. The test pair consists of 1,959 positive and 9,691 negative controls.

Results: CAROL has higher predictive power and accuracy for the effect of non-synonymous variants than each individual annotation tool (PolyPhen-2 and SIFT) and benefits from higher coverage.

Conclusion: The combination of annotation tools can help improve automated prediction of whole-genome/exome non-synonymous variant functional consequences.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1159/000334984DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3390741PMC
July 2012

Ensembl 2012.

Nucleic Acids Res 2012 Jan 15;40(Database issue):D84-90. Epub 2011 Nov 15.

European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton Cambridge CB10 1SD, UK.

The Ensembl project (http://www.ensembl.org) provides genome resources for chordate genomes with a particular focus on human genome data as well as data for key model organisms such as mouse, rat and zebrafish. Five additional species were added in the last year including gibbon (Nomascus leucogenys) and Tasmanian devil (Sarcophilus harrisii) bringing the total number of supported species to 61 as of Ensembl release 64 (September 2011). Of these, 55 species appear on the main Ensembl website and six species are provided on the Ensembl preview site (Pre!Ensembl; http://pre.ensembl.org) with preliminary support. The past year has also seen improvements across the project.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkr991DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245178PMC
January 2012

Ensembl 2011.

Nucleic Acids Res 2011 Jan 2;39(Database issue):D800-6. Epub 2010 Nov 2.

European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

The Ensembl project (http://www.ensembl.org) seeks to enable genomic science by providing high quality, integrated annotation on chordate and selected eukaryotic genomes within a consistent and accessible infrastructure. All supported species include comprehensive, evidence-based gene annotations and a selected set of genomes includes additional data focused on variation, comparative, evolutionary, functional and regulatory annotation. The most advanced resources are provided for key species including human, mouse, rat and zebrafish reflecting the popularity and importance of these species in biomedical research. As of Ensembl release 59 (August 2010), 56 species are supported of which 5 have been added in the past year. Since our previous report, we have substantially improved the presentation and integration of both data of disease relevance and the regulatory state of different cell types.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkq1064DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3013672PMC
January 2011

Signalling signalhood and the emergence of communication.

Cognition 2009 Nov 8;113(2):226-33. Epub 2009 Sep 8.

School of Psychology, Philosophy and Language Sciences, University of Edinburgh, Edinburgh EH8 9AD, United Kingdom.

A unique hallmark of human language is that it uses signals that are both learnt and symbolic. The emergence of such signals was therefore a defining event in human cognitive evolution, yet very little is known about how such a process occurs. Previous work provides some insights on how meaning can become attached to form, but a more foundational issue is presently unaddressed. How does a signal signal its own signalhood? That is, how do humans even know that communicative behaviour is indeed communicative in nature? We introduce an experimental game that has been designed to tackle this problem. We find that it is commonly resolved with a bootstrapping process, and that this process influences the final form of the communication system. Furthermore, sufficient common ground is observed to be integral to the recognition of signalhood, and the emergence of dialogue is observed to be the key step in the development of a system that can be employed to achieve shared goals.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cognition.2009.08.009DOI Listing
November 2009

Song learning as an indicator mechanism: modelling the developmental stress hypothesis.

J Theor Biol 2008 Apr 27;251(4):570-83. Epub 2007 Dec 27.

Language Evolution and Computation Research Unit, University of Edinburgh, George Square, Edinburgh EH8 9LL, UK.

The 'developmental stress hypothesis' attempts to provide a functional explanation of the evolutionary maintenance of song learning in songbirds. It argues that song learning can be viewed as an indicator mechanism that allows females to use learned features of song as a window on a male's early development, a potentially stressful period that may have long-term phenotypic effects. In this paper we formally model this hypothesis for the first time, presenting a population genetic model that takes into account both the evolution of genetic learning preferences and cultural transmission of song. The models demonstrate that a preference for song types that reveal developmental stress can evolve in a population, and that cultural transmission of these song types can be stable, lending more support to the hypothesis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jtbi.2007.12.013DOI Listing
April 2008