Publications by authors named "Anthony Shafer"

12 Publications

  • Page 1 of 1

Large-scale identification of sequence variants influencing human transcription factor occupancy in vivo.

Nat Genet 2015 Dec 26;47(12):1393-401. Epub 2015 Oct 26.

Department of Genome Sciences, University of Washington, Seattle, Washington, USA.

The function of human regulatory regions depends exquisitely on their local genomic environment and on cellular context, complicating experimental analysis of common disease- and trait-associated variants that localize within regulatory DNA. We use allelically resolved genomic DNase I footprinting data encompassing 166 individuals and 114 cell types to identify >60,000 common variants that directly influence transcription factor occupancy and regulatory DNA accessibility in vivo. The unprecedented scale of these data enables systematic analysis of the impact of sequence variation on transcription factor occupancy in vivo. We leverage this analysis to develop accurate models of variation affecting the recognition sites for diverse transcription factors and apply these models to discriminate nearly 500,000 common regulatory variants likely to affect transcription factor occupancy across the human genome. The approach and results provide a new foundation for the analysis and interpretation of noncoding variation in complete human genomes and for systems-level investigation of disease-associated variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3432DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4666772PMC
December 2015

Role of DNA Methylation in Modulating Transcription Factor Occupancy.

Cell Rep 2015 Aug 6;12(7):1184-95. Epub 2015 Aug 6.

Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Division of Oncology, Department of Medicine, University of Washington, Seattle, WA 98195, USA. Electronic address:

Although DNA methylation is commonly invoked as a mechanism for transcriptional repression, the extent to which it actively silences transcription factor (TF) occupancy sites in vivo is unknown. To study the role of DNA methylation in the active modulation of TF binding, we quantified the effect of DNA methylation depletion on the genomic occupancy patterns of CTCF, an abundant TF with known methylation sensitivity that is capable of autonomous binding to its target sites in chromatin. Here, we show that the vast majority (>98.5%) of the tens of thousands of unoccupied, methylated CTCF recognition sequences remain unbound upon abrogation of DNA methylation. The small fraction of sites that show methylation-dependent binding in vivo are in turn characterized by highly variable CTCF occupancy across cell types. Our results suggest that DNA methylation is not a primary groundskeeper of genomic TF landscapes, but rather a specialized mechanism for stabilizing intrinsically labile sites.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.celrep.2015.07.024DOI Listing
August 2015

A comparative encyclopedia of DNA elements in the mouse genome.

Nature 2014 Nov;515(7527):355-64

Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain.

The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature13992DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4266106PMC
November 2014

Exonic transcription factor binding directs codon choice and affects protein evolution.

Science 2013 Dec;342(6164):1367-72

Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.

Genomes contain both a genetic code specifying amino acids and a regulatory code specifying transcription factor (TF) recognition sequences. We used genomic deoxyribonuclease I footprinting to map nucleotide resolution TF occupancy across the human exome in 81 diverse cell types. We found that ~15% of human codons are dual-use codons ("duons") that simultaneously specify both amino acids and TF recognition sites. Duons are highly conserved and have shaped protein evolution, and TF-imposed constraint appears to be a major driver of codon usage bias. Conversely, the regulatory code has been selectively depleted of TFs that recognize stop codons. More than 17% of single-nucleotide variants within duons directly alter TF binding. Pervasive dual encoding of amino acid and regulatory information appears to be a fundamental feature of genome evolution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.1243490DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3967546PMC
December 2013

Probing DNA shape and methylation state on a genomic scale with DNase I.

Proc Natl Acad Sci U S A 2013 Apr 1;110(16):6376-81. Epub 2013 Apr 1.

Department of Electrical Engineering and Biological Sciences, Columbia University, New York, NY 10027, USA.

DNA binding proteins find their cognate sequences within genomic DNA through recognition of specific chemical and structural features. Here we demonstrate that high-resolution DNase I cleavage profiles can provide detailed information about the shape and chemical modification status of genomic DNA. Analyzing millions of DNA backbone hydrolysis events on naked genomic DNA, we show that the intrinsic rate of cleavage by DNase I closely tracks the width of the minor groove. Integration of these DNase I cleavage data with bisulfite sequencing data for the same cell type's genome reveals that cleavage directly adjacent to cytosine-phosphate-guanine (CpG) dinucleotides is enhanced at least eightfold by cytosine methylation. This phenomenon we show to be attributable to methylation-induced narrowing of the minor groove. Furthermore, we demonstrate that it enables simultaneous mapping of DNase I hypersensitivity and regional DNA methylation levels using dense in vivo cleavage data. Taken together, our results suggest a general mechanism by which CpG methylation can modulate protein-DNA interaction strength via the remodeling of DNA shape.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1216822110DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3631675PMC
April 2013

Systematic localization of common disease-associated variation in regulatory DNA.

Science 2012 Sep 5;337(6099):1190-5. Epub 2012 Sep 5.

Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.

Genome-wide association studies have identified many noncoding variants associated with common diseases and traits. We show that these variants are concentrated in regulatory DNA marked by deoxyribonuclease I (DNase I) hypersensitive sites (DHSs). Eighty-eight percent of such DHSs are active during fetal development and are enriched in variants associated with gestational exposure-related phenotypes. We identified distant gene targets for hundreds of variant-containing DHSs that may explain phenotype associations. Disease-associated variants systematically perturb transcription factor recognition sequences, frequently alter allelic chromatin states, and form regulatory networks. We also demonstrated tissue-selective enrichment of more weakly disease-associated variants within DHSs and the de novo identification of pathogenic cell types for Crohn's disease, multiple sclerosis, and an electrocardiogram trait, without prior knowledge of physiological mechanisms. Our results suggest pervasive involvement of regulatory DNA variation in common human disease and provide pathogenic insights into diverse disorders.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.1222794DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3771521PMC
September 2012

The accessible chromatin landscape of the human genome.

Nature 2012 Sep;489(7414):75-82

Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.

DNase I hypersensitive sites (DHSs) are markers of regulatory DNA and have underpinned the discovery of all classes of cis-regulatory elements including enhancers, promoters, insulators, silencers and locus control regions. Here we present the first extensive map of human DHSs identified through genome-wide profiling in 125 diverse cell and tissue types. We identify ∼2.9 million DHSs that encompass virtually all known experimentally validated cis-regulatory sequences and expose a vast trove of novel elements, most with highly cell-selective regulation. Annotating these elements using ENCODE data reveals novel relationships between chromatin accessibility, transcription, DNA methylation and regulatory factor occupancy patterns. We connect ∼580,000 distal DHSs with their target promoters, revealing systematic pairing of different classes of distal DHSs and specific promoter types. Patterning of chromatin accessibility at many regulatory regions is organized with dozens to hundreds of co-activated elements, and the transcellular DNase I sensitivity pattern at a given region can predict cell-type-specific functional behaviours. The DHS landscape shows signatures of recent functional evolutionary constraint. However, the DHS compartment in pluripotent and immortalized cells exhibits higher mutation rates than that in highly differentiated cells, exposing an unexpected link between chromatin accessibility, proliferative potential and patterns of human variation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature11232DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3721348PMC
September 2012

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.

Nature 2007 Jun;447(7146):799-816

We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2212820PMC
http://dx.doi.org/10.1038/nature05874DOI Listing
June 2007

Genome-scale mapping of DNase I sensitivity in vivo using tiling DNA microarrays.

Nat Methods 2006 Jul;3(7):511-8

Department of Genome Sciences, University of Washington, 1705 NE Pacific St., Box 357730, Seattle, Washington 98195, USA.

Localized accessibility of critical DNA sequences to the regulatory machinery is a key requirement for regulation of human genes. Here we describe a high-resolution, genome-scale approach for quantifying chromatin accessibility by measuring DNase I sensitivity as a continuous function of genome position using tiling DNA microarrays (DNase-array). We demonstrate this approach across 1% ( approximately 30 Mb) of the human genome, wherein we localized 2,690 classical DNase I hypersensitive sites with high sensitivity and specificity, and also mapped larger-scale patterns of chromatin architecture. DNase I hypersensitive sites exhibit marked aggregation around transcriptional start sites (TSSs), though the majority mark nonpromoter functional elements. We also developed a computational approach for visualizing higher-order features of chromatin structure. This revealed that human chromatin organization is dominated by large (100-500 kb) 'superclusters' of DNase I hypersensitive sites, which encompass both gene-rich and gene-poor regions. DNase-array is a powerful and straightforward approach for systematic exposition of the cis-regulatory architecture of complex genomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth890DOI Listing
July 2006

High-throughput localization of functional elements by quantitative chromatin profiling.

Nat Methods 2004 Dec 18;1(3):219-25. Epub 2004 Nov 18.

Department of Molecular Biology, Regulome, 2211 Elliott Avenue, Suite 600, Seattle, Washington 98121, USA.

Identification of functional, noncoding elements that regulate transcription in the context of complex genomes is a major goal of modern biology. Localization of functionality to specific sequences is a requirement for genetic and computational studies. Here, we describe a generic approach, quantitative chromatin profiling, that uses quantitative analysis of in vivo chromatin structure over entire gene loci to rapidly and precisely localize cis-regulatory sequences and other functional modalities encoded by DNase I hypersensitive sites. To demonstrate the accuracy of this approach, we analyzed approximately 300 kilobases of human genome sequence from diverse gene loci and cleanly delineated functional elements corresponding to a spectrum of classical cis-regulatory activities including enhancers, promoters, locus control regions and insulators as well as novel elements. Systematic, high-throughput identification of functional elements coinciding with DNase I hypersensitive sites will substantially expand our knowledge of transcriptional regulation and should simplify the search for noncoding genetic variation with phenotypic consequences.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth721DOI Listing
December 2004

Discovery of functional noncoding elements by digital analysis of chromatin structure.

Proc Natl Acad Sci U S A 2004 Nov 18;101(48):16837-42. Epub 2004 Nov 18.

Department of Molecular Biology, Regulome, 2211 Elliott Avenue, Suite 600, Seattle, WA 98121, USA.

We developed a quantitative methodology, digital analysis of chromatin structure (DACS), for high-throughput, automated mapping of DNase I-hypersensitive sites and associated cis-regulatory sequences in the human and other complex genomes. We used 19/20-bp genomic DNA tags to localize individual DNase I cutting events in nuclear chromatin and produced approximately 257,000 tags from erythroid cells. Tags were mapped to the human genome, and a quantitative algorithm was applied to discriminate statistically significant clusters of independent DNase I cutting events. We show that such clusters identify both known regulatory sequences and previously unrecognized functional elements across the genome. We used in silico simulation to demonstrate that DACS is capable of efficient and accurate localization of the majority of DNase I-hypersensitive sites in the human genome without requiring an independent validation step. A unique feature of DACS is that it permits unbiased evaluation of the chromatin state of regulatory sequences from widely separated genomic loci. We found surprisingly large differences in the accessibility of distant regulatory sequences, suggesting the existence of a hierarchy of nuclear organization that escapes detection by conventional chromatin assays.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.0407387101DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC534745PMC
November 2004
-->