Publications by authors named "Richard Sandstrom"

53 Publications

Inaccessible LCG Promoters Act as Safeguards to Restrict T Cell Development to Appropriate Notch Signaling Environments.

Stem Cell Reports 2021 Apr 25;16(4):717-726. Epub 2021 Mar 25.

Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Washington, Seattle, WA 98195, USA. Electronic address:

T cell development is restricted to the thymus and is dependent on high levels of Notch signaling induced within the thymic microenvironment. To understand Notch function in thymic restriction, we investigated the basis for target gene selectivity in response to quantitative differences in Notch signal strength, focusing on the chromatin architecture of genes essential for T cell differentiation. We find that high Notch signal strength is required to activate promoters of known targets essential for T cell commitment, including Il2ra, Cd3ε, and Rag1, which feature low CpG content (LCG) and DNA inaccessibility in hematopoietic stem progenitor cells. Our findings suggest that promoter DNA inaccessibility at LCG T lineage genes provides robust protection against stochastic activation in inappropriate Notch signaling contexts, limiting T cell development to the thymus.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.stemcr.2021.02.017DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8072033PMC
April 2021

Global reference mapping of human transcription factor footprints.

Nature 2020 07 29;583(7818):729-736. Epub 2020 Jul 29.

Altius Institute for Biomedical Sciences, Seattle, WA, USA.

Combinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected with diseases and diverse phenotypic traits, but it remains challenging to distinguish variants that affect regulatory function. Genomic DNase I footprinting enables the quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin. However, only a small fraction of such sites have been precisely resolved on the human genome sequence. Here, to enable comprehensive mapping of transcription factor footprints, we produced high-density DNase I cleavage maps from 243 human cell and tissue types and states and integrated these data to delineate about 4.5 million compact genomic elements that encode transcription factor occupancy at nucleotide resolution. We map the fine-scale structure within about 1.6 million DNase I-hypersensitive sites and show that the overwhelming majority are populated by well-spaced sites of single transcription factor-DNA interaction. Cell-context-dependent cis-regulation is chiefly executed by wholesale modulation of accessibility at regulatory DNA rather than by differential transcription factor occupancy within accessible elements. We also show that the enrichment of genetic variants associated with diseases or phenotypic traits in regulatory regions is almost entirely attributable to variants within footprints, and that functional variants that affect transcription factor occupancy are nearly evenly partitioned between loss- and gain-of-function alleles. Unexpectedly, we find increased density of human genetic variation within transcription factor footprints, revealing an unappreciated driver of cis-regulatory evolution. Our results provide a framework for both global and nucleotide-precision analyses of gene regulatory mechanisms and functional genetic variation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-020-2528-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7410829PMC
July 2020

Index and biological spectrum of human DNase I hypersensitive sites.

Nature 2020 08 29;584(7820):244-251. Epub 2020 Jul 29.

Altius Institute for Biomedical Sciences, Seattle, WA, USA.

DNase I hypersensitive sites (DHSs) are generic markers of regulatory DNA and contain genetic variations associated with diseases and phenotypic traits. We created high-resolution maps of DHSs from 733 human biosamples encompassing 438 cell and tissue types and states, and integrated these to delineate and numerically index approximately 3.6 million DHSs within the human genome sequence, providing a common coordinate system for regulatory DNA. Here we show that these maps highly resolve the cis-regulatory compartment of the human genome, which encodes unexpectedly diverse cell- and tissue-selective regulatory programs at very high density. These programs can be captured comprehensively by a simple vocabulary that enables the assignment to each DHS of a regulatory barcode that encapsulates its tissue manifestations, and global annotation of protein-coding and non-coding RNA genes in a manner orthogonal to gene expression. Finally, we show that sharply resolved DHSs markedly enhance the genetic association and heritability signals of diseases and traits. Rather than being confined to a small number of distal elements or promoters, we find that genetic signals converge on congruently regulated sets of DHSs that decorate entire gene bodies. Together, our results create a universal, extensible coordinate system and vocabulary for human regulatory DNA marked by DHSs, and provide a new global perspective on the architecture of human gene regulation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-020-2559-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7422677PMC
August 2020

Global Regulatory DNA Potentiation by SMARCA4 Propagates to Selective Gene Expression Programs via Domain-Level Remodeling.

Cell Rep 2020 05;31(8):107676

Altius Institute for Biomedical Sciences, Seattle, WA 98121, USA. Electronic address:

The human genome encodes millions of regulatory elements, of which only a small fraction are active within a given cell type. Little is known about the global impact of chromatin remodelers on regulatory DNA landscapes and how this translates to gene expression. We use precision genome engineering to reawaken homozygously inactivated SMARCA4, a central ATPase of the human SWI/SNF chromatin remodeling complex, in lung adenocarcinoma cells. Here, we combine DNase I hypersensitivity, histone modification, and transcriptional profiling to show that SMARCA4 dramatically increases both the number and magnitude of accessible chromatin sites genome-wide, chiefly by unmasking sites of low regulatory factor occupancy. By contrast, transcriptional changes are concentrated within well-demarcated remodeling domains wherein expression of specific genes is gated by both distal element activation and promoter chromatin configuration. Our results provide a perspective on how global chromatin remodeling activity is translated to gene expression via regulatory DNA.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.celrep.2020.107676DOI Listing
May 2020

Mapping and Dynamics of Regulatory DNA in Maturing Siliques.

Front Plant Sci 2019 14;10:1434. Epub 2019 Nov 14.

Department of Genome Sciences, University of Washington, Seattle, WA, United States.

The genome is reprogrammed during development to produce diverse cell types, largely through altered expression and activity of key transcription factors. The accessibility and critical functions of epidermal cells have made them a model for connecting transcriptional events to development in a range of model systems. In and many other plants, fertilization triggers differentiation of specialized epidermal seed coat cells that have a unique morphology caused by large extracellular deposits of polysaccharides. Here, we used DNase I-seq to generate regulatory landscapes of seeds at two critical time points in seed coat maturation (4 and 7 DPA), enriching for seed coat cells with the INTACT method. We found over 3,000 developmentally dynamic regulatory DNA elements and explored their relationship with nearby gene expression. The dynamic regulatory elements were enriched for motifs for several transcription factors families; most notably the TCP family at the earlier time point and the MYB family at the later one. To assess the extent to which the observed regulatory sites in seeds added to previously known regulatory sites in we compared our data to 11 other data sets generated with 7-day-old seedlings for diverse tissues and conditions. Surprisingly, over a quarter of the regulatory, i.e. accessible, bases observed in seeds were novel. Notably, plant regulatory landscapes from different tissues, cell types, or developmental stages were more dynamic than those generated from bulk tissue in response to environmental perturbations, highlighting the importance of extending studies of regulatory DNA to single tissues and cell types during development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fpls.2019.01434DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6868056PMC
November 2019

Integrated epigenomic profiling reveals endogenous retrovirus reactivation in renal cell carcinoma.

EBioMedicine 2019 Mar 1;41:427-442. Epub 2019 Mar 1.

Department of Pathology, University of Washington, Seattle, WA 98195, United States; Kidney Research Institute, Seattle, WA 98104, United States. Electronic address:

Background: Transcriptional dysregulation drives cancer formation but the underlying mechanisms are still poorly understood. Renal cell carcinoma (RCC) is the most common malignant kidney tumor which canonically activates the hypoxia-inducible transcription factor (HIF) pathway. Despite intensive study, novel therapeutic strategies to target RCC have been difficult to develop. Since the RCC epigenome is relatively understudied, we sought to elucidate key mechanisms underpinning the tumor phenotype and its clinical behavior.

Methods: We performed genome-wide chromatin accessibility (DNase-seq) and transcriptome profiling (RNA-seq) on paired tumor/normal samples from 3 patients undergoing nephrectomy for removal of RCC. We incorporated publicly available data on HIF binding (ChIP-seq) in a RCC cell line. We performed integrated analyses of these high-resolution, genome-scale datasets together with larger transcriptomic data available through The Cancer Genome Atlas (TCGA).

Findings: Though HIF transcription factors play a cardinal role in RCC oncogenesis, we found that numerous transcription factors with a RCC-selective expression pattern also demonstrated evidence of HIF binding near their gene body. Examination of chromatin accessibility profiles revealed that some of these transcription factors influenced the tumor's regulatory landscape, notably the stem cell transcription factor POU5F1 (OCT4). Elevated POU5F1 transcript levels were correlated with advanced tumor stage and poorer overall survival in RCC patients. Unexpectedly, we discovered a HIF-pathway-responsive promoter embedded within a endogenous retroviral long terminal repeat (LTR) element at the transcriptional start site of the PSOR1C3 long non-coding RNA gene upstream of POU5F1. RNA transcripts are induced from this promoter and read through PSOR1C3 into POU5F1 producing a novel POU5F1 transcript isoform. Rather than being unique to the POU5F1 locus, we found that HIF binds to several other transcriptionally active LTR elements genome-wide correlating with broad gene expression changes in RCC.

Interpretation: Integrated transcriptomic and epigenomic analysis of matched tumor and normal tissues from even a small number of primary patient samples revealed remarkably convergent shared regulatory landscapes. Several transcription factors appear to act downstream of HIF including the potent stem cell transcription factor POU5F1. Dysregulated expression of POU5F1 is part of a larger pattern of gene expression changes in RCC that may be induced by HIF-dependent reactivation of dormant promoters embedded within endogenous retroviral LTRs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ebiom.2019.01.063DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6441874PMC
March 2019

Integrated Functional Genomic Analysis Enables Annotation of Kidney Genome-Wide Association Study Loci.

J Am Soc Nephrol 2019 Feb 13. Epub 2019 Feb 13.

Department of Anatomic Pathology,

Background: Linking genetic risk loci identified by genome-wide association studies (GWAS) to their causal genes remains a major challenge. Disease-associated genetic variants are concentrated in regions containing regulatory DNA elements, such as promoters and enhancers. Although researchers have previously published DNA maps of these regulatory regions for kidney tubule cells and glomerular endothelial cells, maps for podocytes and mesangial cells have not been available.

Methods: We generated regulatory DNA maps (DNase-seq) and paired gene expression profiles (RNA-seq) from primary outgrowth cultures of human glomeruli that were composed mainly of podocytes and mesangial cells. We generated similar datasets from renal cortex cultures, to compare with those of the glomerular cultures. Because regulatory DNA elements can act on target genes across large genomic distances, we also generated a chromatin conformation map from freshly isolated human glomeruli.

Results: We identified thousands of unique regulatory DNA elements, many located close to transcription factor genes, which the glomerular and cortex samples expressed at different levels. We found that genetic variants associated with kidney diseases (GWAS) and kidney expression quantitative trait loci were enriched in regulatory DNA regions. By combining GWAS, epigenomic, and chromatin conformation data, we functionally annotated 46 kidney disease genes.

Conclusions: We demonstrate a powerful approach to functionally connect kidney disease-/trait-associated loci to their target genes by leveraging unique regulatory DNA maps and integrated epigenomic and genetic analysis. This process can be applied to other kidney cell types and will enhance our understanding of genome regulation and its effects on gene expression in kidney disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1681/ASN.2018030309DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6405142PMC
February 2019

The birth of a human-specific neural gene by incomplete duplication and gene fusion.

Genome Biol 2017 03 9;18(1):49. Epub 2017 Mar 9.

Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA.

Background: Gene innovation by duplication is a fundamental evolutionary process but is difficult to study in humans due to the large size, high sequence identity, and mosaic nature of segmental duplication blocks. The human-specific gene hydrocephalus-inducing 2, HYDIN2, was generated by a 364 kbp duplication of 79 internal exons of the large ciliary gene HYDIN from chromosome 16q22.2 to chromosome 1q21.1. Because the HYDIN2 locus lacks the ancestral promoter and seven terminal exons of the progenitor gene, we sought to characterize transcription at this locus by coupling reverse transcription polymerase chain reaction and long-read sequencing.

Results: 5' RACE indicates a transcription start site for HYDIN2 outside of the duplication and we observe fusion transcripts spanning both the 5' and 3' breakpoints. We observe extensive splicing diversity leading to the formation of altered open reading frames (ORFs) that appear to be under relaxed selection. We show that HYDIN2 adopted a new promoter that drives an altered pattern of expression, with highest levels in neural tissues. We estimate that the HYDIN duplication occurred ~3.2 million years ago and find that it is nearly fixed (99.9%) for diploid copy number in contemporary humans. Examination of 73 chromosome 1q21 rearrangement patients reveals that HYDIN2 is deleted or duplicated in most cases.

Conclusions: Together, these data support a model of rapid gene innovation by fusion of incomplete segmental duplications, altered tissue expression, and potential subfunctionalization or neofunctionalization of HYDIN2 early in the evolution of the Homo lineage.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-017-1163-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345166PMC
March 2017

Cross-species Analyses Unravel the Complexity of H3K27me3 and H4K20me3 in the Context of Neural Stem Progenitor Cells.

Neuroepigenetics 2016 Jun 3;6:10-25. Epub 2016 May 3.

Department of Biology, University of Texas at San Antonio, San Antonio, Texas 78249, USA; Neuroscience Institute, University of Texas at San Antonio, San Antonio, Texas 78249, USA.

Neural stem progenitor cells (NSPCs) in the human subventricular zone (SVZ) potentially contribute to life-long neurogenesis, yet subtypes of glioblastoma multiforme (GBM) contain NSPC signatures that highlight the importance of cell fate regulation. Among numerous regulatory mechanisms, the post-translational methylations onto histone tails are crucial regulator of cell fate. The work presented here focuses on the role of two repressive chromatin marks tri-methylations on histone H3 lysine 27 (H3K27me3) and histone H4 lysine 20 (H4K20me3) in the adult NSPC within the SVZ. To best model healthy human NSPCs as they exist for epigenetic profiling of H3K27me3 and H4K20me3, we utilized NSPCs isolated from the adult SVZ of baboon brain () with brain structure and genomic level similar to human. The putative role of H3K27me3 in normal NSPCs predominantly falls into the regulation of gene expression, cell cycle, and differentiation, whereas H4K20me3 is involved in DNA replication/repair, metabolism, and cell cycle. Using conditional knock-out mouse models to diminish and responsible for H3K27me3 and H4K20me3, respectively, we found that both repressive marks have irrefutable function for cell cycle regulation in the NSPC population. While both EZH2/H3K27me3 and Suv4-20h/H4K20me3 have implication in cancers, our comparative genomics approach between healthy NSPCs and human GBM specimens revealed that substantial sets of genes enriched with H3K27me3 and H4K20me3 in the NSPCs are altered in the human GBM. In sum, our integrated analyses across species highlight important roles of H3K27me3 and H4K20me3 in normal and disease conditions in the context of NSPC.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.nepig.2016.04.001DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4941106PMC
June 2016

Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA.

Am J Hum Genet 2016 Jan 31;98(1):58-74. Epub 2015 Dec 31.

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA. Electronic address:

We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2015.11.023DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4716689PMC
January 2016

Large-scale identification of sequence variants influencing human transcription factor occupancy in vivo.

Nat Genet 2015 Dec 26;47(12):1393-401. Epub 2015 Oct 26.

Department of Genome Sciences, University of Washington, Seattle, Washington, USA.

The function of human regulatory regions depends exquisitely on their local genomic environment and on cellular context, complicating experimental analysis of common disease- and trait-associated variants that localize within regulatory DNA. We use allelically resolved genomic DNase I footprinting data encompassing 166 individuals and 114 cell types to identify >60,000 common variants that directly influence transcription factor occupancy and regulatory DNA accessibility in vivo. The unprecedented scale of these data enables systematic analysis of the impact of sequence variation on transcription factor occupancy in vivo. We leverage this analysis to develop accurate models of variation affecting the recognition sites for diverse transcription factors and apply these models to discriminate nearly 500,000 common regulatory variants likely to affect transcription factor occupancy across the human genome. The approach and results provide a new foundation for the analysis and interpretation of noncoding variation in complete human genomes and for systems-level investigation of disease-associated variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3432DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4666772PMC
December 2015

DNase I hypersensitivity analysis of the mouse brain and retina identifies region-specific regulatory elements.

Epigenetics Chromatin 2015 28;8. Epub 2015 Feb 28.

Department of Biological Structure, University of Washington, 1959 NE Pacific Street, Box 357420, Seattle, WA 98195 USA.

Background: The brain, spinal cord, and neural retina comprise the central nervous system (CNS) of vertebrates. Understanding the regulatory mechanisms that underlie the enormous cell-type diversity of the CNS is a significant challenge. Whole-genome mapping of DNase I-hypersensitive sites (DHSs) has been used to identify cis-regulatory elements in many tissues. We have applied this approach to the mouse CNS, including developing and mature neural retina, whole brain, and two well-characterized brain regions, the cerebellum and the cerebral cortex.

Results: For the various regions and developmental stages of the CNS that we analyzed, there were approximately the same number of DHSs; however, there were many DHSs unique to each CNS region and developmental stage. Many of the DHSs are likely to mark enhancers that are specific to the specific CNS region and developmental stage. We validated the DNase I mapping approach for identification of CNS enhancers using the existing VISTA Browser database and with in vivo and in vitro electroporation of the retina. Analysis of transcription factor consensus sites within the DHSs shows distinct region-specific profiles of transcriptional regulators particular to each region. Clustering developmentally dynamic DHSs in the retina revealed enrichment of developmental stage-specific transcriptional regulators. Additionally, we found reporter gene activity in the retina driven from several previously uncharacterized regulatory elements surrounding the neurodevelopmental gene Otx2. Identification of DHSs shared between mouse and human showed region-specific differences in the evolution of cis-regulatory elements.

Conclusions: Overall, our results demonstrate the potential of genome-wide DNase I mapping to cis-regulatory questions regarding the regional diversity within the CNS. These data represent an extensive catalogue of potential cis-regulatory elements within the CNS that display region and temporal specificity, as well as a set of DHSs common to CNS tissues. Further examination of evolutionary conservation of DHSs between CNS regions and different species may reveal important cis-regulatory elements in the evolution of the mammalian CNS.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1756-8935-8-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4429822PMC
May 2015

Native elongating transcript sequencing reveals human transcriptional activity at nucleotide resolution.

Cell 2015 Apr;161(3):541-554

Department of Genetics, Harvard Medical School, Boston, MA 02115, USA. Electronic address:

Major features of transcription by human RNA polymerase II (Pol II) remain poorly defined due to a lack of quantitative approaches for visualizing Pol II progress at nucleotide resolution. We developed a simple and powerful approach for performing native elongating transcript sequencing (NET-seq) in human cells that globally maps strand-specific Pol II density at nucleotide resolution. NET-seq exposes a mode of antisense transcription that originates downstream and converges on transcription from the canonical promoter. Convergent transcription is associated with a distinctive chromatin configuration and is characteristic of lower-expressed genes. Integration of NET-seq with genomic footprinting data reveals stereotypic Pol II pausing coincident with transcription factor occupancy. Finally, exons retained in mature transcripts display Pol II pausing signatures that differ markedly from skipped exons, indicating an intrinsic capacity for Pol II to recognize exons with different processing fates. Together, human NET-seq exposes the topography and regulatory complexity of human gene expression.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2015.03.010DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4528962PMC
April 2015

Genome-wide comparative analysis reveals human-mouse regulatory landscape and evolution.

BMC Genomics 2015 Feb 14;16:87. Epub 2015 Feb 14.

Department of Mathematics and Computer Science, Emory University, Atlanta, GA, 30322, USA.

Background: Because species-specific gene expression is driven by species-specific regulation, understanding the relationship between sequence and function of the regulatory regions in different species will help elucidate how differences among species arise. Despite active experimental and computational research, relationships among sequence, conservation, and function are still poorly understood.

Results: We compared transcription factor occupied segments (TFos) for 116 human and 35 mouse TFs in 546 human and 125 mouse cell types and tissues from the Human and the Mouse ENCODE projects. We based the map between human and mouse TFos on a one-to-one nucleotide cross-species mapper, bnMapper, that utilizes whole genome alignments (WGA). Our analysis shows that TFos are under evolutionary constraint, but a substantial portion (25.1% of mouse and 25.85% of human on average) of the TFos does not have a homologous sequence on the other species; this portion varies among cell types and TFs. Furthermore, 47.67% and 57.01% of the homologous TFos sequence shows binding activity on the other species for human and mouse respectively. However, 79.87% and 69.22% is repurposed such that it binds the same TF in different cells or different TFs in the same cells. Remarkably, within the set of repurposed TFos, the corresponding genome regions in the other species are preferred locations of novel TFos. These events suggest exaptation of some functional regulatory sequences into new function. Despite TFos repurposing, we did not find substantial changes in their predicted target genes, suggesting that CRMs buffer evolutionary events allowing little or no change in the TFos - target gene associations. Thus, the small portion of TFos with strictly conserved occupancy underestimates the degree of conservation of regulatory interactions.

Conclusion: We mapped regulatory sequences from an extensive number of TFs and cell types between human and mouse using WGA. A comparative analysis of this correspondence unveiled the extent of the shared regulatory sequence across TFs and cell types under study. Importantly, a large part of the shared regulatory sequence is repurposed on the other species. This sequence, fueled by turnover events, provides a strong case for exaptation in regulatory elements.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12864-015-1245-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4333152PMC
February 2015

Cell-of-origin chromatin organization shapes the mutational landscape of cancer.

Nature 2015 Feb;518(7539):360-364

Division of Genetics, Department of Medicine, Brigham & Women's Hospital and Harvard Medical School, Boston, MA, 02115.

Cancer is a disease potentiated by mutations in somatic cells. Cancer mutations are not distributed uniformly along the human genome. Instead, different human genomic regions vary by up to fivefold in the local density of cancer somatic mutations, posing a fundamental problem for statistical methods used in cancer genomics. Epigenomic organization has been proposed as a major determinant of the cancer mutational landscape. However, both somatic mutagenesis and epigenomic features are highly cell-type-specific. We investigated the distribution of mutations in multiple independent samples of diverse cancer types and compared them to cell-type-specific epigenomic features. Here we show that chromatin accessibility and modification, together with replication timing, explain up to 86% of the variance in mutation rates along cancer genomes. The best predictors of local somatic mutation density are epigenomic features derived from the most likely cell type of origin of the corresponding malignancy. Moreover, we find that cell-of-origin chromatin features are much stronger determinants of cancer mutation profiles than chromatin features of matched cancer cell lines. Furthermore, we show that the cell type of origin of a cancer can be accurately determined based on the distribution of mutations along its genome. Thus, the DNA sequence of a cancer genome encompasses a wealth of information about the identity and epigenomic features of its cell of origin.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature14221DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4405175PMC
February 2015

Integrative analysis of 111 reference human epigenomes.

Nature 2015 Feb;518(7539):317-30

1] Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, Moores Cancer Center, Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA. [2] Ludwig Institute for Cancer Research, 9500 Gilman Drive, La Jolla, California 92093, USA.

The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature14248DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4530010PMC
February 2015

Mouse regulatory DNA landscapes reveal global principles of cis-regulatory evolution.

Science 2014 Nov;346(6212):1007-12

Howard Hughes Medical Institute. Division of Hematology/Oncology, Children's Hospital Boston and Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Stem Cell Institute, Harvard Medical School, Boston, MA 02115, USA.

To study the evolutionary dynamics of regulatory DNA, we mapped >1.3 million deoxyribonuclease I-hypersensitive sites (DHSs) in 45 mouse cell and tissue types, and systematically compared these with human DHS maps from orthologous compartments. We found that the mouse and human genomes have undergone extensive cis-regulatory rewiring that combines branch-specific evolutionary innovation and loss with widespread repurposing of conserved DHSs to alternative cell fates, and that this process is mediated by turnover of transcription factor (TF) recognition elements. Despite pervasive evolutionary remodeling of the location and content of individual cis-regulatory regions, within orthologous mouse and human cell types the global fraction of regulatory DNA bases encoding recognition sites for each TF has been strictly conserved. Our findings provide new insights into the evolutionary forces shaping mammalian regulatory DNA landscapes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.1246426DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4337786PMC
November 2014

Conservation of trans-acting circuitry during mammalian regulatory evolution.

Nature 2014 Nov;515(7527):365-70

1] Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA [2] Department of Medicine, University of Washington, Seattle, Washington 98195, USA.

The basic body plan and major physiological axes have been highly conserved during mammalian evolution, yet only a small fraction of the human genome sequence appears to be subject to evolutionary constraint. To quantify cis- versus trans-acting contributions to mammalian regulatory evolution, we performed genomic DNase I footprinting of the mouse genome across 25 cell and tissue types, collectively defining ∼8.6 million transcription factor (TF) occupancy sites at nucleotide resolution. Here we show that mouse TF footprints conjointly encode a regulatory lexicon that is ∼95% similar with that derived from human TF footprints. However, only ∼20% of mouse TF footprints have human orthologues. Despite substantial turnover of the cis-regulatory landscape, nearly half of all pairwise regulatory interactions connecting mouse TF genes have been maintained in orthologous human cell types through evolutionary innovation of TF recognition sequences. Furthermore, the higher-level organization of mouse TF-to-TF connections into cellular network architectures is nearly identical with human. Our results indicate that evolutionary selection on mammalian gene regulation is targeted chiefly at the level of trans-regulatory circuitry, enabling and potentiating cis-regulatory plasticity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature13972DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4405208PMC
November 2014

A comparative encyclopedia of DNA elements in the mouse genome.

Nature 2014 Nov;515(7527):355-64

Bioinformatics and Genomics, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, 08003 Barcelona, Catalonia, Spain.

The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature13992DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4266106PMC
November 2014

Resolving the complexity of the human genome using single-molecule sequencing.

Nature 2015 Jan 10;517(7536):608-11. Epub 2014 Nov 10.

1] Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA [2] Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA.

The human genome is arguably the most complete mammalian reference assembly, yet more than 160 euchromatic gaps remain and aspects of its structural variation remain poorly understood ten years after its completion. To identify missing sequence and genetic variation, here we sequence and analyse a haploid human genome (CHM1) using single-molecule, real-time DNA sequencing. We close or extend 55% of the remaining interstitial gaps in the human GRCh37 reference genome--78% of which carried long runs of degenerate short tandem repeats, often several kilobases in length, embedded within (G+C)-rich genomic regions. We resolve the complete sequence of 26,079 euchromatic structural variants at the base-pair level, including inversions, complex insertions and long tracts of tandem repeats. Most have not been previously reported, with the greatest increases in sensitivity occurring for events less than 5 kilobases in size. Compared to the human reference, we find a significant insertional bias (3:1) in regions corresponding to complex insertions and long short tandem repeats. Our results suggest a greater complexity of the human genome in the form of variation of longer and more complex repetitive DNA that can now be largely resolved with the application of this longer-read sequencing technology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature13907DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4317254PMC
January 2015

A genome-wide map of adeno-associated virus-mediated human gene targeting.

Nat Struct Mol Biol 2014 Nov 5;21(11):969-75. Epub 2014 Oct 5.

1] Department of Medicine, University of Washington, Seattle, Washington, USA. [2] Department of Biochemistry, University of Washington, Seattle, Washington, USA.

To determine which genomic features promote homologous recombination, we created a genome-wide map of gene targeting sites. We used an adeno-associated virus vector to target identical loci introduced as transcriptionally active retroviral vectors. A comparison of ~2,000 targeted and untargeted sites showed that targeting occurred throughout the human genome and was not influenced by the presence of nearby CpG islands, sequence repeats or DNase I-hypersensitive sites. Targeted sites were preferentially located within transcription units, especially when the target loci were transcribed in the opposite orientation to their surrounding chromosomal genes. We determined the impact of DNA replication by mapping replication forks, which revealed a preference for recombination at target loci transcribed toward an incoming fork. Our results constitute the first genome-wide screen of gene targeting in mammalian cells and demonstrate a strong recombinogenic effect of colliding polymerases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nsmb.2895DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4405182PMC
November 2014

Mapping and dynamics of regulatory DNA and transcription factor networks in A. thaliana.

Cell Rep 2014 Sep 15;8(6):2015-2030. Epub 2014 Sep 15.

Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA. Electronic address:

Our understanding of gene regulation in plants is constrained by our limited knowledge of plant cis-regulatory DNA and its dynamics. We mapped DNase I hypersensitive sites (DHSs) in A. thaliana seedlings and used genomic footprinting to delineate ∼ 700,000 sites of in vivo transcription factor (TF) occupancy at nucleotide resolution. We show that variation associated with 72 diverse quantitative phenotypes localizes within DHSs. TF footprints encode an extensive cis-regulatory lexicon subject to recent evolutionary pressures, and widespread TF binding within exons may have shaped codon usage patterns. The architecture of A. thaliana TF regulatory networks is strikingly similar to that of animals in spite of diverged regulatory repertoires. We analyzed regulatory landscape dynamics during heat shock and photomorphogenesis, disclosing thousands of environmentally sensitive elements and enabling mapping of key TF regulatory circuits underlying these fundamental responses. Our results provide an extensive resource for the study of A. thaliana gene regulation and functional biology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.celrep.2014.08.019DOI Listing
September 2014

Molecular targets of chromatin repressive mark H3K9me3 in primate progenitor cells within adult neurogenic niches.

Front Genet 2014 30;5:252. Epub 2014 Jul 30.

Department of Biology, University of Texas at San Antonio San Antonio, TX, USA ; Neurobiology, Neuroscience Institute, University of Texas at San Antonio San Antonio, TX, USA.

Histone 3 Lysine 9 (H3K9) methylation is known to be associated with pericentric heterochromatin and important in genomic stability. In this study, we show that trimethylation at H3K9 (H3K9me3) is enriched in an adult neural stem cell niche- the subventricular zone (SVZ) on the walls of the lateral ventricle in both rodent and non-human primate baboon brain. Previous studies have shown that there is significant correlation between baboon and human regarding genomic similarity and brain structure, suggesting that findings in baboon are relevant to human. To understand the function of H3K9me3 in this adult neurogenic niche, we performed genome-wide analyses using ChIP-Seq (chromatin immunoprecipitation and deep-sequencing) and RNA-Seq for in vivo SVZ cells purified from baboon brain. Through integrated analyses of ChIP-Seq and RNA-Seq, we found that H3K9me3-enriched genes associated with cellular maintenance, post-transcriptional and translational modifications, signaling pathways, and DNA replication are expressed, while genes involved in axon/neuron, hepatic stellate cell, or immune-response activation are not expressed. As neurogenesis progresses in the adult SVZ, cell fate restriction is essential to direct proper lineage commitment. Our findings highlight that H3K9me3 repression in undifferentiated SVZ cells is engaged in the maintenance of cell type integrity, implicating a role for H3K9me3 as an epigenetic mechanism to control cell fate transition within this adult germinal niche.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2014.00252DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4115620PMC
August 2014

Epigenetic regulation by chromatin activation mark H3K4me3 in primate progenitor cells within adult neurogenic niche.

Sci Rep 2014 Jun 20;4:5371. Epub 2014 Jun 20.

1] Department of Biology, University of Texas at San Antonio, One UTSA Circle, San Antonio, Texas 78249, USA [2] Neuroscience Institute, University of Texas at San Antonio, San Antonio, Texas 78249, USA.

Histone 3 lysine 4 trimethylation (H3K4me3) is known to be associated with transcriptionally active or poised genes and required for postnatal neurogenesis within the subventricular zone (SVZ) in the rodent model. Previous comparisons have shown significant correlation between baboon (Papio anubis) and human brain. In this study, we demonstrate that chromatin activation mark H3K4me3 is present in undifferentiated progenitor cells within the SVZ of adult baboon brain. To identify the targets and regulatory role of H3K4me3 within the baboon SVZ, we developed a technique to purify undifferentiated SVZ cells while preserving the endogenous nature without introducing culture artifact to maintain the in vivo chromatin state for genome-wide studies (ChIP-Seq and RNA-Seq). Overall, H3K4me3 is significantly enriched for genes involved in cell cycle, metabolism, protein synthesis, signaling pathways, and cancer mechanisms. Additionally, we found elevated levels of H3K4me3 in the MRI-classified SVZ-associated Glioblastoma Multiforme (GBM), which has a transcriptional profile that reflects the H3K4me3 modifications in the undifferentiated progenitor cells of the baboon SVZ. Our findings highlight the importance of H3K4me3 in coordinating distinct networks and pathways for life-long neurogenesis, and suggest that subtypes of GBM could occur, at least in part, due to aberrant H3K4me3 epigenetic regulation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/srep05371DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4064326PMC
June 2014

Domains of genome-wide gene expression dysregulation in Down's syndrome.

Nature 2014 Apr;508(7496):345-50

1] Department of Genetic Medicine and Development, University of Geneva Medical School, University Hospitals of Geneva, 1211 Geneva, Switzerland [2] iGE3 Institute of Genetics and Genomics of Geneva, 1211 Geneva, Switzerland.

Trisomy 21 is the most frequent genetic cause of cognitive impairment. To assess the perturbations of gene expression in trisomy 21, and to eliminate the noise of genomic variability, we studied the transcriptome of fetal fibroblasts from a pair of monozygotic twins discordant for trisomy 21. Here we show that the differential expression between the twins is organized in domains along all chromosomes that are either upregulated or downregulated. These gene expression dysregulation domains (GEDDs) can be defined by the expression level of their gene content, and are well conserved in induced pluripotent stem cells derived from the twins' fibroblasts. Comparison of the transcriptome of the Ts65Dn mouse model of Down's syndrome and normal littermate mouse fibroblasts also showed GEDDs along the mouse chromosomes that were syntenic in human. The GEDDs correlate with the lamina-associated (LADs) and replication domains of mammalian cells. The overall position of LADs was not altered in trisomic cells; however, the H3K4me3 profile of the trisomic fibroblasts was modified and accurately followed the GEDD pattern. These results indicate that the nuclear compartments of trisomic cells undergo modifications of the chromatin environment influencing the overall transcriptome, and that GEDDs may therefore contribute to some trisomy 21 phenotypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature13200DOI Listing
April 2014

Coupling transcription factor occupancy to nucleosome architecture with DNase-FLASH.

Nat Methods 2014 Jan 3;11(1):66-72. Epub 2013 Nov 3.

1] Department of Genome Sciences, University of Washington, Seattle, Washington, USA. [2] Department of Medicine, Division of Oncology, University of Washington, Seattle, Washington, USA.

It is currently not possible to resolve the genome-wide relationship of transcription factors (TFs) and nucleosomes at the level of individual chromatin templates despite rapidly increasing data on TF and nucleosome occupancy in the human genome. Here we describe DNase I-released fragment-length analysis of hypersensitivity (DNase-FLASH), an approach that directly couples mapping of TF occupancy, via quantification of DNA microfragments released from individual TF recognition sites in regulatory DNA, to the surrounding nucleosome architecture, via analysis of larger DNA fragments, in a single assay. DNase-FLASH enables coupling of individual TF footprints to nucleosome occupancy, identifying TFs that precisely demarcate the regulatory DNA-nucleosome interface.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.2713DOI Listing
January 2014

Functionally and phenotypically distinct subpopulations of marrow stromal cells are fibroblast in origin and induce different fates in peripheral blood monocytes.

Stem Cells Dev 2014 Apr 23;23(7):729-40. Epub 2013 Nov 23.

1 Fred Hutchinson Cancer Research Center , Seattle, Washington.

Marrow stromal cells constitute a heterogeneous population of cells, typically isolated after expansion in culture. In vivo, stromal cells often exist in close proximity or in direct contact with monocyte-derived macrophages, yet their interaction with monocytes is largely unexplored. In this report, isolated CD146(+) and CD146(-) stromal cells, as well as immortalized cell lines representative of each (designated HS27a and HS5, respectively), were shown by global DNase I hypersensitive site mapping and principal coordinate analysis to have a lineage association with marrow fibroblasts. Gene expression profiles generated for the CD146(+) and CD146(-) cell lines indicate significant differences in their respective transcriptomes, which translates into differences in secreted factors. Consequently, the conditioned media (CM) from these two populations induce different fates in peripheral blood monocytes. Monocytes incubated in CD146(+) CM acquire a tissue macrophage phenotype, whereas monocytes incubated in CM from CD146(-) cells express markers associated with pre-dendritic cells. Importantly, when CD14(+) monocytes are cultured in contact with the CD146(+) cells, the combined cell populations, assayed as a unit, show increased levels of transcripts associated with organismal development and hematopoietic regulation. In contrast, the gene expression profile from cocultures of monocytes and CD146(-) cells does not differ from that obtained when monocytes are cultured with CD146(-) CM. These in vitro results show that the CD146(+) marrow stromal cells together with monocytes increase the expression of genes relevant to hematopoietic regulation. In vivo relevance of these data is suggested by immunohistochemistry of marrow biopsies showing juxtaposed CD146(+) cells and CD68(+) cells associated with these upregulated proteins.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1089/scd.2013.0300DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3967370PMC
April 2014
-->