Publications by authors named "Martin J Aryee"

76 Publications

Augmenting and directing long-range CRISPR-mediated activation in human cells.

Nat Methods 2021 09 5;18(9):1075-1081. Epub 2021 Aug 5.

Molecular Pathology Unit, Massachusetts General Hospital, Charlestown, MA, USA.

Epigenetic editing is an emerging technology that uses artificial transcription factors (aTFs) to regulate expression of a target gene. Although human genes can be robustly upregulated by targeting aTFs to promoters, the activation induced by directing aTFs to distal transcriptional enhancers is substantially less robust and consistent. Here we show that long-range activation using CRISPR-based aTFs in human cells can be made more efficient and reliable by concurrently targeting an aTF to the target gene promoter. We used this strategy to direct target gene choice for enhancers capable of regulating more than one promoter and to achieve allele-selective activation of human genes by targeting aTFs to single-nucleotide polymorphisms embedded in distally located sequences. Our results broaden the potential applications of the epigenetic editing toolbox for research and therapeutics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41592-021-01224-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8446310PMC
September 2021

Smart-RRBS for single-cell methylome and transcriptome analysis.

Nat Protoc 2021 08 9;16(8):4004-4030. Epub 2021 Jul 9.

Broad Institute of MIT and Harvard, Cambridge, MA, USA.

The integration of DNA methylation and transcriptional state within single cells is of broad interest. Several single-cell dual- and multi-omics approaches have been reported that enable further investigation into cellular heterogeneity, including the discovery and in-depth study of rare cell populations. Such analyses will continue to provide important mechanistic insights into the regulatory consequences of epigenetic modifications. We recently reported a new method for profiling the DNA methylome and transcriptome from the same single cells in a cancer research study. Here, we present details of the protocol and provide guidance on its utility. Our Smart-RRBS (reduced representation bisulfite sequencing) protocol combines Smart-seq2 and RRBS and entails physically separating mRNA from the genomic DNA. It generates paired epigenetic promoter and RNA-expression measurements for ~24% of protein-coding genes in a typical single cell. It also works for micro-dissected tissue samples comprising hundreds of cells. The protocol, excluding flow sorting of cells and sequencing, takes ~3 d to process up to 192 samples manually. It requires basic molecular biology expertise and laboratory equipment, including a PCR workstation with UV sterilization, a DNA fluorometer and a microfluidic electrophoresis system.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41596-021-00571-9DOI Listing
August 2021

STAG2 loss rewires oncogenic and developmental programs to promote metastasis in Ewing sarcoma.

Cancer Cell 2021 Jun;39(6):827-844.e10

Dana-Farber/Boston Children's Cancer and Blood Disorders Center, Boston, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA. Electronic address:

The core cohesin subunit STAG2 is recurrently mutated in Ewing sarcoma but its biological role is less clear. Here, we demonstrate that cohesin complexes containing STAG2 occupy enhancer and polycomb repressive complex (PRC2)-marked regulatory regions. Genetic suppression of STAG2 leads to a compensatory increase in cohesin-STAG1 complexes, but not in enhancer-rich regions, and results in reprogramming of cis-chromatin interactions. Strikingly, in STAG2 knockout cells the oncogenic genetic program driven by the fusion transcription factor EWS/FLI1 was highly perturbed, in part due to altered enhancer-promoter contacts. Moreover, loss of STAG2 also disrupted PRC2-mediated regulation of gene expression. Combined, these transcriptional changes converged to modulate EWS/FLI1, migratory, and neurodevelopmental programs. Finally, consistent with clinical observations, functional studies revealed that loss of STAG2 enhances the metastatic potential of Ewing sarcoma xenografts. Our findings demonstrate that STAG2 mutations can alter chromatin architecture and transcriptional programs to promote an aggressive cancer phenotype.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ccell.2021.05.007DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8378827PMC
June 2021

Extended-representation bisulfite sequencing of gene regulatory elements in multiplexed samples and single cells.

Nat Biotechnol 2021 09 6;39(9):1086-1094. Epub 2021 May 6.

Department of Pathology and Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.

The biological roles of DNA methylation have been elucidated by profiling methods based on whole-genome or reduced-representation bisulfite sequencing, but these approaches do not efficiently survey the vast numbers of non-coding regulatory elements in mammalian genomes. Here we present an extended-representation bisulfite sequencing (XRBS) method for targeted profiling of DNA methylation. Our design strikes a balance between expanding coverage of regulatory elements and reproducibly enriching informative CpG dinucleotides in promoters, enhancers and CTCF binding sites. Barcoded DNA fragments are pooled before bisulfite conversion, allowing multiplex processing and technical consistency in low-input samples. Application of XRBS to single leukemia cells enabled us to evaluate genetic copy number variations and methylation variability across individual cells. Our analysis highlights heterochromatic H3K9me3 regions as having the highest cell-to-cell variability in their methylation, likely reflecting inherent epigenetic instability of these late-replicating regions, compounded by differences in cell cycle stages among sampled cells.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-021-00910-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8434949PMC
September 2021

Data-Driven Polymer Model for Mechanistic Exploration of Diploid Genome Organization.

Biophys J 2020 11 22;119(9):1905-1916. Epub 2020 Sep 22.

Departments of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts. Electronic address:

Chromosomes are positioned nonrandomly inside the nucleus to coordinate with their transcriptional activity. The molecular mechanisms that dictate the global genome organization and the nuclear localization of individual chromosomes are not fully understood. We introduce a polymer model to study the organization of the diploid human genome. It is data-driven because all parameters can be derived from Hi-C data; it is also a mechanistic model because the energy function is explicitly written out based on a few biologically motivated hypotheses. These two features distinguish the model from existing approaches and make it useful both for reconstructing genome structures and for exploring the principles of genome organization. We carried out extensive validations to show that simulated genome structures reproduce a wide variety of experimental measurements, including chromosome radial positions and spatial distances between homologous pairs. Detailed mechanistic investigations support the importance of both specific interchromosomal interactions and centromere clustering for chromosome positioning. We anticipate the polymer model, when combined with Hi-C experiments, to be a powerful tool for investigating large-scale rearrangements in genome structure upon cell differentiation and tumor progression.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.bpj.2020.09.009DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7677132PMC
November 2020

Large-Scale Topological Changes Restrain Malignant Progression in Colorectal Cancer.

Cell 2020 09 24;182(6):1474-1489.e23. Epub 2020 Aug 24.

Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA; Center for Cancer Research, Massachusetts General Hospital, Boston, MA 02129, USA. Electronic address:

Widespread changes to DNA methylation and chromatin are well documented in cancer, but the fate of higher-order chromosomal structure remains obscure. Here we integrated topological maps for colon tumors and normal colons with epigenetic, transcriptional, and imaging data to characterize alterations to chromatin loops, topologically associated domains, and large-scale compartments. We found that spatial partitioning of the open and closed genome compartments is profoundly compromised in tumors. This reorganization is accompanied by compartment-specific hypomethylation and chromatin changes. Additionally, we identify a compartment at the interface between the canonical A and B compartments that is reorganized in tumors. Remarkably, similar shifts were evident in non-malignant cells that have accumulated excess divisions. Our analyses suggest that these topological changes repress stemness and invasion programs while inducing anti-tumor immunity genes and may therefore restrain malignant progression. Our findings call into question the conventional view that tumor-associated epigenomic alterations are primarily oncogenic.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2020.07.030DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7575124PMC
September 2020

Massively parallel single-cell mitochondrial DNA genotyping and chromatin profiling.

Nat Biotechnol 2021 04 12;39(4):451-461. Epub 2020 Aug 12.

Division of Hematology/Oncology, Boston Children's Hospital and Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA.

Natural mitochondrial DNA (mtDNA) mutations enable the inference of clonal relationships among cells. mtDNA can be profiled along with measures of cell state, but has not yet been combined with the massively parallel approaches needed to tackle the complexity of human tissue. Here, we introduce a high-throughput, droplet-based mitochondrial single-cell assay for transposase-accessible chromatin with sequencing (scATAC-seq), a method that combines high-confidence mtDNA mutation calling in thousands of single cells with their concomitant high-quality accessible chromatin profile. This enables the inference of mtDNA heteroplasmy, clonal relationships, cell state and accessible chromatin variation in individual cells. We reveal single-cell variation in heteroplasmy of a pathologic mtDNA variant, which we associate with intra-individual chromatin variability and clonal evolution. We clonally trace thousands of cells from cancers, linking epigenomic variability to subclonal evolution, and infer cellular dynamics of differentiating hematopoietic cells in vitro and in vivo. Taken together, our approach enables the study of cellular population dynamics and clonal properties in vivo.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-020-0645-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7878580PMC
April 2021

Author Correction: Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model.

Genome Biol 2020 Jul 22;21(1):179. Epub 2020 Jul 22.

Department of Biostatistics, Harvard University, Cambridge, MA, USA.

An amendment to this paper has been published and can be accessed via the original article.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-020-02109-wDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7374840PMC
July 2020

Publisher Correction: Engineered CRISPR-Cas12a variants with increased activities and improved targeting ranges for gene, epigenetic and base editing.

Nat Biotechnol 2020 Jul;38(7):901

Molecular Pathology Unit, Massachusetts General Hospital, Charlestown, MA, USA.

An amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-020-0587-zDOI Listing
July 2020

A dual-deaminase CRISPR base editor enables concurrent adenine and cytosine editing.

Nat Biotechnol 2020 07 1;38(7):861-864. Epub 2020 Jun 1.

Molecular Pathology Unit, Massachusetts General Hospital, Charlestown, MA, USA.

Existing adenine and cytosine base editors induce only a single type of modification, limiting the range of DNA alterations that can be created. Here we describe a CRISPR-Cas9-based synchronous programmable adenine and cytosine editor (SPACE) that can concurrently introduce A-to-G and C-to-T substitutions with minimal RNA off-target edits. SPACE expands the range of possible DNA sequence alterations, broadening the research applications of CRISPR base editors.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-020-0535-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7723518PMC
July 2020

Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model.

Genome Biol 2019 12 23;20(1):295. Epub 2019 Dec 23.

Department of Biostatistics, Harvard University, Cambridge, MA, USA.

Single-cell RNA-Seq (scRNA-Seq) profiles gene expression of individual cells. Recent scRNA-Seq datasets have incorporated unique molecular identifiers (UMIs). Using negative controls, we show UMI counts follow multinomial sampling with no zero inflation. Current normalization procedures such as log of counts per million and feature selection by highly variable genes produce false variability in dimension reduction. We propose simple multinomial methods, including generalized principal component analysis (GLM-PCA) for non-normal distributions, and feature selection using deviance. These methods outperform the current practice in a downstream clustering assessment using ground truth datasets.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-019-1861-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6927135PMC
December 2019

High levels of AAV vector integration into CRISPR-induced DNA breaks.

Nat Commun 2019 09 30;10(1):4439. Epub 2019 Sep 30.

Department of Neurobiology, Harvard Medical School, Boston, MA, 02115, USA.

Adeno-associated virus (AAV) vectors have shown promising results in preclinical models, but the genomic consequences of transduction with AAV vectors encoding CRISPR-Cas nucleases is still being examined. In this study, we observe high levels of AAV integration (up to 47%) into Cas9-induced double-strand breaks (DSBs) in therapeutically relevant genes in cultured murine neurons, mouse brain, muscle and cochlea. Genome-wide AAV mapping in mouse brain shows no overall increase of AAV integration except at the CRISPR/Cas9 target site. To allow detailed characterization of integration events we engineer a miniature AAV encoding a 465 bp lambda bacteriophage DNA (AAV-λ465), enabling sequencing of the entire integrated vector genome. The integration profile of AAV-465λ in cultured cells display both full-length and fragmented AAV genomes at Cas9 on-target sites. Our data indicate that AAV integration should be recognized as a common outcome for applications that utilize AAV for genome editing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-019-12449-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6769011PMC
September 2019

CRISPR DNA base editors with reduced RNA off-target and self-editing activities.

Nat Biotechnol 2019 09 2;37(9):1041-1048. Epub 2019 Sep 2.

Molecular Pathology Unit, Massachusetts General Hospital, Charlestown, MA, USA.

Cytosine or adenine base editors (CBEs or ABEs) can introduce specific DNA C-to-T or A-to-G alterations. However, we recently demonstrated that they can also induce transcriptome-wide guide-RNA-independent editing of RNA bases, and created selective curbing of unwanted RNA editing (SECURE)-BE3 variants that have reduced unwanted RNA-editing activity. Here we describe structure-guided engineering of SECURE-ABE variants with reduced off-target RNA-editing activity and comparable on-target DNA-editing activity that are also among the smallest Streptococcus pyogenes Cas9 base editors described to date. We also tested CBEs with cytidine deaminases other than APOBEC1 and found that the human APOBEC3A-based CBE induces substantial editing of RNA bases, whereas an enhanced APOBEC3A-based CBE, human activation-induced cytidine deaminase-based CBE, and the Petromyzon marinus cytidine deaminase-based CBE Target-AID induce less editing of RNA. Finally, we found that CBEs and ABEs that exhibit RNA off-target editing activity can also self-edit their own transcripts, thereby leading to heterogeneity in base-editor coding sequences.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-019-0236-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6730565PMC
September 2019

Magnetic Resonance Spectroscopy-based Metabolomic Biomarkers for Typing, Staging, and Survival Estimation of Early-Stage Human Lung Cancer.

Sci Rep 2019 07 16;9(1):10319. Epub 2019 Jul 16.

Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, 02114, USA.

Low-dose CT has shown promise in detecting early stage lung cancer. However, concerns about the adverse health effects of radiation and high cost prevent its use as a population-wide screening tool. Effective and feasible screening methods to triage suspicious patients to CT are needed. We investigated human lung cancer metabolomics from 93 paired tissue-serum samples with magnetic resonance spectroscopy and identified tissue and serum metabolomic markers that can differentiate cancer types and stages. Most interestingly, we identified serum metabolomic profiles that can predict patient overall survival for all cases (p = 0.0076), and more importantly for Stage I cases alone (n = 58, p = 0.0100), a prediction which is significant for treatment strategies but currently cannot be achieved by any clinical method. Prolonged survival is associated with relative overexpression of glutamine, valine, and glycine, and relative suppression of glutamate and lipids in serum.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-019-46643-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6635503PMC
July 2019

Droplet-based combinatorial indexing for massive-scale single-cell chromatin accessibility.

Nat Biotechnol 2019 08 24;37(8):916-924. Epub 2019 Jun 24.

Broad Institute of MIT and Harvard, Cambridge, MA, USA.

Recent technical advancements have facilitated the mapping of epigenomes at single-cell resolution; however, the throughput and quality of these methods have limited their widespread adoption. Here we describe a high-quality (10 nuclear fragments per cell) droplet-microfluidics-based method for single-cell profiling of chromatin accessibility. We use this approach, named 'droplet single-cell assay for transposase-accessible chromatin using sequencing' (dscATAC-seq), to assay 46,653 cells for the unbiased discovery of cell types and regulatory elements in adult mouse brain. We further increase the throughput of this platform by combining it with combinatorial indexing (dsciATAC-seq), enabling single-cell studies at a massive scale. We demonstrate the utility of this approach by measuring chromatin accessibility across 136,463 resting and stimulated human bone marrow-derived cells to reveal changes in the cis- and trans-regulatory landscape across cell types and under stimulatory conditions at single-cell resolution. Altogether, we describe a total of 510,123 single-cell profiles, demonstrating the scalability and flexibility of this droplet-based platform.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-019-0147-6DOI Listing
August 2019

Transcriptional States and Chromatin Accessibility Underlying Human Erythropoiesis.

Cell Rep 2019 06;27(11):3228-3240.e7

Division of Hematology/Oncology, Boston Children's Hospital, and Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA. Electronic address:

Human erythropoiesis serves as a paradigm of physiologic cellular differentiation. This process is also of considerable interest for better understanding anemias and identifying new therapies. Here, we apply deep transcriptomic and accessible chromatin profiling to characterize a faithful ex vivo human erythroid differentiation system from hematopoietic stem and progenitor cells. We reveal stage-specific transcriptional states and chromatin accessibility during various stages of erythropoiesis, including 14,260 differentially expressed genes and 63,659 variably accessible chromatin peaks. Our analysis suggests differentiation stage-predominant roles for specific master regulators, including GATA1 and KLF1. We integrate chromatin profiles with common and rare genetic variants associated with erythroid cell traits and diseases, finding that variants regulating different erythroid phenotypes likely act at variable points during differentiation. In addition, we identify a regulator of terminal erythropoiesis, TMCC2, more broadly illustrating the value of this comprehensive analysis to improve our understanding of erythropoiesis in health and disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.celrep.2019.05.046DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6579117PMC
June 2019

Stromal Microenvironment Shapes the Intratumoral Architecture of Pancreatic Cancer.

Cell 2019 06 30;178(1):160-175.e27. Epub 2019 May 30.

Cancer Center, Massachusetts General Hospital, Boston, MA 02114, USA.

Single-cell technologies have described heterogeneity across tissues, but the spatial distribution and forces that drive single-cell phenotypes have not been well defined. Combining single-cell RNA and protein analytics in studying the role of stromal cancer-associated fibroblasts (CAFs) in modulating heterogeneity in pancreatic cancer (pancreatic ductal adenocarcinoma [PDAC]) model systems, we have identified significant single-cell population shifts toward invasive epithelial-to-mesenchymal transition (EMT) and proliferative (PRO) phenotypes linked with mitogen-activated protein kinase (MAPK) and signal transducer and activator of transcription 3 (STAT3) signaling. Using high-content digital imaging of RNA in situ hybridization in 195 PDAC tumors, we quantified these EMT and PRO subpopulations in 319,626 individual cancer cells that can be classified within the context of distinct tumor gland "units." Tumor gland typing provided an additional layer of intratumoral heterogeneity that was associated with differences in stromal abundance and clinical outcomes. This demonstrates the impact of the stroma in shaping tumor architecture by altering inherent patterns of tumor glands in human PDAC.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2019.05.012DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6697165PMC
June 2019

Epigenetic evolution and lineage histories of chronic lymphocytic leukaemia.

Nature 2019 05 15;569(7757):576-580. Epub 2019 May 15.

New York Genome Center, New York, NY, USA.

Genetic and epigenetic intra-tumoral heterogeneity cooperate to shape the evolutionary course of cancer. Chronic lymphocytic leukaemia (CLL) is a highly informative model for cancer evolution as it undergoes substantial genetic diversification and evolution after therapy. The CLL epigenome is also an important disease-defining feature, and growing populations of cells in CLL diversify by stochastic changes in DNA methylation known as epimutations. However, previous studies using bulk sequencing methods to analyse the patterns of DNA methylation were unable to determine whether epimutations affect CLL populations homogeneously. Here, to measure the epimutation rate at single-cell resolution, we applied multiplexed single-cell reduced-representation bisulfite sequencing to B cells from healthy donors and patients with CLL. We observed that the common clonal origin of CLL results in a consistently increased epimutation rate, with low variability in the cell-to-cell epimutation rate. By contrast, variable epimutation rates across healthy B cells reflect diverse evolutionary ages across the trajectory of B cell differentiation, consistent with epimutations serving as a molecular clock. Heritable epimutation information allowed us to reconstruct lineages at high-resolution with single-cell data, and to apply this directly to patient samples. The CLL lineage tree shape revealed earlier branching and longer branch lengths than in normal B cells, reflecting rapid drift after the initial malignant transformation and a greater proliferative history. Integration of single-cell bisulfite sequencing analysis with single-cell transcriptomes and genotyping confirmed that genetic subclones mapped to distinct clades, as inferred solely on the basis of epimutation information. Finally, to examine potential lineage biases during therapy, we profiled serial samples during ibrutinib-associated lymphocytosis, and identified clades of cells that were preferentially expelled from the lymph node after treatment, marked by distinct transcriptional profiles. The single-cell integration of genetic, epigenetic and transcriptional information thus charts the lineage history of CLL and its evolution with therapy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-019-1198-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6533116PMC
May 2019

Single-cell trajectories reconstruction, exploration and mapping of omics data with STREAM.

Nat Commun 2019 04 23;10(1):1903. Epub 2019 Apr 23.

Molecular Pathology Unit & Cancer Center, Massachusetts General Hospital Research Institute and Harvard Medical School, Boston, MA, 02114, USA.

Single-cell transcriptomic assays have enabled the de novo reconstruction of lineage differentiation trajectories, along with the characterization of cellular heterogeneity and state transitions. Several methods have been developed for reconstructing developmental trajectories from single-cell transcriptomic data, but efforts on analyzing single-cell epigenomic data and on trajectory visualization remain limited. Here we present STREAM, an interactive pipeline capable of disentangling and visualizing complex branching trajectories from both single-cell transcriptomic and epigenomic data. We have tested STREAM on several synthetic and real datasets generated with different single-cell technologies. We further demonstrate its utility for understanding myoblast differentiation and disentangling known heterogeneity in hematopoiesis for different organisms. STREAM is an open-source software package.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-019-09670-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6478907PMC
April 2019

Transcriptome-wide off-target RNA editing induced by CRISPR-guided DNA base editors.

Nature 2019 05 17;569(7756):433-437. Epub 2019 Apr 17.

Molecular Pathology Unit, Massachusetts General Hospital, Charlestown, MA, USA.

CRISPR-Cas base-editor technology enables targeted nucleotide alterations, and is being increasingly used for research and potential therapeutic applications. The most widely used cytosine base editors (CBEs) induce deamination of DNA cytosines using the rat APOBEC1 enzyme, which is targeted by a linked Cas protein-guide RNA complex. Previous studies of the specificity of CBEs have identified off-target DNA edits in mammalian cells. Here we show that a CBE with rat APOBEC1 can cause extensive transcriptome-wide deamination of RNA cytosines in human cells, inducing tens of thousands of C-to-U edits with frequencies ranging from 0.07% to 100% in 38-58% of expressed genes. CBE-induced RNA edits occur in both protein-coding and non-protein-coding sequences and generate missense, nonsense, splice site, and 5' and 3' untranslated region mutations. We engineered two CBE variants bearing mutations in rat APOBEC1 that substantially decreased the number of RNA edits (by more than 390-fold and more than 3,800-fold) in human cells. These variants also showed more precise on-target DNA editing than the wild-type CBE and, for most guide RNAs tested, no substantial reduction in editing efficiency. Finally, we show that an adenine base editor can also induce transcriptome-wide RNA edits. These results have implications for the use of base editors in both research and clinical settings, illustrate the feasibility of engineering improved variants with reduced RNA editing activities, and suggest the need to more fully define and characterize the RNA off-target effects of deaminase enzymes in base editor platforms.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-019-1161-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6657343PMC
May 2019

A (fire)cloud-based DNA methylation data preprocessing and quality control platform.

BMC Bioinformatics 2019 Mar 29;20(1):160. Epub 2019 Mar 29.

Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA.

Background: Bisulfite sequencing allows base-pair resolution profiling of DNA methylation and has recently been adapted for use in single-cells. Analyzing these data, including making comparisons with existing data, remains challenging due to the scale of the data and differences in preprocessing methods between published datasets.

Results: We present a set of preprocessing pipelines for bisulfite sequencing DNA methylation data that include a new R/Bioconductor package, scmeth, for a series of efficient QC analyses of large datasets. The pipelines go from raw data to CpG-level methylation estimates and can be run, with identical results, either on a single computer, in an HPC cluster or on Google Cloud Compute resources. These pipelines are designed to allow users to 1) ensure reproducibility of analyses, 2) achieve scalability to large whole genome datasets with 100 GB+ of raw data per sample and to single-cell datasets with thousands of cells, 3) enable integration and comparison between user-provided data and publicly available data, as all samples can be processed through the same pipeline, and 4) access to best-practice analysis pipelines. Pipelines are provided for whole genome bisulfite sequencing (WGBS), reduced representation bisulfite sequencing (RRBS) and hybrid selection (capture) bisulfite sequencing (HSBS).

Conclusions: The workflows produce data quality metrics, visualization tracks, and aggregated output for further downstream analysis. Optional use of cloud computing resources facilitates analysis of large datasets, and integration with existing methylome profiles. The workflow design principles are applicable to other genomic data types.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2750-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6440105PMC
March 2019

Interrogation of human hematopoiesis at single-cell and single-variant resolution.

Nat Genet 2019 04 11;51(4):683-693. Epub 2019 Mar 11.

Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA.

Widespread linkage disequilibrium and incomplete annotation of cell-to-cell state variation represent substantial challenges to elucidating mechanisms of trait-associated genetic variation. Here we perform genetic fine-mapping for blood cell traits in the UK Biobank to identify putative causal variants. These variants are enriched in genes encoding proteins in trait-relevant biological pathways and in accessible chromatin of hematopoietic progenitors. For regulatory variants, we explore patterns of developmental enhancer activity, predict molecular mechanisms, and identify likely target genes. In several instances, we localize multiple independent variants to the same regulatory element or gene. We further observe that variants with pleiotropic effects preferentially act in common progenitor populations to direct the production of distinct lineages. Finally, we leverage fine-mapped variants in conjunction with continuous epigenomic annotations to identify trait-cell type enrichments within closely related populations and in single cells. Our study provides a comprehensive framework for single-variant and single-cell analyses of genetic associations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-019-0362-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6441389PMC
April 2019

Lineage Tracing in Humans Enabled by Mitochondrial Mutations and Single-Cell Genomics.

Cell 2019 03 28;176(6):1325-1339.e22. Epub 2019 Feb 28.

Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Division of Hematology/Oncology, Boston Children's Hospital and Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02115, USA; Harvard Stem Cell Institute, Cambridge, MA 02138, USA. Electronic address:

Lineage tracing provides key insights into the fate of individual cells in complex organisms. Although effective genetic labeling approaches are available in model systems, in humans, most approaches require detection of nuclear somatic mutations, which have high error rates, limited scale, and do not capture cell state information. Here, we show that somatic mutations in mtDNA can be tracked by single-cell RNA or assay for transposase accessible chromatin (ATAC) sequencing. We leverage somatic mtDNA mutations as natural genetic barcodes and demonstrate their utility as highly accurate clonal markers to infer cellular relationships. We track native human cells both in vitro and in vivo and relate clonal dynamics to gene expression and chromatin accessibility. Our approach should allow clonal tracking at a 1,000-fold greater scale than with nuclear genome sequencing, with simultaneous information on cell state, opening the way to chart cellular dynamics in human health and disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2019.01.022DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6408267PMC
March 2019

Preprocessing and Computational Analysis of Single-Cell Epigenomic Datasets.

Methods Mol Biol 2019 ;1935:187-202

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

Recent technological developments have enabled the characterization of the epigenetic landscape of single cells across a range of tissues in normal and diseased states and under various biological and chemical perturbations. While analysis of these profiles resembles methods from single-cell transcriptomic studies, unique challenges are associated with bioinformatics processing of single-cell epigenetic data, including a much larger (10-1,000×) feature set and significantly greater sparsity, requiring customized solutions. Here, we discuss the essentials of the computational methodology required for analyzing common single-cell epigenomic measurements for DNA methylation using bisulfite sequencing and open chromatin using ATAC-Seq.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-1-4939-9057-3_13DOI Listing
June 2019

Engineered CRISPR-Cas12a variants with increased activities and improved targeting ranges for gene, epigenetic and base editing.

Nat Biotechnol 2019 03 11;37(3):276-282. Epub 2019 Feb 11.

Molecular Pathology Unit, Massachusetts General Hospital, Charlestown, MA, USA.

Broad use of CRISPR-Cas12a (formerly Cpf1) nucleases has been hindered by the requirement for an extended TTTV protospacer adjacent motif (PAM). To address this limitation, we engineered an enhanced Acidaminococcus sp. Cas12a variant (enAsCas12a) that has a substantially expanded targeting range, enabling targeting of many previously inaccessible PAMs. On average, enAsCas12a exhibits a twofold higher genome editing activity on sites with canonical TTTV PAMs compared to wild-type AsCas12a, and we successfully grafted a subset of mutations from enAsCas12a onto other previously described AsCas12a variants to enhance their activities. enAsCas12a improves the efficiency of multiplex gene editing, endogenous gene activation and C-to-T base editing, and we engineered a high-fidelity version of enAsCas12a (enAsCas12a-HF1) to reduce off-target effects. Both enAsCas12a and enAsCas12a-HF1 function in HEK293T and primary human T cells when delivered as ribonucleoprotein (RNP) complexes. Collectively, enAsCas12a provides an optimized version of Cas12a that should enable wider application of Cas12a enzymes for gene and epigenetic editing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-018-0011-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6401248PMC
March 2019

Defining CRISPR-Cas9 genome-wide nuclease activities with CIRCLE-seq.

Nat Protoc 2018 11;13(11):2615-2642

Department of Hematology, St. Jude Children's Research Hospital, Memphis, TN, USA.

Circularization for in vitro reporting of cleavage effects by sequencing (CIRCLE-seq) is a sensitive and unbiased method for defining the genome-wide activity (on-target and off-target) of CRISPR-Cas9 nucleases by selective sequencing of nuclease-cleaved genomic DNA (gDNA). Here, we describe a detailed experimental and analytical protocol for CIRCLE-seq. The principle of our method is to generate a library of circularized gDNA with minimized numbers of free ends. Highly purified gDNA circles are treated with CRISPR-Cas9 ribonucleoprotein complexes, and nuclease-linearized DNA fragments are then ligated to adapters for high-throughput sequencing. The primary advantages of CIRCLE-seq as compared with other in vitro methods for defining genome-wide genome editing activity are (i) high enrichment for sequencing nuclease-cleaved gDNA/low background, enabling sensitive detection with low sequencing depth requirements; and (ii) the fact that paired-end reads can contain complete information on individual nuclease cleavage sites, enabling use of CIRCLE-seq in species without high-quality reference genomes. The entire protocol can be completed in 2 weeks, including time for gRNA cloning, sequence verification, in vitro transcription, library preparation, and sequencing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41596-018-0055-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6512799PMC
November 2018

In vivo CRISPR editing with no detectable genome-wide off-target mutations.

Nature 2018 09 12;561(7723):416-419. Epub 2018 Sep 12.

Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA.

CRISPR-Cas genome-editing nucleases hold substantial promise for developing human therapeutic applications but identifying unwanted off-target mutations is important for clinical translation. A well-validated method that can reliably identify off-targets in vivo has not been described to date, which means it is currently unclear whether and how frequently these mutations occur. Here we describe 'verification of in vivo off-targets' (VIVO), a highly sensitive strategy that can robustly identify the genome-wide off-target effects of CRISPR-Cas nucleases in vivo. We use VIVO and a guide RNA deliberately designed to be promiscuous to show that CRISPR-Cas nucleases can induce substantial off-target mutations in mouse livers in vivo. More importantly, we also use VIVO to show that appropriately designed guide RNAs can direct efficient in vivo editing in mouse livers with no detectable off-target mutations. VIVO provides a general strategy for defining and quantifying the off-target effects of gene-editing nucleases in whole organisms, thereby providing a blueprint to foster the development of therapeutic strategies that use in vivo gene editing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-018-0500-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6194229PMC
September 2018

Enhancer histone-QTLs are enriched on autoimmune risk haplotypes and influence gene expression within chromatin networks.

Nat Commun 2018 07 25;9(1):2905. Epub 2018 Jul 25.

Division of Genomics and Data Sciences, Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, 73104, OK, USA.

Genetic variants can confer risk to complex genetic diseases by modulating gene expression through changes to the epigenome. To assess the degree to which genetic variants influence epigenome activity, we integrate epigenetic and genotypic data from lupus patient lymphoblastoid cell lines to identify variants that induce allelic imbalance in the magnitude of histone post-translational modifications, referred to herein as histone quantitative trait loci (hQTLs). We demonstrate that enhancer hQTLs are enriched on autoimmune disease risk haplotypes and disproportionately influence gene expression variability compared with non-hQTL variants in strong linkage disequilibrium. We show that the epigenome regulates HLA class II genes differently in individuals who carry HLA-DR3 or HLA-DR15 haplotypes, resulting in differential 3D chromatin conformation and gene expression. Finally, we identify significant expression QTL (eQTL) x hQTL interactions that reveal substructure within eQTL gene expression, suggesting potential implications for functional genomic studies that leverage eQTL data for subject selection and stratification.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-018-05328-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6060153PMC
July 2018

Activities and specificities of CRISPR/Cas9 and Cas12a nucleases for targeted mutagenesis in maize.

Plant Biotechnol J 2019 02 22;17(2):362-372. Epub 2018 Jul 22.

Crop Bioengineering Center, Iowa State University, Ames, IA, USA.

CRISPR/Cas9 and Cas12a (Cpf1) nucleases are two of the most powerful genome editing tools in plants. In this work, we compared their activities by targeting maize glossy2 gene coding region that has overlapping sequences recognized by both nucleases. We introduced constructs carrying SpCas9-guide RNA (gRNA) and LbCas12a-CRISPR RNA (crRNA) into maize inbred B104 embryos using Agrobacterium-mediated transformation. On-target mutation analysis showed that 90%-100% of the Cas9-edited T0 plants carried indel mutations and 63%-77% of them were homozygous or biallelic mutants. In contrast, 0%-60% of Cas12a-edited T0 plants had on-target mutations. We then conducted CIRCLE-seq analysis to identify genome-wide potential off-target sites for Cas9. A total of 18 and 67 potential off-targets were identified for the two gRNAs, respectively, with an average of five mismatches compared to the target sites. Sequencing analysis of a selected subset of the off-target sites revealed no detectable level of mutations in the T1 plants, which constitutively express Cas9 nuclease and gRNAs. In conclusion, our results suggest that the CRISPR/Cas9 system used in this study is highly efficient and specific for genome editing in maize, while CRISPR/Cas12a needs further optimization for improved editing efficiency.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/pbi.12982DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6320322PMC
February 2019

Integrated Single-Cell Analysis Maps the Continuous Regulatory Landscape of Human Hematopoietic Differentiation.

Cell 2018 05 26;173(6):1535-1548.e16. Epub 2018 Apr 26.

Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Applied Physics, Stanford University, Stanford, CA 94025, USA; Chan Zuckerberg Biohub, San Francisco, CA 94158, USA. Electronic address:

Human hematopoiesis involves cellular differentiation of multipotent cells into progressively more lineage-restricted states. While the chromatin accessibility landscape of this process has been explored in defined populations, single-cell regulatory variation has been hidden by ensemble averaging. We collected single-cell chromatin accessibility profiles across 10 populations of immunophenotypically defined human hematopoietic cell types and constructed a chromatin accessibility landscape of human hematopoiesis to characterize differentiation trajectories. We find variation consistent with lineage bias toward different developmental branches in multipotent cell types. We observe heterogeneity within common myeloid progenitors (CMPs) and granulocyte-macrophage progenitors (GMPs) and develop a strategy to partition GMPs along their differentiation trajectory. Furthermore, we integrated single-cell RNA sequencing (scRNA-seq) data to associate transcription factors to chromatin accessibility changes and regulatory elements to target genes through correlations of expression and regulatory element accessibility. Overall, this work provides a framework for integrative exploration of complex regulatory dynamics in a primary human tissue at single-cell resolution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2018.03.074DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5989727PMC
May 2018
-->