Publications by authors named "John Aach"

40 Publications

Barcoded oligonucleotides ligated on RNA amplified for multiplexed and parallel in situ analyses.

Nucleic Acids Res 2021 06;49(10):e58

Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.

We present barcoded oligonucleotides ligated on RNA amplified for multiplexed and parallel insitu analyses (BOLORAMIS), a reverse transcription-free method for spatially-resolved, targeted, in situ RNA identification of single or multiple targets. BOLORAMIS was demonstrated on a range of cell types and human cerebral organoids. Singleplex experiments to detect coding and non-coding RNAs in human iPSCs showed a stem-cell signature pattern. Specificity of BOLORAMIS was found to be 92% as illustrated by a clear distinction between human and mouse housekeeping genes in a co-culture system, as well as by recapitulation of subcellular localization of lncRNA MALAT1. Sensitivity of BOLORAMIS was quantified by comparing with single molecule FISH experiments and found to be 11%, 12% and 35% for GAPDH, TFRC and POLR2A, respectively. To demonstrate BOLORAMIS for multiplexed gene analysis, we targeted 96 mRNAs within a co-culture of iNGN neurons and HMC3 human microglial cells. We used fluorescence in situ sequencing to detect error-robust 8-base barcodes associated with each of these genes. We then used this data to uncover the spatial relationship among cells and transcripts by performing single-cell clustering and gene-gene proximity analyses. We anticipate the BOLORAMIS technology for in situ RNA detection to find applications in basic and translational research.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkab120DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191787PMC
June 2021

Characterizing the portability of phage-encoded homologous recombination proteins.

Nat Chem Biol 2021 04 18;17(4):394-402. Epub 2021 Jan 18.

Wyss Institute for Biologically Inspired Engineering, Harvard University, Cambridge, MA, USA.

Efficient genome editing methods are essential for biotechnology and fundamental research. Homologous recombination (HR) is the most versatile method of genome editing, but techniques that rely on host RecA-mediated pathways are inefficient and laborious. Phage-encoded single-stranded DNA annealing proteins (SSAPs) improve HR 1,000-fold above endogenous levels. However, they are not broadly functional. Using Escherichia coli, Lactococcus lactis, Mycobacterium smegmatis, Lactobacillus rhamnosus and Caulobacter crescentus, we investigated the limited portability of SSAPs. We find that these proteins specifically recognize the C-terminal tail of the host's single-stranded DNA-binding protein (SSB) and are portable between species only if compatibility with this host domain is maintained. Furthermore, we find that co-expressing SSAPs with SSBs can significantly improve genome editing efficiency, in some species enabling SSAP functionality even without host compatibility. Finally, we find that high-efficiency HR far surpasses the mutational capacity of commonly used random mutagenesis methods, generating exceptional phenotypes that are inaccessible through sequential nucleotide conversions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41589-020-00710-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7990699PMC
April 2021

Chromosome-scale, haplotype-resolved assembly of human genomes.

Nat Biotechnol 2021 03 7;39(3):309-312. Epub 2020 Dec 7.

Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.

Haplotype-resolved or phased genome assembly provides a complete picture of genomes and their complex genetic variations. However, current algorithms for phased assembly either do not generate chromosome-scale phasing or require pedigree information, which limits their application. We present a method named diploid assembly (DipAsm) that uses long, accurate reads and long-range conformation data for single individuals to generate a chromosome-scale phased assembly within 1 day. Applied to four public human genomes, PGP1, HG002, NA12878 and HG00733, DipAsm produced haplotype-resolved assemblies with minimum contig length needed to cover 50% of the known genome (NG50) up to 25 Mb and phased ~99.5% of heterozygous sites at 98-99% accuracy, outperforming other approaches in terms of both contiguity and phasing completeness. We demonstrate the importance of chromosome-scale phased assemblies for the discovery of structural variants (SVs), including thousands of new transposon insertions, and of highly polymorphic and medically important regions such as the human leukocyte antigen (HLA) and killer cell immunoglobulin-like receptor (KIR) regions. DipAsm will facilitate high-quality precision medicine and studies of individual haplotype variation and population diversity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-020-0711-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7954703PMC
March 2021

A haplotype-aware de novo assembly of related individuals using pedigree sequence graph.

Bioinformatics 2020 04;36(8):2385-2392

Department of Genetics, Harvard Medical School.

Motivation: Reconstructing high-quality haplotype-resolved assemblies for related individuals has important applications in Mendelian diseases and population genomics. Through major genomics sequencing efforts such as the Personal Genome Project, the Vertebrate Genome Project (VGP) and the Genome in a Bottle project (GIAB), a variety of sequencing datasets from trios of diploid genomes are becoming available. Current trio assembly approaches are not designed to incorporate long- and short-read data from mother-father-child trios, and therefore require relatively high coverages of costly long-read data to produce high-quality assemblies. Thus, building a trio-aware assembler capable of producing accurate and chromosomal-scale diploid genomes of all individuals in a pedigree, while being cost-effective in terms of sequencing costs, is a pressing need of the genomics community.

Results: We present a novel pedigree sequence graph based approach to diploid assembly using accurate Illumina data and long-read Pacific Biosciences (PacBio) data from all related individuals, thereby generalizing our previous work on single individuals. We demonstrate the effectiveness of our pedigree approach on a simulated trio of pseudo-diploid yeast genomes with different heterozygosity rates, and real data from human chromosome. We show that we require as little as 30× coverage Illumina data and 15× PacBio data from each individual in a trio to generate chromosomal-scale phased assemblies. Additionally, we show that we can detect and phase variants from generated phased assemblies.

Availability And Implementation: https://github.com/shilpagarg/WHdenovo.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz942DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7759745PMC
April 2020

Optimizing complex phenotypes through model-guided multiplex genome engineering.

Genome Biol 2017 05 25;18(1):100. Epub 2017 May 25.

Department of Genetics, Harvard Medical School, Boston, MA, USA.

We present a method for identifying genomic modifications that optimize a complex phenotype through multiplex genome engineering and predictive modeling. We apply our method to identify six single nucleotide mutations that recover 59% of the fitness defect exhibited by the 63-codon E. coli strain C321.∆A. By introducing targeted combinations of changes in multiplex we generate rich genotypic and phenotypic diversity and characterize clones using whole-genome sequencing and doubling time measurements. Regularized multivariate linear regression accurately quantifies individual allelic effects and overcomes bias from hitchhiking mutations and context-dependence of genome editing efficiency that would confound other strategies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-017-1217-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5445303PMC
May 2017

Addressing the ethical issues raised by synthetic human entities with embryo-like features.

Elife 2017 03 21;6. Epub 2017 Mar 21.

Department of Genetics, Harvard Medical School, Boston, United States.

The "14-day rule" for embryo research stipulates that experiments with intact human embryos must not allow them to develop beyond 14 days or the appearance of the primitive streak. However, recent experiments showing that suitably cultured human pluripotent stem cells can self-organize and recapitulate embryonic features have highlighted difficulties with the 14-day rule and led to calls for its reassessment. Here we argue that these and related experiments raise more foundational issues that cannot be fixed by adjusting the 14-day rule, because the framework underlying the rule cannot adequately describe the ways by which synthetic human entities with embryo-like features (SHEEFs) might develop morally concerning features through altered forms of development. We propose that limits on research with SHEEFs be based as directly as possible on the generation of such features, and recommend that the research and bioethics communities lead a wide-ranging inquiry aimed at mapping out solutions to the ethical problems raised by them.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7554/eLife.20674DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5360441PMC
March 2017

Engineering and optimising deaminase fusions for genome editing.

Nat Commun 2016 11 2;7:13330. Epub 2016 Nov 2.

Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA.

Precise editing is essential for biomedical research and gene therapy. Yet, homology-directed genome modification is limited by the requirements for genomic lesions, homology donors and the endogenous DNA repair machinery. Here we engineered programmable cytidine deaminases and test if we could introduce site-specific cytidine to thymidine transitions in the absence of targeted genomic lesions. Our programmable deaminases effectively convert specific cytidines to thymidines with 13% efficiency in Escherichia coli and 2.5% in human cells. However, off-target deaminations were detected more than 150 bp away from the target site. Moreover, whole genome sequencing revealed that edited bacterial cells did not harbour chromosomal abnormalities but demonstrated elevated global cytidine deamination at deaminase intrinsic binding sites. Therefore programmable deaminases represent a promising genome editing tool in prokaryotes and eukaryotes. Future engineering is required to overcome the processivity and the intrinsic DNA binding affinity of deaminases for safer therapeutic applications.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncomms13330DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5097136PMC
November 2016

Genome-wide inactivation of porcine endogenous retroviruses (PERVs).

Science 2015 Nov 11;350(6264):1101-4. Epub 2015 Oct 11.

Department of Genetics, Harvard Medical School, Boston, MA, USA. Wyss Institute for Biologically Inspired Engineering, Harvard University, Cambridge, MA, USA. eGenesis Biosciences, Boston, MA 02115, USA.

The shortage of organs for transplantation is a major barrier to the treatment of organ failure. Although porcine organs are considered promising, their use has been checked by concerns about the transmission of porcine endogenous retroviruses (PERVs) to humans. Here we describe the eradication of all PERVs in a porcine kidney epithelial cell line (PK15). We first determined the PK15 PERV copy number to be 62. Using CRISPR-Cas9, we disrupted all copies of the PERV pol gene and demonstrated a >1000-fold reduction in PERV transmission to human cells, using our engineered cells. Our study shows that CRISPR-Cas9 multiplexability can be as high as 62 and demonstrates the possibility that PERVs can be inactivated for clinical application of porcine-to-human xenotransplantation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aad1191DOI Listing
November 2015

Highly efficient Cas9-mediated transcriptional programming.

Nat Methods 2015 Apr 2;12(4):326-8. Epub 2015 Mar 2.

1] Wyss Institute for Biologically Inspired Engineering, Harvard University, Cambridge, Massachusetts, USA. [2] Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA.

The RNA-guided nuclease Cas9 can be reengineered as a programmable transcription factor. However, modest levels of gene activation have limited potential applications. We describe an improved transcriptional regulator obtained through the rational design of a tripartite activator, VP64-p65-Rta (VPR), fused to nuclease-null Cas9. We demonstrate its utility in activating endogenous coding and noncoding genes, targeting several genes simultaneously and stimulating neuronal differentiation of human induced pluripotent stem cells (iPSCs).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.3312DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4393883PMC
April 2015

Fluorescent in situ sequencing (FISSEQ) of RNA for gene expression profiling in intact cells and tissues.

Nat Protoc 2015 Mar 12;10(3):442-58. Epub 2015 Feb 12.

1] Wyss Institute, Harvard Medical School, Boston, Massachusetts, USA. [2] Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA.

RNA-sequencing (RNA-seq) measures the quantitative change in gene expression over the whole transcriptome, but it lacks spatial context. In contrast, in situ hybridization provides the location of gene expression, but only for a small number of genes. Here we detail a protocol for genome-wide profiling of gene expression in situ in fixed cells and tissues, in which RNA is converted into cross-linked cDNA amplicons and sequenced manually on a confocal microscope. Unlike traditional RNA-seq, our method enriches for context-specific transcripts over housekeeping and/or structural RNA, and it preserves the tissue architecture for RNA localization studies. Our protocol is written for researchers experienced in cell microscopy with minimal computing skills. Library construction and sequencing can be completed within 14 d, with image analysis requiring an additional 2 d.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nprot.2014.191DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4327781PMC
March 2015

Targeted and genome-wide sequencing reveal single nucleotide variations impacting specificity of Cas9 in human stem cells.

Nat Commun 2014 Nov 26;5:5507. Epub 2014 Nov 26.

1] Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA [2] Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, Massachusetts 02115, USA.

CRISPR/Cas9 has demonstrated a high-efficiency in site-specific gene targeting. However, potential off-target effects of the Cas9 nuclease represent a major safety concern for any therapeutic application. Here, we knock out the Tafazzin gene by CRISPR/Cas9 in human-induced pluripotent stem cells with 54% efficiency. We combine whole-genome sequencing and deep-targeted sequencing to characterise the off-target effects of Cas9 editing. Whole-genome sequencing of Cas9-modified hiPSC clones detects neither gross genomic alterations nor elevated mutation rates. Deep sequencing of in silico predicted off-target sites in a population of Cas9-treated cells further confirms high specificity of Cas9. However, we identify a single high-efficiency off-target site that is generated by a common germline single-nucleotide variant (SNV) in our experiment. Based on in silico analysis, we estimate a likelihood of SNVs creating off-target sites in a human genome to be ~1.5-8.5%, depending on the genome and site-selection method, but also note that mutations might be generated at these sites only at low rates and may not have functional consequences. Our study demonstrates the feasibility of highly specific clonal ex vivo gene editing using CRISPR/Cas9 and highlights the value of whole-genome sequencing before personalised CRISPR design.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncomms6507DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4352754PMC
November 2014

Multi-kilobase homozygous targeted gene replacement in human induced pluripotent stem cells.

Nucleic Acids Res 2015 Feb 20;43(3):e21. Epub 2014 Nov 20.

Department of Genetics, Harvard Medical School, Boston, MA 02115, USA Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA

Sequence-specific nucleases such as TALEN and the CRISPR/Cas9 system have so far been used to disrupt, correct or insert transgenes at precise locations in mammalian genomes. We demonstrate efficient 'knock-in' targeted replacement of multi-kilobase genes in human induced pluripotent stem cells (iPSC). Using a model system replacing endogenous human genes with their mouse counterpart, we performed a comprehensive study of targeting vector design parameters for homologous recombination. A 2.7 kilobase (kb) homozygous gene replacement was achieved in up to 11% of iPSC without selection. The optimal homology arm length was around 2 kb, with homology length being especially critical on the arm not adjacent to the cut site. Homologous sequence inside the cut sites was detrimental to targeting efficiency, consistent with a synthesis-dependent strand annealing (SDSA) mechanism. Using two nuclease sites, we observed a high degree of gene excisions and inversions, which sometimes occurred more frequently than indel mutations. While homozygous deletions of 86 kb were achieved with up to 8% frequency, deletion frequencies were not solely a function of nuclease activity and deletion size. Our results analyzing the optimal parameters for targeting vector design will inform future gene targeting efforts involving multi-kilobase gene segments, particularly in human iPSC.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gku1246DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4330342PMC
February 2015

Multiplex single-molecule interaction profiling of DNA-barcoded proteins.

Nature 2014 Nov 21;515(7528):554-7. Epub 2014 Sep 21.

1] Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA [2] Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, Massachusetts 02115, USA.

In contrast with advances in massively parallel DNA sequencing, high-throughput protein analyses are often limited by ensemble measurements, individual analyte purification and hence compromised quality and cost-effectiveness. Single-molecule protein detection using optical methods is limited by the number of spectrally non-overlapping chromophores. Here we introduce a single-molecular-interaction sequencing (SMI-seq) technology for parallel protein interaction profiling leveraging single-molecule advantages. DNA barcodes are attached to proteins collectively via ribosome display or individually via enzymatic conjugation. Barcoded proteins are assayed en masse in aqueous solution and subsequently immobilized in a polyacrylamide thin film to construct a random single-molecule array, where barcoding DNAs are amplified into in situ polymerase colonies (polonies) and analysed by DNA sequencing. This method allows precise quantification of various proteins with a theoretical maximum array density of over one million polonies per square millimetre. Furthermore, protein interactions can be measured on the basis of the statistics of colocalized polonies arising from barcoding DNAs of interacting proteins. Two demanding applications, G-protein coupled receptor and antibody-binding profiling, are demonstrated. SMI-seq enables 'library versus library' screening in a one-pot assay, simultaneously interrogating molecular binding affinity and specificity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature13761DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4246050PMC
November 2014

Improved cell-free RNA and protein synthesis system.

PLoS One 2014 2;9(9):e106232. Epub 2014 Sep 2.

Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America.

Cell-free RNA and protein synthesis (CFPS) is becoming increasingly used for protein production as yields increase and costs decrease. Advances in reconstituted CFPS systems such as the Protein synthesis Using Recombinant Elements (PURE) system offer new opportunities to tailor the reactions for specialized applications including in vitro protein evolution, protein microarrays, isotopic labeling, and incorporating unnatural amino acids. In this study, using firefly luciferase synthesis as a reporter system, we improved PURE system productivity up to 5 fold by adding or adjusting a variety of factors that affect transcription and translation, including Elongation factors (EF-Ts, EF-Tu, EF-G, and EF4), ribosome recycling factor (RRF), release factors (RF1, RF2, RF3), chaperones (GroEL/ES), BSA and tRNAs. The work provides a more efficient defined in vitro transcription and translation system and a deeper understanding of the factors that limit the whole system efficiency.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0106232PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4152126PMC
May 2015

Highly multiplexed subcellular RNA sequencing in situ.

Science 2014 Mar 27;343(6177):1360-3. Epub 2014 Feb 27.

Wyss Institute, Harvard Medical School, Boston, MA 02115, USA.

Understanding the spatial organization of gene expression with single-nucleotide resolution requires localizing the sequences of expressed RNA transcripts within a cell in situ. Here, we describe fluorescent in situ RNA sequencing (FISSEQ), in which stably cross-linked complementary DNA (cDNA) amplicons are sequenced within a biological sample. Using 30-base reads from 8102 genes in situ, we examined RNA expression and localization in human primary fibroblasts with a simulated wound-healing assay. FISSEQ is compatible with tissue sections and whole-mount embryos and reduces the limitations of optical resolution and noisy signals on single-molecule detection. Our platform enables massively parallel detection of genetic elements, including gene transcripts and molecular barcodes, and can be used to investigate cellular phenotype, gene regulation, and environment in situ.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.1250212DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4140943PMC
March 2014

Rational optimization of tolC as a powerful dual selectable marker for genome engineering.

Nucleic Acids Res 2014 Apr 22;42(7):4779-90. Epub 2014 Jan 22.

Department of Genetics and Wyss Institute for Biologically Inspired Engineering, Harvard Medical School, Boston, MA 02115, USA, Program in Chemical Biology, Harvard University, Cambridge, MA 02138, USA, Biological and Biomedical Sciences, Harvard Medical School, Boston, MA 02115, USA and Molecular, Cellular, Developmental and Systems Biology Institute, Yale University, New Haven, CT 06516, USA.

Selection has been invaluable for genetic manipulation, although counter-selection has historically exhibited limited robustness and convenience. TolC, an outer membrane pore involved in transmembrane transport in E. coli, has been implemented as a selectable/counter-selectable marker, but counter-selection escape frequency using colicin E1 precludes using tolC for inefficient genetic manipulations and/or with large libraries. Here, we leveraged unbiased deep sequencing of 96 independent lineages exhibiting counter-selection escape to identify loss-of-function mutations, which offered mechanistic insight and guided strain engineering to reduce counter-selection escape frequency by ∼40-fold. We fundamentally improved the tolC counter-selection by supplementing a second agent, vancomycin, which reduces counter-selection escape by 425-fold, compared colicin E1 alone. Combining these improvements in a mismatch repair proficient strain reduced counter-selection escape frequency by 1.3E6-fold in total, making tolC counter-selection as effective as most selectable markers, and adding a valuable tool to the genome editing toolbox. These improvements permitted us to perform stable and continuous rounds of selection/counter-selection using tolC, enabling replacement of 10 alleles without requiring genotypic screening for the first time. Finally, we combined these advances to create an optimized E. coli strain for genome engineering that is ∼10-fold more efficient at achieving allelic diversity than previous best practices.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkt1374DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3985617PMC
April 2014

On the design of clone-based haplotyping.

Genome Biol 2013 ;14(9):R100

Background: Haplotypes are important for assessing genealogy and disease susceptibility of individual genomes,but are difficult to obtain with routine sequencing approaches. Experimental haplotype reconstruction based on assembling fragments of individual chromosomes is promising, but with variable yields due to incompletely understood parameter choices.

Results: We parameterize the clone-based haplotyping problem in order to provide theoretical and empirical assessments of the impact of different parameters on haplotype assembly. We confirm the intuition that long clones help link together heterozygous variants and thus improve haplotype length. Furthermore, given the length of the clones, we address how to choose the other parameters, including number of pools, clone coverage and sequencing coverage, so as to maximize haplotype length. We model the problem theoretically and show empirically the benefits of using larger clones with moderate number of pools and sequencing coverage. In particular, using 140 kb BAC clones, we construct haplotypes for a personal genome and assemble haplotypes with N50 values greater than 2.6 Mb. These assembled haplotypes are longer and at least as accurate as haplotypes of existing clone-based strategies, whether in vivo or in vitro.

Conclusions: Our results provide practical guidelines for the development and design of clone-based methods to achieve long range, high-resolution and accurate haplotypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/gb-2013-14-9-r100DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053695PMC
January 2015

Optimization of scarless human stem cell genome editing.

Nucleic Acids Res 2013 Oct 31;41(19):9049-61. Epub 2013 Jul 31.

Department of Genetics, Harvard Medical School, Boston, 02115 MA, USA, Biological and Biomedical Sciences Program, Harvard Medical School, Boston, 02115 MA, USA, Children's Hospital, Boston, 02115 MA, USA, Chemistry and Chemical Biology program, Harvard, 02138 Cambridge, MA, USA and Wyss Institute for Biologically Inspired Engineering, Harvard University, Cambridge, 02138 MA, USA.

Efficient strategies for precise genome editing in human-induced pluripotent cells (hiPSCs) will enable sophisticated genome engineering for research and clinical purposes. The development of programmable sequence-specific nucleases such as Transcription Activator-Like Effectors Nucleases (TALENs) and Cas9-gRNA allows genetic modifications to be made more efficiently at targeted sites of interest. However, many opportunities remain to optimize these tools and to enlarge their spheres of application. We present several improvements: First, we developed functional re-coded TALEs (reTALEs), which not only enable simple one-pot TALE synthesis but also allow TALE-based applications to be performed using lentiviral vectors. We then compared genome-editing efficiencies in hiPSCs mediated by 15 pairs of reTALENs and Cas9-gRNA targeting CCR5 and optimized ssODN design in conjunction with both methods for introducing specific mutations. We found Cas9-gRNA achieved 7-8× higher non-homologous end joining efficiencies (3%) than reTALENs (0.4%) and moderately superior homology-directed repair efficiencies (1.0 versus 0.6%) when combined with ssODN donors in hiPSCs. Using the optimal design, we demonstrated a streamlined process to generated seamlessly genome corrected hiPSCs within 3 weeks.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkt555DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3799423PMC
October 2013

CAS9 transcriptional activators for target specificity screening and paired nickases for cooperative genome engineering.

Nat Biotechnol 2013 Sep 1;31(9):833-8. Epub 2013 Aug 1.

1] Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA. [2].

Prokaryotic type II CRISPR-Cas systems can be adapted to enable targeted genome modifications across a range of eukaryotes. Here we engineer this system to enable RNA-guided genome regulation in human cells by tethering transcriptional activation domains either directly to a nuclease-null Cas9 protein or to an aptamer-modified single guide RNA (sgRNA). Using this functionality we developed a transcriptional activation-based assay to determine the landscape of off-target binding of sgRNA:Cas9 complexes and compared it with the off-target activity of transcription activator-like (TALs) effectors. Our results reveal that specificity profiles are sgRNA dependent, and that sgRNA:Cas9 complexes and 18-mer TAL effectors can potentially tolerate 1-3 and 1-2 target mismatches, respectively. By engineering a requirement for cooperativity through offset nicking for genome editing or through multiple synergistic sgRNAs for robust transcriptional activation, we suggest methods to mitigate off-target phenomena. Our results expand the versatility of the sgRNA:Cas9 tool and highlight the critical need to engineer improved specificity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nbt.2675DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3818127PMC
September 2013

Barcoding cells using cell-surface programmable DNA-binding domains.

Nat Methods 2013 May 17;10(5):403-6. Epub 2013 Mar 17.

Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA.

We report an approach to barcode cells through cell-surface expression of programmable zinc-finger DNA-binding domains (surface zinc fingers, sZFs). We show that sZFs enable sequence-specific labeling of living cells by dsDNA, and we develop a sequential labeling approach to image more than three cell types in mixed populations using three fluorophores. We demonstrate the versatility of sZFs through applications in which they serve as surrogate reporters, function as selective cell capture reagents and facilitate targeted cellular delivery of viruses.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.2407DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3641172PMC
May 2013

Genome engineering in Saccharomyces cerevisiae using CRISPR-Cas systems.

Nucleic Acids Res 2013 Apr 4;41(7):4336-43. Epub 2013 Mar 4.

Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA.

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and CRISPR-associated (Cas) systems in bacteria and archaea use RNA-guided nuclease activity to provide adaptive immunity against invading foreign nucleic acids. Here, we report the use of type II bacterial CRISPR-Cas system in Saccharomyces cerevisiae for genome engineering. The CRISPR-Cas components, Cas9 gene and a designer genome targeting CRISPR guide RNA (gRNA), show robust and specific RNA-guided endonuclease activity at targeted endogenous genomic loci in yeast. Using constitutive Cas9 expression and a transient gRNA cassette, we show that targeted double-strand breaks can increase homologous recombination rates of single- and double-stranded oligonucleotide donors by 5-fold and 130-fold, respectively. In addition, co-transformation of a gRNA plasmid and a donor DNA in cells constitutively expressing Cas9 resulted in near 100% donor DNA recombination frequency. Our approach provides foundations for a simple and powerful genome engineering tool for site-specific mutagenesis and allelic replacement in yeast.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkt135DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3627607PMC
April 2013

RNA-guided human genome engineering via Cas9.

Science 2013 Feb 3;339(6121):823-6. Epub 2013 Jan 3.

Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.

Bacteria and archaea have evolved adaptive immune defenses, termed clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) systems, that use short RNA to direct degradation of foreign nucleic acids. Here, we engineer the type II bacterial CRISPR system to function with custom guide RNA (gRNA) in human cells. For the endogenous AAVS1 locus, we obtained targeting rates of 10 to 25% in 293T cells, 13 to 8% in K562 cells, and 2 to 4% in induced pluripotent stem cells. We show that this process relies on CRISPR components; is sequence-specific; and, upon simultaneous introduction of multiple gRNAs, can effect multiplex editing of target loci. We also compute a genome-wide resource of ~190 K unique gRNAs targeting ~40.5% of human exons. Our results establish an RNA-guided editing tool for facile, robust, and multiplexable human genome engineering.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.1232033DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3712628PMC
February 2013

A public resource facilitating clinical use of genomes.

Proc Natl Acad Sci U S A 2012 Jul 13;109(30):11920-7. Epub 2012 Jul 13.

Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.

Rapid advances in DNA sequencing promise to enable new diagnostics and individualized therapies. Achieving personalized medicine, however, will require extensive research on highly reidentifiable, integrated datasets of genomic and health information. To assist with this, participants in the Personal Genome Project choose to forgo privacy via our institutional review board- approved "open consent" process. The contribution of public data and samples facilitates both scientific discovery and standardization of methods. We present our findings after enrollment of more than 1,800 participants, including whole-genome sequencing of 10 pilot participant genomes (the PGP-10). We introduce the Genome-Environment-Trait Evidence (GET-Evidence) system. This tool automatically processes genomes and prioritizes both published and novel variants for interpretation. In the process of reviewing the presumed healthy PGP-10 genomes, we find numerous literature references implying serious disease. Although it is sometimes impossible to rule out a late-onset effect, stringent evidence requirements can address the high rate of incidental findings. To that end we develop a peer production system for recording and organizing variant evaluations according to standard evidence guidelines, creating a public forum for reaching consensus on interpretation of clinically relevant variants. Genome analysis becomes a two-step process: using a prioritized list to record variant evaluations, then automatically sorting reviewed variants using these annotations. Genome data, health and trait information, participant samples, and variant interpretations are all shared in the public domain-we invite others to review our results using our participant samples and contribute to our interpretations. We offer our public resource and methods to further personalized medical research.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1201904109DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3409785PMC
July 2012

Proteome-wide systems analysis of a cellulosic biofuel-producing microbe.

Mol Syst Biol 2011 Jan;7:461

Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.

Fermentation of plant biomass by microbes like Clostridium phytofermentans recycles carbon globally and can make biofuels from inedible feedstocks. We analyzed C. phytofermentans fermenting cellulosic substrates by integrating quantitative mass spectrometry of more than 2500 proteins with measurements of growth, enzyme activities, fermentation products, and electron microscopy. Absolute protein concentrations were estimated using Absolute Protein EXpression (APEX); relative changes between treatments were quantified with chemical stable isotope labeling by reductive dimethylation (ReDi). We identified the different combinations of carbohydratases used to degrade cellulose and hemicellulose, many of which were secreted based on quantification of supernatant proteins, as well as the repertoires of glycolytic enzymes and alcohol dehydrogenases (ADHs) enabling ethanol production at near maximal yields. Growth on cellulose also resulted in diverse changes such as increased expression of tryptophan synthesis proteins and repression of proteins for fatty acid metabolism and cell motility. This study gives a systems-level understanding of how this microbe ferments biomass and provides a rational, empirical basis to identify engineering targets for industrial cellulosic fermentation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/msb.2010.116DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3049413PMC
January 2011

Personal genomes in progress: from the human genome project to the personal genome project.

Dialogues Clin Neurosci 2010 ;12(1):47-60

European Centre for Public Health Genomics, FHML, Maastricht University, Maastricht, The Netherlands.

The cost of a diploid human genome sequence has dropped from about $70M to $2000 since 2007--even as the standards for redundancy have increased from 7x to 40x in order to improve call rates. Coupled with the low return on investment for common single-nucleotide polylmorphisms, this has caused a significant rise in interest in correlating genome sequences with comprehensive environmental and trait data (GET). The cost of electronic health records, imaging, and microbial, immunological, and behavioral data are also dropping quickly. Sharing such integrated GET datasets and their interpretations with a diversity of researchers and research subjects highlights the need for informed-consent models capable of addressing novel privacy and other issues, as well as for flexible data-sharing resources that make materials and data available with minimum restrictions on use. This article examines the Personal Genome Project's effort to develop a GET database as a public genomics resource broadly accessible to both researchers and research participants, while pursuing the highest standards in research ethics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3181947PMC
April 2010

Digital RNA allelotyping reveals tissue-specific and allele-specific gene expression in human.

Nat Methods 2009 Aug 20;6(8):613-8. Epub 2009 Jul 20.

Department of Bioengineering, University of California at San Diego, La Jolla, California, USA.

We developed a digital RNA allelotyping method for quantitatively interrogating allele-specific gene expression. This method involves ultra-deep sequencing of padlock-captured single-nucleotide polymorphisms (SNPs) from the transcriptome. We characterized four cell lines established from two human subjects in the Personal Genome Project. Approximately 11-22% of the heterozygous mRNA-associated SNPs showed allele-specific expression in each cell line and 4.3-8.5% were tissue-specific, suggesting the presence of tissue-specific cis regulation. When we applied allelotyping to two pairs of sibling human embryonic stem cell lines, the sibling lines were more similar in allele-specific expression than were the genetically unrelated lines. We found that the variation of allelic ratios in gene expression among different cell lines was primarily explained by genetic variations, much more so than by specific tissue types or growth conditions. Comparison of expressed SNPs on the sense and antisense transcripts suggested that allelic ratios are primarily determined by cis-regulatory mechanisms on the sense transcripts.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.1357DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2742772PMC
August 2009

Multiplex padlock targeted sequencing reveals human hypermutable CpG variations.

Genome Res 2009 Sep 12;19(9):1606-15. Epub 2009 Jun 12.

Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA.

Utilizing the full power of next-generation sequencing often requires the ability to perform large-scale multiplex enrichment of many specific genomic loci in multiple samples. Several technologies have been recently developed but await substantial improvements. We report the 10,000-fold improvement of a previously developed padlock-based approach, and apply the assay to identifying genetic variations in hypermutable CpG regions across human chromosome 21. From approximately 3 million reads derived from a single Illumina Genome Analyzer lane, approximately 94% (approximately 50,500) target sites can be observed with at least one read. The uniformity of coverage was also greatly improved; up to 93% and 57% of all targets fell within a 100- and 10-fold coverage range, respectively. Alleles at >400,000 target base positions were determined across six subjects and examined for single nucleotide polymorphisms (SNPs), and the concordance with independently obtained genotypes was 98.4%-100%. We detected >500 SNPs not currently in dbSNP, 362 of which were in targeted CpG locations. Transitions in CpG sites were at least 13.7 times more abundant than non-CpG transitions. Fractions of polymorphic CpG sites are lower in CpG-rich regions and show higher correlation with human-chimpanzee divergence within CpG versus non-CpG sites. This is consistent with the hypothesis that methylation rate heterogeneity along chromosomes contributes to mutation rate variation in humans. Our success suggests that targeted CpG resequencing is an efficient way to identify common and rare genetic variations. In addition, the significantly improved padlock capture technology can be readily applied to other projects that require multiplex sample preparation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.092213.109DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2752131PMC
September 2009
-->