Publications by authors named "Jay Shendure"

337 Publications

SARS-CoV-2 Epidemiology on a Public University Campus in Washington State.

Open Forum Infect Dis 2021 Nov 17;8(11):ofab464. Epub 2021 Sep 17.

Department of Biostatistics, University of Washington, Seattle, Washington, USA.

Background: We aimed to evaluate a testing program to facilitate control of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) transmission at a large university and measure spread in the university community using viral genome sequencing.

Methods: Our prospective longitudinal study used remote contactless enrollment, daily mobile symptom and exposure tracking, and self-swab sample collection. Individuals were tested if the participant was exposed to a known SARS-CoV-2-infected person, developed new symptoms, or reported high-risk behavior (such as attending an indoor gathering without masking or social distancing), if a member of a group experiencing an outbreak, or at enrollment. Study participants included students, staff, and faculty at an urban public university during the Autumn quarter of 2020.

Results: We enrolled 16 476 individuals, performed 29 783 SARS-CoV-2 tests, and detected 236 infections. Seventy-five percent of positive cases reported at least 1 of the following: symptoms (60.8%), exposure (34.7%), or high-risk behaviors (21.5%). Greek community affiliation was the strongest risk factor for testing positive, and molecular epidemiology results suggest that specific large gatherings were responsible for several outbreaks.

Conclusions: A testing program focused on individuals with symptoms and unvaccinated persons who participate in large campus gatherings may be effective as part of a comprehensive university-wide mitigation strategy to control the spread of SARS-CoV-2.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/ofid/ofab464DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8599730PMC
November 2021

The glucose-sensing transcription factor MLX balances metabolism and stress to suppress apoptosis and maintain spermatogenesis.

PLoS Biol 2021 10 20;19(10):e3001085. Epub 2021 Oct 20.

Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America.

Male germ cell (GC) production is a metabolically driven and apoptosis-prone process. Here, we show that the glucose-sensing transcription factor (TF) MAX-Like protein X (MLX) and its binding partner MondoA are both required for male fertility in the mouse, as well as survival of human tumor cells derived from the male germ line. Loss of Mlx results in altered metabolism as well as activation of multiple stress pathways and GC apoptosis in the testes. This is concomitant with dysregulation of the expression of male-specific GC transcripts and proteins. Our genomic and functional analyses identify loci directly bound by MLX involved in these processes, including metabolic targets, obligate components of male-specific GC development, and apoptotic effectors. These in vivo and in vitro studies implicate MLX and other members of the proximal MYC network, such as MNT, in regulation of metabolism and differentiation, as well as in suppression of intrinsic and extrinsic death signaling pathways in both spermatogenesis and male germ cell tumors (MGCTs).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pbio.3001085DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8528285PMC
October 2021

Precise genomic deletions using paired prime editing.

Nat Biotechnol 2021 Oct 14. Epub 2021 Oct 14.

Department of Genome Sciences, University of Washington, Seattle, WA, USA.

Current methods to delete genomic sequences are based on clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 and pairs of single-guide RNAs (sgRNAs), but can be inefficient and imprecise, with errors including small indels as well as unintended large deletions and more complex rearrangements. In the present study, we describe a prime editing-based method, PRIME-Del, which induces a deletion using a pair of prime editing sgRNAs (pegRNAs) that target opposite DNA strands, programming not only the sites that are nicked but also the outcome of the repair. PRIME-Del achieves markedly higher precision than CRISPR-Cas9 and sgRNA pairs in programming deletions up to 10 kb, with 1-30% editing efficiency. PRIME-Del can also be used to couple genomic deletions with short insertions, enabling deletions with junctions that do not fall at protospacer-adjacent motif sites. Finally, extended expression of prime editing components can substantially enhance efficiency without compromising precision. We anticipate that PRIME-Del will be broadly useful for precise, flexible programming of genomic deletions, epitope tagging and, potentially, programming genomic rearrangements.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-021-01025-zDOI Listing
October 2021

Adaptations in Hippo-Yap signaling and myofibroblast fate underlie scar-free ear appendage wound healing in spiny mice.

Dev Cell 2021 10 4;56(19):2722-2740.e6. Epub 2021 Oct 4.

Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA; Department of Pediatrics, University of Washington, Seattle, WA 98195, USA; Center for Developmental Biology & Regenerative Medicine, Seattle Children's Research Institute, Seattle, WA 98101, USA. Electronic address:

Spiny mice (Acomys cahirinus) are terrestrial mammals that evolved unique scar-free regenerative wound-healing properties. Myofibroblasts (MFs) are the major scar-forming cell type in skin. We found that following traumatic injury to ear pinnae, MFs appeared rapidly in both Acomys and mouse yet persisted only in mouse. The timing of MF loss in Acomys correlated with wound closure, blastema differentiation, and nuclear localization of the Hippo pathway target protein Yap. Experiments in vitro revealed an accelerated PP2A-dependent dephosphorylation activity that maintained nuclear Yap in Acomys dermal fibroblasts (DFs) and was not detected in mouse or human DFs. Treatment of Acomys in vivo with the nuclear Yap-TEAD inhibitor verteporfin prolonged MF persistence and converted tissue regeneration to fibrosis. Forced Yap activity prevented and rescued TGF-β1-induced human MF formation in vitro. These results suggest that Acomys evolved modifications of Yap activity and MF fate important for scar-free regenerative wound healing in vivo.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.devcel.2021.09.008DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8623355PMC
October 2021

Single-cell landscape of nuclear configuration and gene expression during stem cell differentiation and X inactivation.

Genome Biol 2021 09 27;22(1):279. Epub 2021 Sep 27.

Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA.

Background: Mammalian development is associated with extensive changes in gene expression, chromatin accessibility, and nuclear structure. Here, we follow such changes associated with mouse embryonic stem cell differentiation and X inactivation by integrating, for the first time, allele-specific data from these three modalities obtained by high-throughput single-cell RNA-seq, ATAC-seq, and Hi-C.

Results: Allele-specific contact decay profiles obtained by single-cell Hi-C clearly show that the inactive X chromosome has a unique profile in differentiated cells that have undergone X inactivation. Loss of this inactive X-specific structure at mitosis is followed by its reappearance during the cell cycle, suggesting a "bookmark" mechanism. Differentiation of embryonic stem cells to follow the onset of X inactivation is associated with changes in contact decay profiles that occur in parallel on both the X chromosomes and autosomes. Single-cell RNA-seq and ATAC-seq show evidence of a delay in female versus male cells, due to the presence of two active X chromosomes at early stages of differentiation. The onset of the inactive X-specific structure in single cells occurs later than gene silencing, consistent with the idea that chromatin compaction is a late event of X inactivation. Single-cell Hi-C highlights evidence of discrete changes in nuclear structure characterized by the acquisition of very long-range contacts throughout the nucleus. Novel computational approaches allow for the effective alignment of single-cell gene expression, chromatin accessibility, and 3D chromosome structure.

Conclusions: Based on trajectory analyses, three distinct nuclear structure states are detected reflecting discrete and profound simultaneous changes not only to the structure of the X chromosomes, but also to that of autosomes during differentiation. Our study reveals that long-range structural changes to chromosomes appear as discrete events, unlike progressive changes in gene expression and chromatin accessibility.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-021-02432-wDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8474932PMC
September 2021

The landscape of alternative polyadenylation in single cells of the developing mouse embryo.

Nat Commun 2021 08 24;12(1):5101. Epub 2021 Aug 24.

Department of Genome Sciences, University of Washington, Seattle, WA, USA.

3' untranslated regions (3' UTRs) post-transcriptionally regulate mRNA stability, localization, and translation rate. While 3'-UTR isoforms have been globally quantified in limited cell types using bulk measurements, their differential usage among cell types during mammalian development remains poorly characterized. In this study, we examine a dataset comprising ~2 million nuclei spanning E9.5-E13.5 of mouse embryonic development to quantify transcriptome-wide changes in alternative polyadenylation (APA). We observe a global lengthening of 3' UTRs across embryonic stages in all cell types, although we detect shorter 3' UTRs in hematopoietic lineages and longer 3' UTRs in neuronal cell types within each stage. An analysis of RNA-binding protein (RBP) dynamics identifies ELAV-like family members, which are concomitantly induced in neuronal lineages and developmental stages experiencing 3'-UTR lengthening, as putative regulators of APA. By measuring 3'-UTR isoforms in an expansive single cell dataset, our work provides a transcriptome-wide and organism-wide map of the dynamic landscape of alternative polyadenylation during mammalian organogenesis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-021-25388-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8385098PMC
August 2021

SwabExpress: An End-to-End Protocol for Extraction-Free COVID-19 Testing.

Clin Chem 2021 12;68(1):143-152

Department of Genome Sciences, University of Washington, Seattle, WA.

Background: The urgent need for massively scaled clinical testing for SARS-CoV-2, along with global shortages of critical reagents and supplies, has necessitated development of streamlined laboratory testing protocols. Conventional nucleic acid testing for SARS-CoV-2 involves collection of a clinical specimen with a nasopharyngeal swab in transport medium, nucleic acid extraction, and quantitative reverse-transcription PCR (RT-qPCR). As testing has scaled across the world, the global supply chain has buckled, rendering testing reagents and materials scarce. To address shortages, we developed SwabExpress, an end-to-end protocol developed to employ mass produced anterior nares swabs and bypass the requirement for transport media and nucleic acid extraction.

Methods: We evaluated anterior nares swabs, transported dry and eluted in low-TE buffer as a direct-to-RT-qPCR alternative to extraction-dependent viral transport media. We validated our protocol of using heat treatment for viral inactivation and added a proteinase K digestion step to reduce amplification interference. We tested this protocol across archived and prospectively collected swab specimens to fine-tune test performance.

Results: After optimization, SwabExpress has a low limit of detection at 2-4 molecules/µL, 100% sensitivity, and 99.4% specificity when compared side by side with a traditional RT-qPCR protocol employing extraction. On real-world specimens, SwabExpress outperforms an automated extraction system while simultaneously reducing cost and hands-on time.

Conclusion: SwabExpress is a simplified workflow that facilitates scaled testing for COVID-19 without sacrificing test performance. It may serve as a template for the simplification of PCR-based clinical laboratory tests, particularly in times of critical shortages during pandemics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/clinchem/hvab132DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8406859PMC
December 2021

Embryo-scale, single-cell spatial transcriptomics.

Science 2021 07;373(6550):111-117

Department of Genome Sciences, University of Washington, Seattle, WA, USA.

Spatial patterns of gene expression manifest at scales ranging from local (e.g., cell-cell interactions) to global (e.g., body axis patterning). However, current spatial transcriptomics methods either average local contexts or are restricted to limited fields of view. Here, we introduce sci-Space, which retains single-cell resolution while resolving spatial heterogeneity at larger scales. Applying sci-Space to developing mouse embryos, we captured approximate spatial coordinates and whole transcriptomes of about 120,000 nuclei. We identify thousands of genes exhibiting anatomically patterned expression, leverage spatial information to annotate cellular subtypes, show that cell types vary substantially in their extent of spatial patterning, and reveal correlations between pseudotime and the migratory patterns of differentiating neurons. Looking forward, we anticipate that sci-Space will facilitate the construction of spatially resolved single-cell atlases of mammalian development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.abb9536DOI Listing
July 2021

Benchmarked approaches for reconstruction of in vitro cell lineages and in silico models of C. elegans and M. musculus developmental trees.

Cell Syst 2021 08 18;12(8):810-826.e4. Epub 2021 Jun 18.

Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, TX 77030, USA.

The recent advent of CRISPR and other molecular tools enabled the reconstruction of cell lineages based on induced DNA mutations and promises to solve the ones of more complex organisms. To date, no lineage reconstruction algorithms have been rigorously examined for their performance and robustness across dataset types and number of cells. To benchmark such methods, we decided to organize a DREAM challenge using in vitro experimental intMEMOIR recordings and in silico data for a C. elegans lineage tree of about 1,000 cells and a Mus musculus tree of 10,000 cells. Some of the 22 approaches submitted had excellent performance, but structural features of the trees prevented optimal reconstructions. Using smaller sub-trees as training sets proved to be a good approach for tuning algorithms to reconstruct larger trees. The simulation and reconstruction methods here generated delineate a potential way forward for solving larger cell lineage trees such as in mouse.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cels.2021.05.008DOI Listing
August 2021

Single-cell lineage tracing of metastatic cancer reveals selection of hybrid EMT states.

Cancer Cell 2021 08 10;39(8):1150-1162.e9. Epub 2021 Jun 10.

Department of Biomedical Sciences, School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA, USA; Department of Cell & Developmental Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Institute for Regenerative Medicine, University of Pennsylvania, Philadelphia, PA, USA. Electronic address:

The underpinnings of cancer metastasis remain poorly understood, in part due to a lack of tools for probing their emergence at high resolution. Here we present macsGESTALT, an inducible CRISPR-Cas9-based lineage recorder with highly efficient single-cell capture of both transcriptional and phylogenetic information. Applying macsGESTALT to a mouse model of metastatic pancreatic cancer, we recover ∼380,000 CRISPR target sites and reconstruct dissemination of ∼28,000 single cells across multiple metastatic sites. We find that cells occupy a continuum of epithelial-to-mesenchymal transition (EMT) states. Metastatic potential peaks in rare, late-hybrid EMT states, which are aggressively selected from a predominately epithelial ancestral pool. The gene signatures of these late-hybrid EMT states are predictive of reduced survival in both human pancreatic and lung cancer patients, highlighting their relevance to clinical disease progression. Finally, we observe evidence for in vivo propagation of S100 family gene expression across clonally distinct metastatic subpopulations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ccell.2021.05.005DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8782207PMC
August 2021

Comparison of Symptoms and RNA Levels in Children and Adults With SARS-CoV-2 Infection in the Community Setting.

JAMA Pediatr 2021 10 4;175(10):e212025. Epub 2021 Oct 4.

Division of Allergy and Infectious Diseases, Department of Medicine, University of Washington, Seattle.

Importance: The association between COVID-19 symptoms and SARS-CoV-2 viral levels in children living in the community is not well understood.

Objective: To characterize symptoms of pediatric COVID-19 in the community and analyze the association between symptoms and SARS-CoV-2 RNA levels, as approximated by cycle threshold (Ct) values, in children and adults.

Design, Setting, And Participants: This cross-sectional study used a respiratory virus surveillance platform in persons of all ages to detect community COVID-19 cases from March 23 to November 9, 2020. A population-based convenience sample of children younger than 18 years and adults in King County, Washington, who enrolled online for home self-collection of upper respiratory samples for SARS-CoV-2 testing were included.

Exposures: Detection of SARS-CoV-2 RNA by reverse transcription-polymerase chain reaction (RT-PCR) from participant-collected samples.

Main Outcomes And Measures: RT-PCR-confirmed SARS-CoV-2 infection, with Ct values stratified by age and symptoms.

Results: Among 555 SARS-CoV-2-positive participants (mean [SD] age, 33.7 [20.1] years; 320 were female [57.7%]), 47 of 123 children (38.2%) were asymptomatic compared with 31 of 432 adults (7.2%). When symptomatic, fewer symptoms were reported in children compared with adults (mean [SD], 1.6 [2.0] vs 4.5 [3.1]). Symptomatic individuals had lower Ct values (which corresponded to higher viral RNA levels) than asymptomatic individuals (adjusted estimate for children, -3.0; 95% CI, -5.5 to -0.6; P = .02; adjusted estimate for adults, -2.9; 95% CI, -5.2 to -0.6; P = .01). The difference in mean Ct values was neither statistically significant between symptomatic children and symptomatic adults (adjusted estimate, -0.7; 95% CI, -2.2 to 0.9; P = .41) nor between asymptomatic children and asymptomatic adults (adjusted estimate, -0.6; 95% CI, -4.0 to 2.8; P = .74).

Conclusions And Relevance: In this community-based cross-sectional study, SARS-CoV-2 RNA levels, as determined by Ct values, were significantly higher in symptomatic individuals than in asymptomatic individuals and no significant age-related differences were found. Further research is needed to understand the role of SARS-CoV-2 RNA levels and viral transmission.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1001/jamapediatrics.2021.2025DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8491103PMC
October 2021

Unsupervised manifold alignment for single-cell multi-omics data.

ACM BCB 2020 Sep;2020:1-10

Department of Genome Sciences, University of Washington, Paul G. Allen School of Computer Science and Engineering, University of Washington.

Integrating single-cell measurements that capture different properties of the genome is vital to extending our understanding of genome biology. This task is challenging due to the lack of a shared axis across datasets obtained from different types of single-cell experiments. For most such datasets, we lack corresponding information among the cells (samples) and the measurements (features). In this scenario, unsupervised algorithms that are capable of aligning single-cell experiments are critical to learning an co-assay that can help draw correspondences among the cells. Maximum mean discrepancy-based manifold alignment (MMD-MA) is such an unsupervised algorithm. Without requiring correspondence information, it can align single-cell datasets from different modalities in a common shared latent space, showing promising results on simulations and a small-scale single-cell experiment with 61 cells. However, it is essential to explore the applicability of this method to larger single-cell experiments with thousands of cells so that it can be of practical interest to the community. In this paper, we apply MMD-MA to two recent datasets that measure transcriptome and chromatin accessibility in ~2000 single cells. To scale the runtime of MMD-MA to a more substantial number of cells, we extend the original implementation to run on GPUs. We also introduce a method to automatically select one of the user-defined parameters, thus reducing the hyperparameter search space. We demonstrate that the proposed extensions allow MMD-MA to accurately align state-of-the-art single-cell experiments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1145/3388440.3412410DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8095090PMC
September 2020

Viral genomes reveal patterns of the SARS-CoV-2 outbreak in Washington State.

Sci Transl Med 2021 05 3;13(595). Epub 2021 May 3.

Seattle Children's Research Institute, Seattle, WA 98101, USA.

The rapid spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has gravely affected societies around the world. Outbreaks in different parts of the globe have been shaped by repeated introductions of new viral lineages and subsequent local transmission of those lineages. Here, we sequenced 3940 SARS-CoV-2 viral genomes from Washington State (USA) to characterize how the spread of SARS-CoV-2 in Washington State in early 2020 was shaped by differences in timing of mitigation strategies across counties and by repeated introductions of viral lineages into the state. In addition, we show that the increase in frequency of a potentially more transmissible viral variant (614G) over time can potentially be explained by regional mobility differences and multiple introductions of 614G but not the other variant (614D) into the state. At an individual level, we observed evidence of higher viral loads in patients infected with the 614G variant. However, using clinical records data, we did not find any evidence that the 614G variant affects clinical severity or patient outcomes. Overall, this suggests that with regard to D614G, the behavior of individuals has been more important in shaping the course of the pandemic in Washington State than this variant of the virus.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/scitranslmed.abf0202DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8158963PMC
May 2021

Comprehensive characterization of tissue-specific chromatin accessibility in L2 nematodes.

Genome Res 2021 Oct 22;31(10):1952-1969. Epub 2021 Apr 22.

Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.

Recently developed single-cell technologies allow researchers to characterize cell states at ever greater resolution and scale. is a particularly tractable system for studying development, and recent single-cell RNA-seq studies characterized the gene expression patterns for nearly every cell type in the embryo and at the second larval stage (L2). Gene expression patterns give insight about gene function and into the biochemical state of different cell types; recent advances in other single-cell genomics technologies can now also characterize the regulatory context of the genome that gives rise to these gene expression levels at a single-cell resolution. To explore the regulatory DNA of individual cell types in , we collected single-cell chromatin accessibility data using the sci-ATAC-seq assay in L2 larvae to match the available single-cell RNA-seq data set. By using a novel implementation of the latent Dirichlet allocation algorithm, we identify 37 clusters of cells that correspond to different cell types in the worm, providing new maps of putative cell type-specific gene regulatory sites, with promise for better understanding of cellular differentiation and gene regulation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.271791.120DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8494234PMC
October 2021

Genome-wide strand asymmetry in massively parallel reporter activity favors genic strands.

Genome Res 2021 May 20;31(5):866-876. Epub 2021 Apr 20.

HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA.

Massively parallel reporter assays (MPRAs) are useful tools to characterize regulatory elements in human genomes. An aspect of MPRAs that is not typically the focus of analysis is their intrinsic ability to differentiate activity levels for a given sequence element when placed in both of its possible orientations relative to the reporter construct. Here, we describe pervasive strand asymmetry of MPRA signals in data sets from multiple reporter configurations in both published and newly reported data. These effects are reproducible across different cell types and in different treatments within a cell type and are observed both within and outside of annotated regulatory elements. From elements in gene bodies, MPRA strand asymmetry favors the sense strand, suggesting that function related to endogenous transcription is driving the phenomenon. Similarly, we find that within mobile element insertions, strand asymmetry favors the transcribed strand of the ancestral retrotransposon. The effect is consistent across the multiplicity of elements in human genomes and is more pronounced in less diverged elements. We find sequence features driving MPRA strand asymmetry and show its prediction from sequence alone. We see some evidence for RNA stabilization and transcriptional activation mechanisms and hypothesize that the effect is driven by natural selection favoring efficient transcription. Our results indicate that strand asymmetry is a pervasive and reproducible feature in MPRA data. More importantly, the fact that MPRA asymmetry favors naturally transcribed strands suggests that it stems from preserved biological functions that have a substantial, global impact on gene and genome evolution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.270751.120DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8092006PMC
May 2021

CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores.

Genome Med 2021 02 22;13(1):31. Epub 2021 Feb 22.

Charité - Universitätsmedizin Berlin, 10117, Berlin, Germany.

Background: Splicing of genomic exons into mRNAs is a critical prerequisite for the accurate synthesis of human proteins. Genetic variants impacting splicing underlie a substantial proportion of genetic disease, but are challenging to identify beyond those occurring at donor and acceptor dinucleotides. To address this, various methods aim to predict variant effects on splicing. Recently, deep neural networks (DNNs) have been shown to achieve better results in predicting splice variants than other strategies.

Methods: It has been unclear how best to integrate such process-specific scores into genome-wide variant effect predictors. Here, we use a recently published experimental data set to compare several machine learning methods that score variant effects on splicing. We integrate the best of those approaches into general variant effect prediction models and observe the effect on classification of known pathogenic variants.

Results: We integrate two specialized splicing scores into CADD (Combined Annotation Dependent Depletion; cadd.gs.washington.edu ), a widely used tool for genome-wide variant effect prediction that we previously developed to weight and integrate diverse collections of genomic annotations. With this new model, CADD-Splice, we show that inclusion of splicing DNN effect scores substantially improves predictions across multiple variant categories, without compromising overall performance.

Conclusions: While splice effect scores show superior performance on splice variants, specialized predictors cannot compete with other variant scores in general variant interpretation, as the latter account for nonsense and missense effects that do not alter splicing. Although only shown here for splice scores, we believe that the applied approach will generalize to other specific molecular processes, providing a path for the further improvement of genome-wide variant effect prediction.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13073-021-00835-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7901104PMC
February 2021

Comparable specimen collection from both ends of at-home mid-turbinate swabs.

medRxiv 2020 Dec 8. Epub 2020 Dec 8.

Brotman Baty Institute For Precision Medicine, Seattle WA, USA.

Unsupervised upper respiratory specimen collection is a key factor in the ability to massively scale SARS-CoV-2 testing. But there is concern that unsupervised specimen collection may produce inferior samples. Across two studies that included unsupervised at-home mid-turbinate specimen collection, ∼1% of participants used the wrong end of the swab. We found that molecular detection of respiratory pathogens and a human biomarker were comparable between specimens collected from the handle of the swab and those collected correctly. Older participants were more likely to use the swab backwards. Our results suggest that errors made during home-collection of nasal specimens do not preclude molecular detection of pathogens and specialized swabs may be an unnecessary luxury during a pandemic.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2020.12.05.20244632DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7743106PMC
December 2020

Trans- and cis-acting effects of Firre on epigenetic features of the inactive X chromosome.

Nat Commun 2020 11 27;11(1):6053. Epub 2020 Nov 27.

Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA.

Firre encodes a lncRNA involved in nuclear organization. Here, we show that Firre RNA expressed from the active X chromosome maintains histone H3K27me3 enrichment on the inactive X chromosome (Xi) in somatic cells. This trans-acting effect involves SUZ12, reflecting interactions between Firre RNA and components of the Polycomb repressive complexes. Without Firre RNA, H3K27me3 decreases on the Xi and the Xi-perinucleolar location is disrupted, possibly due to decreased CTCF binding on the Xi. We also observe widespread gene dysregulation, but not on the Xi. These effects are measurably rescued by ectopic expression of mouse or human Firre/FIRRE transgenes, supporting conserved trans-acting roles. We also find that the compact 3D structure of the Xi partly depends on the Firre locus and its RNA. In common lymphoid progenitors and T-cells Firre exerts a cis-acting effect on maintenance of H3K27me3 in a 26 Mb region around the locus, demonstrating cell type-specific trans- and cis-acting roles of this lncRNA.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-19879-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7695720PMC
November 2020

A human cell atlas of fetal gene expression.

Science 2020 11;370(6518)

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.

The gene expression program underlying the specification of human cell types is of fundamental interest. We generated human cell atlases of gene expression and chromatin accessibility in fetal tissues. For gene expression, we applied three-level combinatorial indexing to >110 samples representing 15 organs, ultimately profiling ~4 million single cells. We leveraged the literature and other atlases to identify and annotate hundreds of cell types and subtypes, both within and across tissues. Our analyses focused on organ-specific specializations of broadly distributed cell types (such as blood, endothelial, and epithelial), sites of fetal erythropoiesis (which notably included the adrenal gland), and integration with mouse developmental atlases (such as conserved specification of blood cells). These data represent a rich resource for the exploration of in vivo human gene expression in diverse tissues and cell types.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aba7721DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7780123PMC
November 2020

A human cell atlas of fetal chromatin accessibility.

Science 2020 11;370(6518)

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.

The chromatin landscape underlying the specification of human cell types is of fundamental interest. We generated human cell atlases of chromatin accessibility and gene expression in fetal tissues. For chromatin accessibility, we devised a three-level combinatorial indexing assay and applied it to 53 samples representing 15 organs, profiling ~800,000 single cells. We leveraged cell types defined by gene expression to annotate these data and cataloged hundreds of thousands of candidate regulatory elements that exhibit cell type-specific chromatin accessibility. We investigated the properties of lineage-specific transcription factors (such as POU2F1 in neurons), organ-specific specializations of broadly distributed cell types (such as blood and endothelial), and cell type-specific enrichments of complex trait heritability. These data represent a rich resource for the exploration of in vivo human gene regulation in diverse tissues and cell types.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aba7612DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7785298PMC
November 2020

A systematic evaluation of the design and context dependencies of massively parallel reporter assays.

Nat Methods 2020 11 12;17(11):1083-1091. Epub 2020 Oct 12.

Department of Genome Sciences, University of Washington, Seattle, WA, USA.

Massively parallel reporter assays (MPRAs) functionally screen thousands of sequences for regulatory activity in parallel. To date, there are limited studies that systematically compare differences in MPRA design. Here, we screen a library of 2,440 candidate liver enhancers and controls for regulatory activity in HepG2 cells using nine different MPRA designs. We identify subtle but significant differences that correlate with epigenetic and sequence-level features, as well as differences in dynamic range and reproducibility. We also validate that enhancer activity is largely independent of orientation, at least for our library and designs. Finally, we assemble and test the same enhancers as 192-mers, 354-mers and 678-mers and observe sizable differences. This work provides a framework for the experimental design of high-throughput reporter assays, suggesting that the extended sequence context of tested elements and to a lesser degree the precise assay, influence MPRA results.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41592-020-0965-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7727316PMC
November 2020

The Seattle Flu Study: a multiarm community-based prospective study protocol for assessing influenza prevalence, transmission and genomic epidemiology.

BMJ Open 2020 10 7;10(10):e037295. Epub 2020 Oct 7.

Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA.

Introduction: Influenza epidemics and pandemics cause significant morbidity and mortality. An effective response to a potential pandemic requires the infrastructure to rapidly detect, characterise, and potentially contain new and emerging influenza strains at both an individual and population level. The objective of this study is to use data gathered simultaneously from community and hospital sites to develop a model of how influenza enters and spreads in a population.

Methods And Analysis: Starting in the 2018-2019 season, we have been enrolling individuals with acute respiratory illness from community sites throughout the Seattle metropolitan area, including clinics, childcare facilities, Seattle-Tacoma International Airport, workplaces, college campuses and homeless shelters. At these sites, we collect clinical data and mid-nasal swabs from individuals with at least two acute respiratory symptoms. Additionally, we collect residual nasal swabs and data from individuals who seek care for respiratory symptoms at four regional hospitals. Samples are tested using a multiplex molecular assay, and influenza whole genome sequencing is performed for samples with influenza detected. Geospatial mapping and computational modelling platforms are in development to characterise the regional spread of influenza and other respiratory pathogens.

Ethics And Dissemination: The study was approved by the University of Washington's Institutional Review Board (STUDY00006181). Results will be disseminated through talks at conferences, peer-reviewed publications and on the study website (www.seattleflu.org).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1136/bmjopen-2020-037295DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7542952PMC
October 2020

Viral genomes reveal patterns of the SARS-CoV-2 outbreak in Washington State.

medRxiv 2020 Sep 30. Epub 2020 Sep 30.

University of Washington, Seattle, WA, USA.

The rapid spread of SARS-CoV-2 has gravely impacted societies around the world. Outbreaks in different parts of the globe are shaped by repeated introductions of new lineages and subsequent local transmission of those lineages. Here, we sequenced 3940 SARS-CoV-2 viral genomes from Washington State to characterize how the spread of SARS-CoV-2 in Washington State (USA) was shaped by differences in timing of mitigation strategies across counties, as well as by repeated introductions of viral lineages into the state. Additionally, we show that the increase in frequency of a potentially more transmissible viral variant (614G) over time can potentially be explained by regional mobility differences and multiple introductions of 614G, but not the other variant (614D) into the state. At an individual level, we see evidence of higher viral loads in patients infected with the 614G variant. However, using clinical records data, we do not find any evidence that the 614G variant impacts clinical severity or patient outcomes. Overall, this suggests that at least to date, the behavior of individuals has been more important in shaping the course of the pandemic than changes in the virus.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2020.09.30.20204230DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7536883PMC
September 2020

Capturing cell type-specific chromatin compartment patterns by applying topic modeling to single-cell Hi-C data.

PLoS Comput Biol 2020 09 18;16(9):e1008173. Epub 2020 Sep 18.

Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America.

Single-cell Hi-C (scHi-C) interrogates genome-wide chromatin interaction in individual cells, allowing us to gain insights into 3D genome organization. However, the extremely sparse nature of scHi-C data poses a significant barrier to analysis, limiting our ability to tease out hidden biological information. In this work, we approach this problem by applying topic modeling to scHi-C data. Topic modeling is well-suited for discovering latent topics in a collection of discrete data. For our analysis, we generate nine different single-cell combinatorial indexed Hi-C (sci-Hi-C) libraries from five human cell lines (GM12878, H1Esc, HFF, IMR90, and HAP1), consisting over 19,000 cells. We demonstrate that topic modeling is able to successfully capture cell type differences from sci-Hi-C data in the form of "chromatin topics." We further show enrichment of particular compartment structures associated with locus pairs in these topics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1008173DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7526900PMC
September 2020

Cryptic transmission of SARS-CoV-2 in Washington state.

Science 2020 10 10;370(6516):571-575. Epub 2020 Sep 10.

Brotman Baty Institute for Precision Medicine, Seattle, WA, USA.

After its emergence in Wuhan, China, in late November or early December 2019, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus rapidly spread globally. Genome sequencing of SARS-CoV-2 allows the reconstruction of its transmission history, although this is contingent on sampling. We analyzed 453 SARS-CoV-2 genomes collected between 20 February and 15 March 2020 from infected patients in Washington state in the United States. We find that most SARS-CoV-2 infections sampled during this time derive from a single introduction in late January or early February 2020, which subsequently spread locally before active community surveillance was implemented.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.abc0523DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7810035PMC
October 2020

lentiMPRA and MPRAflow for high-throughput functional characterization of gene regulatory elements.

Nat Protoc 2020 08 8;15(8):2387-2412. Epub 2020 Jul 8.

Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA.

Massively parallel reporter assays (MPRAs) can simultaneously measure the function of thousands of candidate regulatory sequences (CRSs) in a quantitative manner. In this method, CRSs are cloned upstream of a minimal promoter and reporter gene, alongside a unique barcode, and introduced into cells. If the CRS is a functional regulatory element, it will lead to the transcription of the barcode sequence, which is measured via RNA sequencing and normalized for cellular integration via DNA sequencing of the barcode. This technology has been used to test thousands of sequences and their variants for regulatory activity, to decipher the regulatory code and its evolution, and to develop genetic switches. Lentivirus-based MPRA (lentiMPRA) produces 'in-genome' readouts and enables the use of this technique in hard-to-transfect cells. Here, we provide a detailed protocol for lentiMPRA, along with a user-friendly Nextflow-based computational pipeline-MPRAflow-for quantifying CRS activity from different MPRA designs. The lentiMPRA protocol takes ~2 months, which includes sequencing turnaround time and data processing with MPRAflow.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41596-020-0333-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7550205PMC
August 2020
-->