Publications by authors named "Tim R Mercer"

51 Publications

Testing at scale during the COVID-19 pandemic.

Nat Rev Genet 2021 May 4. Epub 2021 May 4.

Departments of Pathology and Bioengineering, Stanford University, Stanford, CA, USA.

Assembly and publication of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genome in January 2020 enabled the immediate development of tests to detect the new virus. This began the largest global testing programme in history, in which hundreds of millions of individuals have been tested to date. The unprecedented scale of testing has driven innovation in the strategies, technologies and concepts that govern testing in public health. This Review describes the changing role of testing during the COVID-19 pandemic, including the use of genomic surveillance to track SARS-CoV-2 transmission around the world, the use of contact tracing to contain disease outbreaks and testing for the presence of the virus circulating in the environment. Despite these efforts, widespread community transmission has become entrenched in many countries and has required the testing of populations to identify and isolate infected individuals, many of whom are asymptomatic. The diagnostic and epidemiological principles that underpin such population-scale testing are also considered, as are the high-throughput and point-of-care technologies that make testing feasible on a massive scale.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41576-021-00360-wDOI Listing
May 2021

Chimeric synthetic reference standards enable cross-validation of positive and negative controls in SARS-CoV-2 molecular tests.

Sci Rep 2021 01 29;11(1):2636. Epub 2021 Jan 29.

Genomics and Epigenetics Theme, Garvan Institute of Medical Research, Sydney, NSW, Australia.

DNA synthesis in vitro has enabled the rapid production of reference standards. These are used as controls, and allow measurement and improvement of the accuracy and quality of diagnostic tests. Current reference standards typically represent target genetic material, and act only as positive controls to assess test sensitivity. However, negative controls are also required to evaluate test specificity. Using a pair of chimeric A/B RNA standards, this allowed incorporation of positive and negative controls into diagnostic testing for the Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2). The chimeric standards constituted target regions for RT-PCR primer/probe sets that are joined in tandem across two separate synthetic molecules. Accordingly, a target region that is present in standard A provides a positive control, whilst being absent in standard B, thereby providing a negative control. This design enables cross-validation of positive and negative controls between the paired standards in the same reaction, with identical conditions. This enables control and test failures to be distinguished, increasing confidence in the accuracy of results. The chimeric A/B standards were assessed using the US Centres for Disease Control real-time RT-PCR protocol, and showed results congruent with other commercial controls in detecting SARS-CoV-2 in patient samples. This chimeric reference standard design approach offers extensive flexibility, allowing representation of diverse genetic features and distantly related sequences, even from different organisms.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-021-81760-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7846570PMC
January 2021

A universal and independent synthetic DNA ladder for the quantitative measurement of genomic features.

Nat Commun 2020 07 17;11(1):3609. Epub 2020 Jul 17.

Garvan Institute of Medical Research, Sydney, New South Wales, Australia.

Standard units of measurement are required for the quantitative description of nature; however, few standard units have been established for genomics to date. Here, we have developed a synthetic DNA ladder that defines a quantitative standard unit that can measure DNA sequence abundance within a next-generation sequencing library. The ladder can be spiked into a DNA sample, and act as an internal scale that measures quantitative genetics features. Unlike previous spike-ins, the ladder is encoded within a single molecule, and can be equivalently and independently synthesized by different laboratories. We show how the ladder can measure diverse quantitative features, including human genetic variation and microbial abundance, and also estimate uncertainty due to technical variation and improve normalization between libraries. This ladder provides an independent quantitative unit that can be used with any organism, application or technology, thereby providing a common metric by which genomes can be measured.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-17445-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7367866PMC
July 2020

Author Correction: Diagnosis of fusion genes using targeted RNA sequencing.

Nat Commun 2020 Apr 8;11(1):1810. Epub 2020 Apr 8.

Genomics and Epigenetics Division, Garvan Institute of Medical Research, Sydney, 2010 NSW, Australia.

An amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-15697-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7142116PMC
April 2020

Lymphoma Driver Mutations in the Pathogenic Evolution of an Iconic Human Autoantibody.

Cell 2020 03 13;180(5):878-894.e19. Epub 2020 Feb 13.

Kirby Institute for Infection and Immunity, UNSW Sydney, Sydney, NSW 2052, Australia; School of Medical Sciences and Cellular Genomics Futures Institute, UNSW Sydney, Sydney, NSW 2052, Australia.

Pathogenic autoantibodies arise in many autoimmune diseases, but it is not understood how the cells making them evade immune checkpoints. Here, single-cell multi-omics analysis demonstrates a shared mechanism with lymphoid malignancy in the formation of public rheumatoid factor autoantibodies responsible for mixed cryoglobulinemic vasculitis. By combining single-cell DNA and RNA sequencing with serum antibody peptide sequencing and antibody synthesis, rare circulating B lymphocytes making pathogenic autoantibodies were found to comprise clonal trees accumulating mutations. Lymphoma driver mutations in genes regulating B cell proliferation and V(D)J mutation (CARD11, TNFAIP3, CCND3, ID3, BTG2, and KLHL6) were present in rogue B cells producing the pathogenic autoantibody. Antibody V(D)J mutations conferred pathogenicity by causing the antigen-bound autoantibodies to undergo phase transition to insoluble aggregates at lower temperatures. These results reveal a pre-neoplastic stage in human lymphomagenesis and a cascade of somatic mutations leading to an iconic pathogenic autoantibody.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2020.01.029DOI Listing
March 2020

Use of synthetic DNA spike-in controls (sequins) for human genome sequencing.

Nat Protoc 2019 07 19;14(7):2119-2151. Epub 2019 Jun 19.

Genomics and Epigenetics Division, Garvan Institute of Medical Research, Sydney, Australia.

Next-generation sequencing (NGS) has been widely adopted to identify genetic variants and investigate their association with disease. However, the analysis of sequencing data remains challenging because of the complexity of human genetic variation and confounding errors introduced during library preparation, sequencing and analysis. We have developed a set of synthetic DNA spike-ins-termed 'sequins' (sequencing spike-ins)-that are directly added to DNA samples before library preparation. Sequins can be used to measure technical biases and to act as internal quantitative and qualitative controls throughout the sequencing workflow. This step-by-step protocol explains the use of sequins for both whole-genome and targeted sequencing of the human genome. This includes instructions regarding the dilution and addition of sequins to human DNA samples, followed by the bioinformatic steps required to separate sequin- and sample-derived sequencing reads and to evaluate the diagnostic performance of the assay. These practical guidelines are accompanied by a broader discussion of the conceptual and statistical principles that underpin the design of sequin standards. This protocol is suitable for users with standard laboratory and bioinformatic experience. The laboratory steps require ~1-4 d and the bioinformatic steps (which can be performed with the provided example data files) take an additional day.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41596-019-0175-1DOI Listing
July 2019

TMPRSS2-ERG fusions linked to prostate cancer racial health disparities: A focus on Africa.

Prostate 2019 07 15;79(10):1191-1196. Epub 2019 May 15.

Garvan Institute of Medical Research, The Kinghorn Cancer Centre, Darlinghurst, Australia.

Background: The androgen-regulated gene TMPRSS2 to the ETS transcription factor gene ERG fusion is the most common genomic alteration acquired during prostate tumorigenesis and biased toward men of European ancestry. In contrast, African American men present with more advanced disease, yet their tumors are less likely to acquire TMPRSS2-ERG. Data for Africa is scarce.

Methods: RNA was made available for genomic analyses from 181 prostate tissue biopsy cores from Black South African men, 94 with and 87 without pathological evidence for prostate cancer. Reverse transcription polymerase chain reaction was used to screen for the TMPRSS2-ERG fusion, while transcript junction coordinates and isoform frequencies, including novel gene fusions, were determined using targeted RNA sequencing.

Results: Here we report a frequency of 13% for TMPRSS2-ERG in tumors from Black South Africans. Present in 12/94 positive versus 1/87 cancer negative prostate tissue cores, this suggests a 92.62% predictivity for a positive cancer diagnosis (P = 0.0031). At a frequency of almost half that reported for African Americans and roughly a quarter of that reported for men of European ancestry, acquisition of TMPRSS2-ERG appears to be inversely associated with aggressive prostate cancer. Further support was provided by linking the presence of TMPRSS2-ERG to low-grade disease in younger patients (P = 0.0466), with higher expressing distal ERG fusion junction coordinates.

Conclusions: Only the second study of its kind for the African continent, we support a link between TMPRSS2-ERG status and prostate cancer racial health disparity beyond the borders of the United States. We call for urgent evaluation of androgen deprivation therapy within Africa.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/pros.23823DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6617820PMC
July 2019

Crizotinib and Surgery for Long-Term Disease Control in Children and Adolescents With ALK-Positive Inflammatory Myofibroblastic Tumors.

JCO Precis Oncol 2019 16;3. Epub 2019 May 16.

Children's Medical Research Institute, Westmead New South Wales, Australia.

Purpose: Before anaplastic lymphoma kinase (ALK) inhibitors, treatment options for positive inflammatory myofibroblastic tumors (AP-IMTs) were unsatisfactory. We retrospectively analyzed the outcome of patients with AP-IMT treated with crizotinib to document response, toxicity, survival, and features associated with relapse.

Methods: The cohort comprised eight patients with AP-IMT treated with crizotinib and surgery. Outcome measures were progression-free and overall survival after commencing crizotinib, treatment-related toxicities, features associated with relapse, outcome after relapse, and outcome after ceasing crizotinib.

Results: The median follow-up after commencing crizotinib was 3 years (range, 0.9 to 5.5 years). The major toxicity was neutropenia. All patients responded to crizotinib. Five were able to discontinue therapy without recurrence (median treatment duration, 1 year; range, 0.2 to 3.0 years); one continues on crizotinib. Two critically ill patients with initial complete response experienced relapse while on therapy. Both harbored fusions and responded to alternative ALK inhibitors; one ultimately died as a result of progressive disease, whereas the other remains alive on treatment. Progression-free and overall survival since commencement of crizotinib is 0.75 ± 0.15% and 0.83 ± 0.15%, respectively.

Conclusion: We confirm acceptable toxicity and excellent disease control in patients with AP-IMT treated with crizotinib, which may be ceased without recurrence in most. Relapses occurred in two of three patients with translocated IMT, which suggests that such patients require additional therapy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1200/PO.18.00297DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7446396PMC
May 2019

Targeted, High-Resolution RNA Sequencing of Non-coding Genomic Regions Associated With Neuropsychiatric Functions.

Front Genet 2019 12;10:309. Epub 2019 Apr 12.

Genomics and Epigenetics Division, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia.

The human brain is one of the last frontiers of biomedical research. Genome-wide association studies (GWAS) have succeeded in identifying thousands of haplotype blocks associated with a range of neuropsychiatric traits, including disorders such as schizophrenia, Alzheimer's and Parkinson's disease. However, the majority of single nucleotide polymorphisms (SNPs) that mark these haplotype blocks fall within non-coding regions of the genome, hindering their functional validation. While some of these GWAS loci may contain acting regulatory DNA elements such as enhancers, we hypothesized that many are also transcribed into non-coding RNAs that are missing from publicly available transcriptome annotations. Here, we use targeted RNA capture ('RNA CaptureSeq') in combination with nanopore long-read cDNA sequencing to transcriptionally profile 1,023 haplotype blocks across the genome containing non-coding GWAS SNPs associated with neuropsychiatric traits, using post-mortem human brain tissue from three neurologically healthy donors. We find that the majority (62%) of targeted haplotype blocks, including 13% of intergenic blocks, are transcribed into novel, multi-exonic RNAs, most of which are not yet recorded in GENCODE annotations. We validated our findings with short-read RNA-seq, providing orthogonal confirmation of novel splice junctions and enabling a quantitative assessment of the long-read assemblies. Many novel transcripts are supported by independent evidence of transcription including cap analysis of gene expression (CAGE) data and epigenetic marks, and some show signs of potential functional roles. We present these transcriptomes as a preliminary atlas of non-coding transcription in human brain that can be used to connect neurological phenotypes with gene expression.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2019.00309DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6473190PMC
April 2019

Diagnosis of fusion genes using targeted RNA sequencing.

Nat Commun 2019 03 27;10(1):1388. Epub 2019 Mar 27.

Genomics and Epigenetics Division, Garvan Institute of Medical Research, Sydney, 2010, NSW, Australia.

Fusion genes are a major cause of cancer. Their rapid and accurate diagnosis can inform clinical action, but current molecular diagnostic assays are restricted in resolution and throughput. Here, we show that targeted RNA sequencing (RNAseq) can overcome these limitations. First, we establish that fusion gene detection with targeted RNAseq is both sensitive and quantitative by optimising laboratory and bioinformatic variables using spike-in standards and cell lines. Next, we analyse a clinical patient cohort and improve the overall fusion gene diagnostic rate from 63% with conventional approaches to 76% with targeted RNAseq while demonstrating high concordance for patient samples with previous diagnoses. Finally, we show that targeted RNAseq offers additional advantages by simultaneously measuring gene expression levels and profiling the immune-receptor repertoire. We anticipate that targeted RNAseq will improve clinical fusion gene detection, and its increasing use will provide a deeper understanding of fusion gene biology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-019-09374-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6437215PMC
March 2019

Chiral DNA sequences as commutable controls for clinical genomics.

Nat Commun 2019 03 22;10(1):1342. Epub 2019 Mar 22.

Garvan Institute of Medical Research, Sydney, 2010, NSW, Australia.

Chirality is a property describing any object that is inequivalent to its mirror image. Due to its 5'-3' directionality, a DNA sequence is distinct from a mirrored sequence arranged in reverse nucleotide-order, and is therefore chiral. A given sequence and its opposing chiral partner sequence share many properties, such as nucleotide composition and sequence entropy. Here we demonstrate that chiral DNA sequence pairs also perform equivalently during molecular and bioinformatic techniques that underpin genetic analysis, including PCR amplification, hybridization, whole-genome, target-enriched and nanopore sequencing, sequence alignment and variant detection. Given these shared properties, synthetic DNA sequences mirroring clinically relevant or analytically challenging regions of the human genome are ideal controls for clinical genomics. The addition of synthetic chiral sequences (sequins) to patient tumor samples can prevent false-positive and false-negative mutation detection to improve diagnosis. Accordingly, we propose that sequins can fulfill the need for commutable internal controls in precision medicine.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-019-09272-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6430799PMC
March 2019

Synthetic microbe communities provide internal reference standards for metagenome sequencing and analysis.

Nat Commun 2018 08 6;9(1):3096. Epub 2018 Aug 6.

Garvan Institute of Medical Research, Sydney, 2010, NSW, Australia.

The complexity of microbial communities, combined with technical biases in next-generation sequencing, pose a challenge to metagenomic analysis. Here, we develop a set of internal DNA standards, termed "sequins" (sequencing spike-ins), that together constitute a synthetic community of artificial microbial genomes. Sequins are added to environmental DNA samples prior to library preparation, and undergo concurrent sequencing with the accompanying sample. We validate the performance of sequins by comparison to mock microbial communities, and demonstrate their use in the analysis of real metagenome samples. We show how sequins can be used to measure fold change differences in the size and structure of accompanying microbial communities, and perform quantitative normalization between samples. We further illustrate how sequins can be used to benchmark and optimize new methods, including nanopore long-read sequencing technology. We provide metagenome sequins, along with associated data sets, protocols, and an accompanying software toolkit, as reference standards to aid in metagenomic studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-018-05555-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6078961PMC
August 2018

Universal Alternative Splicing of Noncoding Exons.

Cell Syst 2018 Feb 24;6(2):245-255.e5. Epub 2018 Jan 24.

Garvan Institute of Medical Research, Sydney, NSW, Australia; St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia; Altius Institute for Biomedical Sciences, Seattle, WA, USA. Electronic address:

The human transcriptome is so large, diverse, and dynamic that, even after a decade of investigation by RNA sequencing (RNA-seq), we have yet to resolve its true dimensions. RNA-seq suffers from an expression-dependent bias that impedes characterization of low-abundance transcripts. We performed targeted single-molecule and short-read RNA-seq to survey the transcriptional landscape of a single human chromosome (Hsa21) at unprecedented resolution. Our analysis reaches the lower limits of the transcriptome, identifying a fundamental distinction between protein-coding and noncoding gene content: almost every noncoding exon undergoes alternative splicing, producing a seemingly limitless variety of isoforms. Analysis of syntenic regions of the mouse genome shows that few noncoding exons are shared between human and mouse, yet human splicing profiles are recapitulated on Hsa21 in mouse cells, indicative of regulation by a deeply conserved splicing code. We propose that noncoding exons are functionally modular, with alternative splicing generating an enormous repertoire of potential regulatory RNAs and a rich transcriptional reservoir for gene evolution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cels.2017.12.005DOI Listing
February 2018

Machine learning annotation of human branchpoints.

Bioinformatics 2018 03;34(6):920-927

Genomics and Epigenetics, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia.

Motivation: The branchpoint element is required for the first lariat-forming reaction in splicing. However current catalogues of human branchpoints remain incomplete due to the difficulty in experimentally identifying these splicing elements. To address this limitation, we have developed a machine-learning algorithm-branchpointer-to identify branchpoint elements solely from gene annotations and genomic sequence.

Results: Using branchpointer, we annotate branchpoint elements in 85% of human gene introns with sensitivity (61.8%) and specificity (97.8%). In addition to annotation, branchpointer can evaluate the impact of SNPs on branchpoint architecture to inform functional interpretation of genetic variants. Branchpointer identifies all published deleterious branchpoint mutations annotated in clinical variant databases, and finds thousands of additional clinical and common genetic variants with similar predicted effects. This genome-wide annotation of branchpoints provides a reference for the genetic analysis of splicing, and the interpretation of noncoding variation.

Availability And Implementation: Branchpointer is written and implemented in the statistical programming language R and is freely available under a BSD license as a package through Bioconductor.

Contact: b.signal@garvan.org.au or t.mercer@garvan.org.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btx688DOI Listing
March 2018

Phosphoproteomic Profiling Reveals ALK and MET as Novel Actionable Targets across Synovial Sarcoma Subtypes.

Cancer Res 2017 08 20;77(16):4279-4292. Epub 2017 Jun 20.

Cancer Research Program, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Clayton, Victoria, Australia.

Despite intensive multimodal treatment of sarcomas, a heterogeneous group of malignant tumors arising from connective tissue, survival remains poor. Candidate-based targeted treatments have demonstrated limited clinical success, urging an unbiased and comprehensive analysis of oncogenic signaling networks to reveal therapeutic targets and personalized treatment strategies. Here we applied mass spectrometry-based phosphoproteomic profiling to the largest and most heterogeneous set of sarcoma cell lines characterized to date and identified novel tyrosine phosphorylation patterns, enhanced tyrosine kinases in specific subtypes, and potential driver kinases. ALK was identified as a novel driver in the Aska-SS synovial sarcoma (SS) cell line via expression of an ALK variant with a large extracellular domain deletion (ALK). Functional ALK dependency was confirmed and with selective inhibitors. Importantly, ALK immunopositivity was detected in 6 of 43 (14%) of SS patient specimens, one of which exhibited an ALK rearrangement. High PDGFRα phosphorylation also characterized SS cell lines, which was accompanied by enhanced MET activation in Yamato-SS cells. Although Yamato-SS cells were sensitive to crizotinib (ALK/MET-inhibitor) but not pazopanib (VEGFR/PDGFR-inhibitor) monotherapy , synergistic effects were observed upon drug combination. , both drugs were individually effective, with pazopanib efficacy likely attributable to reduced angiogenesis. MET or PDGFRα expression was detected in 58% and 84% of SS patients, respectively, with coexpression in 56%. Consequently, our integrated approach has led to the identification of ALK and MET as promising therapeutic targets in SS. .
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1158/0008-5472.CAN-16-2550DOI Listing
August 2017

Reference standards for next-generation sequencing.

Nat Rev Genet 2017 08 19;18(8):473-484. Epub 2017 Jun 19.

Genomics and Epigenetics Division, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia.

Next-generation sequencing (NGS) provides a broad investigation of the genome, and it is being readily applied for the diagnosis of disease-associated genetic features. However, the interpretation of NGS data remains challenging owing to the size and complexity of the genome and the technical errors that are introduced during sample preparation, sequencing and analysis. These errors can be understood and mitigated through the use of reference standards - well-characterized genetic materials or synthetic spike-in controls that help to calibrate NGS measurements and to evaluate diagnostic performance. The informed use of reference standards, and associated statistical principles, ensures rigorous analysis of NGS data and is essential for its future clinical use.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nrg.2017.44DOI Listing
August 2017

The Dimensions, Dynamics, and Relevance of the Mammalian Noncoding Transcriptome.

Trends Genet 2017 07 20;33(7):464-478. Epub 2017 May 20.

Genomics and Epigenetics Division, Garvan Institute of Medical Research, Sydney, NSW, Australia; School of Biotechnology and Biomolecular Sciences, Faculty of Science, University of New South Wales, Sydney, NSW, Australia; St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia. Electronic address:

The combination of pervasive transcription and prolific alternative splicing produces a mammalian transcriptome of great breadth and diversity. The majority of transcribed genomic bases are intronic, antisense, or intergenic to protein-coding genes, yielding a plethora of short and long non-protein-coding regulatory RNAs. Long noncoding RNAs (lncRNAs) share most aspects of their biogenesis, processing, and regulation with mRNAs. However, lncRNAs are typically expressed in more restricted patterns, frequently from enhancers, and exhibit almost universal alternative splicing. These features are consistent with their role as modular epigenetic regulators. We describe here the key studies and technological advances that have shaped our understanding of the dimensions, dynamics, and biological relevance of the mammalian noncoding transcriptome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.tig.2017.04.004DOI Listing
July 2017

ANAQUIN: a software toolkit for the analysis of spike-in controls for next generation sequencing.

Bioinformatics 2017 Jun;33(11):1723-1724

Genomics and Epigenetics Division, Garvan Institute of Medical Research, Sydney, NSW, Australia.

Summary: Spike-in controls are synthetic nucleic-acid sequences that are added to a user's sample and constitute internal standards for subsequent steps in the next generation sequencing workflow.

Availability And Implementation: : The software is implemented in C ++/R and is freely available under BSD license. The source code is available from github.com/student-t/Anaquin , binaries and user manual from www.sequin.xyz/software and R package from bioconductor.org/packages/Anaquin.

Contact: anaquin@garvan.org.au or t.mercer@garvan.org.au.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btx038DOI Listing
June 2017

Spliced synthetic genes as internal controls in RNA sequencing experiments.

Nat Methods 2016 09 8;13(9):792-8. Epub 2016 Aug 8.

Genomics and Epigenetics Division, Garvan Institute of Medical Research, Sydney, New South Wales, Australia.

RNA sequencing (RNA-seq) can be used to assemble spliced isoforms, quantify expressed genes and provide a global profile of the transcriptome. However, the size and diversity of the transcriptome, the wide dynamic range in gene expression and inherent technical biases confound RNA-seq analysis. We have developed a set of spike-in RNA standards, termed 'sequins' (sequencing spike-ins), that represent full-length spliced mRNA isoforms. Sequins have an entirely artificial sequence with no homology to natural reference genomes, but they align to gene loci encoded on an artificial in silico chromosome. The combination of multiple sequins across a range of concentrations emulates alternative splicing and differential gene expression, and it provides scaling factors for normalization between samples. We demonstrate the use of sequins in RNA-seq experiments to measure sample-specific biases and determine the limits of reliable transcript assembly and quantification in accompanying human RNA samples. In addition, we have designed a complementary set of sequins that represent fusion genes arising from rearrangements of the in silico chromosome to aid in cancer diagnosis. RNA sequins provide a qualitative and quantitative reference with which to navigate the complexity of the human transcriptome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.3958DOI Listing
September 2016

Representing genetic variation with synthetic DNA standards.

Nat Methods 2016 09 8;13(9):784-91. Epub 2016 Aug 8.

Genomics and Epigenetics Division, Garvan Institute of Medical Research, New South Wales, Australia.

The identification of genetic variation with next-generation sequencing is confounded by the complexity of the human genome sequence and by biases that arise during library preparation, sequencing and analysis. We have developed a set of synthetic DNA standards, termed 'sequins', that emulate human genetic features and constitute qualitative and quantitative spike-in controls for genome sequencing. Sequencing reads derived from sequins align exclusively to an artificial in silico reference chromosome, rather than the human reference genome, which allows them them to be partitioned for parallel analysis. Here we use this approach to represent common and clinically relevant genetic variation, ranging from single nucleotide variants to large structural rearrangements and copy-number variation. We validate the design and performance of sequin standards by comparison to examples in the NA12878 reference genome, and we demonstrate their utility during the detection and quantification of variants. We provide sequins as a standardized, quantitative resource against which human genetic variation can be measured and diagnostic performance assessed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.3957DOI Listing
September 2016

Improved definition of the mouse transcriptome via targeted RNA sequencing.

Genome Res 2016 05;26(5):705-16

EMBL, European Bioinformatics Institute, Cambridge, CB10 1SD, United Kingdom;

Targeted RNA sequencing (CaptureSeq) uses oligonucleotide probes to capture RNAs for sequencing, providing enriched read coverage, accurate measurement of gene expression, and quantitative expression data. We applied CaptureSeq to refine transcript annotations in the current murine GRCm38 assembly. More than 23,000 regions corresponding to putative or annotated long noncoding RNAs (lncRNAs) and 154,281 known splicing junction sites were selected for targeted sequencing across five mouse tissues and three brain subregions. The results illustrate that the mouse transcriptome is considerably more complex than previously thought. We assemble more complete transcript isoforms than GENCODE, expand transcript boundaries, and connect interspersed islands of mapped reads. We describe a novel filtering pipeline that identifies previously unannotated but high-quality transcript isoforms. In this set, 911 GENCODE neighboring genes are condensed into 400 expanded gene models. Additionally, 594 GENCODE lncRNAs acquire an open reading frame (ORF) when their structure is extended with CaptureSeq. Finally, we validate our observations using current FANTOM and Mouse ENCODE resources.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.199760.115DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4864457PMC
May 2016

Quantitative gene profiling of long noncoding RNAs with targeted RNA sequencing.

Nat Methods 2015 Apr 9;12(4):339-42. Epub 2015 Mar 9.

1] Garvan Institute of Medical Research, Sydney, Australia. [2] St Vincents Clinical School, Faculty of Medicine, University of New South Wales, Sydney, Australia.

We compared quantitative RT-PCR (qRT-PCR), RNA-seq and capture sequencing (CaptureSeq) in terms of their ability to assemble and quantify long noncoding RNAs and novel coding exons across 20 human tissues. CaptureSeq was superior for the detection and quantification of genes with low expression, showed little technical variation and accurately measured differential expression. This approach expands and refines previous annotations and simultaneously generates an expression atlas.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.3321DOI Listing
April 2015

Integrative analysis of 111 reference human epigenomes.

Nature 2015 Feb;518(7539):317-30

1] Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, Moores Cancer Center, Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA. [2] Ludwig Institute for Cancer Research, 9500 Gilman Drive, La Jolla, California 92093, USA.

The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature14248DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4530010PMC
February 2015

Genome-wide discovery of human splicing branchpoints.

Genome Res 2015 Feb 5;25(2):290-303. Epub 2015 Jan 5.

Garvan Institute of Medical Research, Sydney, New South Wales 2010, Australia; St. Vincent's Clinical School, Faculty of Medicine, UNSW Australia, Sydney, New South Wales 2052, Australia;

During the splicing reaction, the 5' intron end is joined to the branchpoint nucleotide, selecting the next exon to incorporate into the mature RNA and forming an intron lariat, which is excised. Despite a critical role in gene splicing, the locations and features of human splicing branchpoints are largely unknown. We use exoribonuclease digestion and targeted RNA-sequencing to enrich for sequences that traverse the lariat junction and, by split and inverted alignment, reveal the branchpoint. We identify 59,359 high-confidence human branchpoints in >10,000 genes, providing a first map of splicing branchpoints in the human genome. Branchpoints are predominantly adenosine, highly conserved, and closely distributed to the 3' splice site. Analysis of human branchpoints reveals numerous novel features, including distinct features of branchpoints for alternatively spliced exons and a family of conserved sequence motifs overlapping branchpoints we term B-boxes, which exhibit maximal nucleotide diversity while maintaining interactions with the keto-rich U2 snRNA. Different B-box motifs exhibit divergent usage in vertebrate lineages and associate with other splicing elements and distinct intron-exon architectures, suggesting integration within a broader regulatory splicing code. Lastly, although branchpoints are refractory to common mutational processes and genetic variation, mutations occurring at branchpoint nucleotides are enriched for disease associations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.182899.114DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4315302PMC
February 2015

Extracellular vesicles from neural stem cells transfer IFN-γ via Ifngr1 to activate Stat1 signaling in target cells.

Mol Cell 2014 Oct 18;56(2):193-204. Epub 2014 Sep 18.

John van Geest Centre for Brain Repair, Department of Clinical Neurosciences, and NIHR Biomedical Research Centre, University of Cambridge, CB2 0PY Cambridge, UK; Wellcome Trust-Medical Research Council Stem Cell Institute, Cambridge, UK. Electronic address:

The idea that stem cell therapies work only via cell replacement is challenged by the observation of consistent intercellular molecule exchange between the graft and the host. Here we defined a mechanism of cellular signaling by which neural stem/precursor cells (NPCs) communicate with the microenvironment via extracellular vesicles (EVs), and we elucidated its molecular signature and function. We observed cytokine-regulated pathways that sort proteins and mRNAs into EVs. We described induction of interferon gamma (IFN-γ) pathway in NPCs exposed to proinflammatory cytokines that is mirrored in EVs. We showed that IFN-γ bound to EVs through Ifngr1 activates Stat1 in target cells. Finally, we demonstrated that endogenous Stat1 and Ifngr1 in target cells are indispensable to sustain the activation of Stat1 signaling by EV-associated IFN-γ/Ifngr1 complexes. Our study identifies a mechanism of cellular signaling regulated by EV-associated IFN-γ/Ifngr1 complexes, which grafted stem cells may use to communicate with the host immune system.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.molcel.2014.08.020DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4578249PMC
October 2014

Targeted sequencing for gene discovery and quantification using RNA CaptureSeq.

Nat Protoc 2014 May 3;9(5):989-1009. Epub 2014 Apr 3.

Garvan Institute of Medical Research, Sydney, New South Wales, Australia.

RNA sequencing (RNAseq) samples the majority of expressed genes infrequently, owing to the large size, complex splicing and wide dynamic range of eukaryotic transcriptomes. This results in sparse sequencing coverage that can hinder robust isoform assembly and quantification. RNA capture sequencing (CaptureSeq) addresses this challenge by using oligonucleotide probes to capture selected genes or regions of interest for targeted sequencing. Targeted RNAseq provides enhanced coverage for sensitive gene discovery, robust transcript assembly and accurate gene quantification. Here we describe a detailed protocol for all stages of RNA CaptureSeq, from initial probe design considerations and capture of targeted genes to final assembly and quantification of captured transcripts. Initial probe design and final analysis can take less than 1 d, whereas the central experimental capture stage requires ∼7 d.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nprot.2014.058DOI Listing
May 2014

Re-annotation of the Saccharopolyspora erythraea genome using a systems biology approach.

BMC Genomics 2013 Oct 11;14:699. Epub 2013 Oct 11.

Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Brisbane, Qld 4072, Australia.

Background: Accurate bacterial genome annotations provide a framework to understanding cellular functions, behavior and pathogenicity and are essential for metabolic engineering. Annotations based only on in silico predictions are inaccurate, particularly for large, high G + C content genomes due to the lack of similarities in gene length and gene organization to model organisms.

Results: Here we describe a 2D systems biology driven re-annotation of the Saccharopolyspora erythraea genome using proteogenomics, a genome-scale metabolic reconstruction, RNA-sequencing and small-RNA-sequencing. We observed transcription of more than 300 intergenic regions, detected 59 peptides in intergenic regions, confirmed 164 open reading frames previously annotated as hypothetical proteins and reassigned function to open reading frames using the genome-scale metabolic reconstruction. Finally, we present a novel way of mapping ribosomal binding sites across the genome by sequencing small RNAs.

Conclusions: The work presented here describes a novel framework for annotation of the Saccharopolyspora erythraea genome. Based on experimental observations, the 2D annotation framework greatly reduces errors that are commonly made when annotating large-high G + C content genomes using computational prediction algorithms.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2164-14-699DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4008361PMC
October 2013

Understanding the regulatory and transcriptional complexity of the genome through structure.

Genome Res 2013 Jul;23(7):1081-8

Garvan Institute of Medical Research, Sydney, New South Wales, Australia.

An expansive functionality and complexity has been ascribed to the majority of the human genome that was unanticipated at the outset of the draft sequence and assembly a decade ago. We are now faced with the challenge of integrating and interpreting this complexity in order to achieve a coherent view of genome biology. We argue that the linear representation of the genome exacerbates this complexity and an understanding of its three-dimensional structure is central to interpreting the regulatory and transcriptional architecture of the genome. Chromatin conformation capture techniques and high-resolution microscopy have afforded an emergent global view of genome structure within the nucleus. Chromosomes fold into complex, territorialized three-dimensional domains in concert with specialized subnuclear bodies that harbor concentrations of transcription and splicing machinery. The signature of these folds is retained within the layered regulatory landscapes annotated by chromatin immunoprecipitation, and we propose that genome contacts are reflected in the organization and expression of interweaved networks of overlapping coding and noncoding transcripts. This pervasive impact of genome structure favors a preeminent role for the nucleoskeleton and RNA in regulating gene expression by organizing these folds and contacts. Accordingly, we propose that the local and global three-dimensional structure of the genome provides a consistent, integrated, and intuitive framework for interpreting and understanding the regulatory and transcriptional complexity of the human genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.156612.113DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3698501PMC
July 2013

DNase I-hypersensitive exons colocalize with promoters and distal regulatory elements.

Nat Genet 2013 Aug 23;45(8):852-9. Epub 2013 Jun 23.

Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland, Australia.

The precise splicing of genes confers an enormous transcriptional complexity to the human genome. The majority of gene splicing occurs cotranscriptionally, permitting epigenetic modifications to affect splicing outcomes. Here we show that select exonic regions are demarcated within the three-dimensional structure of the human genome. We identify a subset of exons that exhibit DNase I hypersensitivity and are accompanied by 'phantom' signals in chromatin immunoprecipitation and sequencing (ChIP-seq) that result from cross-linking with proximal promoter- or enhancer-bound factors. The capture of structural features by ChIP-seq is confirmed by chromatin interaction analysis that resolves local intragenic loops that fold exons close to cognate promoters while excluding intervening intronic sequences. These interactions of exons with promoters and enhancers are enriched for alternative splicing events, an effect reflected in cell type-specific periexonic DNase I hypersensitivity patterns. Collectively, our results connect local genome topography, chromatin structure and cis-regulatory landscapes with the generation of human transcriptional complexity by cotranscriptional splicing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.2677DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4405174PMC
August 2013

Structure and function of long noncoding RNAs in epigenetic regulation.

Nat Struct Mol Biol 2013 Mar;20(3):300-7

Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia.

Genomes of complex organisms encode an abundance and diversity of long noncoding RNAs (lncRNAs) that are expressed throughout the cell and fulfill a wide variety of regulatory roles at almost every stage of gene expression. These roles, which encompass sensory, guiding, scaffolding and allosteric capacities, derive from folded modular domains in lncRNAs. In this diverse functional repertoire, we focus on the well-characterized ability for lncRNAs to function as epigenetic modulators. Many lncRNAs bind to chromatin-modifying proteins and recruit their catalytic activity to specific sites in the genome, thereby modulating chromatin states and impacting gene expression. Considering this regulatory potential in combination with the abundance of lncRNAs suggests that lncRNAs may be part of a broad epigenetic regulatory network.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nsmb.2480DOI Listing
March 2013