Publications by authors named "Lovelace J Luquette"

26 Publications

  • Page 1 of 1

Somatic mutation accumulation seen through a single-molecule lens.

Cell Res 2021 Sep;31(9):949-950

Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41422-021-00537-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8410831PMC
September 2021

Comprehensive identification of somatic nucleotide variants in human brain tissue.

Genome Biol 2021 03 29;22(1):92. Epub 2021 Mar 29.

Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.

Background: Post-zygotic mutations incurred during DNA replication, DNA repair, and other cellular processes lead to somatic mosaicism. Somatic mosaicism is an established cause of various diseases, including cancers. However, detecting mosaic variants in DNA from non-cancerous somatic tissues poses significant challenges, particularly if the variants only are present in a small fraction of cells.

Results: Here, the Brain Somatic Mosaicism Network conducts a coordinated, multi-institutional study to examine the ability of existing methods to detect simulated somatic single-nucleotide variants (SNVs) in DNA mixing experiments, generate multiple replicates of whole-genome sequencing data from the dorsolateral prefrontal cortex, other brain regions, dura mater, and dural fibroblasts of a single neurotypical individual, devise strategies to discover somatic SNVs, and apply various approaches to validate somatic SNVs. These efforts lead to the identification of 43 bona fide somatic SNVs that range in variant allele fractions from ~ 0.005 to ~ 0.28. Guided by these results, we devise best practices for calling mosaic SNVs from 250× whole-genome sequencing data in the accessible portion of the human genome that achieve 90% specificity and sensitivity. Finally, we demonstrate that analysis of multiple bulk DNA samples from a single individual allows the reconstruction of early developmental cell lineage trees.

Conclusions: This study provides a unified set of best practices to detect somatic SNVs in non-cancerous tissues. The data and methods are freely available to the scientific community and should serve as a guide to assess the contributions of somatic SNVs to neuropsychiatric diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-021-02285-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8006362PMC
March 2021

The landscape of somatic mutation in cerebral cortex of autistic and neurotypical individuals revealed by ultra-deep whole-genome sequencing.

Nat Neurosci 2021 02 11;24(2):176-185. Epub 2021 Jan 11.

Division of Genetics and Genomics, Manton Center for Orphan Disease Research, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA.

We characterize the landscape of somatic mutations-mutations occurring after fertilization-in the human brain using ultra-deep (~250×) whole-genome sequencing of prefrontal cortex from 59 donors with autism spectrum disorder (ASD) and 15 control donors. We observe a mean of 26 somatic single-nucleotide variants per brain present in ≥4% of cells, with enrichment of mutations in coding and putative regulatory regions. Our analysis reveals that the first cell division after fertilization produces ~3.4 mutations, followed by 2-3 mutations in subsequent generations. This suggests that a typical individual possesses ~80 somatic single-nucleotide variants present in ≥2% of cells-comparable to the number of de novo germline mutations per generation-with about half of individuals having at least one potentially function-altering somatic mutation somewhere in the cortex. ASD brains show an excess of somatic mutations in neural enhancer sequences compared with controls, suggesting that mosaic enhancer mutations may contribute to ASD risk.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41593-020-00765-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7983596PMC
February 2021

Accurate detection of mosaic variants in sequencing data without matched controls.

Nat Biotechnol 2020 03 6;38(3):314-319. Epub 2020 Jan 6.

Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.

Detection of mosaic mutations that arise in normal development is challenging, as such mutations are typically present in only a minute fraction of cells and there is no clear matched control for removing germline variants and systematic artifacts. We present MosaicForecast, a machine-learning method that leverages read-based phasing and read-level features to accurately detect mosaic single-nucleotide variants and indels, achieving a multifold increase in specificity compared with existing algorithms. Using single-cell sequencing and targeted sequencing, we validated 80-90% of the mosaic single-nucleotide variants and 60-80% of indels detected in human brain whole-genome sequencing data. Our method should help elucidate the contribution of mosaic somatic mutations to the origin and development of disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-019-0368-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7065972PMC
March 2020

Global impact of somatic structural variation on the DNA methylome of human cancers.

Genome Biol 2019 10 15;20(1):209. Epub 2019 Oct 15.

Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA.

Background: Genomic rearrangements exert a heavy influence on the molecular landscape of cancer. New analytical approaches integrating somatic structural variants (SSVs) with altered gene features represent a framework by which we can assign global significance to a core set of genes, analogous to established methods that identify genes non-randomly targeted by somatic mutation or copy number alteration. While recent studies have defined broad patterns of association involving gene transcription and nearby SSV breakpoints, global alterations in DNA methylation in the context of SSVs remain largely unexplored.

Results: By data integration of whole genome sequencing, RNA sequencing, and DNA methylation arrays from more than 1400 human cancers, we identify hundreds of genes and associated CpG islands (CGIs) for which the nearby presence of a somatic structural variant (SSV) breakpoint is recurrently associated with altered expression or DNA methylation, respectively, independently of copy number alterations. CGIs with SSV-associated increased methylation are predominantly promoter-associated, while CGIs with SSV-associated decreased methylation are enriched for gene body CGIs. Rearrangement of genomic regions normally having higher or lower methylation is often involved in SSV-associated CGI methylation alterations. Across cancers, the overall structural variation burden is associated with a global decrease in methylation, increased expression in methyltransferase genes and DNA damage response genes, and decreased immune cell infiltration.

Conclusion: Genomic rearrangement appears to have a major role in shaping the cancer DNA methylome, to be considered alongside commonly accepted mechanisms including histone modifications and disruption of DNA methyltransferases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-019-1818-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6792267PMC
October 2019

Identification of somatic mutations in single cell DNA-seq using a spatial model of allelic imbalance.

Nat Commun 2019 08 29;10(1):3908. Epub 2019 Aug 29.

Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.

Recent advances in single cell technology have enabled dissection of cellular heterogeneity in great detail. However, analysis of single cell DNA sequencing data remains challenging due to bias and artifacts that arise during DNA extraction and whole-genome amplification, including allelic imbalance and dropout. Here, we present a framework for statistical estimation of allele-specific amplification imbalance at any given position in single cell whole-genome sequencing data by utilizing the allele frequencies of heterozygous single nucleotide polymorphisms in the neighborhood. The resulting allelic imbalance profile is critical for determining whether the variant allele fraction of an observed mutation is consistent with the expected fraction for a true variant. This method, implemented in SCAN-SNV (Single Cell ANalysis of SNVs), substantially improves the identification of somatic variants in single cells. Our allele balance framework is broadly applicable to genotype analysis of any variant type in any data that might exhibit allelic imbalance.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-019-11857-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6715686PMC
August 2019

Linked-read analysis identifies mutations in single-cell DNA-sequencing data.

Nat Genet 2019 04 18;51(4):749-754. Epub 2019 Mar 18.

Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.

Whole-genome sequencing of DNA from single cells has the potential to reshape our understanding of mutational heterogeneity in normal and diseased tissues. However, a major difficulty is distinguishing amplification artifacts from biologically derived somatic mutations. Here, we describe linked-read analysis (LiRA), a method that accurately identifies somatic single-nucleotide variants (sSNVs) by using read-level phasing with nearby germline heterozygous polymorphisms, thereby enabling the characterization of mutational signatures and estimation of somatic mutation rates in single cells.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-019-0366-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6900933PMC
April 2019

A Pan-Cancer Compendium of Genes Deregulated by Somatic Genomic Rearrangement across More Than 1,400 Cases.

Cell Rep 2018 07;24(2):515-527

Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA; Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA. Electronic address:

A systematic cataloging of genes affected by genomic rearrangement, using multiple patient cohorts and cancer types, can provide insight into cancer-relevant alterations outside of exomes. By integrative analysis of whole-genome sequencing (predominantly low pass) and gene expression data from 1,448 cancers involving 18 histopathological types in The Cancer Genome Atlas, we identified hundreds of genes for which the nearby presence (within 100 kb) of a somatic structural variant (SV) breakpoint is associated with altered expression. While genomic rearrangements are associated with widespread copy-number alteration (CNA) patterns, approximately 1,100 genes-including overexpressed cancer driver genes (e.g., TERT, ERBB2, CDK12, CDK4) and underexpressed tumor suppressors (e.g., TP53, RB1, PTEN, STK11)-show SV-associated deregulation independent of CNA. SVs associated with the disruption of topologically associated domains, enhancer hijacking, or fusion transcripts are implicated in gene upregulation. For cancer-relevant pathways, SVs considerably expand our understanding of how genes are affected beyond point mutation or CNA.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.celrep.2018.06.025DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6092947PMC
July 2018

Detecting Somatic Mutations in Normal Cells.

Trends Genet 2018 07 3;34(7):545-557. Epub 2018 May 3.

Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Division of Genetics, Brigham and Women's Hospital, Boston, MA, USA. Electronic address:

Somatic mutations have been studied extensively in the context of cancer. Recent studies have demonstrated that high-throughput sequencing data can be used to detect somatic mutations in non-tumor cells. Analysis of such mutations allows us to better understand the mutational processes in normal cells, explore cell lineages in development, and examine potential associations with age-related disease. We describe here approaches for characterizing somatic mutations in normal and non-tumor disease tissues. We discuss several experimental designs and common pitfalls in somatic mutation detection, as well as more recent developments such as phasing and linked-read technology. With the dramatically increasing numbers of samples undergoing genome sequencing, bioinformatic analysis will enable the characterization of somatic mutations and their impact on non-cancer tissues.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.tig.2018.04.003DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6029698PMC
July 2018

Aging and neurodegeneration are associated with increased mutations in single human neurons.

Science 2018 02 7;359(6375):555-559. Epub 2017 Dec 7.

Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA.

It has long been hypothesized that aging and neurodegeneration are associated with somatic mutation in neurons; however, methodological hurdles have prevented testing this hypothesis directly. We used single-cell whole-genome sequencing to perform genome-wide somatic single-nucleotide variant (sSNV) identification on DNA from 161 single neurons from the prefrontal cortex and hippocampus of 15 normal individuals (aged 4 months to 82 years), as well as 9 individuals affected by early-onset neurodegeneration due to genetic disorders of DNA repair (Cockayne syndrome and xeroderma pigmentosum). sSNVs increased approximately linearly with age in both areas (with a higher rate in hippocampus) and were more abundant in neurodegenerative disease. The accumulation of somatic mutations with age-which we term genosenium-shows age-related, region-related, and disease-related molecular signatures and may be important in other human age-associated conditions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aao4426DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5831169PMC
February 2018

A Pan-Cancer Proteogenomic Atlas of PI3K/AKT/mTOR Pathway Alterations.

Cancer Cell 2017 06 18;31(6):820-832.e3. Epub 2017 May 18.

Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.

Molecular alterations involving the PI3K/AKT/mTOR pathway (including mutation, copy number, protein, or RNA) were examined across 11,219 human cancers representing 32 major types. Within specific mutated genes, frequency, mutation hotspot residues, in silico predictions, and functional assays were all informative in distinguishing the subset of genetic variants more likely to have functional relevance. Multiple oncogenic pathways including PI3K/AKT/mTOR converged on similar sets of downstream transcriptional targets. In addition to mutation, structural variations and partial copy losses involving PTEN and STK11 showed evidence for having functional relevance. A substantial fraction of cancers showed high mTOR pathway activity without an associated canonical genetic or genomic alteration, including cancers harboring IDH1 or VHL mutations, suggesting multiple mechanisms for pathway activation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ccell.2017.04.013DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5502825PMC
June 2017

Orthogonal NGS for High Throughput Clinical Diagnostics.

Sci Rep 2016 Apr 19;6:24650. Epub 2016 Apr 19.

Claritas Genomics, Cambridge MA, USA.

Next generation sequencing is a transformative technology for discovering and diagnosing genetic disorders. However, high-throughput sequencing remains error-prone, necessitating variant confirmation in order to meet the exacting demands of clinical diagnostic sequencing. To address this, we devised an orthogonal, dual platform approach employing complementary target capture and sequencing chemistries to improve speed and accuracy of variant calls at a genomic scale. We combined DNA selection by bait-based hybridization followed by Illumina NextSeq reversible terminator sequencing with DNA selection by amplification followed by Ion Proton semiconductor sequencing. This approach yields genomic scale orthogonal confirmation of ~95% of exome variants. Overall variant sensitivity improves as each method covers thousands of coding exons missed by the other. We conclude that orthogonal NGS offers improvements in variant calling sensitivity when two platforms are used, better specificity for variants identified on both platforms, and greatly reduces the time and expense of Sanger follow-up, thus enabling physicians to act on genomic results more quickly.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/srep24650DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4836299PMC
April 2016

Somatic mutation in single human neurons tracks developmental and transcriptional history.

Science 2015 Oct;350(6256):94-98

Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA; Departments of Neurology and Pediatrics, Harvard Medical School, Boston, MA, USA; and Broad Institute of MIT and Harvard, Cambridge, MA, USA.

Neurons live for decades in a postmitotic state, their genomes susceptible to DNA damage. Here we survey the landscape of somatic single-nucleotide variants (SNVs) in the human brain. We identified thousands of somatic SNVs by single-cell sequencing of 36 neurons from the cerebral cortex of three normal individuals. Unlike germline and cancer SNVs, which are often caused by errors in DNA replication, neuronal mutations appear to reflect damage during active transcription. Somatic mutations create nested lineage trees, allowing them to be dated relative to developmental landmarks and revealing a polyclonal architecture of the human cerebral cortex. Thus, somatic mutations in the brain represent a durable and ongoing record of neuronal life history, from development through postmitotic function.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aab1785DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4664477PMC
October 2015

COSMOS: Python library for massively parallel workflows.

Bioinformatics 2014 Oct 30;30(20):2956-8. Epub 2014 Jun 30.

Center for Biomedical Informatics, Harvard Medical School, 10 Shattuck Street, Boston, MA 02115, Department of Pathology, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA 02215, USA and Department of Biology, Mohammed V University-Agal, 4 Ibn Battouta Avenue, Rabat B.P:1014RP, Morocco Center for Biomedical Informatics, Harvard Medical School, 10 Shattuck Street, Boston, MA 02115, Department of Pathology, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA 02215, USA and Department of Biology, Mohammed V University-Agal, 4 Ibn Battouta Avenue, Rabat B.P:1014RP, Morocco.

Summary: Efficient workflows to shepherd clinically generated genomic data through the multiple stages of a next-generation sequencing pipeline are of critical importance in translational biomedical science. Here we present COSMOS, a Python library for workflow management that allows formal description of pipelines and partitioning of jobs. In addition, it includes a user interface for tracking the progress of jobs, abstraction of the queuing system and fine-grained control over the workflow. Workflows can be created on traditional computing clusters as well as cloud-based services.

Availability And Implementation: Source code is available for academic non-commercial research purposes. Links to code and documentation are provided at http://lpm.hms.harvard.edu and http://wall-lab.stanford.edu.

Contact: [email protected] or [email protected]

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btu385DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4184253PMC
October 2014

An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge.

Genome Biol 2014 Mar 25;15(3):R53. Epub 2014 Mar 25.

Background: There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challenge was designed to spur convergence in methods for diagnosing genetic disease starting from clinical case history and genome sequencing data. DNA samples were obtained from three families with heritable genetic disorders and genomic sequence data were donated by sequencing platform vendors. The challenge was to analyze and interpret these data with the goals of identifying disease-causing variants and reporting the findings in a clinically useful format. Participating contestant groups were solicited broadly, and an independent panel of judges evaluated their performance.

Results: A total of 30 international groups were engaged. The entries reveal a general convergence of practices on most elements of the analysis and interpretation process. However, even given this commonality of approach, only two groups identified the consensus candidate variants in all disease cases, demonstrating a need for consistent fine-tuning of the generally accepted methods. There was greater diversity of the final clinical report content and in the patient consenting process, demonstrating that these areas require additional exploration and standardization.

Conclusions: The CLARITY Challenge provides a comprehensive assessment of current practices for using genome sequencing to diagnose and report genetic diseases. There is remarkable convergence in bioinformatic techniques, but medical interpretation and reporting are areas that require further development by many groups.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/gb-2014-15-3-r53DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4073084PMC
March 2014

Impact of sequencing depth in ChIP-seq experiments.

Nucleic Acids Res 2014 May 5;42(9):e74. Epub 2014 Mar 5.

Center for Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA Division of Genetics, Brigham and Women's Hospital & Harvard Medical School, Boston, MA 02115, USA Informatics Program, Children's Hospital, Boston, MA 02115, USA

In a chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) experiment, an important consideration in experimental design is the minimum number of sequenced reads required to obtain statistically significant results. We present an extensive evaluation of the impact of sequencing depth on identification of enriched regions for key histone modifications (H3K4me3, H3K36me3, H3K27me3 and H3K9me2/me3) using deep-sequenced datasets in human and fly. We propose to define sufficient sequencing depth as the number of reads at which detected enrichment regions increase <1% for an additional million reads. Although the required depth depends on the nature of the mark and the state of the cell in each experiment, we observe that sufficient depth is often reached at <20 million reads for fly. For human, there are no clear saturation points for the examined datasets, but our analysis suggests 40-50 million reads as a practical minimum for most marks. We also devise a mathematical model to estimate the sufficient depth and total genomic coverage of a mark. Lastly, we find that the five algorithms tested do not agree well for broad enrichment profiles, especially at lower depths. Our findings suggest that sufficient sequencing depth and an appropriate peak-calling algorithm are essential for ensuring robustness of conclusions derived from ChIP-seq data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gku178DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4027199PMC
May 2014

Mutation of KCNJ8 in a patient with Cantú syndrome with unique vascular abnormalities - support for the role of K(ATP) channels in this condition.

Eur J Med Genet 2013 Dec 28;56(12):678-82. Epub 2013 Oct 28.

Division of Genetics and Genomics, The Manton Center for Orphan Disease Research, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA. Electronic address:

KCNJ8 (NM_004982) encodes the pore forming subunit of one of the ATP-sensitive inwardly rectifying potassium (KATP) channels. KCNJ8 sequence variations are traditionally associated with J-wave syndromes, involving ventricular fibrillation and sudden cardiac death. Recently, the KATP gene ABCC9 (SUR2, NM_020297) has been associated with the multi-organ disorder Cantú syndrome or hypertrichotic osteochondrodysplasia (MIM 239850) (hypertrichosis, macrosomia, osteochondrodysplasia, and cardiomegaly). Here, we report on a patient with a de novo nonsynonymous KCNJ8 SNV (p.V65M) and Cantú syndrome, who tested negative for mutations in ABCC9. The genotype and multi-organ abnormalities of this patient are reviewed. A careful screening of the KATP genes should be performed in all individuals diagnosed with Cantú syndrome and no mutation in ABCC9.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ejmg.2013.09.009DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3902017PMC
December 2013

Diverse mechanisms of somatic structural variations in human cancer genomes.

Cell 2013 May;153(4):919-29

Center for Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA.

Identification of somatic rearrangements in cancer genomes has accelerated through analysis of high-throughput sequencing data. However, characterization of complex structural alterations and their underlying mechanisms remains inadequate. Here, applying an algorithm to predict structural variations from short reads, we report a comprehensive catalog of somatic structural variations and the mechanisms generating them, using high-coverage whole-genome sequencing data from 140 patients across ten tumor types. We characterize the relative contributions of different types of rearrangements and their mutational mechanisms, find that ~20% of the somatic deletions are complex deletions formed by replication errors, and describe the differences between the mutational mechanisms in somatic and germline alterations. Importantly, we provide detailed reconstructions of the events responsible for loss of CDKN2A/B and gain of EGFR in glioblastoma, revealing that these alterations can result from multiple mechanisms even in a single genome and that both DNA double-strand breaks and replication errors drive somatic rearrangements.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2013.04.010DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3704973PMC
May 2013

Functional genomic analysis of chromosomal aberrations in a compendium of 8000 cancer genomes.

Genome Res 2013 Feb 6;23(2):217-27. Epub 2012 Nov 6.

Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02115, USA.

A large database of copy number profiles from cancer genomes can facilitate the identification of recurrent chromosomal alterations that often contain key cancer-related genes. It can also be used to explore low-prevalence genomic events such as chromothripsis. In this study, we report an analysis of 8227 human cancer copy number profiles obtained from 107 array comparative genomic hybridization (CGH) studies. Our analysis reveals similarity of chromosomal arm-level alterations among developmentally related tumor types as well as a number of co-occurring pairs of arm-level alterations. Recurrent ("pan-lineage") focal alterations identified across diverse tumor types show an enrichment of known cancer-related genes and genes with relevant functions in cancer-associated phenotypes (e.g., kinase and cell cycle). Tumor type-specific ("lineage-restricted") alterations and their enriched functional categories were also identified. Furthermore, we developed an algorithm for detecting regions in which the copy number oscillates rapidly between fixed levels, indicative of chromothripsis. We observed these massive genomic rearrangements in 1%-2% of the samples with variable tumor type-specific incidence rates. Taken together, our comprehensive view of copy number alterations provides a framework for understanding the functional significance of various genomic alterations in cancer genomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.140301.112DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3561863PMC
February 2013

Systematic identification of synergistic drug pairs targeting HIV.

Nat Biotechnol 2012 Nov 14;30(11):1125-30. Epub 2012 Oct 14.

Howard Hughes Medical Institute, Department of Genetics, Harvard Medical School, Division of Genetics, Brigham and Women's Hospital, Boston, Massachusetts, USA.

The systematic identification of effective drug combinations has been hindered by the unavailability of methods that can explore the large combinatorial search space of drug interactions. Here we present multiplex screening for interacting compounds (MuSIC), which expedites the comprehensive assessment of pairwise compound interactions. We examined ∼500,000 drug pairs from 1,000 US Food and Drug Administration (FDA)-approved or clinically tested drugs and identified drugs that synergize to inhibit HIV replication. Our analysis reveals an enrichment of anti-inflammatory drugs in drug combinations that synergize against HIV. As inflammation accompanies HIV infection, these findings indicate that inhibiting inflammation could curb HIV propagation. Multiple drug pairs identified in this study, including various glucocorticoids and nitazoxanide (NTZ), synergize by targeting different steps in the HIV life cycle. MuSIC can be applied to a wide variety of disease-relevant screens to facilitate efficient identification of compound combinations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nbt.2391DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3494743PMC
November 2012

Landscape of somatic retrotransposition in human cancers.

Science 2012 Aug 28;337(6097):967-71. Epub 2012 Jun 28.

Center for Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA.

Transposable elements (TEs) are abundant in the human genome, and some are capable of generating new insertions through RNA intermediates. In cancer, the disruption of cellular mechanisms that normally suppress TE activity may facilitate mutagenic retrotranspositions. We performed single-nucleotide resolution analysis of TE insertions in 43 high-coverage whole-genome sequencing data sets from five cancer types. We identified 194 high-confidence somatic TE insertions, as well as thousands of polymorphic TE insertions in matched normal genomes. Somatic insertions were present in epithelial tumors but not in blood or brain cancers. Somatic L1 insertions tend to occur in genes that are commonly mutated in cancer, disrupt the expression of the target genes, and are biased toward regions of cancer-specific DNA hypomethylation, highlighting their potential impact in tumorigenesis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.1222077DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3656569PMC
August 2012

Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion.

Proc Natl Acad Sci U S A 2011 Nov 7;108(46):E1128-36. Epub 2011 Nov 7.

Center for Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA.

DNA copy number variations (CNVs) play an important role in the pathogenesis and progression of cancer and confer susceptibility to a variety of human disorders. Array comparative genomic hybridization has been used widely to identify CNVs genome wide, but the next-generation sequencing technology provides an opportunity to characterize CNVs genome wide with unprecedented resolution. In this study, we developed an algorithm to detect CNVs from whole-genome sequencing data and applied it to a newly sequenced glioblastoma genome with a matched control. This read-depth algorithm, called BIC-seq, can accurately and efficiently identify CNVs via minimizing the Bayesian information criterion. Using BIC-seq, we identified hundreds of CNVs as small as 40 bp in the cancer genome sequenced at 10× coverage, whereas we could only detect large CNVs (> 15 kb) in the array comparative genomic hybridization profiles for the same genome. Eighty percent (14/16) of the small variants tested (110 bp to 14 kb) were experimentally validated by quantitative PCR, demonstrating high sensitivity and true positive rate of the algorithm. We also extended the algorithm to detect recurrent CNVs in multiple samples as well as deriving error bars for breakpoints using a Gibbs sampling approach. We propose this statistical approach as a principled yet practical and efficient method to estimate CNVs in whole-genome sequencing data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1110574108DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3219132PMC
November 2011

Comprehensive analysis of the chromatin landscape in Drosophila melanogaster.

Nature 2011 Mar 22;471(7339):480-5. Epub 2010 Dec 22.

Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02115, USA.

Chromatin is composed of DNA and a variety of modified histones and non-histone proteins, which have an impact on cell differentiation, gene regulation and other key cellular processes. Here we present a genome-wide chromatin landscape for Drosophila melanogaster based on eighteen histone modifications, summarized by nine prevalent combinatorial patterns. Integrative analysis with other data (non-histone chromatin proteins, DNase I hypersensitivity, GRO-Seq reads produced by engaged polymerase, short/long RNA products) reveals discrete characteristics of chromosomes, genes, regulatory elements and other functional domains. We find that active genes display distinct chromatin signatures that are correlated with disparate gene lengths, exon patterns, regulatory functions and genomic contexts. We also demonstrate a diversity of signatures among Polycomb targets that include a subset with paused polymerase. This systematic profiling and integrative analysis of chromatin signatures provides insights into how genomic elements are regulated, and will serve as a resource for future experimental investigations of genome structure and function.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature09725DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3109908PMC
March 2011

rSW-seq: algorithm for detection of copy number alterations in deep sequencing data.

BMC Bioinformatics 2010 Aug 18;11:432. Epub 2010 Aug 18.

Center for Biomedical Informatics, Harvard Medical School, 10 Shattuck St, Boston, Massachusetts 02115, USA.

Background: Recent advances in sequencing technologies have enabled generation of large-scale genome sequencing data. These data can be used to characterize a variety of genomic features, including the DNA copy number profile of a cancer genome. A robust and reliable method for screening chromosomal alterations would allow a detailed characterization of the cancer genome with unprecedented accuracy.

Results: We develop a method for identification of copy number alterations in a tumor genome compared to its matched control, based on application of Smith-Waterman algorithm to single-end sequencing data. In a performance test with simulated data, our algorithm shows >90% sensitivity and >90% precision in detecting a single copy number change that contains approximately 500 reads for the normal sample. With 100-bp reads, this corresponds to a ~50 kb region for 1X genome coverage of the human genome. We further refine the algorithm to develop rSW-seq, (recursive Smith-Waterman-seq) to identify alterations in a complex configuration, which are commonly observed in the human cancer genome. To validate our approach, we compare our algorithm with an existing algorithm using simulated and publicly available datasets. We also compare the sequencing-based profiles to microarray-based results.

Conclusion: We propose rSW-seq as an efficient method for detecting copy number changes in the tumor genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2105-11-432DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2939611PMC
August 2010

Estimating enrichment of repetitive elements from high-throughput sequence data.

Genome Biol 2010 28;11(6):R69. Epub 2010 Jun 28.

Harvard-MIT Health Sciences and Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA.

We describe computational methods for analysis of repetitive elements from short-read sequencing data, and apply them to study histone modifications associated with the repetitive elements in human and mouse cells. Our results demonstrate that while accurate enrichment estimates can be obtained for individual repeat types and small sets of repeat instances, there are distinct combinatorial patterns of chromatin marks associated with major annotated repeat families, including H3K27me3/H3K9me3 differences among the endogenous retroviral element classes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/gb-2010-11-6-r69DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2911117PMC
November 2010
-->