Publications by authors named "Zuojian Tang"

24 Publications

  • Page 1 of 1

TranspoScope: interactive visualization of retrotransposon insertions.

Bioinformatics 2020 06;36(12):3877-3878

Institute for Systems Genetics.

Motivation: Retrotransposition is an important force in shaping the human genome and is involved in prenatal development, disease and aging. Current genome browsers are not optimized for visualizing the experimental evidence for retrotransposon insertions.

Results: We have developed a specialized browser to visualize the evidence for retrotransposon insertions for both targeted and whole-genome sequencing data.

Availability And Implementation: TranspoScope's source code, as well as installation instructions, are available at https://github.com/FenyoLab/transposcope.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa244DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7320613PMC
June 2020

Human transposon insertion profiling by sequencing (TIPseq) to map LINE-1 insertions in single cells.

Philos Trans R Soc Lond B Biol Sci 2020 03 10;375(1795):20190335. Epub 2020 Feb 10.

Department of Pathology, Johns Hopkins University School of Medicine, 733N Broadway, Baltimore, MD 21205, USA.

Long interspersed element-1 (LINE-1, L1) sequences, which comprise about 17% of human genome, are the product of one of the most active types of mobile DNAs in modern humans. LINE-1 insertion alleles can cause inherited and de novo genetic diseases, and LINE-1-encoded proteins are highly expressed in some cancers. Genome-wide LINE-1 mapping in single cells could be useful for defining somatic and germline retrotransposition rates, and for enabling studies to characterize tumour heterogeneity, relate insertions to transcriptional and epigenetic effects at the cellular level, or describe cellular phylogenies in development. Our laboratories have reported a genome-wide LINE-1 insertion site mapping method for bulk DNA, named transposon insertion profiling by sequencing (TIPseq). There have been significant barriers applying LINE-1 mapping to single cells, owing to the chimeric artefacts and features of repetitive sequences. Here, we optimize a modified TIPseq protocol and show its utility for LINE-1 mapping in single lymphoblastoid cells. Results from single-cell TIPseq experiments compare well to known LINE-1 insertions found by whole-genome sequencing and TIPseq on bulk DNA. Among the several approaches we tested, whole-genome amplification by multiple displacement amplification followed by restriction enzyme digestion, vectorette ligation and LINE-1-targeted PCR had the best assay performance. This article is part of a discussion meeting issue 'Crossroads between transposons and gene regulation'.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1098/rstb.2019.0335DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7061987PMC
March 2020

Phylogenetic debugging of a complete human biosynthetic pathway transplanted into yeast.

Nucleic Acids Res 2020 01;48(1):486-499

Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY, USA.

Cross-species pathway transplantation enables insight into a biological process not possible through traditional approaches. We replaced the enzymes catalyzing the entire Saccharomyces cerevisiae adenine de novo biosynthesis pathway with the human pathway. While the 'humanized' yeast grew in the absence of adenine, it did so poorly. Dissection of the phenotype revealed that PPAT, the human ortholog of ADE4, showed only partial function whereas all other genes complemented fully. Suppressor analysis revealed other pathways that play a role in adenine de-novo pathway regulation. Phylogenetic analysis pointed to adaptations of enzyme regulation to endogenous metabolite level 'setpoints' in diverse organisms. Using DNA shuffling, we isolated specific amino acids combinations that stabilize the human protein in yeast. Thus, using adenine de novo biosynthesis as a proof of concept, we suggest that the engineering methods used in this study as well as the debugging strategies can be utilized to transplant metabolic pathway from any origin into yeast.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkz1098DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7145547PMC
January 2020

Transposon insertion profiling by sequencing (TIPseq) for mapping LINE-1 insertions in the human genome.

Mob DNA 2019 8;10. Epub 2019 Mar 8.

1Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA.

Background: Transposable elements make up a significant portion of the human genome. Accurately locating these mobile DNAs is vital to understand their role as a source of structural variation and somatic mutation. To this end, laboratories have developed strategies to selectively amplify or otherwise enrich transposable element insertion sites in genomic DNA.

Results: Here we describe a technique, Transposon Insertion Profiling by sequencing (TIPseq), to map Long INterspersed Element 1 (LINE-1, L1) retrotransposon insertions in the human genome. This method uses vectorette PCR to amplify species-specific L1 (L1PA1) insertion sites followed by paired-end Illumina sequencing. In addition to providing a step-by-step molecular biology protocol, we offer users a guide to our pipeline for data analysis, TIPseqHunter. Our recent studies in pancreatic and ovarian cancer demonstrate the ability of TIPseq to identify invariant (fixed), polymorphic (inherited variants), as well as somatically-acquired L1 insertions that distinguish cancer genomes from a patient's constitutional make-up.

Conclusions: TIPseq provides an approach for amplifying evolutionarily young, active transposable element insertion sites from genomic DNA. Our rationale and variations on this protocol may be useful to those mapping L1 and other mobile elements in complex genomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13100-019-0148-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6407172PMC
March 2019

Transcription factor profiling reveals molecular choreography and key regulators of human retrotransposon expression.

Proc Natl Acad Sci U S A 2018 06 25;115(24):E5526-E5535. Epub 2018 May 25.

Institute for Systems Genetics, NYU Langone Health, New York, NY 10016;

Transposable elements (TEs) represent a substantial fraction of many eukaryotic genomes, and transcriptional regulation of these factors is important to determine TE activities in human cells. However, due to the repetitive nature of TEs, identifying transcription factor (TF)-binding sites from ChIP-sequencing (ChIP-seq) datasets is challenging. Current algorithms are focused on subtle differences between TE copies and thus bias the analysis to relatively old and inactive TEs. Here we describe an approach termed "MapRRCon" (mapping repeat reads to a consensus) which allows us to identify proteins binding to TE DNA sequences by mapping ChIP-seq reads to the TE consensus sequence after whole-genome alignment. Although this method does not assign binding sites to individual insertions in the genome, it provides a landscape of interacting TFs by capturing factors that bind to TEs under various conditions. We applied this method to screen TFs' interaction with L1 in human cells/tissues using ENCODE ChIP-seq datasets and identified 178 of the 512 TFs tested as bound to L1 in at least one biological condition with most of them (138) localized to the promoter. Among these L1-binding factors, we focused on Myc and CTCF, as they play important roles in cancer progression and 3D chromatin structure formation. Furthermore, we explored the transcriptomes of The Cancer Genome Atlas breast and ovarian tumor samples in which a consistent anti-/correlation between L1 and Myc/CTCF expression was observed, suggesting that these two factors may play roles in regulating L1 transcription during the development of such tumors.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1722565115DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6004460PMC
June 2018

UXT is required for spermatogenesis in mice.

PLoS One 2018 12;13(4):e0195747. Epub 2018 Apr 12.

Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, NY, United States of America.

Male mammals must simultaneously produce prodigious numbers of sperm and maintain an adequate reserve of stem cells to ensure continuous production of gametes throughout life. Failures in the mechanisms responsible for balancing germ cell differentiation and spermatogonial stem cell (SSC) self-renewal can result in infertility. We discovered a novel requirement for Ubiquitous Expressed Transcript (UXT) in spermatogenesis by developing the first knockout mouse model for this gene. Constitutive deletion of Uxt is embryonic lethal, while conditional knockout in the male germline results in a Sertoli cell-only phenotype during the first wave of spermatogenesis that does not recover in the adult. This phenotype begins to manifest between 6 and 7 days post-partum, just before meiotic entry. Gene expression analysis revealed that Uxt deletion downregulates the transcription of genes governing SSC self-renewal, differentiation, and meiosis, consistent with its previously defined role as a transcriptional co-factor. Our study has revealed the first in vivo function for UXT in the mammalian germline as a regulator of distinct transcriptional programs in SSCs and differentiating spermatogonia.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0195747PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5896988PMC
July 2018

Synthesis, debugging, and effects of synthetic chromosome consolidation: synVI and beyond.

Science 2017 03;355(6329)

Department of Biochemistry and Molecular Pharmacology, New York University Langone School of Medicine, New York, NY 10016, USA.

We describe design, rapid assembly, and characterization of synthetic yeast Sc2.0 chromosome VI (synVI). A mitochondrial defect in the synVI strain mapped to synonymous coding changes within (), encoding an essential proteasome subunit; Sc2.0 coding changes reduced Pre4 protein accumulation by half. Completing Sc2.0 specifies consolidation of 16 synthetic chromosomes into a single strain. We investigated phenotypic, transcriptional, and proteomewide consequences of Sc2.0 chromosome consolidation in poly-synthetic strains. Another "bug" was discovered through proteomic analysis, associated with alteration of the transcription start due to transfer RNA deletion and loxPsym site insertion. Despite extensive genetic alterations across 6% of the genome, no major global changes were detected in the poly-synthetic strain "omics" analyses. This work sets the stage for completion of a designer, synthetic eukaryotic genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aaf4831DOI Listing
March 2017

Low escape-rate genome safeguards with minimal molecular perturbation of .

Proc Natl Acad Sci U S A 2017 02 7;114(8):E1470-E1479. Epub 2017 Feb 7.

Institute for Systems Genetics, New York University Langone Medical Center, New York, NY 10016;

As the use of synthetic biology both in industry and in academia grows, there is an increasing need to ensure biocontainment. There is growing interest in engineering bacterial- and yeast-based safeguard (SG) strains. First-generation SGs were based on metabolic auxotrophy; however, the risk of cross-feeding and the cost of growth-controlling nutrients led researchers to look for other avenues. Recent strategies include bacteria engineered to be dependent on nonnatural amino acids and yeast SG strains that have both transcriptional- and recombinational-based biocontainment. We describe improving yeast -based transcriptional SG strains, which have near-WT fitness, the lowest possible escape rate, and nanomolar ligands controlling growth. We screened a library of essential genes, as well as the best-performing promoter and terminators, yielding the best SG strains in yeast. The best constructs were fine-tuned, resulting in two tightly controlled inducible systems. In addition, for potential use in the prevention of industrial espionage, we screened an array of possible "decoy molecules" that can be used to mask any proprietary supplement to the SG strain, with minimal effect on strain fitness.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1621250114DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5338387PMC
February 2017

Human transposon insertion profiling: Analysis, visualization and identification of somatic LINE-1 insertions in ovarian cancer.

Proc Natl Acad Sci U S A 2017 01 17;114(5):E733-E740. Epub 2017 Jan 17.

Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205;

Mammalian genomes are replete with interspersed repeats reflecting the activity of transposable elements. These mobile DNAs are self-propagating, and their continued transposition is a source of both heritable structural variation as well as somatic mutation in human genomes. Tailored approaches to map these sequences are useful to identify insertion alleles. Here, we describe in detail a strategy to amplify and sequence long interspersed element-1 (LINE-1, L1) retrotransposon insertions selectively in the human genome, transposon insertion profiling by next-generation sequencing (TIPseq). We also report the development of a machine-learning-based computational pipeline, TIPseqHunter, to identify insertion sites with high precision and reliability. We demonstrate the utility of this approach to detect somatic retrotransposition events in high-grade ovarian serous carcinoma.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1619797114DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5293032PMC
January 2017

Somatic retrotransposition is infrequent in glioblastomas.

Mob DNA 2016 11;7:22. Epub 2016 Nov 11.

Department of Pathology, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA.

Background: Gliomas are the most common primary brain tumors in adults. We sought to understand the roles of endogenous transposable elements in these malignancies by identifying evidence of somatic retrotransposition in glioblastomas (GBM). We performed transposon insertion profiling of the active subfamily of Long INterspersed Element-1 (LINE-1) elements by deep sequencing (TIPseq) on genomic DNA of low passage oncosphere cell lines derived from 7 primary GBM biopsies, 3 secondary GBM tissue samples, and matched normal intravenous blood samples from the same individuals.

Results: We found and PCR validated one somatically acquired tumor-specific insertion in a case of secondary GBM. No LINE-1 insertions present in primary GBM oncosphere cultures were missing from corresponding blood samples. However, several copies of the element (11) were found in genomic DNA from blood and not in the oncosphere cultures. SNP 6.0 microarray analysis revealed deletions or loss of heterozygosity in the tumor genomes over the intervals corresponding to these LINE-1 insertions.

Conclusions: These findings indicate that LINE-1 retrotransposon can act as an infrequent insertional mutagen in secondary GBM, but that retrotransposition is uncommon in these central nervous system tumors as compared to other neoplasias.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13100-016-0077-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5105304PMC
November 2016

An Analysis of the Sensitivity of Proteogenomic Mapping of Somatic Mutations and Novel Splicing Events in Cancer.

Mol Cell Proteomics 2016 Mar 2;15(3):1060-71. Epub 2015 Dec 2.

¶Washington University in St. Louis, St. Louis, MO;

Improvements in mass spectrometry (MS)-based peptide sequencing provide a new opportunity to determine whether polymorphisms, mutations, and splice variants identified in cancer cells are translated. Herein, we apply a proteogenomic data integration tool (QUILTS) to illustrate protein variant discovery using whole genome, whole transcriptome, and global proteome datasets generated from a pair of luminal and basal-like breast-cancer-patient-derived xenografts (PDX). The sensitivity of proteogenomic analysis for singe nucleotide variant (SNV) expression and novel splice junction (NSJ) detection was probed using multiple MS/MS sample process replicates defined here as an independent tandem MS experiment using identical sample material. Despite analysis of over 30 sample process replicates, only about 10% of SNVs (somatic and germline) detected by both DNA and RNA sequencing were observed as peptides. An even smaller proportion of peptides corresponding to NSJ observed by RNA sequencing were detected (<0.1%). Peptides mapping to DNA-detected SNVs without a detectable mRNA transcript were also observed, suggesting that transcriptome coverage was incomplete (∼80%). In contrast to germline variants, somatic variants were less likely to be detected at the peptide level in the basal-like tumor than in the luminal tumor, raising the possibility of differential translation or protein degradation effects. In conclusion, this large-scale proteogenomic integration allowed us to determine the degree to which mutations are translated and identify gaps in sequence coverage, thereby benchmarking current technology and progress toward whole cancer proteome and transcriptome analysis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1074/mcp.M115.056226DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4813688PMC
March 2016

Lhx6 and Lhx8 promote palate development through negative regulation of a cell cycle inhibitor gene, p57Kip2.

Hum Mol Genet 2015 Sep 12;24(17):5024-39. Epub 2015 Jun 12.

Department of Basic Science and Craniofacial Biology, New York University College of Dentistry, New York, NY 10010, USA,

Cleft palate is a common birth defect in humans. Therefore, understanding the molecular genetics of palate development is important from both scientific and medical perspectives. Lhx6 and Lhx8 encode LIM homeodomain transcription factors, and inactivation of both genes in mice resulted in profound craniofacial defects including cleft secondary palate. The initial outgrowth of the palate was severely impaired in the mutant embryos, due to decreased cell proliferation. Through genome-wide transcriptional profiling, we discovered that p57(Kip2) (Cdkn1c), encoding a cell cycle inhibitor, was up-regulated in the prospective palate of Lhx6(-/-);Lhx8(-/-) mutants. p57(Kip2) has been linked to Beckwith-Wiedemann syndrome and IMAGe syndrome in humans, which are developmental disorders with increased incidents of palate defects among the patients. To determine the molecular mechanism underlying the regulation of p57(Kip2) by the Lhx genes, we combined chromatin immunoprecipitation, in silico search for transcription factor-binding motifs, and in vitro reporter assays with putative cis-regulatory elements. The results of these experiments indicated that LHX6 and LHX8 regulated p57(Kip2) via both direct and indirect mechanisms, with the latter mediated by Forkhead box (FOX) family transcription factors. Together, our findings uncovered a novel connection between the initiation of palate development and a cell cycle inhibitor via LHX. We propose a model in which Lhx6 and Lhx8 negatively regulate p57(Kip2) expression in the prospective palate area to allow adequate levels of cell proliferation and thereby promote normal palate development. This is the first report elucidating a molecular genetic pathway downstream of Lhx in palate development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddv223DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4527495PMC
September 2015

Identification of a face enhancer reveals direct regulation of LIM homeobox 8 (Lhx8) by wingless-int (WNT)/β-catenin signaling.

J Biol Chem 2014 Oct 4;289(44):30289-30301. Epub 2014 Sep 4.

Department of Basic Science and Craniofacial Biology, New York University College of Dentistry, New York, New York 10010 and. Electronic address:

Development of the mammalian face requires a large number of genes that are expressed with spatio-temporal specificity, and transcriptional regulation mediated by enhancers plays a key role in the precise control of gene expression. Using chromatin immunoprecipitation for a histone marker of active enhancers, we generated a genome-wide map of candidate enhancers from the maxillary arch (primordium for the upper jaw) of mouse embryos. Furthermore, we confirmed multiple novel craniofacial enhancers near the genes implicated in human palate defects through functional assays. We characterized in detail one of the enhancers (Lhx8_enh1) located upstream of Lhx8, a key regulatory gene for craniofacial development. Lhx8_enh1 contained an evolutionarily conserved binding site for lymphoid enhancer factor/T-cell factor family proteins, which mediate the transcriptional regulation by the WNT/β-catenin signaling pathway. We demonstrated in vitro that WNT/β-catenin signaling was indeed essential for the expression of Lhx8 in the maxillary arch cells and that Lhx8_enh1 was a direct target of the WNT/β-catenin pathway. Together, we uncovered a molecular mechanism for the regulation of Lhx8, and we provided valuable resources for further investigation into the gene regulatory network of craniofacial development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1074/jbc.M114.592014DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4215213PMC
October 2014

FIREWACh: high-throughput functional detection of transcriptional regulatory modules in mammalian cells.

Nat Methods 2014 May 23;11(5):559-65. Epub 2014 Mar 23.

Department of Microbiology, New York University School of Medicine, New York, New York, USA.

Promoters and enhancers establish precise gene transcription patterns. The development of functional approaches for their identification in mammalian cells has been complicated by the size of these genomes. Here we report a high-throughput functional assay for directly identifying active promoter and enhancer elements called FIREWACh (Functional Identification of Regulatory Elements Within Accessible Chromatin), which we used to simultaneously assess over 80,000 DNA fragments derived from nucleosome-free regions within the chromatin of embryonic stem cells (ESCs) and identify 6,364 active regulatory elements. Many of these represent newly discovered ESC-specific enhancers, showing enriched binding-site motifs for ESC-specific transcription factors including SOX2, POU5F1 (OCT4) and KLF4. The application of FIREWACh to additional cultured cell types will facilitate functional annotation of the genome and expand our view of transcriptional network dynamics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.2885DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4020622PMC
May 2014

Co-expression network analysis identifies Spleen Tyrosine Kinase (SYK) as a candidate oncogenic driver in a subset of small-cell lung cancer.

BMC Syst Biol 2013 9;7 Suppl 5:S1. Epub 2013 Dec 9.

Background: Oncogenic mechanisms in small-cell lung cancer remain poorly understood leaving this tumor with the worst prognosis among all lung cancers. Unlike other cancer types, sequencing genomic approaches have been of limited success in small-cell lung cancer, i.e., no mutated oncogenes with potential driver characteristics have emerged, as it is the case for activating mutations of epidermal growth factor receptor in non-small-cell lung cancer. Differential gene expression analysis has also produced SCLC signatures with limited application, since they are generally not robust across datasets. Nonetheless, additional genomic approaches are warranted, due to the increasing availability of suitable small-cell lung cancer datasets. Gene co-expression network approaches are a recent and promising avenue, since they have been successful in identifying gene modules that drive phenotypic traits in several biological systems, including other cancer types.

Results: We derived an SCLC-specific classifier from weighted gene co-expression network analysis (WGCNA) of a lung cancer dataset. The classifier, termed SCLC-specific hub network (SSHN), robustly separates SCLC from other lung cancer types across multiple datasets and multiple platforms, including RNA-seq and shotgun proteomics. The classifier was also conserved in SCLC cell lines. SSHN is enriched for co-expressed signaling network hubs strongly associated with the SCLC phenotype. Twenty of these hubs are actionable kinases with oncogenic potential, among which spleen tyrosine kinase (SYK) exhibits one of the highest overall statistical associations to SCLC. In patient tissue microarrays and cell lines, SCLC can be separated into SYK-positive and -negative. SYK siRNA decreases proliferation rate and increases cell death of SYK-positive SCLC cell lines, suggesting a role for SYK as an oncogenic driver in a subset of SCLC.

Conclusions: SCLC treatment has thus far been limited to chemotherapy and radiation. Our WGCNA analysis identifies SYK both as a candidate biomarker to stratify SCLC patients and as a potential therapeutic target. In summary, WGCNA represents an alternative strategy to large scale sequencing for the identification of potential oncogenic drivers, based on a systems view of signaling networks. This strategy is especially useful in cancer types where no actionable mutations have emerged.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1752-0509-7-S5-S1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4029366PMC
November 2014

Relapse-specific mutations in NT5C2 in childhood acute lymphoblastic leukemia.

Nat Genet 2013 Mar 3;45(3):290-4. Epub 2013 Feb 3.

New York University Cancer Institute, New York University Langone Medical Center, New York, New York, USA.

Relapsed childhood acute lymphoblastic leukemia (ALL) carries a poor prognosis, despite intensive retreatment, owing to intrinsic drug resistance. The biological pathways that mediate resistance are unknown. Here, we report the transcriptome profiles of matched diagnosis and relapse bone marrow specimens from ten individuals with pediatric B-lymphoblastic leukemia using RNA sequencing. Transcriptome sequencing identified 20 newly acquired, novel nonsynonymous mutations not present at initial diagnosis, with 2 individuals harboring relapse-specific mutations in the same gene, NT5C2, encoding a 5'-nucleotidase. Full-exon sequencing of NT5C2 was completed in 61 further relapse specimens, identifying additional mutations in 5 cases. Enzymatic analysis of mutant proteins showed that base substitutions conferred increased enzymatic activity and resistance to treatment with nucleoside analog therapies. Clinically, all individuals who harbored NT5C2 mutations relapsed early, within 36 months of initial diagnosis (P = 0.03). These results suggest that mutations in NT5C2 are associated with the outgrowth of drug-resistant clones in ALL.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.2558DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3681285PMC
March 2013

Combining multiple ChIP-seq peak detection systems using combinatorial fusion.

BMC Genomics 2012 17;13 Suppl 8:S12. Epub 2012 Dec 17.

Laboratory for Informatics and Data Mining, Department of Computer and Information Science, Fordham University, New York, NY 10023, USA.

Background: Due to the recent rapid development in ChIP-seq technologies, which uses high-throughput next-generation DNA sequencing to identify the targets of Chromatin Immunoprecipitation, there is an increasing amount of sequencing data being generated that provides us with greater opportunity to analyze genome-wide protein-DNA interactions. In particular, we are interested in evaluating and enhancing computational and statistical techniques for locating protein binding sites. Many peak detection systems have been developed; in this study, we utilize the following six: CisGenome, MACS, PeakSeq, QuEST, SISSRs, and TRLocator.

Results: We define two methods to merge and rescore the regions of two peak detection systems and analyze the performance based on average precision and coverage of transcription start sites. The results indicate that ChIP-seq peak detection can be improved by fusion using score or rank combination.

Conclusion: Our method of combination and fusion analysis would provide a means for generic assessment of available technologies and systems and assist researchers in choosing an appropriate system (or fusion method) for analyzing ChIP-seq data. This analysis offers an alternate approach for increasing true positive rates, while decreasing false positive rates and hence improving the ChIP-seq peak identification process.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2164-13-S8-S12DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3535708PMC
June 2013

A streamlined method for detecting structural variants in cancer genomes by short read paired-end sequencing.

PLoS One 2012 29;7(10):e48314. Epub 2012 Oct 29.

Department of Pathology and Laboratory Medicine and Abramson Family Cancer Research Institute, Raymond and Ruth Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.

Defining the architecture of a specific cancer genome, including its structural variants, is essential for understanding tumor biology, mechanisms of oncogenesis, and for designing effective personalized therapies. Short read paired-end sequencing is currently the most sensitive method for detecting somatic mutations that arise during tumor development. However, mapping structural variants using this method leads to a large number of false positive calls, mostly due to the repetitive nature of the genome and the difficulty of assigning correct mapping positions to short reads. This study describes a method to efficiently identify large tumor-specific deletions, inversions, duplications and translocations from low coverage data using SVDetect or BreakDancer software and a set of novel filtering procedures designed to reduce false positive calls. Applying our method to a spontaneous T cell lymphoma arising in a core RAG2/p53-deficient mouse, we identified 40 validated tumor-specific structural rearrangements supported by as few as 2 independent read pairs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0048314PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3483208PMC
August 2014

Genetic inactivation of the polycomb repressive complex 2 in T cell acute lymphoblastic leukemia.

Nat Med 2012 Feb 6;18(2):298-301. Epub 2012 Feb 6.

Howard Hughes Medical Institute and Department of Pathology, New York University School of Medicine, New York, New York, USA.

T cell acute lymphoblastic leukemia (T-ALL) is an immature hematopoietic malignancy driven mainly by oncogenic activation of NOTCH1 signaling. In this study we report the presence of loss-of-function mutations and deletions of the EZH2 and SUZ12 genes, which encode crucial components of the Polycomb repressive complex 2 (PRC2), in 25% of T-ALLs. To further study the role of PRC2 in T-ALL, we used NOTCH1-dependent mouse models of the disease, as well as human T-ALL samples, and combined locus-specific and global analysis of NOTCH1-driven epigenetic changes. These studies demonstrated that activation of NOTCH1 specifically induces loss of the repressive mark Lys27 trimethylation of histone 3 (H3K27me3) by antagonizing the activity of PRC2. These studies suggest a tumor suppressor role for PRC2 in human leukemia and suggest a hitherto unrecognized dynamic interplay between oncogenic NOTCH1 and PRC2 function for the regulation of gene expression and cell transformation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nm.2651DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3274628PMC
February 2012

Effects of nickel treatment on H3K4 trimethylation and gene expression.

PLoS One 2011 Mar 24;6(3):e17728. Epub 2011 Mar 24.

Department of Environmental Medicine, New York University School of Medicine, Tuxedo, New York, United States of America.

Occupational exposure to nickel compounds has been associated with lung and nasal cancers. We have previously shown that exposure of the human lung adenocarcinoma A549 cells to NiCl(2) for 24 hr significantly increased global levels of trimethylated H3K4 (H3K4me3), a transcriptional activating mark that maps to the promoters of transcribed genes. To further understand the potential epigenetic mechanism(s) underlying nickel carcinogenesis, we performed genome-wide mapping of H3K4me3 by chromatin immunoprecipitation and direct genome sequencing (ChIP-seq) and correlated with transcriptome genome-wide mapping of RNA transcripts by massive parallel sequencing of cDNA (RNA-seq). The effect of NiCl(2) treatment on H3K4me3 peaks within 5,000 bp of transcription start sites (TSSs) on a set of genes highly induced by nickel in both A549 cells and human peripheral blood mononuclear cells were analyzed. Nickel exposure increased the level of H3K4 trimethylation in both the promoters and coding regions of several genes including CA9 and NDRG1 that were increased in expression in A549 cells. We have also compared the extent of the H3K4 trimethylation in the absence and presence of formaldehyde crosslinking and observed that crosslinking of chromatin was required to observe H3K4 trimethylation in the coding regions immediately downstream of TSSs of some nickel-induced genes including ADM and IGFBP3. This is the first genome-wide mapping of trimethylated H3K4 in the promoter and coding regions of genes induced after exposure to NiCl(2). This study may provide insights into the epigenetic mechanism(s) underlying the carcinogenicity of nickel compounds.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0017728PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3063782PMC
March 2011

Assessing quality and completeness of human transcriptional regulatory pathways on a genome-wide scale.

Biol Direct 2011 Feb 28;6:15. Epub 2011 Feb 28.

Department of Pharmacology, New York University School of Medicine, New York, NY, USA.

Background: Pathway databases are becoming increasingly important and almost omnipresent in most types of biological and translational research. However, little is known about the quality and completeness of pathways stored in these databases. The present study conducts a comprehensive assessment of transcriptional regulatory pathways in humans for seven well-studied transcription factors: MYC, NOTCH1, BCL6, TP53, AR, STAT1, and RELA. The employed benchmarking methodology first involves integrating genome-wide binding with functional gene expression data to derive direct targets of transcription factors. Then the lists of experimentally obtained direct targets are compared with relevant lists of transcriptional targets from 10 commonly used pathway databases.

Results: The results of this study show that for the majority of pathway databases, the overlap between experimentally obtained target genes and targets reported in transcriptional regulatory pathway databases is surprisingly small and often is not statistically significant. The only exception is MetaCore pathway database which yields statistically significant intersection with experimental results in 84% cases. Additionally, we suggest that the lists of experimentally derived direct targets obtained in this study can be used to reveal new biological insight in transcriptional regulation and suggest novel putative therapeutic targets in cancer.

Conclusions: Our study opens a debate on validity of using many popular pathway databases to obtain transcriptional regulatory targets. We conclude that the choice of pathway databases should be informed by solid scientific evidence and rigorous empirical evaluation.

Reviewers: This article was reviewed by Prof. Wing Hung Wong, Dr. Thiago Motta Venancio (nominated by Dr. L Aravind), and Prof. Geoff J McLachlan.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1745-6150-6-15DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3055855PMC
February 2011

The ecoresponsive genome of Daphnia pulex.

Science 2011 Feb;331(6017):555-61

Center for Genomics and Bioinformatics, Indiana University, 915 East Third Street, Bloomington, IN 47405, USA.

We describe the draft genome of the microcrustacean Daphnia pulex, which is only 200 megabases and contains at least 30,907 genes. The high gene count is a consequence of an elevated rate of gene duplication resulting in tandem gene clusters. More than a third of Daphnia's genes have no detectable homologs in any other available proteome, and the most amplified gene families are specific to the Daphnia lineage. The coexpansion of gene families interacting within metabolic pathways suggests that the maintenance of duplicated genes is not random, and the analysis of gene expression under different environmental conditions reveals that numerous paralogs acquire divergent expression patterns soon after duplication. Daphnia-specific genes, including many additional loci within sequenced regions that are otherwise devoid of annotations, are the most responsive genes to ecological challenges.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.1197761DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3529199PMC
February 2011

EST and microarray analysis of horn development in Onthophagus beetles.

BMC Genomics 2009 Oct 30;10:504. Epub 2009 Oct 30.

Department of Biology, Indiana University, Bloomington, Indiana 47405, USA.

Background: The origin of novel traits and their subsequent diversification represent central themes in evo-devo and evolutionary ecology. Here we explore the genetic and genomic basis of a class of traits that is both novel and highly diverse, in a group of organisms that is ecologically complex and experimentally tractable: horned beetles.

Results: We developed two high quality, normalized cDNA libraries for larval and pupal Onthophagus taurus and sequenced 3,488 ESTs that assembled into 451 contigs and 2,330 singletons. We present the annotation and a comparative analysis of the conservation of the sequences. Microarrays developed from the combined libraries were then used to contrast the transcriptome of developing primordia of head horns, prothoracic horns, and legs. Our experiments identify a first comprehensive list of candidate genes for the evolution and diversification of beetle horns. We find that developing horns and legs show many similarities as well as important differences in their transcription profiles, suggesting that the origin of horns was mediated partly, but not entirely, by the recruitment of genes involved in the formation of more traditional appendages such as legs. Furthermore, we find that horns developing from the head and prothorax differ in their transcription profiles to a degree that suggests that head and prothoracic horns are not serial homologs, but instead may have evolved independently from each other.

Conclusion: We have laid the foundation for a systematic analysis of the genetic basis of horned beetle development and diversification with the potential to contribute significantly to several major frontiers in evolutionary developmental biology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2164-10-504DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2777201PMC
October 2009

ESTPiper--a web-based analysis pipeline for expressed sequence tags.

BMC Genomics 2009 Apr 21;10:174. Epub 2009 Apr 21.

The Center for Genomics and Bioinformatics, Indiana University, Bloomington, Indiana, USA.

Background: EST sequencing projects are increasing in scale and scope as the genome sequencing technologies migrate from core sequencing centers to individual research laboratories. Effectively, generating EST data is no longer a bottleneck for investigators. However, processing large amounts of EST data remains a non-trivial challenge for many. Web-based EST analysis tools are proving to be the most convenient option for biologists when performing their analysis, so these tools must continuously improve on their utility to keep in step with the growing needs of research communities. We have developed a web-based EST analysis pipeline called ESTPiper, which streamlines typical large-scale EST analysis components.

Results: The intuitive web interface guides users through each step of base calling, data cleaning, assembly, genome alignment, annotation, analysis of gene ontology (GO), and microarray oligonucleotide probe design. Each step is modularized. Therefore, a user can execute them separately or together in batch mode. In addition, the user has control over the parameters used by the underlying programs. Extensive documentation of ESTPiper's functionality is embedded throughout the web site to facilitate understanding of the required input and interpretation of the computational results. The user can also download intermediate results and port files to separate programs for further analysis. In addition, our server provides a time-stamped description of the run history for reproducibility. The pipeline can also be installed locally, allowing researchers to modify ESTPiper to suit their own needs.

Conclusion: ESTPiper streamlines the typical process of EST analysis. The pipeline was initially designed in part to support the Daphnia pulex cDNA sequencing project. A web server hosting ESTPiper is provided at http://estpiper.cgb.indiana.edu/ to now support projects of all size. The software is also freely available from the authors for local installations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2164-10-174DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2676306PMC
April 2009
-->