Publications by authors named "Atif Shahab"

20 Publications

  • Page 1 of 1

Coherence analysis discriminates between retroviral integration patterns in CD34(+) cells transduced under differing clinical trial conditions.

Mol Ther Methods Clin Dev 2015 29;2:15015. Epub 2015 Apr 29.

Gene Therapy Research Unit, Children's Medical Research Institute and The Children's Hospital at Westmead , Westmead, Australia ; The University of Sydney, Discipline of Paediatrics and Child Health , Westmead, Australia.

Unequivocal demonstration of the therapeutic utility of γ-retroviral vectors for gene therapy applications targeting the hematopoietic system was accompanied by instances of insertional mutagenesis. These events stimulated the ongoing development of putatively safer integrating vector systems and analysis methods to characterize and compare integration site (IS) biosafety profiles. Continuing advances in next-generation sequencing technologies are driving the generation of ever-more complex IS datasets. Available bioinformatic tools to compare such datasets focus on the association of integration sites (ISs) with selected genomic and epigenetic features, and the choice of these features determines the ability to discriminate between datasets. We describe the scalable application of point-process coherence analysis (CA) to compare patterns produced by vector ISs across genomic intervals, uncoupled from association with genomic features. To explore the utility of CA in the context of an unresolved question, we asked whether the differing transduction conditions used in the initial Paris and London SCID-X1 gene therapy trials result in divergent genome-wide integration profiles. We tested a transduction carried out under each condition, and showed that CA could indeed resolve differences in IS distributions. Existence of these differences was confirmed by the application of established methods to compare integration datasets.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/mtm.2015.15DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4445430PMC
June 2015

Whole-genome reconstruction and mutational signatures in gastric cancer.

Genome Biol 2012 Dec 13;13(12):R115. Epub 2012 Dec 13.

Background: Gastric cancer is the second highest cause of global cancer mortality. To explore the complete repertoire of somatic alterations in gastric cancer, we combined massively parallel short read and DNA paired-end tag sequencing to present the first whole-genome analysis of two gastric adenocarcinomas, one with chromosomal instability and the other with microsatellite instability.

Results: Integrative analysis and de novo assemblies revealed the architecture of a wild-type KRAS amplification, a common driver event in gastric cancer. We discovered three distinct mutational signatures in gastric cancer--against a genome-wide backdrop of oxidative and microsatellite instability-related mutational signatures, we identified the first exome-specific mutational signature. Further characterization of the impact of these signatures by combining sequencing data from 40 complete gastric cancer exomes and targeted screening of an additional 94 independent gastric tumors uncovered ACVR2A, RPL22 and LMAN1 as recurrently mutated genes in microsatellite instability-positive gastric cancer and PAPPA as a recurrently mutated gene in TP53 wild-type gastric cancer.

Conclusions: These results highlight how whole-genome cancer sequencing can uncover information relevant to tissue-specific carcinogenesis that would otherwise be missed from exome-sequencing data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/gb-2012-13-12-r115DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4056366PMC
December 2012

Long span DNA paired-end-tag (DNA-PET) sequencing strategy for the interrogation of genomic structural mutations and fusion-point-guided reconstruction of amplicons.

PLoS One 2012 28;7(9):e46152. Epub 2012 Sep 28.

Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore.

Structural variations (SVs) contribute significantly to the variability of the human genome and extensive genomic rearrangements are a hallmark of cancer. While genomic DNA paired-end-tag (DNA-PET) sequencing is an attractive approach to identify genomic SVs, the current application of PET sequencing with short insert size DNA can be insufficient for the comprehensive mapping of SVs in low complexity and repeat-rich genomic regions. We employed a recently developed procedure to generate PET sequencing data using large DNA inserts of 10-20 kb and compared their characteristics with short insert (1 kb) libraries for their ability to identify SVs. Our results suggest that although short insert libraries bear an advantage in identifying small deletions, they do not provide significantly better breakpoint resolution. In contrast, large inserts are superior to short inserts in providing higher physical genome coverage for the same sequencing cost and achieve greater sensitivity, in practice, for the identification of several classes of SVs, such as copy number neutral and complex events. Furthermore, our results confirm that large insert libraries allow for the identification of SVs within repetitive sequences, which cannot be spanned by short inserts. This provides a key advantage in studying rearrangements in cancer, and we show how it can be used in a fusion-point-guided-concatenation algorithm to study focally amplified regions in cancer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0046152PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3461012PMC
February 2013

Landscape of transcription in human cells.

Nature 2012 Sep;489(7414):101-8

Centre for Genomic Regulation and UPF, Doctor Aiguader 88, Barcelona 08003, Catalonia, Spain.

Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature11233DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3684276PMC
September 2012

A common BIM deletion polymorphism mediates intrinsic resistance and inferior responses to tyrosine kinase inhibitors in cancer.

Nat Med 2012 Mar 18;18(4):521-8. Epub 2012 Mar 18.

Cancer & Stem Cell Biology Signature Research Programme, Duke-National University of Singapore Graduate Medical School, Singapore.

Tyrosine kinase inhibitors (TKIs) elicit high response rates among individuals with kinase-driven malignancies, including chronic myeloid leukemia (CML) and epidermal growth factor receptor-mutated non-small-cell lung cancer (EGFR NSCLC). However, the extent and duration of these responses are heterogeneous, suggesting the existence of genetic modifiers affecting an individual's response to TKIs. Using paired-end DNA sequencing, we discovered a common intronic deletion polymorphism in the gene encoding BCL2-like 11 (BIM). BIM is a pro-apoptotic member of the B-cell CLL/lymphoma 2 (BCL2) family of proteins, and its upregulation is required for TKIs to induce apoptosis in kinase-driven cancers. The polymorphism switched BIM splicing from exon 4 to exon 3, which resulted in expression of BIM isoforms lacking the pro-apoptotic BCL2-homology domain 3 (BH3). The polymorphism was sufficient to confer intrinsic TKI resistance in CML and EGFR NSCLC cell lines, but this resistance could be overcome with BH3-mimetic drugs. Notably, individuals with CML and EGFR NSCLC harboring the polymorphism experienced significantly inferior responses to TKIs than did individuals without the polymorphism (P = 0.02 for CML and P = 0.027 for EGFR NSCLC). Our results offer an explanation for the heterogeneity of TKI responses across individuals and suggest the possibility of personalizing therapy with BH3 mimetics to overcome BIM-polymorphism-associated TKI resistance.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nm.2713DOI Listing
March 2012

CTCF-mediated functional chromatin interactome in pluripotent cells.

Nat Genet 2011 Jun 19;43(7):630-8. Epub 2011 Jun 19.

Genome Institute of Singapore, Singapore.

Mammalian genomes are viewed as functional organizations that orchestrate spatial and temporal gene regulation. CTCF, the most characterized insulator-binding protein, has been implicated as a key genome organizer. However, little is known about CTCF-associated higher-order chromatin structures at a global scale. Here we applied chromatin interaction analysis by paired-end tag (ChIA-PET) sequencing to elucidate the CTCF-chromatin interactome in pluripotent cells. From this analysis, we identified 1,480 cis- and 336 trans-interacting loci with high reproducibility and precision. Associating these chromatin interaction loci with their underlying epigenetic states, promoter activities, enhancer binding and nuclear lamina occupancy, we uncovered five distinct chromatin domains that suggest potential new models of CTCF function in chromatin organization and transcriptional control. Specifically, CTCF interactions demarcate chromatin-nuclear membrane attachments and influence proper gene expression through extensive cross-talk between promoters and regulatory elements. This highly complex nuclear organization offers insights toward the unifying principles that govern genome plasticity and function.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.857DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3436933PMC
June 2011

Comprehensive long-span paired-end-tag mapping reveals characteristic patterns of structural variations in epithelial cancer genomes.

Genome Res 2011 May 5;21(5):665-75. Epub 2011 Apr 5.

Genome Technology and Biology, Genome Institute of Singapore, Singapore 138672, Singapore.

Somatic genome rearrangements are thought to play important roles in cancer development. We optimized a long-span paired-end-tag (PET) sequencing approach using 10-Kb genomic DNA inserts to study human genome structural variations (SVs). The use of a 10-Kb insert size allows the identification of breakpoints within repetitive or homology-containing regions of a few kilobases in size and results in a higher physical coverage compared with small insert libraries with the same sequencing effort. We have applied this approach to comprehensively characterize the SVs of 15 cancer and two noncancer genomes and used a filtering approach to strongly enrich for somatic SVs in the cancer genomes. Our analyses revealed that most inversions, deletions, and insertions are germ-line SVs, whereas tandem duplications, unpaired inversions, interchromosomal translocations, and complex rearrangements are over-represented among somatic rearrangements in cancer genomes. We demonstrate that the quantitative and connective nature of DNA-PET data is precise in delineating the genealogy of complex rearrangement events, we observe signatures that are compatible with breakage-fusion-bridge cycles, and we discover that large duplications are among the initial rearrangements that trigger genome instability for extensive amplification in epithelial cancers.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.113555.110DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3083083PMC
May 2011

Transcriptional consequences of genomic structural aberrations in breast cancer.

Genome Res 2011 May 5;21(5):676-87. Epub 2011 Apr 5.

Cancer Biology and Pharmacology, Genome Institute of Singapore, Genome, Singapore 138672, Singapore.

Using a long-span, paired-end deep sequencing strategy, we have comprehensively identified cancer genome rearrangements in eight breast cancer genomes. Herein, we show that 40%-54% of these structural genomic rearrangements result in different forms of fusion transcripts and that 44% are potentially translated. We find that single segmental tandem duplication spanning several genes is a major source of the fusion gene transcripts in both cell lines and primary tumors involving adjacent genes placed in the reverse-order position by the duplication event. Certain other structural mutations, however, tend to attenuate gene expression. From these candidate gene fusions, we have found a fusion transcript (RPS6KB1-VMP1) recurrently expressed in ∼30% of breast cancers associated with potential clinical consequences. This gene fusion is caused by tandem duplication on 17q23 and appears to be an indicator of local genomic instability altering the expression of oncogenic components such as MIR21 and RPS6KB1.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.113225.110DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3083084PMC
May 2011

Deconvolution of chromatin immunoprecipitation-microarray (ChIP-chip) analysis of MBF occupancies reveals the temporal recruitment of Rep2 at the MBF target genes.

Eukaryot Cell 2011 Jan 12;10(1):130-41. Epub 2010 Nov 12.

Laboratory of Systems Biology, Genome Institute of Singapore, Singapore, Republic of Singapore.

MBF (or DSC1) is known to regulate transcription of a set of G(1)/S-phase genes encoding proteins involved in regulation of DNA replication. Previous studies have shown that MBF binds not only the promoter of G(1)/S-phase genes, but also the constitutive genes; however, it was unclear if the MBF bindings at the G(1)/S-phase and constitutive genes were mechanistically distinguishable. Here, we report a chromatin immunoprecipitation-microarray (ChIP-chip) analysis of MBF binding in the Schizosaccharomyces pombe genome using high-resolution genome tiling microarrays. ChIP-chip analysis indicates that the majority of the MBF occupancies are located at the intragenic regions. Deconvolution analysis using Rpb1 ChIP-chip results distinguishes the Cdc10 bindings at the Rpb1-poor loci (promoters) from those at the Rpb1-rich loci (intragenic sequences). Importantly, Res1 binding at the Rpb1-poor loci, but not at the Rpb1-rich loci, is dependent on the Cdc10 function, suggesting a distinct binding mechanism. Most Cdc10 promoter bindings at the Rpb1-poor loci are associated with the G(1)/S-phase genes. While Res1 or Res2 is found at both the Cdc10 promoter and intragenic binding sites, Rep2 appears to be absent at the Cdc10 promoter binding sites but present at the intragenic sites. Time course ChIP-chip analysis demonstrates that Rep2 is temporally accumulated at the coding region of the MBF target genes, resembling the RNAP-II occupancies. Taken together, our results show that deconvolution analysis of Cdc10 occupancies refines the functional subset of genomic binding sites. We propose that the MBF activator Rep2 plays a role in mediating the cell cycle-specific transcription through the recruitment of RNAP-II to the MBF-bound G(1)/S-phase genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/EC.00218-10DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3019803PMC
January 2011

Genome-wide analysis reveals methyl-CpG-binding protein 2-dependent regulation of microRNAs in a mouse model of Rett syndrome.

Proc Natl Acad Sci U S A 2010 Oct 4;107(42):18161-6. Epub 2010 Oct 4.

Department of Molecular and Medical Pharmacology, University of California, Los Angeles, CA 90095, USA.

MicroRNAs (miRNAs) are a class of small, noncoding RNAs that function as posttranscriptional regulators of gene expression. Many miRNAs are expressed in the developing brain and regulate multiple aspects of neural development, including neurogenesis, dendritogenesis, and synapse formation. Rett syndrome (RTT) is a progressive neurodevelopmental disorder caused by mutations in the gene encoding methyl-CpG-binding protein 2 (MECP2). Although Mecp2 is known to act as a global transcriptional regulator, miRNAs that are directly regulated by Mecp2 in the brain are not known. Using massively parallel sequencing methods, we have identified miRNAs whose expression is altered in cerebella of Mecp2-null mice before and after the onset of severe neurological symptoms. In vivo genome-wide analyses indicate that promoter regions of a significant fraction of dysregulated miRNA transcripts, including a large polycistronic cluster of brain-specific miRNAs, are DNA-methylated and are bound directly by Mecp2. Functional analysis demonstrates that the 3' UTR of messenger RNA encoding Brain-derived neurotrophic factor (Bdnf) can be targeted by multiple miRNAs aberrantly up-regulated in the absence of Mecp2. Taken together, these results suggest that dysregulation of miRNAs may contribute to RTT pathoetiology and also may provide a valuable resource for further investigations of the role of miRNAs in RTT.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1005595107DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2964235PMC
October 2010

webMGR: an online tool for the multiple genome rearrangement problem.

Bioinformatics 2010 Feb 18;26(3):408-10. Epub 2009 Dec 18.

Genome Institute of Singapore, 60 Biopolis Street, #02-01, Genome, Singapore 138672.

Summary: The algorithm MGR enables the reconstruction of rearrangement phylogenies based on gene or synteny block order in multiple genomes. Although MGR has been successfully applied to study the evolution of different sets of species, its utilization has been hampered by the prohibitive running time for some applications. In the current work, we have designed new heuristics that significantly speed up the tool without compromising its accuracy. Moreover, we have developed a web server (webMGR) that includes elaborate web output to facilitate navigation through the results.

Availability: webMGR can be accessed via http://www.gis.a-star.edu.sg/~bourque. The source code of the improved standalone version of MGR is also freely available from the web site.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btp689DOI Listing
February 2010

Integration of external signaling pathways with the core transcriptional network in embryonic stem cells.

Cell 2008 Jun;133(6):1106-17

Gene Regulation Laboratory, Genome Institute of Singapore, Singapore 138672.

Transcription factors (TFs) and their specific interactions with targets are crucial for specifying gene-expression programs. To gain insights into the transcriptional regulatory networks in embryonic stem (ES) cells, we use chromatin immunoprecipitation coupled with ultra-high-throughput DNA sequencing (ChIP-seq) to map the locations of 13 sequence-specific TFs (Nanog, Oct4, STAT3, Smad1, Sox2, Zfx, c-Myc, n-Myc, Klf4, Esrrb, Tcfcp2l1, E2f1, and CTCF) and 2 transcription regulators (p300 and Suz12). These factors are known to play different roles in ES-cell biology as components of the LIF and BMP signaling pathways, self-renewal regulators, and key reprogramming factors. Our study provides insights into the integration of the signaling pathways into the ES-cell-specific transcription circuitries. Intriguingly, we find specific genomic regions extensively targeted by different TFs. Collectively, the comprehensive mapping of TF-binding sites identifies important features of the transcriptional regulatory networks that define ES-cell identity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2008.04.043DOI Listing
June 2008

Quality assessment of the Affymetrix U133A&B probesets by target sequence mapping and expression data analysis.

In Silico Biol 2007 ;7(3):241-60

Genome Institute of Singapore, 60 Biopolis str., Genome, Singapore 138672.

Careful analysis of microarray probe design should be an obligatory component of MicroArray Quality Control (MACQ) project [Patterson et al., 2006; Shi et al., 2006] initiated by the FDA (USA) in order to provide quality control tools to researchers of gene expression profiles and to translate the microarray technology from bench to bedside. The identification and filtering of unreliable probesets are important preprocessing steps before analysis of microarray data. These steps may result in an essential improvement in the selection of differentially expressed genes, gene clustering and construction of co-regulatory expression networks. We revised genome localization of the Affymetrix U133A&B GeneChip initial (target) probe sequences, and evaluated the impact of erroneous and poorly annotated target sequences on the quality of gene expression data. We found about 25% of Affymetrix target sequences overlapping with interspersed repeats that could cause cross-hybridization effects. In total, discrepancies in target sequence annotation account for up to approximately 30% of 44692 Affymetrix probesets. We introduce a novel quality control algorithm based on target sequence mapping onto genome and GeneChip expression data analysis. To validate the quality of probesets we used expression data from large, clinically and genetically distinct groups of breast cancers (249 samples). For the first time, we quantitatively evaluated the effect of repeats and other sources of inadequate probe design on the specificity, reliability and discrimination ability of Affymetrix probesets. We propose that only functionally reliable Affymetrix probesets that passed our quality control algorithm (approximately 86%) for gene expression analysis should be utilized. The target sequence annotation and filtering is available upon request.
View Article and Find Full Text PDF

Download full-text PDF

Source
May 2008

Whole-genome mapping of histone H3 Lys4 and 27 trimethylations reveals distinct genomic compartments in human embryonic stem cells.

Cell Stem Cell 2007 Sep;1(3):286-98

Genome Technology and Biology Group, Genome Institute of Singapore, 138672, Singapore.

Epigenetic modifications are crucial for proper lineage specification and embryo development. To explore the chromatin modification landscapes in human ES cells, we profiled two histone modifications, H3K4me3 and H3K27me3, by ChIP coupled with the paired-end ditags sequencing strategy. H3K4me3 was found to be a prevalent mark and occurred in close proximity to the promoters of two-thirds of total human genes. Among the H3K27me3 loci identified, 56% are associated with promoters and the vast majority of them are comodified by H3K4me3. By deep-transcript digital counting, 80% of H3K4me3 and 36% of comodified promoters were found to be transcribed. Remarkably, we observed that different combinations of histone methylations are associated with genes from distinct functional categories. These global histone methylation maps provide an epigenetic framework that enables the discovery of novel transcriptional networks and delineation of different genetic compartments of the pluripotent cell genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.stem.2007.08.004DOI Listing
September 2007

Genome-wide mapping of RELA(p65) binding identifies E2F1 as a transcriptional activator recruited by NF-kappaB upon TLR4 activation.

Mol Cell 2007 Aug;27(4):622-35

Laboratory of Immunology and Virology, Genome Institute of Singapore, 138672 Singapore.

NF-kappaB is a key mediator of inflammation. Here, we mapped the genome-wide loci bound by the RELA subunit of NF-kappaB in lipopolysaccharide (LPS)-stimulated human monocytic cells, and together with global gene expression profiling, found an overrepresentation of the E2F1-binding motif among RELA-bound loci associated with NF-kappaB target genes. Knockdown of endogenous E2F1 impaired the LPS inducibility of the proinflammatory cytokines CCL3(MIP-1alpha), IL23A(p19), TNF-alpha, and IL1-beta. Upon LPS stimulation, E2F1 is rapidly recruited to the promoters of these genes along with p50/RELA heterodimer via a mechanism that is dependent on NF-kappaB activation. Together with the observation that E2F1 physically interacts with p50/RELA in LPS-stimulated cells, our findings suggest that NF-kappaB recruits E2F1 to fully activate the transcription of NF-kappaB target genes. Global gene expression profiling subsequently revealed a spectrum of NF-kappaB target genes that are positively regulated by E2F1, further demonstrating the critical role of E2F1 in the Toll-like receptor 4 pathway.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.molcel.2007.06.038DOI Listing
August 2007

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.

Nature 2007 Jun;447(7146):799-816

We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2212820PMC
http://dx.doi.org/10.1038/nature05874DOI Listing
June 2007

Fusion transcripts and transcribed retrotransposed loci discovered through comprehensive transcriptome analysis using Paired-End diTags (PETs).

Genome Res 2007 Jun;17(6):828-38

Genome Technology and Biology Group, Genome Institute of Singapore, Singapore 138672, Singapore.

Identification of unconventional functional features such as fusion transcripts is a challenging task in the effort to annotate all functional DNA elements in the human genome. Paired-End diTag (PET) analysis possesses a unique capability to accurately and efficiently characterize the two ends of DNA fragments, which may have either normal or unusual compositions. This unique nature of PET analysis makes it an ideal tool for uncovering unconventional features residing in the human genome. Using the PET approach for comprehensive transcriptome analysis, we were able to identify fusion transcripts derived from genome rearrangements and actively expressed retrotransposed pseudogenes, which would be difficult to capture by other means. Here, we demonstrate this unique capability through the analysis of 865,000 individual transcripts in two types of cancer cells. In addition to the characterization of a large number of differentially expressed alternative 5' and 3' transcript variants and novel transcriptional units, we identified 70 fusion transcript candidates in this study. One was validated as the product of a fusion gene between BCAS4 and BCAS3 resulting from an amplification followed by a translocation event between the two loci, chr20q13 and chr17q23. Through an examination of PETs that mapped to multiple genomic locations, we identified 4055 retrotransposed loci in the human genome, of which at least three were found to be transcriptionally active. The PET mapping strategy presented here promises to be a useful tool in annotating the human genome, especially aberrations in human cancer genomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.6018607DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1891342PMC
June 2007

Global mapping of c-Myc binding sites and target gene networks in human B cells.

Proc Natl Acad Sci U S A 2006 Nov 8;103(47):17834-9. Epub 2006 Nov 8.

Department of Medicine and The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins, The Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.

The protooncogene MYC encodes the c-Myc transcription factor that regulates cell growth, cell proliferation, cell cycle, and apoptosis. Although deregulation of MYC contributes to tumorigenesis, it is still unclear what direct Myc-induced transcriptomes promote cell transformation. Here we provide a snapshot of genome-wide, unbiased characterization of direct Myc binding targets in a model of human B lymphoid tumor using ChIP coupled with pair-end ditag sequencing analysis (ChIP-PET). Myc potentially occupies > 4,000 genomic loci with the majority near proximal promoter regions associated frequently with CpG islands. Using gene expression profiles with ChIP-PET, we identified 668 direct Myc-regulated gene targets, including 48 transcription factors, indicating that Myc is a central transcriptional hub in growth and proliferation control. This first global genomic view of Myc binding sites yields insights of transcriptional circuitries and cis regulatory modules involving Myc and provides a substantial framework for our understanding of mechanisms of Myc-induced tumorigenesis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.0604129103DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1635161PMC
November 2006

A global map of p53 transcription-factor binding sites in the human genome.

Cell 2006 Jan;124(1):207-19

Genome Institute of Singapore, Singapore 138672.

The ability to derive a whole-genome map of transcription-factor binding sites (TFBS) is crucial for elucidating gene regulatory networks. Herein, we describe a robust approach that couples chromatin immunoprecipitation (ChIP) with the paired-end ditag (PET) sequencing strategy for unbiased and precise global localization of TFBS. We have applied this strategy to map p53 targets in the human genome. From a saturated sampling of over half a million PET sequences, we characterized 65,572 unique p53 ChIP DNA fragments and established overlapping PET clusters as a readout to define p53 binding loci with remarkable specificity. Based on this information, we refined the consensus p53 binding motif, identified at least 542 binding loci with high confidence, discovered 98 previously unidentified p53 target genes that were implicated in novel aspects of p53 functions, and showed their clinical relevance to p53-dependent tumorigenesis in primary cancer samples.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2005.10.043DOI Listing
January 2006

Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation.

Nat Methods 2005 Feb 9;2(2):105-11. Epub 2005 Jan 9.

Genome Institute of Singapore, 60 Biopolis Street, Genome 02-01, Singapore 138672.

We have developed a DNA tag sequencing and mapping strategy called gene identification signature (GIS) analysis, in which 5' and 3' signatures of full-length cDNAs are accurately extracted into paired-end ditags (PETs) that are concatenated for efficient sequencing and mapped to genome sequences to demarcate the transcription boundaries of every gene. GIS analysis is potentially 30-fold more efficient than standard cDNA sequencing approaches for transcriptome characterization. We demonstrated this approach with 116,252 PET sequences derived from mouse embryonic stem cells. Initial analysis of this dataset identified hundreds of previously uncharacterized transcripts, including alternative transcripts of known genes. We also uncovered several intergenically spliced and unusual fusion transcripts, one of which was confirmed as a trans-splicing event and was differentially expressed. The concept of paired-end ditagging described here for transcriptome analysis can also be applied to whole-genome analysis of cis-regulatory and other DNA elements and represents an important technological advance for genome annotation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth733DOI Listing
February 2005
-->