Publications by authors named "Arunima Srivastava"

14 Publications

  • Page 1 of 1

PTR Explorer: An approach to identify and explore Post Transcriptional Regulatory mechanisms using proteogenomics.

Pac Symp Biocomput 2020 ;25:475-486

Dept. of Computer Science and Engineering, The Ohio State University, 2015 Neil Ave, Columbus, OH, USA,

Integration of transcriptomic and proteomic data should reveal multi-layered regulatory processes governing cancer cell behaviors. Traditional correlation-based analyses have demonstrated limited ability to identify the post-transcriptional regulatory (PTR) processes that drive the non-linear relationship between transcript and protein abundances. In this work, we ideate an integrative approach to explore the variety of post-transcriptional mechanisms that dictate relationships between genes and corresponding proteins. The proposed workflow utilizes the intuitive technique of scatterplot diagnostics or scagnostics, to characterize and examine the diverse scatterplots built from transcript and protein abundances in a proteogenomic experiment. The workflow includes representing gene-protein relationships as scatterplots, clustering on geometric scagnostic features of these scatterplots, and finally identifying and grouping the potential gene-protein relationships according to their disposition to various PTR mechanisms. Our study verifies the efficacy of the implemented approach to excavate possible regulatory mechanisms by utilizing comprehensive tests on a synthetic dataset. We also propose a variety of 2D pattern-specific downstream analyses methodologies such as mixture modeling, and mapping miRNA post-transcriptional effects to explore each mechanism further. This work suggests that the proposed methodology has the potential for discovering and categorizing post-transcriptional regulatory mechanisms, manifesting in proteogenomic trends. These trends subsequently provide evidence for cancer specificity, miRNA targeting, and identification of regulation impacted by biological functionality and different types of degradation. (Supplementary Material - https://github.com/arunima2/PTRE_PSB_2020).
View Article and Find Full Text PDF

Download full-text PDF

Source
January 2020

Semantic workflows for benchmark challenges: Enhancing comparability, reusability and reproducibility.

Pac Symp Biocomput 2019 ;24:208-219

Computer Science and Engineering, The Ohio State University, 2015 Neil Ave Columbus, OH 43210, USA,

Benchmark challenges, such as the Critical Assessment of Structure Prediction (CASP) and Dialogue for Reverse Engineering Assessments and Methods (DREAM) have been instrumental in driving the development of bioinformatics methods. Typically, challenges are posted, and then competitors perform a prediction based upon blinded test data. Challengers then submit their answers to a central server where they are scored. Recent efforts to automate these challenges have been enabled by systems in which challengers submit Docker containers, a unit of software that packages up code and all of its dependencies, to be run on the cloud. Despite their incredible value for providing an unbiased test-bed for the bioinformatics community, there remain opportunities to further enhance the potential impact of benchmark challenges. Specifically, current approaches only evaluate end-to-end performance; it is nearly impossible to directly compare methodologies or parameters. Furthermore, the scientific community cannot easily reuse challengers' approaches, due to lack of specifics, ambiguity in tools and parameters as well as problems in sharing and maintenance. Lastly, the intuition behind why particular steps are used is not captured, as the proposed workflows are not explicitly defined, making it cumbersome to understand the flow and utilization of data. Here we introduce an approach to overcome these limitations based upon the WINGS semantic workflow system. Specifically, WINGS enables researchers to submit complete semantic workflows as challenge submissions. By submitting entries as workflows, it then becomes possible to compare not just the results and performance of a challenger, but also the methodology employed. This is particularly important when dozens of challenge entries may use nearly identical tools, but with only subtle changes in parameters (and radical differences in results). WINGS uses a component driven workflow design and offers intelligent parameter and data selection by reasoning about data characteristics. This proves to be especially critical in bioinformatics workflows where using default or incorrect parameter values is prone to drastically altering results. Different challenge entries may be readily compared through the use of abstract workflows, which also facilitate reuse. WINGS is housed on a cloud based setup, which stores data, dependencies and workflows for easy sharing and utility. It also has the ability to scale workflow executions using distributed computing through the Pegasus workflow execution system. We demonstrate the application of this architecture to the DREAM proteogenomic challenge.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6417805PMC
August 2019

Imitating Pathologist Based Assessment With Interpretable and Context Based Neural Network Modeling of Histology Images.

Biomed Inform Insights 2018 31;10:1178222618807481. Epub 2018 Oct 31.

Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA.

Convolutional neural networks (CNNs) have gained steady popularity as a tool to perform automatic classification of whole slide histology images. While CNNs have proven to be powerful classifiers in this context, they fail to explain this classification, as the network engineered features used for modeling and classification are ONLY interpretable by the CNNs themselves. This work aims at enhancing a traditional neural network model to perform histology image modeling, patient classification, and interpretation of the distinctive features identified by the network within the histology whole slide images (WSIs). We synthesize a workflow which (a) intelligently samples the training data by automatically selecting only image areas that display visible disease-relevant tissue state and (b) isolates regions most pertinent to the trained CNN prediction and translates them to observable and qualitative features such as color, intensity, cell and tissue morphology and texture. We use the Cancer Genome Atlas's Breast Invasive Carcinoma (TCGA-BRCA) histology dataset to build a model predicting patient attributes (disease stage and node status) and the tumor proliferation challenge (TUPAC 2016) breast cancer histology image repository to help identify disease-relevant tissue state (mitotic activity). We find that our enhanced CNN based workflow both increased patient attribute predictive accuracy (~2% increase for disease stage and ~10% increase for node status) and experimentally proved that a data-driven CNN histology model predicting breast invasive carcinoma stages is highly sensitive to features such as color, cell size, and shape, granularity, and uniformity. This work summarizes the need for understanding the widely trusted models built using deep learning and adds a layer of biological context to a technique that functioned as a classification only approach till now.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1177/1178222618807481DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6236488PMC
October 2018

Proteogenomic Analysis of Surgically Resected Lung Adenocarcinoma.

J Thorac Oncol 2018 10 11;13(10):1519-1529. Epub 2018 Jul 11.

Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio. Electronic address:

Introduction: Despite apparently complete surgical resection, approximately half of resected early-stage lung cancer patients relapse and die of their disease. Adjuvant chemotherapy reduces this risk by only 5% to 8%. Thus, there is a need for better identifying who benefits from adjuvant therapy, the drivers of relapse, and novel targets in this setting.

Methods: RNA sequencing and liquid chromatography/liquid chromatography-mass spectrometry proteomics data were generated from 51 surgically resected non-small cell lung tumors with known recurrence status.

Results: We present a rationale and framework for the incorporation of high-content RNA and protein measurements into integrative biomarkers and show the potential of this approach for predicting risk of recurrence in a group of lung adenocarcinomas. In addition, we characterize the relationship between mRNA and protein measurements in lung adenocarcinoma and show that it is outcome specific.

Conclusions: Our results suggest that mRNA and protein data possess independent biological and clinical importance, which can be leveraged to create higher-powered expression biomarkers.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jtho.2018.06.025DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7135954PMC
October 2018

Building trans-omics evidence: using imaging and 'omics' to characterize cancer profiles.

Pac Symp Biocomput 2018 ;23:377-387

Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Avenue, Columbus, OH 43210, USA,

Utilization of single modality data to build predictive models in cancer results in a rather narrow view of most patient profiles. Some clinical facet s relate strongly to histology image features, e.g. tumor stages, whereas others are associated with genomic and proteomic variations (e.g. cancer subtypes and disease aggression biomarkers). We hypothesize that there are coherent "trans-omics" features that characterize varied clinical cohorts across multiple sources of data leading to more descriptive and robust disease characterization. In this work, for l 05 breast cancer patients from the TCGA (The Cancer Genome Atlas), we consider four clinical attributes (AJCC Stage, Tumor Stage, ER-Status and PAM50 mRNA Subtypes), and build predictive models using three different modalities of data (histopathological images, transcriptomics and proteomics). Following which, we identify critical multi-level features that drive successful classification of patients for the various different cohorts. To build predictors for each data type, we employ widely used "best practice" techniques including CNN-based (convolutional neural network) classifiers for histopathological images and regression models for proteogenomic data. While, as expected, histology images outperformed molecular features while predicting cancer stages, and transcriptomics held superior discriminatory power for ER-Status and PAM50 subtypes, there exist a few cases where all data modalities exhibited comparable performance. Further, we also identified sets of key genes and proteins whose expression and abundance correlate across each clinical cohort including (i) tumor severity and progression (incl. GABARAP), (ii) ER-status (incl.ESRl) and (iii) disease subtypes (incl. FOXCl). Thus, we quantitatively assess the efficacy of different data types to predict critical breast cancer patient attributes and improve disease characterization.
View Article and Find Full Text PDF

Download full-text PDF

Source
August 2018

annoPeak: a web application to annotate and visualize peaks from ChIP-seq/ChIP-exo-seq.

Bioinformatics 2017 May;33(10):1570-1571

Department of Molecular Virology, Immunology and Medical Genetics College of Medicine, The Ohio State University, Columbus, OH, USA.

Summary: We developed annoPeak, a web application to annotate, visualize and compare predicted protein-binding regions derived from ChIP-seq/ChIP-exo-seq experiments using human and mouse cells. Users can upload peak regions from multiple experiments onto the annoPeak server to annotate them with biological context, identify associated target genes and categorize binding sites with respect to gene structure. Users can also compare multiple binding profiles intuitively with the help of visualization tools and tables provided by annoPeak. In general, annoPeak will help users identify patterns of genome wide transcription factor binding profiles, assess binding profiles in different biological contexts and generate new hypotheses.

Availability And Implementation: The web service is freely accessible through URL: http://ccc-annopeak.osumc.edu/annoPeak . Source code is available at https://github.com/XingTang2014/annoPeak .

Contact: gustavo.leone@osumc.edu or kun.huang@osumc.edu.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btx016DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860050PMC
May 2017

Dosage-dependent copy number gains in E2f1 and E2f3 drive hepatocellular carcinoma.

J Clin Invest 2017 Mar 30;127(3):830-842. Epub 2017 Jan 30.

Disruption of the retinoblastoma (RB) tumor suppressor pathway, either through genetic mutation of upstream regulatory components or mutation of RB1 itself, is believed to be a required event in cancer. However, genetic alterations in the RB-regulated E2F family of transcription factors are infrequent, casting doubt on a direct role for E2Fs in driving cancer. In this work, a mutation analysis of human cancer revealed subtle but impactful copy number gains in E2F1 and E2F3 in hepatocellular carcinoma (HCC). Using a series of loss- and gain-of-function alleles to dial E2F transcriptional output, we have shown that copy number gains in E2f1 or E2f3b resulted in dosage-dependent spontaneous HCC in mice without the involvement of additional organs. Conversely, germ-line loss of E2f1 or E2f3b, but not E2f3a, protected mice against HCC. Combinatorial mapping of chromatin occupancy and transcriptome profiling identified an E2F1- and E2F3B-driven transcriptional program that was associated with development and progression of HCC. These findings demonstrate a direct and cell-autonomous role for E2F activators in human cancer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1172/JCI87583DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5330731PMC
March 2017

E2f8 mediates tumor suppression in postnatal liver development.

J Clin Invest 2016 08 25;126(8):2955-69. Epub 2016 Jul 25.

E2F-mediated transcriptional repression of cell cycle-dependent gene expression is critical for the control of cellular proliferation, survival, and development. E2F signaling also interacts with transcriptional programs that are downstream of genetic predictors for cancer development, including hepatocellular carcinoma (HCC). Here, we evaluated the function of the atypical repressor genes E2f7 and E2f8 in adult liver physiology. Using several loss-of-function alleles in mice, we determined that combined deletion of E2f7 and E2f8 in hepatocytes leads to HCC. Temporal-specific ablation strategies revealed that E2f8's tumor suppressor role is critical during the first 2 weeks of life, which correspond to a highly proliferative stage of postnatal liver development. Disruption of E2F8's DNA binding activity phenocopied the effects of an E2f8 null allele and led to HCC. Finally, a profile of chromatin occupancy and gene expression in young and tumor-bearing mice identified a set of shared targets for E2F7 and E2F8 whose increased expression during early postnatal liver development is associated with HCC progression in mice. Increased expression of E2F8-specific target genes was also observed in human liver biopsies from HCC patients compared to healthy patients. In summary, these studies suggest that E2F8-mediated transcriptional repression is a critical tumor suppressor mechanism during postnatal liver development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1172/JCI85506DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4966321PMC
August 2016

Transcriptome regulation and chromatin occupancy by E2F3 and MYC in mice.

Sci Data 2016 Feb 16;3:160008. Epub 2016 Feb 16.

Department of Molecular Virology, Immunology and Medical Genetics, College of Medicine, Columbus, Ohio 43210, USA.

E2F3 and MYC are transcription factors that control cellular proliferation. To study their mechanism of action in the context of a regenerating tissue, we isolated both proliferating (crypts) and non-dividing (villi) cells from wild-type and Rb depleted small intestines of mice and performed ChIP-exo-seq (chromatin immunoprecipitation combined with lambda exonuclease digestion followed by high-throughput sequencing). The genome-wide chromatin occupancy of E2F3 and MYC was determined by mapping sequence reads to the genome and predicting preferred binding sites (peaks). Binding sites could be accurately identified within small regions of only 24 bp-28 bp long, highlighting the precision to which binding peaks can be identified by ChIP-exo-seq. Forty randomly selected E2F3- and MYC-specific binding sites were validated by ChIP-PCR. In addition, we also presented gene expression data sets from wild type, Rb-, E2f3- and Myc-depleted crypts and villi within this manuscript. These represent comprehensive and validated datasets that can be integrated to identify putative direct targets of E2F3 and MYC involved in the control of cellular proliferation in normal and Rb-deficient small intestines.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/sdata.2016.8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4755127PMC
February 2016

DISCOVERY OF MOLECULARLY TARGETED THERAPIES.

Pac Symp Biocomput 2016 ;21:1-8

Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA,

View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4874173PMC
May 2016

Noncatalytic PTEN missense mutation predisposes to organ-selective cancer development in vivo.

Genes Dev 2015 Aug;29(16):1707-20

Solid Tumor Biology Program, James Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio 43210, USA; Department of Molecular Genetics, College of Arts and Sciences, The Ohio State University, Columbus, Ohio 43210, USA; Department of Molecular Virology, Immunology, and Medical Genetics, College of Medicine, The Ohio State University, Columbus, Ohio 43210, USA;

Inactivation of phosphatase and tensin homology deleted on chromosome 10 (PTEN) is linked to increased PI3K-AKT signaling, enhanced organismal growth, and cancer development. Here we generated and analyzed Pten knock-in mice harboring a C2 domain missense mutation at phenylalanine 341 (Pten(FV)), found in human cancer. Despite having reduced levels of PTEN protein, homozygous Pten(FV/FV) embryos have intact AKT signaling, develop normally, and are carried to term. Heterozygous Pten(FV/+) mice develop carcinoma in the thymus, stomach, adrenal medulla, and mammary gland but not in other organs typically sensitive to Pten deficiency, including the thyroid, prostate, and uterus. Progression to carcinoma in sensitive organs ensues in the absence of overt AKT activation. Carcinoma in the uterus, a cancer-resistant organ, requires a second clonal event associated with the spontaneous activation of AKT and downstream signaling. In summary, this PTEN noncatalytic missense mutation exposes a core tumor suppressor function distinct from inhibition of canonical AKT signaling that predisposes to organ-selective cancer development in vivo.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gad.262568.115DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4561480PMC
August 2015

Redeployment of Myc and E2f1-3 drives Rb-deficient cell cycles.

Nat Cell Biol 2015 Aug 20;17(8):1036-48. Epub 2015 Jul 20.

1] Department of Molecular Virology, Immunology and Medical Genetics, College of Medicine, The Ohio State University, Columbus, Ohio 43210, USA [2] Department of Molecular Genetics, College of Biological Sciences, The Ohio State University, Columbus, Ohio 43210, USA [3] Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio 43210, USA.

Robust mechanisms to control cell proliferation have evolved to maintain the integrity of organ architecture. Here, we investigated how two critical proliferative pathways, Myc and E2f, are integrated to control cell cycles in normal and Rb-deficient cells using a murine intestinal model. We show that Myc and E2f1-3 have little impact on normal G1-S transitions. Instead, they synergistically control an S-G2 transcriptional program required for normal cell divisions and maintaining crypt-villus integrity. Surprisingly, Rb deficiency results in the Myc-dependent accumulation of E2f3 protein and chromatin repositioning of both Myc and E2f3, leading to the 'super activation' of a G1-S transcriptional program, ectopic S phase entry and rampant cell proliferation. These findings reveal that Rb-deficient cells hijack and redeploy Myc and E2f3 from an S-G2 program essential for normal cell cycles to a G1-S program that re-engages ectopic cell cycles, exposing an unanticipated addiction of Rb-null cells on Myc.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncb3210DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4526313PMC
August 2015