Background: Constructing gene co-expression networks from cancer expression data is important for investigating the genetic mechanisms underlying cancer. However, correlation coefficients or linear regression models are not able to model sophisticated relationships among gene expression profiles. Here, we address the 3-way interaction that 2 genes' expression levels are clustered in different space locations under the control of a third gene's expression levels. Read More
The amounts and types of available multimodal tumor data are rapidly increasing, and their integration is critical for fully understanding the underlying cancer biology and personalizing treatment. However, the development of methods for effectively integrating multimodal data in a principled manner is lagging behind our ability to generate the data. In this article, we introduce an extension to a multiview nonnegative matrix factorization algorithm (NNMF) for dimensionality reduction and integration of heterogeneous data types and compare the predictive modeling performance of the method on unimodal and multimodal data. Read More
In vivo and in vitro functional phenotyping characterization was recently obtained with reference to an experimental pan-cancer study of 22 osteosarcoma (OS) cell lines. Here, differentially expressed gene (DEG) profiles were recomputed from the publicly available data to conduct network inference on the immune system regulatory activity across the characterized OS phenotypes. Based on such DEG profiles, and for each phenotype that was analyzed, we obtained coexpression networks and bio-annotations for them. Read More
Predictive modeling from high-dimensional genomic data is often preceded by a dimension reduction step, such as principal component analysis (PCA). However, the application of PCA is not straightforward for multisource data, wherein multiple sources of 'omics data measure different but related biological components. In this article, we use recent advances in the dimension reduction of multisource data for predictive modeling. Read More
Background: Determination of functional pathways regulated by microRNAs (miRNAs), while an essential step in developing therapeutics, is challenging. Some miRNAs have been studied extensively; others have limited information. In this study, we focus on 254 miRNAs previously identified as being associated with colorectal cancer and their database-identified validated target genes. Read More
Ranking feature sets for phenotype classification based on gene expression is a challenging issue in cancer bioinformatics. When the number of samples is small, all feature selection algorithms are known to be unreliable, producing significant error, and error estimators suffer from different degrees of imprecision. The problem is compounded by the fact that the accuracy of classification depends on the manner in which the phenomena are transformed into data by the measurement technology. Read More
The 3 human RAS genes play pivotal roles regulating proliferation, differentiation, and survival in normal cells and become mutated in 15% to 20% of all human tumors and amplified in many others. In this report, we examined data from The Cancer Genome Atlas to investigate the relationship between RAS gene mutational status and messenger RNA expression. We show that all 3 RAS genes exhibit increased expression when they are mutated in a context-dependent manner. Read More
Purpose: Immune checkpoint inhibition reactivates the immune response against cancer cells in multiple tissue types and has been shown to induce durable responses. However, for patients with autoimmune disorders, their conditions can worsen with this reactivation. We sought to identify, among patients with lung and renal cancer, how many harbor a comorbid autoimmune condition and may be at risk of worsening their condition while on immune checkpoint inhibitors such as nivolumab and pembrolizumab. Read More
Hypoxia-inducible factors (HIF) belong to the basic helix loop helix-PER ARNT SIM (bHLH-PAS) family of transcription factors that induce metabolic reprogramming under hypoxic condition. The phylogenetic studies of hypoxia-inducible factor-1α (HIF-1α) sequences across different organisms/species may leave a clue on the evolutionary relationships and its probable correlation to tumorigenesis and adaptation to low oxygen environments. In this study, we have aimed at the evolutionary investigation of the protein HIF-1α across different species to decipher their sequence variations/mutations and look into the probable causes and abnormal behaviour of this molecule under exotic conditions. Read More
Nowadays, many biological data are acquired via images. In this article, we study the pathological images scanned from 205 patients with lung cancer with the goal to find out the relationship between the survival time and the spatial distribution of different types of cells, including lymphocyte, stroma, and tumor cells. Toward this goal, we model the spatial distribution of different types of cells using a modified Potts model for which the parameters represent interactions between different types of cells and estimate the parameters of the Potts model using the double Metropolis-Hastings algorithm. Read More
As cancer growth and development typically involves multiple genes and pathways, combination therapy has been touted as the standard of care in the treatment of cancer. However, drug toxicity becomes a major concern whenever a patient takes 2 or more drugs simultaneously at the maximum tolerable dosage. A potential solution would be administering the drugs in a sequential or alternating manner rather than concurrently. Read More
Purpose: To analyze a microarray experiment to identify the genes with expressions varying after the diagnosis of breast cancer.
Methods: A total of 44 928 probe sets in an Affymetrix microarray data publicly available on Gene Expression Omnibus from 249 patients with breast cancer were analyzed by the nonparametric multivariate adaptive splines. Then, the identified genes with turning points were grouped by K-means clustering, and their network relationship was subsequently analyzed by the Ingenuity Pathway Analysis. Read More
Background: Mathematical modeling of biothermal processes is widely used to enhance the quantitative understanding of thermoregulation system of human body organs. This quantitative knowledge of thermal information of various human body organs can be used for developing clinical applications. In the past, investigators have studied thermal distribution in hemisphere-shaped human breast in the presence of sphere-shaped tumor. Read More
In patients with advanced ovarian cancer (AOC), additional imaging of disseminated disease at laparoscopy could complement conventional imaging for estimation of chemotherapy response. We developed an image segmentation method and evaluated its use in making accurate and objective measurements of peritoneal metastases in comparison to Response Evaluation Criteria In Solid Tumors (RECIST) criteria. A software tool using a custom ImageJ macro-based approach was employed to estimate lesion size by converting image pixels into unit length. Read More
Training anatomic and clinical pathology residents in the principles of bioinformatics is a challenging endeavor. Most residents receive little to no formal exposure to bioinformatics during medical education, and most of the pathology training is spent interpreting histopathology slides using light microscopy or focused on laboratory regulation, management, and interpretation of discrete laboratory data. At a minimum, residents should be familiar with data structure, data pipelines, data manipulation, and data regulations within clinical laboratories. Read More
This study aimed to identify and characterize microRNAs (miRNAs) that are related to radiosensitivity in low-grade gliomas (LGGs). The miRNA expression levels in radiosensitive and radioresistant LGGs were compared using The Cancer Genome Atlas database, and differentially expressed miRNAs were identified using the EBSeq package. The miRNA target genes were predicted using Web databases. Read More
Gaussian Bayesian networks have become a widely used framework to estimate directed associations between joint Gaussian variables, where the network structure encodes the decomposition of multivariate normal density into local terms. However, the resulting estimates can be inaccurate when the normality assumption is moderately or severely violated, making it unsuitable for dealing with recent genomic data such as the Cancer Genome Atlas data. In the present paper, we propose a mixture copula Bayesian network model which provides great flexibility in modeling non-Gaussian and multimodal data for causal inference. Read More
The concept of protein intrinsic disorder has taken the driving seat to understand regulatory proteins in general. Reports suggest that in mammals nearly 75% of signalling proteins contain long disordered regions with greater than 30 amino acid residues. Therefore, intrinsically disordered proteins (IDPs) have been implicated in several human diseases and should be considered as potential novel drug targets. Read More
Leading institutions throughout the country have established Precision Medicine programs to support personalized treatment of patients. A cornerstone for these programs is the establishment of enterprise-wide Clinical Data Warehouses. Working shoulder-to-shoulder, a team of physicians, systems biologists, engineers, and scientists at Rutgers Cancer Institute of New Jersey have designed, developed, and implemented the Warehouse with information originating from data sources, including Electronic Medical Records, Clinical Trial Management Systems, Tumor Registries, Biospecimen Repositories, Radiology and Pathology archives, and Next Generation Sequencing services. Read More
TP53 is the most frequently altered gene in human cancers. Numerous retrospective studies have related its mutation and abnormal p53 protein expression to poor patient survival. Nonetheless, the clinical significance of TP53 (p53) status has been a controversial issue. Read More
The construction of gene regulatory networks (GRNs) is an essential component of biomedical research to determine disease mechanisms and identify treatment targets. Gaussian graphical models (GGMs) have been widely used for constructing GRNs by inferring conditional dependence among a set of gene expressions. In practice, GRNs obtained by the analysis of a single data set may not be reliable due to sample limitations. Read More
A vast number of human pathologic conditions are directly or indirectly related to tissular collagen structure remodeling. The nonlinear optical microscopy second-harmonic generation has become a powerful tool for imaging biological tissues with anisotropic hyperpolarized structures, such as collagen. During the past years, several quantification methods to analyze and evaluate these images have been developed. Read More
The article proposes a unified least squares method to estimate the receiver operating characteristic (ROC) parameters for continuous and ordinal diagnostic tests, such as cancer biomarkers. The method is based on a linear model framework using the empirically estimated sensitivities and specificities as input "data." It gives consistent estimates for regression and accuracy parameters when the underlying continuous test results are normally distributed after some monotonic transformation. Read More
In cancer studies, the prediction of cancer outcome based on a set of prognostic variables has been a long-standing topic of interest. Current statistical methods for survival analysis offer the possibility of modelling cancer survivability but require unrealistic assumptions about the survival time distribution or proportionality of hazard. Therefore, attention must be paid in developing nonlinear models with less restrictive assumptions. Read More
A novel explicit triscale reaction-diffusion numerical model of glioblastoma multiforme tumor growth is presented. The model incorporates the handling of Neumann boundary conditions imposed by the cranium and takes into account both the inhomogeneous nature of human brain and the complexity of the skull geometry. The finite-difference time-domain method is adopted. Read More
Introduction: Breast cancer being a multifaceted disease constitutes a wide spectrum of histological and molecular variability in tumors. However, the task for the identification of these variances is complicated by the interplay between inherited genetic and epigenetic aberrations. Therefore, this study provides an extrapolate outlook to the sinister partnership between DNA methylation and single-nucleotide polymorphisms (SNPs) in relevance to the identification of prognostic markers in breast cancer. Read More
Proteomics promises to revolutionize cancer treatment and prevention by facilitating the discovery of molecular biomarkers. Progress has been impeded, however, by the small-sample, high-dimensional nature of proteomic data. We propose the application of a Bayesian approach to address this issue in classification of proteomic profiles generated by liquid chromatography-mass spectrometry (LC-MS). Read More
In this article, we propose a regression model to compare the performances of different diagnostic methods having clustered ordinal test outcomes. The proposed model treats ordinal test outcomes (an ordinal categorical variable) as grouped-survival time data and uses random effects to explain correlation among outcomes from the same cluster. To compare different diagnostic methods, we introduce a set of covariates indicating diagnostic methods and compare their coefficients. Read More
Objective: The primary aim was to compare independent and joint performance of retrieving smoking status through different sources, including narrative text processed by natural language processing (NLP), patient-provided information (PPI), and diagnosis codes (ie, International Classification of Diseases, Ninth Revision [ICD-9]). We also compared the performance of retrieving smoking strength information (ie, heavy/light smoker) from narrative text and PPI.
Materials And Methods: Our study leveraged an existing lung cancer cohort for smoking status, amount, and strength information, which was manually chart-reviewed. Read More
Motivation: Among many large-scale proteomic quantification methods, (18)O/(16)O labeling requires neither specific amino acid in peptides nor label incorporation through several cell cycles, as in metabolic labeling; it does not cause significant elution time shifts between heavy- and light-labeled peptides, and its dynamic range of quantification is larger than that of tandem mass spectrometry-based quantification methods. These properties offer (18)O/(16)O labeling the maximum flexibility in application. However, (18)O/(16)O labeling introduces large quantification variations due to varying labeling efficiency. Read More
The plethora of available disease prediction models and the ongoing process of their application into clinical practice - following their clinical validation - have created new needs regarding their efficient handling and exploitation. Consolidation of software implementations, descriptive information, and supportive tools in a single place, offering persistent storage as well as proper management of execution results, is a priority, especially with respect to the needs of large healthcare providers. At the same time, modelers should be able to access these storage facilities under special rights, in order to upgrade and maintain their work. Read More
Cancer Inform 2016 26;15:211-217. Epub 2016 Oct 26.
Center for Biomedical Data and Language Processing, Department of Health Informatics and Administration, University of Wisconsin-Milwaukee, Milwaukee, WI, USA.; Department of Health Informatics and Administration, College of Health Sciences, University of Wisconsin-Milwaukee, Milwaukee, WI, USA.; College of Health Sciences, University of Wisconsin-Milwaukee, Milwaukee, WI, USA.; Center for Urban Population Health, Milwaukee, WI, USA.; Joseph J. Zilber School of Public Health, University of Wisconsin-Milwaukee, Milwaukee, WI, USA.
We systematically compared the adverse effects of cancer drugs to detect event outliers across different clinical trials using a data-driven approach. Because many cancer drugs are toxic to patients, better understanding of adverse events of cancer drugs is critical for developing therapies that could minimize the toxic effects. However, due to the large variabilities of adverse events across different cancer drugs, methods to efficiently compare adverse effects across different cancer drugs are lacking. Read More
p53 is an important regulator of cell cycle arrest, senescence, apoptosis and metabolism, and is frequently mutated in tumors. It functions as a tetramer, where each component dimer binds to a decameric DNA region known as a response element. We identify p53 binding site subtypes and examine the functional and evolutionary properties of these subtypes. Read More
Cancer Inform 2016 9;15(Suppl 2):43-50. Epub 2016 Oct 9.
Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Canada; Centre for Healthcare Innovation, Winnipeg Regional Health Authority/University of Manitoba, Winnipeg, Canada; Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, Canada.
Many cancers have been linked to copy number variations (CNVs) in the genomic DNA. Although there are existing methods to analyze CNVs from individual samples, cancer-causing genes are more frequently discovered in regions where CNVs are common among tumor samples, also known as recurrent CNVs. Integrating multiple samples and locating recurrent CNV regions remain a challenge, both computationally and conceptually. Read More
MicroRNAs (miRs) are small single-stranded noncoding RNA that function in RNA silencing and post-transcriptional regulation of gene expression. An increasing number of studies have shown that miRs play an important role in tumorigenesis, and understanding the regulatory mechanism of miRs in this gene regulatory network will help elucidate the complex biological processes at play during malignancy. Despite advances, determination of miR-target interactions (MTIs) and identification of functional modules composed of miRs and their specific targets remain a challenge. Read More
Heterogeneous DNA methylation patterns are linked to tumor growth. In order to study DNA methylation heterogeneity patterns for breast cancer cell lines, we comparatively study four metrics: variance, I (2) statistic, entropy, and methylation state. Using the categorical metric methylation state, we select the two most heterogeneous states to identify genes that directly affect tumor suppressor genes and high- or moderate-risk breast cancer genes. Read More
Colorectal cancer (CRC) is one of the most common and lethal cancers. Although numerous studies have evaluated potential biomarkers for early diagnosis, current biomarkers have failed to reach an acceptable level of accuracy for distant metastasis. In this paper, we performed a gene set meta-analysis of in vitro microarray studies and combined the results from this study with previously published proteomic data to validate and suggest prognostic candidates for CRC metastasis. Read More
In order to provide the most effective therapy for cancer, it is important to be able to diagnose whether a patient's cancer will respond to a proposed treatment. Methylation profiling could contain information from which such predictions could be made. Currently, hypothesis testing is used to determine whether possible biomarkers for cancer progression produce statistically significant results. Read More
Discovering important genes that account for the phenotype of interest has long been a challenge in genome-wide expression analysis. Analyses such as gene set enrichment analysis (GSEA) that incorporate pathway information have become widespread in hypothesis testing, but pathway-based approaches have been largely absent from regression methods due to the challenges of dealing with overlapping pathways and the resulting lack of available software. The R package grpreg is widely used to fit group lasso and other group-penalized regression models; in this study, we develop an extension, grpregOverlap, to allow for overlapping group structure using a latent variable approach. Read More
Due to its extraordinary heterogeneity and complexity, cancer is often proposed as a model case of a systems biology disease or network disease. There is a critical need of effective biomarkers for cancer diagnosis and/or outcome prediction from system level analyses. Methods based on integrating omics data into networks have the potential to revolutionize the identification of cancer biomarkers. Read More
We performed gene expression microarray analysis coupled with spherical self-organizing map (sSOM) for artificially developed cancer stem cells (CSCs). The CSCs were developed from human induced pluripotent stem cells (hiPSCs) with the conditioned media of cancer cell lines, whereas the CSCs were induced from primary cell culture of human cancer tissues with defined factors (OCT3/4, SOX2, and KLF4). These cells commonly expressed human embryonic stem cell (hESC)/hiPSC-specific genes (POU5F1, SOX2, NANOG, LIN28, and SALL4) at a level equivalent to those of control hiPSC 201B7. Read More
Cancer Inform 2016 7;15(Suppl 2):17-24. Epub 2016 Aug 7.
Department of Biohealth Informatics, School of Informatics and Computing, Indiana University Purdue University, Indianapolis, IN, USA.; Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 5021 Health Information and Translational Sciences (HITS), Indianapolis, IN, USA.; Department of Medical and Molecular Genetics, Indiana University School of Medicine, Medical Research and Library Building, Indianapolis, IN, USA.
Survival analysis in biomedical sciences is generally performed by correlating the levels of cellular components with patients' clinical features as a common practice in prognostic biomarker discovery. While the common and primary focus of such analysis in cancer genomics so far has been to identify the potential prognostic genes, alternative splicing - a posttranscriptional regulatory mechanism that affects the functional form of a protein due to inclusion or exclusion of individual exons giving rise to alternative protein products, has increasingly gained attention due to the prevalence of splicing aberrations in cancer transcriptomes. Hence, uncovering the potential prognostic exons can not only help in rationally designing exon-specific therapeutics but also increase specificity toward more personalized treatment options. Read More
In some cancer clinical studies, researchers have interests to explore the risk factors associated with competing risk outcomes such as recurrence-free survival. We develop a novel recursive partitioning framework on competing risk data for both prognostic and predictive model constructions. We define specific splitting rules, pruning algorithm, and final tree selection algorithm for the competing risk tree models. Read More
We introduce Pathway-Informed Classification System (PICS) for classifying cancers based on tumor sample gene expression levels. PICS is a computational method capable of expeditiously elucidating both known and novel biological pathway involvement specific to various cancers and uses that learned pathway information to separate patients into distinct classes. The method clearly separates a pan-cancer dataset by tissue of origin and also sub-classifies individual cancer datasets into distinct survival classes. Read More
There are no satisfying tools in tissue microarray (TMA) data analysis up to now to analyze the cooperative behavior of all measured markers in a multifactorial TMA approach. The developed tool TMAinspiration is not only offering an analysis option to close this gap but also offering an ecosystem consisting of quality control concepts and supporting scripts to make this approach a platform for informed practice and further research. The TMAinspiration method is specifically focusing on the demands of the TMA analysis by controlling errors and noise by a generalized regression scheme while at the same time avoiding to introduce a priori too many constraints into the analysis of the data. Read More
Background: DNA copy number alteration is common in many cancers. Studies have shown that insertion or deletion of DNA sequences can directly alter gene expression, and significant correlation exists between DNA copy number and gene expression. Data normalization is a critical step in the analysis of gene expression generated by RNA-seq technology. Read More
Recently, new but expensive treatments have become available for metastatic melanoma. These improve survival, but in view of the limited funds available, cost-effectiveness needs to be evaluated. Most cancer cost-effectiveness models are based on the observed clinical events such as recurrence- free and overall survival. Read More
Clustering is carried out to identify patterns in transcriptomics profiles to determine clinically relevant subgroups of patients. Feature (gene) selection is a critical and an integral part of the process. Currently, there are many feature selection and clustering methods to identify the relevant genes and perform clustering of samples. Read More
Protein-DNA interactions are involved in different cancer pathways. In particular, the DNA-binding domains of proteins can determine where and how gene regulatory regions are bound in different cell lines at different stages. Therefore, it is essential to develop a method to predict and locate the core residues on cancer-related DNA-binding domains. Read More