Publications by authors named "Martin Ester"

23 Publications

  • Page 1 of 1

ParKCa: Causal Inference with Partially Known Causes.

Pac Symp Biocomput 2021 ;26:196-207

School of Computing Science, Simon Fraser University, Burnaby, Canada,

Methods for causal inference from observational data are an alternative for scenarios where collecting counterfactual data or realizing a randomized experiment is not possible. Our proposed method ParKCA combines the results of several causal inference methods to learn new causes in applications with some known causes and many potential causes. We validate ParKCA in two Genome-wide association studies, one real-world and one simulated dataset. Our results show that ParKCA can infer more causes than existing methods.
View Article and Find Full Text PDF

Download full-text PDF

Source
April 2021

Identification Of Differentially Expressed Gene Modules In Heterogeneous Diseases.

Bioinformatics 2020 Dec 16. Epub 2020 Dec 16.

School of Computing Science, Simon Fraser University, Burnaby, BC, Canada.

Motivation: Identification of differentially expressed genes is necessary for unraveling disease pathogenesis. This task is complicated by the fact that many diseases are heterogeneous at the molecular level and samples representing distinct disease subtypes may demonstrate different patterns of dysregulation. Biclustering methods are capable of identifying genes that follow a similar expression pattern only in a subset of samples and hence can consider disease heterogeneity. However, identifying biologically significant and reproducible sets of genes and samples remains challenging for the existing tools. Many recent studies have shown that the integration of gene expression and protein interaction data improves the robustness of prediction and classification and advances biomarker discovery.

Results: Here we present DESMOND, a new method for identification of Differentially ExpreSsed gene MOdules iN Diseases. DESMOND performs network-constrained biclustering on gene expression data and identifies gene modules - connected sets of genes up- or down-regulated in subsets of samples. We applied DESMOND on expression profiles of samples from two large breast cancer cohorts and have shown that the capability of DESMOND to incorporate protein interactions allows identifying the biologically meaningful gene and sample subsets and improves the reproducibility of the results.

Availability: https://github.com/ozolotareva/DESMOND.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa1038DOI Listing
December 2020

A fast and fully-automated deep-learning approach for accurate hemorrhage segmentation and volume quantification in non-contrast whole-head CT.

Sci Rep 2020 11 9;10(1):19389. Epub 2020 Nov 9.

Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, Canada.

This project aimed to develop and evaluate a fast and fully-automated deep-learning method applying convolutional neural networks with deep supervision (CNN-DS) for accurate hematoma segmentation and volume quantification in computed tomography (CT) scans. Non-contrast whole-head CT scans of 55 patients with hemorrhagic stroke were used. Individual scans were standardized to 64 axial slices of 128 × 128 voxels. Each voxel was annotated independently by experienced raters, generating a binary label of hematoma versus normal brain tissue based on majority voting. The dataset was split randomly into training (n = 45) and testing (n = 10) subsets. A CNN-DS model was built applying the training data and examined using the testing data. Performance of the CNN-DS solution was compared with three previously established methods. The CNN-DS achieved a Dice coefficient score of 0.84 ± 0.06 and recall of 0.83 ± 0.07, higher than patch-wise U-Net (< 0.76). CNN-DS average running time of 0.74 ± 0.07 s was faster than PItcHPERFeCT (> 1412 s) and slice-based U-Net (> 12 s). Comparable interrater agreement rates were observed between "method-human" vs. "human-human" (Cohen's kappa coefficients > 0.82). The fully automated CNN-DS approach demonstrated expert-level accuracy in fast segmentation and quantification of hematoma, substantially improving over previous methods. Further research is warranted to test the CNN-DS solution as a software tool in clinical settings for effective stroke management.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-020-76459-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7652921PMC
November 2020

Deep Learning Modeling of Androgen Receptor Responses to Prostate Cancer Therapies.

Int J Mol Sci 2020 Aug 14;21(16). Epub 2020 Aug 14.

Vancouver Prostate Centre, University of British Columbia, 2660 Oak St, Vancouver, BC V6H 3Z6, Canada.

Gain-of-function mutations in human androgen receptor (AR) are among the major causes of drug resistance in prostate cancer (PCa). Identifying mutations that cause resistant phenotype is of critical importance for guiding treatment protocols, as well as for designing drugs that do not elicit adverse responses. However, experimental characterization of these mutations is time consuming and costly; thus, predictive models are needed to anticipate resistant mutations and to guide the drug discovery process. In this work, we leverage experimental data collected on 68 AR mutants, either observed in the clinic or described in the literature, to train a deep neural network (DNN) that predicts the response of these mutants to currently used and experimental anti-androgens and testosterone. We demonstrate that the use of this DNN, with general 2D descriptors, provides a more accurate prediction of the biological outcome (inhibition, activation, no-response, mixed-response) in AR mutant-drug pairs compared to other machine learning approaches. Finally, the developed approach was used to make predictions of AR mutant response to the latest AR inhibitor darolutamide, which were then validated by in-vitro experiments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/ijms21165847DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7461580PMC
August 2020

AITL: Adversarial Inductive Transfer Learning with input and output space adaptation for pharmacogenomics.

Bioinformatics 2020 07;36(Suppl_1):i380-i388

School of Computing Science, Simon Fraser University, Burnaby, BC, Canada.

Motivation: The goal of pharmacogenomics is to predict drug response in patients using their single- or multi-omics data. A major challenge is that clinical data (i.e. patients) with drug response outcome is very limited, creating a need for transfer learning to bridge the gap between large pre-clinical pharmacogenomics datasets (e.g. cancer cell lines), as a source domain, and clinical datasets as a target domain. Two major discrepancies exist between pre-clinical and clinical datasets: (i) in the input space, the gene expression data due to difference in the basic biology, and (ii) in the output space, the different measures of the drug response. Therefore, training a computational model on cell lines and testing it on patients violates the i.i.d assumption that train and test data are from the same distribution.

Results: We propose Adversarial Inductive Transfer Learning (AITL), a deep neural network method for addressing discrepancies in input and output space between the pre-clinical and clinical datasets. AITL takes gene expression of patients and cell lines as the input, employs adversarial domain adaptation and multi-task learning to address these discrepancies, and predicts the drug response as the output. To the best of our knowledge, AITL is the first adversarial inductive transfer learning method to address both input and output discrepancies. Experimental results indicate that AITL outperforms state-of-the-art pharmacogenomics and transfer learning baselines and may guide precision oncology more accurately.

Availability And Implementation: https://github.com/hosseinshn/AITL.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa442DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355265PMC
July 2020

Uncovering the subtype-specific temporal order of cancer pathway dysregulation.

PLoS Comput Biol 2019 11 11;15(11):e1007451. Epub 2019 Nov 11.

School of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada.

Cancer is driven by genetic mutations that dysregulate pathways important for proper cell function. Therefore, discovering these cancer pathways and their dysregulation order is key to understanding and treating cancer. However, the heterogeneity of mutations between different individuals makes this challenging and requires that cancer progression is studied in a subtype-specific way. To address this challenge, we provide a mathematical model, called Subtype-specific Pathway Linear Progression Model (SPM), that simultaneously captures cancer subtypes and pathways and order of dysregulation of the pathways within each subtype. Experiments with synthetic data indicate the robustness of SPM to problem specifics including noise compared to an existing method. Moreover, experimental results on glioblastoma multiforme and colorectal adenocarcinoma show the consistency of SPM's results with the existing knowledge and its superiority to an existing method in certain cases. The implementation of our method is available at https://github.com/Dalton386/SPM.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1007451DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6872169PMC
November 2019

MOLI: multi-omics late integration with deep neural networks for drug response prediction.

Bioinformatics 2019 07;35(14):i501-i509

School of Computing Science, Simon Fraser University, Burnaby, BC, Canada.

Motivation: Historically, gene expression has been shown to be the most informative data for drug response prediction. Recent evidence suggests that integrating additional omics can improve the prediction accuracy which raises the question of how to integrate the additional omics. Regardless of the integration strategy, clinical utility and translatability are crucial. Thus, we reasoned a multi-omics approach combined with clinical datasets would improve drug response prediction and clinical relevance.

Results: We propose MOLI, a multi-omics late integration method based on deep neural networks. MOLI takes somatic mutation, copy number aberration and gene expression data as input, and integrates them for drug response prediction. MOLI uses type-specific encoding sub-networks to learn features for each omics type, concatenates them into one representation and optimizes this representation via a combined cost function consisting of a triplet loss and a binary cross-entropy loss. The former makes the representations of responder samples more similar to each other and different from the non-responders, and the latter makes this representation predictive of the response values. We validate MOLI on in vitro and in vivo datasets for five chemotherapy agents and two targeted therapeutics. Compared to state-of-the-art single-omics and early integration multi-omics methods, MOLI achieves higher prediction accuracy in external validations. Moreover, a significant improvement in MOLI's performance is observed for targeted drugs when training on a pan-drug input, i.e. using all the drugs with the same target compared to training only on drug-specific inputs. MOLI's high predictive power suggests it may have utility in precision oncology.

Availability And Implementation: https://github.com/hosseinshn/MOLI.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz318DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612815PMC
July 2019

Collaborative intra-tumor heterogeneity detection.

Bioinformatics 2019 07;35(14):i379-i388

School of Computing Science, Simon Fraser University, Burnaby, BC.

Motivation: Despite the remarkable advances in sequencing and computational techniques, noise in the data and complexity of the underlying biological mechanisms render deconvolution of the phylogenetic relationships between cancer mutations difficult. Besides that, the majority of the existing datasets consist of bulk sequencing data of single tumor sample of an individual. Accurate inference of the phylogenetic order of mutations is particularly challenging in these cases and the existing methods are faced with several theoretical limitations. To overcome these limitations, new methods are required for integrating and harnessing the full potential of the existing data.

Results: We introduce a method called Hintra for intra-tumor heterogeneity detection. Hintra integrates sequencing data for a cohort of tumors and infers tumor phylogeny for each individual based on the evolutionary information shared between different tumors. Through an iterative process, Hintra learns the repeating evolutionary patterns and uses this information for resolving the phylogenetic ambiguities of individual tumors. The results of synthetic experiments show an improved performance compared to two state-of-the-art methods. The experimental results with a recent Breast Cancer dataset are consistent with the existing knowledge and provide potentially interesting findings.

Availability And Implementation: The source code for Hintra is available at https://github.com/sahandk/HINTRA.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz355DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612880PMC
July 2019

Performance comparison of linear and non-linear feature selection methods for the analysis of large survey datasets.

PLoS One 2019 21;14(3):e0213584. Epub 2019 Mar 21.

Digital Health Hub, Simon Fraser University, Surrey, British Columbia, Canada.

Large survey databases for aging-related analysis are often examined to discover key factors that affect a dependent variable of interest. Typically, this analysis is performed with methods assuming linear dependencies between variables. Such assumptions however do not hold in many cases, wherein data are linked by way of non-linear dependencies. This in turn requires applications of analytic methods, which are more accurate in identifying potentially non-linear dependencies. Here, we objectively compared the feature selection performance of several frequently-used linear selection methods and three non-linear selection methods in the context of large survey data. These methods were assessed using both synthetic and real-world datasets, wherein relationships between the features and dependent variables were known in advance. In contrast to linear methods, we found that the non-linear methods offered better overall feature selection performance than linear methods in all usage conditions. Moreover, the performance of the non-linear methods was more stable, being unaffected by the inclusion or exclusion of variables from the datasets. These properties make non-linear feature selection methods a potentially preferable tool for both hypothesis-driven and exploratory analyses for aging-related datasets.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0213584PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6428288PMC
December 2019

SUBSTRA: Supervised Bayesian Patient Stratification.

Bioinformatics 2019 09;35(18):3263-3272

Computational Systems Immunology, Pfizer Worldwide R&D, Berlin, Germany.

Motivation: Patient stratification methods are key to the vision of precision medicine. Here, we consider transcriptional data to segment the patient population into subsets relevant to a given phenotype. Whereas most existing patient stratification methods focus either on predictive performance or interpretable features, we developed a method striking a balance between these two important goals.

Results: We introduce a Bayesian method called SUBSTRA that uses regularized biclustering to identify patient subtypes and interpretable subtype-specific transcript clusters. The method iteratively re-weights feature importance to optimize phenotype prediction performance by producing more phenotype-relevant patient subtypes. We investigate the performance of SUBSTRA in finding relevant features using simulated data and successfully benchmark it against state-of-the-art unsupervised stratification methods and supervised alternatives. Moreover, SUBSTRA achieves predictive performance competitive with the supervised benchmark methods and provides interpretable transcriptional features in diverse biological settings, such as drug response prediction, cancer diagnosis, or kidney transplant rejection.

Availability And Implementation: The R code of SUBSTRA is available at https://github.com/sahandk/SUBSTRA.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz112DOI Listing
September 2019

Tissue-Specific Subcellular Localization Prediction Using Multi-Label Markov Random Fields.

IEEE/ACM Trans Comput Biol Bioinform 2019 Sep-Oct;16(5):1471-1482. Epub 2019 Feb 5.

The understanding of subcellular localization (SCL) of proteins and proteome variation in the different tissues and organs of the human body are two crucial aspects for increasing our knowledge of the dynamic rules of proteins, the cell biology, and the mechanism of diseases. Although there have been tremendous contributions to these two fields independently, the lack of knowledge of the variation of spatial distribution of proteins in the different tissues still exists. Here, we proposed an approach that allows predicting protein SCL on tissue specificity through the use of tissue-specific functional associations and physical protein-protein interactions (PPIs). We applied our previously developed Bayesian collective Markov random fields (BCMRFs) on tissue-specific protein-protein interaction network (PPI network) for nine types of tissues focusing on eight high-level SCL. The evaluated results demonstrate the strength of our approach in predicting tissue-specific SCL. We identified 1,314 proteins that their SCL were previously proven cell line dependent. We predicted 549 novel tissue-specific localized candidate proteins while some of them were validated via text-mining.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCBB.2019.2897683DOI Listing
March 2020

HUME: large-scale detection of causal genetic factors of adverse drug reactions.

Bioinformatics 2018 12;34(24):4274-4283

Department of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada.

Motivation: Adverse drug reactions are one of the major factors that affect the wellbeing of patients and financial costs of healthcare systems. Genetic variations of patients have been shown to be a key factor in the occurrence and severity of many ADRs. However, the large number of confounding drugs and genetic biomarkers for each adverse reaction case demands a method that evaluates all potential genetic causes of ADRs simultaneously.

Results: To address this challenge, we propose HUME, a multi-phase algorithm that recommends genetic factors for ADRs that are causally supported by the patient record data. HUME consists of the construction of a network from co-prevalence between significant genetic biomarkers and ADRs, a link score phase for predicting candidate relations based on the Adamic-Adar measure, and a causal refinement phase based on multiple hypothesis testing of quasi experimental designs for evaluating evidence and counter evidence of candidate relations in the patient records.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty475DOI Listing
December 2018

Automation of CT-based haemorrhagic stroke assessment for improved clinical outcomes: study protocol and design.

BMJ Open 2018 04 19;8(4):e020260. Epub 2018 Apr 19.

Health Sciences and Innovation, Fraser Health Authority, Surrey, British Columbia, Canada.

Introduction: Haemorrhagic stroke is of significant healthcare concern due to its association with high mortality and lasting impact on the survivors' quality of life. Treatment decisions and clinical outcomes depend strongly on the size, spread and location of the haematoma. Non-contrast CT (NCCT) is the primary neuroimaging modality for haematoma assessment in haemorrhagic stroke diagnosis. Current procedures do not allow convenient NCCT-based haemorrhage volume calculation in clinical settings, while research-based approaches are yet to be tested for clinical utility; there is a demonstrated need for developing effective solutions. The project under review investigates the development of an automatic NCCT-based haematoma computation tool in support of accurate quantification of haematoma volumes.

Methods And Analysis: Several existing research methods for haematoma volume estimation are studied. Selected methods are tested using NCCT images of patients diagnosed with acute haemorrhagic stroke. For inter-rater and intrarater reliability evaluation, different raters will analyse haemorrhage volumes independently. The efficiency with respect to time of haematoma volume assessments will be examined to compare with the results from routine clinical evaluations and planimetry assessment that are known to be more accurate. The project will target the development of an enhanced solution by adapting existing methods and integrating machine learning algorithms. NCCT-based information of brain haemorrhage (eg, size, volume, location) and other relevant information (eg, age, sex, risk factor, comorbidities) will be used in relation to clinical outcomes with future project development. Validity and reliability of the solution will be examined for potential clinical utility.

Ethics And Dissemination: The project including procedures for deidentification of NCCT data has been ethically approved. The study involves secondary use of existing data and does not require new consent of participation. The team consists of clinical neuroimaging scientists, computing scientists and clinical professionals in neurology and neuroradiology and includes patient representatives. Research outputs will be disseminated following knowledge translation plans towards improving stroke patient care. Significant findings will be published in scientific journals. Anticipated deliverables include computer solutions for improved clinical assessment of haematoma using NCCT.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1136/bmjopen-2017-020260DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5914893PMC
April 2018

SimBoost: a read-across approach for predicting drug-target binding affinities using gradient boosting machines.

J Cheminform 2017 Apr 18;9(1):24. Epub 2017 Apr 18.

School of Computing Science, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada.

Computational prediction of the interaction between drugs and targets is a standing challenge in the field of drug discovery. A number of rather accurate predictions were reported for various binary drug-target benchmark datasets. However, a notable drawback of a binary representation of interaction data is that missing endpoints for non-interacting drug-target pairs are not differentiated from inactive cases, and that predicted levels of activity depend on pre-defined binarization thresholds. In this paper, we present a method called SimBoost that predicts continuous (non-binary) values of binding affinities of compounds and proteins and thus incorporates the whole interaction spectrum from true negative to true positive interactions. Additionally, we propose a version of the method called SimBoostQuant which computes a prediction interval in order to assess the confidence of the predicted affinity, thus defining the Applicability Domain metrics explicitly. We evaluate SimBoost and SimBoostQuant on two established drug-target interaction benchmark datasets and one new dataset that we propose to use as a benchmark for read-across cheminformatics applications. We demonstrate that our methods outperform the previously reported models across the studied datasets.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-017-0209-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395521PMC
April 2017

BAYESIAN BICLUSTERING FOR PATIENT STRATIFICATION.

Pac Symp Biocomput 2016 ;21:345-56

School of Computing Science, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada,

The move from Empirical Medicine towards Personalized Medicine has attracted attention to Stratified Medicine (SM). Some methods are provided in the literature for patient stratification, which is the central task of SM, however, there are still significant open issues. First, it is still unclear if integrating different datatypes will help in detecting disease subtypes more accurately, and, if not, which datatype(s) are most useful for this task. Second, it is not clear how we can compare different methods of patient stratification. Third, as most of the proposed stratification methods are deterministic, there is a need for investigating the potential benefits of applying probabilistic methods. To address these issues, we introduce a novel integrative Bayesian biclustering method, called B2PS, for patient stratification and propose methods for evaluating the results. Our experimental results demonstrate the superiority of B2PS over a popular state-of-the-art method and the benefits of Bayesian approaches. Our results agree with the intuition that transcriptomic data forms a better basis for patient stratification than genomic data.
View Article and Find Full Text PDF

Download full-text PDF

Source
October 2016

Optimally discriminative subnetwork markers predict response to chemotherapy.

Bioinformatics 2011 Jul;27(13):i205-13

School of Computing Science, Simon Fraser University.

Motivation: Molecular profiles of tumour samples have been widely and successfully used for classification problems. A number of algorithms have been proposed to predict classes of tumor samples based on expression profiles with relatively high performance. However, prediction of response to cancer treatment has proved to be more challenging and novel approaches with improved generalizability are still highly needed. Recent studies have clearly demonstrated the advantages of integrating protein-protein interaction (PPI) data with gene expression profiles for the development of subnetwork markers in classification problems.

Results: We describe a novel network-based classification algorithm (OptDis) using color coding technique to identify optimally discriminative subnetwork markers. Focusing on PPI networks, we apply our algorithm to drug response studies: we evaluate our algorithm using published cohorts of breast cancer patients treated with combination chemotherapy. We show that our OptDis method improves over previously published subnetwork methods and provides better and more stable performance compared with other subnetwork and single gene methods. We also show that our subnetwork method produces predictive markers that are more reproducible across independent cohorts and offer valuable insight into biological processes underlying response to therapy.

Availability: The implementation is available at: http://www.cs.sfu.ca/~pdao/personal/OptDis.html

Contact: cenk@cs.sfu.ca; alapuk@prostatecentre.com; ccollins@prostatecentre.com.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btr245DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117373PMC
July 2011

Module discovery by exhaustive search for densely connected, co-expressed regions in biomolecular interaction networks.

PLoS One 2010 Oct 25;5(10):e13348. Epub 2010 Oct 25.

School of Computing Science, Simon Fraser University, Burnaby, Canada.

Background: Computational prediction of functionally related groups of genes (functional modules) from large-scale data is an important issue in computational biology. Gene expression experiments and interaction networks are well studied large-scale data sources, available for many not yet exhaustively annotated organisms. It has been well established, when analyzing these two data sources jointly, modules are often reflected by highly interconnected (dense) regions in the interaction networks whose participating genes are co-expressed. However, the tractability of the problem had remained unclear and methods by which to exhaustively search for such constellations had not been presented.

Methodology/principal Findings: We provide an algorithmic framework, referred to as Densely Connected Biclustering (DECOB), by which the aforementioned search problem becomes tractable. To benchmark the predictive power inherent to the approach, we computed all co-expressed, dense regions in physical protein and genetic interaction networks from human and yeast. An automatized filtering procedure reduces our output which results in smaller collections of modules, comparable to state-of-the-art approaches. Our results performed favorably in a fair benchmarking competition which adheres to standard criteria. We demonstrate the usefulness of an exhaustive module search, by using the unreduced output to more quickly perform GO term related function prediction tasks. We point out the advantages of our exhaustive output by predicting functional relationships using two examples.

Conclusion/significance: We demonstrate that the computation of all densely connected and co-expressed regions in interaction networks is an approach to module discovery of considerable value. Beyond confirming the well settled hypothesis that such co-expressed, densely connected interaction network regions reflect functional modules, we open up novel computational ways to comprehensively analyze the modular organization of an organism based on prevalent and largely available large-scale datasets.

Availability: Software and data sets are available at http://www.sfu.ca/~ester/software/DECOB.zip.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0013348PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2963598PMC
October 2010

Inferring cancer subnetwork markers using density-constrained biclustering.

Bioinformatics 2010 Sep;26(18):i625-31

School of Computing Science, Simon Fraser University, Burnaby, Canada.

Motivation: Recent genomic studies have confirmed that cancer is of utmost phenotypical complexity, varying greatly in terms of subtypes and evolutionary stages. When classifying cancer tissue samples, subnetwork marker approaches have proven to be superior over single gene marker approaches, most importantly in cross-platform evaluation schemes. However, prior subnetwork-based approaches do not explicitly address the great phenotypical complexity of cancer.

Results: We explicitly address this and employ density-constrained biclustering to compute subnetwork markers, which reflect pathways being dysregulated in many, but not necessarily all samples under consideration. In breast cancer we achieve substantial improvements over all cross-platform applicable approaches when predicting TP53 mutation status in a well-established non-cross-platform setting. In colon cancer, we raise prediction accuracy in the most difficult instances from 87% to 93% for cancer versus non-cancer and from 83% to (astonishing) 92%, for with versus without liver metastasis, in well-established cross-platform evaluation schemes.

Availability: Software is available on request.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btq393DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935415PMC
September 2010

PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes.

Bioinformatics 2010 Jul 13;26(13):1608-15. Epub 2010 May 13.

Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada.

Motivation: PSORTb has remained the most precise bacterial protein subcellular localization (SCL) predictor since it was first made available in 2003. However, the recall needs to be improved and no accurate SCL predictors yet make predictions for archaea, nor differentiate important localization subcategories, such as proteins targeted to a host cell or bacterial hyperstructures/organelles. Such improvements should preferably be encompassed in a freely available web-based predictor that can also be used as a standalone program.

Results: We developed PSORTb version 3.0 with improved recall, higher proteome-scale prediction coverage, and new refined localization subcategories. It is the first SCL predictor specifically geared for all prokaryotes, including archaea and bacteria with atypical membrane/cell wall topologies. It features an improved standalone program, with a new batch results delivery system complementing its web interface. We evaluated the most accurate SCL predictors using 5-fold cross validation plus we performed an independent proteomics analysis, showing that PSORTb 3.0 is the most accurate but can benefit from being complemented by Proteome Analyst predictions.

Availability: http://www.psort.org/psortb (download open source software or use the web interface).

Contact: psort-mail@sfu.ca

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btq249DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2887053PMC
July 2010

Slider--maximum use of probability information for alignment of short sequence reads and SNP detection.

Bioinformatics 2009 Jan 30;25(1):6-13. Epub 2008 Oct 30.

Genome Sciences Centre, BC Cancer Agency, Vancouver and School of Computing Science, Simon Fraser University, Burnaby, BC, Canada.

Motivation: A plethora of alignment tools have been created that are designed to best fit different types of alignment conditions. While some of these are made for aligning Illumina Sequence Analyzer reads, none of these are fully utilizing its probability (prb) output. In this article, we will introduce a new alignment approach (Slider) that reduces the alignment problem space by utilizing each read base's probabilities given in the prb files.

Results: Compared with other aligners, Slider has higher alignment accuracy and efficiency. In addition, given that Slider matches bases with probabilities other than the most probable, it significantly reduces the percentage of base mismatches. The result is that its SNP predictions are more accurate than other SNP prediction approaches used today that start from the most probable sequence, including those using base quality.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btn565DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2638935PMC
January 2009

Assessment and integration of publicly available SAGE, cDNA microarray, and oligonucleotide microarray expression data for global coexpression analyses.

Genomics 2005 Oct;86(4):476-88

Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada V5Z 4E6.

Large amounts of gene expression data from several different technologies are becoming available to the scientific community. A common practice is to use these data to calculate global gene coexpression for validation or integration of other "omic" data. To assess the utility of publicly available datasets for this purpose we have analyzed Homo sapiens data from 1202 cDNA microarray experiments, 242 SAGE libraries, and 667 Affymetrix oligonucleotide microarray experiments. The three datasets compared demonstrate significant but low levels of global concordance (rc<0.11). Assessment against Gene Ontology (GO) revealed that all three platforms identify more coexpressed gene pairs with common biological processes than expected by chance. As the Pearson correlation for a gene pair increased it was more likely to be confirmed by GO. The Affymetrix dataset performed best individually with gene pairs of correlation 0.9-1.0 confirmed by GO in 74% of cases. However, in all cases, gene pairs confirmed by multiple platforms were more likely to be confirmed by GO. We show that combining results from different expression platforms increases reliability of coexpression. A comparison with other recently published coexpression studies found similar results in terms of performance against GO but with each method producing distinctly different gene pair lists.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ygeno.2005.06.009DOI Listing
October 2005

PSORT-B: Improving protein subcellular localization prediction for Gram-negative bacteria.

Nucleic Acids Res 2003 Jul;31(13):3613-7

Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada V5A 1S6.

Automated prediction of bacterial protein subcellular localization is an important tool for genome annotation and drug discovery. PSORT has been one of the most widely used computational methods for such bacterial protein analysis; however, it has not been updated since it was introduced in 1991. In addition, neither PSORT nor any of the other computational methods available make predictions for all five of the localization sites characteristic of Gram-negative bacteria. Here we present PSORT-B, an updated version of PSORT for Gram-negative bacteria, which is available as a web-based application at http://www.psort.org. PSORT-B examines a given protein sequence for amino acid composition, similarity to proteins of known localization, presence of a signal peptide, transmembrane alpha-helices and motifs corresponding to specific localizations. A probabilistic method integrates these analyses, returning a list of five possible localization sites with associated probability scores. PSORT-B, designed to favor high precision (specificity) over high recall (sensitivity), attained an overall precision of 97% and recall of 75% in 5-fold cross-validation tests, using a dataset we developed of 1443 proteins of experimentally known localization. This dataset, the largest of its kind, is freely available, along with the PSORT-B source code (under GNU General Public License).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC169008PMC
http://dx.doi.org/10.1093/nar/gkg602DOI Listing
July 2003

[Effect of pneumoperitoneum on venous hemodynamics during laparoscopic cholecystectomy. Influence of patients' age and time of surgery].

Med Clin (Barc) 2003 Mar;120(9):330-4

Servicio Cirugía General y Digestivo, Hospital Universitario de Getafe, Madrid, Spain.

Background And Objective: Abdominal hyperpressure developed during laparoscopic cholecystectomy by the effect of pneumoperitoneum represents an obstacle to the venous return that may facilitate thromboembolic complications. The aim of this study was to establish the effect of pneumoperitoneum in venous hemodynamics during laparoscopy.

Patients And Method: Prospective study of 31 consecutive patients who underwent laparoscopic cholecystectomy. Venous occlusion plethysmography was performed preoperatively, after anaesthetic induction, after insufflation, before pneumoperitoneum release and at the end of surgery. Changes of plethysmography were compared with preoperative values and according to age, obesity, presence of varicose veins and pneumoperitoneum time. Bilateral lower limb venous Duplex scanning was performed at days 1, 7 and 30 to detect deep venous thrombosis (DVT).

Results: Average age was 56 years, 66.6% females, 40% obese, 16% with varicose veins and the pneumoperitoneum time was < 45 min in 22.5% patients. Capacitance decreased progressively during surgery and was significantly reduced with pneumoperitoneum. The maximum venous outflow in the first second was reduced significantly at the end of pneumoperitoneum. These reductions were more evident in older patients.

Conclusions: Pneumoperitoneum produces plethysmographic changes on venous hemodynamics with a diminished venous return in lower limbs. Older patients have higher risk of thromboembolic complications, while obesity can also increase this risk. However, no DVT was demonstrated in this study.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/s0025-7753(03)73693-8DOI Listing
March 2003