Publications by authors named "Chris Sander"

196 Publications

SARS-CoV-2 infects blood monocytes to activate NLRP3 and AIM2 inflammasomes, pyroptosis and cytokine release.

Res Sq 2021 Aug 11. Epub 2021 Aug 11.

SARS-CoV-2 causes acute respiratory distress that can progress to multiorgan failure and death in a minority of patients. Although severe COVID-19 disease is linked to exuberant inflammation, how SARS-CoV-2 triggers inflammation is not understood. Monocytes and macrophages are sentinel immune cells in the blood and tissue, respectively, that sense invasive infection to form inflammasomes that activate caspase-1 and gasdermin D (GSDMD) pores, leading to inflammatory death (pyroptosis) and processing and release of IL-1 family cytokines, potent inflammatory mediators. Here we show that expression quantitative trait loci (eQTLs) linked to higher GSDMD expression increase the risk of severe COVID-19 disease (odds ratio, 1.3, p<0.005). We find that about 10% of blood monocytes in COVID-19 patients are infected with SARS-CoV-2. Monocyte infection depends on viral antibody opsonization and uptake of opsonized virus by the Fc receptor CD16. After uptake, SARS-CoV-2 begins to replicate in monocytes, as evidenced by detection of double-stranded RNA and subgenomic RNA and expression of a fluorescent reporter gene. However, infection is aborted, and infectious virus is not detected in infected monocyte supernatants or patient plasma. Instead, infected cells undergo inflammatory cell death (pyroptosis) mediated by activation of the NLRP3 and AIM2 inflammasomes, caspase-1 and GSDMD. Moreover, tissue-resident macrophages, but not infected epithelial cells, from COVID-19 lung autopsy specimens showed evidence of inflammasome activation. These findings taken together suggest that antibody-mediated SARS-CoV-2 infection of monocytes/macrophages triggers inflammatory cell death that aborts production of infectious virus but causes systemic inflammation that contributes to severe COVID-19 disease pathogenesis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.21203/rs.3.rs-153628/v1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8366805PMC
August 2021

Causal interactions from proteomic profiles: Molecular data meet pathway knowledge.

Patterns (N Y) 2021 Jun 12;2(6):100257. Epub 2021 May 12.

Computational Biology Program, Oregon Health and Science University, 3181 SW Sam Jackson Park Road, Portland, OR 97239, USA.

We present a computational method to infer causal mechanisms in cell biology by analyzing changes in high-throughput proteomic profiles on the background of prior knowledge captured in biochemical reaction knowledge bases. The method mimics a biologist's traditional approach of explaining changes in data using prior knowledge but does this at the scale of hundreds of thousands of reactions. This is a specific example of how to automate scientific reasoning processes and illustrates the power of mapping from experimental data to prior knowledge via logic programming. The identified mechanisms can explain how experimental and physiological perturbations, propagating in a network of reactions, affect cellular responses and their phenotypic consequences. Causal pathway analysis is a powerful and flexible discovery tool for a wide range of cellular profiling data types and biological questions. The automated causation inference tool, as well as the source code, are freely available at http://causalpath.org.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.patter.2021.100257DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8212145PMC
June 2021

PredictProtein - Predicting Protein Structure and Function for 29 Years.

Nucleic Acids Res 2021 07;49(W1):W535-W540

TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany.

Since 1992 PredictProtein (https://predictprotein.org) is a one-stop online resource for protein sequence analysis with its main site hosted at the Luxembourg Centre for Systems Biomedicine (LCSB) and queried monthly by over 3,000 users in 2020. PredictProtein was the first Internet server for protein predictions. It pioneered combining evolutionary information and machine learning. Given a protein sequence as input, the server outputs multiple sequence alignments, predictions of protein structure in 1D and 2D (secondary structure, solvent accessibility, transmembrane segments, disordered regions, protein flexibility, and disulfide bridges) and predictions of protein function (functional effects of sequence variation or point mutations, Gene Ontology (GO) terms, subcellular localization, and protein-, RNA-, and DNA binding). PredictProtein's infrastructure has moved to the LCSB increasing throughput; the use of MMseqs2 sequence search reduced runtime five-fold (apparently without lowering performance of prediction methods); user interface elements improved usability, and new prediction methods were added. PredictProtein recently included predictions from deep learning embeddings (GO and secondary structure) and a method for the prediction of proteins and residues binding DNA, RNA, or other proteins. PredictProtein.org aspires to provide reliable predictions to computational and experimental biologists alike. All scripts and methods are freely available for offline execution in high-throughput settings.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkab354DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8265159PMC
July 2021

Protein design and variant prediction using autoregressive generative models.

Nat Commun 2021 04 23;12(1):2403. Epub 2021 Apr 23.

Department of Systems Biology, Harvard Medical School, Boston, MA, USA.

The ability to design functional sequences and predict effects of variation is central to protein engineering and biotherapeutics. State-of-art computational methods rely on models that leverage evolutionary information but are inadequate for important applications where multiple sequence alignments are not robust. Such applications include the prediction of variant effects of indels, disordered proteins, and the design of proteins such as antibodies due to the highly variable complementarity determining regions. We introduce a deep generative model adapted from natural language processing for prediction and design of diverse functional sequences without the need for alignments. The model performs state-of-art prediction of missense and indel effects and we successfully design and test a diverse 10-nanobody library that shows better expression than a 1000-fold larger synthetic library. Our results demonstrate the power of the alignment-free autoregressive model in generalizing to regions of sequence space traditionally considered beyond the reach of prediction and design.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-021-22732-wDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8065141PMC
April 2021

Artificial Intelligence and Early Detection of Pancreatic Cancer: 2020 Summative Review.

Pancreas 2021 03;50(3):251-279

Sander Lab, Harvard Medical School, Boston, MA.

Abstract: Despite considerable research efforts, pancreatic cancer is associated with a dire prognosis and a 5-year survival rate of only 10%. Early symptoms of the disease are mostly nonspecific. The premise of improved survival through early detection is that more individuals will benefit from potentially curative treatment. Artificial intelligence (AI) methodology has emerged as a successful tool for risk stratification and identification in general health care. In response to the maturity of AI, Kenner Family Research Fund conducted the 2020 AI and Early Detection of Pancreatic Cancer Virtual Summit (www.pdac-virtualsummit.org) in conjunction with the American Pancreatic Association, with a focus on the potential of AI to advance early detection efforts in this disease. This comprehensive presummit article was prepared based on information provided by each of the interdisciplinary participants on one of the 5 following topics: Progress, Problems, and Prospects for Early Detection; AI and Machine Learning; AI and Pancreatic Cancer-Current Efforts; Collaborative Opportunities; and Moving Forward-Reflections from Government, Industry, and Advocacy. The outcome from the robust Summit conversations, to be presented in a future white paper, indicate that significant progress must be the result of strategic collaboration among investigators and institutions from multidisciplinary backgrounds, supported by committed funders.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1097/MPA.0000000000001762DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8041569PMC
March 2021

SARS-CoV-2 infects blood monocytes to activate NLRP3 and AIM2 inflammasomes, pyroptosis and cytokine release.

medRxiv 2021 Mar 8. Epub 2021 Mar 8.

SARS-CoV-2 causes acute respiratory distress that can progress to multiorgan failure and death in some patients. Although severe COVID-19 disease is linked to exuberant inflammation, how SARS-CoV-2 triggers inflammation is not understood. Monocytes are sentinel blood cells that sense invasive infection to form inflammasomes that activate caspase-1 and gasdermin D (GSDMD) pores, leading to inflammatory death (pyroptosis) and processing and release of IL-1 family cytokines, potent inflammatory mediators. Here we show that ~10% of blood monocytes in COVID-19 patients are dying and infected with SARS-CoV-2. Monocyte infection, which depends on antiviral antibodies, activates NLRP3 and AIM2 inflammasomes, caspase-1 and GSDMD cleavage and relocalization. Signs of pyroptosis (IL-1 family cytokines, LDH) in the plasma correlate with development of severe disease. Moreover, expression quantitative trait loci (eQTLs) linked to higher expression increase the risk of severe COVID-19 disease (odds ratio, 1.3, p<0.005). These findings taken together suggest that antibody-mediated SARS-CoV-2 infection of monocytes triggers inflammation that contributes to severe COVID-19 disease pathogenesis.

One Sentence Summary: Antibody-mediated SARS-CoV-2 infection of monocytes activates inflammation and cytokine release.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2021.03.06.21252796DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7987031PMC
March 2021

CellBox: Interpretable Machine Learning for Perturbation Biology with Application to the Design of Cancer Combination Therapy.

Cell Syst 2021 02 28;12(2):128-140.e4. Epub 2020 Dec 28.

Department of Cell Biology, Harvard Medical School, Boston, MA, USA; cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA; Broad Institute, Cambridge, MA, USA. Electronic address:

Systematic perturbation of cells followed by comprehensive measurements of molecular and phenotypic responses provides informative data resources for constructing computational models of cell biology. Models that generalize well beyond training data can be used to identify combinatorial perturbations of potential therapeutic interest. Major challenges for machine learning on large biological datasets are to find global optima in a complex multidimensional space and mechanistically interpret the solutions. To address these challenges, we introduce a hybrid approach that combines explicit mathematical models of cell dynamics with a machine-learning framework, implemented in TensorFlow. We tested the modeling framework on a perturbation-response dataset of a melanoma cell line after drug treatments. The models can be efficiently trained to describe cellular behavior accurately. Even though completely data driven and independent of prior knowledge, the resulting de novo network models recapitulate some known interactions. The approach is readily applicable to various kinetic models of cell biology. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cels.2020.11.013DOI Listing
February 2021

CellMiner Cross-Database (CellMinerCDB) version 1.2: Exploration of patient-derived cancer cell line pharmacogenomics.

Nucleic Acids Res 2021 01;49(D1):D1083-D1093

Developmental Therapeutics Branch, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD 20892, USA.

CellMiner Cross-Database (CellMinerCDB, discover.nci.nih.gov/cellminercdb) allows integration and analysis of molecular and pharmacological data within and across cancer cell line datasets from the National Cancer Institute (NCI), Broad Institute, Sanger/MGH and MD Anderson Cancer Center (MDACC). We present CellMinerCDB 1.2 with updates to datasets from NCI-60, Broad Cancer Cell Line Encyclopedia and Sanger/MGH, and the addition of new datasets, including NCI-ALMANAC drug combination, MDACC Cell Line Project proteomic, NCI-SCLC DNA copy number and methylation data, and Broad methylation, genetic dependency and metabolomic datasets. CellMinerCDB (v1.2) includes several improvements over the previously published version: (i) new and updated datasets; (ii) support for pattern comparisons and multivariate analyses across data sources; (iii) updated annotations with drug mechanism of action information and biologically relevant multigene signatures; (iv) analysis speedups via caching; (v) a new dataset download feature; (vi) improved visualization of subsets of multiple tissue types; (vii) breakdown of univariate associations by tissue type; and (viii) enhanced help information. The curation and common annotations (e.g. tissues of origin and identifiers) provided here across pharmacogenomic datasets increase the utility of the individual datasets to address multiple researcher question types, including data reproducibility, biomarker discovery and multivariate analysis of drug activity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkaa968DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7779001PMC
January 2021

netboxr: Automated discovery of biological process modules by network analysis in R.

PLoS One 2020 2;15(11):e0234669. Epub 2020 Nov 2.

Department of Cell Biology, Harvard Medical School, Boston, MA, United States of America.

Summary: Large-scale sequencing projects, such as The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC), have generated high throughput sequencing and molecular profiling data sets, but it is still challenging to identify potentially causal changes in cellular processes in cancer as well as in other diseases in an automated fashion. We developed the netboxr package written in the R programming language, which makes use of the NetBox algorithm to identify candidate cancer-related functional modules. The algorithm makes use of a data-driven, network-based approach that combines prior knowledge with a network clustering algorithm, obviating the need for and the limitation of independently curated functionally labeled gene sets. The method can combine multiple data types, such as mutations and copy number alterations, leading to more reliable identification of functional modules. We make the tool available in the Bioconductor R ecosystem for applications in cancer research and cell biology.

Availability And Implementation: The netboxr package is free and open-sourced under the GNU GPL-3 license R package available at https://www.bioconductor.org/packages/release/bioc/html/netboxr.html.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0234669PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7605689PMC
December 2020

AlignmentViewer: Sequence Analysis of Large Protein Families.

F1000Res 2020 27;9. Epub 2020 Mar 27.

cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.

AlignmentViewer is a web-based tool to view and analyze multiple sequence alignments of protein families. The particular strengths of AlignmentViewer include flexible visualization at different scales as well as analysis of conservation patterns and of the distribution of proteins in sequence space. The tool is directly accessible in web browsers without the need for software installation. It can handle protein families with tens of thousands of sequences and is particularly suitable for evolutionary coupling analysis, e.g. via EVcouplings.org.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/f1000research.22242.2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7570326PMC
February 2021

Systematic Assessment of Tumor Purity and Its Clinical Implications.

JCO Precis Oncol 2020 4;4. Epub 2020 Sep 4.

Department of Human Genetics, University of California, Los Angeles, CA.

Purpose: The tumor microenvironment is complex, comprising heterogeneous cellular populations. As molecular profiles are frequently generated using bulk tissue sections, they represent an admixture of multiple cell types (including immune, stromal, and cancer cells) interacting with each other. Therefore, these molecular profiles are confounded by signals emanating from many cell types. Accurate assessment of residual cancer cell fraction is crucial for parameterization and interpretation of genomic analyses, as well as for accurately interpreting the clinical properties of the tumor.

Materials And Methods: To benchmark cancer cell fraction estimation methods, 10 estimators were applied to a clinical cohort of 333 patients with prostate cancer. These methods include gold-standard multiobserver pathology estimates, as well as estimates inferred from genome, epigenome, and transcriptome data. In addition, two methods based on genomic and transcriptomic profiles were used to quantify tumor purity in 4,497 tumors across 12 cancer types. Bulk mRNA and microRNA profiles were subject to in silico deconvolution to estimate cancer cell-specific mRNA and microRNA profiles.

Results: We present a systematic comparison of 10 tumor purity estimation methods on a cohort of 333 prostate tumors. We quantify variation among purity estimation methods and demonstrate how this influences interpretation of clinico-genomic analyses. Our data show poor concordance between pathologic and molecular purity estimates, necessitating caution when interpreting molecular results. Limited concordance between DNA- and mRNA-derived purity estimates remained a general pan-cancer phenomenon when tested in an additional 4,497 tumors spanning 12 cancer types.

Conclusion: The choice of tumor purity estimation method may have a profound impact on the interpretation of genomic assays. Taken together, these data highlight the need for improved assessment of tumor purity and quantitation of its influences on the molecular hallmarks of cancers.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1200/PO.20.00016DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7529507PMC
September 2020

Diabetes, Weight Change, and Pancreatic Cancer Risk.

JAMA Oncol 2020 10 8;6(10):e202948. Epub 2020 Oct 8.

Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts.

Importance: Pancreatic cancer is the third-leading cause of cancer death in the United States; however, few high-risk groups have been identified to facilitate early diagnosis strategies.

Objective: To evaluate the association of diabetes duration and recent weight change with subsequent risk of pancreatic cancer in the general population.

Design, Setting, And Participants: This cohort study obtained data from female participants in the Nurses' Health Study and male participants in the Health Professionals Follow-Up Study, with repeated exposure assessments over 30 years. Incident cases of pancreatic cancer were identified from self-report or during follow-up of participant deaths. Deaths were ascertained through reports from the next of kin, the US Postal Service, or the National Death Index. Data collection was conducted from October 1, 2018, to December 31, 2018. Data analysis was performed from January 1, 2019, to June 30, 2019.

Exposures: Duration of physician-diagnosed diabetes and recent weight change.

Main Outcome And Measures: Hazard ratios (HRs) for subsequent development of pancreatic cancer.

Results: Of the 112 818 women (with a mean [SD] age of 59.4 [11.7] years) and 46 207 men (with a mean [SD] age of 64.7 [10.8] years) included in the analysis, 1116 incident cases of pancreatic cancers were identified. Compared with participants with no diabetes, those with recent-onset diabetes had an age-adjusted HR for pancreatic cancer of 2.97 (95% CI, 2.31-3.82) and those with long-standing diabetes had an age-adjusted HR of 2.16 (95% CI, 1.78-2.60). Compared with those with no weight loss, participants who reported a 1- to 4-lb weight loss had an age-adjusted HR for pancreatic cancer of 1.25 (95% CI, 1.03-1.52), those with a 5- to 8-lb weight loss had an age-adjusted HR of 1.33 (95% CI, 1.06-1.66), and those with more than an 8-lb weight loss had an age-adjusted HR of 1.92 (95% CI, 1.58-2.32). Participants with recent-onset diabetes accompanied by weight loss of 1 to 8 lb (91 incident cases per 100 000 person-years [95% CI, 55-151]; HR, 3.61 [95% CI, 2.14-6.10]) or more than 8 lb (164 incident cases per 100 000 person-years [95% CI, 114-238]; HR, 6.75 [95% CI, 4.55-10.00]) had a substantially increased risk for pancreatic cancer compared with those with neither exposure (16 incident cases per 100 000 person-years; 95% CI, 14-17). Incidence rates were even higher among participants with recent-onset diabetes and weight loss with a body mass index of less than 25 before weight loss (400 incident cases per 100 000 person-years) or whose weight loss was not intentional judging from increased physical activity or healthier dietary choices (334 incident cases per 100 000 person-years).

Conclusions And Relevance: This study demonstrates that recent-onset diabetes accompanied by weight loss is associated with a substantially increased risk for developing pancreatic cancer. Older age, previous healthy weight, and no intentional weight loss further elevate this risk.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1001/jamaoncol.2020.2948DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7426876PMC
October 2020

Perturbation biology links temporal protein changes to drug responses in a melanoma cell line.

PLoS Comput Biol 2020 07 15;16(7):e1007909. Epub 2020 Jul 15.

Department of Cell Biology, Harvard Medical School, Boston, MA 02115, U.S.A.

Cancer cells have genetic alterations that often directly affect intracellular protein signaling processes allowing them to bypass control mechanisms for cell death, growth and division. Cancer drugs targeting these alterations often work initially, but resistance is common. Combinations of targeted drugs may overcome or prevent resistance, but their selection requires context-specific knowledge of signaling pathways including complex interactions such as feedback loops and crosstalk. To infer quantitative pathway models, we collected a rich dataset on a melanoma cell line: Following perturbation with 54 drug combinations, we measured 124 (phospho-)protein levels and phenotypic response (cell growth, apoptosis) in a time series from 10 minutes to 67 hours. From these data, we trained time-resolved mathematical models that capture molecular interactions and the coupling of molecular levels to cellular phenotype, which in turn reveal the main direct or indirect molecular responses to each drug. Systematic model simulations identified novel combinations of drugs predicted to reduce the survival of melanoma cells, with partial experimental verification. This particular application of perturbation biology demonstrates the potential impact of combining time-resolved data with modeling for the discovery of new combinations of cancer drugs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1007909DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7384681PMC
July 2020

Analyses of non-coding somatic drivers in 2,658 cancer whole genomes.

Nature 2020 02 5;578(7793):102-111. Epub 2020 Feb 5.

Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA.

The discovery of drivers of cancer has traditionally focused on protein-coding genes. Here we present analyses of driver point mutations and structural variants in non-coding regions across 2,658 genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). For point mutations, we developed a statistically rigorous strategy for combining significance levels from multiple methods of driver discovery that overcomes the limitations of individual methods. For structural variants, we present two methods of driver discovery, and identify regions that are significantly affected by recurrent breakpoints and recurrent somatic juxtapositions. Our analyses confirm previously reported drivers, raise doubts about others and identify novel candidates, including point mutations in the 5' region of TP53, in the 3' untranslated regions of NFKBIZ and TOB1, focal deletions in BRD4 and rearrangements in the loci of AKR1C genes. We show that although point mutations and structural variants that drive cancer are less frequent in non-coding genes and regulatory sequences than in protein-coding genes, additional examples of these drivers will be found as more cancer genomes become available.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-020-1965-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7054214PMC
February 2020

Protein Structure from Experimental Evolution.

Cell Syst 2020 01 11;10(1):15-24.e5. Epub 2019 Dec 11.

cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA; Department of Cell Biology, Harvard Medical School, Boston, MA, USA; Broad Institute, Cambridge, MA, USA. Electronic address:

Natural evolution encodes rich information about the structure and function of biomolecules in the genetic record. Previously, statistical analysis of co-variation patterns in natural protein families has enabled the accurate computation of 3D structures. Here, we explored generating similar information by experimental evolution, starting from a single gene and performing multiple cycles of in vitro mutagenesis and functional selection in Escherichia coli. We evolved two antibiotic resistance proteins, β-lactamase PSE1 and acetyltransferase AAC6, and obtained hundreds of thousands of diverse functional sequences. Using evolutionary coupling analysis, we inferred residue interaction constraints that were in agreement with contacts in known 3D structures, confirming genetic encoding of structural constraints in the selected sequences. Computational protein folding with interaction constraints then yielded 3D structures with the same fold as natural relatives. This work lays the foundation for a new experimental method (3Dseq) for protein structure determination, combining evolution experiments with inference of residue interactions from sequence information. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cels.2019.11.008DOI Listing
January 2020

Quantitative Proteome Landscape of the NCI-60 Cancer Cell Lines.

iScience 2019 Nov 31;21:664-680. Epub 2019 Oct 31.

Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

Here we describe a proteomic data resource for the NCI-60 cell lines generated by pressure cycling technology and SWATH mass spectrometry. We developed the DIA-expert software to curate and visualize the SWATH data, leading to reproducible detection of over 3,100 SwissProt proteotypic proteins and systematic quantification of pathway activities. Stoichiometric relationships of interacting proteins for DNA replication, repair, the chromatin remodeling NuRD complex, β-catenin, RNA metabolism, and prefoldins are more evident than that at the mRNA level. The data are available in CellMiner (discover.nci.nih.gov/cellminercdb and discover.nci.nih.gov/cellminer), allowing casual users to test hypotheses and perform integrative, cross-database analyses of multi-omic drug response correlations for over 20,000 drugs. We demonstrate the value of proteome data in predicting drug response for over 240 clinically relevant chemotherapeutic and targeted therapies. In summary, we present a novel proteome resource for the NCI-60, together with relevant software tools, and demonstrate the benefit of proteome analyses.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.isci.2019.10.059DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6889472PMC
November 2019

Pathway Commons 2019 Update: integration, analysis and exploration of pathway data.

Nucleic Acids Res 2020 01;48(D1):D489-D497

cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA 02215, USA.

Pathway Commons (https://www.pathwaycommons.org) is an integrated resource of publicly available information about biological pathways including biochemical reactions, assembly of biomolecular complexes, transport and catalysis events and physical interactions involving proteins, DNA, RNA, and small molecules (e.g. metabolites and drug compounds). Data is collected from multiple providers in standard formats, including the Biological Pathway Exchange (BioPAX) language and the Proteomics Standards Initiative Molecular Interactions format, and then integrated. Pathway Commons provides biologists with (i) tools to search this comprehensive resource, (ii) a download site offering integrated bulk sets of pathway data (e.g. tables of interactions and gene sets), (iii) reusable software libraries for working with pathway information in several programming languages (Java, R, Python and Javascript) and (iv) a web service for programmatically querying the entire dataset. Visualization of pathways is supported using the Systems Biological Graphical Notation (SBGN). Pathway Commons currently contains data from 22 databases with 4794 detailed human biochemical processes (i.e. pathways) and ∼2.3 million interactions. To enhance the usability of this large resource for end-users, we develop and maintain interactive web applications and training materials that enable pathway exploration and advanced analysis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkz946DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7145667PMC
January 2020

Protein structure prediction assisted with sparse NMR data in CASP13.

Proteins 2019 12;87(12):1315-1332

Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey.

CASP13 has investigated the impact of sparse NMR data on the accuracy of protein structure prediction. NOESY and N- H residual dipolar coupling data, typical of that obtained for N, C-enriched, perdeuterated proteins up to about 40 kDa, were simulated for 11 CASP13 targets ranging in size from 80 to 326 residues. For several targets, two prediction groups generated models that are more accurate than those produced using baseline methods. Real NMR data collected for a de novo designed protein were also provided to predictors, including one data set in which only backbone resonance assignments were available. Some NMR-assisted prediction groups also did very well with these data. CASP13 also assessed whether incorporation of sparse NMR data improves the accuracy of protein structure prediction relative to nonassisted regular methods. In most cases, incorporation of sparse, noisy NMR data results in models with higher accuracy. The best NMR-assisted models were also compared with the best regular predictions of any CASP13 group for the same target. For six of 13 targets, the most accurate model provided by any NMR-assisted prediction group was more accurate than the most accurate model provided by any regular prediction group; however, for the remaining seven targets, one or more regular prediction method provided a more accurate model than even the best NMR-assisted model. These results suggest a novel approach for protein structure determination, in which advanced prediction methods are first used to generate structural models, and sparse NMR data is then used to validate and/or refine these models.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/prot.25837DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7213643PMC
December 2019

Cancer-associated mutations in DICER1 RNase IIIa and IIIb domains exert similar effects on miRNA biogenesis.

Nat Commun 2019 08 15;10(1):3682. Epub 2019 Aug 15.

Department of Developmental Biology, Memorial Sloan-Kettering Cancer Center, New York, NY, 10065, USA.

Somatic mutations in the RNase IIIb domain of DICER1 arise in cancer and disrupt the cleavage of 5' pre-miRNA arms. Here, we characterize an unstudied, recurrent, mutation (S1344L) in the DICER1 RNase IIIa domain in tumors from The Cancer Genome Atlas (TCGA) project and MSK-IMPACT profiling. RNase IIIa/b hotspots are absent from most cancers, but are notably enriched in uterine cancers. Systematic analysis of TCGA small RNA datasets show that DICER1 RNase IIIa-S1344L tumors deplete 5p-miRNAs, analogous to RNase IIIb hotspot samples. Structural and evolutionary coupling analyses reveal constrained proximity of RNase IIIa-S1344 to the RNase IIIb catalytic site, rationalizing why mutation of this site phenocopies known hotspot alterations. Finally, examination of DICER1 hotspot endometrial tumors reveals derepression of specific miRNA target signatures. In summary, comprehensive analyses of DICER1 somatic mutations and small RNA data reveal a mechanistic aspect of pre-miRNA processing that manifests in specific cancer settings.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-019-11610-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6695490PMC
August 2019

Inferring protein 3D structure from deep mutation scans.

Nat Genet 2019 07 17;51(7):1170-1176. Epub 2019 Jun 17.

Department of Systems Biology, Harvard Medical School, Boston, MA, USA.

We describe an experimental method of three-dimensional (3D) structure determination that exploits the increasing ease of high-throughput mutational scans. Inspired by the success of using natural, evolutionary sequence covariation to compute protein and RNA folds, we explored whether 'laboratory', synthetic sequence variation might also yield 3D structures. We analyzed five large-scale mutational scans and discovered that the pairs of residues with the largest positive epistasis in the experiments are sufficient to determine the 3D fold. We show that the strongest epistatic pairings from genetic screens of three proteins, a ribozyme and a protein interaction reveal 3D contacts within and between macromolecules. Using these experimental epistatic pairs, we compute ab initio folds for a GB1 domain (within 1.8 Å of the crystal structure) and a WW domain (2.1 Å). We propose strategies that reduce the number of mutants needed for contact prediction, suggesting that genomics-based techniques can efficiently predict 3D structure.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-019-0432-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7295002PMC
July 2019

LLGL2 rescues nutrient stress by promoting leucine uptake in ER breast cancer.

Nature 2019 05 17;569(7755):275-279. Epub 2019 Apr 17.

Department of Medicine, Cancer Research Institute, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA.

Drosophila Lgl and its mammalian homologues, LLGL1 and LLGL2, are scaffolding proteins that regulate the establishment of apical-basal polarity in epithelial cells. Whereas Lgl functions as a tumour suppressor in Drosophila, the roles of mammalian LLGL1 and LLGL2 in cancer are unclear. The majority (about 75%) of breast cancers express oestrogen receptors (ERs), and patients with these tumours receive endocrine treatment. However, the development of resistance to endocrine therapy and metastatic progression are leading causes of death for patients with ER disease. Here we report that, unlike LLGL1, LLGL2 is overexpressed in ER breast cancer and promotes cell proliferation under nutrient stress. LLGL2 regulates cell surface levels of a leucine transporter, SLC7A5, by forming a trimeric complex with SLC7A5 and a regulator of membrane fusion, YKT6, to promote leucine uptake and cell proliferation. The oestrogen receptor targets LLGL2 expression. Resistance to endocrine treatment in breast cancer cells was associated with SLC7A5- and LLGL2-dependent adaption to nutrient stress. SLC7A5 was necessary and sufficient to confer resistance to tamoxifen treatment, identifying SLC7A5 as a potential therapeutic target for overcoming resistance to endocrine treatments in breast cancer. Thus, LLGL2 functions as a promoter of tumour growth and not as a tumour suppressor in ER breast cancer. Beyond breast cancer, adaptation to nutrient stress is critically important, and our findings identify an unexpected role for LLGL2 in this process.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-019-1126-2DOI Listing
May 2019

Abnormal oxidative metabolism in a quiet genomic background underlies clear cell papillary renal cell carcinoma.

Elife 2019 04 1;8. Epub 2019 Apr 1.

Department of Medicine, Molecular Oncology, Siteman Cancer Center, Washington University, St. Louis, United States.

While genomic sequencing routinely identifies oncogenic alterations for the majority of cancers, many tumors harbor no discernable driver lesion. Here, we describe the exceptional molecular phenotype of a genomically quiet kidney tumor, clear cell papillary renal cell carcinoma (CCPAP). In spite of a largely wild-type nuclear genome, CCPAP tumors exhibit severe depletion of mitochondrial DNA (mtDNA) and RNA and high levels of oxidative stress, reflecting a shift away from respiratory metabolism. Moreover, CCPAP tumors exhibit a distinct metabolic phenotype uniquely characterized by accumulation of the sugar alcohol sorbitol. Immunohistochemical staining of primary CCPAP tumor specimens recapitulates both the depletion of mtDNA-encoded proteins and a lipid-depleted metabolic phenotype, suggesting that the cytoplasmic clarity in CCPAP is primarily related to the presence of glycogen. These results argue for non-genetic profiling as a tool for the study of cancers of unknown driver.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7554/eLife.38986DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6459676PMC
April 2019

A Hybrid Approach for Protein Structure Determination Combining Sparse NMR with Evolutionary Coupling Sequence Data.

Adv Exp Med Biol 2018;1105:153-169

Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, NJ, USA.

While 3D structure determination of small (<15 kDa) proteins by solution NMR is largely automated and routine, structural analysis of larger proteins is more challenging. An emerging hybrid strategy for modeling protein structures combines sparse NMR data that can be obtained for larger proteins with sequence co-variation data, called evolutionary couplings (ECs), obtained from multiple sequence alignments of protein families. This hybrid "EC-NMR" method can be used to accurately model larger (15-60 kDa) proteins, and more rapidly determine structures of smaller (5-15 kDa) proteins using only backbone NMR data. The resulting structures have accuracies relative to reference structures comparable to those obtained with full backbone and sidechain NMR resonance assignments. The requirement that evolutionary couplings (ECs) are consistent with NMR data recorded on a specific member of a protein family, under specific conditions, potentially also allows identification of ECs that reflect alternative allosteric or excited states of the protein structure.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-981-13-2200-6_10DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6630173PMC
July 2019

Combining Evolutionary Covariance and NMR Data for Protein Structure Determination.

Methods Enzymol 2019 23;614:363-392. Epub 2018 Dec 23.

Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, Piscataway, NJ, United States; Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, NJ, United States; Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, NJ, United States. Electronic address:

Accurate protein structure determination by solution-state NMR is challenging for proteins greater than about 20kDa, for which extensive perdeuteration is generally required, providing experimental data that are incomplete (sparse) and ambiguous. However, the massive increase in evolutionary sequence information coupled with advances in methods for sequence covariance analysis can provide reliable residue-residue contact information for a protein from sequence data alone. These "evolutionary couplings (ECs)" can be combined with sparse NMR data to determine accurate 3D protein structures. This hybrid "EC-NMR" method has been developed using NMR data for several soluble proteins and validated by comparison with corresponding reference structures determined by X-ray crystallography and/or conventional NMR methods. For small proteins, only backbone resonance assignments are utilized, while for larger proteins both backbone and some sidechain methyl resonance assignments are generally required. ECs can be combined with sparse NMR data obtained on deuterated, selectively protonated protein samples to provide structures that are more accurate and complete than those obtained using such sparse NMR data alone. EC-NMR also has significant potential for analysis of protein structures from solid-state NMR data and for studies of integral membrane proteins. The requirement that ECs are consistent with NMR data recorded on a specific member of a protein family, under specific conditions, also allows identification of ECs that reflect alternative allosteric or excited states of the protein structure.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/bs.mie.2018.11.004DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6640129PMC
August 2019

CellMinerCDB for Integrative Cross-Database Genomics and Pharmacogenomics Analyses of Cancer Cell Lines.

iScience 2018 Dec 12;10:247-264. Epub 2018 Dec 12.

Developmental Therapeutics Branch, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD 20892, USA. Electronic address:

CellMinerCDB provides a web-based resource (https://discover.nci.nih.gov/cellminercdb/) for integrating multiple forms of pharmacological and genomic analyses, and unifying the richest cancer cell line datasets (the NCI-60, NCI-SCLC, Sanger/MGH GDSC, and Broad CCLE/CTRP). CellMinerCDB enables data queries for genomics and gene regulatory network analyses, and exploration of pharmacogenomic determinants and drug signatures. It leverages overlaps of cell lines and drugs across databases to examine reproducibility and expand pathway analyses. We illustrate the value of CellMinerCDB for elucidating gene expression determinants, such as DNA methylation and copy number variations, and highlight complexities in assessing mutational burden. We demonstrate the value of CellMinerCDB in selecting drugs with reproducible activity, expand on the dominant role of SLFN11 for drug response, and present novel response determinants and genomic signatures for topoisomerase inhibitors and schweinfurthins. We also introduce LIX1L as a gene associated with mesenchymal signature and regulation of cellular migration and invasiveness.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.isci.2018.11.029DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6302245PMC
December 2018

The EVcouplings Python framework for coevolutionary sequence analysis.

Bioinformatics 2019 05;35(9):1582-1584

Department of Systems Biology, Harvard Medical School, Boston, MA, USA.

Summary: Coevolutionary sequence analysis has become a commonly used technique for de novo prediction of the structure and function of proteins, RNA, and protein complexes. We present the EVcouplings framework, a fully integrated open-source application and Python package for coevolutionary analysis. The framework enables generation of sequence alignments, calculation and evaluation of evolutionary couplings (ECs), and de novo prediction of structure and mutation effects. The combination of an easy to use, flexible command line interface and an underlying modular Python package makes the full power of coevolutionary analyses available to entry-level and advanced users.

Availability And Implementation: https://github.com/debbiemarkslab/evcouplings.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty862DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6499242PMC
May 2019

Comprehensive Analysis of Alternative Splicing Across Tumors from 8,705 Patients.

Cancer Cell 2018 08 2;34(2):211-224.e6. Epub 2018 Aug 2.

ETH Zurich, Department of Computer Science, Zurich, Switzerland; Memorial Sloan Kettering Cancer Center, Computational Biology Department, New York, USA; University Hospital Zurich, Biomedical Informatics Research, Zurich, Switzerland; ETH Zurich, Department of Biology, Zurich, Switzerland; SIB Swiss Institute of Bioinformatics, Zurich, Switzerland. Electronic address:

Our comprehensive analysis of alternative splicing across 32 The Cancer Genome Atlas cancer types from 8,705 patients detects alternative splicing events and tumor variants by reanalyzing RNA and whole-exome sequencing data. Tumors have up to 30% more alternative splicing events than normal samples. Association analysis of somatic variants with alternative splicing events confirmed known trans associations with variants in SF3B1 and U2AF1 and identified additional trans-acting variants (e.g., TADA1, PPP2R1A). Many tumors have thousands of alternative splicing events not detectable in normal samples; on average, we identified ≈930 exon-exon junctions ("neojunctions") in tumors not typically found in GTEx normals. From Clinical Proteomic Tumor Analysis Consortium data available for breast and ovarian tumor samples, we confirmed ≈1.7 neojunction- and ≈0.6 single nucleotide variant-derived peptides per tumor sample that are also predicted major histocompatibility complex-I binders ("putative neoantigens").
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ccell.2018.07.001DOI Listing
August 2018

Systems pharmacology using mass spectrometry identifies critical response nodes in prostate cancer.

NPJ Syst Biol Appl 2018 2;4:26. Epub 2018 Jul 2.

1Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Auguste Piccard Hof 1, Zürich, Switzerland.

In the United States alone one in five newly diagnosed cancers in men are prostate carcinomas (PCa). Androgen receptor (AR) status and the PI3K-AKT-mTOR signal transduction pathway are critical in PCa. After initial response to single drugs targeting these pathways resistance often emerges, indicating the need for combination therapy. Here, we address the question of efficacy of drug combinations and development of resistance mechanisms to targeted therapy by a systems pharmacology approach. We combine targeted perturbation with detailed observation of the molecular response by mass spectrometry. We hypothesize that the molecular short-term (24 h) response reveals details of how PCa cells adapt to counter the anti-proliferative drug effect. With focus on six drugs currently used in PCa treatment or targeting the PI3K-AKT-mTOR signal transduction pathway, we perturbed the LNCaP clone FGC cell line by a total of 21 treatment conditions using single and paired drug combinations. The molecular response was analyzed by the mass spectrometric quantification of 52 proteins. Analysis of the data revealed a pattern of strong responders, i.e., proteins that were consistently downregulated or upregulated across many of the perturbation conditions. The downregulated proteins, HN1, PAK1, and SPAG5, are potential early indicators of drug efficacy and point to previously less well-characterized response pathways in PCa cells. Some of the upregulated proteins such as 14-3-3 proteins and KLK2 may be useful early markers of adaptive response and indicate potential resistance pathways targetable as part of combination therapy to overcome drug resistance. The potential of 14-3-3ζ (YWHAZ) as a target is underscored by the independent observation, based on cancer genomics of surgical specimens, that its DNA copy number and transcript levels tend to increase with PCa disease progression. The combination of systematic drug perturbation combined with detailed observation of short-term molecular response using mass spectrometry is a potentially powerful tool to discover response markers and anti-resistance targets.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41540-018-0064-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6026592PMC
July 2018

Computer-guided design of optimal microbial consortia for immune system modulation.

Elife 2018 04 17;7. Epub 2018 Apr 17.

Engineering and Applied Sciences PhD Program, University of Massachusetts Dartmouth, North Dartmouth, United States.

Manipulation of the gut microbiota holds great promise for the treatment of diseases. However, a major challenge is the identification of therapeutically potent microbial consortia that colonize the host effectively while maximizing immunologic outcome. Here, we propose a novel workflow to select optimal immune-inducing consortia from microbiome compositicon and immune effectors measurements. Using published and newly generated microbial and regulatory T-cell (T) data from germ-free mice, we estimate the contributions of twelve Clostridia strains with known immune-modulating effect to T induction. Combining this with a longitudinal data-constrained ecological model, we predict the ability of every attainable and ecologically stable subconsortium in promoting T activation and rank them by the T Induction Score (TrIS). Experimental validation of selected consortia indicates a strong and statistically significant correlation between predicted TrIS and measured T. We argue that computational indexes, such as the TrIS, are valuable tools for the systematic selection of immune-modulating bacteriotherapeutics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7554/eLife.30916DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5959721PMC
April 2018

Sequencing of prostate cancers identifies new cancer genes, routes of progression and drug targets.

Nat Genet 2018 05 16;50(5):682-692. Epub 2018 Apr 16.

The Institute of Cancer Research, London, UK.

Prostate cancer represents a substantial clinical challenge because it is difficult to predict outcome and advanced disease is often fatal. We sequenced the whole genomes of 112 primary and metastatic prostate cancer samples. From joint analysis of these cancers with those from previous studies (930 cancers in total), we found evidence for 22 previously unidentified putative driver genes harboring coding mutations, as well as evidence for NEAT1 and FOXA1 acting as drivers through noncoding mutations. Through the temporal dissection of aberrations, we identified driver mutations specifically associated with steps in the progression of prostate cancer, establishing, for example, loss of CHD1 and BRCA2 as early events in cancer development of ETS fusion-negative cancers. Computational chemogenomic (canSAR) analysis of prostate cancer mutations identified 11 targets of approved drugs, 7 targets of investigational drugs, and 62 targets of compounds that may be active and should be considered candidates for future clinical trials.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-018-0086-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6372064PMC
May 2018
-->