Publications by authors named "Abhishek Niroula"

26 Publications

  • Page 1 of 1

Supplemental Association of Clonal Hematopoiesis With Incident Heart Failure.

J Am Coll Cardiol 2021 Jul;78(1):42-52

Department of Epidemiology, Brown University, Providence, Rhode Island, USA; Care New England, Center for Primary Care and Prevention, Pawtucket, Rhode Island, USA; Department of Family Medicine, Warren Alpert Medical School of Brown University, Providence, Rhode Island, USA. Electronic address:

Background: Age-related clonal hematopoiesis of indeterminate potential (CHIP), defined as clonally expanded leukemogenic sequence variations (particularly in DNMT3A, TET2, ASXL1, and JAK2) in asymptomatic individuals, is associated with cardiovascular events, including recurrent heart failure (HF).

Objectives: This study sought to evaluate whether CHIP is associated with incident HF.

Methods: CHIP status was obtained from whole exome or genome sequencing of blood DNA in participants without prevalent HF or hematological malignancy from 5 cohorts. Cox proportional hazards models were performed within each cohort, adjusting for demographic and clinical risk factors, followed by fixed-effect meta-analyses. Large CHIP clones (defined as variant allele frequency >10%), HF with or without baseline coronary heart disease, and left ventricular ejection fraction were evaluated in secondary analyses.

Results: Of 56,597 individuals (59% women, mean age 58 years at baseline), 3,406 (6%) had CHIP, and 4,694 developed HF (8.3%) over up to 20 years of follow-up. CHIP was prospectively associated with a 25% increased risk of HF in meta-analysis (hazard ratio: 1.25; 95% confidence interval: 1.13-1.38) with consistent associations across cohorts. ASXL1, TET2, and JAK2 sequence variations were each associated with an increased risk of HF, whereas DNMT3A sequence variations were not associated with HF. Secondary analyses suggested large CHIP was associated with a greater risk of HF (hazard ratio: 1.29; 95% confidence interval: 1.15-1.44), and the associations for CHIP on HF with and without prior coronary heart disease were homogenous. ASXL1 sequence variations were associated with reduced left ventricular ejection fraction.

Conclusions: CHIP, particularly sequence variations in ASXL1, TET2, and JAK2, represents a new risk factor for HF.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jacc.2021.04.085DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8313294PMC
July 2021

Prognostic implications of troponin T variations in inherited cardiomyopathies using systems biology.

NPJ Genom Med 2021 Jun 14;6(1):47. Epub 2021 Jun 14.

Cardiology department, Health In Code. As Xubias s/n, Edificio El Fortín, 15006, A Coruña, Spain.

The cardiac troponin T variations have often been used as an example of the application of clinical genotyping for prognostication and risk stratification measures for the management of patients with a family history of sudden cardiac death or familial cardiomyopathy. Given the disparity in patient outcomes and therapy options, we investigated the impact of variations on the intermolecular interactions across the thin filament complex as an example of an unbiased systems biology method to better define clinical prognosis to aid future management options. We present a novel unbiased dynamic model to define and analyse the functional, structural and physico-chemical consequences of genetic variations among the troponins. This was subsequently integrated with clinical data from accessible global multi-centre systematic reviews of familial cardiomyopathy cases from 106 articles of the literature: 136 disease-causing variations pertaining to 981 global clinical cases. Troponin T variations showed distinct pathogenic hotspots for dilated and hypertrophic cardiomyopathies; considering the causes of cardiovascular death separately, there was a worse survival in terms of sudden cardiac death for patients with a variation at regions 90-129 and 130-179 when compared to amino acids 1-89 and 200-288. Our data support variations among 90-130 as being a hotspot for sudden cardiac death and the region 131-179 for heart failure death/transplantation outcomes wherein the most common phenotype was dilated cardiomyopathy. Survival analysis into regions of high risk (regions 90-129 and 130-180) and low risk (regions 1-89 and 200-288) was significant for sudden cardiac death (p = 0.011) and for heart failure death/transplant (p = 0.028). Our integrative genomic, structural, model from genotype to clinical data integration has implications for enhancing clinical genomics methodologies to improve risk stratification.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41525-021-00204-wDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8203786PMC
June 2021

Association of Diet Quality With Prevalence of Clonal Hematopoiesis and Adverse Cardiovascular Events.

JAMA Cardiol 2021 Jun 9. Epub 2021 Jun 9.

Cardiovascular Research Center, Massachusetts General Hospital, Boston.

Importance: Clonal hematopoiesis of indeterminate potential (CHIP), the expansion of somatic leukemogenic variations in hematopoietic stem cells, has been associated with atherosclerotic cardiovascular disease. Because the inherited risk of developing CHIP is low, lifestyle elements such as dietary factors may be associated with the development and outcomes of CHIP.

Objective: To examine whether there is an association between diet quality and the prevalence of CHIP.

Design, Setting, And Participants: This retrospective cohort study used data from participants in the UK Biobank, an ongoing population-based study in the United Kingdom that examines whole-exome sequencing data and survey-based information on health-associated behaviors. Individuals from the UK Biobank were recruited between 2006 and 2010 and followed up prospectively with linkage to health data records through May 2020. The present study included 44 111 participants in the UK Biobank who were age 40 to 70 years, had data available from whole-exome sequencing of blood DNA, and were free of coronary artery disease (CAD) or hematologic cancer at baseline.

Exposures: Diet quality was categorized as unhealthy if the intake of healthy elements (fruits and vegetables) was lower than the median of all survey responses, and the intake of unhealthy elements (red meat, processed food, and added salt) was higher than the median. Diets were classified as healthy if the intake of healthy elements was higher than the median, and the intake of unhealthy elements was lower than the median. The presence of CHIP was detected by data from whole-exome sequencing of blood DNA.

Main Outcomes And Measures: The primary outcome was CHIP prevalence. Multivariable logistic regression analysis was used to examine the association between diet quality and the presence of CHIP. Multivariable Cox proportional hazards models were used to assess the association of incident events (acute coronary syndromes, coronary revascularization, or death) in each diet quality category stratified by the presence of CHIP.

Results: Among 44 111 participants (mean [SD] age at time of blood sample collection, 56.3 [8.0] years; 24 507 women [55.6%]), 2271 individuals (5.1%) had an unhealthy diet, 38 552 individuals (87.4%) had an intermediate diet, and 3288 individuals (7.5%) had a healthy diet. A total of 2507 individuals (5.7%) had CHIP, and the prevalence of CHIP decreased as diet quality improved from unhealthy (162 of 2271 participants [7.1%]) to intermediate (2177 of 38 552 participants [5.7%]) to healthy (168 of 3288 participants [5.1%]; P = .003 for trend). Compared with individuals without CHIP who had an intermediate diet, the rates of incident cardiovascular events progressively decreased among those with CHIP who had an unhealthy diet (hazard ratio [HR], 1.52; 95% CI, 1.04-2.22) and those with CHIP who had a healthy diet (HR, 0.99; 95% CI, 0.62-1.58) over a median of 10.0 years (interquartile range, 9.6-10.4 years) of follow-up.

Conclusions And Relevance: This cohort study suggests that an unhealthy diet quality may be associated with a higher prevalence of CHIP and higher rates of adverse cardiovascular events and death independent of CHIP status.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1001/jamacardio.2021.1678DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8190703PMC
June 2021

Germline variants at SOHLH2 influence multiple myeloma risk.

Blood Cancer J 2021 Apr 19;11(4):76. Epub 2021 Apr 19.

Hematology and Transfusion Medicine, Department of Laboratory Medicine, 221 84, Lund, Sweden.

Multiple myeloma (MM) is caused by the uncontrolled, clonal expansion of plasma cells. While there is epidemiological evidence for inherited susceptibility, the molecular basis remains incompletely understood. We report a genome-wide association study totalling 5,320 cases and 422,289 controls from four Nordic populations, and find a novel MM risk variant at SOHLH2 at 13q13.3 (risk allele frequency = 3.5%; odds ratio = 1.38; P = 2.2 × 10). This gene encodes a transcription factor involved in gametogenesis that is normally only weakly expressed in plasma cells. The association is represented by 14 variants in linkage disequilibrium. Among these, rs75712673 maps to a genomic region with open chromatin in plasma cells, and upregulates SOHLH2 in this cell type. Moreover, rs75712673 influences transcriptional activity in luciferase assays, and shows a chromatin looping interaction with the SOHLH2 promoter. Our work provides novel insight into MM susceptibility.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41408-021-00468-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8055668PMC
April 2021

Accelerating target deconvolution for therapeutic antibody candidates using highly parallelized genome editing.

Nat Commun 2021 02 24;12(1):1277. Epub 2021 Feb 24.

Department of Laboratory Medicine, Hematology and Transfusion Medicine, Lund, Sweden.

Therapeutic antibodies are transforming the treatment of cancer and autoimmune diseases. Today, a key challenge is finding antibodies against new targets. Phenotypic discovery promises to achieve this by enabling discovery of antibodies with therapeutic potential without specifying the molecular target a priori. Yet, deconvoluting the targets of phenotypically discovered antibodies remains a bottleneck; efficient deconvolution methods are needed for phenotypic discovery to reach its full potential. Here, we report a comprehensive investigation of a target deconvolution approach based on pooled CRISPR/Cas9. Applying this approach within three real-world phenotypic discovery programs, we rapidly deconvolute the targets of 38 of 39 test antibodies (97%), a success rate far higher than with existing approaches. Moreover, the approach scales well, requires much less work, and robustly identifies antibodies against the major histocompatibility complex. Our data establish CRISPR/Cas9 as a highly efficient target deconvolution approach, with immediate implications for the development of antibody-based drugs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-021-21518-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7904777PMC
February 2021

Premature Menopause, Clonal Hematopoiesis, and Coronary Artery Disease in Postmenopausal Women.

Circulation 2021 Feb 9;143(5):410-423. Epub 2020 Nov 9.

Cardiology Division (M.C.H., J.P.P., P.N.), Massachusetts General Hospital, Harvard Medical School, Boston.

Background: Premature menopause is an independent risk factor for cardiovascular disease in women, but mechanisms underlying this association remain unclear. Clonal hematopoiesis of indeterminate potential (CHIP), the age-related expansion of hematopoietic cells with leukemogenic mutations without detectable malignancy, is associated with accelerated atherosclerosis. Whether premature menopause is associated with CHIP is unknown.

Methods: We included postmenopausal women from the UK Biobank (n=11 495) aged 40 to 70 years with whole exome sequences and from the Women's Health Initiative (n=8111) aged 50 to 79 years with whole genome sequences. Premature menopause was defined as natural or surgical menopause occurring before age 40 years. Co-primary outcomes were the presence of any CHIP and CHIP with variant allele frequency >0.1. Logistic regression tested the association of premature menopause with CHIP, adjusted for age, race, the first 10 principal components of ancestry, smoking, diabetes, and hormone therapy use. Secondary analyses considered natural versus surgical premature menopause and gene-specific CHIP subtypes. Multivariable-adjusted Cox models tested the association between CHIP and incident coronary artery disease.

Results: The sample included 19 606 women, including 418 (2.1%) with natural premature menopause and 887 (4.5%) with surgical premature menopause. Across cohorts, CHIP prevalence in postmenopausal women with versus without a history of premature menopause was 8.8% versus 5.5% (<0.001), respectively. After multivariable adjustment, premature menopause was independently associated with CHIP (all CHIP: odds ratio, 1.36 [95% 1.10-1.68]; =0.004; CHIP with variant allele frequency >0.1: odds ratio, 1.40 [95% CI, 1.10-1.79]; =0.007). Associations were larger for natural premature menopause (all CHIP: odds ratio, 1.73 [95% CI, 1.23-2.44]; =0.001; CHIP with variant allele frequency >0.1: odds ratio, 1.91 [95% CI, 1.30-2.80]; <0.001) but smaller and nonsignificant for surgical premature menopause. In gene-specific analyses, only CHIP was significantly associated with premature menopause. Among postmenopausal middle-aged women, CHIP was independently associated with incident coronary artery disease (hazard ratio associated with all CHIP: 1.36 [95% CI, 1.07-1.73]; =0.012; hazard ratio associated with CHIP with variant allele frequency >0.1: 1.48 [95% CI, 1.13-1.94]; =0.005).

Conclusions: Premature menopause, especially natural premature menopause, is independently associated with CHIP among postmenopausal women. Natural premature menopause may serve as a risk signal for predilection to develop CHIP and CHIP-associated cardiovascular disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1161/CIRCULATIONAHA.120.051775DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7911856PMC
February 2021

Correction: S100A6 is a critical regulator of hematopoietic stem cells.

Leukemia 2020 Dec;34(12):3439

Division of Molecular Medicine and Gene Therapy, Lund Stem Cell Center, Lund University Hospital, 22184, Lund, Sweden.

An amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41375-020-0971-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7962579PMC
December 2020

S100A6 is a critical regulator of hematopoietic stem cells.

Leukemia 2020 12 19;34(12):3323-3337. Epub 2020 Jun 19.

Division of Molecular Medicine and Gene Therapy, Lund Stem Cell Center, Lund University Hospital, 22184, Lund, Sweden.

The fate options of hematopoietic stem cells (HSCs) include self-renewal, differentiation, migration, and apoptosis. HSCs self-renewal divisions in stem cells are required for rapid regeneration during tissue damage and stress, but how precisely intracellular calcium signals are regulated to maintain fate options in normal hematopoiesis is unclear. S100A6 knockout (KO) HSCs have reduced total cell numbers in the HSC compartment, decreased myeloid output, and increased apoptotic HSC numbers in steady state. S100A6KO HSCs had impaired self-renewal and regenerative capacity, not responding to 5-Fluorouracil. Our transcriptomic and proteomic profiling suggested that S100A6 is a critical HSC regulator. Intriguingly, S100A6KO HSCs showed decreased levels of phosphorylated Akt (p-Akt) and Hsp90, with an impairment of mitochondrial respiratory capacity and a reduction of mitochondrial calcium levels. We showed that S100A6 regulates intracellular and mitochondria calcium buffering of HSC upon cytokine stimulation and have demonstrated that Akt activator SC79 reverts the levels of intracellular and mitochondrial calcium in HSC. Hematopoietic colony-forming activity and the Hsp90 activity of S100A6KO are restored through activation of the Akt pathway. We show that p-Akt is the prime downstream mechanism of S100A6 in the regulation of HSC self-renewal by specifically governing mitochondrial metabolic function and Hsp90 protein quality.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41375-020-0901-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7685984PMC
December 2020

ProTstab - predictor for cellular protein stability.

BMC Genomics 2019 Nov 4;20(1):804. Epub 2019 Nov 4.

Department of Experimental Medical Science, BMC B13, Lund University, Lund, Sweden.

Background: Stability is one of the most fundamental intrinsic characteristics of proteins and can be determined with various methods. Characterization of protein properties does not keep pace with increase in new sequence data and therefore even basic properties are not known for far majority of identified proteins. There have been some attempts to develop predictors for protein stabilities; however, they have suffered from small numbers of known examples.

Results: We took benefit of results from a recently developed cellular stability method, which is based on limited proteolysis and mass spectrometry, and developed a machine learning method using gradient boosting of regression trees. ProTstab method has high performance and is well suited for large scale prediction of protein stabilities.

Conclusions: The Pearson's correlation coefficient was 0.793 in 10-fold cross validation and 0.763 in independent blind test. The corresponding values for mean absolute error are 0.024 and 0.036, respectively. Comparison with a previously published method indicated ProTstab to have superior performance. We used the method to predict stabilities of all the remaining proteins in the entire human proteome and then correlated the predicted stabilities to protein chain lengths of isoforms and to localizations of proteins.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12864-019-6138-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6830000PMC
November 2019

MPRAscore: robust and non-parametric analysis of massively parallel reporter assays.

Bioinformatics 2019 12;35(24):5351-5353

Department of Laboratory Medicine, Lund University, 221 84 Lund, Sweden.

Motivation: Massively parallel reporter assays (MPRA) enable systematic screening of DNA sequence variants for effects on transcriptional activity. However, convenient analysis tools are still needed.

Results: We introduce MPRAscore, a novel tool to infer allele-specific effects on transcription from MPRA data. MPRAscore uses a weighted, variance-regularized method to calculate variant effect sizes robustly, and a permutation approach to test for significance without assuming normality or independence.

Availability And Implementation: Source code (C++), precompiled binaries and data used in the paper at https://github.com/abhisheknrl/MPRAscore and https://www.ncbi.nlm.nih.gov/bioproject/PRJNA554195.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz591DOI Listing
December 2019

How good are pathogenicity predictors in detecting benign variants?

PLoS Comput Biol 2019 02 11;15(2):e1006481. Epub 2019 Feb 11.

Protein Structure and Bioinformatics, Department of Experimental Medical Science, Lund University, Lund, Sweden.

Computational tools are widely used for interpreting variants detected in sequencing projects. The choice of these tools is critical for reliable variant impact interpretation for precision medicine and should be based on systematic performance assessment. The performance of the methods varies widely in different performance assessments, for example due to the contents and sizes of test datasets. To address this issue, we obtained 63,160 common amino acid substitutions (allele frequency ≥1% and <25%) from the Exome Aggregation Consortium (ExAC) database, which contains variants from 60,706 genomes or exomes. We evaluated the specificity, the capability to detect benign variants, for 10 variant interpretation tools. In addition to overall specificity of the tools, we tested their performance for variants in six geographical populations. PON-P2 had the best performance (95.5%) followed by FATHMM (86.4%) and VEST (83.5%). While these tools had excellent performance, the poorest method predicted more than one third of the benign variants to be disease-causing. The results allow choosing reliable methods for benign variant interpretation, for both research and clinical purposes, as well as provide a benchmark for method developers.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1006481DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6386394PMC
February 2019

The multiple myeloma risk allele at 5q15 lowers ELL2 expression and increases ribosomal gene expression.

Nat Commun 2018 04 25;9(1):1649. Epub 2018 Apr 25.

Department of Laboratory Medicine, Hematology and Transfusion Medicine, SE 221 84, Lund, Sweden.

Recently, we identified ELL2 as a susceptibility gene for multiple myeloma (MM). To understand its mechanism of action, we performed expression quantitative trait locus analysis in CD138 plasma cells from 1630 MM patients from four populations. We show that the MM risk allele lowers ELL2 expression in these cells (P = 2.5 × 10; β = -0.24 SD), but not in peripheral blood or other tissues. Consistent with this, several variants representing the MM risk allele map to regulatory genomic regions, and three yield reduced transcriptional activity in plasmocytoma cell lines. One of these (rs3777189-C) co-locates with the best-supported lead variants for ELL2 expression and MM risk, and reduces binding of MAFF/G/K family transcription factors. Moreover, further analysis reveals that the MM risk allele associates with upregulation of gene sets related to ribosome biogenesis, and knockout/knockdown and rescue experiments in plasmocytoma cell lines support a cause-effect relationship. Our results provide mechanistic insight into MM predisposition.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-018-04082-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5917026PMC
April 2018

PON-tstab: Protein Variant Stability Predictor. Importance of Training Data Quality.

Int J Mol Sci 2018 Mar 28;19(4). Epub 2018 Mar 28.

Department of Experimental Medical Science, BMC B13, Lund University, SE-22 184 Lund, Sweden.

Several methods have been developed to predict effects of amino acid substitutions on protein stability. Benchmark datasets are essential for method training and testing and have numerous requirements including that the data is representative for the investigated phenomenon. Available machine learning algorithms for variant stability have all been trained with ProTherm data. We noticed a number of issues with the contents, quality and relevance of the database. There were errors, but also features that had not been clearly communicated. Consequently, all machine learning variant stability predictors have been trained on biased and incorrect data. We obtained a corrected dataset and trained a random forests-based tool, PON-tstab, applicable to variants in any organism. Our results highlight the importance of the benchmark quality, suitability and appropriateness. Predictions are provided for three categories: stability decreasing, increasing and those not affecting stability.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/ijms19041009DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5979465PMC
March 2018

Direct evidence for a polygenic etiology in familial multiple myeloma.

Blood Adv 2017 Apr 7;1(10):619-623. Epub 2017 Apr 7.

Hematology and Transfusion Medicine, Department of Laboratory Medicine, Lund University, Lund, Sweden.

Although common risk alleles for multiple myeloma (MM) were recently identified, their contribution to familial MM is unknown. Analyzing 38 familial cases identified primarily by linking Swedish nationwide registries, we demonstrate an enrichment of common MM risk alleles in familial compared with 1530 sporadic cases ( = 4.8 × 10 and 6.0 × 10, respectively, for 2 different polygenic risk scores) and 10 171 population-based controls ( = 1.5 × 10 and 1.3 × 10, respectively). Using mixture modeling, we estimate that about one-third of familial cases result from such enrichments. Our results provide the first direct evidence for a polygenic etiology in a familial hematologic malignancy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1182/bloodadvances.2016003111DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5728350PMC
April 2017

Working toward precision medicine: Predicting phenotypes from exomes in the Critical Assessment of Genome Interpretation (CAGI) challenges.

Hum Mutat 2017 09 7;38(9):1182-1192. Epub 2017 Jul 7.

Department of Information Engineering, University of Padova, Padova, Italy.

Precision medicine aims to predict a patient's disease risk and best therapeutic options by using that individual's genetic sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype-phenotype prediction challenges; participants build models, undergo assessment, and share key findings. For CAGI 4, three challenges involved using exome-sequencing data: Crohn's disease, bipolar disorder, and warfarin dosing. Previous CAGI challenges included prior versions of the Crohn's disease challenge. Here, we discuss the range of techniques used for phenotype prediction as well as the methods used for assessing predictive models. Additionally, we outline some of the difficulties associated with making predictions and evaluating them. The lessons learned from the exome challenges can be applied to both research and clinical efforts to improve phenotype prediction from genotype. In addition, these challenges serve as a vehicle for sharing clinical and research exome data in a secure manner with scientists who have a broad range of expertise, contributing to a collaborative effort to advance our understanding of genotype-phenotype relationships.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.23280DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5600620PMC
September 2017

Performance of in silico tools for the evaluation of p16INK4a (CDKN2A) variants in CAGI.

Hum Mutat 2017 09 16;38(9):1042-1050. Epub 2017 May 16.

Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland.

Correct phenotypic interpretation of variants of unknown significance for cancer-associated genes is a diagnostic challenge as genetic screenings gain in popularity in the next-generation sequencing era. The Critical Assessment of Genome Interpretation (CAGI) experiment aims to test and define the state of the art of genotype-phenotype interpretation. Here, we present the assessment of the CAGI p16INK4a challenge. Participants were asked to predict the effect on cellular proliferation of 10 variants for the p16INK4a tumor suppressor, a cyclin-dependent kinase inhibitor encoded by the CDKN2A gene. Twenty-two pathogenicity predictors were assessed with a variety of accuracy measures for reliability in a medical context. Different assessment measures were combined in an overall ranking to provide more robust results. The R scripts used for assessment are publicly available from a GitHub repository for future use in similar assessment exercises. Despite a limited test-set size, our findings show a variety of results, with some methods performing significantly better. Methods combining different strategies frequently outperform simpler approaches. The best predictor, Yang&Zhou lab, uses a machine learning method combining an empirical energy function measuring protein stability with an evolutionary conservation term. The p16INK4a challenge highlights how subtle structural effects can neutralize otherwise deleterious variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.23235DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5561474PMC
September 2017

PON-P and PON-P2 predictor performance in CAGI challenges: Lessons learned.

Hum Mutat 2017 09 2;38(9):1085-1091. Epub 2017 May 2.

Protein Structure and Bioinformatics Group, Department of Experimental Medical Science, Lund University, Lund, Sweden.

Computational tools are widely used for ranking and prioritizing variants for characterizing their disease relevance. Since numerous tools have been developed, they have to be properly assessed before being applied. Critical Assessment of Genome Interpretation (CAGI) experiments have significantly contributed toward the assessment of prediction methods for various tasks. Within and outside the CAGI, we have addressed several questions that facilitate development and assessment of variation interpretation tools. These areas include collection and distribution of benchmark datasets, their use for systematic large-scale method assessment, and the development of guidelines for reporting methods and their performance. For us, CAGI has provided a chance to experiment with new ideas, test the application areas of our methods, and network with other prediction method developers. In this article, we discuss our experiences and lessons learned from the various CAGI challenges. We describe our approaches, their performance, and impact of CAGI on our research. Finally, we discuss some of the possibilities that CAGI experiments have opened up and make some suggestions for future experiments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.23199DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5561442PMC
September 2017

Predicting Severity of Disease-Causing Variants.

Hum Mutat 2017 04 24;38(4):357-364. Epub 2017 Jan 24.

Department of Experimental Medical Science, Lund University, Lund, SE-22184, Sweden.

Most diseases, including those of genetic origin, express a continuum of severity. Clinical interventions for numerous diseases are based on the severity of the phenotype. Predicting severity due to genetic variants could facilitate diagnosis and choice of therapy. Although computational predictions have been used as evidence for classifying the disease relevance of genetic variants, special tools for predicting disease severity in large scale are missing. Here, we manually curated a dataset containing variants leading to severe and less severe phenotypes and studied the abilities of variation impact predictors to distinguish between them. We found that these tools cannot separate the two groups of variants. Then, we developed a novel machine-learning-based method, PON-PS (http://structure.bmc.lu.se/PON-PS), for the classification of amino acid substitutions associated with benign, severe, and less severe phenotypes. We tested the method using an independent test dataset and variants in four additional proteins. For distinguishing severe and nonsevere variants, PON-PS showed an accuracy of 61% in the test dataset, which is higher than for existing tolerance prediction methods. PON-PS is the first generic tool developed for this task. The tool can be used together with other evidence for improving diagnosis and prognosis and for prioritization of preventive interventions, clinical monitoring, and molecular tests.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.23173DOI Listing
April 2017

PON-Sol: prediction of effects of amino acid substitutions on protein solubility.

Bioinformatics 2016 07 19;32(13):2032-4. Epub 2016 Feb 19.

Department of Experimental Medical Science, Lund University, Lund SE 221 84, Sweden.

Motivation: Solubility is one of the fundamental protein properties. It is of great interest because of its relevance to protein expression. Reduced solubility and protein aggregation are also associated with many diseases.

Results: We collected from literature the largest experimentally verified solubility affecting amino acid substitution (AAS) dataset and used it to train a predictor called PON-Sol. The predictor can distinguish both solubility decreasing and increasing variants from those not affecting solubility. PON-Sol has normalized correct prediction ratio of 0.491 on cross-validation and 0.432 for independent test set. The performance of the method was compared both to solubility and aggregation predictors and found to be superior. PON-Sol can be used for the prediction of effects of disease-related substitutions, effects on heterologous recombinant protein expression and enhanced crystallizability. One application is to investigate effects of all possible AASs in a protein to aid protein engineering.

Availability And Implementation: PON-Sol is freely available at http://structure.bmc.lu.se/PON-Sol The training and test data are available at http://structure.bmc.lu.se/VariBench/ponsol.php

Contact: [email protected]

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btw066DOI Listing
July 2016

Variation Interpretation Predictors: Principles, Types, Performance, and Choice.

Hum Mutat 2016 06 15;37(6):579-97. Epub 2016 Apr 15.

Department of Experimental Medical Science, Lund University, BMC B13, Lund, SE-22184, Sweden.

Next-generation sequencing methods have revolutionized the speed of generating variation information. Sequence data have a plethora of applications and will increasingly be used for disease diagnosis. Interpretation of the identified variants is usually not possible with experimental methods. This has caused a bottleneck that many computational methods aim at addressing. Fast and efficient methods for explaining the significance and mechanisms of detected variants are required for efficient precision/personalized medicine. Computational prediction methods have been developed in three areas to address the issue. There are generic tolerance (pathogenicity) predictors for filtering harmful variants. Gene/protein/disease-specific tools are available for some applications. Mechanism and effect-specific computer programs aim at explaining the consequences of variations. Here, we discuss the different types of predictors and their applications. We review available variation databases and prediction methods useful for variation interpretation. We discuss how the performance of methods is assessed and summarize existing assessment studies. A brief introduction is provided to the principles of the methods developed for variation interpretation as well as guidelines for how to choose the optimal tools and where the field is heading in the future.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.22987DOI Listing
June 2016

PON-mt-tRNA: a multifactorial probability-based method for classification of mitochondrial tRNA variations.

Nucleic Acids Res 2016 Mar 3;44(5):2020-7. Epub 2016 Feb 3.

Department of Experimental Medical Science, Lund University, BMC B13, SE-22184 Lund, Sweden

Transfer RNAs (tRNAs) are essential for encoding the transcribed genetic information from DNA into proteins. Variations in the human tRNAs are involved in diverse clinical phenotypes. Interestingly, all pathogenic variations in tRNAs are located in mitochondrial tRNAs (mt-tRNAs). Therefore, it is crucial to identify pathogenic variations in mt-tRNAs for disease diagnosis and proper treatment. We collected mt-tRNA variations using a classification based on evidence from several sources and used the data to develop a multifactorial probability-based prediction method, PON-mt-tRNA, for classification of mt-tRNA single nucleotide substitutions. We integrated a machine learning-based predictor and an evidence-based likelihood ratio for pathogenicity using evidence of segregation, biochemistry and histochemistry to predict the posterior probability of pathogenicity of variants. The accuracy and Matthews correlation coefficient (MCC) of PON-mt-tRNA are 1.00 and 0.99, respectively. In the absence of evidence from segregation, biochemistry and histochemistry, PON-mt-tRNA classifies variations based on the machine learning method with an accuracy and MCC of 0.69 and 0.39, respectively. We classified all possible single nucleotide substitutions in all human mt-tRNAs using PON-mt-tRNA. The variations in the loops are more often tolerated compared to the variations in stems. The anticodon loop contains comparatively more predicted pathogenic variations than the other loops. PON-mt-tRNA is available at http://structure.bmc.lu.se/PON-mt-tRNA/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkw046DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4797295PMC
March 2016

Classification of Amino Acid Substitutions in Mismatch Repair Proteins Using PON-MMR2.

Hum Mutat 2015 Dec 22;36(12):1128-34. Epub 2015 Sep 22.

Department of Experimental Medical Science, Lund University, BMC B13, Lund, SE, 22184, Sweden.

Variations in mismatch repair (MMR) system genes are causative of Lynch syndrome and other cancers. Thousands of variants have been identified in MMR genes, but the clinical relevance is known for only a small proportion. Recently, the InSiGHT group classified 2,360 MMR variants into five classes. One-third of variants, majority of which is nonsynonymous variants, remain to be of uncertain clinical relevance. Computational tools can be used to prioritize variants for disease relevance investigations. Previously, we classified 248 MMR variants as likely pathogenic and likely benign using PON-MMR. We have developed a novel tool, PON-MMR2, which is trained on a larger and more reliable dataset. In performance comparison, PON-MMR2 outperforms both generic tolerance prediction methods as well as methods optimized for MMR variants. It achieves accuracy and MCC of 0.89 and 0.78, respectively, in cross-validation and 0.86 and 0.69, respectively, on an independent test dataset. We classified 354 class 3 variants in InSiGHT database as well as all possible amino acid substitutions in four MMR proteins. Likely harmful variants mainly appear in the protein core, whereas likely benign variants are on the surface. PON-MMR2 is a highly reliable tool to prioritize variants for functional analysis. It is freely available at http://structure.bmc.lu.se/PON-MMR2/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.22900DOI Listing
December 2015

Harmful somatic amino acid substitutions affect key pathways in cancers.

BMC Med Genomics 2015 Aug 19;8:53. Epub 2015 Aug 19.

Department of Experimental Medical Science, Lund University, BMC B13, SE-22184, Lund, Sweden.

Background: Cancer is characterized by the accumulation of large numbers of genetic variations and alterations of multiple biological phenomena. Cancer genomics has largely focused on the identification of such genetic alterations and the genes containing them, known as 'cancer genes'. However, the non-functional somatic variations out-number functional variations and remain as a major challenge. Recurrent somatic variations are thought to be cancer drivers but they are present in only a small fraction of patients.

Methods: We performed an extensive analysis of amino acid substitutions (AASs) from 6,861 cancer samples (whole genome or exome sequences) classified into 30 cancer types and performed pathway enrichment analysis. We also studied the overlap between the cancers based on proteins containing harmful AASs and pathways affected by them.

Results: We found that only a fraction of AASs (39.88 %) are harmful even in known cancer genes. In addition, we found that proteins containing harmful AASs in cancers are often centrally located in protein interaction networks. Based on the proteins containing harmful AASs, we identified significantly affected pathways in 28 cancer types and indicate that proteins containing harmful AASs can affect pathways despite the frequency of AASs in them. Our cross-cancer overlap analysis showed that it would be more beneficial to identify affected pathways in cancers rather than individual genes and variations.

Conclusion: Pathways affected by harmful AASs reveal key processes involved in cancer development. Our approach filters out the putative benign AASs thus reducing the list of cancer variations allowing reliable identification of affected pathways. The pathways identified in individual cancer and overlap between cancer types open avenues for further experimental research and for developing targeted therapies and interventions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-015-0125-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4539680PMC
August 2015

PON-P2: prediction method for fast and reliable identification of harmful variants.

PLoS One 2015 3;10(2):e0117380. Epub 2015 Feb 3.

Department of Experimental Medical Science, Lund University, Lund, Sweden.

More reliable and faster prediction methods are needed to interpret enormous amounts of data generated by sequencing and genome projects. We have developed a new computational tool, PON-P2, for classification of amino acid substitutions in human proteins. The method is a machine learning-based classifier and groups the variants into pathogenic, neutral and unknown classes, on the basis of random forest probability score. PON-P2 is trained using pathogenic and neutral variants obtained from VariBench, a database for benchmark variation datasets. PON-P2 utilizes information about evolutionary conservation of sequences, physical and biochemical properties of amino acids, GO annotations and if available, functional annotations of variation sites. Extensive feature selection was performed to identify 8 informative features among altogether 622 features. PON-P2 consistently showed superior performance in comparison to existing state-of-the-art tools. In 10-fold cross-validation test, its accuracy and MCC are 0.90 and 0.80, respectively, and in the independent test, they are 0.86 and 0.71, respectively. The coverage of PON-P2 is 61.7% in the 10-fold cross-validation and 62.1% in the test dataset. PON-P2 is a powerful tool for screening harmful variants and for ranking and prioritizing experimental characterization. It is very fast making it capable of analyzing large variant datasets. PON-P2 is freely available at http://structure.bmc.lu.se/PON-P2/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0117380PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4315405PMC
January 2016
-->