Publications by authors named "Georgios V Gkoutos"

86 Publications

Improved characterisation of clinical text through ontology-based vocabulary expansion.

J Biomed Semantics 2021 04 12;12(1). Epub 2021 Apr 12.

Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham, B15 2TT, UK.

Background: Biomedical ontologies contain a wealth of metadata that constitutes a fundamental infrastructural resource for text mining. For several reasons, redundancies exist in the ontology ecosystem, which lead to the same entities being described by several concepts in the same or similar contexts across several ontologies. While these concepts describe the same entities, they contain different sets of complementary metadata. Linking these definitions to make use of their combined metadata could lead to improved performance in ontology-based information retrieval, extraction, and analysis tasks.

Results: We develop and present an algorithm that expands the set of labels associated with an ontology class using a combination of strict lexical matching and cross-ontology reasoner-enabled equivalency queries. Across all disease terms in the Disease Ontology, the approach found 51,362 additional labels, more than tripling the number defined by the ontology itself. Manual validation by a clinical expert on a random sampling of expanded synonyms over the Human Phenotype Ontology yielded a precision of 0.912. Furthermore, we found that annotating patient visits in MIMIC-III with an extended set of Disease Ontology labels led to semantic similarity score derived from those labels being a significantly better predictor of matching first diagnosis, with a mean average precision of 0.88 for the unexpanded set of annotations, and 0.913 for the expanded set.

Conclusions: Inter-ontology synonym expansion can lead to a vast increase in the scale of vocabulary available for text mining applications. While the accuracy of the extended vocabulary is not perfect, it nevertheless led to a significantly improved ontology-based characterisation of patients from text in one setting. Furthermore, where run-on error is not acceptable, the technique can be used to provide candidate synonyms which can be checked by a domain expert.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13326-021-00241-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8042947PMC
April 2021

Towards similarity-based differential diagnostics for common diseases.

Comput Biol Med 2021 Apr 1;133:104360. Epub 2021 Apr 1.

College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, University of Birmingham, UK; Institute of Translational Medicine, University Hospitals Birmingham, NHS Foundation Trust, UK; NIHR Experimental Cancer Medicine Centre, UK; NIHR Surgical Reconstruction and Microbiology Research Centre, UK; NIHR Biomedical Research Centre, UK; MRC Health Data Research UK (HDR UK) Midlands, UK; University Hospitals Birmingham NHS Foundation Trust, Edgbaston, Birmingham, UK.

Ontology-based phenotype profiles have been utilised for the purpose of differential diagnosis of rare genetic diseases, and for decision support in specific disease domains. Particularly, semantic similarity facilitates diagnostic hypothesis generation through comparison with disease phenotype profiles. However, the approach has not been applied for differential diagnosis of common diseases, or generalised clinical diagnostics from uncurated text-derived phenotypes. In this work, we describe the development of an approach for deriving patient phenotype profiles from clinical narrative text, and apply this to text associated with MIMIC-III patient visits. We then explore the use of semantic similarity with those text-derived phenotypes to classify primary patient diagnosis, comparing the use of patient-patient similarity and patient-disease similarity using phenotype-disease profiles previously mined from literature. We also consider a combined approach, in which literature-derived phenotypes are extended with the content of text-derived phenotypes we mined from 500 patients. The results reveal a powerful approach, showing that in one setting, uncurated text phenotypes can be used for differential diagnosis of common diseases, making use of information both inside and outside the setting. While the methods themselves should be explored for further optimisation, they could be applied to a variety of clinical tasks, such as differential diagnosis, cohort discovery, document and text classification, and outcome prediction.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.compbiomed.2021.104360DOI Listing
April 2021

The Role of Autophagy and lncRNAs in the Maintenance of Cancer Stem Cells.

Cancers (Basel) 2021 Mar 11;13(6). Epub 2021 Mar 11.

Division of Cellular and Molecular Pathology, Department of Pathology, University of Cambridge, Cambridge CB2 0QQ, UK.

Cancer stem cells (CSCs) possess properties such as self-renewal, resistance to apoptotic cues, quiescence, and DNA-damage repair capacity. Moreover, CSCs strongly influence the tumour microenvironment (TME) and may account for cancer progression, recurrence, and relapse. CSCs represent a distinct subpopulation in tumours and the detection, characterisation, and understanding of the regulatory landscape and cellular processes that govern their maintenance may pave the way to improving prognosis, selective targeted therapy, and therapy outcomes. In this review, we have discussed the characteristics of CSCs identified in various cancer types and the role of autophagy and long noncoding RNAs (lncRNAs) in maintaining the homeostasis of CSCs. Further, we have discussed methods to detect CSCs and strategies for treatment and relapse, taking into account the requirement to inhibit CSC growth and survival within the complex backdrop of cellular processes, microenvironmental interactions, and regulatory networks associated with cancer. Finally, we critique the computationally reinforced triangle of factors inclusive of CSC properties, the process of autophagy, and lncRNA and their associated networks with respect to hypoxia, epithelial-to-mesenchymal transition (EMT), and signalling pathways.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/cancers13061239DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7998932PMC
March 2021

Improving the diagnosis of heart failure in patients with atrial fibrillation.

Heart 2021 Jun 10;107(11):902-908. Epub 2021 Mar 10.

Institute of Cardiovascular Sciences, University of Birmingham, Birmingham, UK

Objective: To improve the echocardiographic assessment of heart failure in patients with atrial fibrillation (AF) by comparing conventional averaging of consecutive beats with an index-beat approach, whereby measurements are taken after two cycles with similar R-R interval.

Methods: Transthoracic echocardiography was performed using a standardised and blinded protocol in patients enrolled in the RATE-AF (RAte control Therapy Evaluation in permanent Atrial Fibrillation) randomised trial. We compared reproducibility of the index-beat and conventional consecutive-beat methods to calculate left ventricular ejection fraction (LVEF), global longitudinal strain (GLS) and E/e' (mitral E wave max/average diastolic tissue Doppler velocity), and assessed intraoperator/interoperator variability, time efficiency and validity against natriuretic peptides.

Results: 160 patients were included, 46% of whom were women, with a median age of 75 years (IQR 69-82) and a median heart rate of 100 beats per minute (IQR 86-112). The index-beat had the lowest within-beat coefficient of variation for LVEF (32%, vs 51% for 5 consecutive beats and 53% for 10 consecutive beats), GLS (26%, vs 43% and 42%) and E/e' (25%, vs 41% and 41%). Intraoperator (n=50) and interoperator (n=18) reproducibility were both superior for index-beats and this method was quicker to perform (p<0.001): 35.4 s to measure E/e' (95% CI 33.1 to 37.8) compared with 44.7 s for 5-beat (95% CI 41.8 to 47.5) and 98.1 s for 10-beat (95% CI 91.7 to 104.4) analyses. Using a single index-beat did not compromise the association of LVEF, GLS or E/e' with natriuretic peptide levels.

Conclusions: Compared with averaging of multiple beats in patients with AF, the index-beat approach improves reproducibility and saves time without a negative impact on validity, potentially improving the diagnosis and classification of heart failure in patients with AF.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1136/heartjnl-2020-318557DOI Listing
June 2021

Quantification of fibroblast growth factor 23 and N-terminal pro-B-type natriuretic peptide to identify patients with atrial fibrillation using a high-throughput platform: A validation study.

PLoS Med 2021 Feb 3;18(2):e1003405. Epub 2021 Feb 3.

Institute of Cardiovascular Sciences, University of Birmingham, Birmingham, United Kingdom.

Background: Large-scale screening for atrial fibrillation (AF) requires reliable methods to identify at-risk populations. Using an experimental semi-quantitative biomarker assay, B-type natriuretic peptide (BNP) and fibroblast growth factor 23 (FGF23) were recently identified as the most suitable biomarkers for detecting AF in combination with simple morphometric parameters (age, sex, and body mass index [BMI]). In this study, we validated the AF model using standardised, high-throughput, high-sensitivity biomarker assays.

Methods And Findings: For this study, 1,625 consecutive patients with either (1) diagnosed AF or (2) sinus rhythm with CHA2DS2-VASc score of 2 or more were recruited from a large teaching hospital in Birmingham, West Midlands, UK, between September 2014 and February 2018. Seven-day ambulatory ECG monitoring excluded silent AF. Patients with tachyarrhythmias apart from AF and incomplete cases were excluded. AF was diagnosed according to current clinical guidelines and confirmed by ECG. We developed a high-throughput, high-sensitivity assay for FGF23, quantified plasma N-terminal pro-B-type natriuretic peptide (NT-proBNP) and FGF23, and compared results to the previously used multibiomarker research assay. Data were fitted to the previously derived model, adjusting for differences in measurement platforms and known confounders (heart failure and chronic kidney disease). In 1,084 patients (46% with AF; median [Q1, Q3] age 70 [60, 78] years, median [Q1, Q3] BMI 28.8 [25.1, 32.8] kg/m2, 59% males), patients with AF had higher concentrations of NT-proBNP (median [Q1, Q3] per 100 pg/ml: with AF 12.00 [4.19, 30.15], without AF 4.25 [1.17, 15.70]; p < 0.001) and FGF23 (median [Q1, Q3] per 100 pg/ml: with AF 1.93 [1.30, 4.16], without AF 1.55 [1.04, 2.62]; p < 0.001). Univariate associations remained after adjusting for heart failure and estimated glomerular filtration rate, known confounders of NT-proBNP and FGF23. The fitted model yielded a C-statistic of 0.688 (95% CI 0.656, 0.719), almost identical to that of the derived model (C-statistic 0.691; 95% CI 0.638, 0.744). The key limitation is that this validation was performed in a cohort that is very similar demographically to the one used in model development, calling for further external validation.

Conclusions: Age, sex, and BMI combined with elevated NT-proBNP and elevated FGF23, quantified on a high-throughput platform, reliably identify patients with AF.

Trial Registration: Registry IRAS ID 97753 Health Research Authority (HRA), United Kingdom.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pmed.1003405DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7857735PMC
February 2021

A fast, accurate, and generalisable heuristic-based negation detection algorithm for clinical text.

Comput Biol Med 2021 Mar 16;130:104216. Epub 2021 Jan 16.

College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, University of Birmingham, UK; Institute of Translational Medicine, University Hospitals Birmingham, NHS Foundation Trust, UK; NIHR Experimental Cancer Medicine Centre, UK; NIHR Surgical Reconstruction and Microbiology Research Centre, UK; NIHR Biomedical Research Centre, UK; MRC Health Data Research UK (HDR UK) Midlands, UK; University Hospitals Birmingham NHS Foundation Trust, Edgbaston, Birmingham, UK.

Negation detection is an important task in biomedical text mining. Particularly in clinical settings, it is of critical importance to determine whether findings mentioned in text are present or absent. Rule-based negation detection algorithms are a common approach to the task, and more recent investigations have resulted in the development of rule-based systems utilising the rich grammatical information afforded by typed dependency graphs. However, interacting with these complex representations inevitably necessitates complex rules, which are time-consuming to develop and do not generalise well. We hypothesise that a heuristic approach to determining negation via dependency graphs could offer a powerful alternative. We describe and implement an algorithm for negation detection based on grammatical distance from a negatory construct in a typed dependency graph. To evaluate the algorithm, we develop two testing corpora comprised of sentences of clinical text extracted from the MIMIC-III database and documents related to hypertrophic cardiomyopathy patients routinely collected at University Hospitals Birmingham NHS trust. Gold-standard validation datasets were built by a combination of human annotation and examination of algorithm error. Finally, we compare the performance of our approach with four other rule-based algorithms on both gold-standard corpora. The presented algorithm exhibits the best performance by f-measure over the MIMIC-III dataset, and a similar performance to the syntactic negation detection systems over the HCM dataset. It is also the fastest of the dependency-based negation systems explored in this study. Our results show that while a single heuristic approach to dependency-based negation detection is ignorant to certain advanced cases, it nevertheless forms a powerful and stable method, requiring minimal training and adaptation between datasets. As such, it could present a drop-in replacement or augmentation for many-rule negation approaches in clinical text-mining pipelines, particularly for cases where adaptation and rule development is not required or possible.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.compbiomed.2021.104216DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7910278PMC
March 2021

Evaluation and improvement of the National Early Warning Score (NEWS2) for COVID-19: a multi-hospital study.

BMC Med 2021 01 21;19(1):23. Epub 2021 Jan 21.

Department of Acute Medicine, Oslo University Hospital and Institute of Clinical Medicine, University of Oslo, Oslo, Norway.

Background: The National Early Warning Score (NEWS2) is currently recommended in the UK for the risk stratification of COVID-19 patients, but little is known about its ability to detect severe cases. We aimed to evaluate NEWS2 for the prediction of severe COVID-19 outcome and identify and validate a set of blood and physiological parameters routinely collected at hospital admission to improve upon the use of NEWS2 alone for medium-term risk stratification.

Methods: Training cohorts comprised 1276 patients admitted to King's College Hospital National Health Service (NHS) Foundation Trust with COVID-19 disease from 1 March to 30 April 2020. External validation cohorts included 6237 patients from five UK NHS Trusts (Guy's and St Thomas' Hospitals, University Hospitals Southampton, University Hospitals Bristol and Weston NHS Foundation Trust, University College London Hospitals, University Hospitals Birmingham), one hospital in Norway (Oslo University Hospital), and two hospitals in Wuhan, China (Wuhan Sixth Hospital and Taikang Tongji Hospital). The outcome was severe COVID-19 disease (transfer to intensive care unit (ICU) or death) at 14 days after hospital admission. Age, physiological measures, blood biomarkers, sex, ethnicity, and comorbidities (hypertension, diabetes, cardiovascular, respiratory and kidney diseases) measured at hospital admission were considered in the models.

Results: A baseline model of 'NEWS2 + age' had poor-to-moderate discrimination for severe COVID-19 infection at 14 days (area under receiver operating characteristic curve (AUC) in training cohort = 0.700, 95% confidence interval (CI) 0.680, 0.722; Brier score = 0.192, 95% CI 0.186, 0.197). A supplemented model adding eight routinely collected blood and physiological parameters (supplemental oxygen flow rate, urea, age, oxygen saturation, C-reactive protein, estimated glomerular filtration rate, neutrophil count, neutrophil/lymphocyte ratio) improved discrimination (AUC = 0.735; 95% CI 0.715, 0.757), and these improvements were replicated across seven UK and non-UK sites. However, there was evidence of miscalibration with the model tending to underestimate risks in most sites.

Conclusions: NEWS2 score had poor-to-moderate discrimination for medium-term COVID-19 outcome which raises questions about its use as a screening tool at hospital admission. Risk stratification was improved by including readily available blood and physiological parameters measured at hospital admission, but there was evidence of miscalibration in external sites. This highlights the need for a better understanding of the use of early warning scores for COVID.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12916-020-01893-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7817348PMC
January 2021

Towards semantic interoperability: finding and repairing hidden contradictions in biomedical ontologies.

BMC Med Inform Decis Mak 2020 12 15;20(Suppl 10):311. Epub 2020 Dec 15.

Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, 23955, Saudi Arabia.

Background: Ontologies are widely used throughout the biomedical domain. These ontologies formally represent the classes and relations assumed to exist within a domain. As scientific domains are deeply interlinked, so too are their representations. While individual ontologies can be tested for consistency and coherency using automated reasoning methods, systematically combining ontologies of multiple domains together may reveal previously hidden contradictions.

Methods: We developed a method that tests for hidden unsatisfiabilities in an ontology that arise when combined with other ontologies. For this purpose, we combined sets of ontologies and use automated reasoning to determine whether unsatisfiable classes are present. In addition, we designed and implemented a novel algorithm that can determine justifications for contradictions across extremely large and complicated ontologies, and use these justifications to semi-automatically repair ontologies by identifying a small set of axioms that, when removed, result in a consistent and coherent set of ontologies.

Results: We tested the mutual consistency of the OBO Foundry and the OBO ontologies and find that the combined OBO Foundry gives rise to at least 636 unsatisfiable classes, while the OBO ontologies give rise to more than 300,000 unsatisfiable classes. We also applied our semi-automatic repair algorithm to each combination of OBO ontologies that resulted in unsatisfiable classes, finding that only 117 axioms could be removed to account for all cases of unsatisfiability across all OBO ontologies.

Conclusions: We identified a large set of hidden unsatisfiability across a broad range of biomedical ontologies, and we find that this large set of unsatisfiable classes is the result of a relatively small amount of axiomatic disagreements. Our results show that hidden unsatisfiability is a serious problem in ontology interoperability; however, our results also provide a way towards more consistent ontologies by addressing the issues we identified.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12911-020-01336-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7736131PMC
December 2020

A random forest based biomarker discovery and power analysis framework for diagnostics research.

BMC Med Genomics 2020 11 23;13(1):178. Epub 2020 Nov 23.

College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham, B15 2TT, UK.

Background: Biomarker identification is one of the major and important goal of functional genomics and translational medicine studies. Large scale -omics data are increasingly being accumulated and can provide vital means for the identification of biomarkers for the early diagnosis of complex disease and/or for advanced patient/diseases stratification. These tasks are clearly interlinked, and it is essential that an unbiased and stable methodology is applied in order to address them. Although, recently, many, primarily machine learning based, biomarker identification approaches have been developed, the exploration of potential associations between biomarker identification and the design of future experiments remains a challenge.

Methods: In this study, using both simulated and published experimentally derived datasets, we assessed the performance of several state-of-the-art Random Forest (RF) based decision approaches, namely the Boruta method, the permutation based feature selection without correction method, the permutation based feature selection with correction method, and the backward elimination based feature selection method. Moreover, we conducted a power analysis to estimate the number of samples required for potential future studies.

Results: We present a number of different RF based stable feature selection methods and compare their performances using simulated, as well as published, experimentally derived, datasets. Across all of the scenarios considered, we found the Boruta method to be the most stable methodology, whilst the Permutation (Raw) approach offered the largest number of relevant features, when allowed to stabilise over a number of iterations. Finally, we developed and made available a web interface ( https://joelarkman.shinyapps.io/PowerTools/ ) to streamline power calculations thereby aiding the design of potential future studies within a translational medicine context.

Conclusions: We developed a RF-based biomarker discovery framework and provide a web interface for our framework, termed PowerTools, that caters the design of appropriate and cost-effective subsequent future omics study.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-020-00826-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7685541PMC
November 2020

Path-based extensions of local link prediction methods for complex networks.

Sci Rep 2020 11 16;10(1):19848. Epub 2020 Nov 16.

Centre for Computational Biology, University of Birmingham, Birmingham, B15 2TT, UK.

Link prediction in a complex network is a problem of fundamental interest in network science and has attracted increasing attention in recent years. It aims to predict missing (or future) links between two entities in a complex system that are not already connected. Among existing methods, local similarity indices are most popular that take into account the information of common neighbours to estimate the likelihood of existence of a connection between two nodes. In this paper, we propose global and quasi-local extensions of some commonly used local similarity indices. We have performed extensive numerical simulations on publicly available datasets from diverse domains demonstrating that the proposed extensions not only give superior performance, when compared to their respective local indices, but also outperform some of the current, state-of-the-art, local and global link-prediction methods.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-020-76860-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7670409PMC
November 2020

Ensemble learning for poor prognosis predictions: A case study on SARS-CoV-2.

J Am Med Inform Assoc 2021 03;28(4):791-800

Centre for Population Health Sciences, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom.

Objective: Risk prediction models are widely used to inform evidence-based clinical decision making. However, few models developed from single cohorts can perform consistently well at population level where diverse prognoses exist (such as the SARS-CoV-2 [severe acute respiratory syndrome coronavirus 2] pandemic). This study aims at tackling this challenge by synergizing prediction models from the literature using ensemble learning.

Materials And Methods: In this study, we selected and reimplemented 7 prediction models for COVID-19 (coronavirus disease 2019) that were derived from diverse cohorts and used different implementation techniques. A novel ensemble learning framework was proposed to synergize them for realizing personalized predictions for individual patients. Four diverse international cohorts (2 from the United Kingdom and 2 from China; N = 5394) were used to validate all 8 models on discrimination, calibration, and clinical usefulness.

Results: Results showed that individual prediction models could perform well on some cohorts while poorly on others. Conversely, the ensemble model achieved the best performances consistently on all metrics quantifying discrimination, calibration, and clinical usefulness. Performance disparities were observed in cohorts from the 2 countries: all models achieved better performances on the China cohorts.

Discussion: When individual models were learned from complementary cohorts, the synergized model had the potential to achieve better performances than any individual model. Results indicate that blood parameters and physiological measurements might have better predictive powers when collected early, which remains to be confirmed by further studies.

Conclusions: Combining a diverse set of individual prediction models, the ensemble method can synergize a robust and well-performing model by choosing the most competent ones for individual patients.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/jamia/ocaa295DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7717299PMC
March 2021

Biomarker Prioritisation and Power Estimation Using Ensemble Gene Regulatory Network Inference.

Int J Mol Sci 2020 Oct 23;21(21). Epub 2020 Oct 23.

Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham B15 2TT, UK.

Inferring the topology of a gene regulatory network (GRN) from gene expression data is a challenging but important undertaking for gaining a better understanding of gene regulation. Key challenges include working with noisy data and dealing with a higher number of genes than samples. Although a number of different methods have been proposed to infer the structure of a GRN, there are large discrepancies among the different inference algorithms they adopt, rendering their meaningful comparison challenging. In this study, we used two methods, namely the MIDER (Mutual Information Distance and Entropy Reduction) and the PLSNET (Partial least square based feature selection) methods, to infer the structure of a GRN directly from data and computationally validated our results. Both methods were applied to different gene expression datasets resulting from inflammatory bowel disease (IBD), pancreatic ductal adenocarcinoma (PDAC), and acute myeloid leukaemia (AML) studies. For each case, gene regulators were successfully identified. For example, for the case of the IBD dataset, the family genes were identified as key regulators while upon analysing the PDAC dataset, the and genes were depicted. We further demonstrate that an ensemble-based approach, that combines the output of the MIDER and PLSNET algorithms, can infer the structure of a GRN from data with higher accuracy. We have also estimated the number of the samples required for potential future validation studies. Here, we presented our proposed analysis framework that caters not only to candidate regulator genes prediction for potential validation experiments but also an estimation of the number of samples required for these experiments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/ijms21217886DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7660606PMC
October 2020

Fractal Analysis: Prognostic Value of Left Ventricular Trabecular Complexity Cardiovascular MRI in Participants with Hypertrophic Cardiomyopathy.

Radiology 2021 01 20;298(1):71-79. Epub 2020 Oct 20.

From the Department of Cardiology (J.W., Y.L., F.Y., Y.X., Y.C.), Department of Radiology (W.C., J.S., Y.C.), Department of Geriatrics (K.W.), Center of Rare Diseases (Y.C.), and Medical Big Data Center (T.Z.), West China Hospital, Sichuan University, Guoxue Xiang No. 37, Chengdu, Sichuan 610041, China; Paul C. Lauterbur Research Centre for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, P. R. China (Y.Z.); College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, England (J.W., L.B., G.V.G.); Medical Research Council Health Data Research, Midlands Site, Birmingham, England (G.V.G.); and Department of Medicine (Cardiovascular Division), University of Pennsylvania, Philadelphia, Pa (Y.H.).

Background The prognostic value of myocardial trabecular complexity in patients with hypertrophic cardiomyopathy (HCM) is unknown. Purpose To explore the prognostic value of myocardial trabecular complexity using fractal analysis in participants with HCM. Materials and Methods The authors prospectively enrolled participants with HCM who underwent 3.0-T cardiovascular MRI from August 2011 to October 2017. The authors also enrolled 100 age- and sex-matched healthy participants to form a comparison group. Trabeculae were quantified with fractal analysis of cine slices to estimate the fractal dimension (FD). Participants with HCM were divided into normal and high FD groups according to the upper limit of normal reference value from the healthy group. The primary end point was defined as all-cause mortality and aborted sudden cardiac death. The secondary end point was the composite of the primary end point and readmission to the hospital owing to heart failure. Internal validation was performed using the bootstrapping method. Results A total of 378 participants with HCM (median age, 50 years; age range, 40-61 years; 207 men) and 100 healthy participants (median age, 46 years; age range, 36-59 years; 55 women) were included in this study. During the median follow-up of 33 months ± 18 (standard deviation), the increased maximal apical FD (≥1.325) had a higher risk of the primary and secondary end points than those with a normal FD (<1.325) ( = .01 and = .04, respectively). Furthermore, Cox analysis revealed that left ventricular maximal apical FD (hazard ratio range, 1.001-1.008; all < .05) provided significant prognostic value to predict the primary and secondary end points after adjustment for the European Society of Cardiology predictors and late gadolinium enhancement. Internal validation showed that left ventricular maximal apical FD retained a good performance in predicting the primary end points with an area under the curve of 0.70 ± 0.03. Conclusion Left ventricular apical fractal dimension, which reflects myocardial trabecular complexity, was an independent predictor of the primary and secondary end points in patients with hypertrophic cardiomyopathy. © RSNA, 2020 See also the editorial by Captur and Moon in this issue.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1148/radiol.2020202261DOI Listing
January 2021

Reduced left atrial cardiomyocyte PITX2 and elevated circulating BMP10 predict atrial fibrillation after ablation.

JCI Insight 2020 08 20;5(16). Epub 2020 Aug 20.

Institute of Cardiovascular Sciences and.

BACKGROUNDGenomic and experimental studies suggest a role for PITX2 in atrial fibrillation (AF). To assess if this association is relevant for recurrent AF in patients, we tested whether left atrial PITX2 affects recurrent AF after AF ablation.METHODSmRNA concentrations of PITX2 and its cardiac isoform, PITX2c, were quantified in left atrial appendages (LAAs) from patients undergoing thoracoscopic AF ablation, either in whole LAA tissue (n = 83) or in LAA cardiomyocytes (n = 52), and combined with clinical parameters to predict AF recurrence. Literature suggests that BMP10 is a PITX2-repressed, atrial-specific, secreted protein. BMP10 plasma concentrations were combined with 11 cardiovascular biomarkers and clinical parameters to predict recurrent AF after catheter ablation in 359 patients.RESULTSReduced concentrations of cardiomyocyte PITX2, but not whole LAA tissue PITX2, were associated with AF recurrence after thoracoscopic AF ablation (16% decreased recurrence per 2-(ΔΔCt) increase in PITX2). RNA sequencing, quantitative PCR, and Western blotting confirmed that BMP10 is one of the most PITX2-repressed atrial genes. Left atrial size (HR per mm increase [95% CI], 1.055 [1.028, 1.082]); nonparoxysmal AF (HR 1.672 [1.206, 2.318]), and elevated BMP10 (HR 1.339 [CI 1.159, 1.546] per quartile increase) were predictive of recurrent AF. BMP10 outperformed 11 other cardiovascular biomarkers in predicting recurrent AF.CONCLUSIONSReduced left atrial cardiomyocyte PITX2 and elevated plasma concentrations of the PITX2-repressed, secreted atrial protein BMP10 identify patients at risk of recurrent AF after ablation.TRIAL REGISTRATIONClinicalTrials.gov NCT01091389, NL50069.018.14, Dutch National Registry of Clinical Research Projects EK494-16.FUNDINGBritish Heart Foundation, European Union (H2020), Leducq Foundation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1172/jci.insight.139179DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7455124PMC
August 2020

Prevalence of admission plasma glucose in 'diabetes' or 'at risk' ranges in hospital emergencies with no prior diagnosis of diabetes by gender, age and ethnicity.

Endocrinol Diabetes Metab 2020 Jul 15;3(3):e00140. Epub 2020 May 15.

Diabetes Translational Research Group Diabetes Centre Nuffield House Queen Elizabeth Hospital Birmingham Birmingham UK.

Aims: To establish the prevalence of admission plasma glucose in 'diabetes' and 'at risk' ranges in emergency hospital admissions with no prior diagnosis of diabetes; characteristics of people with hyperglycaemia; and factors influencing glucose measurement.

Methods: Electronic patient records for 113 097 hospital admissions over 1 year from 2014 to 2015 included 43 201 emergencies with glucose available for 31 927 (74%) admissions, comprising 22 045 people. Data are presented for 18 965 people with no prior diagnosis of diabetes and glucose available on first attendance.

Results: Three quarters (14 214) were White Europeans aged 62 (43-78) years, median (IQ range); 12% (2241) South Asians 46 (32-64) years; 9% (1726) Unknown/Other ethnicities 43 (29-61) years; and 4% (784) Afro-Caribbeans 49 (33-63) years,  < .001. Overall, 5% (1003) had glucose in the 'diabetes' range (≥11.1 mmol/L) higher at 8% (175) for South Asians; 16% (3042) were 'at risk' (7.8-11.0 mmol/L), that is 17% (2379) White Europeans, 15% (338) South Asians, 14% (236) Unknown/Others and 11% (89) Afro-Caribbeans,  < .001. The prevalence for South Asians aged <30 years was 2.1% and 5.2%, respectively, 2.6% and 8.6% for Afro-Caribbeans <30 years, and 2.0% and 8.4% for White Europeans <40 years. Glucose increased with age and was more often in the 'diabetes' range for South Asians than White Europeans with South Asian men particularly affected. One third of all emergency admissions were for <24 hours with 58% of these having glucose measured compared to 82% with duration >24 hours.

Conclusions: Hyperglycaemia was evident in 21% of adults admitted as an emergency; various aspects related to follow-up and initial testing, age and ethnicity need to be considered by professional bodies addressing undiagnosed diabetes in hospital admissions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/edm2.140DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7375073PMC
July 2020

Core regulatory circuitries in defining cancer cell identity across the malignant spectrum.

Open Biol 2020 07 8;10(7):200121. Epub 2020 Jul 8.

Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Charlestown, USA.

Gene expression programmes driving cell identity are established by tightly regulated transcription factors that auto- and cross-regulate in a feed-forward manner, forming core regulatory circuitries (CRCs). CRC transcription factors create and engage super-enhancers by recruiting acetylation writers depositing permissive H3K27ac chromatin marks. These super-enhancers are largely associated with BET proteins, including BRD4, that influence higher-order chromatin structure. The orchestration of these events triggers accessibility of RNA polymerase machinery and the imposition of lineage-specific gene expression. In cancers, CRCs drive cell identity by superimposing developmental programmes on a background of genetic alterations. Further, the establishment and maintenance of oncogenic states are reliant on CRCs that drive factors involved in tumour development. Hence, the molecular dissection of CRC components driving cell identity and cancer state can contribute to elucidating mechanisms of diversion from pre-determined developmental programmes and highlight cancer dependencies. These insights can provide valuable opportunities for identifying and re-purposing drug targets. In this article, we review the current understanding of CRCs across solid and liquid malignancies and avenues of investigation for drug development efforts. We also review techniques used to understand CRCs and elaborate the indication of discussed CRC transcription factors in the wider context of cancer CRC models.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1098/rsob.200121DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7574545PMC
July 2020

Radiomic Analysis of Native T Mapping Images Discriminates Between MYH7 and MYBPC3-Related Hypertrophic Cardiomyopathy.

J Magn Reson Imaging 2020 12 11;52(6):1714-1721. Epub 2020 Jun 11.

Department of Cardiology, West China Hospital, Sichuan University, Chengdu, China.

Background: The phenotype via conventional cardiac MRI analysis of MYH7 (β-myosin heavy chain)- and MYBPC3 (β-myosin-binding protein C)-associated hypertrophic cardiomyopathy (HCM) groups is similar. Few studies exist on the genotypic-phenotypic association as assessed by machine learning in HCM patients.

Purpose: To explore the phenotypic differences based on radiomics analysis of T mapping images between MYH7 and MYBPC3-associated HCM subgroups.

Study Type: Prospective observational study.

Subjects: In all, 102 HCM patients with pathogenic, or likely pathogenic mutation, in MYH7 (n = 68) or MYBPC3 (n = 34) genes.

Field Strength/sequence: Cardiac MRI was performed at 3.0T with balanced steady-state free precession (bSSFP), phase-sensitive inversion recovery (PSIR) late gadolinium enhancement (LGE), and modified Look-Locker inversion recovery (MOLLI) T mapping sequences.

Assessment: All patients underwent next-generation sequencing and Sanger genetic sequencing. Left ventricular native T and LGE were analyzed. One hundred and fifty-seven radiomic features were extracted and modeled using a support vector machine (SVM) combined with principal component analysis (PCA). Each subgroup was randomly split 4:1 (feature selection / test validation).

Statistical Tests: Mann-Whitney U-tests and Student's t-tests were performed to assess differences between subgroups. A receiver operating characteristic (ROC) curve was used to assess the model's ability to stratify patients based on radiomic features.

Results: There were no significant differences between MYH7- and MYBPC3-associated HCM subgroups based on traditional native T values (global, basal, and middle short-axis slice native T ; P = 0.760, 0.914, and 0.178, respectively). However, the SVM model combined with PCA achieved an accuracy and area under the curve (AUC) of 92.0% and 0.968 (95% confidence interval [CI]: 0.968-0.971), respectively. For the test validation dataset, the accuracy and AUC were 85.5% and 0.886 (95% CI: 0.881-0.901), respectively.

Data Conclusion: Radiomic analysis of native T mapping images may be able to discriminate between MYH7- and MYBPC3-associated HCM patients, exceeding the performance of conventional native T values.

Level Of Evidence: 3 TECHNICAL EFFICACY STAGE: 2 J. MAGN. RESON. IMAGING 2020;52:1714-1721.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/jmri.27209DOI Listing
December 2020

Assessment of Endoscopic Healing by Using Advanced Technologies Reflects Histological Healing in Ulcerative Colitis.

J Crohns Colitis 2020 Sep;14(9):1282-1289

Institute of Translational Medicine, Institute of Immunology and Immunotherapy and NIHR Birmingham Biomedical Research Centre, University Hospitals NHS Foundation Trust and University of Birmingham, Birmingham, UK.

Background: Several studies have reported that ulcerative colitis [UC] patients with endoscopic mucosal healing may still have histological inflammation. We investigated the relationship between mucosal healing defined by modified PICaSSO [Paddington International Virtual ChromoendoScopy ScOre], Mayo Endoscopic Score [MES] and probe-based confocal laser endomicroscopy [pCLE] with histological indices in UC.

Methods: A prospective study enrolling 82 UC patients [male 66%] was conducted. High-definition colonoscopy was performed to evaluate the activity of the disease with MES assessed with High-Definition MES [HD-MES] and modified PICaSSO and targeted biopsies were taken; pCLE was then performed. Receiver operating characteristic [ROC] curves were plotted to determine the best thresholds for modified PICaSSO and pCLE scores that predicted histological healing according to the Robarts Histopathology Index [RHI] and ECAP 'Extension, Chronicity, Activity, Plus' histology score.

Results: A modified PICaSSO of ≤ 4 predicted histological healing at RHI ≤ 3, with sensitivity, specificity, accuracy and area under the ROC curve [AUROC] of 89.8%, 95.7%, 91.5% and 95.9% respectively. The sensitivity, specificity, accuracy and AUROC of HD-MES to predict histological healing by RHI were 81.4%, 95.7%, 85.4% and 92.1%, respectively. A pCLE ≤ 10 predicted histological healing with sensitivity of 94.9%, specificity of 91.3%, accuracy of 93.9% and AUROC of 96.5%. An ECAP of ≤ 10 was predicted by modified PICaSSO ≤ 4 with accuracy of 91.5% and AUROC of 95.9%.

Conclusion: Histological healing by RHI and ECAP is accurately predicted by HD-MES and modified virtual electronic chromoendoscopy PICaSSO, endoscopic score; and the use of pCLE did not improve the accuracy any further.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/ecco-jcc/jjaa056DOI Listing
September 2020

Annotating and detecting phenotypic information for chronic obstructive pulmonary disease.

JAMIA Open 2019 Jul 26;2(2):261-271. Epub 2019 Apr 26.

National Centre for Text Mining, School of Computer Science, The University of Manchester, Manchester, UK.

Objectives: Chronic obstructive pulmonary disease (COPD) phenotypes cover a range of lung abnormalities. To allow text mining methods to identify pertinent and potentially complex information about these phenotypes from textual data, we have developed a novel annotated corpus, which we use to train a neural network-based named entity recognizer to detect fine-grained COPD phenotypic information.

Materials And Methods: Since COPD phenotype descriptions often mention other concepts within them (proteins, treatments, etc.), our corpus annotations include both outermost phenotype descriptions and concepts nested within them. Our neural layered bidirectional long short-term memory conditional random field (BiLSTM-CRF) network firstly recognizes nested mentions, which are fed into subsequent BiLSTM-CRF layers, to help to recognize enclosing phenotype mentions.

Results: Our corpus of 30 full papers (available at: http://www.nactem.ac.uk/COPD) is annotated by experts with 27 030 phenotype-related concept mentions, most of which are automatically linked to UMLS Metathesaurus concepts. When trained using the corpus, our BiLSTM-CRF network outperforms other popular approaches in recognizing detailed phenotypic information.

Discussion: Information extracted by our method can facilitate efficient location and exploration of detailed information about phenotypes, for example, those specifically concerning reactions to treatments.

Conclusion: The importance of our corpus for developing methods to extract fine-grained information about COPD phenotypes is demonstrated through its successful use to train a layered BiLSTM-CRF network to extract phenotypic information at various levels of granularity. The minimal human intervention needed for training should permit ready adaption to extracting phenotypic information about other diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/jamiaopen/ooz009DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6951876PMC
July 2019

Machine learning for the detection of early immunological markers as predictors of multi-organ dysfunction.

Sci Data 2019 12 19;6(1):328. Epub 2019 Dec 19.

NIHR Surgical Reconstruction and Microbiology Research Centre, University Hospital Birmingham, Birmingham, B15 2WB, UK.

The immune response to major trauma has been analysed mainly within post-hospital admission settings where the inflammatory response is already underway and the early drivers of clinical outcome cannot be readily determined. Thus, there is a need to better understand the immediate immune response to injury and how this might influence important patient outcomes such as multi-organ dysfunction syndrome (MODS). In this study, we have assessed the immune response to trauma in 61 patients at three different post-injury time points (ultra-early (<=1 h), 4-12 h, 48-72 h) and analysed relationships with the development of MODS. We developed a pipeline using Absolute Shrinkage and Selection Operator and Elastic Net feature selection methods that were able to identify 3 physiological features (decrease in neutrophil CD62L and CD63 expression and monocyte CD63 expression and frequency) as possible biomarkers for MODS development. After univariate and multivariate analysis for each feature alongside a stability analysis, the addition of these 3 markers to standard clinical trauma injury severity scores yields a Generalized Liner Model (GLM) with an average Area Under the Curve value of 0.92 ± 0.06. This performance provides an 8% improvement over the Probability of Survival (PS14) outcome measure and a 13% improvement over the New Injury Severity Score (NISS) for identifying patients at risk of MODS.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41597-019-0337-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6923383PMC
December 2019

Ontology-based prediction of cancer driver genes.

Sci Rep 2019 11 22;9(1):17405. Epub 2019 Nov 22.

Computer, Electrical and Mathematical Science and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955, Saudi Arabia.

Identifying and distinguishing cancer driver genes among thousands of candidate mutations remains a major challenge. Accurate identification of driver genes and driver mutations is critical for advancing cancer research and personalizing treatment based on accurate stratification of patients. Due to inter-tumor genetic heterogeneity many driver mutations within a gene occur at low frequencies, which make it challenging to distinguish them from non-driver mutations. We have developed a novel method for identifying cancer driver genes. Our approach utilizes multiple complementary types of information, specifically cellular phenotypes, cellular locations, functions, and whole body physiological phenotypes as features. We demonstrate that our method can accurately identify known cancer driver genes and distinguish between their role in different types of cancer. In addition to confirming known driver genes, we identify several novel candidate driver genes. We demonstrate the utility of our method by validating its predictions in nasopharyngeal cancer and colorectal cancer using whole exome and whole genome sequencing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-019-53454-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6874647PMC
November 2019

A Practical Guide to Assess the Reproducibility of Echocardiographic Measurements.

J Am Soc Echocardiogr 2019 12 22;32(12):1505-1515. Epub 2019 Oct 22.

Institute of Cardiovascular Sciences, University of Birmingham, Birmingham, United Kingdom; University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom. Electronic address:

Echocardiography plays an essential role in the diagnosis and assessment of cardiovascular disease. Measurements derived from echocardiography are also used to determine the severity of disease, its progression over time, and to aid in the choice of optimal therapy. It is therefore clinically important that echocardiographic measurements be reproducible, repeatable, and reliable. There are a variety of statistical tests available to assess these parameters, and in this article the authors summarize those available for use by echocardiographers to improve their clinical practice. Correlation coefficients, linear regression, Bland-Altman plots, and the coefficient of variation are explored, along with their limitations. The authors also provide an online tool for the easy calculation of these statistics in the clinical environment (www.birmingham.ac.uk/echo). Quantifying and enhancing the reproducibility of echocardiography has important potential to improve the value of echocardiography as the basis for good clinical decision-making.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.echo.2019.08.015DOI Listing
December 2019

Development and validation of multivariable prediction models of remission, recovery, and quality of life outcomes in people with first episode psychosis: a machine learning approach.

Lancet Digit Health 2019 10 12;1(6):e261-e270. Epub 2019 Sep 12.

Institute for Mental Health, University of Birmingham, Birmingham, UK. Electronic address:

Background: Outcomes for people with first-episode psychosis are highly heterogeneous. Few reliable validated methods are available to predict the outcome for individual patients in the first clinical contact. In this study, we aimed to build multivariable prediction models of 1-year remission and recovery outcomes using baseline clinical variables in people with first-episode psychosis.

Methods: In this machine learning approach, we applied supervised machine learning, using regularised regression and nested leave-one-site-out cross-validation, to baseline clinical data from the English Evaluating the Development and Impact of Early Intervention Services (EDEN) study (n=1027), to develop and internally validate prediction models at 1-year follow-up. We assessed four binary outcomes that were recorded at 1 year: symptom remission, social recovery, vocational recovery, and quality of life (QoL). We externally validated the prediction models by selecting from the top predictor variables identified in the internal validation models the variables shared with the external validation datasets comprised of two Scottish longitudinal cohort studies (n=162) and the OPUS trial, a randomised controlled trial of specialised assertive intervention versus standard treatment (n=578).

Findings: The performance of prediction models was robust for the four 1-year outcomes of symptom remission (area under the receiver operating characteristic curve [AUC] 0·703, 95% CI 0·664-0·742), social recovery (0·731, 0·697-0·765), vocational recovery (0·736, 0·702-0·771), and QoL (0·704, 0·667-0·742; p<0·0001 for all outcomes), on internal validation. We externally validated the outcomes of symptom remission (AUC 0·680, 95% CI 0·587-0·773), vocational recovery (0·867, 0·805-0·930), and QoL (0·679, 0·522-0·836) in the Scottish datasets, and symptom remission (0·616, 0·553-0·679), social recovery (0·573, 0·504-0·643), vocational recovery (0·660, 0·610-0·710), and QoL (0·556, 0·481-0·631) in the OPUS dataset.

Interpretation: In our machine learning analysis, we showed that prediction models can reliably and prospectively identify poor remission and recovery outcomes at 1 year for patients with first-episode psychosis using baseline clinical variables at first clinical contact.

Funding: Lundbeck Foundation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/S2589-7500(19)30121-9DOI Listing
October 2019

Allergic diseases and long-term risk of autoimmune disorders: longitudinal cohort study and cluster analysis.

Eur Respir J 2019 11 14;54(5). Epub 2019 Nov 14.

Institute of Applied Health Research, University of Birmingham, Birmingham, UK.

Introduction: The association between allergic diseases and autoimmune disorders is not well established. Our objective was to determine incidence rates of autoimmune disorders in allergic rhinitis/conjunctivitis (ARC), atopic eczema and asthma, and to investigate for co-occurring patterns.

Methods: This was a retrospective cohort study (1990-2018) employing data extracted from The Health Improvement Network (UK primary care database). The exposure group comprised ARC, atopic eczema and asthma (all ages). For each exposed patient, up to two randomly selected age- and sex-matched controls with no documented allergic disease were used. Adjusted incidence rate ratios (aIRRs) were calculated using Poisson regression. A cross-sectional study was also conducted employing Association Rule Mining (ARM) to investigate disease clusters.

Results: 782 320, 1 393 570 and 1 049 868 patients with ARC, atopic eczema and asthma, respectively, were included. aIRRs of systemic lupus erythematosus (SLE), Sjögren's syndrome, vitiligo, rheumatoid arthritis, psoriasis, pernicious anaemia, inflammatory bowel disease, coeliac disease and autoimmune thyroiditis were uniformly higher in the three allergic diseases compared with controls. Specifically, aIRRs of SLE (1.45) and Sjögren's syndrome (1.88) were higher in ARC; aIRRs of SLE (1.44), Sjögren's syndrome (1.61) and myasthenia (1.56) were higher in asthma; and aIRRs of SLE (1.86), Sjögren's syndrome (1.48), vitiligo (1.54) and psoriasis (2.41) were higher in atopic eczema. There was no significant effect of the three allergic diseases on multiple sclerosis or of ARC and atopic eczema on myasthenia. Using ARM, allergic diseases clustered with multiple autoimmune disorders. Three age- and sex-related clusters were identified, with a relatively complex pattern in females ≥55 years old.

Conclusions: The long-term risks of autoimmune disorders are significantly higher in patients with allergic diseases. Allergic diseases and autoimmune disorders show age- and sex-related clustering patterns.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1183/13993003.00476-2019DOI Listing
November 2019

-Omics biomarker identification pipeline for translational medicine.

J Transl Med 2019 05 14;17(1):155. Epub 2019 May 14.

College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham, B15 2TT, UK.

Background: Translational medicine (TM) is an emerging domain that aims to facilitate medical or biological advances efficiently from the scientist to the clinician. Central to the TM vision is to narrow the gap between basic science and applied science in terms of time, cost and early diagnosis of the disease state. Biomarker identification is one of the main challenges within TM. The identification of disease biomarkers from -omics data will not only help the stratification of diverse patient cohorts but will also provide early diagnostic information which could improve patient management and potentially prevent adverse outcomes. However, biomarker identification needs to be robust and reproducible. Hence a robust unbiased computational framework that can help clinicians identify those biomarkers is necessary.

Methods: We developed a pipeline (workflow) that includes two different supervised classification techniques based on regularization methods to identify biomarkers from -omics or other high dimension clinical datasets. The pipeline includes several important steps such as quality control and stability of selected biomarkers. The process takes input files (outcome and independent variables or -omics data) and pre-processes (normalization, missing values) them. After a random division of samples into training and test sets, Least Absolute Shrinkage and Selection Operator and Elastic Net feature selection methods are applied to identify the most important features representing potential biomarker candidates. The penalization parameters are optimised using 10-fold cross validation and the process undergoes 100 iterations and a combinatorial analysis to select the best performing multivariate model. An empirical unbiased assessment of their quality as biomarkers for clinical use is performed through a Receiver Operating Characteristic curve and its Area Under the Curve analysis on both permuted and real data for 1000 different randomized training and test sets. We validated this pipeline against previously published biomarkers.

Results: We applied this pipeline to three different datasets with previously published biomarkers: lipidomics data by Acharjee et al. (Metabolomics 13:25, 2017) and transcriptomics data by Rajamani and Bhasin (Genome Med 8:38, 2016) and Mills et al. (Blood 114:1063-1072, 2009). Our results demonstrate that our method was able to identify both previously published biomarkers as well as new variables that add value to the published results.

Conclusions: We developed a robust pipeline to identify clinically relevant biomarkers that can be applied to different -omics datasets. Such identification reveals potentially novel drug targets and can be used as a part of a machine-learning based patient stratification framework in the translational medicine settings.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12967-019-1912-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6518609PMC
May 2019

DeepPVP: phenotype-based prioritization of causative variants using deep learning.

BMC Bioinformatics 2019 Feb 6;20(1):65. Epub 2019 Feb 6.

Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955-6900, Kingdom of Saudi Arabia.

Background: Prioritization of variants in personal genomic data is a major challenge. Recently, computational methods that rely on comparing phenotype similarity have shown to be useful to identify causative variants. In these methods, pathogenicity prediction is combined with a semantic similarity measure to prioritize not only variants that are likely to be dysfunctional but those that are likely involved in the pathogenesis of a patient's phenotype.

Results: We have developed DeepPVP, a variant prioritization method that combined automated inference with deep neural networks to identify the likely causative variants in whole exome or whole genome sequence data. We demonstrate that DeepPVP performs significantly better than existing methods, including phenotype-based methods that use similar features. DeepPVP is freely available at https://github.com/bio-ontology-research-group/phenomenet-vp .

Conclusions: DeepPVP further improves on existing variant prioritization methods both in terms of speed as well as accuracy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-2633-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6364462PMC
February 2019

Data-driven discovery and validation of circulating blood-based biomarkers associated with prevalent atrial fibrillation.

Eur Heart J 2019 04;40(16):1268-1276

Institute of Cardiovascular Sciences, University of Birmingham, Birmingham, UK.

Aims: Undetected atrial fibrillation (AF) is a major health concern. Blood biomarkers associated with AF could simplify patient selection for screening and further inform ongoing research towards stratified prevention and treatment of AF.

Methods And Results: Forty common cardiovascular biomarkers were quantified in 638 consecutive patients referred to hospital [mean ± standard deviation age 70 ± 12 years, 398 (62%) male, 294 (46%) with AF] with known AF or ≥2 CHA2DS2-VASc risk factors. Paroxysmal or silent AF was ruled out by 7-day ECG monitoring. Logistic regression with forward selection and machine learning algorithms were used to determine clinical risk factors, imaging parameters, and biomarkers associated with AF. Atrial fibrillation was significantly associated with age [bootstrapped odds ratio (OR) per year = 1.060, 95% confidence interval (1.04-1.10); P = 0.001], male sex [OR = 2.022 (1.28-3.56); P = 0.008], body mass index [BMI, OR per unit = 1.060 (1.02-1.12); P = 0.003], elevated brain natriuretic peptide [BNP, OR per fold change = 1.293 (1.11-1.63); P = 0.002], elevated fibroblast growth factor-23 [FGF-23, OR = 1.667 (1.36-2.34); P = 0.001], and reduced TNF-related apoptosis-induced ligand-receptor 2 [TRAIL-R2, OR = 0.242 (0.14-0.32); P = 0.001], but not other biomarkers. Biomarkers improved the prediction of AF compared with clinical risk factors alone (net reclassification improvement = 0.178; P < 0.001). Both logistic regression and machine learning predicted AF well during validation [area under the receiver-operator curve = 0.684 (0.62-0.75) and 0.697 (0.63-0.76), respectively].

Conclusion: Three simple clinical risk factors (age, sex, and BMI) and two biomarkers (elevated BNP and elevated FGF-23) identify patients with AF. Further research is warranted to elucidate FGF-23 dependent mechanisms of AF.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/eurheartj/ehy815DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6475521PMC
April 2019

A Review of Current Standards and the Evolution of Histopathology Nomenclature for Laboratory Animals.

ILAR J 2018 12;59(1):29-39

Susan A. Elmore, MS, DVM, DCVP, DABT, FIATP, is NTP Pathologist and Staff Scientist at the National Toxicology Program, National Institute of Environmental Health Sciences in the Research Triangle Park, North Carolina. Robert D. Cardiff, MD, PhD, is Distinguished Professor of Pathology, Emeritus at the UCD Center for Comparative Medicine, University of California, and the Department of Pathology and Laboratory Medicine, School of Medicine, Davis, in Davis, California. Mark F. Cesta, DVM, PhD, DACVP, is NTP Pathologist and Staff Scientist, leading the effort for establishment of the online NTP Nonneoplastic Lesion Atlas at the National Toxicology Program, National Institute of Environmental Health Sciences in the Research Triangle Park, North Carolina. Georgios V. Gkoutos, PhD, DIC, is Professor of Clinical Bioinformatics at College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences Centre for Computational Biology, University of Birmingham in Birmingham, United Kingdom. Robert Hoehndorf, PhD, is Assistant Professor in Computer Science at the Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology in Thuwal, Kingdom of Saudi Arabia. Charlotte M. Keenan, VMD, DACVP, is a principle consultant at C.M. ToxPath Consulting in Doylestown, Pennsylvania, USA and leads the international STP effort for the publication of the harmonization of nomenclature and diagnostic criteria (INHAND) in toxicologic pathology. Colin McKerlie, DVM, DVSc, MRCVS, is a senior associate scientist in the Translational Medicine Research Program at The Hospital for Sick Children and a Professor in the Department of Pathobiology & Laboratory Medicine in the Faculty of Medicine at the University of Toronto, Toronto, Ontario, Canada. Paul N. Schofield, MA DPhil, is the University Reader in Biomedical Informatics at the Department of Physiology, Development & Neuroscience, University of Cambridge in Cambridge, United Kingdom and is also an adjunct professor at The Jackson Laboratory in Bar Harbor, Maine. John P. Sundberg, DVM, PhD, DACVP, is a professor at The Jackson Laboratory in Bar Harbor, Maine. Jerrold M. Ward, DVM, PhD, DACVP, FIATP, is a special volunteer at the National Cancer Institute, National Institutes of Health in Bethesda, MD and is also Adjunct Faculty at The Jackson Laboratory in Bar Harbor, Maine.

The need for international collaboration in rodent pathology has evolved since the 1970s and was initially driven by the new field of toxicologic pathology. First initiated by the World Health Organization's International Agency for Research on Cancer for rodents, it has evolved to include pathology of the major species (rats, mice, guinea pigs, nonhuman primates, pigs, dogs, fish, rabbits) used in medical research, safety assessment, and mouse pathology. The collaborative effort today is driven by the needs of the regulatory agencies in multiple countries, and by needs of research involving genetically engineered animals, for "basic" research and for more translational preclinical models of human disease. These efforts led to the establishment of an international rodent pathology nomenclature program. Since that time, multiple collaborations for standardization of laboratory animal pathology nomenclature and diagnostic criteria have been developed, and just a few are described herein. Recently, approaches to a nomenclature that is amenable to sophisticated computation have been made available and implemented for large-scale programs in functional genomics and aging. Most terminologies continue to evolve as the science of human and veterinary pathology continues to develop, but standardization and successful implementation remain critical for scientific communication now as ever in the history of veterinary nosology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/ilar/ily005DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6927895PMC
December 2018

Ontology-based validation and identification of regulatory phenotypes.

Bioinformatics 2018 09;34(17):i857-i865

Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Centre, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.

Motivation: Function annotations of gene products, and phenotype annotations of genotypes, provide valuable information about molecular mechanisms that can be utilized by computational methods to identify functional and phenotypic relatedness, improve our understanding of disease and pathobiology, and lead to discovery of drug targets. Identifying functions and phenotypes commonly requires experiments which are time-consuming and expensive to carry out; creating the annotations additionally requires a curator to make an assertion based on reported evidence. Support to validate the mutual consistency of functional and phenotype annotations as well as a computational method to predict phenotypes from function annotations, would greatly improve the utility of function annotations.

Results: We developed a novel ontology-based method to validate the mutual consistency of function and phenotype annotations. We apply our method to mouse and human annotations, and identify several inconsistencies that can be resolved to improve overall annotation quality. We also apply our method to the rule-based prediction of regulatory phenotypes from functions and demonstrate that we can predict these phenotypes with Fmax of up to 0.647.

Availability And Implementation: https://github.com/bio-ontology-research-group/phenogocon.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty605DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129279PMC
September 2018