Publications by authors named "Peter Szolovits"

83 Publications

ATLAS: an automated association test using probabilistically linked health records with application to genetic studies.

J Am Med Inform Assoc 2021 Nov;28(12):2582-2592

Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA.

Objective: Large amounts of health data are becoming available for biomedical research. Synthesizing information across databases may capture more comprehensive pictures of patient health and enable novel research studies. When no gold standard mappings between patient records are available, researchers may probabilistically link records from separate databases and analyze the linked data. However, previous linked data inference methods are constrained to certain linkage settings and exhibit low power. Here, we present ATLAS, an automated, flexible, and robust association testing algorithm for probabilistically linked data.

Materials And Methods: Missing variables are imputed at various thresholds using a weighted average method that propagates uncertainty from probabilistic linkage. Next, estimated effect sizes are obtained using a generalized linear model. ATLAS then conducts the threshold combination test by optimally combining P values obtained from data imputed at varying thresholds using Fisher's method and perturbation resampling.

Results: In simulations, ATLAS controls for type I error and exhibits high power compared to previous methods. In a real-world genetic association study, meta-analysis of ATLAS-enabled analyses on a linked cohort with analyses using an existing cohort yielded additional significant associations between rheumatoid arthritis genetic risk score and laboratory biomarkers.

Discussion: Weighted average imputation weathers false matches and increases contribution of true matches to mitigate linkage error-induced bias. The threshold combination test avoids arbitrarily choosing a threshold to rule a match, thus automating linked data-enabled analyses and preserving power.

Conclusion: ATLAS promises to enable novel and powerful research studies using linked data to capitalize on all available data sources.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/jamia/ocab187DOI Listing
November 2021

Visceral Adiposity and Severe COVID-19 Disease: Application of an Artificial Intelligence Algorithm to Improve Clinical Risk Prediction.

Open Forum Infect Dis 2021 Jul 28;8(7):ofab275. Epub 2021 May 28.

Medical Practice Evaluation Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA.

Background: Obesity has been linked to severe clinical outcomes among people who are hospitalized with coronavirus disease 2019 (COVID-19). We tested the hypothesis that visceral adipose tissue (VAT) is associated with severe outcomes in patients hospitalized with COVID-19, independent of body mass index (BMI).

Methods: We analyzed data from the Massachusetts General Hospital COVID-19 Data Registry, which included patients admitted with polymerase chain reaction-confirmed severe acute respiratory syndrome coronavirus 2 infection from March 11 to May 4, 2020. We used a validated, fully automated artificial intelligence (AI) algorithm to quantify VAT from computed tomography (CT) scans during or before the hospital admission. VAT quantification took an average of 2 ± 0.5 seconds per patient. We dichotomized VAT as high and low at a threshold of ≥100 cm and used Kaplan-Meier curves and Cox proportional hazards regression to assess the relationship between VAT and death or intubation over 28 days, adjusting for age, sex, race, BMI, and diabetes status.

Results: A total of 378 participants had CT imaging. Kaplan-Meier curves showed that participants with high VAT had a greater risk of the outcome compared with those with low VAT ( < .005), especially in those with BMI <30 kg/m ( < .005). In multivariable models, the adjusted hazard ratio (aHR) for high vs low VAT was unchanged (aHR, 1.97; 95% CI, 1.24-3.09), whereas BMI was no longer significant (aHR for obese vs normal BMI, 1.14; 95% CI, 0.71-1.82).

Conclusions: High VAT is associated with a greater risk of severe disease or death in COVID-19 and can offer more precise information to risk-stratify individuals beyond BMI. AI offers a promising approach to routinely ascertain VAT and improve clinical risk prediction in COVID-19.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/ofid/ofab275DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8244656PMC
July 2021

Artificial intelligence to assess body composition on routine abdominal CT scans and predict mortality in pancreatic cancer- A recipe for your local application.

Eur J Radiol 2021 Sep 24;142:109834. Epub 2021 Jun 24.

MIT Computer Science & Artificial Intelligence Laboratory, 32 Vassar St, Cambridge, MA 02139, United States; Center for Evidence Based Imaging, Department of Radiology, Brigham and Women's Hospital, 20 Kent Street, Brookline, MA 02445, United States. Electronic address:

Background: Body composition is associated with mortality; however its routine assessment is too time-consuming.

Purpose: To demonstrate the value of artificial intelligence (AI) to extract body composition measures from routine studies, we aimed to develop a fully automated AI approach to measure fat and muscles masses, to validate its clinical discriminatory value, and to provide the code, training data and workflow solutions to facilitate its integration into local practice.

Methods: We developed a neural network that quantified the tissue components at the L3 vertebral body level using data from the Liver Tumor Challenge (LiTS) and a pancreatic cancer cohort. We classified sarcopenia using accepted skeletal muscle index cut-offs and visceral fat based its median value. We used Kaplan Meier curves and Cox regression analysis to assess the association between these measures and mortality.

Results: Applying the algorithm trained on LiTS data to the local cohort yielded good agreement [>0.8 intraclass correlation (ICC)]; when trained on both datasets, it had excellent agreement (>0.9 ICC). The pancreatic cancer cohort had 136 patients (mean age: 67 ± 11 years; 54% women); 15% had sarcopenia; mean visceral fat was 142 cm. Concurrent with prior research, we found a significant association between sarcopenia and mortality [mean survival of 15 ± 12 vs. 22 ± 12 (p < 0.05), adjusted HR of 1.58 (95% CI: 1.03-3.33)] but no association between visceral fat and mortality. The detector analysis took 1 ± 0.5 s.

Conclusions: AI body composition analysis can provide meaningful imaging biomarkers from routine exams demonstrating AI's ability to further enhance the clinical value of radiology reports.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ejrad.2021.109834DOI Listing
September 2021

Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment.

Med Image Comput Comput Assist Interv 2020 Oct 29;12262:529-539. Epub 2020 Sep 29.

Massachusetts Institute of Technology, Cambridge, MA, USA.

We propose and demonstrate a novel machine learning algorithm that assesses pulmonary edema severity from chest radiographs. While large publicly available datasets of chest radiographs and free-text radiology reports exist, only limited numerical edema severity labels can be extracted from radiology reports. This is a significant challenge in learning such models for image classification. To take advantage of the rich information present in the radiology reports, we develop a neural network model that is trained on both images and free-text to assess pulmonary edema severity from chest radiographs at inference time. Our experimental results suggest that the joint image-text representation learning improves the performance of pulmonary edema assessment compared to a supervised model trained on images only. We also show the use of the text for explaining the image classification by the joint model. To the best of our knowledge, our approach is the first to leverage free-text radiology reports for improving the image model performance in this application. Our code is available at: https://github.com/RayRuizhiLiao/joint_chestxray.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-3-030-59713-9_51DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7901713PMC
October 2020

Hard for humans, hard for machines: predicting readmission after psychiatric hospitalization using narrative notes.

Transl Psychiatry 2021 01 11;11(1):32. Epub 2021 Jan 11.

Center for Quantitative Health, Division of Clinical Research, Massachusetts General Hospital, 185 Cambridge Street, Boston, MA, 02114, USA.

Machine learning has been suggested as a means of identifying individuals at greatest risk for hospital readmission, including psychiatric readmission. We sought to compare the performance of predictive models that use interpretable representations derived via topic modeling to the performance of human experts and nonexperts. We examined all 5076 admissions to a general psychiatry inpatient unit between 2009 and 2016 using electronic health records. We developed multiple models to predict 180-day readmission for these admissions based on features derived from narrative discharge summaries, augmented by baseline sociodemographic and clinical features. We developed models using a training set comprising 70% of the cohort and evaluated on the remaining 30%. Baseline models using demographic features for prediction achieved an area under the curve (AUC) of 0.675 [95% CI 0.674-0.676] on an independent testing set, while language-based models also incorporating bag-of-words features, discharge summaries topics identified by Latent Dirichlet allocation (LDA), and prior psychiatric admissions achieved AUC of 0.726 [95% CI 0.725-0.727]. To characterize the difficulty of the task, we also compared the performance of these classifiers to both expert and nonexpert human raters, with and without feedback, on a subset of 75 test cases. These models outperformed humans on average, including predictions by experienced psychiatrists. Typical note tokens or topics associated with readmission risk were related to pregnancy/postpartum state, family relationships, and psychosis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41398-020-01104-wDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7801508PMC
January 2021

A multidimensional precision medicine approach identifies an autism subtype characterized by dyslipidemia.

Nat Med 2020 09 10;26(9):1375-1379. Epub 2020 Aug 10.

Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.

The promise of precision medicine lies in data diversity. More than the sheer size of biomedical data, it is the layering of multiple data modalities, offering complementary perspectives, that is thought to enable the identification of patient subgroups with shared pathophysiology. In the present study, we use autism to test this notion. By combining healthcare claims, electronic health records, familial whole-exome sequences and neurodevelopmental gene expression patterns, we identified a subgroup of patients with dyslipidemia-associated autism.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41591-020-1007-0DOI Listing
September 2020

Three-Dimensional Neural Network to Automatically Assess Liver Tumor Burden Change on Consecutive Liver MRIs.

J Am Coll Radiol 2020 Nov 25;17(11):1475-1484. Epub 2020 Jul 25.

Director of the Center for Evidence Imaging and Vice Chair of Quality/Safety, Department of Radiology, Brigham and Women's Hospital, Boston, Massachusetts.

Background: Tumor response to therapy is often assessed by measuring change in liver lesion size between consecutive MRIs. However, these evaluations are both tedious and time-consuming for clinical radiologists.

Purpose: In this study, we sought to develop a convolutional neural network to detect liver metastases on MRI and applied this algorithm to assess change in tumor size on consecutive examinations.

Methods: We annotated a data set of 64 patients with neuroendocrine tumors who underwent at least two consecutive liver MRIs with gadoxetic acid. We then developed a 3-D neural network using a U-Net architecture with ResNet-18 building blocks that first detected the liver and then lesions within the liver. Liver lesion labels for each examination were then matched in 3-D space using an iterative closest point algorithm followed by Kuhn-Munkres algorithm.

Results: We developed a deep learning algorithm that detected liver metastases, co-registered the detected lesions, and then assessed the interval change in tumor burden between two multiparametric liver MRI examinations. Our deep learning algorithm was concordant in 91% with the radiologists' manual assessment about the interval change of disease burden. It had a sensitivity of 0.85 (95% confidence interval (95% CI): 0.77; 0.93) and specificity of 0.92 (95% CI: 0.87; 0.96) to classify liver segments as diseased or healthy. The mean DICE coefficient for individual lesions ranged between 0.73 and 0.81.

Conclusions: Our algorithm displayed high agreement with human readers for detecting change in liver lesions on MRI, offering evidence that artificial intelligence-based detectors may perform these tasks as part of routine clinical care in the future.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jacr.2020.06.033DOI Listing
November 2020

Advancing PICO element detection in biomedical text via deep neural networks.

Bioinformatics 2020 06;36(12):3856-3862

Compter Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

Motivation: In evidence-based medicine, defining a clinical question in terms of the specific patient problem aids the physicians to efficiently identify appropriate resources and search for the best available evidence for medical treatment. In order to formulate a well-defined, focused clinical question, a framework called PICO is widely used, which identifies the sentences in a given medical text that belong to the four components typically reported in clinical trials: Participants/Problem (P), Intervention (I), Comparison (C) and Outcome (O). In this work, we propose a novel deep learning model for recognizing PICO elements in biomedical abstracts. Based on the previous state-of-the-art bidirectional long-short-term memory (bi-LSTM) plus conditional random field architecture, we add another layer of bi-LSTM upon the sentence representation vectors so that the contextual information from surrounding sentences can be gathered to help infer the interpretation of the current one. In addition, we propose two methods to further generalize and improve the model: adversarial training and unsupervised pre-training over large corpora.

Results: We tested our proposed approach over two benchmark datasets. One is the PubMed-PICO dataset, where our best results outperform the previous best by 5.5%, 7.9% and 5.8% for P, I and O elements in terms of F1 score, respectively. And for the other dataset named NICTA-PIBOSO, the improvements for P/I/O elements are 3.9%, 15.6% and 1.3% in F1 score, respectively. Overall, our proposed deep learning model can obtain unprecedented PICO element detection accuracy while avoiding the need for any manual feature selection.

Availability And Implementation: Code is available at https://github.com/jind11/Deep-PICO-Detection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa256DOI Listing
June 2020

High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP).

Nat Protoc 2019 12 20;14(12):3426-3444. Epub 2019 Nov 20.

Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Boston, MA, USA.

Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challenges to phenotyping with EMR data include variation in the accuracy of codes, as well as the high level of manual input required to identify features for the algorithm and to obtain gold standard labels. To address these challenges, we developed PheCAP, a high-throughput semi-supervised phenotyping pipeline. PheCAP begins with data from the EMR, including structured data and information extracted from the narrative notes using natural language processing (NLP). The standardized steps integrate automated procedures, which reduce the level of manual input, and machine learning approaches for algorithm training. PheCAP itself can be executed in 1-2 d if all data are available; however, the timing is largely dependent on the chart review stage, which typically requires at least 2 weeks. The final products of PheCAP include a phenotype algorithm, the probability of the phenotype for all patients, and a phenotype classification (yes or no).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41596-019-0227-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7323894PMC
December 2019

High-throughput multimodal automated phenotyping (MAP) with application to PheWAS.

J Am Med Inform Assoc 2019 11;26(11):1255-1262

Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.

Objective: Electronic health records linked with biorepositories are a powerful platform for translational studies. A major bottleneck exists in the ability to phenotype patients accurately and efficiently. The objective of this study was to develop an automated high-throughput phenotyping method integrating International Classification of Diseases (ICD) codes and narrative data extracted using natural language processing (NLP).

Materials And Methods: We developed a mapping method for automatically identifying relevant ICD and NLP concepts for a specific phenotype leveraging the Unified Medical Language System. Along with health care utilization, aggregated ICD and NLP counts were jointly analyzed by fitting an ensemble of latent mixture models. The multimodal automated phenotyping (MAP) algorithm yields a predicted probability of phenotype for each patient and a threshold for classifying participants with phenotype yes/no. The algorithm was validated using labeled data for 16 phenotypes from a biorepository and further tested in an independent cohort phenome-wide association studies (PheWAS) for 2 single nucleotide polymorphisms with known associations.

Results: The MAP algorithm achieved higher or similar AUC and F-scores compared to the ICD code across all 16 phenotypes. The features assembled via the automated approach had comparable accuracy to those assembled via manual curation (AUCMAP 0.943, AUCmanual 0.941). The PheWAS results suggest that the MAP approach detected previously validated associations with higher power when compared to the standard PheWAS method based on ICD codes.

Conclusion: The MAP approach increased the accuracy of phenotype definition while maintaining scalability, thereby facilitating use in studies requiring large-scale phenotyping, such as PheWAS.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/jamia/ocz066DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6798574PMC
November 2019

Early Prediction of Acute Kidney Injury in Critical Care Setting Using Clinical Notes and Structured Multivariate Physiological Measurements.

Stud Health Technol Inform 2019 Aug;264:368-372

Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.

The onset of acute kidney injury (AKI) during an intensive care unit (ICU) admission is associated with increased morbidity and mortality. Developing novel methods to identify early AKI onset is of critical importance in preventing or reducing AKI complications. We built and applied multiple machine learning models to integrate clinical notes and structured physiological measurements and estimate the risk of new AKI onset using the MIMIC-III database. From the clinical notes, we generated clinically meaningful word representations and embeddings. Four supervised learning classifiers and mixed-feature deep learning architecture were used to construct prediction models. The best configurations consistently utilized both structured and unstructured clinical features and yielded competitive AUCs above 0.83. Our work suggests that integrating structured and unstructured clinical features can be effectively applied to assist clinicians in identifying the risk of incident AKI onset in critically-ill patients upon admission to the ICU.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3233/SHTI190245DOI Listing
August 2019

Deep Learning Benchmarks on L1000 Gene Expression Data.

IEEE/ACM Trans Comput Biol Bioinform 2020 Nov-Dec;17(6):1846-1857. Epub 2020 Dec 8.

Gene expression data can offer deep, physiological insights beyond the static coding of the genome alone. We believe that realizing this potential requires specialized, high-capacity machine learning methods capable of using underlying biological structure, but the development of such models is hampered by the lack of published benchmark tasks and well characterized baselines. In this work, we establish such benchmarks and baselines by profiling many classifiers against biologically motivated tasks on two curated views of a large, public gene expression dataset (the LINCS corpus) and one privately produced dataset. We provide these two curated views of the public LINCS dataset and our benchmark tasks to enable direct comparisons to future methodological work and help spur deep learning method development on this modality. In addition to profiling a battery of traditional classifiers, including linear models, random forests, decision trees, K nearest neighbor (KNN) classifiers, and feed-forward artificial neural networks (FF-ANNs), we also test a method novel to this data modality: graph convolugtional neural networks (GCNNs), which allow us to incorporate prior biological domain knowledge. We find that GCNNs can be highly performant, with large datasets, whereas FF-ANNs consistently perform well. Non-neural classifiers are dominated by linear models and KNN classifiers.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCBB.2019.2910061DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6980363PMC
December 2020

Use of machine-learning algorithms to determine features of systolic blood pressure variability that predict poor outcomes in hypertensive patients.

Clin Kidney J 2019 Apr 3;12(2):206-212. Epub 2018 Jul 3.

Dialysis Clinic, Inc., Nashville, TN, USA.

Background: We re-analyzed data from the Systolic Blood Pressure Intervention Trial (SPRINT) trial to identify features of systolic blood pressure (SBP) variability that portend poor cardiovascular outcomes using a nonlinear machine-learning algorithm.

Methods: We included all patients who completed 1 year of the study without reaching any primary endpoint during the first year, specifically: myocardial infarction, other acute coronary syndromes, stroke, heart failure or death from a cardiovascular event ( = 8799; 94%). In addition to clinical variables, features representing longitudinal SBP trends and variability were determined and combined in a random forest algorithm, optimized using cross-validation, using 70% of patients in the training set. Area under the curve (AUC) was measured using a 30% testing set. Finally, feature importance was determined by minimizing node impurity averaging over all trees in the forest for a specific feature.

Results: A total of 365 patients (4.1%) reached the combined primary outcome over 37 months of follow-up. The random forest classifier had an AUC of 0.71 on the testing set. The 10 most significant features selected in order of importance by the automated algorithm included the urine albumin/creatinine (CR) ratio, estimated glomerular filtration rate, age, serum CR, history of subclinical cardiovascular disease (CVD), cholesterol, a variable representing SBP signals using wavelet transformation, high-density lipoprotein, the 90th percentile of SBP and triglyceride level.

Conclusions: We successfully demonstrated use of random forest algorithm to define best prognostic longitudinal SBP representations. In addition to known risk factors for CVD, transformed variables for time series SBP measurements were found to be important in predicting poor cardiovascular outcomes and require further evaluation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/ckj/sfy049DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6452173PMC
April 2019

Can AI Help Reduce Disparities in General Medical and Mental Health Care?

AMA J Ethics 2019 02 1;21(2):E167-179. Epub 2019 Feb 1.

An assistant professor of computer science and medicine at the University of Toronto and a faculty member at the Vector Institute, both in in Ontario, Canada; and previously served as a visiting researcher at Alphabet Inc. within its life sciences research organization, Verily, and as a postdoctoral fellow at the Massachusetts Institute of Technology.

Background: As machine learning becomes increasingly common in health care applications, concerns have been raised about bias in these systems' data, algorithms, and recommendations. Simply put, as health care improves for some, it might not improve for all.

Methods: Two case studies are examined using a machine learning algorithm on unstructured clinical and psychiatric notes to predict intensive care unit (ICU) mortality and 30-day psychiatric readmission with respect to race, gender, and insurance payer type as a proxy for socioeconomic status.

Results: Clinical note topics and psychiatric note topics were heterogenous with respect to race, gender, and insurance payer type, which reflects known clinical findings. Differences in prediction accuracy and therefore machine bias are shown with respect to gender and insurance type for ICU mortality and with respect to insurance policy for psychiatric 30-day readmission.

Conclusions: This analysis can provide a framework for assessing and identifying disparate impacts of artificial intelligence in health care.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1001/amajethics.2019.167DOI Listing
February 2019

Probabilistic record linkage of de-identified research datasets with discrepancies using diagnosis codes.

Sci Data 2019 01 8;6:180298. Epub 2019 Jan 8.

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

We develop an algorithm for probabilistic linkage of de-identified research datasets at the patient level, when only diagnosis codes with discrepancies and no personal health identifiers such as name or date of birth are available. It relies on Bayesian modelling of binarized diagnosis codes, and provides a posterior probability of matching for each patient pair, while considering all the data at once. Both in our simulation study (using an administrative claims dataset for data generation) and in two real use-cases linking patient electronic health records from a large tertiary care network, our method exhibits good performance and compares favourably to the standard baseline Fellegi-Sunter algorithm. We propose a scalable, fast and efficient open-source implementation in the ludic R package available on CRAN, which also includes the anonymized diagnosis code data from our real use-case. This work suggests it is possible to link de-identified research databases stripped of any personal health identifiers using only diagnosis codes, provided sufficient information is shared between the data sources.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/sdata.2018.298DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6326114PMC
January 2019

Implementing a Portable Clinical NLP System with a Common Data Model - a Lisp Perspective.

Proceedings (IEEE Int Conf Bioinformatics Biomed) 2018 Dec 24;2018:461-466. Epub 2019 Jan 24.

CSAIL, MIT, Cambridge, USA.

This paper presents a Lisp architecture for a portable NLP system, termed LAPNLP, for processing clinical notes. LAPNLP integrates multiple standard, customized and in-house developed NLP tools. Our system facilitates portability across different institutions and data systems by incorporating an enriched Common Data Model (CDM) to standardize necessary data elements. It utilizes UMLS to perform domain adaptation when integrating generic domain NLP tools. It also features stand-off annotations that are specified by positional reference to the original document. We built an interval tree based search engine to efficiently query and retrieve the stand-off annotations by specifying positional requirements. We also developed a utility to convert an inline annotation format to stand-off annotations to enable the reuse of clinical text datasets with in-line annotations. We experimented with our system on several NLP facilitated tasks including computational phenotyping for lymphoma patients and semantic relation extraction for clinical notes. These experiments showcased the broader applicability and utility of LAPNLP.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/bibm.2018.8621521DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7769694PMC
December 2018

Artificial intelligence, machine learning and health systems.

J Glob Health 2018 Dec;8(2):020303

Department of Global Health and Population, Harvard TH Chan School of Public Health, Harvard University, Boston, Massachusetts, USA.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7189/jogh.08.020303DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6199467PMC
December 2018

What's in a Note? Unpacking Predictive Value in Clinical Note Representations.

AMIA Jt Summits Transl Sci Proc 2018 18;2017:26-34. Epub 2018 May 18.

Massachusetts Institute of Technology, Cambridge, MA, USA.

Electronic Health Records (EHRs) have seen a rapid increase in adoption during the last decade. The narrative prose contained in clinical notes is unstructured and unlocking its full potential has proved challenging. Many studies incorporating clinical notes have applied simple information extraction models to build representations that enhance a downstream clinical prediction task, such as mortality or readmission. Improved predictive performance suggests a "good" representation. However, these extrinsic evaluations are blind to most of the insight contained in the notes. In order to better understand the power of expressive clinical prose, we investigate both intrinsic and extrinsic methods for understanding several common note representations. To ensure replicability and to support the clinical modeling community, we run all experiments on publicly-available data and provide our code.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5961801PMC
May 2018

3D-MICE: integration of cross-sectional and longitudinal imputation for multi-analyte longitudinal clinical data.

J Am Med Inform Assoc 2018 06;25(6):645-653

Department of Pathology, Massachusetts General Hospital, Boston, MA, USA.

Objective: A key challenge in clinical data mining is that most clinical datasets contain missing data. Since many commonly used machine learning algorithms require complete datasets (no missing data), clinical analytic approaches often entail an imputation procedure to "fill in" missing data. However, although most clinical datasets contain a temporal component, most commonly used imputation methods do not adequately accommodate longitudinal time-based data. We sought to develop a new imputation algorithm, 3-dimensional multiple imputation with chained equations (3D-MICE), that can perform accurate imputation of missing clinical time series data.

Methods: We extracted clinical laboratory test results for 13 commonly measured analytes (clinical laboratory tests). We imputed missing test results for the 13 analytes using 3 imputation methods: multiple imputation with chained equations (MICE), Gaussian process (GP), and 3D-MICE. 3D-MICE utilizes both MICE and GP imputation to integrate cross-sectional and longitudinal information. To evaluate imputation method performance, we randomly masked selected test results and imputed these masked results alongside results missing from our original data. We compared predicted results to measured results for masked data points.

Results: 3D-MICE performed significantly better than MICE and GP-based imputation in a composite of all 13 analytes, predicting missing results with a normalized root-mean-square error of 0.342, compared to 0.373 for MICE alone and 0.358 for GP alone.

Conclusions: 3D-MICE offers a novel and practical approach to imputing clinical laboratory time series data. 3D-MICE may provide an additional tool for use as a foundation in clinical predictive analytics and intelligent clinical decision support.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/jamia/ocx133DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7646951PMC
June 2018

Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach.

BMC Med Inform Decis Mak 2017 Dec 1;17(1):155. Epub 2017 Dec 1.

Laboratory of Computer Science, Massachusetts General Hospital, 50 Staniford Street, Suite 750, Boston, MA, 02114, USA.

Background: The medical subdomain of a clinical note, such as cardiology or neurology, is useful content-derived metadata for developing machine learning downstream applications. To classify the medical subdomain of a note accurately, we have constructed a machine learning-based natural language processing (NLP) pipeline and developed medical subdomain classifiers based on the content of the note.

Methods: We constructed the pipeline using the clinical NLP system, clinical Text Analysis and Knowledge Extraction System (cTAKES), the Unified Medical Language System (UMLS) Metathesaurus, Semantic Network, and learning algorithms to extract features from two datasets - clinical notes from Integrating Data for Analysis, Anonymization, and Sharing (iDASH) data repository (n = 431) and Massachusetts General Hospital (MGH) (n = 91,237), and built medical subdomain classifiers with different combinations of data representation methods and supervised learning algorithms. We evaluated the performance of classifiers and their portability across the two datasets.

Results: The convolutional recurrent neural network with neural word embeddings trained-medical subdomain classifier yielded the best performance measurement on iDASH and MGH datasets with area under receiver operating characteristic curve (AUC) of 0.975 and 0.991, and F1 scores of 0.845 and 0.870, respectively. Considering better clinical interpretability, linear support vector machine-trained medical subdomain classifier using hybrid bag-of-words and clinically relevant UMLS concepts as the feature representation, with term frequency-inverse document frequency (tf-idf)-weighting, outperformed other shallow learning classifiers on iDASH and MGH datasets with AUC of 0.957 and 0.964, and F1 scores of 0.932 and 0.934 respectively. We trained classifiers on one dataset, applied to the other dataset and yielded the threshold of F1 score of 0.7 in classifiers for half of the medical subdomains we studied.

Conclusion: Our study shows that a supervised learning-based NLP approach is useful to develop medical subdomain classifiers. The deep learning algorithm with distributed word representation yields better performance yet shallow learning algorithms with the word and concept representation achieves comparable performance with better clinical interpretability. Portable classifiers may also be used across datasets from different institutions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12911-017-0556-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5709846PMC
December 2017

Enabling phenotypic big data with PheNorm.

J Am Med Inform Assoc 2018 01;25(1):54-60

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

Objective: Electronic health record (EHR)-based phenotyping infers whether a patient has a disease based on the information in his or her EHR. A human-annotated training set with gold-standard disease status labels is usually required to build an algorithm for phenotyping based on a set of predictive features. The time intensiveness of annotation and feature curation severely limits the ability to achieve high-throughput phenotyping. While previous studies have successfully automated feature curation, annotation remains a major bottleneck. In this paper, we present PheNorm, a phenotyping algorithm that does not require expert-labeled samples for training.

Methods: The most predictive features, such as the number of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes or mentions of the target phenotype, are normalized to resemble a normal mixture distribution with high area under the receiver operating curve (AUC) for prediction. The transformed features are then denoised and combined into a score for accurate disease classification.

Results: We validated the accuracy of PheNorm with 4 phenotypes: coronary artery disease, rheumatoid arthritis, Crohn's disease, and ulcerative colitis. The AUCs of the PheNorm score reached 0.90, 0.94, 0.95, and 0.94 for the 4 phenotypes, respectively, which were comparable to the accuracy of supervised algorithms trained with sample sizes of 100-300, with no statistically significant difference.

Conclusion: The accuracy of the PheNorm algorithms is on par with algorithms trained with annotated samples. PheNorm fully automates the generation of accurate phenotyping algorithms and demonstrates the capacity for EHR-driven annotations to scale to the next level - phenotypic big data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/jamia/ocx111DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6251688PMC
January 2018

Segment convolutional neural networks (Seg-CNNs) for classifying relations in clinical notes.

J Am Med Inform Assoc 2018 01;25(1):93-98

Department of Preventive Medicine and Medical Social Science, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.

We propose Segment Convolutional Neural Networks (Seg-CNNs) for classifying relations from clinical notes. Seg-CNNs use only word-embedding features without manual feature engineering. Unlike typical CNN models, relations between 2 concepts are identified by simultaneously learning separate representations for text segments in a sentence: preceding, concept1, middle, concept2, and succeeding. We evaluate Seg-CNN on the i2b2/VA relation classification challenge dataset. We show that Seg-CNN achieves a state-of-the-art micro-average F-measure of 0.742 for overall evaluation, 0.686 for classifying medical problem-treatment relations, 0.820 for medical problem-test relations, and 0.702 for medical problem-medical problem relations. We demonstrate the benefits of learning segment-level representations. We show that medical domain word embeddings help improve relation classification. Seg-CNNs can be trained quickly for the i2b2/VA dataset on a graphics processing unit (GPU) platform. These results support the use of CNNs computed over segments of text for classifying medical relations, as they show state-of-the-art performance while requiring no manual feature engineering.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/jamia/ocx090DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6381760PMC
January 2018

Predicting intervention onset in the ICU with switching state space models.

AMIA Jt Summits Transl Sci Proc 2017 26;2017:82-91. Epub 2017 Jul 26.

Harvard University, Cambridge, MA, USA.

The impact of many intensive care unit interventions has not been fully quantified, especially in heterogeneous patient populations. We train unsupervised switching state autoregressive models on vital signs from the public MIMIC-III database to capture patient movement between physiological states. We compare our learned states to static demographics and raw vital signs in the prediction of five ICU treatments: ventilation, vasopressor administra tion, and three transfusions. We show that our learned states, when combined with demographics and raw vital signs, improve prediction for most interventions even 4 or 8 hours ahead of onset. Our results are competitive with existing work while using a substantially larger and more diverse cohort of 36,050 patients. While custom classifiers can only target a specific clinical event, our model learns physiological states which can help with many interventions. Our robust patient state representations provide a path towards evidence-driven administration of clinical interventions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5543372PMC
July 2017

Learning a Comorbidity-Driven Taxonomy of Pediatric Pulmonary Hypertension.

Circ Res 2017 Aug 13;121(4):341-353. Epub 2017 Jun 13.

From the Computational Health Informatics Program (M.-S.O., M.D.N., A.G., S.W.K., K.D.M.), Department of Cardiology (M.P.M.), Division of Critical Care Medicine, Department of Anesthesiology, Perioperative, and Pain Medicine (A.G.), and Department of Anesthesia (A.G.), Harvard School of Medicine, Boston Children's Hospital, MA; Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN (E.D.A.); Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge (P.S.); Department of Pediatrics, Massachusetts General Hospital, Boston (M.D.N.); and Department of Biostatistics, Harvard School of Public Health, Boston, MA. (T.C.).

Rationale: Pediatric pulmonary hypertension (PH) is a heterogeneous condition with varying natural history and therapeutic response. Precise classification of PH subtypes is, therefore, crucial for individualizing care. However, gaps remain in our understanding of the spectrum of PH in children.

Objective: We seek to study the manifestations of PH in children and to assess the feasibility of applying a network-based approach to discern disease subtypes from comorbidity data recorded in longitudinal data sets.

Methods And Results: A retrospective cohort study comprising 6 943 263 children (<18 years of age) enrolled in a commercial health insurance plan in the United States, between January 2010 and May 2013. A total of 1583 (0.02%) children met the criteria for PH. We identified comorbidities significantly associated with PH compared with the general population of children without PH. A Bayesian comorbidity network was constructed to model the interdependencies of these comorbidities, and network-clustering analysis was applied to derive disease subtypes comprising subgraphs of highly connected comorbid conditions. A total of 186 comorbidities were found to be significantly associated with PH. Network analysis of comorbidity patterns captured most of the major PH subtypes with known pathological basis defined by the World Health Organization and Panama classifications. The analysis further identified many subtypes documented in only a few case studies, including rare subtypes associated with several well-described genetic syndromes.

Conclusions: Application of network science to model comorbidity patterns recorded in longitudinal data sets can facilitate the discovery of disease subtypes. Our analysis relearned established subtypes, thus validating the approach, and identified rare subtypes that are difficult to discern through clinical observations, providing impetus for deeper investigation of the disease subtypes that will enrich current disease classifications.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1161/CIRCRESAHA.117.310804DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5559726PMC
August 2017

Prediction using patient comparison vs. modeling: a case study for mortality prediction.

Annu Int Conf IEEE Eng Med Biol Soc 2016 Aug;2016:2464-2467

Information in Electronic Medical Records (EMRs) can be used to generate accurate predictions for the occurrence of a variety of health states, which can contribute to more pro-active interventions. The very nature of EMRs does make the application of off-the-shelf machine learning techniques difficult. In this paper, we study two approaches to making predictions that have hardly been compared in the past: (1) extracting high-level (temporal) features from EMRs and building a predictive model, and (2) defining a patient similarity metric and predicting based on the outcome observed for similar patients. We analyze and compare both approaches on the MIMIC-II ICU dataset to predict patient mortality and find that the patient similarity approach does not scale well and results in a less accurate model (AUC of 0.68) compared to the modeling approach (0.84). We also show that mortality can be predicted within a median of 72 hours.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/EMBC.2016.7591229DOI Listing
August 2016

De-identification of patient notes with recurrent neural networks.

J Am Med Inform Assoc 2017 May;24(3):596-606

Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA.

Objective: Patient notes in electronic health records (EHRs) may contain critical information for medical investigations. However, the vast majority of medical investigators can only access de-identified notes, in order to protect the confidentiality of patients. In the United States, the Health Insurance Portability and Accountability Act (HIPAA) defines 18 types of protected health information that needs to be removed to de-identify patient notes. Manual de-identification is impractical given the size of electronic health record databases, the limited number of researchers with access to non-de-identified notes, and the frequent mistakes of human annotators. A reliable automated de-identification system would consequently be of high value.

Materials And Methods: We introduce the first de-identification system based on artificial neural networks (ANNs), which requires no handcrafted features or rules, unlike existing systems. We compare the performance of the system with state-of-the-art systems on two datasets: the i2b2 2014 de-identification challenge dataset, which is the largest publicly available de-identification dataset, and the MIMIC de-identification dataset, which we assembled and is twice as large as the i2b2 2014 dataset.

Results: Our ANN model outperforms the state-of-the-art systems. It yields an F1-score of 97.85 on the i2b2 2014 dataset, with a recall of 97.38 and a precision of 98.32, and an F1-score of 99.23 on the MIMIC de-identification dataset, with a recall of 99.25 and a precision of 99.21.

Conclusion: Our findings support the use of ANNs for de-identification of patient notes, as they show better performance than previously published systems while requiring no manual feature engineering.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/jamia/ocw156DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7787254PMC
May 2017

Understanding vasopressor intervention and weaning: risk prediction in a public heterogeneous clinical time series database.

J Am Med Inform Assoc 2017 May;24(3):488-495

Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA.

Background: The widespread adoption of electronic health records allows us to ask evidence-based questions about the need for and benefits of specific clinical interventions in critical-care settings across large populations.

Objective: We investigated the prediction of vasopressor administration and weaning in the intensive care unit. Vasopressors are commonly used to control hypotension, and changes in timing and dosage can have a large impact on patient outcomes.

Materials And Methods: We considered a cohort of 15 695 intensive care unit patients without orders for reduced care who were alive 30 days post-discharge. A switching-state autoregressive model (SSAM) was trained to predict the multidimensional physiological time series of patients before, during, and after vasopressor administration. The latent states from the SSAM were used as predictors of vasopressor administration and weaning.

Results: The unsupervised SSAM features were able to predict patient vasopressor administration and successful patient weaning. Features derived from the SSAM achieved areas under the receiver operating curve of 0.92, 0.88, and 0.71 for predicting ungapped vasopressor administration, gapped vasopressor administration, and vasopressor weaning, respectively. We also demonstrated many cases where our model predicted weaning well in advance of a successful wean.

Conclusion: Models that used SSAM features increased performance on both predictive tasks. These improvements may reflect an underlying, and ultimately predictive, latent state detectable from the physiological time series.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/jamia/ocw138DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6455107PMC
May 2017

Surrogate-assisted feature extraction for high-throughput phenotyping.

J Am Med Inform Assoc 2017 Apr;24(e1):e143-e149

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.

Objective: Phenotyping algorithms are capable of accurately identifying patients with specific phenotypes from within electronic medical records systems. However, developing phenotyping algorithms in a scalable way remains a challenge due to the extensive human resources required. This paper introduces a high-throughput unsupervised feature selection method, which improves the robustness and scalability of electronic medical record phenotyping without compromising its accuracy.

Methods: The proposed Surrogate-Assisted Feature Extraction (SAFE) method selects candidate features from a pool of comprehensive medical concepts found in publicly available knowledge sources. The target phenotype's International Classification of Diseases, Ninth Revision and natural language processing counts, acting as noisy surrogates to the gold-standard labels, are used to create silver-standard labels. Candidate features highly predictive of the silver-standard labels are selected as the final features.

Results: Algorithms were trained to identify patients with coronary artery disease, rheumatoid arthritis, Crohn's disease, and ulcerative colitis using various numbers of labels to compare the performance of features selected by SAFE, a previously published automated feature extraction for phenotyping procedure, and domain experts. The out-of-sample area under the receiver operating characteristic curve and F -score from SAFE algorithms were remarkably higher than those from the other two, especially at small label sizes.

Conclusion: SAFE advances high-throughput phenotyping methods by automatically selecting a succinct set of informative features for algorithm training, which in turn reduces overfitting and the needed number of gold-standard labels. SAFE also potentially identifies important features missed by automated feature extraction for phenotyping or experts.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/jamia/ocw135DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6080726PMC
April 2017

Predicting Social Anxiety Treatment Outcome Based on Therapeutic Email Conversations.

IEEE J Biomed Health Inform 2017 09 17;21(5):1449-1459. Epub 2016 Aug 17.

Predicting therapeutic outcome in the mental health domain is of utmost importance to enable therapists to provide the most effective treatment to a patient. Using information from the writings of a patient can potentially be a valuable source of information, especially now that more and more treatments involve computer-based exercises or electronic conversations between patient and therapist. In this paper, we study predictive modeling using writings of patients under treatment for a social anxiety disorder. We extract a wealth of information from the text written by patients including their usage of words, the topics they talk about, the sentiment of the messages, and the style of writing. In addition, we study trends over time with respect to those measures. We then apply machine learning algorithms to generate the predictive models. Based on a dataset of 69 patients, we are able to show that we can predict therapy outcome with an area under the curve of 0.83 halfway through the therapy and with a precision of 0.78 when using the full data (i.e., the entire treatment period). Due to the limited number of participants, it is hard to generalize the results, but they do show great potential in this type of information.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/JBHI.2016.2601123DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5613669PMC
September 2017
-->