Publications by authors named "Matthew P Lungren"

86 Publications

Automatic lung nodule segmentation and intra-nodular heterogeneity image generation.

IEEE J Biomed Health Inform 2021 Dec 15;PP. Epub 2021 Dec 15.

Automatic segmentation of lung nodules on computed tomography (CT) images is challenging owing to the variability of morphology, location, and intensity. In addition, few segmentation methods can capture intra-nodular heterogeneity to assist lung nodule diagnosis. In this study, we propose an end-to-end architecture to perform fully automated segmentation of multiple types of lung nodules and generate intra-nodular heterogeneity images for clinical use. To this end, a hybrid loss is considered by introducing a Faster R-CNN model based on generalized intersection over union loss in generative adversarial network. The Lung Image Database Consortium image collection dataset, comprising 2,635 lung nodules, was combined with 3,200 lung nodules from five hospitals for this study. Compared with manual segmentation by radiologists, the proposed model obtained an average dice coefficient (DC) of 82.05% on the test dataset. Compared with U-net, NoduleNet, nnU-net, and other three models, the proposed method achieved comparable performance on lung nodule segmentation and generated more vivid and valid intra-nodular heterogeneity images, which are beneficial in radiological diagnosis. In an external test of 91 patients from another hospital, the proposed model achieved an average DC of 81.61%. The proposed method effectively addresses the challenges of inevitable human interaction and additional pre-processing procedures in the existing solutions for lung nodule segmentation. In addition, the results show that the intra-nodular heterogeneity images generated by the proposed model are suitable to facilitate lung nodule diagnosis in radiology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/JBHI.2021.3135647DOI Listing
December 2021

Deep learning evaluation of biomarkers from echocardiogram videos.

EBioMedicine 2021 Nov 14;73:103613. Epub 2021 Oct 14.

Department of Computer Science, Stanford University, Palo Alto, CA 94025; Department of Electrical Engineering, Stanford University, Palo Alto, CA, 94025; Department of Biomedical Data Science, Stanford University, Palo Alto, CA, 94025. Electronic address:

Background: Laboratory testing is routinely used to assay blood biomarkers to provide information on physiologic state beyond what clinicians can evaluate from interpreting medical imaging. We hypothesized that deep learning interpretation of echocardiogram videos can provide additional value in understanding disease states and can evaluate common biomarkers results.

Methods: We developed EchoNet-Labs, a video-based deep learning algorithm to detect evidence of anemia, elevated B-type natriuretic peptide (BNP), troponin I, and blood urea nitrogen (BUN), as well as values of ten additional lab tests directly from echocardiograms. We included patients (n = 39,460) aged 18 years or older with one or more apical-4-chamber echocardiogram videos (n = 70,066) from Stanford Healthcare for training and internal testing of EchoNet-Lab's performance in estimating the most proximal biomarker result. Without fine-tuning, the performance of EchoNet-Labs was further evaluated on an additional external test dataset (n = 1,301) from Cedars-Sinai Medical Center. We calculated the area under the curve (AUC) of the receiver operating characteristic curve for the internal and external test datasets.

Findings: On the held-out test set of Stanford patients not previously seen during model training, EchoNet-Labs achieved an AUC of 0.80 (0.79-0.81) in detecting anemia (low hemoglobin), 0.86 (0.85-0.88) in detecting elevated BNP, 0.75 (0.73-0.78) in detecting elevated troponin I, and 0.74 (0.72-0.76) in detecting elevated BUN. On the external test dataset from Cedars-Sinai, EchoNet-Labs achieved an AUC of 0.80 (0.77-0.82) in detecting anemia, of 0.82 (0.79-0.84) in detecting elevated BNP, of 0.75 (0.72-0.78) in detecting elevated troponin I, and of 0.69 (0.66-0.71) in detecting elevated BUN. We further demonstrate the utility of the model in detecting abnormalities in 10 additional lab tests. We investigate the features necessary for EchoNet-Labs to make successful detection and identify potential mechanisms for each biomarker using well-known and novel explainability techniques.

Interpretation: These results show that deep learning applied to diagnostic imaging can provide additional clinical value and identify phenotypic information beyond current imaging interpretation methods.

Funding: J.W.H. and B.H. are supported by the NSF Graduate Research Fellowship. D.O. is supported by NIH K99 HL157421-01. J.Y.Z. is supported by NSF CAREER 1942926, NIH R21 MD012867-01, NIH P30AG059307 and by a Chan-Zuckerberg Biohub Fellowship.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ebiom.2021.103613DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8524103PMC
November 2021

Artificial Intelligence Algorithm Improves Radiologist Performance in Skeletal Age Assessment: A Prospective Multicenter Randomized Controlled Trial.

Radiology 2021 12 28;301(3):692-699. Epub 2021 Sep 28.

From the Department of Computer Science, Stanford University, 300 N Pasteur Dr, Stanford, CA 94305 (D.K.E., N.B.K.); Departments of Pediatrics (J.L.) and Radiology (D.B.L., J.M.S., C.P.L., M.P.L., S.S.H.), Stanford University School of Medicine, Stanford, Calif; Department of Radiology, New York University School of Medicine, New York, NY (N.R.F., S.V.L., N.A.S., M.E.B.); Department of Radiology, Emory School of Medicine and Children's Healthcare of Atlanta, Atlanta, Ga (S.S.M.); Department of Radiology, MedStar Health and Georgetown University School of Medicine, Washington, DC (R.W.F., A.R.Z.); Department of Radiology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio (S.E.S., A.J.T., C.G.A.); Department of Radiology, Children's Hospital of Philadelphia, Philadelphia, Pa (M.L.F., S.L.K., R.D.); Department of Radiology, Harvard Medical School and Boston Children's Hospital, Boston, Mass (K.E., S.P.P.); Department of Radiology, Yale School of Medicine, New Haven, Conn (B.J.D., C.T.S.); and Department of Radiology, Kansas University School of Medicine, Kansas City, Kan (B.M.E.).

Background Previous studies suggest that use of artificial intelligence (AI) algorithms as diagnostic aids may improve the quality of skeletal age assessment, though these studies lack evidence from clinical practice. Purpose To compare the accuracy and interpretation time of skeletal age assessment on hand radiograph examinations with and without the use of an AI algorithm as a diagnostic aid. Materials and Methods In this prospective randomized controlled trial, the accuracy of skeletal age assessment on hand radiograph examinations was performed with ( = 792) and without ( = 739) the AI algorithm as a diagnostic aid. For examinations with the AI algorithm, the radiologist was shown the AI interpretation as part of their routine clinical work and was permitted to accept or modify it. Hand radiographs were interpreted by 93 radiologists from six centers. The primary efficacy outcome was the mean absolute difference between the skeletal age dictated into the radiologists' signed report and the average interpretation of a panel of four radiologists not using a diagnostic aid. The secondary outcome was the interpretation time. A linear mixed-effects regression model with random center- and radiologist-level effects was used to compare the two experimental groups. Results Overall mean absolute difference was lower when radiologists used the AI algorithm compared with when they did not (5.36 months vs 5.95 months; = .04). The proportions at which the absolute difference exceeded 12 months (9.3% vs 13.0%, = .02) and 24 months (0.5% vs 1.8%, = .02) were lower with the AI algorithm than without it. Median radiologist interpretation time was lower with the AI algorithm than without it (102 seconds vs 142 seconds, = .001). Conclusion Use of an artificial intelligence algorithm improved skeletal age assessment accuracy and reduced interpretation times for radiologists, although differences were observed between centers. Clinical trial registration no. NCT03530098 © RSNA, 2021 See also the editorial by Rubin in this issue.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1148/radiol.2021204021DOI Listing
December 2021

CheXED: Comparison of a Deep Learning Model to a Clinical Decision Support System for Pneumonia in the Emergency Department.

J Thorac Imaging 2021 Sep 23. Epub 2021 Sep 23.

Department of Computer Science AIMI Center, Stanford University, Stanford Bunkerhill Health, Palo Alto, CA Care Transformations Department, Intermountain Healthcare Department of Biomedical Informatics Division of Respiratory, Critical Care, and Occupational Pulmonary Medicine, University of Utah Division of Pulmonary and Critical Care Medicine Department of Radiology, Intermountain Medical Center, Salt Lake City, UT.

Purpose: Patients with pneumonia often present to the emergency department (ED) and require prompt diagnosis and treatment. Clinical decision support systems for the diagnosis and management of pneumonia are commonly utilized in EDs to improve patient care. The purpose of this study is to investigate whether a deep learning model for detecting radiographic pneumonia and pleural effusions can improve functionality of a clinical decision support system (CDSS) for pneumonia management (ePNa) operating in 20 EDs.

Materials And Methods: In this retrospective cohort study, a dataset of 7434 prior chest radiographic studies from 6551 ED patients was used to develop and validate a deep learning model to identify radiographic pneumonia, pleural effusions, and evidence of multilobar pneumonia. Model performance was evaluated against 3 radiologists' adjudicated interpretation and compared with performance of the natural language processing of radiology reports used by ePNa.

Results: The deep learning model achieved an area under the receiver operating characteristic curve of 0.833 (95% confidence interval [CI]: 0.795, 0.868) for detecting radiographic pneumonia, 0.939 (95% CI: 0.911, 0.962) for detecting pleural effusions and 0.847 (95% CI: 0.800, 0.890) for identifying multilobar pneumonia. On all 3 tasks, the model achieved higher agreement with the adjudicated radiologist interpretation compared with ePNa.

Conclusions: A deep learning model demonstrated higher agreement with radiologists than the ePNa CDSS in detecting radiographic pneumonia and related findings. Incorporating deep learning models into pneumonia CDSS could enhance diagnostic performance and improve pneumonia management.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1097/RTI.0000000000000622DOI Listing
September 2021

Impact of Upstream Medical Image Processing on Downstream Performance of a Head CT Triage Neural Network.

Radiol Artif Intell 2021 Jul 28;3(4):e200229. Epub 2021 Apr 28.

Department of Electrical Engineering (S.M.H.), Department of Computer Science (J.A.D., C.R.), Department of Radiology (M.P.L., D.M., D.L.R., A.W.), Department of Biomedical Data Science (J.A.D., D.L.R.), and Center for Artificial Intelligence in Medicine and Imaging (M.P.L., D.L.R.), Stanford University, 450 Serra Mall, Stanford, CA 94305; and Department of Radiology, Mayo Clinic, Scottsdale, Ariz (B.N.P.).

Purpose: To develop a convolutional neural network (CNN) to triage head CT (HCT) studies and investigate the effect of upstream medical image processing on the CNN's performance.

Materials And Methods: A total of 9776 HCT studies were retrospectively collected from 2001 through 2014, and a CNN was trained to triage them as normal or abnormal. CNN performance was evaluated on a held-out test set, assessing triage performance and sensitivity to 20 disorders to assess differential model performance, with 7856 CT studies in the training set, 936 in the validation set, and 984 in the test set. This CNN was used to understand how the upstream imaging chain affects CNN performance by evaluating performance after altering three variables: image acquisition by reducing the number of x-ray projections, image reconstruction by inputting sinogram data into the CNN, and image preprocessing. To evaluate performance, the DeLong test was used to assess differences in the area under the receiver operating characteristic curve (AUROC), and the McNemar test was used to compare sensitivities.

Results: The CNN achieved a mean AUROC of 0.84 (95% CI: 0.83, 0.84) in discriminating normal and abnormal HCT studies. The number of x-ray projections could be reduced by 16 times and the raw sensor data could be input into the CNN with no statistically significant difference in classification performance. Additionally, CT windowing consistently improved CNN performance, increasing the mean triage AUROC by 0.07 points.

Conclusion: A CNN was developed to triage HCT studies, which may help streamline image evaluation, and the means by which upstream image acquisition, reconstruction, and preprocessing affect downstream CNN performance was investigated, bringing focus to this important part of the imaging chain. Head CT, Automated Triage, Deep Learning, Sinogram, Dataset© RSNA, 2021.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1148/ryai.2021200229DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8328108PMC
July 2021

Automated coronary calcium scoring using deep learning with multicenter external validation.

NPJ Digit Med 2021 Jun 1;4(1):88. Epub 2021 Jun 1.

Department of Radiology, Mayo Clinic, Scottsdale, AZ, USA.

Coronary artery disease (CAD), the most common manifestation of cardiovascular disease, remains the most common cause of mortality in the United States. Risk assessment is key for primary prevention of coronary events and coronary artery calcium (CAC) scoring using computed tomography (CT) is one such non-invasive tool. Despite the proven clinical value of CAC, the current clinical practice implementation for CAC has limitations such as the lack of insurance coverage for the test, need for capital-intensive CT machines, specialized imaging protocols, and accredited 3D imaging labs for analysis (including personnel and software). Perhaps the greatest gap is the millions of patients who undergo routine chest CT exams and demonstrate coronary artery calcification, but their presence is not often reported or quantitation is not feasible. We present two deep learning models that automate CAC scoring demonstrating advantages in automated scoring for both dedicated gated coronary CT exams and routine non-gated chest CTs performed for other reasons to allow opportunistic screening. First, we trained a gated coronary CT model for CAC scoring that showed near perfect agreement (mean difference in scores = -2.86; Cohen's Kappa = 0.89, P < 0.0001) with current conventional manual scoring on a retrospective dataset of 79 patients and was found to perform the task faster (average time for automated CAC scoring using a graphics processing unit (GPU) was 3.5 ± 2.1 s vs. 261 s for manual scoring) in a prospective trial of 55 patients with little difference in scores compared to three technologists (mean difference in scores = 3.24, 5.12, and 5.48, respectively). Then using CAC scores from paired gated coronary CT as a reference standard, we trained a deep learning model on our internal data and a cohort from the Multi-Ethnic Study of Atherosclerosis (MESA) study (total training n = 341, Stanford test n = 42, MESA test n = 46) to perform CAC scoring on routine non-gated chest CT exams with validation on external datasets (total n = 303) obtained from four geographically disparate health systems. On identifying patients with any CAC (i.e., CAC ≥ 1), sensitivity and PPV was high across all datasets (ranges: 80-100% and 87-100%, respectively). For CAC ≥ 100 on routine non-gated chest CTs, which is the latest recommended threshold to initiate statin therapy, our model showed sensitivities of 71-94% and positive predictive values in the range of 88-100% across all the sites. Adoption of this model could allow more patients to be screened with CAC scoring, potentially allowing opportunistic early preventive interventions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41746-021-00460-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8169744PMC
June 2021

Current and emerging artificial intelligence applications for pediatric interventional radiology.

Pediatr Radiol 2021 May 12. Epub 2021 May 12.

Center for Artificial Intelligence in Medicine & Imaging, Stanford University, Stanford, CA, USA.

Artificial intelligence in medicine can help improve the accuracy and efficiency of diagnostics, selection of therapies and prediction of outcomes. Machine learning describes a subset of artificial intelligence that utilizes algorithms that can learn modeling functions from datasets. More complex algorithms, or deep learning, can similarly learn modeling functions for a variety of tasks leveraging massive complex datasets. The aggregation of artificial intelligence tools has the potential to improve many facets of health care delivery, from mundane tasks such as scheduling appointments to more complex functions such as enterprise management modeling and in-suite procedural assistance. Within radiology, the roles and use cases for artificial intelligence (inclusive of machine learning and deep learning) continue to evolve. Significant resources have been devoted to diagnostic radiology tasks via national radiology societies, academic medical centers and hundreds of commercial entities. Despite the widespread interest in artificial intelligence radiology solutions, there remains a lack of applications and discussion for use cases in interventional radiology (IR). Even more relevant to this audience, specific technologies tailored to the pediatric IR space are lacking. In this review, we describe artificial intelligence technologies that have been developed within the IR suite, as well as some future work, with a focus on artificial intelligence's potential impact in pediatric interventional medicine.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00247-021-05013-yDOI Listing
May 2021

The RSNA Pulmonary Embolism CT Dataset.

Radiol Artif Intell 2021 Mar 20;3(2):e200254. Epub 2021 Jan 20.

Department of Medical Imaging, Unity Health Toronto, University of Toronto, 30 Bond St, Toronto, ON, Canada M5B 1W8 (E.C.); Department of Diagnostic Imaging, Universidade Federal de São Paulo, São Paulo, Brazil (F.C.K.); Diagnósticos da América SA (Dasa) (F.C.K.); Department of Radiology, University of Kentucky, Lexington, Ky (S.B.H.); Department of Diagnostic Radiology, University of Texas MD Anderson Cancer Center, Houston, Tex (C.C.W.); Department of Radiology, Stanford University, Stanford, Calif (M.P.L., S.S.H.); Department of Radiology, The Ohio State University, Columbus, Ohio (L.M.P.); Department of Radiology and Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, Mass (J.K.); The Jackson Laboratory, Bar Harbor, Maine (R.L.B.); Department of Radiology, Weill Cornell Medical College, New York, NY (G.S.); MD.ai, New York, NY (A.S.); Department of Radiology, Koc University School of Medicine, Istanbul, Turkey (E.A.); Department of Radiology and Nuclear Medicine, Alfred Health, Monash University, Melbourne, Australia (M.L.); Department of Radiodiagnosis, Fortis Escorts Heart Institute, New Delhi, India (P.K.); Department of Diagnostic Radiology and Nuclear Medicine, Faculty of Medicine, University of Jordan, Amman, Jordan (K.A.M.); Department of Departamento de Imagenología, Hospital Regional de Alta Especialidad de la Península de Yucatán, Mérida, Mexico (D.C.N.R.); Department of Radiology, University of Pittsburgh Medical Center, Pittsburgh, Pa (J.W.S.); Department of Radiology, Cooper University Hospital, Camden, NJ (P. Germaine); A Coruña University Hospital, A Coruña, Spain (E.C.L.); Swiss Medical Group, Buenos Aires, Argentina (T.A.); Inland Imaging, Spokane, Wash (P. Gupta); AMRI Hospitals, Kolkata, India (M.J.); Department of Radiology, University of Texas Southwestern Medical Center, Dallas, Tex (F.U.K.); Department of Radiology, Johns Hopkins University School of Medicine, Baltimore, Md (C.T.L.); Department of Radiology and Imaging Sciences, Tata Medical Center, Kolkata, India (S.S.); Department of Radiology, University of New Mexico, Albuquerque, NM (J.W.R.); Department of Radiology, Universitair Ziekenhuis Brussel, Jette, Belgium (C.C.B.); Department of Radiology and Biomedical Imaging, University of California-San Francisco, San Francisco, Calif (J.M).

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1148/ryai.2021200254DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8043364PMC
March 2021

Construction of a Machine Learning Dataset through Collaboration: The RSNA 2019 Brain CT Hemorrhage Challenge.

Radiol Artif Intell 2020 May 29;2(3):e190211. Epub 2020 Apr 29.

Department of Radiology/Division of Neuroradiology, Thomas Jefferson University Hospital, 132 S Tenth St, Suite 1080B Main Building, Philadelphia, PA 19107 (A.E.F.); Department of Radiology, The Ohio State University, Columbus, Ohio (L.M.P.); Department of Radiology, Weill Cornell Medical College, New York, NY (G.S.); Department of Radiology, Stanford University, Stanford, Calif (S.S.H.); Department of Radiology and Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, Mass (J.K.); Quantitative Sciences Unit, Stanford University, Stanford, Calif (R.B.); Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, Calif (J.T.M.); MD.ai, New York, NY (A.S.); Department of Diagnostic Imaging, Universidade Federal de São Paulo, São Paulo, Brazil (F.C.K.); Department of Radiology, Stanford University, Stanford, Calif (M.P.L.); Department of Radiology, University of Alabama at Birmingham, Birmingham, Ala (G.C.); Faculty of Health and Medical Sciences, University of Western Australia, Perth, Australia (L. Cala); Advanced Diagnostic Imaging, Clínica DAPI, Curitiba, Brazil (L. Coelho); Department of Radiology, University of Washington, Seattle, Wash (M.M.); Department of Radiology, Baylor College of Medicine, Houston, Tex (F.M., C.L.); Department of Radiology, University of Ottawa, Ottawa, Canada (E.M.); Department of Radiology & Biomedical Imaging, Yale University, New Haven, Conn (I.I., V.Z.); Department of Medical Imaging, Gold Coast University Hospital, Southport, Australia (O.M.); Department of Neuroradiology, University of Utah Health Sciences Center, Salt Lake City, Utah (L.S.); Department of Radiology and Medical Imaging, University of Virginia Health, Charlottesville, Va (D.J.); Division of Neuroradiology, University of Texas Southwestern Medical Center, Dallas, Tex (A.A.); Department of Radiology, Albert Einstein Healthcare Network, Philadelphia, Pa (R.K.L.); and Department of Radiology, SUNY Downstate Medical Center, Albany, NY (J.N.).

This dataset is composed of annotations of the five hemorrhage subtypes (subarachnoid, intraventricular, subdural, epidural, and intraparenchymal hemorrhage) typically encountered at brain CT.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1148/ryai.2020190211DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8082297PMC
May 2020

Multi-task weak supervision enables anatomically-resolved abnormality detection in whole-body FDG-PET/CT.

Nat Commun 2021 03 25;12(1):1880. Epub 2021 Mar 25.

Department of Radiology, Stanford University, Stanford, CA, USA.

Computational decision support systems could provide clinical value in whole-body FDG-PET/CT workflows. However, limited availability of labeled data combined with the large size of PET/CT imaging exams make it challenging to apply existing supervised machine learning systems. Leveraging recent advancements in natural language processing, we describe a weak supervision framework that extracts imperfect, yet highly granular, regional abnormality labels from free-text radiology reports. Our framework automatically labels each region in a custom ontology of anatomical regions, providing a structured profile of the pathologies in each imaging exam. Using these generated labels, we then train an attention-based, multi-task CNN architecture to detect and estimate the location of abnormalities in whole-body scans. We demonstrate empirically that our multi-task representation is critical for strong performance on rare abnormalities with limited training data. The representation also contributes to more accurate mortality prediction from imaging data, suggesting the potential utility of our framework beyond abnormality detection and location estimation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-021-22018-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7994797PMC
March 2021

The RSNA International COVID-19 Open Radiology Database (RICORD).

Radiology 2021 04 5;299(1):E204-E213. Epub 2021 Jan 5.

From the Department of Radiology, Stanford University, Stanford, Calif (E.B.T., J.S., B.P.P.); Department of Radiology, University of Pennsylvania Hospital, Philadelphia, Pa (S.S., M. Hershman, L.R.); Department of Radiology, Stanford University School of Medicine, Stanford University Medical Center, 725 Welch Rd, Room 1675, Stanford, CA 94305-5913 (M.P.L.); Department of Medical Imaging, University of Toronto, Unity Health Toronto, Toronto, Canada (E.C.); Department of Radiology, Mayo Clinic, Rochester, Minn (B.J.E., P.R.); Department of Radiology, Weill Cornell Medicine, New York, NY (G.S.); MD.ai, New York, NY (A.S.); Department of Radiology, Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School, Charlestown, Mass (J.K.C.); Department of Diagnostic and Interventional Radiology, Cairo University Kasr Alainy Faculty of Medicine, Cairo, Egypt (M. Hafez); Department of Radiology, The Ottawa Hospital, Ottawa, Canada (S.J.); Department of Radiology and Biomedical Imaging, Center for Intelligent Imaging, San Francisco, Calif (J.M.); Department of Radiology, Koç University School of Medicine, Koç University Hospital, Istanbul, Turkey (E.A.); Department of Radiology, ETZ Hospital, Tilburg, the Netherlands (E.R.R.); Department of Radiology, University of Ghent, Ghent, Belgium (E.R.R.); Department of Diagnostic Imaging, Universidade Federal de São Paulo, São Paulo, Brazil (F.C.K.); Department of Radiology, Netherlands Cancer Institute, Amsterdam, the Netherlands (L.T.); Department of Radiology, NYU Grossman School of Medicine, Center for Advanced Imaging Innovation and Research, Laura and Isaac Perlmutter Cancer Center, New York, NY (L.M.); Department of Radiology, University of Wisconsin School of Medicine and Public Health, Madison, Wis (J.P.K.); and Department of Thoracic Imaging, University of Texas MD Anderson Cancer Center, Houston, Tex (C.C.W.).

The coronavirus disease 2019 (COVID-19) pandemic is a global health care emergency. Although reverse-transcription polymerase chain reaction testing is the reference standard method to identify patients with COVID-19 infection, chest radiography and CT play a vital role in the detection and management of these patients. Prediction models for COVID-19 imaging are rapidly being developed to support medical decision making. However, inadequate availability of a diverse annotated data set has limited the performance and generalizability of existing models. To address this unmet need, the RSNA and Society of Thoracic Radiology collaborated to develop the RSNA International COVID-19 Open Radiology Database (RICORD). This database is the first multi-institutional, multinational, expert-annotated COVID-19 imaging data set. It is made freely available to the machine learning community as a research and educational resource for COVID-19 chest imaging. Pixel-level volumetric segmentation with clinical annotations was performed by thoracic radiology subspecialists for all COVID-19-positive thoracic CT scans. The labeling schema was coordinated with other international consensus panels and COVID-19 data annotation efforts, the European Society of Medical Imaging Informatics, the American College of Radiology, and the American Association of Physicists in Medicine. Study-level COVID-19 classification labels for chest radiographs were annotated by three radiologists, with majority vote adjudication by board-certified radiologists. RICORD consists of 240 thoracic CT scans and 1000 chest radiographs contributed from four international sites. It is anticipated that RICORD will ideally lead to prediction models that can demonstrate sustained performance across populations and health care systems.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1148/radiol.2021203957DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7993245PMC
April 2021

Multimodal fusion with deep neural networks for leveraging CT imaging and electronic health record: a case-study in pulmonary embolism detection.

Sci Rep 2020 12 17;10(1):22147. Epub 2020 Dec 17.

Department of Biomedical Data Science, Stanford University, Stanford, USA.

Recent advancements in deep learning have led to a resurgence of medical imaging and Electronic Medical Record (EMR) models for a variety of applications, including clinical decision support, automated workflow triage, clinical prediction and more. However, very few models have been developed to integrate both clinical and imaging data, despite that in routine practice clinicians rely on EMR to provide context in medical imaging interpretation. In this study, we developed and compared different multimodal fusion model architectures that are capable of utilizing both pixel data from volumetric Computed Tomography Pulmonary Angiography scans and clinical patient data from the EMR to automatically classify Pulmonary Embolism (PE) cases. The best performing multimodality model is a late fusion model that achieves an AUROC of 0.947 [95% CI: 0.946-0.948] on the entire held-out test set, outperforming imaging-only and EMR-only single modality models.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-020-78888-wDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7746687PMC
December 2020

Deep learning and its role in COVID-19 medical imaging.

Intell Based Med 2020 Dec 4;3:100013. Epub 2020 Nov 4.

Center for Artificial Intelligence in Medicine & Imaging, Stanford University, United States.

COVID-19 is one of the greatest global public health challenges in history. COVID-19 is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and is estimated to have an cumulative global case-fatality rate as high as 7.2% (Onder et al., 2020) [1]. As the SARS-CoV-2 spread across the globe it catalyzed new urgency in building systems to allow rapid sharing and dissemination of data between international healthcare infrastructures and governments in a worldwide effort focused on case tracking/tracing, identifying effective therapeutic protocols, securing healthcare resources, and in drug and vaccine research. In addition to the worldwide efforts to share clinical and routine population health data, there are many large-scale efforts to collect and disseminate medical imaging data, owing to the critical role that imaging has played in diagnosis and management around the world. Given reported false negative rates of the reverse transcriptase polymerase chain reaction (RT-PCR) of up to 61% (Centers for Disease Control and Prevention, Division of Viral Diseases, 2020; Kucirka et al., 2020) [2,3], imaging can be used as an important adjunct or alternative. Furthermore, there has been a shortage of test-kits worldwide and laboratories in many testing sites have struggled to process the available tests within a reasonable time frame. Given these issues surrounding COVID-19, many groups began to explore the benefits of 'big data' processing and algorithms to assist with the diagnosis and therapeutic development of COVID-19.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ibmed.2020.100013DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7641591PMC
December 2020

Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines.

NPJ Digit Med 2020 16;3:136. Epub 2020 Oct 16.

Department of Biomedical Data Science, Stanford University, Stanford, USA.

Advancements in deep learning techniques carry the potential to make significant contributions to healthcare, particularly in fields that utilize medical imaging for diagnosis, prognosis, and treatment decisions. The current state-of-the-art deep learning models for radiology applications consider only pixel-value information without data informing clinical context. Yet in practice, pertinent and accurate non-imaging data based on the clinical history and laboratory data enable physicians to interpret imaging findings in the appropriate clinical context, leading to a higher diagnostic accuracy, informative clinical decision making, and improved patient outcomes. To achieve a similar goal using deep learning, medical imaging pixel-based models must also achieve the capability to process contextual data from electronic health records (EHR) in addition to pixel data. In this paper, we describe different data fusion techniques that can be applied to combine medical imaging with EHR, and systematically review medical data fusion literature published between 2012 and 2020. We conducted a systematic search on PubMed and Scopus for original research articles leveraging deep learning for fusion of multimodality data. In total, we screened 985 studies and extracted data from 17 papers. By means of this systematic review, we present current knowledge, summarize important results and provide implementation guidelines to serve as a reference for researchers interested in the application of multimodal fusion in medical imaging.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41746-020-00341-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7567861PMC
October 2020

CheXaid: deep learning assistance for physician diagnosis of tuberculosis using chest x-rays in patients with HIV.

NPJ Digit Med 2020 9;3:115. Epub 2020 Sep 9.

Stanford University AIMI Center, Stanford, CA USA.

Tuberculosis (TB) is the leading cause of preventable death in HIV-positive patients, and yet often remains undiagnosed and untreated. Chest x-ray is often used to assist in diagnosis, yet this presents additional challenges due to atypical radiographic presentation and radiologist shortages in regions where co-infection is most common. We developed a deep learning algorithm to diagnose TB using clinical information and chest x-ray images from 677 HIV-positive patients with suspected TB from two hospitals in South Africa. We then sought to determine whether the algorithm could assist clinicians in the diagnosis of TB in HIV-positive patients as a web-based diagnostic assistant. Use of the algorithm resulted in a modest but statistically significant improvement in clinician accuracy ( = 0.002), increasing the mean clinician accuracy from 0.60 (95% CI 0.57, 0.63) without assistance to 0.65 (95% CI 0.60, 0.70) with assistance. However, the accuracy of assisted clinicians was significantly lower ( < 0.001) than that of the stand-alone algorithm, which had an accuracy of 0.79 (95% CI 0.77, 0.82) on the same unseen test cases. These results suggest that deep learning assistance may improve clinician accuracy in TB diagnosis using chest x-rays, which would be valuable in settings with a high burden of HIV/TB co-infection. Moreover, the high accuracy of the stand-alone algorithm suggests a potential value particularly in settings with a scarcity of radiological expertise.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41746-020-00322-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7481246PMC
September 2020

Prospective Deployment of Deep Learning in MRI: A Framework for Important Considerations, Challenges, and Recommendations for Best Practices.

J Magn Reson Imaging 2021 08 24;54(2):357-371. Epub 2020 Aug 24.

Department of Radiology, Stanford University, Stanford, California, USA.

Artificial intelligence algorithms based on principles of deep learning (DL) have made a large impact on the acquisition, reconstruction, and interpretation of MRI data. Despite the large number of retrospective studies using DL, there are fewer applications of DL in the clinic on a routine basis. To address this large translational gap, we review the recent publications to determine three major use cases that DL can have in MRI, namely, that of model-free image synthesis, model-based image reconstruction, and image or pixel-level classification. For each of these three areas, we provide a framework for important considerations that consist of appropriate model training paradigms, evaluation of model robustness, downstream clinical utility, opportunities for future advances, as well recommendations for best current practices. We draw inspiration for this framework from advances in computer vision in natural imaging as well as additional healthcare fields. We further emphasize the need for reproducibility of research studies through the sharing of datasets and software. LEVEL OF EVIDENCE: 5 TECHNICAL EFFICACY STAGE: 2.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/jmri.27331DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8639049PMC
August 2021

Cross-Modal Data Programming Enables Rapid Medical Machine Learning.

Patterns (N Y) 2020 May 28;1(2). Epub 2020 Apr 28.

Department of Computer Science, Stanford University, Stanford, CA, USA.

A major bottleneck in developing clinically impactful machine learning models is a lack of labeled training data for model supervision. Thus, medical researchers increasingly turn to weaker, noisier sources of supervision, such as leveraging extractions from unstructured text reports to supervise image classification. A key challenge in weak supervision is combining sources of information that may differ in quality and have correlated errors. Recently, a statistical theory of weak supervision called data programming has shown promise in addressing this challenge. Data programming now underpins many deployed machine-learning systems in the technology industry, even for critical applications. We propose a new technique for applying data programming to the problem of cross-modal weak supervision in medicine, wherein weak labels derived from an auxiliary modality (e.g., text) are used to train models over a different target modality (e.g., images). We evaluate our approach on diverse clinical tasks via direct comparison to institution-scale, hand-labeled datasets. We find that our supervision technique increases model performance by up to 6 points area under the receiver operating characteristic curve (ROC-AUC) over baseline methods by improving both coverage and quality of the weak labels. Our approach yields models that on average perform within 1.75 points ROC-AUC of those supervised with physician-years of hand labeling and outperform those supervised with physician-months of hand labeling by 10.25 points ROC-AUC, while using only person-days of developer time and clinician work-a time saving of 96%. Our results suggest that modern weak supervision techniques such as data programming may enable more rapid development and deployment of clinically useful machine-learning models.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.patter.2020.100019DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7413132PMC
May 2020

Pediatric Hepatoblastoma, Hepatocellular Carcinoma, and Other Hepatic Neoplasms: Consensus Imaging Recommendations from American College of Radiology Pediatric Liver Reporting and Data System (LI-RADS) Working Group.

Radiology 2020 09 30;296(3):493-497. Epub 2020 Jun 30.

From the Department of Radiology, University of Texas Southwestern Medical Center, Dallas, Tex (G.R.S.); Department of Radiology, University of Pittsburgh Medical Center Children's Hospital of Pittsburgh, Pittsburgh, Pa (J.H.S.); Department of Radiology, Emory University and Children's Healthcare of Atlanta, Atlanta, Ga (A.A.); Department of Diagnostic Imaging, The Hospital for Sick Children and Department of Medical Imaging, University of Toronto, Toronto, Canada (G.B.C.); Department of Radiology, Montefiore Medical Center, Bronx, NY (V.C.); Department of Radiology, Duke University Medical Center, Durham, NC (J.T.D.); Department of Radiology, Mallinckrodt Institute of Radiology, St. Louis Children's Hospital, Washington University School of Medicine, St Louis, Mo (G.K.); Department of Radiology, Nationwide Children's Hospital, Columbus, Ohio (R.K.); Stanford University School of Medicine, Lucile Packard Children's Hospital, Stanford, Calif (M.P.L.); Department of Radiology, Texas Children's Hospital, Houston, Tex (P.M.M.); Nemours Children's Hospital, Nemours Children's Health System, University of Central Florida College of Medicine, Orlando, Fla (D.J.P.); Liver Imaging Group, Department of Radiology, University of California San Diego, San Diego, Calif (C.B.S.); Department of Radiology, Cincinnati Children's Hospital, Cincinnati, Ohio (A.J.T.); and Department of Radiology, University of Cincinnati College of Medicine, 3333 Burnet Ave, MLC 5031, Cincinnati, OH 45229 (A.J.T.).

Appropriate imaging is imperative in evaluating children with a primary hepatic malignancy such as hepatoblastoma or hepatocellular carcinoma. For use in the adult patient population, the American College of Radiology created the Liver Imaging Reporting and Data System (LI-RADS) to provide consistent terminology and to improve imaging interpretation. At present, no similar consensus exists to guide imaging and interpretation of pediatric patients at risk for developing a liver neoplasm or how best to evaluate a pediatric patient with a known liver neoplasm. Therefore, a new Pediatric Working Group within American College of Radiology LI-RADS was created to provide consensus for imaging recommendations and interpretation of pediatric liver neoplasms. The article was drafted based on the most up-to-date existing information as interpreted by imaging experts comprising the Pediatric LI-RADS Working Group. Guidance is provided regarding appropriate imaging modalities and protocols, as well as imaging interpretation and reporting, with the goals to improve imaging quality, to decrease image interpretation errors, to enhance communication with referrers, and to advance patient care. An expanded version of this document that includes broader background information on pediatric hepatocellular carcinoma and rationale for recommendations can be found in Appendix E1 (online).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1148/radiol.2020200751DOI Listing
September 2020

Dynamic Hydrodissection for Skin Protection during Cryoablation of Superficial Lesions.

J Vasc Interv Radiol 2020 11 14;31(11):1942-1945. Epub 2020 May 14.

Department of Pediatric Interventional Radiology, Lucile Packard Children's Hospital, Stanford University School of Medicine 3155 Porter Drive, Stanford, CA 94034.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jvir.2020.01.025DOI Listing
November 2020

PENet-a scalable deep-learning model for automated diagnosis of pulmonary embolism using volumetric CT imaging.

NPJ Digit Med 2020 24;3:61. Epub 2020 Apr 24.

1Department of Biomedical Data Science, Stanford University, Stanford, CA USA.

Pulmonary embolism (PE) is a life-threatening clinical problem and computed tomography pulmonary angiography (CTPA) is the gold standard for diagnosis. Prompt diagnosis and immediate treatment are critical to avoid high morbidity and mortality rates, yet PE remains among the diagnoses most frequently missed or delayed. In this study, we developed a deep learning model-PENet, to automatically detect PE on volumetric CTPA scans as an end-to-end solution for this purpose. The PENet is a 77-layer 3D convolutional neural network (CNN) pretrained on the Kinetics-600 dataset and fine-tuned on a retrospective CTPA dataset collected from a single academic institution. The PENet model performance was evaluated in detecting PE on data from two different institutions: one as a hold-out dataset from the same institution as the training data and a second collected from an external institution to evaluate model generalizability to an unrelated population dataset. PENet achieved an AUROC of 0.84 [0.82-0.87] on detecting PE on the hold out internal test set and 0.85 [0.81-0.88] on external dataset. PENet also outperformed current state-of-the-art 3D CNN models. The results represent successful application of an end-to-end 3D CNN model for the complex task of PE diagnosis without requiring computationally intensive and time consuming preprocessing and demonstrates sustained performance on data from an external institution. Our model could be applied as a triage tool to automatically identify clinically important PEs allowing for prioritization for diagnostic radiology interpretation and improved care pathways via more efficient diagnosis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41746-020-0266-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7181770PMC
April 2020

Ethics of Using and Sharing Clinical Imaging Data for Artificial Intelligence: A Proposed Framework.

Radiology 2020 Jun 24;295(3):675-682. Epub 2020 Mar 24.

From the Department of Radiology, Stanford University School of Medicine, 300 Pasteur Dr, Stanford, CA 94305-5105.

In this article, the authors propose an ethical framework for using and sharing clinical data for the development of artificial intelligence (AI) applications. The philosophical premise is as follows: when clinical data are used to provide care, the primary purpose for acquiring the data is fulfilled. At that point, clinical data should be treated as a form of public good, to be used for the benefit of future patients. In their 2013 article, Faden et al argued that all who participate in the health care system, including patients, have a moral obligation to contribute to improving that system. The authors extend that framework to questions surrounding the secondary use of clinical data for AI applications. Specifically, the authors propose that all individuals and entities with access to clinical data become data stewards, with fiduciary (or trust) responsibilities to patients to carefully safeguard patient privacy, and to the public to ensure that the data are made widely available for the development of knowledge and tools to benefit future patients. According to this framework, the authors maintain that it is unethical for providers to "sell" clinical data to other parties by granting access to clinical data, especially under exclusive arrangements, in exchange for monetary or in-kind payments that exceed costs. The authors also propose that patient consent is not required before the data are used for secondary purposes when obtaining such consent is prohibitively costly or burdensome, as long as mechanisms are in place to ensure that ethical standards are strictly followed. Rather than debate whether patients or provider organizations "own" the data, the authors propose that clinical data are not owned at all in the traditional sense, but rather that all who interact with or control the data have an obligation to ensure that the data are used for the benefit of future patients and society.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1148/radiol.2020192536DOI Listing
June 2020

AppendiXNet: Deep Learning for Diagnosis of Appendicitis from A Small Dataset of CT Exams Using Video Pretraining.

Sci Rep 2020 03 3;10(1):3958. Epub 2020 Mar 3.

Stanford University Department of Radiology, Stanford, USA.

The development of deep learning algorithms for complex tasks in digital medicine has relied on the availability of large labeled training datasets, usually containing hundreds of thousands of examples. The purpose of this study was to develop a 3D deep learning model, AppendiXNet, to detect appendicitis, one of the most common life-threatening abdominal emergencies, using a small training dataset of less than 500 training CT exams. We explored whether pretraining the model on a large collection of natural videos would improve the performance of the model over training the model from scratch. AppendiXNet was pretrained on a large collection of YouTube videos called Kinetics, consisting of approximately 500,000 video clips and annotated for one of 600 human action classes, and then fine-tuned on a small dataset of 438 CT scans annotated for appendicitis. We found that pretraining the 3D model on natural videos significantly improved the performance of the model from an AUC of 0.724 (95% CI 0.625, 0.823) to 0.810 (95% CI 0.725, 0.895). The application of deep learning to detect abnormalities on CT examinations using video pretraining could generalize effectively to other challenging cross-sectional medical imaging tasks when training data is limited.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-020-61055-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7054445PMC
March 2020

Preparing Medical Imaging Data for Machine Learning.

Radiology 2020 04 18;295(1):4-15. Epub 2020 Feb 18.

From the Department of Radiology, Stanford University School of Medicine, 300 Pasteur Dr, S-072, Stanford, CA 94305-5105 (M.J.W., D.F., D.L.R., M.P.L.); Segmed, Menlo Park, Calif (M.J.W., W.A.K., C.H., J.W.); School of Engineering, Stanford University, Stanford, Calif (J.W.); Institute of Cognitive Neuroscience, University College London, London, England (H.H.); Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bethesda, Md (L.R.F.); Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, National Institutes of Health, Clinical Center, Bethesda, Md (R.M.S.); Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (D.L.R.); and Stanford Center for Artificial Intelligence in Medicine and Imaging (AIMI), Stanford, Calif (M.P.L.).

Artificial intelligence (AI) continues to garner substantial interest in medical imaging. The potential applications are vast and include the entirety of the medical imaging life cycle from image creation to diagnosis to outcome prediction. The chief obstacles to development and clinical implementation of AI algorithms include availability of sufficiently large, curated, and representative training data that includes expert labeling (eg, annotations). Current supervised AI methods require a curation process for data to optimally train, validate, and test algorithms. Currently, most research groups and industry have limited data access based on small sample sizes from small geographic areas. In addition, the preparation of data is a costly and time-intensive process, the results of which are algorithms with limited utility and poor generalization. In this article, the authors describe fundamental steps for preparing medical imaging data in AI algorithm development, explain current limitations to data curation, and explore new approaches to address the problem of data availability.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1148/radiol.2020192224DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7104701PMC
April 2020

MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports.

Sci Data 2019 12 12;6(1):317. Epub 2019 Dec 12.

Department of Emergency Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA.

Chest radiography is an extremely powerful imaging modality, allowing for a detailed inspection of a patient's chest, but requires specialized training for proper interpretation. With the advent of high performance general purpose computer vision algorithms, the accurate automated analysis of chest radiographs is becoming increasingly of interest to researchers. Here we describe MIMIC-CXR, a large dataset of 227,835 imaging studies for 65,379 patients presenting to the Beth Israel Deaconess Medical Center Emergency Department between 2011-2016. Each imaging study can contain one or more images, usually a frontal view and a lateral view. A total of 377,110 images are available in the dataset. Studies are made available with a semi-structured free-text radiology report that describes the radiological findings of the images, written by a practicing radiologist contemporaneously during routine clinical care. All images and reports have been de-identified to protect patient privacy. The dataset is made freely available to facilitate and encourage a wide range of research in computer vision, natural language processing, and clinical data mining.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41597-019-0322-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6908718PMC
December 2019

Pediatric Interventional Oncology: Endovascular, Percutaneous, and Palliative Procedures.

Semin Roentgenol 2019 Oct 25;54(4):359-366. Epub 2019 Jun 25.

Department of Radiology, Columbia University Medical Center, New York, NY. Electronic address:

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1053/j.ro.2019.06.008DOI Listing
October 2019

Development and Performance of the Pulmonary Embolism Result Forecast Model (PERFORM) for Computed Tomography Clinical Decision Support.

JAMA Netw Open 2019 08 2;2(8):e198719. Epub 2019 Aug 2.

Department of Radiology, Stanford University, Stanford, California.

Importance: Pulmonary embolism (PE) is a life-threatening clinical problem, and computed tomographic imaging is the standard for diagnosis. Clinical decision support rules based on PE risk-scoring models have been developed to compute pretest probability but are underused and tend to underperform in practice, leading to persistent overuse of CT imaging for PE.

Objective: To develop a machine learning model to generate a patient-specific risk score for PE by analyzing longitudinal clinical data as clinical decision support for patients referred for CT imaging for PE.

Design, Setting, And Participants: In this diagnostic study, the proposed workflow for the machine learning model, the Pulmonary Embolism Result Forecast Model (PERFORM), transforms raw electronic medical record (EMR) data into temporal feature vectors and develops a decision analytical model targeted toward adult patients referred for CT imaging for PE. The model was tested on holdout patient EMR data from 2 large, academic medical practices. A total of 3397 annotated CT imaging examinations for PE from 3214 unique patients seen at Stanford University hospitals and clinics were used for training and validation. The models were externally validated on 240 unique patients seen at Duke University Medical Center. The comparison with clinical scoring systems was done on randomly selected 100 outpatient samples from Stanford University hospitals and clinics and 101 outpatient samples from Duke University Medical Center.

Main Outcomes And Measures: Prediction performance of diagnosing acute PE was evaluated using ElasticNet, artificial neural networks, and other machine learning approaches on holdout data sets from both institutions, and performance of models was measured by area under the receiver operating characteristic curve (AUROC).

Results: Of the 3214 patients included in the study, 1704 (53.0%) were women from Stanford University hospitals and clinics; mean (SD) age was 60.53 (19.43) years. The 240 patients from Duke University Medical Center used for validation included 132 women (55.0%); mean (SD) age was 70.2 (14.2) years. In the samples for clinical scoring system comparisons, the 100 outpatients from Stanford University hospitals and clinics included 67 women (67.0%); mean (SD) age was 57.74 (19.87) years, and the 101 patients from Duke University Medical Center included 59 women (58.4%); mean (SD) age was 73.06 (15.3) years. The best-performing model achieved an AUROC performance of predicting a positive PE study of 0.90 (95% CI, 0.87-0.91) on intrainstitutional holdout data with an AUROC of 0.71 (95% CI, 0.69-0.72) on an external data set from Duke University Medical Center; superior AUROC performance and cross-institutional generalization of the model of 0.81 (95% CI, 0.77-0.87) and 0.81 (95% CI, 0.73-0.82), respectively, were noted on holdout outpatient populations from both intrainstitutional and extrainstitutional data.

Conclusions And Relevance: The machine learning model, PERFORM, may consider multitudes of applicable patient-specific risk factors and dependencies to arrive at a PE risk prediction that generalizes to new population distributions. This approach might be used as an automated clinical decision-support tool for patients referred for CT PE imaging to improve CT use.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1001/jamanetworkopen.2019.8719DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6686780PMC
August 2019
-->