J Med Imaging (Bellingham) 2020 Jan 21;7(1):014504. Epub 2020 Feb 21.
The University of Chicago, Department of Radiation and Cellular Oncology, Chicago, Illinois, United States.
While radiomics feature values can differ when extracted using different radiomics software, the effects of these variations when applied to a particular clinical task are currently unknown. The goal of our study was to use various radiomics software packages to classify patients with radiation pneumonitis (RP) and to quantify the variation in classification ability among packages. A database of serial thoracic computed tomography scans was obtained from 105 patients with esophageal cancer. Patients were treated with radiation therapy (RT), resulting in 20 patients developing RP grade . Regions of interest (ROIs) were randomly placed in the lung volume of the pre-RT scan within high-dose regions ( ), and corresponding ROIs were anatomically matched in the post-RT scan. Three radiomics packages were compared: A1 (in-house), IBEX v1.0 beta, and PyRadiomics v.2.0.0. Radiomics features robust to deformable registration and common among radiomics packages were calculated: four first-order and four gray-level co-occurrence matrix features. Differences in feature values between time points were calculated for each feature, and logistic regression was used in conjunction with analysis of variance to classify patients with and without RP ( ). Classification ability for each package was assessed using receiver operating characteristic (ROC) analysis and compared using the area under the ROC curve (AUC). Of the eight radiomics features, five were significantly correlated with RP status for all three packages, whereas one feature was not significantly correlated with RP for all three packages. The remaining two features differed in whether or not they were significantly associated with RP status among the packages. Seven of the eight features agreed among the packages in whether the AUC value was significantly . Radiomics features extracted using different software packages can result in differences in classification ability.