Insight into Best Variables for COPD Case Identification: A Random Forests Analysis.

Chronic Obstr Pulm Dis 2016;3(1):406-418

Weill Cornell Medical Center, New York, New York.

Rationale: This study is part of a larger, multi-method project to develop a questionnaire for identifying undiagnosed cases of chronic obstructive pulmonary disease (COPD) in primary care settings, with specific interest in the detection of patients with moderate to severe airway obstruction or risk of exacerbation.

Objectives: To examine 3 existing datasets for insight into key features of COPD that could be useful in the identification of undiagnosed COPD.

Methods: Random forests analyses were applied to the following databases: COPD Foundation Peak Flow Study Cohort (N=5761), Burden of Obstructive Lung Disease (BOLD) Kentucky site (N=508), and COPDGene® (N=10,214). Four scenarios were examined to find the best, smallest sets of variables that distinguished cases and controls:(1) moderate to severe COPD (forced expiratory volume in 1 second [FEV] <50% predicted) versus no COPD; (2) undiagnosed versus diagnosed COPD; (3) COPD with and without exacerbation history; and (4) clinically significant COPD (FEV<60% predicted or history of acute exacerbation) versus all others.

Results: From 4 to 8 variables were able to differentiate cases from controls, with sensitivity ≥73 (range: 73-90) and specificity >68 (range: 68-93). Across scenarios, the best models included age, smoking status or history, symptoms (cough, wheeze, phlegm), general or breathing-related activity limitation, episodes of acute bronchitis, and/or missed work days and non-work activities due to breathing or health.

Conclusions: Results provide insight into variables that should be considered during the development of candidate items for a new questionnaire to identify undiagnosed cases of clinically significant COPD.

Download full-text PDF

Source
http://dx.doi.org/10.15326/jcopdf.3.1.2015.0144DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4729451PMC
January 2016
33 Reads

Publication Analysis

Top Keywords

random forests
8
moderate severe
8
copd
5
analyses applied
4
applied databases
4
forests analyses
4
undiagnosed copdmethods
4
databases copd
4
copdmethods random
4
foundation peak
4
study cohort
4
cohort n=5761
4
flow study
4
peak flow
4
identification undiagnosed
4
copd foundation
4
features copd
4
obstruction risk
4
risk exacerbationobjectives
4
airway obstruction
4

Similar Publications