CCFS: A Confidence-based Cost-effective feature selection scheme for healthcare data classification.

Authors:
Yiyuan Chen
Yiyuan Chen
School of Agriculture and Biology
China
Yufeng Wang
Yufeng Wang
Yamaguchi University Graduate School of Medicine
Japan
Liang Cao
Liang Cao
National Cancer Institute
United States

IEEE/ACM Trans Comput Biol Bioinform 2019 Mar 7. Epub 2019 Mar 7.

Feature selection (FS) is one of fundamental data processing techniques in machine learning algorithms, especially for classification of healthcare data. It is a challenging issue due to the large search space. Binary Particle Swarm Optimization (BPSO) is an efficient evolutionary computation technique, and has been widely used in FS. However, in traditional BPSO-based FS schemes, each particle's historically best position and the globally best position of particle swarm are iteratively updated according to the overall fitness of the particle, without taking into account the fine-grained impact of each dimension in the participle. In addition, the acquisition cost of different features is naturally different, especially for medical data. To address these two issues, this paper proposed the Confidence-based and Cost-effective feature selection (CCFS) method using BPSO. First, CCFS improves search effectiveness through developing a new updating mechanism, in which confidence of each feature is explicitly considered, including the correlation between feature and categories, and historically selected frequency of each feature. Second, the feature cost is intentionally incorporated into the design of the fitness function. CCFS has been verified in various UCI public datasets. The experimental result shows the effectiveness of the proposed method, in terms of accuracy and feature selection cost.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCBB.2019.2903804DOI Listing
March 2019
1.438 Impact Factor

Publication Analysis

Top Keywords

feature selection
16
healthcare data
8
particle swarm
8
feature
8
cost-effective feature
8
best position
8
confidence-based cost-effective
8
proposed confidence-based
4
selection ccfs
4
paper proposed
4
ccfs method
4
data address
4
medical data
4
address issues
4
issues paper
4
bpso ccfs
4
effectiveness developing
4
developing updating
4
search effectiveness
4
improves search
4

Similar Publications