UniBFS: A novel uniform-solution-driven binary feature selection algorithm for high-dimensional data
View/ Open
Publisher version (Check access options)
Check access options
Date
2024Author
Ahadzadeh, BehrouzAbdar, Moloud
Foroumandi, Mahdieh
Safara, Fatemeh
Khosravi, Abbas
García, Salvador
Suganthan, Ponnuthurai Nagaratnam
...show more authors ...show less authors
Metadata
Show full item recordAbstract
Feature selection (FS) is a crucial technique in machine learning and data mining, serving a variety of purposes such as simplifying model construction, facilitating knowledge discovery, improving computational efficiency, and reducing memory consumption. Despite its importance, the constantly increasing search space of high-dimensional datasets poses significant challenges to FS methods, including issues like the "curse of dimensionality," susceptibility to local optima, and high computational and memory costs. To overcome these challenges, a new FS algorithm named Uniform-solution-driven Binary Feature Selection (UniBFS) has been developed in this study. UniBFS exploits the inherent characteristic of binary algorithms-binary coding-to search the entire problem space for identifying relevant features while avoiding irrelevant ones. To improve the effectiveness and efficiency of the UniBFS algorithm, Redundant Features Elimination algorithm (RFE) is presented in this paper. RFE performs a local search in a very small subspace of the solutions obtained by UniBFS in different stages, and removes the redundant features which do not increase the classification accuracy. Moreover, the study proposes a hybrid algorithm that combines UniBFS with two filter-based FS methods, ReliefF and Fisher, to identify pertinent features during the global search phase. The proposed algorithms are evaluated on 30 high-dimensional datasets ranging from 2000 to 54676 dimensions, and their effectiveness and efficiency are compared with state-of-the-art techniques, demonstrating their superiority. 2024 The Author(s)
Collections
- Network & Distributed Systems [142 items ]