Comparison of Filter Techniques for Two-Step Feature Selection

keywords: Feature selection, selection stability, high dimensionality, exhaustive search, bioinformatics
In the last decade, the processing of the high dimensional data became inevitable task in many areas of research and daily life. Feature selection (FS), as part of the data processing methodology, is an important step in knowledge discovery. This paper proposes nine variation of two-step feature selection approach with filter FS employed in the first step and exhaustive search in the second step. The performance of the proposed methods is comparatively analysed from the stability and predictive performance point of view. As the obtained results indicate the choice of the filter FS in the first stage has strong influence on the resulting stability. Here, the choice of univariate Pearson correlation coefficient based FS method appears to provide the most stable results.
reference: Vol. 36, 2017, No. 3, pp. 597–617