Tutorial - Evolutionary Feature Reduction for Machine Learning
Bach Nguyen,Victoria University of Wellington, NZ; Bing Xue, Victoria University of Wellington,NZ; Mengjie Zhang, University of Wellington, NZ
-
CIS
IEEE Members: Free
Non-members: FreeLength: 02:00:58
Bach Nguyen,Victoria University of Wellington, NZ; Bing Xue, Victoria University of Wellington,NZ; Mengjie Zhang, University of Wellington, NZ;
ABSTRACT: In the era of big data, vast amounts of high-dimensional data have become ubiquitous in various domains, such as social media, healthcare, and cybersecurity. Training machine learning algorithms on such high-dimensional data is not practical due to the curse of dimensionality. Furthermore, the high-dimensional data might contain redundant and/or irrelevant features that blur useful information from relevant features. Feature reduction can address the above issues by building a smaller but more informative feature set.Feature selection (FS) and feature construction (FC) are two main approaches to feature reduction. FS aims to select a small subset of original (relevant) features. FC aims to create a small set of new high-level (informative) features based on the original feature set. Although both approaches are essential pre-processing steps, they are challenging due to their large and complex search spaces. While exhaustive searches are impractical due to their intensive computational cost, traditional heuristic searches require less computational resources but can be trapped at local optima. Evolutionary computation (EC) has been widely applied to achieve feature reduction because of its potential global search ability. Existing EC-based feature reduction approaches successfully reduce the data dimensionality and improve the classification performance and interpretability of the built models. This tutorial firstly introduces the main concepts and the general framework of feature reduction. Then, we will show how EC techniques, such as particle swarm optimisation, genetic programming, ant colony optimisation, and evolutionary multi-objective optimisation, can address challenges in feature reduction. The effectiveness of EC-based feature reduction is illustrated through several applications such as bioinformatics, image analysis and pattern classification, and cybersecurity. The tutorial concludes with existing challenges for future research.