Important complexity reduction of random forest in multi-classification problem
Algorithm complexity in machine learning problems has been a real concern especially with large-scaled systems. By increasing data dimensionality, a particular emphasis is placed on designing computationally efficient learning models. In this paper, we propose an approach to improve the complexity of a multi-classification learning problem in cloud networks. Based on the Random Forest algorithm and the highly dimensional UNSW-NB 15 dataset, a tuning of the algorithm is first performed to reduce the number of grown trees used during classification. Then, we apply an importance-based feature selection to optimize the number of predictors involved in the learning process. All of these optimizations, implemented with respect to the best performance recorded by our classifier, yield substantial improvement in terms of computational complexity both during training and prediction phases. - 2019 IEEE.