inner-banner-bg

Journal of Sensor Networks and Data Communications(JSNDC)

ISSN: 2994-6433 | DOI: 10.33140/JSNDC

Impact Factor: 0.98

Enhancing Landslide Prediction: A Comparative Study of Ensembled and Non-Ensembled Machine Learning Approaches with Dimensionality Reduction and Random Feature Selection to Showcase Entropy Management

Abstract

Adeel Abbas, Farkhanda Abbas, Fazila Abbas, Abdulwahed Fahad Alrefaei and Mohammed Fahad Albeshr

This research looks at how well different ensembled and non-ensembled machine learning algorithms perform both before and after dimensionality reduction and manual feature engineering using random feature selection. LightGBM, Extra Trees (EXT), XGBoost, Gradient Boosting Machine (GBM), Random Forest (RF), Naive Bayes (NB), K-Nearest Neighbors (KNN), and Decision Tree (DT) are among the algorithms that were assessed. With a computational time (CT) of 15.985 seconds prior to dimensionality reduction, LightGBM obtained an AUC/ROC score of 0.833, whereas Extra Trees (EXT), XGBoost, and GBM each obtained AUC/ROC scores of 0.832 with CTs of 15.892, 16.203, and 15.904 seconds, respectively. While Random Forest (RF), Naive Bayes (NB), K-Nearest Neighbors (KNN), and Decision Tree (DT) displayed decreasing AUC/ROC scores and varying CTs (RF: 0.784, 16.130s), NB: 0.740, 2.456s, KNN: 0.718, 1.897s, and DT: 0.689, 1.787s), CatBoost came in second with an AUC/ROC score of 0.816 and a CT of 17.121 seconds. The algorithms showed improved performance metrics following the reduction of dimensionality: LightGBM had the highest AUC/ROC score of 0.979 with a CT of 15.344 seconds, while CatBoost had a competitive AUC/ROC score of 0.977 with a CT of 15.235 seconds. Other methods also showed improvement. All computations were performed more efficiently as a result of the smaller feature space. AUC/ROC scores for LightGBM and XGBoost were 0.830 and 0.829, respectively, with CTs of 22.345 and 22.455 seconds, after manual feature engineering through random feature selection. CatBoost, on the other hand, had an AUC/ROC score of 0.814 with a CT of 24.587 seconds. These modifications revealed extra computational complexity brought about by feature engineering, which had an impact on calculation times as well as performance measures. The impact of preprocessing strategies on computing efficiency and model performance is highlighted in this work. By concentrating on pertinent features, dimensionality reduction dramatically improved AUC/ROC scores and shortened calculation times, whereas manual feature engineering offered more nuanced insights but frequently at the expense of more computational complexity. The trade-offs associated with maximizing the accuracy and efficiency of machine learning models are highlighted by these results.

PDF