Predicting Lung Cancer Survivability: A Machine Learning Ensemble Method On Seer Data
Abstract
Surrajkumar Prabhu Venkatesh and Lilly Raamesh
Ensemble methods are powerful techniques used in machine learning to improve the prediction accuracy of classifier learning systems. In this study, different ensemble learning methods for lung cancer survival prediction were evaluated on the Surveillance, Epidemiology and End Results (SEER) dataset. Data were preprocessed in several steps before applying classification models. The popular ensemble methods Bagging, Adaboost and three classification algorithms, K-Nearest Neighbours, Decision Tree and Neural Networks as base classifiers were evaluated for lung cancer survival prediction. The results empirically showed that ensemble methods are able to evaluate the performance of their base classifiers and they are appropriate methods for analysis of cancer survival.