IEEE Transactions on Information Technology in Biomedicine vol:11 issue:3 pages:338-347
This paper investigates variable selection (VS) and classification for biomedical datasets with a small sample size and a very high input dimension. The sequential sparse Bayesian learning methods with linear bases are used as the basic VS algorithm. Selected variables are fed to the kernel-based probabilistic classifiers: Bayesian least squares support vector machines (BayLSSVMs) and relevance vector machines (RVMs). We employ the bagging techniques for both VS and model building in order to improve the reliability of the selected variables and the predictive performance. This modeling strategy is applied to real-life medlical classification problems, including two binary cancer diagnosis problems based on microarray data and a brain tumor multiclass classification problem using spectra acquired via magnetic resonance spectroscopy. The work is experimentally compared to other VS methods. It is shown that the use of bagging can improve the reliability and stability of both VS and model prediction.