Software Defect Prediction Using Hybrid Machine Learning Techniques

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

ASTU

Abstract

Various researchers tried to develop methods of software defect prediction through applying different machine learning algorithms. However, the performance of those techniques on most publically available defect datasets is far from satisfactory. This is because of defect datasets which mostly are affected by two main problems such as high feature dimensionality and class imbalance. Five AEEEM projects of software defect datasets namely EQ, JDT, LC, ML and PDE are used in this research and affected by high feature dimensionality and class imbalance problem. To solve these problems, in this research proposed software defect prediction models using seven ensemble machine learning algorithms such as AdaBoost, GB, XGBoost, RF, ET, Bagging and Stacking with base classifier. And three feature selection methods namely CFS, SFS and CO for solving problem of high feature dimensionality and SMOTE data balancing technique to handle class imbalance problem are first used as part of preprocessing methods before implementing the above models. The experiment is performed and evaluated on 10- folds cross validation with performance metrics such as accuracy, recall, precision, F-measure and AUC. Results indicate that CO feature selection with ET ensemble learning algorithm is outperforming as compared to other models on all five datasets. The accuracy results are 93.1 %, 96.3 %, 99.2%, 98.2% and 97.8% for EQ, JDT, LC, ML and PDE data sets respectively. Additionally, other performance metrics have also demonstrated more than 91% for all datasets. Therefore, ET with CO model is recommended for software defect prediction model that classify software modules as defect or non-defect.

Description

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By