Chronic Kidney Disease Prediction Using Machine Learning Techniques
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
ASTU
Abstract
Chronic kidney disease is a major challenge for health care systems all over the world
consuming a high percentage of health care budgets, mainly affects low-income countries.
Early prediction of the stages and prevention of this disease based on severity level is one of
the most important problems of health sectors especially in developing countries like
Ethiopia. Machine learning plays a key role in analyzing huge medical data and solve
complex problem for early prediction of diseases.
This study aimed to develop a chronic kidney disease prediction using machine learning
techniques to predict the severity of the disease as notckd, mild, moderate, severe, or ESRD
and also presence or absence of the disease as ckd or notckd. The data for this study purpose
was collected from St. Paulo’s Hospital. This research study have been conducted an
experimental approach in order to determine the best performing model. This study used
python programming language for the implementation purpose. The study employed three
models such as Random Forest (RF), Support Vector Machine (SVM), Decision Tree (DT)
and two feature selection methods such as analysis of variance (ANOVA) and recursive
feature elimination using cross validation (RFECV). First, the models built on the whole
dataset for both binary class and five class, and then feature selection methods applied to
both datasets. Evaluation of the models was done using 10-fold cross-validation and
classification performance was used in order to compare the models. Binary models and
multiclass models were developed based on the two datasets. The results of this study show
that Random forest based on recursive feature elimination with cross validation record better
performance compared to support vector machine and decision tree models based on
accuracy and F1-score. It record accuracy of 99.8 for binary class and 79.0% for multiclass
and F1-score of 99.8% for binary class and F1-score of 77.9 % for multiclass. Finally,
severity prediction model is recommended for the experts to provide appropriate prevention,
treatment and diet recommendation based on the disease severity.
