Cardiovascular Disease Prediction Using Machine Learning Approach

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

ASTU

Abstract

Cardiovascular diseases (CVDs) are the primary contributors to worldwide mortality, responsible for 17.9 million deaths, constituting 32% of the total global fatalities. Notably, more than 75% of these deaths occurred in low and middle-income countries. In Ethiopia, CVDs kills 170 people every day. It accounts for a significant portion of non-communicable disease (NCD) deaths fatalities and healthcare expenses. Early diagnosis of CVDs is crucial but challenging due to limited access to primary health care program, lack of expertise, shortage of diagnosis and treatment apparatus and the inefficiency of traditional diagnostic methods, which are ineffective, costly, and time-consuming. Recently, machine learning has emerged as a valuable tool to support the diagnosis of CVDs. This study leverages a stacking-based machine learning model to address the challenge of CVD diagnosis. While prior research has focused on general CVD prediction or specific disease types, this study classifies the disease into four common CVDs categories such as CAD, PAD, RHD, and Stroke. The proposed stacking model was implemented using a CVD dataset collected at St. Paul’s Hospital Millennium Medical College (SPHMMC) which consists of 2196 instances with 19 features. The obtained dataset was prepared before exposed to the model using different preprocessing techniques, including imputer, z-score, label encoder and min-max normalization. Stratified k-fold cross validation was utilized as dataset splitting methods during model construction process. Performance comparisons were made with three individual machine learning models: SVM, RF, and XGB. Model performance evaluation was carried out with and without applying feature selection techniques (recursive feature elimination (RFE) and lasso regularization (L1) using metrics such as accuracy, precision, recall, and F1-score metrics. Our experiments revealed stacking model with RFE outperformed the others achieving the highest accuracy of 97.55%. The proposed model helps medical practitioners or cardiologists in classifying CVDs into four common categories effectively, thereby potentially saving lives and reducing the burden of these diseases

Description

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By