Predicting Employees Turnover in Factory using Machine Learning:The Case of Adama Industrial Park
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
ASTU
Abstract
Employee turnover is a critical topic in the current human resource management literature. Turnover is the leaving of employees from their jobs and it occurs negatively influences not only financial and operational performance but also organizational performance and stability. Predicting employee turnover in this context is of great importance since it might indicate to the employer how and for what reasons employees may leave the organization and thus may allow steps to be taken to reduce costly employee turnover rates. This study aims to address the growing concern of employee turnover at Adama Industrial Park by focusing on understanding the reasons why employees leave and developing an appropriate machine learning model to predict turnover. To do so we begin by collecting data from the company HR office, resulting in 16078 raw data entries with 13 features collected for four years, from 2018-2022. Explanatory data analysis techniques are applied like handling missing values, encoding categorical data, feature selecting, data binning, and data normalization to make our data interpret for the machine learning model. We use the classification model. We split the data into 12862 samples which are 80% of the total data for training and 3216 samples which are 20% of the total data for testing. The models were trained using Logistic Regression, Random Forest, and K-Nearest-Neighbor. To enhance the model accuracy, we utilized a random search CV for the tuning parameter of our model. We evaluated the model?��?s performance with accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). Random forest model outperforms the high accuracy result of 88.87% compared with logistic regression and K-Nearest Neighbor. Also, Random forest scored an average accuracy of 88.31% result using 10-fold cross-validation it is the highest score compared with logistic regression and K-Nearest Neighbor. We use the Random Forest feature importance to identify which factor affects employees to leave and, in our study, we get salary is the main reason to leave employees from their jobs. The study?��?s outcomes suggest that addressing salary-related issues could be pivotal in reducing turnover rates at Adama Indus
