A Framework for Water Pollution Monitoring System Using Sensors and Machine Learning Technique

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

ASTU

Abstract

Water is the core and most relevant among the five core elements in natural existence and survival of the entire life on the globe. Here we are studying about water pollution issue that has unfortunately become a common problem that occurs in developing countries due to the substantial growth of industrial and urbanization. Polluted water out from industrial can cause a series of problems on the plantation and directly or indirectly Cause health risk on the human being. The core objective of this research study is to design a framework for water pollution classification and prediction through the integration of Sensors and machine learning technique technologies. As a solution to this problem, this research proposed polluted water classification and prediction using machine learning and feature selection techniques to build prediction models. A polluted water dataset gains from online for sample dataset and real-time data collected using the PH sensor, and waterproof temperature sensor through the Arduino Uno communication with Tera term and PuTTY. The sample dataset labeled with the help of domain experts and classified into a binary class that defined as Suitable and Polluted water for reused. This research study employed an experimental approach to determine the best combination of the machine learning algorithms and feature selection technique-based model. SVM, KNN, DT, and RF models trained the whole sample dataset with the feature selection method, RFE, UFS, and LR. The models were evaluated using 10-fold cross-validation to evaluate accuracy, misclassification, and classification performance, measured using well-known metrics; precision, recall, F1-score. Using a sample dataset, the study developed binary class models on selected feature importance. The models based on RF with RFE achieve slightly better performance than the SVM, KNN, and DT models for binary classification. Finally, the prediction of RF with RFE results in an accuracy of approximately 97%. From this result, it is possible to conclude that the RF analysis models should be preferred, since it has a better prediction performance and is less-error prone on classification compared to the SVM, KNN, and DT analysis approaches.

Description

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By