Discovering Knowledge From complex Data: The Case of   Ethiopian Revenue and Customs Authority

Babesha, Kenaw

Discovering Knowledge From complex Data: The Case of Ethiopian Revenue and Customs Authority

Files

BABESHA KENAW.pdf (1.85 MB)

Date

2018-01

Authors

Babesha, Kenaw

Abstract

The research area of the thesis is data mining in Revenue and customs sector. We applied data mining in imported Customs items data set by using machine learning techniques. The object of the research is to evaluate models trained by using machine learning algorithms and compare the results which would increase the efficiency of data analysis.. In this thesis, we collected a total number of 100,310 imported Items described with 15 attributes from ERCA. And 10% of this data set used in the experiment by random sampling method. The dataset for the study collected from ERCA and the data pre-processing and resampling techniques are explained in order to improve the performance of the training model. During the implementation of machine learning algorithms, three typical models (Ordinary Linear Regression, SVM and Random Forest) have been implemented by using the different packages in R on the given large datasets. The experiment result shows almost 1 for multiple R2 and adjusted R2 for Ordinary Linear Regression that shows as there is existence of 100% of the Variation in total import costs. The 10fold cross validation result on the test set for OLR model shows 2952689 and 0.8113806 for the smaller RMSE and maximum R2 respectively. When we compare the result of RF with the OLR model, the minimum RMSE and maximum R2 we can get from the results are 464212.8 and 0.9928060 which shows better performance than OLR. The quantitative and visual results of our practical machine learning implementation show the feasibility for the large datasets under the random forest algorithms. The research results of the work revealed new opportunities in the application of data mining methods by using machine learning in the Revenue and customs domain.

URI

http://10.240.1.28:4000/handle/123456789/2924

Collections

Information System Engineering

Full item page

Discovering Knowledge From complex Data: The Case of Ethiopian Revenue and Customs Authority

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By