Afaan Oromo MultiLabel News Text Classification  Using Convolutional Neural Network(CNN)

Diriba, Gichile

Afaan Oromo MultiLabel News Text Classification Using Convolutional Neural Network(CNN)

Files

Diriba Gichile.pdf (2.71 MB)

Date

2021-09

Authors

Diriba, Gichile

Publisher

ASTU

Abstract

In today’s world, news texts are based on multi-label classification. Furthermore, large amount of text documents are generated from different sources, particlularly from online and offline. With the explosive advance in Internet news media and the disordered status of news texts, it is difficult to access the desired content from the sources on time. Therefore the thesis puts forward an automatic classification model for news text based on a Convolutional Neural Network. It has developed a model for multi-label news text classification for Afaan Oromo using a Convolution al Neural Network model. The model takes text as input and classifies it to the predefined la bels/categories based on the content of the text. Text classification is a technique that classifies textual information into a predefined set of Classes. In this work, various natural language pro cessing tasks are performed. This includes text preprocessing which includes normalization, to kenization, text cleaning, and removal of stop words. The main objective of natural language processing is to make computers perform tasks that require the participation of humans to solve labor force, cost, and time devoted. As most previous researchers have done works associated with single-level which only consider mutually exclusive. And this research mainly focused on classifying the news text in multi-label classification. In this study, six thousand four hundred eleven (6411) newly collected and annotated news datasets have been used to build the model for the Afaan Oromo language using the Convolutional Neural Network model. After experiments performed by a convolutional neural network on the problem domain, Convolutional Neural Network has been selected because of the ability to simply assimilate pre-trained word embed ding as well as the non-linearity of the network lead to greater classification accuracy. The ex periment undertaken has shown different results for pre-trained word embedding when compared to the non-pre-trained word embedding model. That means, Convolutional Neural Network mod el implemented on news text classification based on pre-trained word embedding using a 10/90 train test ratio has resulted in greater performance with precision 83.3%, recall 76.3%, F1- score 79.3%, and accuracy of 73.2%. On the other hand, the result of the experiment on non-pre trained word embedding shows precision 74%, recall 73.6%, F1-Score 73.8%, and accuracy 68 % .

Keywords

News Text classification, Afaan Oromo, Deep learning, Word embedding, CNN

URI

http://10.240.1.28:4000/handle/123456789/1547

Collections

Thesis

Full item page

Afaan Oromo MultiLabel News Text Classification Using Convolutional Neural Network(CNN)

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By