Lemmatization For Afan Oromo Text

dc.contributor.advisorSolomon Teferra(PhD)
dc.contributor.authorMohammed, Tiya
dc.date.accessioned2025-12-17T10:55:14Z
dc.date.issued2017-02
dc.description.abstractThis study has been intended to evaluate lemmatization of the Afan Oromo text performance in natural language processing like information retrieval system, machine learning. The objectives of the study are: to identify the various affixes and the lemma?�?s from the collected written Afan Oromo surface words, to perform text lemmatization and its relationship with stemming and to evaluate its performance. The study conducted on manually collected 1000 tokens with its candidate lemma by consulting language experts. The collected tokens were analyzed using weka package which was presented in hierarchal clusterer of the candidate lemma 231 was selected and tested in 10,15,and 20 clusters using the single-link criterion with edit distance similarity of the tokens. The study mainly focused on the lemmatization of Afan Oromo text that achieved good performance in its accuracy. The study findings revealed that the single-link of hierarchical clustering with edit distance produced 98.3 % of accuracy. Using the above findings, it is proved that there is a strong lemmatization performance on Afan Oromo text. According to the study, lemmatization contributes towards a better performance than stemming when we compered the results of each other. The error rate was 1.7% where seen due the manually annotated lemma and the hierarchical clustering algorithms itself. Improvements should be made for the implementation of lemmatization on this language to make easy to all user of the languageen_US
dc.description.sponsorshipASTUen_US
dc.identifier.urihttp://10.240.1.28:4000/handle/123456789/1769
dc.language.isoenen_US
dc.publisherASTUen_US
dc.titleLemmatization For Afan Oromo Texten_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Mohammed Tiya.pdf
Size:
1.52 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description:

Collections