Large Vocabulary Continuous Speech Recognition System For Afaan Oromo Using Hidden Markov Model (HMM)
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The ultimate goal of automatic speech recognition is towards developing a model that automatically converts speech utterance into a sequence of words. Having similar objective of transforming Afaan Oromo language speech in to its equivalent sequence of words, this study explored the possibility of developing large vocabulary continuous read speech recognizer prototype for Afaan Oromo using Hidden Markov Model.Afaan Oromo is an Afro-Asiatic language, and the most widely spoken of the Cushitic family. It is spoken as a first language by more than 40 million Oromo and neighboring peoples in Ethiopia and Kenya. Afaan Oromo uses Latin script or characters (Roman orthography). The script of Afaan Oromo is called ?�?QUBEE?�?. Afaan Oromo is a phonetic language, which means that it is spoken in the way it is written.To develop and test the required speech recognizer performance one hour read speech data was collected, it was segmented and labeled into sentences with freely available toolkits. Prototype of large vocabulary continuous speech recognition system for Afaan Oromo were modeled using CMU sphinx open source speech recognition toolkit.We have tested the recognizer by using two types of test data set. The first is speaker dependent test data set in which speaker that participated in training were participated in testing. The second is speaker independent test data set in which speaker participated in testing were do not participated in training.The experimentation were done in two distinct phases. The first phase experimentation were done using with skip transition topology, context dependent tri-phone and using 8 Gaussian mixture. The second phase experimentation were done using without skip transition topology, context dependent tr
