Amharic Print Documents Transcription To Audio Using Machine Learning Techniques

dc.contributor.advisorTilahun Melake(PhD)
dc.contributor.authorSinishaw, Getachew
dc.date.accessioned2025-12-17T10:54:42Z
dc.date.issued2022-09
dc.description.abstractPrinted Document Recognition Of Remains A Largely Unsolved Problem In The Field Of Pattern Recognition. This Thesis Examines How Modern Deep Learning Techniques Can Improve Printed Document Recognition In The Context Of The Amharic Language.Although Amharic Was A Literary Language In Ethiopia, It Is Underrepresented In The Research Field Of Document Image Recognition And Analysis. Therefore, A Printed Document Recognition Based On Real-World Large-Scale Digitization Scenarios Is Proposed. Its Architecture Is Made Up Of Functions: Pre-Processing (Binary And Discriminant Estimation), Page Layout Analysis, Model Recognition, And Post-Processing. For Each Task, A Test Setup Is Prepared. In The Binarization Task, Four Binarization Methods (Otsu's Global Method, Otsu's Local Method, Adaptive_Thresh_Mean_Method, And Adaptive_Thresh_Gaussian Method) Were Investigated Using FM, P-FM, PSNR, And DRD Evaluation Parameters. Adaptive_Thresh_Mean_Method Is Superior To All Other Methods In All Dimensions. In The Document Image Bias Estimation Task, A Method Based On The Howe Transform Was Investigated By Testing And Investigating Its Effect On The Dataset. The Evaluation Parameters CE, AED And TOP80 Were Used And Values Equal To 86.00,0.4215and 0.068 Were Obtained. The Performance Of The Leptonica Open-Source C Library Was Investigated In The Page Layout Analysis Function, And The Results Were Found To Be Highly Successful In Various Formats At The Region And Line Level.Amharic Language Page Layouts. The Final Test Setup Was Designed To Build A Recognition Model Using The Tesseract OCR Engine. Since It Was Difficult To Prepare Large Training Data With Realistically Accurate Printed Documents, A Calibration Method Based On The Amharic Language Was Proposed And Implemented. A Total Of 286 Images Of 200 Lines Of Text Collected From 15 Different Pages Were Processed And A Recognition Model With A Character Error Rate Of 2.632% Was Constructed. Overall, The Experiments Conducted With The Prototype Approach Have Yielded Encouraging Results, So The Full Development Of An OCR System For Amharic Printed Document Recognition Is Applicable And Vital For Multi-Purpose Task Specially For Education Purpose.en_US
dc.description.sponsorshipASTUen_US
dc.identifier.urihttp://10.240.1.28:4000/handle/123456789/1668
dc.language.isoen_USen_US
dc.publisherASTUen_US
dc.subjectOcr System, Recognition, Page Layoutsen_US
dc.titleAmharic Print Documents Transcription To Audio Using Machine Learning Techniquesen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Sinshaw Getachew.pdf
Size:
3.15 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description:

Collections