Universal Networking Language Based Multilingual Machine Translation System For Amharic Language

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

ASTU

Abstract

In this technologically integrated world, most people can easily access a huge volume of electronic information in the common platforms like internet, but the problem comes with the languages by which the information is represented. The machine translation is the name for computerized methods that automate the translations of documents from one language to another; it is a multi-disciplinary research area from linguistics, computer science, artificial intelligence, statistics, mathematics, philosophy and other. Universal Networking Language framework is an Interlingua platform for handling multilanguages over Inter. The UNL framework is used for various natural language processing tasks such as: information retrieval, text simplification and semantic reasoning, machine translation, representing, describing, summarizing, and storing information in an independent natural language (NL) format. There are two basic movements: Unlization is the process of representing the NL into UNL and Nlization is the process of generating a NL out of a UNL expression. UNL is far better than other methods, by using the UNL we only need [2*n] modules, while other methods uses [n*(n-1)] modules where n is the number of NL that are going to be translated. In Ethiopia, there are more than 80 nations and nationalities are speaking in their own native languages. Amharic is one of the members of the Semitic branch of the Afro-Asiatic family. The majority of the speakers of Amharic can be found in different parts of the country. It has five dialectical variations across different parts of the country. These include dialects like Addis Ababa, Gojam, Gonder, Wollo and Menz. But there are also Amharic speakers in other countries, Outside Ethiopia; it is used in Egypt, Israel, Canada, Sweden, Eritrea, and the US. Thus, it has official status is spoken by many people as their native and second language. In addition, it is a language with many literatures. There are relatively fewer contributions of development of Amharic language under UNL framework. The previous conducted research was done mostly bilingual pair language, doesn’t support multilanguage environments over net, old-fashioned to integrate centrally and outdated to manage as a common platform for all NL representations, they are very complex to develop corpuses and lingustics resources in quantity and quality due to they follow diffent techniqies to Adama Science and Technology University 2019 XII | P a g e repreesent a given languages. Tthat’s why people are not satisfied with an internet usage due to linguistic diversity over Internet. Currently, these results are create incompatibilities among the various methodologies are used. Then, it is not possible to avoid language barriers all over the world, even if we get together all the results in one system. The methodology is used for the conducted research is going to create UNlization and Nlization for Amharic language by using Interactive Analyser tools that takes Amharic language sentences as input and delivers UNL, it includes a dictionary and grammar for Amharic language analysis. It represents the sentences as semantic networks in the UNL format and dep-to-surface natural language generator tools is fully automatic natural language generation system that takes a UNL input and delivers an output in Amharic language without any human intervention. It generates the sentences out of semantic networks represented in the UNL format. Both of these tools are in its current release, it is a web application developed in Java and available at the UNLdev. The data collection technique is used for comparison UNL framework with others machine translations and support for prepared the corpus, dictionary and grammar for Amharic Language. Some of the sample corpus are very few in numbers, not able to handle complex sentences and few literatures are reviewed in Ethiopian local language contexts as the references and not covered all regional languages under this conducted research study. The Evaluation was done F-1 score; it is a precision and recall for calculating the grammar accuracy, the final outcome is 0.640 and 0.0340 in the process of Unlization and Nlization respectively. Lastly, the evaluation of F-1score measure accuracy of the systems can be further improvement needs depending on the specific dictionary and grammar syntax structures; to come up handling for very large complex Amharic sentences at paragraph level and this study in the future can be integrated into other Ethiopian local languages.

Description

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By