A Context Sensitive Text Writing Correction and Error Detection for Afaan Oromoo Words

Main Article Content

Workineh Tesema
Tirate Kumera
Million Meshesha

Abstract

This article presents Afaan Oromoo context sensitive spelling checking using unstructured free text corpus. In the present paper, we describe a new and original
approach for word error correction and detection which is a context based. Spell checking is ultimately important in Afaan Oromoo hence, Afaan Oromoo language is a morphologically rich. This application was developed to reduce the problem of misspelling at the typing time in Afaan Oromoo text writing. The purpose of the paper was to fix spell checking errors by developing spell checking application hence, such application is important for Latin script like Afaan Oromoo where there are long vowels in the words. The method used in this work was statistical method (2-gram) and unsupervised approach hence there is no annotated corpus for Afaan Oromoo. To find the number of the contexts in a free corpus, n-gram (bi-gram) was used and to find the similarity measure between the words levenshtein distance was used. The finding shows that non-word error and real-word error found in Afaan Oromoo while typing Afaan Oromoo words. The result shows that the performance of the system was surprising; however, the corpus suffered by data sparseness problem as cannot capture large
vocabulary of words including proper names, abbreviations, special acronyms, hyphen, apostrophe, domain-specific terms, technical jargons, and terminologies. To the best of our knowledge, this work is the first of its kind for Afaan Oromoo .It argued that the accuracy of the system was 93.9%, this shows that the model is good but it needs further investigations particularly on proper names, abbreviation words, acronyms, hyphenated words, apostrophe words, domain-specific terms, technical jargons, and special acronyms using different algorithms.

Downloads

Download data is not yet available.

Article Details

How to Cite
Tesema, W., Kumera, T., & Meshesha, M. (2020). A Context Sensitive Text Writing Correction and Error Detection for Afaan Oromoo Words. Gadaa Journal, 3(1), 70-85. Retrieved from https://ejhs.ju.edu.et/index.php/gadaa/article/view/2019
Section
Articles
Author Biographies

Workineh Tesema, Jimma University

Jimma University

Institute of Technology

Department of Information Technology

Tirate Kumera, Mettu University

Mettu University

Department of Management Information System

Million Meshesha, Addis Ababa University

Addis Ababa University

 Department of Information Science