Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4700
Title: Romanized Sinhala to Sinhala Reverse Transliteration using a Hybrid Approach
Authors: Sumanathilaka, T.G.D.K.
Keywords: Romanized Sinhala
Transliteration
Neural Approach
Statistical approach
Sinhala, Typing
Issue Date: 22-Jun-2023
Abstract: With the revolution of social technology, the introduction of social media platforms and instant messages strengthen the native language used for communication in electronic media. With the commencement of multi-language compatibility in the digital arena both native Sinhala and Romanized Sinhala became prominent among the general community. Machine transliteration provides the ability to transliterate the alphabet of one language to another using computational approaches. The informal shorthand language that uses in texting also known as “Singlish” makes texting easier as the words in Sinhala can be interpreted using English letters with different typing patterns. But typing “Romanized Sinhala” using ad hoc transliterations and short net acronyms and getting the expected output in native Sinhala is less accurate. The current transliterators with a rule-based approach use a letter-level transliteration with a defined rule for the transliteration schema. But Romanized Sinhala via shortened handbased typing is not compatible with the current system. The proposed ad-hoc schema uses multiple computational approaches Aka Hybrid Approach to accomplish the requirement of ambiguity-free transliteration. The statistical approach used in the first phase uses an N-gram tagger where the tokens are fed to Trigram, Bigram, and Unigram taggers respectively. The unknown token from the initial phase is fed to the second phase with a Rule-based Algorithm which will predict respective words. The third phase which is the finalizing phase uses a suggestion-level model implemented using a Trie and Knowledge base to find the most optimal word suggestions from the predicted words pool. This phase will solve the ambiguity of a word selection. The Transliterator has been tested with the testing data and word level accuracy achieved was 84%. Therefore, the proposed novel transliterator which can back transliterate Romanized Sinhala to Sinhala using the Hybrid approach can use to enhance the reverse transliteration schema which will escalate the usage of Native Sinhala for communication.
URI: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4700
Appears in Collections:2022

Files in This Item:
File Description SizeFormat 
2019 MCS 084.pdf3.73 MBAdobe PDFView/Open


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.