TRANSLITERATING SINHALA: A COMPUTATIONAL APPROACH TO CONVERT ROMANIZED SINHALA AND EMBEDDED ENGLISH INTO SINHALA SCRIPT M.L.A.S. Yapa 2024

Yapa, M.L.A.S.

Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4827

Full metadata record

DC Field	Value	Language
dc.contributor.author	Yapa, M.L.A.S.	-
dc.date.accessioned	2025-07-04T07:09:11Z	-
dc.date.available	2025-07-04T07:09:11Z	-
dc.date.issued	2024-09-19	-
dc.identifier.uri	https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4827	-
dc.description.abstract	ABSTRACT This thesis presents a comprehensive study on the transliteration of Sinhala focusing on the conversion of Romanized Sinhala and embedded English using Neural Machine Translation (NMT) into Sinhala. The research revolves around developing a robust NMT system that leverages a Gated repeated Unit model (GRU) with Bahdanau attention optimized for the nuances of The Sinhala language. The study begins with an exploration of current methodologies in NMT identifying gaps in The Sinhala transliteration research particularly in the context of handling mixed-language texts commonly found in social media. A novel transliteration system is proposed to address these gaps; the thesis proposes a user-friendly web application developed using Angular 17 for the frontend and Python FastAPI for the backend. This system not only translates embedded English phrases via the Google Translator API but also includes an n-gram model to improve the accuracy of the final transliteration output. In order to evaluate the effectiveness of the Singlish systems Several evaluation approaches including different scenarios of English transliteration with and without vowels and original words were used. The performance of the model was rigorously tested using BLEU and METEOR scores revealing a high degree of accuracy in transliteration across different testing scenarios. However, the research acknowledges some limitations particularly in computational resources that limited the size of the training dataset. The project concludes by highlighting potential future enhancements such as exploring advanced neural network architectures such as transformer models and extending the system's capabilities to other regional languages. This thesis contributes to computational linguistics by providing a practical solution to Sinhala transliteration challenges paving the way for further research in less-studied languages and script conversion technologies.	en_US
dc.language.iso	en	en_US
dc.subject	Natural Language Processing, Transliterating Sinhala, Encoder-Decoder, GRU Model, N-gram model, translating embedded English.	en_US
dc.title	TRANSLITERATING SINHALA: A COMPUTATIONAL APPROACH TO CONVERT ROMANIZED SINHALA AND EMBEDDED ENGLISH INTO SINHALA SCRIPT M.L.A.S. Yapa 2024	en_US
dc.type	Thesis	en_US
Appears in Collections:	2023

Files in This Item:

File	Description	Size	Format
2020MCS100.pdf		2.13 MB	Adobe PDF	View/Open

Show simple item record