Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4457
Title: Singlish to Sinhala Converter using Machine Learning
Authors: De Silva, A.D.
Issue Date: 5-Aug-2021
Abstract: In the modern world it is hard to successfully cope with each other throughout the entire system, without adopting modern technology. With the enhancement of the technology artificial intelligence play a crucial role in the society. Today most of the activities which involves with the human beings have been learnt by the machines and perform it as human brains perform them. Machine transliteration is a process of converting a Romanized script into another language without considering the meaning of the word. It’s a conversion between two types of alphabets. Even though English is considered to be a universal language, most of the people are not fluent in the English language. But still they know how to use the English alphabet. So people preferred to do the communication, using their native language. Even though Unicode characters are available for most of the language, people use English characters to communicate with each other. But not communicate in English. Typing the wordings using English characters but the meaning is from their native language. This process is very common among the today’s world. When it comes to Sri Lanka, most of the people are chatting by using Romanized Sinhala through social media. For example: “oyata kohomada?”. There are lots of existing applications which converts Singlish characters to Sinhala fonts. But there are some applications which needs to perform analysis based on the data which collect through social media such as Hate Speech Detection. So in such circumstances it is required to convert an entire Romanized Sinhala script to Sinhala fonts. So for that purpose it is really beneficial to have Singlish to Sinhala converter which has been developed by training a model with a large number of Singlish and Sinhala phrase pairs. So through this project, it has been achieved. Singlish to Sinhala converter has been developed by training a model using Long Short-Term Memory(LSTM) algorithm. It has been used six thousand Singlish and Sinhala pairs to train the model. The model’s accuracy has been evaluated using BLEU score and it is around 40%. Since the corpus consists of little number of data, the accuracy has been decreased. To have a better accuracy it is required to increase the number of phrases and train the model. However, this particular converter will be beneficial for the community who are performing some analysis using Romanized Sinhala scripts. They do not want to spend time on perform the conversion manually. They can directly input the document which contains Romanized Sinhala and get the output as a document which has been converted the entire content to the Sinhala font.
URI: http://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4457
Appears in Collections:2020

Files in This Item:
File Description SizeFormat 
2016 MCS 025.pdf767.22 kBAdobe PDFView/Open


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.