Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4793
Full metadata record
DC FieldValueLanguage
dc.contributor.authorPushpakumara, W.D.H-
dc.date.accessioned2024-10-16T04:57:48Z-
dc.date.available2024-10-16T04:57:48Z-
dc.date.issued2024-05-
dc.identifier.urihttps://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4793-
dc.description.abstractAbstract Automatic Speech Recognition (ASR) is a rapidly evolving area within Natural Language Processing (NLP), addressing a range of linguistic challenges. While ASR technologies have made significant strides through various models, including Hidden Markov Models (HMMs), Gaussian Mixture Models (GMMs), and more recently, Deep Neural Networks (DNNs), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs), certain languages like Sinhala face specific limitations. One major challenge for Sinhala ASR development is the lack of sufficient labeled speech data, which makes it difficult and costly to build accurate models. This thesis explores a transfer learning-based approach to mitigate the data scarcity problem in Sinhala ASR. Specifically, the study leverages the XLS-R model developed by Babu et al. (2021) as the source model, using its pre-learned speech representations to fine-tune a Sinhala ASR model. Two distinct datasets, differing in their lexical composition, were used to evaluate the model’s performance. The proposed model achieved Word Error Rates (WER) of 33.78% and 38.31% on the two datasets, respectively. To further enhance transcription accuracy, post-processing steps, including spell correction and word boundary correction algorithms, were applied, resulting in improved WERs of 24.28% and 36.6%. While the baseline model performed better on the first dataset, a relative WER reduction of 10.07% was observed on the second dataset. An analysis of the generated transcriptions indicates that the proposed model produces results that are acceptable in practical applications, highlighting its potential to improve ASR performance for under-resourced languages like Sinhala.en_US
dc.language.isoenen_US
dc.titleApplicability of Transfer Learning on End-to-End Sinhala Speech Recognitionen_US
dc.typeThesisen_US
Appears in Collections:2024

Files in This Item:
File Description SizeFormat 
2019 CS 125.pdf2.76 MBAdobe PDFView/Open


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.