Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4905
Full metadata record
DC FieldValueLanguage
dc.contributor.authorWeerakoon, T.V.R.-
dc.contributor.authorNayanathara, K.K.S.-
dc.contributor.authorHarischandra, L.I.L.-
dc.date.accessioned2025-08-15T10:09:04Z-
dc.date.available2025-08-15T10:09:04Z-
dc.date.issued2025-06-30-
dc.identifier.urihttps://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4905-
dc.description.abstractAbstract This research presents the development of an advanced Sinhala speech-to-speech chatbot designed to bridge the gap in digital accessibility for native Sinhala speakers. Despite the rapid advancements in conversational AI systems, low-resource languages like Sinhala remain underrepresented, limiting the ability of native speakers to interact with technology in their own language. Addressing this critical gap, this study proposes an end-to-end solution that seamlessly integrates Automatic Speech Recognition (ASR), Natural Language Understanding (NLU), and Text-to-Speech (TTS) synthesis, enabling real-time, voice-based communication in Sinhala. The system leverages state-of-the-art deep learning techniques to achieve high accuracy and robustness. For ASR, transfer learning is employed to fine-tune the Wav2Vec2-BERT model on a 40-hour Sinhala speech dataset, achieving remarkable improvements with a Word Error Rate (WER) of 1.79% and a Character Error Rate (CER) of 0.33%, surpassing existing Sinhala ASR systems. The chatbot component utilizes a Retrieval-Augmented Generation (RAG) approach, combining the strengths of Large Language Models (LLMs) with dynamic knowledge retrieval to deliver context-aware and accurate responses in Sinhala. The TTS module, powered by the Variational Inference TTS (VITS) model, generates natural-sounding Sinhala speech, achieving a Mean Opinion Score (MOS) of 4.62 for intelligibility and 4.18 for naturalness in male voices, and 4.24 for intelligibility and 4.07 for naturalness in female voices. The proposed system addresses a significant gap in voice-based human-computer interaction for Sinhala speakers, with applications spanning education, accessibility, and digital services. By combining cutting-edge ASR, RAG-powered chatbot intelligence, and high-quality TTS, this research not only advances the field of NLP for low-resource languages but also sets a benchmark for future developments in multilingual speech technologies. The modular architecture and methodologies developed in this study provide a foundation for extending similar solutions to other underrepresented languages, fostering greater inclusivity in the digital age.en_US
dc.language.isoenen_US
dc.titleSinhala Speech-to-Speech Chatbot Using Deep Learning Approachesen_US
dc.typeThesisen_US
Appears in Collections:2025

Files in This Item:
File Description SizeFormat 
20000715, 20001207, 20002009 .pdf7.62 MBAdobe PDFView/Open


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.