Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4163
Title: Low-resource Sinhala Speech Recognition using Deep Learning
Authors: Karunathilaka, N.A.K.H.S.
Keywords: Sinhala
Speech Recognition
Deep Neural Network
Kaldi
Issue Date: 19-Jul-2021
Abstract: Automatic speech recognition has progressed considerably in the past several decades for most of the European languages, but still it is a prominent research area for most of the low-resourced languages. This project presents a study to build an automatic speech recognition (ASR) system using the Kaldi toolkit for the Sinhala language which is one of the low-resourced languages with a large lexical variety. We experiment on different deep neural architectures like pre-trained DNN, DNN, TDNN, TDNN+LSTM to enhance the acoustic modeling process and each of their performances is investigated in this document. A statistical model of GMM-HMM is also trained on the same data set keeping it as the baseline model for comparing the effectiveness of deep learning approach. For the language model, a corpus containing more than 20K sentences taken from UCSC LTRL is used to generate the 220K extended lexicon. The experiments are conducted using a phonetically balanced training data set consisting of twenty-five hours of speech data collected from fifty females and twenty males and tested on 1.6 hours of speech data. We present an overview of different architectures with their procedures and compare and contrast the performances of models with the statistical baseline approach. The results obtained show that Deep neural network architecture exceeds the statistical baseline model with a Word Error Rate (WER) of 7.48% on the test data set. The best observed lowest WERs are produced by the TDNN architectures.
URI: http://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4163
Appears in Collections:2019

Files in This Item:
File Description SizeFormat 
2015 CS 069.pdf1.42 MBAdobe PDFView/Open


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.