Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/1713
Title: Statistical Machine Translation for Sinhala and Tamil
Authors: Sripirakas, S.
Issue Date: 19-Dec-2013
Abstract: One of the most promising and leading machine translation strategies would be Sta- tistical Translation Approach. Being pertinent even to structurally dissimilar language pairs, it has con¯rmed its suitability for large text translation. Rising demand is present for automatic translation for Sinhala and Tamil for quite a lot of decades. Statistical approach is the best preference to resolve the unavailability of a machine translation tool for the languages concerned. Because of language similarity, statistical approach could thrive agreeably, exclusive of more concern on linguistic knowledge. A well tuned translation system has been modeled and implemented in this research, with the preparation of parallel corpora from parliament order papers. This thesis demonstrates all particulars of experiments, experimental strategies,obtained results. Design is made on three perspectives namely architecture, implementation and evaluation bearing maintainability,scalability and con¯gurability requirements in mind. The key strength of this research would be the system's output quality analysis dependent on the previous results in already executed experimental steps. Number of experiments was manageable and far better than all combinations of all parameters with no reference to results in past steps. To facilitate the improvement of output quality, MERT technique is integrated to assess its applicability and contribution to system enhancement. To stay away from sole dependence on BLEU, two other automatic metrics namely TER and NIST are utilized for the evaluation in di®erent aspects.In addition, The optimal system con¯gurations chosen in this research with constrained language model are experimented with larger general language model. It has been a successful attempt in generating human readable translated output in both languages. Finally, directions and messages to future research are also recognized and speci¯ed for the re¯nements of this system.
URI: http://hdl.handle.net/123456789/1713
Appears in Collections:SCS Individual Project - Final Thesis (2010)

Files in This Item:
File Description SizeFormat 
26.pdf
  Restricted Access
442.8 kBAdobe PDFView/Open Request a copy


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.