Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/1729
Title: Morphological Analyzer and Generator for Tamil Language
Authors: Lushanthan, S.
Issue Date: 19-Dec-2013
Abstract: Morphological analysis is an essential component in natural language applications ranging from spell checker to machine translation. When performing a morphological analysis it leads to segmentation of a word into morphemes, combined with an analysis of the attachments of these morphemes. In English language the complexity of the formation of words is not much high compared with other languages. But when it comes to Indic languages they are very much complex. Thus, in a morphologically rich language like Tamil it is the same. So a system that could predict such changes leads to researches in this area. The morphemes in the language, the rules how these morphemes are connected and the changes occur when they attach together are important and interesting factors that needs to be considered. The Morphological analyzer and generator will be generating the word forms of a base word, given a particular context and at the same time, an analysis can also be done for the input word, for both noun and verbs in Tamil language. This Thesis illustrates how the lexicon and the orthographic rules of Tamil language have been written as regular expressions using only finite state operations and how this approach has been implemented in a morphological analyzer/generator. This model is built using the Xerox toolkit, which uses Two-level Morphology , and almost 2000 noun stems and dozens of verb stems have been incorporated into the network. A noun stem now produces about 40 different forms and the verb stem produces about 240 forms.
URI: http://hdl.handle.net/123456789/1729
Appears in Collections:SCS Individual Project - Final Thesis (2010)

Files in This Item:
File Description SizeFormat 
40.pdf
  Restricted Access
653.13 kBAdobe PDFView/Open Request a copy


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.