Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4185
Title: Automated Title Generation in Sinhala Language
Authors: Batawalaarachchi, T. H.
Issue Date: 22-Jul-2021
Abstract: With the recent advancements in the area of information technology, information has become available in large quantities. Though the availability has increased, the value of correct and meaningful information has not gone down and considered as one of most valuable resources. But, due to the high number of resources and the increase in content, time taken for accessing the required information has also increased. Hence, it is more important to access the desired information within the required time. In this case, the value of the concepts of summary and more importantly title reaches a great value. When considering a document, a title presents a compact representation of what is included in the document. Hence title is important in capturing the main idea of a document quickly, without spending time on reading the whole article. Then a reader can make the decision whether the document is useful for the purpose he/she intended to use it. In this work, two approaches in selecting words to be included in the title for a given Sinhala document is discussed. Both the approaches use statistical features from a selected corpus to include words in the title. First approach considers the words included in the titles and the structure of the corresponding document, while the second approach focuses on translation from words in the document to the words contained in the title. Two approaches were evaluated using human evaluation and automatic (averaged F1 score) evaluation. Though similar approaches have obtained acceptable results with the work done on other languages, by the results of this work, it is clear that statistical approaches are not the go to method for the title word selection task in Sinhala Language. This can be mainly due to the high complexity of the language organization and also the structural distance of Sinhala Language from the languages which these approaches have shown better results.
URI: http://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4185
Appears in Collections:2018

Files in This Item:
File Description SizeFormat 
2014CS010.pdf334.47 kBAdobe PDFView/Open


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.