Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4195
Full metadata record
DC FieldValueLanguage
dc.contributor.authorJayasinghe, K.L.-
dc.date.accessioned2021-07-22T10:19:11Z-
dc.date.available2021-07-22T10:19:11Z-
dc.date.issued2021-07-22-
dc.identifier.urihttp://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4195-
dc.description.abstractPopular languages have lots of data pools to use in linguistic data applications. But languages like Sinhala have lack of data. Because of that researchers conducted studies to increase labeled data as part of speech words, Named entities and other semantic categories. Most of their studies are based on supervised learning or statistical methods which require big effort to label the train data. The proposed solution tries to design a method that requires less effort and increase the labeled Sinhala named entity data in average accuracy. It is a semi-supervised bootstrapping method which uses an iterative seeding mechanism to extract named entities in person and location categories. The complete process conducted in two main phases. First one was the bootstrapping process and outputs of the process used to train the supervised learning process which is the second phase. So evaluation was also conducted in two phases. The first intermediate bootstrapping result shows 91% accuracy and the second phase result is also shown the intended accuracy level.en_US
dc.language.isoenen_US
dc.subjectSinhala Named-entity recognitionen_US
dc.subjectBootstrappingen_US
dc.subjectSemi-supervised learningen_US
dc.titleBootstrapping Sinhala Named Entities for NLP Applicationsen_US
dc.typeThesisen_US
Appears in Collections:2018

Files in This Item:
File Description SizeFormat 
2014CS051.pdf550.41 kBAdobe PDFView/Open


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.