<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel rdf:about="https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/2380">
    <title>UCSC Digital Library Collection:</title>
    <link>https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/2380</link>
    <description />
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/2469" />
      </rdf:Seq>
    </items>
    <dc:date>2026-03-29T06:14:19Z</dc:date>
  </channel>
  <item rdf:about="https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/2469">
    <title>Named Entity Recognition For Sinhala Language</title>
    <link>https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/2469</link>
    <description>Title: Named Entity Recognition For Sinhala Language
Authors: Dahanayaka, J.K.
Abstract: Today with the vast growth of technology and information content, there is a need of
retrieving the required information more efficiently out of the huge unstructured
contexts with own native languages. To fulfill that need Natural Language Processing
related research areas such as Information Extraction, Machine Translation,
Information Retrieval and Automatic Summarization are essential. In all those areas
Named Entity Recognition is one of the preliminary task that has to be performed.
However it is challenging to build a proper Named Entity Recognition (NER) System
especially for Indic languages because of the features inherited. Sinhala language,
mother tongue of Sri Lanka belongs to Indo Aryan branch of Indic language family,
still has not any proper NER system to be use in its Machine Translation and
Information Extraction tasks. Although Latin languages like English having far better
NER solutions, Sinhala could not apply them directly as those systems use
capitalization as a major crucial feature which Indic languages misses. Since there
have not much previous work based on NER for Sinhala, the concept and the needed
resources has to be built from them sketch. It is believed that there will be a higher
probability about the applicability of the algorithms used for Indian languages to
Sinhala language too. So this dissertation tries to find out the effectiveness of using
data-driven techniques to detect NEs in Sinhala text. Two data-driven techniques,
Conditional Random Fields and Maximum Entropy model has been tried out. To
improve the performance language dependant as well as language independent
features in Sinhala text were added. Conditional Random Fields model outer performs
well expressing high precision, reasonable recall and f-measure respectively 91.64%,
69.34%and 78.95% while Maximum Entropy model expressed 81.71%, 51.34% and
63.06%.</description>
    <dc:date>2014-05-20T00:00:00Z</dc:date>
  </item>
</rdf:RDF>

