Please use this identifier to cite or link to this item:
https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/2468
Title: | Automation of Constructing an eProfile from Web Contents |
Authors: | Silva, S.S.H. |
Issue Date: | 20-May-2014 |
Abstract: | The WWW is becoming more powerful in present information era. Extracting web content and summarizing them were became a di cult for Internet users. This thesis investigates the existing techniques for web data extraction and summarizing them. This research focus on creating an ePro le for scholars by extracting the web content, summarizing and classifying them. Therefore prototype application was built with a pipeline architecture. In the application, the extraction module was performed by using html parser and using XPath expressions. The summarization was done by calculating Sentence Importance Score and Semantic Similarity Score. In the classi cation module, a N-Gram based text classi cation technique was used. The proposed solution is intended to be usable for constructing the ePro le in Academia domain and also it could be expected to use to many other domains such as analyzing news articles, pro ling past activities of persons. |
URI: | http://hdl.handle.net/123456789/2468 |
Appears in Collections: | SCS Individual Project - Final Thesis (2013) |
Files in This Item:
File | Size | Format | |
---|---|---|---|
9001417.pdf Restricted Access | 456.58 kB | Adobe PDF | View/Open Request a copy |
Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.