Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/3171
Full metadata record
DC FieldValueLanguage
dc.thesis.supervisorAtukorala, D.A.S. (Dr.)-
dc.contributor.authorMadhawa, P.K.K.-
dc.date.accessioned2015-06-29T10:55:56Z-
dc.date.available2015-06-29T10:55:56Z-
dc.date.issued2015-06-29-
dc.identifier.urihttp://hdl.handle.net/123456789/3171-
dc.description.abstractMicroblogging platforms such as Twitter have become a primary medium for people to share their experiences and opinions on a broad range of topics. Because posts on Twitter are publicly viewable by default, Twitter is used to gain latest information on events like natural disasters, disease outbreaks or sports events. The abundance of tweets containing user opinions and their sentiments towards a topic necessitates the need of extracting newsworthy tweets from a large stream of tweets on a single topic. The goal of this research is to build a system capable of generating a summary of long running events using only a stream of tweets related to the particular event. This thesis contains an in-depth analysis of algorithms and features which are useful in detecting newsworthy tweets using several corpora of tweets. To address the constraints in manual annotation of large corpora, we introduce a novel heuristic based annotation scheme to generate training dataset for the system. The classifier trained using heuristically labeled tweets could achieve accuracy in the range of 80% for a manually annotated gold standard tweet corpus. Finally we investigate how these newsworthy tweets can be presented to the user as a summary. We introduce duplicate removal algorithms and an entity-centric clustering algorithm aimed at grouping tweets representing similar content.en_US
dc.language.isoen_USen_US
dc.title“Twitsum”: Automatic generation of event summaries using microblog streamsen_US
dc.typeThesisen_US
Appears in Collections:Master of Computer Science - 2015

Files in This Item:
File Description SizeFormat 
12440442.pdf
  Restricted Access
626.69 kBAdobe PDFView/Open Request a copy


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.