Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/3675
Title: Sentiment Analysis in Twitter Messages
Authors: Muthutantrige, S. R.
Keywords: Sentiment Analysis,
twitter classification
SemEval 2015
ANN
SVM
Issue Date: 8-Sep-2016
Abstract: With the proliferation of Social media, socially shared content gained tremendous recognition for its powerful nature. Freedom to express and the ability to connect beyond boundaries made social media unparalleled to any conventional web communication tool. Having abundance of untapped information on user's desires has little use without proper content analysis and mining techniques. Complex nature of socially shared content requires advanced machine learning and natural language processing techniques beyond mere word level analysis to extract hidden opinions. This research proposes a system to answer a specific sentiment analysis problem described in 2015 iteration of SemEval (Semantic Evaluation series), Sentiment Analysis in Twitter as the base challenge to be improved. When it comes to sentiment analysis competitions and shared tasks, this is the most popular to date with more than 40 team participants in each year from its inception in 2013. Only subtask B (Message Polarity Classification) was considered under main task(task 10) in this research as it was a return from previous years and remained highly challenged and competitive among teams from around the world. Proposed model extracts several key features from twitter messages using NLTK (Natural language processing toolkit) and scikit learn python libraries. A novel algorithm for deriving hashtag information along with other basic and lexicon features were extracted for the generation process of the feature vector. Close to 100 features were derived and reduced selecting the most influential subset of features. Classification process was carried out using several classifiers including BayesNet, Support Vector Machine, Logistic Regression and Artificial Neural networks. Obtained results were compared with each year's top ranked systems in SemEval, Sentiment Analysis in Twitter task. Proposed system displayed promising results competing head to head against the best in the world. Several key contributions of this research including novel feature extraction algorithm, newly created sentiment lexicon, exhaustive analysis with various classifiers using both constrained and unconstrained data clearly proven to be unique in many ways. As stated earlier, experiments showed competitive results proving the significance of the proposed method for sentiment analysis in twitter messages.
URI: http://hdl.handle.net/123456789/3675
Appears in Collections:SCS Individual Project - Final Thesis (2015)

Files in This Item:
File Description SizeFormat 
11000392-FinalDissertation.pdf
  Restricted Access
1.21 MBAdobe PDFView/Open Request a copy


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.