Please use this identifier to cite or link to this item:
https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4756
Title: | Social Sensor Networks for News Mining |
Authors: | Fernando, M. A. I. D. |
Issue Date: | 2019 |
Abstract: | Abstract With the development of technology, many people tend to use the Internet which has resulted in an increase in usage of social networks and microblogs, inducing many organizations too to share their news in social networks and microblogs. News providers are such organizations that share large amount of news in social networks and blogs and Twitter is one such common social network, which is well known as a microblog. The short messages (Tweets) which are shared in Twitter can produce many important information. S2Net tool was developed in order to analyze these Tweets and generate useful information and present it in a suitable manner. Situations where one is interested in the news topics rather than news groups. For such cases, the clustering technique was used, in which the news was clustered into news topics. Expectation–Maximization clustering (EM Clustering) and Hierarchical Clustering were the methods used in these situations. The results show that Hierarchical Clustering with Simple Linkage function performs better than EM Clustering. The Simple Linkage function can detect the small relationships between clusters. Because of the high dimension of the features, there will be many relationships which are hard to detect. Therefore using Simple Linkage function can improve the accuracy. There can be situations where one is interested in the news topics rather than news groups. For such cases, the clustering technique was used, in which the news was clustered into news topics. EM Clustering and Hierarchical Clustering were the methods used in these situations. The results show that Hierarchical Clustering with Simple Linkage function performs better than EM Clustering. The Simple Linkage function can detect the small relationships between clusters. Because of the high dimension of the features, there will be many relationships which are hard to detect. Therefore using Simple Linkage function can improve the accuracy. These two analyzing methods were evaluated using two evaluation techniques. The classification method was evaluated using F-measure. According to the F measure, it is clear that the Random Forest method performs well than the other methods. The clustering method was evaluated by getting review comments for the each cluster. The reviewer evaluates and marks the mismatches for each cluster. According to their evaluations, EM clustering performs with 68.52% accuracy and Hierarchical clustering performs with 89.93%. |
URI: | https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4756 |
Appears in Collections: | 2019 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Mphil_MAID Fernando2019.pdf | 1.3 MB | Adobe PDF | View/Open |
Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.