UCSC Digital Library Collection:

UCSC Digital Library Collection: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4687 Sat, 18 Jul 2026 16:50:59 GMT 2026-07-18T16:50:59Z A Model for the Estimation of Land Prices in Colombo District using Web Scraped Data https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4699 Title: A Model for the Estimation of Land Prices in Colombo District using Web Scraped Data Authors: Naotunna, R.A.G. Abstract: Sri Lankan people have been showing keen interest in real estate investments, especially in the Colombo district, as these assets do not depreciate with time like most tangible assets and as these investments cause a significant outflow of money from their overall wealth. However, at present lands in Sri Lanka are valued based on the experience and judgment of the individual valuation officers which could be highly subjective and questionable as the way of analyzing the features and providing a value could vary from person to person. In an attempt to address the above-mentioned issue, this research focuses on developing a machine learning model to estimate the land prices in the Colombo district by utilizing web scraped data. To achieve the above objective, web advertisements posted in the ikman.lk on lands for sale in the Colombo district for a 3 months period were scraped and obtained the land related data. These data were amalgamated with land price determinants data obtained from other web sources and formed the dataset which contained 3725 records distributed over 43 land price determinants. Further, when developing the required dataset, steps have been taken to collect data about different sub-categorical levels of each price determinant as it could add more value and make the dataset being built more meaningful. This dataset is utilized to fit five machine learning algorithms, namely; Multiple linear regression, Random Forests Regression, Support Vector Regression, Extra Trees Regression and Extreme Gradient Boosting. The performance of each machine learning model is gradually increased through feature reduction and hyper-parameter optimization. In feature reduction, two different approaches; a wrapper method (Recursive Feature Elimination) and a filter method (SelectKBest) were utilized, and selected the approach which provided the optimum results. Out of the five machine learning algorithms utilized, the hyper-parameter optimized Random Forests regression model outperformed the other linear, nonlinear, tree-based and ensemble machine learning models. The model performed exceptionally well for unseen data with R2 value of 90.24% and MAPE, MAE and RMSE values of 17.88%, 0.098065 and 0.313154 respectively Thu, 22 Jun 2023 00:00:00 GMT https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4699 2023-06-22T00:00:00Z Profanity Filtering in Speech Contents Using Deep Learning Algorithms https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4698 Title: Profanity Filtering in Speech Contents Using Deep Learning Algorithms Authors: Dandeniya, D.D.K.R.W. Abstract: The worldwide online exposure has significantly increased as a result of the Covid-19 pandemic and remote working, online learning, e-commerce have all become the norm. This has drastically increased the use of hate speech, swear words, racial slurs and many other inappropriate contents on the online platforms. These inappropriate contents are slowly degrading the quality of the online user experiences. Consequently, automatic detection and filtering of such inappropriate contents has grown to be a significant issue for enhancing the calibre of contents. Inappropriate contents may include profanity, violence, misleading information, sexually explicit material, extremism, and it may occur in textual, audio or video forms. In this study, a methodology for profanity filtering in speech contents is proposed. The proposed methodology focuses on identifying the audio segment ‘fuck’ which is the most frequently used swear word in the English Language. Audio segments related to swear words and non-swear words were collected, annotated, pre-processed, and analysed for the development of a RNN configuration by using Mel Frequency Cepstral Coefficients (MFCCs) as inputs to the model. Thu, 22 Jun 2023 00:00:00 GMT https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4698 2023-06-22T00:00:00Z Market Outreach for Retail Supermarkets through Customer Segmentation https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4697 Title: Market Outreach for Retail Supermarkets through Customer Segmentation Authors: PERERA, M.T.I. Abstract: This research is built around applying Machine Learning technologies to the supermarket retail sector in Sri Lanka. Two areas were identified for the study: Customer Segmentation for the application of Unsupervised Clustering algorithms and Market Basket Analysis for the application of Association Rule Mining. The main aim of the research was to identify the different clusters of customers found within the supermarket retail domain of Sri Lanka. To facilitate this, it first required the collecting and analyzing of the POS (Point of Sale) sale data in combination with the customer information. Access to this information was provided by the Keells Supermarket chain, owned, and maintained by Jaykay Marketing Pvt. Ltd which is a Part of the John Keells Group of Companies. They provided limited access to the relevant information and the Nexus Customer Loyalty Program which contained most of their customer data. The data were analyzed in their entirety and various derivative forms yielding diverse results. In the clustering process several clustering algorithms were applied, K-Means, K-Modes, KPrototypes, DBSCAN, and Mean Shift algorithms were some of the successfully tested algorithms. They provided diverse outcomes, some with very clear clusters and others without any coherent meaning. There were also instances where an algorithm could not deliver a clear and coherent outcome with the main dataset but would give a viable result for one of the derived datasets. The Association Rule Mining (ARM) process considered the Apriori and Frequent Pattern Growth (FP Growth) algorithms are two of the most popular ARM algorithms used today. The outcomes of these algorithms were able to provide consistent association rules between products through tests on different samples of data. Based on the finding it was successfully concluded that it is indeed possible to apply Clustering to the retail industry in a customer segmentation capacity, albeit the viability of the outcomes may differ based on the requirement and mode of application. Great potential can be found in the application of the findings of both Clustering and ARM in customer attraction and retention. It opens a new frontier for building customer value. Thu, 22 Jun 2023 00:00:00 GMT https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4697 2023-06-22T00:00:00Z Leveraging Acoustic Voice Characteristics for Face Recognition during the COVID-19 Pandemic https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4696 Title: Leveraging Acoustic Voice Characteristics for Face Recognition during the COVID-19 Pandemic Authors: Kamalsooriya, D. N. Abstract: Abstract not available in PDF Thu, 22 Jun 2023 00:00:00 GMT https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4696 2023-06-22T00:00:00Z