Please use this identifier to cite or link to this item:
Title: E-commerce Product Recommendation based on User Ratings and Reviews
Authors: Senarath, K L C D
Issue Date: 26-May-2022
Abstract: The fast-growing retail industry amidst the current pandemic is eCommerce. It is becoming popular due to emerging information technologies, rapid growth, and the high adapting rate of e-banking services. In addition, benefits such as low cost, flexibility, speed in the buying process, availability of comprehensive descriptions of products, convenience due to no geographical limitations, availability of reviews, etc., are a few psychological factors that impact the consumer behavior towards eCommerce. Furthermore, millions of products are available in retail e-commerce, and users post reviews every minute. Therefore, customer reviews and ratings are crucial factors nowadays, and it affects the customer's buying behavior. Those reviews ultimately increase sales by giving the consumers the information they need to choose to buy the product. People are always more eager to purchase products that others have already recommended. However, due to a large number of products and customer reviews available, it has become tedious to understand the actual quality of the product. Therefore, it is tough to make a good choice whether to buy the products. Hence, it is vital to analyze the customer ratings, reviews, and recommendations to assist consumers' decision-making process. In order to derive valuable insights from a large set of reviews with ratings, this study has been conducted using four supervised machine learning techniques, Linear SVC, Decision Tree, K-Nearest Neighbor(KNN), and Naive Bayes on fashion products from Amazon. Exploratory data analysis was applied to the dataset containing more than 10,000 records and 16 attributes for feature selection and handling missing values. Customer reviews were pre-processed using natural language processing techniques and classified data into three classes called Good, Moderate, and Not Recommended product based on rating score and reviews. The pre-processed data set is divided into training and testing, and the model was trained using different algorithms. The detailed output is generated using a confusion matrix and a classification report. The accuracies and prediction time have then been compared to identify the best fit. The results showed that the Linear SVC approach performed better than other algorithms. Accuracy, precision, recall, F1-score, and confusion matrix are used as performance measures. The KNN algorithm was applied against different k values and observed that the K-Nearest Neighbor algorithm classifier was further improved when the value of k was increased. Several statistical analyses such as Pearson correlation, OLS regression, etc. was carried out on the attributes, rating, number of reviews, and price to identify their relationships and impact. Based on the study, most popular product categories, manufacturers, and products were identified. Hence, these statistics and findings would help sellers, marketers to make better decisions to improve their revenue.
Appears in Collections:2021

Files in This Item:
File Description SizeFormat 
2018 BA 031.pdf2.03 MBAdobe PDFView/Open

Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.