Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4588
Title: Data Mining Approach to Predict CVD Risk in the Sri Lankan Context
Authors: Gamage, N.D.U.
Issue Date: 23-May-2022
Abstract: Cardiovascular diseases (CVDs) are considered the number one non-communicable disease that causes death and severe disabilities. CVDs associated with damage to the heart or blood vessels. Coronary heart disease, cerebrovascular disease, rhematic heart disease, heart attack, heart failure are common cardiovascular diseases spread all around the world. The World Heart Federation predicts that there will be more than 23millions CVD-related death per year by 2023. High blood pressure, smoking, high cholesterol, diabetes, physical inactivity, and family history of CVD are the main causes behind this tragedy. World Health Organization (WHO) and World Heart Foundation along with experts in the domain all around the world scale up the effort on CVD prediction and control through technical packages. Existing mobile and web-based packages are calculating the CVD risk using Framingham Risk Score. But existing solutions having age restrictions or are limited to major CVD risk factors. This study is introducing CVD classifier combing PCA to determine the CVD risk using literature-based classifiers and CVD risk factors. Results of the research indicate that the Random Forest classifier is most suitable to predict CVD risk prediction as the prediction accuracy is 85%. The data used for CVD risk prediction is not focused on Sri Lankan lifestyle, habits, and environment. Therefore, a questionary was prepared to identify CVD risk factors with the support of medical students and feedback from domain experts. Data collected from 1252 individuals all over Sri Lanka was fed into multiple classifiers to detect the best classifier, The experiment identified that the Random Forest classifier is suitable for CVD risk prediction in Sri Lankan oriented data due to it provides 100% accuracy for modifiable, contributing, and major CVD risk factors. As this is the first attempt to identify the CVD risk factors unique to the Sri Lanka context set of clustering and classifications was conducted to detect CVD risk groups and associations between factors. The association rule mining proved that neglecting age, age groups, or alcohol consumption is not suitable when predicting the CVD risk. Also, it proved that there is a high association between every non-communicable disease considered in this research (diabetics, blood pressure, and cholesterol). The cluster analysis identified that, if a person having any non-communicable diseases such as diabetics, cholesterol, or blood pressure and is not/less involved in physical exercises there is a high risk for getting into CVD.
URI: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4588
Appears in Collections:2021

Files in This Item:
File Description SizeFormat 
2018 BA 013.pdf1.76 MBAdobe PDFView/Open


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.