Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4587
Title: MACHINE LEARNING APPROACHES ON MOTOR INSURANCE FRAUD DETECTION
Authors: Fernando, E.N.R.
Issue Date: 23-May-2022
Abstract: Insurance companies are one of the most important components of the financial sector for any country. One of the main challenges faced by insurance companies in current market environment are the fraud claims, especially in motor insurance domain. The number of fraud claims are expected to increase in the future, since claim counts are also increasing. Therefore, many researchers in the motor insurance field around the world are trying to find methods on detecting fraudulent claims as early by using machine learning algorithms. Motor claims fraud detection is a complex task since the fraud behavior different for each claim and the detected fraud cases are much low compared to the normal claims. This research aims to develop a motor insurance fraud detection model using classification algorithms and proposed a best model by using some evaluation criteria’s. The research includes in its scope motor claim data from Sri Lanka Insurance. Dataset contains 30098 claims and out of these claims 3112 claims are labeled as fraudulent. Dataset is imbalanced since fraud claims also known as positive cases only accounts 10% of total cases. Past claim data are analyzed with underwriting details. Artificial Neural Network, Random Forest and XGBoost algorithms are used as the classifiers to detect a claim is fraudulent or not. These algorithms are analyzed and evaluated by dividing the data set into training, validating and testing. However, when giving input data of an imbalanced class variable to the machine learning model, it is biased towards the majority class. Then it misclassified a fraudulent claim as a normal claim. Oversampling method called Synthetic Minority Oversampling Technique (SMOTE) is applied along with ensemble models to address this problem. Model performance is evaluated based on evaluation criteria’s such as recall, precision, f1-score, precision-recall (PR) curve, and receiver operating characteristics (ROC) curve. Since Random Forest and XGBoost classifier model contains parameters that need to be decided by the researcher, hyperparameter tuning is also applied and evaluated. It was found that Random forest and XGBoost models are perform better compared to neural network model. There were not much difference between random forest models and XGBoost models, however, Random forest model with tuned hyperparameters perform slightly better than other models.
URI: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/4587
Appears in Collections:2021

Files in This Item:
File Description SizeFormat 
2018 BA 012.pdf4.71 MBAdobe PDFView/Open


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.