Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/3928
Title: How Dirty is your Data? Identification of the Effects of Unclean Data and Incorporation of String Matching Techniques to Mitigate these Effects in the Telecommunication Industry
Authors: Jayawardena, P. A. S. N.
Muthuweera, N. M. D.
Issue Date: 2017
Abstract: Abstract Telecommunication organizations in Sri Lanka face the challenge of improving and maintaining data quality in customer data. Customer data are considered as a data category which is exposed to the infusion of dirty data. The problem gets accumulated in customer profiles, creating varied duplicated accounts of the same customer profile. In the current circumstances, data cleansing tools are used to improve data quality, but the suitability of a tool to control the problem is limited. Even though tools provide recommendations on merging duplicate accounts, it is crucial to clerically review on the final linkage status. This dissertation outlines a detailed analysis of dirty data infused in customer data, which would create an adversative effect on CRM operations and overall decision making process. Further, it incorporates similarity measure techniques to provide a better decision making in the clerical review process, by evaluating the similarity in different attributes of customer data, in order to control and mitigate the effect. Experimental comparisons on a Sri Lankan telecommunication dataset indicates appropriate similarity measure techniques to be applied to full name, address and email address attributes of customer data with a high level of accuracy.
URI: http://hdl.handle.net/123456789/3928
Appears in Collections:2017

Files in This Item:
File Description SizeFormat 
IS 4001_GroupNo6_FinalThesis.pdf2.42 MBAdobe PDFView/Open


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.