Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/3108
Title: A Computational Approach to Prioritize Functionally Significant Variations in Whole Exome Sequencing
Authors: Liyanage, I.D.
Issue Date: 21-May-2015
Abstract: Single Nucleotide Polymorphisms (SNPs) are the most common type of genetic variants that can be seen among people. It is considered as the main source of phenotypic di erences among individuals and many common and complex diseases. Thus, knowledge of such SNPs is used for medical treatment, diagnostics, and preventive strategies. However, due to rapid development of sequencing technologies and large scale projects such as Human Genome Project, genome sequence and genotype data for humans and other species are increasingly growing. Hence, the tremendous number of SNPs poses challenges to perform extensive analysis on all SNPs. Thus, prioritizing SNPs and selecting representative SNPs which show association with the conditions under study have become important. Since prioritizing techniques reduce the available list of SNPs to a manageable amount by selecting the most promising subset of SNPs with improved power to detect associations, it enables researchers to reduce their experimental and analysis overhead, and to carry out their follow up functional studies easily. In this thesis, we propose a prioritization model with the intention of predicting the functionally signi cant SNPs in whole exome sequencing. The proposed model quantitatively assesses the deleterious functional e ects of SNPs by combining information from diverse resources. Based on the signi cant features of SNPs identi ed, logistic regression was applied to explore the relationship among the SNP features and the functional signi cance of SNPs. The discriminative ability, calibration and overall performance of the proposed model have been assessed and have yielded promising results. The comparative study conducted based on the prediction model developed using the support vector machine approach demonstrates that the logistic regression model outperforms in terms of iv the predictive ability. Therefore, the proposed SNP prioritization model facilitates the reliable identi cation of functionally signi cant SNPs which are more likely to be associated with disease conditions or functional impairments in massive amount of exome sequencing data.
URI: http://hdl.handle.net/123456789/3108
Appears in Collections:SCS Individual Project - Final Thesis (2014)

Files in This Item:
File Description SizeFormat 
Thesis_10000658.pdf
  Restricted Access
1.26 MBAdobe PDFView/Open Request a copy


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.