Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/3670
Full metadata record
DC FieldValueLanguage
dc.thesis.supervisorWijesinghe, Rupika-
dc.thesis.supervisorWeerasingha, Ruvan-
dc.thesis.supervisorSenavirathne, Sampath-
dc.thesis.supervisorWijesekara, Primal-
dc.contributor.authorAlweera, A.M.S.L.B.-
dc.date.accessioned2016-09-08T04:58:48Z-
dc.date.available2016-09-08T04:58:48Z-
dc.date.issued2016-09-08-
dc.identifier.urihttp://hdl.handle.net/123456789/3670-
dc.description.abstractNext Generation Sequencing (NGS), also known as massively parallel sequencing technologies are revolutionizing our knowledge about biological systems and new trends of DNA sequencing. Since the completion of Human Genome project in 2003, the major attempt to sequence the first human genome, the demand for faster and cheaper sequencing methods and techniques has increased rapidly. This demand has driven to development of second generation sequencing methods also known as NGS techniques. During the past decade, several NGS platforms have been developed that provide low-cost, high-throughput sequencing. These sophisticated NGS platforms perform massively parallel sequencing, which allows an entire genome to be sequenced in less than one day. Aforementioned NGS platforms sequence DNAs and release the output data as FASTA, FASTQ, BAM, SAM, BED and many more file types. In this research our main focus is on BAM and BED files. Generally BAM file contains a massive amount of sequence reads which directs to NGS data analysis task more complicated. BED file contains the targeted regions of interest along the reference sequence. Data analysts are interested in knowing the depth of coverage for specific locations of interested gene or genome. With existing tools and techniques finding the depth of coverage using BAM and BED files requires a long time. The main focus of this study is to come up with an approach to accomplish this task in time efficient manner. We have utilized an existing algorithm called “BEDTools genomecov” and developed a parallelized model to divide the depth finding job across multiple processes. We have mainly used two approaches to evaluate the performance; the single machine approach and the computer cluster approach. We have evaluated the performance of our proposed approach using BAM and BED test files on main two approaches. Experiment results proved that both of the approaches have considerable amount of performance enhancement. But we can achieve better performance on computer cluster approach than single machine approach.en_US
dc.language.isoenen_US
dc.subjectParallelization Modelen_US
dc.titleParallelization Model to Find Depth of Coverage in Target Sequencingen_US
dc.typeThesisen_US
Appears in Collections:SCS Individual Project - Final Thesis (2015)

Files in This Item:
File Description SizeFormat 
11002255.pdf
  Restricted Access
1.47 MBAdobe PDFView/Open Request a copy


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.