Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/1785
Title: Predicting Protein-Protein Interactions
Authors: Kumar, L.N.
Issue Date:  12
Abstract: Protein{protein interactions are very important occurrences in our body which govern major biological functions. Gene expression, cell growth, nutrient uptake, intercellular communication are some of the major biological processes governed by protein{protein interactions. Knowledge of protein{protein interactions will improve our understanding of many biological processes while at the same time our understanding of protein{protein interactions signi cantly contributes to novel drug designs. There are many experiments to identify protein{protein interactions such as Co{Immunoprecipitation (Co{IP), Tandem A nity Puri cation (TAP) and Yeast tow-hybrid which is predominantly used. Even though there are many experiments, bearing in mind the huge number of proteins and even larger number of protein{protein interactions it is highly impractical to nd all interaction through experiments. To overcome this hurdle researchers have proposed many protein{protein interaction predictors based on learning algorithms. These predictors are expected to reduce the search space for protein{protein interaction experiments. Protein{protein interaction predictors predict probable interactions, so that the number of experiments needed would be considerably reduced. Previously there have been many protein{protein interaction predictors proposed based on the similarity measures of the whole protein sequence. In our work we propose a novel method to compute similarities by looking at the most informative regions of the protein sequence. We used both secondary structure and `hotspots' as our indicators to narrow down the most important regions of a protein which govern protein{protein interaction. In our novel prediction process by weighting up the `important' regions we give more emphasis to residues that involve in the interaction, therefore similarity measures' from our technique only compares the regions which govern the protein{protein interaction. Using our novel residue selection procedure which considers secondary structure and hotspots, we were able to eliminate nearly 30% of a protein sequence and retain only the informative 70% of the protein sequence. We achieved 91.14% ( (0:5256%)) prediction accuracy and an AUC (ROC) of 0.9522 ( (0:0035)) in 10{fold cross validation. Informative region selection process considers {sheet residues as less informative and gives more consideration to hotspot residues of the protein. We were able to provide convincing evidence that interaction happens on some active regions of the sequence and local information alone contribute heavily in predicting protein{protein interaction from our novel predictor. Meanwhile we have critical evidence from random elimination tests on protein sequence to enforce larger interaction datasets to be used in protein{protein interaction predictors.
URI: http://hdl.handle.net/123456789/1785
Appears in Collections:SCS Individual Project - Final Thesis (2012)

Files in This Item:
File Description SizeFormat 
17.pdf
  Restricted Access
2.97 MBAdobe PDFView/Open Request a copy


Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.