Dynamic Process Migration in MPI

Nishantha, K.W.G.H.

Please use this identifier to cite or link to this item: https://dl.ucsc.cmb.ac.lk/jspui/handle/123456789/1670

Title:	Dynamic Process Migration in MPI
Authors:	Nishantha, K.W.G.H.
Issue Date:	18-Dec-2013
Abstract:	In high performance computing lot of techniques are used to increase the efficiency of the programs. However, there are some programs which take hours, days, weeks or months to complete the task in a single computer. Therefore, parallel computing come into action to reduce the run time of those kinds of programs. In parallel computing, computer clusters are used and Message Passing Interface (MPI) is widely used in the development of many applications. Even though MPI is used in wide variety of applications, fault tolerance of MPI is still an issue. In a MPI cluster, if an unexpected failure (interrupt) happens to a node in the cluster, the whole MPI process is aborted. We have to restart the process from the beginning. In a heavy MPI task, a task which takes about hours to complete, if an interrupt happen at the very end of the process, this become a vast wastage in terms of time and resources. Therefore, it is very essential to have a recovery from such situations. Dynamic process migration in MPI supply mechanism to recover the MPI processes in those kind of critical situations. In dynamic process migration in MPI, process migration happen dynamically without the intervention of another party. In MPI programs, when processes are running parallel in different nodes (computers), some nodes may face unexpected interrupt and discard their processes. This interrupt may be due to unexpected shutdown of a computer, some problem in network hardware or some other reason. Dynamic process migration finds out a way to migrate the interrupted processes to live nodes and resume the process without aborting the whole process. Conventional process migration follows the checkpoint/restart process migration mechanisms. However, since it is inefficient to use the same technique in dynamic process migration, this paper presents a coding approach with a simple API. This approach does not need any additional libraries and this gives the programmer to control the dynamic process migration in a flexible manner. We evaluated our approach by implementing a matrix multiplication application and we were able to successfully run the program in LAM/MPI and FT-MPI with dynamic process migration.
URI:	http://hdl.handle.net/123456789/1670
Appears in Collections:	SCS Individual Project - Final Thesis (2009)

Files in This Item:

File	Description	Size	Format
28.pdf Restricted Access		736.13 kB	Adobe PDF	View/Open Request a copy

Show full item record