A Generic Execution Management Framework for Scientific Applications
Date
2010-07-09T16:17:31Z
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Managing the execution of scientific applications in a heterogeneous grid computing environment can be a daunting
task, particularly for long running jobs. Increasing fault tolerance by checkpointing and migrating jobs between
resources requires expertise and time of the scientist. Automation of such tasks can allow the scientist to focus more
on the scientific results and less on the technical details.
In this paper a generic framework for managing and automating the execution of jobs is presented. It uses of
a variety of information models describing systems, policies, and application details/requirements to make suitable
decisions on where and how to run, checkpoint, migrate and reconfigure jobs as needed. To demonstrate the utility
of the framework, it is used as part of a simulation study to assess the impact availability of application memory
usage information has on meeting the QoS objectives of job submitters and on overall utilization of resources. The
study shows that with greater availability of memory usage information, the execution management framework is able
to better meet user objectives and improve utilization of resources, particularly when the objective is to make more
efficient use of resources.
Description
Keywords
Application Modelling, Grid Computing