|
|||||||||||||||||||||
|
|||||||||||||||||||||
ABSTRACT
The concept of task farming provides a mechanism to mange the execution of multiple instances of a serial application on distributed computing resources. These type of jobs fall into the category of "embarrassingly parallel" applications, for which it is clear how to partition them on parallel programming environments. Many important problem-classes, like Monte-Carlo simulations and parameter-space-surveys fall into this category. Although it is common practice on clusters systems using shared filesystems and local batch queue systems, task farming on distributed Grid resources is still far away from being ubiquitous due to the heterogeneous, potentially unreliable, and not very well predictable nature of the Grid. Additionally, the vast amount of different Grid middleware, transport protocols, and programming interfaces makes it exceptionally difficult to transparently run a task farming job on the Grid. Confronted with the task to run a large parameter survey for a serial application (the Shrimp Model) on different Grid resources, we evaluated different Grid task farming tools (Condor-G, Nimrod/G, ...) for the following requirements: • Grid-enabled: The application MUST be Grid-enabled, i.e. able to submit jobs to different Grid resource managers. • Portable: All components of the task farming application should run on as many hardware/OS platforms as possible. • Lightweight: The implementation should be as simple and lightweight as possible and shouldn't be bloated with unnecessary features. • Middleware independent: The application MUST not be bound to a single Grid middleware and should ideally be able to run without any Grid middleware as well. • User friendly: Both, describing a task farming job as well as submitting and monitoring a task farming job should be as intuitive as possible. INDEX TERMS
Primary Classification:
Additional Classification:
|
|||||||||||||||||||||