ACM Home Page
Please provide us with feedback. Feedback
The Tool Dæmon Protocol (TDP)
Full text PdfPdf (810 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2003 ACM/IEEE conference on Supercomputing table of contents
Page: 19  
Year of Publication: 2003
ISBN:1-58113-695-1
Authors
Barton Miller  University of Wisconsin, Madison
Ana Cortes  Universitat Autònoma de Barcelona, Spain
Miquel A. Senar  Universitat Autònoma de Barcelona, Spain
Miron Livny  University of Wisconsin, Madison
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
IEEE Computer Society  Washington, DC, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 11,   Citation Count: 1
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

Run-time tools are crucial to program development. In our desktop computer environments, we take for granted the availability of tools for operations such as debugging, profiling, tracing, checkpointing, and visualization. When programs move into distributed or Grid environments, it is difficult to find such tools. This difficulty is caused by the complex interactions necessary between application program, operating system and layers of job scheduling and process management software. As a result, each run-time tool must be individually ported to run under a particular job management system; for m tools and n environments, the problem becomes an m \times n effort, rather than the hoped-for m + n effort. Variations in underlying operating systems can make this problem even worse. The consequence of this situation is a paucity of tools in distributed and Grid computing environments. In response to the problem, we have analyzed a variety of job scheduling environments and run-time tools to better understand their interactions. From this analysis, we isolated what we believe are the essential interactions between the run-time tool, job scheduler and resource manager, and application program. We are proposing a standard interface, called the Tool Dæmon Protocol (TDP) that codifies these interactions and provides the necessary communication functions. We have implemented a pilot TDP library and experimented with Parador, a prototype using the Paradyn Parallel Performance tools profiling jobs running under the Condor batch-scheduling environment.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
 
4
[4] Cray Computer Inc., "NQE Users Guide", Version 3.2, January 1997.
 
5
[5] Etnus LLC, "TotalView User's Guide", Document version 6.0.0-1, January 2003. ¿http://www.etnus.com¿
 
6
 
7
 
8
 
9
[9] IBM Corporation, "Load Leveler Users Guide", Version 1.2. 1995.
 
10
11
 
12
 
13
[13] M.J. Mutka, M. Livny, and M.W. Litzkow, "Condor - A Hunter of Idle Workstations", 8th Int'l Conf. on Distributed Systems, San Francisco, Calif., June 1988.
 
14
 
15
[15] Platform Computing Inc, "LSF Users Guide".
 
16

Collaborative Colleagues:
Barton Miller: colleagues
Ana Cortes: colleagues
Miquel A. Senar: colleagues
Miron Livny: colleagues