|
ABSTRACT
We study management strategies for main memory database clusters that are interposed between Internet applications and back-end databases as content caches. The task of management is to allocate data across individual cache databases and to route queries to the appropriate databases for execution. The goal is to maximize effective cache capacity and to minimize synchronization cost. We propose an affinity-based management system for main memory database cLUsters (ALBUM). ALBUM executes each query in two stages in order to take advantage of the query affinity that is observed in a wide range of applications. We evaluate the data/query distribution strategy in ALBUM with a set of trace-based simulations. The results show that ALBUM reduces cache miss ratio by a factor of 1.7 to 9 over alternative strategies. We have implemented a prototype of ALBUM, and compare its performance to that of an existing infrastructure: a fully replicated database with large buffer cache. The results show that ALBUM outperforms the existing infrastructure with the same number of server machines by a factor of 2 to 7, and that ALBUM with only 1/3 to 1/2 of the server machines achieves the same throughput as the existing infrastructure.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Allen, N. 2001. Don't waste your storage dollars: What you need to know. Research Note, Gartner Group, 20 March, 2001.
|
| |
2
|
|
 |
3
|
|
| |
4
|
Aron, M., Sanders, D., Druschel, P., and Zwaenepoel, W. 2000. Scalable content-aware request distribution in cluster-based network servers. In Proceedings of USENIX Annual Technical Conference, June 2000.
|
 |
5
|
Anna Brunstrom , Scott T. Leutenegger , Rahul Simha, Experimental evaluation of dynamic data allocation strategies in a distributed database with changing workloads, Proceedings of the fourth international conference on Information and knowledge management, p.395-402, November 29-December 02, 1995, Baltimore, Maryland, United States
[doi> 10.1145/221270.221652]
|
| |
6
|
Cao, P., and Irani, S. 1997. Cost-aware www proxy caching algorithms. In Proceedings of the USENIX Symposium on Internet Technologies and Systems, Dec. 1997.
|
| |
7
|
Carter, J. L. and Wegman, M. N. 1979. Universal classes of hash functions. J. Comput. Syst. Sci. 18 (1979).
|
 |
8
|
|
 |
9
|
George Copeland , William Alexander , Ellen Boughter , Tom Keller, Data placement in Bubba, Proceedings of the 1988 ACM SIGMOD international conference on Management of data, p.99-108, June 01-03, 1988, Chicago, Illinois, United States
|
| |
10
|
Davis, J. R. 1999. DataLinks: Managing External Data with DB2 Universal Database. IBM, Feb. 1999.
|
 |
11
|
|
 |
12
|
|
 |
13
|
M. J. Feeley , W. E. Morgan , E. P. Pighin , A. R. Karlin , H. M. Levy , C. A. Thekkath, Implementing global memory management in a workstation cluster, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.201-212, December 03-06, 1995, Copper Mountain, Colorado, United States
|
 |
14
|
|
 |
15
|
|
| |
16
|
|
 |
17
|
Shahram Ghandeharizadeh , David J. DeWitt , Waheed Qureshi, A performance analysis of alternative multi-attribute declustering strategies, Proceedings of the 1992 ACM SIGMOD international conference on Management of data, p.29-38, June 02-05, 1992, San Diego, California, United States
|
| |
18
|
Princeton University Campus Directory. http://www.princeton.edu/Siteware/puphf.shtml.
|
| |
19
|
Iyengar, A. and Challenger, J. 1997. Improving web server performance by caching dynamic data. In Proceedings of the 1st USENIX Symposium on Internet Technologies and Systems, Dec. 1997.
|
| |
20
|
Ji, M. 2000. A low-cost consistency protocol for replicated directory data in cluster-based storage systems. In Proceedings of 1st IEEE International Conference on Cluster Computing. Extended Abstract/Poster, Full Paper as Technical Report 620-00, Dept. of Computer Science, Princeton University, Nov. 2000.
|
 |
21
|
|
 |
22
|
|
 |
23
|
Hoshi Mistry , Prasan Roy , S. Sudarshan , Krithi Ramamritham, Materialized view selection and maintenance using multi-query optimization, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.307-318, May 21-24, 2001, Santa Barbara, California, United States
|
 |
24
|
|
| |
25
|
Navathe, S. B., Karlapalem, K., and Ra, M. 1995. A mixed fragmentation methodology for initial distributed database design. J. Comput. Softw. Eng. 3, 4 (1995).
|
| |
26
|
Open Market. 1996. FastCGI:A High-Performance Web Server Interface, April 1996.
|
| |
27
|
|
 |
28
|
Vivek S. Pai , Mohit Aron , Gaurov Banga , Michael Svendsen , Peter Druschel , Willy Zwaenepoel , Erich Nahum, Locality-aware request distribution in cluster-based network servers, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.205-216, October 02-07, 1998, San Jose, California, United States
|
 |
29
|
|
 |
30
|
P. Griffiths Selinger , M. M. Astrahan , D. D. Chamberlin , R. A. Lorie , T. G. Price, Access path selection in a relational database management system, Proceedings of the 1979 ACM SIGMOD international conference on Management of data, May 30-June 01, 1979, Boston, Massachusetts
[doi> 10.1145/582095.582099]
|
 |
31
|
|
| |
32
|
|
 |
33
|
|
| |
34
|
Smith, B., Acharya, A., Yang, T., and Zhu, H. 1999. Exploiting result equivalence in caching dynamic web content. In Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems, Oct. 1999.
|
| |
35
|
TimesTen Performance Software. 2000. In-Memory Data Management in the Application Tier, 2000.
|
| |
36
|
Transaction Processing Performance Council (TPC). 2001. TPC Benchmark C Standard Specification Rev. 5.0, 2001.
|
| |
37
|
Wirzenius, L. and Oja, J. 1993. The Linux System Administrators' Guide Version 0.6.2. Ch. 5. Linux Documentation Project, 1993.
|
| |
38
|
Yang, C., and Luo, M. 1999. Efficient support for content-based routing in web server clusters. In Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems, Oct. 1999.
|
| |
39
|
Zhang, X., Barientos, M., Chen, J. B., and Seltzer, M. 1999. Hacc: An architecture for cluster-based web servers. In Proceedings of the 3rd USENIX Windows NT Symposium, July 1999.
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.2
DATABASE MANAGEMENT
H.2.4
Systems
Subjects:
Distributed databases
Additional Classification:
H.
Information Systems
H.3
INFORMATION STORAGE AND RETRIEVAL
H.3.3
Information Search and Retrieval
H.3.4
Systems and Software
Subjects:
Performance evaluation (efficiency and effectiveness)
General Terms:
Algorithms,
Experimentation,
Management,
Measurement,
Performance
Keywords:
Main memory database,
clustering,
database administration,
database cluster,
file organization,
query affinity,
scalability
|