|
ABSTRACT
Many techniques have been proposed to scale web applications. However, the data interdependencies between the database queries and transactions issued by the applications limit their efficiency. We claim that major scalability improvements can be gained by restructuring the web application data into multiple independent data services with exclusive access to their private data store. While this restructuring does not provide performance gains by itself, the implied simplification of each database workload allows a much more efficient use of classical techniques. We illustrate the data denormalization process on three benchmark applications: TPC-W, RUBiS and RUBBoS. We deploy the resulting service-oriented implementation of TPC-W across an 85-node cluster and show that restructuring its data can provide at least an order of magnitude improvement in the maximum sustainable throughput compared to master-slave database replication, while preserving strong consistency and transactional properties.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
B. Abrahao, V. Almeida, J. Almeida, A. Zhang, D. Beyer, and F. Safai. Self-adaptive SLA-driven capacity management for internet services. In Proc. NOMS, Apr. 2006.
|
| |
2
|
K. Amiri, S. Park, R. Tewari, and S. Padmanabhan. DBProxy: A dynamic data cache for Web applications. In Proc. ICDE, Mar. 2003.
|
| |
3
|
C. Amza, E. Cecchet, A. Chanda, A. Cox, S. Elnikety, R. Gil, J. Marguerite, K. Rajamani, and W. Zwaenepoel. Specification and implementation of dynamic web site benchmarks. In Proc. Intl. Workshop on Workload Characterization, Nov. 2002.
|
| |
4
|
C. Bornhövd, M. Altinel, C. Mohan, H. Pirahesh, and B. Reinwald. Adaptive database caching with DBCache. Data Engineering, 27(2):11--18, June 2004.
|
| |
5
|
E. Cecchet. C-JDBC: a middleware framework for database clustering. Data Engineering, 27(2):19--26, June 2004.
|
| |
6
|
Fay Chang , Jeffrey Dean , Sanjay Ghemawat , Wilson C. Hsieh , Deborah A. Wallach , Mike Burrows , Tushar Chandra , Andrew Fikes , Robert E. Gruber, Bigtable: a distributed storage system for structured data, Proceedings of the 7th symposium on Operating systems design and implementation, November 06-08, 2006, Seattle, Washington
|
| |
7
|
I. Cunha, J. Almeida, V. Almeida, and M. dos Santos. Self-adaptive capacity management for multi-tier virtualized environments. In Proc. Intl. Symposium on Integrated Network Management, May 2007.
|
| |
8
|
DAS3: The Distributed ASCI Supercomputer 3. http://www.cs.vu.nl/das3/.
|
 |
9
|
|
 |
10
|
Giuseppe DeCandia , Deniz Hastorun , Madan Jampani , Gunavardhan Kakulapati , Avinash Lakshman , Alex Pilchin , Swaminathan Sivasubramanian , Peter Vosshall , Werner Vogels, Dynamo: amazon's highly available key-value store, Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, October 14-17, 2007, Stevenson, Washington, USA
|
 |
11
|
Lei Gao , Mike Dahlin , Amol Nayate , Jiandan Zheng , Arun Iyengar, Application specific data replication for edge services, Proceedings of the 12th international conference on World Wide Web, May 20-24, 2003, Budapest, Hungary
[doi> 10.1145/775152.775217]
|
 |
12
|
|
| |
13
|
Steven D. Gribble , Eric A. Brewer , Joseph M. Hellerstein , David Culler, Scalable, distributed data structures for internet service construction, Proceedings of the 4th conference on Symposium on Operating System Design & Implementation, p.22-22, October 22-25, 2000, San Diego, California
|
 |
14
|
|
| |
15
|
Y. Huang and J. Chen. Fragment allocation in distributed database design. Information Science and Engineering, 17(3):491--506, May 2001.
|
| |
16
|
Java TPC-W implementation distribution. http://www.ece.wisc.edu/pharm/tpcw.shtml.
|
| |
17
|
L. Kazerouni and K. Karlapalem. Stepwise redesign of distributed relational databases. Technical Report HKUST-CS97-12, Hong Kong Univ. of Science and Technology, Dept. of Computer Science, Sept. 1997.
|
| |
18
|
|
| |
19
|
S. Navathe, K. Karlapalem, and M. Ra. A mixed fragmentation methodology for initial distributed database design. Computer and Software Engineering, 3(4), 1995.
|
 |
20
|
|
| |
21
|
C. Olston, A. Manjhi, C. Garrod, A. Ailamaki, B. Maggs, and T. Mowry. A scalability service for dynamic web applications. In Proc. Conf. on Innovative Data Systems Research, Jan. 2005.
|
| |
22
|
|
| |
23
|
|
| |
24
|
M. Rabinovich, Z. Xiao, and A. Agarwal. Computing on the edge: A platform for replicating internet applications. In Proc. Intl. Workshop on Web Content Caching and Distribution, Sept. 2003.
|
| |
25
|
M. Ronstrom and L. Thalmann. MySQL cluster architecture overview. MySQL Technical White Paper, Apr. 2004.
|
| |
26
|
RUBBoS: Bulletin board system benchmark. http://jmob.objectweb.org/rubbos.html.
|
| |
27
|
|
| |
28
|
|
 |
29
|
|
| |
30
|
S. Sivasubramanian, G. Pierre, M. van Steen, and G. Alonso. GlobeCBC: Content-blind result caching for dynamic web applications. Technical Report IR-CS-022, Vrije Universiteit, Amsterdam, The Netherlands, June 2006.
|
| |
31
|
|
| |
32
|
W. D. Smith. TPC-W: Benchmarking an ecommerce solution. White paper, Transaction Processing Performance Council.
|
 |
33
|
|
| |
34
|
TPC-W frequently asked questions, question 2.10: "Why was the concept of atomic set of operations added and what are its requirements?", Aug. 1999.
|
| |
35
|
|
|