|
ABSTRACT
One technique that is sometimes employed to enhance the performance of a database management system is known as attribute partitioning. This is the process of dividing the attributes of a file into separately stored subfiles. By storing together those attributes that are frequently requested together by transactions, and by separating those that are not, attribute partitioning can reduce the number of pages that are transferred from secondary storage to primary memory in the processing of a transaction.The goal of this work is to design mechanisms that can automatically select a near-optimal attribute partition of a file's attributes, based on the usage pattern of the file and on the characteristics of the data in the file. The approach taken to this problem is based on the use of an accurate partition evaluator and of a heuristic that guides a search through the large space of possible partitions. The heuristics propose a small set of promising partitions to submit for detailed analysis. The evaluator assigns a figure of merit to any proposed partition that reflects the cost that would be incurred in processing the transactions in the usage pattern if the file were partitioned in the proposed way.We have implemented an evaluator for a particular model database system and have developed a heuristic search technique. A series of experiments has demonstrated the accuracy and efficiency of this heuristic.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Alsberg P. A., "Space and Time Savings through Large Data Base Compression and Dynamic Restructuring", Proceedings of the IEEE, Vol. 63, No. 8, August 1975.
|
 |
2
|
|
| |
3
|
Batory, D. S., "On Searching Transposed Files", Proceedings of the International Conference on Very Large Data Bases, 1978.
|
| |
4
|
Benner F. H., "On Designing Generalized File Records for Management Information Systems", Proceedings of the Fall Joint Computer Conference, 1967.
|
| |
5
|
Belford, G. G., "Dynamic Data Clustering and Partitioning", CAC *162, Center for Advanced Computation, Research in Network Data Management and Resource Sharing, University of Illinois at Urbana-Champaign, May 1975.
|
| |
6
|
Blasgen, M. W., and Eswaran, K. P., "On the Evaluation of Queries in a Relational Data Base System", Computer Science Department, IBM Research Laboratory, San Jose, Calif., 1976.
|
| |
7
|
|
| |
8
|
Day, R. H., "An Optimal Extracting from a Multiple File Data Storage System: An Application of Integer Programming," Operations Research, 13, 3 (May 1956) 482--494.
|
| |
9
|
Dearnley, P., "Model of a Self-organizing Data Management System", Computer Journal Vol. 17, No. 1, Feb. 1974.
|
 |
10
|
|
| |
11
|
Golden, B. L., "A Statistical Approach to the Travelling Salesman Problem", Networks, Vol. 7, 1977.
|
 |
12
|
|
| |
13
|
Hammer, M. M., and Chan, A. Y., "Acquisition and Utilization of Access Patterns in Relational Data Base Implementation", Pattern Recognition and Artificial Intelligence (editor Chen, C. H.), Academic Press, 1976.
|
| |
14
|
Hammer, M. M., "Self-adaptive Automatic Data Base Design", Proceedings of National Computer Conference, Dallas, Texas, June 1977.
|
| |
15
|
Hammer, M. M., and Chan, A. Y., "A Statistical Inference Approach to the Validation of Database Design Heuristics", forthcoming memorandum.
|
| |
16
|
|
| |
17
|
Hoffer, J. A., and Severance, D. G., "The Use of Cluster Analysis in Physical Data Base Design", Proceedings of International Conference on Very Large Data Bases, Framingham MA, Sept. 1975.
|
| |
18
|
Kennedy, S. R., "A File Partitioning Model", California Institute of Technology Information Science Technical Report 2, May 1972.
|
| |
19
|
Kennedy, S. R., "The Use of Access Frequencies in Data Base Organization", Ph. D. Dissertation, The Wharton School, University of Pennsylvania (May 1973), 155pp.
|
 |
20
|
|
| |
21
|
McCormick, W. T., Jr., Schweitzer, P. J., and White, T. W., "Problem Decomposition and Data Reorganization by a Clustering Technique", Operations Research, 20, 4 (July 1972), 741--751.
|
| |
22
|
|
| |
23
|
Niamir, B., and Chan, A. Y., "On Estimating the Cost of Accessing Records in Blocked Database Organizations", MIT Laboratory for Computer Science, forthcoming Technical Memorandum.
|
| |
24
|
Osman, I. M., "The Partitioning of a Data Base into Supplies Matching User's Queries", Computer Centre, University of Khartoum, Sudan.
|
 |
25
|
|
| |
26
|
Seppälä, Y., "Definition of Extraction Files and Their Optimization by Zero-One Programming," BIT, 7, 3 (1967) 206--215.
|
| |
27
|
Stocker, P. M., and Dearnley, P. A., "Self-organizing Data Management Systems", Computer Journal 16, 2 (May 1973).
|
| |
28
|
Wang, L., "Simultaneous File Partitioning and Index Selection in a Self-adaptive Data Base Management System", MIT Laboratory for Computer Science, Data Base Systems Group unpublished report, August 1978.
|
| |
29
|
Waters, S. J., "Hit Ratios", the Computer Journal, Volume 19, No. 1, April 1976.
|
 |
30
|
|
| |
31
|
Yue, P. C., and Wong, C. K., "On the Optimal Properties of a Partition Algorithm", IBM Research Report, RC 3797, March 30, 1972.
|
CITED BY 29
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jagannathan Srinivasan , Souripriya Das , Chuck Freiwald , Eugene Inseok Chong , Mahesh Jagannath , Aravind Yalamanchi , Ramkumar Krishnan , Anh-Tuan Tran , Samuel DeFazio , Jayanta Banerjee, Oracle8i Index-Organized Table and Its Application to New Domains, Proceedings of the 26th International Conference on Very Large Data Bases, p.285-296, September 10-14, 2000
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|