Anda belum login :: 27 Nov 2024 09:31 WIB
Detail
BukuEfficient database support for OLAP queries (On-line Analytical Processing)
Bibliografi
Author: Deshpande, Prasad Manikarao ; Naughton, Jeffrey F. (Advisor)
Topik: COMPUTER SCIENCE
Bahasa: (EN )    ISBN: 0-599-63895-8    
Penerbit: THE UNIVERSITY OF WISCONSIN - MADISON     Tahun Terbit: 2000    
Jenis: Theses - Dissertation
Fulltext: 9960377.pdf (0.0B; 5 download)
Abstract
Computing multidimensional aggregates is a performance bottleneck for OLAP and multi-dimensional data analysis applications. This thesis addresses various problems that arise while speeding up multi-dimensional queries. The first problem we consider deals with computing the CUBE operator proposed by Gray et al. We show how the structure of CUBE computation can be viewed in terms of a hierarchy of group-by operations, and present a class of sorting-based algorithms that overlap the computation of different group-by operations using the least possible memory for each computation. Experiments show that the method dramatically outperforms the straightforward implementation of CUBE as a sequence of SQL group-by operations. The second and third part of the thesis deal with caching for multi-dimensional queries. We developed a novel scheme where we cache small regions of the multi-dimensional space called “chunks”. Chunk-based caching allows fine granularity caching, and allows queries to partially reuse the results of previous queries with which they overlap. To facilitate the computation of chunks required by a query but missing from the cache, we propose a new organization for relational tables, which we call a “chunked file.” Our experiments show that for workloads that exhibit query locality, chunked caching combined with the chunked file organization performs better than traditional query level caching. Simple caching unfortunately misses the dramatic performance improvements obtainable when the answer to a query, while not immediately available in the cache, can be computed by aggregating other data in the cache. In order to use aggregation, one must solve two subproblems: (1) determining whether it is possible, and (2) determining the fastest path for this aggregation, since there can be many. We present a naive and a Virtual Count based strategy. The virtual count based methods determine if a query is computable from the cache and the fastest path for this computation almost instantaneously, with a small overhead of maintaining the summary state of the cache. Experiments show that aggregation in the cache leads to substantial performance improvement. The virtual count based methods further improve the performance compared to the naive approaches, in terms of cache lookup and aggregation times.
Opini AndaKlik untuk menuliskan opini Anda tentang koleksi ini!

Lihat Sejarah Pengadaan  Konversi Metadata   Kembali
design
 
Process time: 0.203125 second(s)