Anda belum login :: 27 Nov 2024 09:31 WIB
Home
|
Logon
Hidden
»
Administration
»
Collection Detail
Detail
Efficient database support for OLAP queries (On-line Analytical Processing)
Bibliografi
Author:
Deshpande, Prasad Manikarao
;
Naughton, Jeffrey F.
(Advisor)
Topik:
COMPUTER SCIENCE
Bahasa:
(EN )
ISBN:
0-599-63895-8
Penerbit:
THE UNIVERSITY OF WISCONSIN - MADISON
Tahun Terbit:
2000
Jenis:
Theses - Dissertation
Fulltext:
9960377.pdf
(0.0B;
5 download
)
Abstract
Computing multidimensional aggregates is a performance bottleneck for OLAP and multi-dimensional data analysis applications. This thesis addresses various problems that arise while speeding up multi-dimensional queries. The first problem we consider deals with computing the
CUBE
operator proposed by Gray et al. We show how the structure of
CUBE
computation can be viewed in terms of a hierarchy of group-by operations, and present a class of sorting-based algorithms that overlap the computation of different group-by operations using the least possible memory for each computation. Experiments show that the method dramatically outperforms the straightforward implementation of
CUBE
as a sequence of SQL group-by operations. The second and third part of the thesis deal with caching for multi-dimensional queries. We developed a novel scheme where we cache small regions of the multi-dimensional space called “
chunks
”. Chunk-based caching allows fine granularity caching, and allows queries to partially reuse the results of previous queries with which they overlap. To facilitate the computation of chunks required by a query but missing from the cache, we propose a new organization for relational tables, which we call a “chunked file.” Our experiments show that for workloads that exhibit query locality, chunked caching combined with the chunked file organization performs better than traditional query level caching. Simple caching unfortunately misses the dramatic performance improvements obtainable when the answer to a query, while not immediately available in the cache, can be computed by aggregating other data in the cache. In order to use aggregation, one must solve two subproblems: (1) determining whether it is possible, and (2) determining the fastest path for this aggregation, since there can be many. We present a naive and a
Virtual Count
based strategy. The virtual count based methods determine if a query is computable from the cache and the fastest path for this computation almost instantaneously, with a small overhead of maintaining the summary state of the cache. Experiments show that aggregation in the cache leads to substantial performance improvement. The virtual count based methods further improve the performance compared to the naive approaches, in terms of cache lookup and aggregation times.
Opini Anda
Klik untuk menuliskan opini Anda tentang koleksi ini!
Lihat Sejarah Pengadaan
Konversi Metadata
Kembali
Process time: 0.203125 second(s)