Result: Supporting online queries in ROLAP

Title:
Supporting online queries in ROLAP
Authors:
Source:
DaWak 2000 : data warehousing and knowledge discovery (London, 4-6 September 2000)Lecture notes in computer science. 1874:234-243
Publisher Information:
Berlin: Springer, 2000.
Publication Year:
2000
Physical Description:
print, 21 ref
Original Material:
INIST-CNRS
Document Type:
Conference Conference Paper
File Description:
text
Language:
English
Author Affiliations:
George Mason University, Information and Software Engineering Department, Fairfax, VA 22303, United States
ISSN:
0302-9743
Rights:
Copyright 2001 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Sciences of information and communication. Documentation

FRANCIS
Accession Number:
edscal.781477
Database:
PASCAL Archive

Further Information

Data warehouses are becoming a powerful tool to analyze enterprise data. A critical demand imposed by the users of data warehouses is that the time to get an answer (latency) after posing a query is to be as short as possible. It is arguable that a quick, albeit approximate, answer that can be refined over time is much better than a perfect answer for which a user has to wait a long time. In this paper we addressed the issue of online support for data warehouse queries, meaning the ability to reduce the latency of the answer at the expense of having an approximate answer that can be refined as the user is looking at it. Previous work has address the online support by using sampling techniques. We argue that a better way is to preclassify the cells of the data cube into error bins and bring the target data for a query in waves, i.e., by fetching the data in those bins one after the other. The cells are classified into bins by means of the usage of a data model (e.g., linear regression, log-linear models) that allows the system to obtain an approximate value for each of the data cube cells. The difference between the estimated value and the true value is the estimation error, and its magnitude determines to which bin the cell belongs. The estimated value given by the model serves to give a very quick, yet approximate answer, that will be refined online by bringing cells from the error bins. Experiments show that this technique is a good way to support online aggregation.