Query, Analysis, and Visualization of Multidimensional Databases
Chris Stolte,
Ph.D. Dissertation, Stanford University, June 2003.
Abstract:
In recent years, large multidimensional databases, or data warehouses, have become common in a
variety of commercial and scientific applications. It is not unusual for these data warehouses to contain
billions of tuples, each categorized by tens or hundreds of dimensions. A major challenge with
these databases is to extract meaning from the important data they contain: to discover structure,
find patterns, and derive causal relationships. A promising technique for the analysis of these multidimensional
databases is visualization. To make visualization effective in this context, we need
to develop tools that tightly integrate visual presentation and database queries, support interactive
refinement of the display, and can visually present a large number of tuples and dimensions.
This dissertation introduces a formal approach to building visualization systems that addresses
these demands. The foundation of the dissertation is the Polaris formalism, a language for precisely
describing a wide range of table-based graphical presentations of relational information. A
key aspect of this formal language is the ability to compile visual specifications automatically into
the precise queries and drawing commands necessary to generate the display. This ability enables
us to design systems that closely integrate analysis and visualization. Using the Polaris formalism,
we have built two interactive systems: the Polaris interface and a framework for multiscale
visualization.
The Polaris interface for the exploration of multidimensional databases extends the popular
Pivot Table interface to generate a rich, expressive set of graphic displays. The Polaris interface is
simple and expressive because it is built upon the Polaris formalism. Analysts can incrementally
construct complex queries, receiving visual feedback as they assemble and alter the query. The Polaris
interface is a generally applicable tool that tightly integrates analysis with visualization. This
dissertation also demonstrates how to use the Polaris formalism and data cubes to specify and implement
domain specific multiscale (pan-and-zoom) visualizations efficiently. The presented approach
to multiscale visualization addresses several limitations in the current approaches by introducing
multiple zoom paths into the data and providing general mechanisms for abstraction.
Dissertation:
Query, Analysis, and Visualization of Multidimensional Databases PDF (24 MB)
Products and Software:
A commercial product based on this dissertation is now available from Tableau Software.
Related Papers:
Multiscale Visualization Using Data Cubes
Chris
Stolte, Diane Tang and Pat
Hanrahan
Best Paper Award
Proceedings of the Eighth IEEE Symposium on Information
Visualization, October 2002.
Query, Analysis,
and Visualization of Hierarchically Structured Data using Polaris
Chris
Stolte, Diane Tang and Pat
Hanrahan
Proceedings of the Eighth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, July 2002.
Polaris: A System for Query, Analysis and Visualization of
Multi-dimensional Relational Databases (extended paper)
Chris
Stolte, Diane Tang and Pat
Hanrahan
IEEE Transactions on Visualization and
Computer Graphics, Vol. 8, No. 1, January 2002.
Polaris: A System for
Query, Analysis and Visualization of Multi-dimensional Relational
Databases
Chris
Stolte and Pat
Hanrahan
Proceedings of the Sixth IEEE Symposium on Information
Visualization, October 2000.
cstolte@graphics.stanford.edu