Wikipedia is an example of the collaborative, semi-structured data sets emerging on the Web. These data sets have large, non-uniform schema that require costly data integration, making visualization difficult.

We present Vispedia (live at, a system that reduces the cost of data integration, enabling casual users to build ad hoc visualizations of Wikipedia data. Users can browse Wikipedia, select an interesting data table, then interactively discover, integrate, and visualize additional related data on-demand through a search interface and a query recommendation engine. This is accomplished through a fast path search algorithm over a semantic graph derived from Wikipedia. Vispedia also supports exporting the augmented data tables produced for use in more traditional visualization systems. We believe that these techniques begin to address the "long tail" of visualization by allowing a wider audience to visualize a broader class of data.

We evaluated this system and its interaction techniques in a first-use formative lab study. Study participants were able to quickly create effective visualizations for a diverse set of domains, performing data integration as needed. This suggests that the techniques embodied in Vispedia do reduce the time needed to author ad hoc visualizations of Wikipedia data.

Prior to our work on Vispedia, we developed the basic path-based data integration formalisms in "Visualization of Heterogeneous Data", and investigated lightweight semantic-aware web interaction techniques in "Programming by a Sample: Rapidly Creating Web Applications with d.mix".


"Vispedia: Interactive Visual Exploration of Wikipedia Data via Search-Based Integration"
(04/2008, QuickTime (H.264), 1248, 3min, 30MB)

Bryan Chan, Justin Talbot, Leslie Wu, Nathan Sakunkoo, Mike Cammarano, Pat Hanrahan
"Vispedia: On-demand Data Integration for Interactive Visualization and Exploration"
(ACM SIGMOD Demo Paper, 2009)

Bryan Chan, Leslie Wu, Justin Talbot, Mike Cammarano, Pat Hanrahan
"Vispedia: Interactive Visual Exploration of Wikipedia Data via Search-Based Integration"
(IEEE Information Visualization, 2008)

Mike Cammarano, Xin (Luna) Dong, Bryan Chan, Jeff Klingner, Alon Halevy, and Pat Hanrahan,
"Visualization of Heterogeneous Data"
(IEEE Information Visualization, 2007)

Hartmann, Björn, Leslie Wu, Kevin Collins, Scott R. Klemmer,
"Programming by a Sample: Rapidly Prototyping Web Applications with d.mix"

(ACM UIST 2007 Full paper)



Bryan Chan
Leslie Wu
Justin Talbot
Mike Cammarano
Pat Hanrahan
Jeff Klingner
Alon Halevy
Luna Dong


