current page



Lecture on Oct 12, 2010. (Slides)


  • Required

    • Postmortem of an example, Bertin. (pdf)

    • Visual information seeking: Tight coupling of dynamic query filters with starfield displays, Ahlberg & Shneiderman. (html)

    • Visual exploration of time-series data, Hochheiser & Shneiderman. (pdf)

  • Optional

    • Generalized Selection via Interactive Query Relaxation. Heer, Agrawala, Willett. CHI 2008. (pdf)

    • The cognitive coprocessor architecture for interactive user interfaces. George Robertson, Stuart K. Card, and Jock D. Mackinlay, Proc. UIST '89, pp 10-18. (pdf)

    • Exploration of the Brain's White Matter Pathways with Dynamic Queries. Akers, Sherbondy, Mackenzie, Dougherty, Wandell. Visualization 2004. (html)

  • Demos

  • Videos


amirg wrote:

I found the timeboxes paper to be a particularly interesting yet simple interactive visualization to enable searching for temporal patterns in data. While reading the paper, I thought of a couple of additional extensions to the work presented in the paper that I think would be useful.

The first of these would be to query over multiple temporal variables. This is partially addressed by the fact that the authors mention toward the end of the paper that transcription factors often affect which genes get activated later on so they planned to incorporate a feature to allow temporal patterns from one set of data to be identified as a "leader" and to look at how one leads into the other. However, I think there could be a lot of benefit by making this generalizable by simply allowing the creation of timeboxes over multiple temporally-changing variables, rather than restricting to cases where one variable follows another in time. For example, maybe you want to look at a set of variables that are all within a certain range over the same time period.

The second extension I think would be interesting would be to allow different views on the time series data itself once you have filtered the query using a timebox. For example, in the stock example given in the paper, they filtered the data set using timeboxes to look specifically at stocks that rose and then fell during a particular period. It might be informative to then look at these stocks and cluster them by industry, so a visualization that supports this could help identify commonalities or differences in the data that remains when you filter using a timebox. I guess this is then not so much about looking at the temporal patterns of the data, but understanding which portion of the data fits a particular temporal pattern and why.

msavva wrote:

@amirg: I also felt that the timeboxes were an amazingly simple yet powerful technique. They seem to derive most of their power from the fact that they are so closely tied to the query over the data (i.e. they induce interactive response in the visualization) and that they are very intuitive in their parameterization. Much like you, after reading the paper I thought that there can be several useful ways to extend the concept of timeboxes using similar ideas.

For example, a gradient/slope restrictor (i.e. effectively a directed funnel) could be a very efficient way to restrict the query with respect to rates of change of the data from a given point. One could also imagine composing gradient funnels with timeboxes to restrict selection of points within a box to a subset of local gradients. A further extension to this would be gradient "spokes" resulting from the composition of multiple funnels.

A related idea that occurred to me (and quite likely someone has explored already) has to do with the fact that very often when looking at time series we want to visually extract a particular envelope corresponding to a function of time from the noisy data. I thought that a good way to visually do that would be to quickly construct a smooth curve "stamp" (much like defining control points and gradients for a spline in software such as Photoshop) and then brush it over the data resulting in interactive selection of points that fall within some distance to the curve (this could be represented by an adjustable curve thickness at each control point). A simple example of this would be to define a parabola with a thickness corresponding to the tolerance and then brush this over the time axis to highlight points that match the pattern. A piecewise linear composition of such curves could be used to select arbitrarily complex patterns within the data.

wulabs wrote:

The timebox paper made me think of the movie Inception, where your view of your environment (and hence dataset) is based on the maximum of what you think is in the world around you. It's an interesting way of looking at data, and I myself do use this method sometimes to draw conclusions at the very microscopic level.

The 1st paper - the Bertin postmortem was also somewhat interesting but a more or less straight forward example of how to look at data and extract some meaning from it. Didn't really pick up anything useful here, seems this is more of a tutorial/example for someone new to this space.

rakasaka wrote:

The relevancy of the topics in the Ahlberg and Shneiderman makes it perhaps surprising that it was written 16 years ago. What I like most about it is the stress on the "output-is-input" paradigm - one of the challenges I see in designing effective visualizations is communicating what areas are "inputtable". Since both FilmFinder and TimeSearcher are software-based solutions it would be interesting to take a more web-based approach and see how graphics like the ones the New York Times publish ( are effective or not.

I am excited about the ability to add more graphical real-estate by using interactivity, the sort of direct manipulation one sees in a rather analog fashion as described by Bertin.

trcarden wrote:

@wulabs I think the reason why the Bertin postmortem is on the reading list is to rather succinctly go through all the steps in the data visualization process so we know what to design for when we create our own visualization tool in the next assignment. Its like you said fairly basic and easy to skim to each different step in the data visualization process. It will probably be useful to reference back to it when checking that our visualization assignment actually affords itself to the data discovery and organization process.

adh15 wrote:

@rakasaka I think the Ahlberg and Shneiderman's principle of continuous display is particularly powerful ("always show the users some portion of the information space that they are exploring. They begin by seeing a typical result set or item, which helps to orient them to what is possible in this information seeking environment. This seems more effective than starting with a blank screen or a form to fill in."). Applying this to the Summer Sea Ice example from the New York Times, I think they do a good job of presenting an interesting initial story. I wish, however, that after examining that initial story, they presented controls that would allow the viewer to explore more aspects of the data (e.g. what was the state of sea ice on other dates?)

In general, I think this suggests including a couple pre-made storys in every interactive visualization. Viewers get the benefit of getting a quick sense of what kinds of questions can be answered by the data and are drawn in by an interesting story. Ideally, the vis is also interactive and not just a story-telling mechanism, thus allowing viewers to easily pose their own questions and branch off of the pre-made stories to explore the data on their own.

adh15 wrote:

I am also wondering whether there is some sort of an optimal update-rate for presenting query results. The systems presented in the readings all update results after the query is specified (for example, with mouse-up events in TimeSearcher). However, as computational power increases, it may be possible to update the results continuously as the query is being entered (à la Google Instant). Beyond a certain threshold though, I imagine that increasing the update rate will see diminishing returns. The rate might be limited by the rate at which humans can process information visually, or it might be limited in other ways (e.g. even if we can process at some high rate, a lower rate might be better because it is less distracting).

mattbush wrote:

Visual Information Seeking paper was useful in that at the end, it mentioned that a more fuzzy recommendation system would be effective at solving the problem of picking a movie, something that the data visualization system presented in the paper was unable to do directly. Specifically they mentioned "similar films".

Now we have a lot of websites and services out there that do social, algorithmic, and 'fuzzy' recommendations and similar-item-finding. Facebook's Like button, Yelp's reviews, Pandora's personalized music channels are all examples in a way. From this, some important questions to ask are:

Which one of these two types of tools alone--data visualization and algorithmic recommendation--are more effective at helping people make decisions? In what different situations could one of these be more important than the other? I think it definitely depends on the scenario. For large, thought-out decisions like buying a house, data visualization is better because there are a vast array of choices to be made and hard variables to consider in the decision. For lighter decisions like restaurants and movies, people tend to trust the judgment of their peers or of an algorithm, and wouldn't necessarily consider putting the effort required to use a data visualization system into it.

How much added benefit do we get from a tool that *combines* data visualization and recommendation? Yelp's iPhone app is an example that combines both; it filters the data based on GPS location, allows filtering based on distance and price range, and then provides recommendation data (reviews) for the data it displays. In this case, finding a restaurant from a mobile phone carries a set of hard restraints, much like finding a home, so the quick filtering and browsing of data visualization is a helpful first step for the user before viewing of recommendation begins.

sholbert wrote:

In a similar vein to adh15, as I read these papers I was constantly wondering about the engineering constraints of tight coupling and direct manipulation. We are allowing the user such great liberty to manipulate data, and we are expected to update the display instantly. As a result, if we hope for interaction to work as it is described as the amount of data grows very large, we will need to do some very smart indexing on the back end. This would depend on what kind of analysis we predict the user would perform, and this is a fairly difficult problem.

With the information explosions in social networks, biocomputation, and other new data heavy fields, if we hope to ever catch up with the growth in data, we need to innovate new ways to engineer a back end that can support the ease of interactive data manipulation for massive amounts of data.

yanzhudu wrote:

Both the timebox paper and "Visual information seeking" paper demonstrates the importance of interaction for pattern discovery. A seemingly simple query model like timebox, when used interactively by user, could allow user to discover hard-to-see trends. "Visual information seeking" also suggested a few design principles for such interaction to be effective.

Whereas the "Postmortem of an example" is more like a "doing it the manual way" example. If we could digitize the hotel data, can create interactive diagram out of it, or we could even use timebox to query trends. This could make trends discovery much more easier.

amaeda10 wrote:

I would like to criticize on Bertin's article. First of all, I think the visualization he introduces have several flaws.

1. Using the initial letter of each month as a label is a mistake. Because there are three 'J's in a year, and his table is redundant, it is very confusing. Instead, he should have used the number.

2. There is almost no title, caption, or annotations, and hard to understand what this is all about. For example, I still don't understand what the "Discovery Factors" or "Recovery Factors" mean. It took me a while to figure out what this visualization is trying to show.

On the other hand, I like the mobility of the visualization very much. I am wondering which is more efficient to explore this data set: using the computer or manipulating papers by hand. I think I would use the computer for this task because what I am going to do is straight forward: simply swapping items.

I do not totally agree with Bertin's statement of "the most important stages - choice of questions and data, interpretation and decision-making - can never be automated". I am not the expert of AI, but isn't it possible to make a machine learn a lot of examples of experts' choices and imitate the behavior? Maybe interpretation is very difficult, but generating questions and making a decision seem to be easier. I think that the chatterbot ELIZA was faking to look like the machine is having an actual conversation. It was looking into the database and search for something relevant to the keywords, as far as I remember. Maybe same thing can be applied to generate questions. Our data set might not be so varied as we think. For example, if we have a data set of movies, then it is probably natural to have a category of movie titles, budgets, release years, directors, etc. Then, our initial questions might be quite constrained and machines can generate those constrained questions.

Again, I do not know much about AI, so please correct me if I am saying something wrong.

asindhu wrote:

I agree that the timeboxes are a very cool technique. I have to honestly say that when I started reading the timeboxes paper, I found myself thinking "this is so trivially simple, how come someone wrote a whole paper on this?" But then during the in-class demo especially I found myself eating my own words and deciding that after all it's actually a very powerful tool. One point that I wanted to make is that this change of mind I had clearly demonstrates the power of interactivity -- the screenshots and explanations really don't do it justice until you see the tool in action in front of you.

One interesting extension to the timebox tool might be to incorporate some of the other ideas from later in the lecture, when we talked about dynamically creating classifications or groupings of data. I found this to be very compelling as a way to interactively group data on the fly to spot interesting patterns. I could imagine using the timebox not just to filter data, but to create classifications on the fly. For example, you could select a certain subset of the data with a box and then color it differently, for example, to mimic the "brushing" effect. This would allow you to build up your own classifications of the data as you interact with it, which has many advantages over static groupings since you can quickly and easily change them based on the immediate feedback from the visualization.

I also wonder why the authors restricted themselves to time-series data and specifically called this tool the "timebox." As I see it, this type of selection tool could be used when plotting any two variables against each other.

jdudley wrote:

This is somewhat tangentially related, but I think quite interesting to folks in the class, so I am sharing it here. Very interesting figure search engine for searching biomedical texts that was just published in PLoS One:

They also have a demo video:

You'll see later in the video, they have some very interesting interaction paradigms implemented for visualizing interaction between the text data and the figure image data.

strazz wrote:

I agree with the criticisms made by amaeda10 to Bertin's visualization, labels are poorly used, missing or misleading, but still it's hard not to be impressed by the skill, vision and creativity he had to look at information and transform it ( or manipulate it) in order to extract knowledge that was hidden in it. He actually created manual interactive systems that were able to process the data and reveal the information within. I like how Bertin summarizes the goal of data visualization and interaction "Information is the reply to a question", and this particular example, he managed to create one flexible interactive visualization that is able to answer many different questions. Modern tools and data sets are way more powerful/complex, but I think we should always keep the basic ideas and goals in mind when attempting to visualize or interact with data in creative ways.

abhatta1 wrote:

In the videos, Ozone in the Northeast by Richard A. Becker, William S. Cleveland, Beat Kleiner & Jack L. Warner, AT&T Bell Laboratories (1978) I found the time clock diameter representation of the standard ozone concentration really interesting. In Image of a Thunderstorm, Anne Freeny and John Gabbe AT&T Bell Laboratories (1966)the direction encoding and the color encoding were interesting. However, the color choices (gray and black) might have been better at a later stage (given that they had only gray scale at their disposal).

estrat wrote:

Furthering the critique of Bertin's visualization, I felt similarly. Once I understood it it was a fairly powerful visualization, but until it clicked, I was completely baffled. I thought the numbers next to the type of client signified something other than the original order. I had no idea what the black bars meant. The descriptions on the right were difficult to understand. What was the scale being used? Were different rows comparable? It took a visual inspection comparing the two years to determine that the display was just repeated. I tried to figure out the chart when I started reading, but since I didn't understand it after looking at it for a minute or two I just ignored it and read the article, and then while reading it eventually made sense to me. Once I understood it I liked it, but until then it was useless to me.

anomikos wrote:

As many people have pointed out so far the timeboxes are an excellent idea for consolidating the introduction of multiple query components into one user action. If only I had known this one when I implemented my first project as a software engineer for the Sewage Utility company back in Greece. Users where really struggling with the traditional interface of widgets and trying to interpret the data in the traditional table form. In comparison to that time boxes allow for a quick filtering of the data in two dimensions and more importantly with different sets you can easily add multiple constraints during the same query something that is not possible with the widget technique. It is simple and powerful at the same time. It is also very intuitive for users. I mean we have been using this technique to select files, to select units in RTS games, it is actually really weird that.

The LA homicide demo is a good example of constructing queries as a step by step process. It is interesting how the available options change depending on the view. While playing with the demo and after using Tableu extensively for Assignment 2 I started thinking whether there is an optimal selector (widget) you can use for a specific data type or it does depend on the actual data and context. It would be interesting to test if standardization will make user interaction easier and could lead to better use of visualization tools.

clfong wrote:

I would like to comment on the paper by Ahlberg & Shneiderman. I find it so amazing that this paper was written 16 years ago, yet a lot of concepts introduced in this paper about user interaction are re-invented in recent years on the Web. In particular, the concepts "rapid, incremental and reversible actions" were recently highlighted again by Aza Raskin, the Creative Lead of Firefox in one of his recent articles (

Letting users know that they can undo any actions that they performed gives them much more confidence in exploratory interaction with the system. This makes the cycle of "making mistake, stepping back and discovery something new" to be much more fluent in terms of data visualization.

The other important concept in that paper is "immediate and continuous display of results". If we look back 5 years, the Web has been largely static websites, with a few that you have to submit your request and the server response with new data after some loading time. This was true until AJAX was pioneered by Google for Google Maps. Now this type of continuous interaction is the core of a lot of Web 2.0 sites already. We don't have to wait for page re-load when we want to refine our query and get another piece of data.

acravens wrote:

@amaeda10, @strazz, @estrat etc. - Adding my reactions to the Bertin piece:

"Information is the reply to a question" - to me this gets at the idea of approaching a visualization as a design problem of needfinding and problem-finding. I can see the arguments for generating simple questions from data using AI, but while those might on the one hand serve as starting points to inspire people to ask more detailed questions, I think it's more likely those generated questions would constrain people's sense of the possible questions to ask. The thing I most appreciated about the Bertin article is that he calls defining the problem a "problem of imagination." I would emphasize that this "problem of imagination" is not only - or even primarily - about the dataset. It's about the users and their needs and how the information in the data can contribute to meeting those needs. I know very little about AI, but I have a hard time imagining a computer can get at those by only analyzing the dataset itself. It seems to generate insightful questions automatically you'd need some kind of "social context" metadata that would be hard to find for many datasets (unless maybe they were connected to social networking).

acravens wrote:

@amaeda10, @strazz, @estrat etc. - Adding my reactions to the Bertin piece:

"Information is the reply to a question" - to me this gets at the idea of approaching a visualization as a design problem of needfinding and problem-finding. I can see the arguments for generating simple questions from data using AI, but while those might on the one hand serve as starting points to inspire people to ask more detailed questions, I think it's more likely those generated questions would constrain people's sense of the possible questions to ask. The thing I most appreciated about the Bertin article is that he calls defining the problem a "problem of imagination." I would emphasize that this "problem of imagination" is not only - or even primarily - about the dataset. It's about the users and their needs and how the information in the data can contribute to meeting those needs. I know very little about AI, but I have a hard time imagining a computer can get at those by only analyzing the dataset itself. It seems to generate insightful questions automatically you'd need some kind of "social context" metadata that would be hard to find for many datasets (unless maybe they were connected to social networking).

hyatt4 wrote:

I really enjoyed seeing the babynamewizard website in class. I was particularly inspired by the range of options and the extent by which information could be manipulated and formulated in meaningful ways with only a few interactive controls (a search box, and mouse movements or clicks). I wonder what the general rule of thumb is in terms of the number of controls that should be provided for an interface. I suppose the answer is, "it depends". For instance, a fighter pilot needs a lot of controls to fly a plane where as a little child only needs a quarter to put into the toy jet outside of Safeway. Of course there are significantly different (or if you are a fighter pilot then I'm sure you would argue there are infinitely different) expectations of knowledge for the user/pilot of each. Personally, I think I would like to have every conceivable option to manipulate data available to me, but only a handful of the most meaningful manipulators shown to me. That is, I'd like to have the best manipulators prominently available to me, and the rest either configurable or hidden away in some menu or configuration tool that I can find if I really need to.

gdavo wrote:

In "Postmortem of an example" Bertin wrote: "Extrinsic information, that is, the nature of the problem and the interplay of the intrinsic information with everything else. And, by definition, everything else is what cannot be processed by machine. [...] The most important stages - choice of questions and data, interpretation and decision-making - can never be automated. There is no "artificial intelligence" ".

At his epoch, these processes certainly seemed impossible to process automatically. But today artificial intelligence is a very active field of computer science. Do you think Bertin's claim would still be valid today though?

andreaz wrote:

I agree with Bertin about the importance of human cognition in data analysis. Artificial intelligence techniques work well at finding patterns within the dataset supplied. But choosing what data to give a computer to analyze, as well as integrating the feedback from the resulting image with data outside the scope of the dataset, requires human thought. I think the differentiation between intrinsic and extrinsic data that Bertin is trying to make is that intrinsic data is the relationship revealed within a given dataset, whereas extrinsic information is the information relating to the nature of the problem under investigation that is outside the scope of the dataset. My understanding of Bertin's definitions is that all data a computer interacts with is inherently intrinsic information, since all data they interact with is defined, and information outside their scope is extrinsic. Ultimately a feedback loop exists between humans and computers; users dictate the questions and supply the data set to the computer, which displays patterns within the data, which the user then interprets and integrates with outside knowledge concerning the nature of the investigation. The user tailors the scope of the investigation based on the insights that result from this process--insights that dictate the user's subsequent interactions with the computer.

I really enjoyed the demos in class. Here are a couple more resources for some great interactive data visualizations: * (especially their projects and *

gneokleo wrote:

Brushing or selecting data from a series of a big amount of data can help the reader identify patterns much easier. The example in the Bertin reading with the hotel owner was interesting and how the different season clustering and coloring really make data stand out. Cycling through different visualizations is very important when investigating data and the importance of this was emphasized even more in the video of Tukey explaining PRIM-9 and how the built new hardware to look at data in different dimensions. I also experienced this when working on assignment 2 where switching through different visualizations was something that i was using a lot to see differences (sometimes small but important) between different graphs. Selection and cycling through data along with other properties like grouping or different graph types are "actions" that can help in exploratory analysis of data and, reduce the time of question-answer cycle and engage the reader to ask more questio ns.

iofir wrote:

this is my favorite topic so far. I'm really glad that our next assignment is an interactive visualization. Out of all the interactive visualization examples in the reads I most prefered the refinement by area selection method. Using the box or cube to highlight the data sets that we want to see. This method is used a lot in physical simulations. for example, wind tunnel simulations that show the dynamic flow of air around an object. some of the more interesting ways of showing flow is by generating noise and moving it along the direction of the flow. the movement can be seen because of the noisy texture and is easy of the eye to interpert. now, add an interactive element like highlighting and you can focus on one part of the flow. A good way of doing that would be to select a box region and add color to all noise that flows through it. the flow will carry the color and eventually dissapate, but for the area next to the box, the flow can easily be traced by following the color. I think I'll implement that for my project. Anyone interested in partnering with me?

saahmad wrote:

I made this point in class and am going to partially reiterate it here. When reading the VIS paper I continue to be impressed with the elegance and ease of user of such a system. However, these systems are tightly coupled with the type of data they are intended to explore and thus cannot be generalized easily to other data sources easily. Some level of programming is required to do so.

Thus, while I think the interaction ideas are stellar, I would personally find little use for a system unless it could automatically generate the UI for me based on the data's schema OR if the data it was particularly important and common such that a custom interface is warranted (for example, find songs in your music library).

I thought about systems out there that support the interactions of the VIS paper while remaining generalizable to other data. Bento, Excel, Number, Microsoft Access, etc are all wonderful tools but it is surprising how few widgets are supported. You can typically include things like a search field or slider but they lack controls for more complex data items like location. Furthermore, they are designed alomst exclusively for record-like-use-cases. For example, keeping attendance for a class or tracking expense reports. They are not intended to be used on large datasets of numeric data, like stock prices or experimental results. Does anyone know of analytic software out there that does a decent job?

I know several friends in the financial sector and they have quite sophisticated visualization tools for visualizing stock prices, but those are hardly generalizable to say, the data from a physics simulation.

emrosenf wrote:

I wanted to add two comments. First, everyone check out tenderloin noise as an interesting example of a cool geo-visualization, though I don't think it meets the criteria of being interactive. Is anyone here interested in working on stuff like that? If so, I'd love to meet you and maybe work on the group project together.

Also, I have to agree with Bertin that data in and of itself is useless. I had an experience this summer where I was trying to build a product for small business owners. I started by catering to restaurants. "Don't you want to know if your new customers in June are more loyal than your new customers in July? Don't you want to know if you are attracting an increasing number of new customers?" For me, growing up with analytics about everything, metrics are a lifestyle. But these business owners just stared back blankly and asked: "What would I do with this data?"

I think the point of this is that data in tabular or graphical form is useless unless it's easy to draw conclusions from and actionable. Lesson learned.

felixror wrote:

I agree with many of the above comments about Timebox that it is quite powerful yet simple to use. It is very effective in exploring data from particular domains such as DNA micro-array and stock prices. However, I highly doubt that such tool will be that useful under real world setting. For example, in analyzing stock prices, there are many mathematical and statistical tools out there for pattern recognition such as correlation, which also take care of the case of shifting in the temporal dimension, that better than timebox. Furthermore, they analyze data faster and can process more data than just merely eyeballing like with the timebox. They also allow analysis for a longer time period. But I think under certain situation, when the size of the dataset is not that big, and the time period of interest is short, timebox can be a handy tool to use.

jsnation wrote:

I also think that TimeSearcher, and the timebox query model in general are really useful tools for exploratory data analysis of a very certain data type. It seems like they work really well, but are meant for very large time series data sets. In the conclusions of that paper the author states that they would like to compare the performance of the timeboxes to other input paradigms like one dimensional sliders as future work. I would be interested to see how the timeboxes compared, and if they were in fact more intuitive and led to finding more difficult patterns more quickly. I also think that the principle of 'tight coupling' is very applicable to all software UI design, not just data visualization stuff. The more readily the user can see what their actions are doing, and what is going on, the less confusion there will be. I also really liked the Homefinder demo of dynamic querying and using a 'starfield' display. I didn't realize just how old dynamic querying was, because the first big thing I remember using it was iTunes to filter your music library. I wonder what type of work has been done on using dynamic querying with user definable relationships between the query terms (or I guess user definable queries that are composites of other more basic query terms).

ankitak wrote:

In the video showing the history of Computer Graphics in Statistics, it was interesting to see the progression of work starting from the seminal work by Tukey (Prim-9) and others like Chang and Kruskal around the same time, to McDonald and Newton in the next decades, to XGobi in the 90s. However, what struck me as the most interesting thing is how the newer tools still derive their basic concepts from the tools introduced in the 70s. This, along with the study of Prim 9, has led me to a deep appreciation for the work by Tukey!

Moreover, I really enjoyed the zipdecode demo in class today. It would be really cool if such a tool existed with greater "context" - say a tool like zipdecode over google maps. Something like [this] but with better interaction techniques (like that of zipdecode). I think it would be really simple to implement - does anyone know if such a tool exists?

selassid wrote:

I thought it was limiting that when we were listing interaction techniques, I thought we limited the discussion to ways in which the user influenced the computer system. We didn't discuss alternative ways that the computer system could influence the user. I was intrigued by the problem of letting the user know when there's data just outside the explicit search or selection range that they might find interesting. Answering this problem almost requires that the computer can identify interesting trends on its own already, although there might be some less tricky ways of helping users out. What if when viewing a scatterplot, the system noticed that a trend line still fit well when you expanded one of your filters by 10% and let you know? What if dynamic query widget sliders became "sluggish" or hard to move when their filtering criteria would eliminate all results? What if when brushing, other data points were lightly highlighted that would increase the quality of a fit in another plot? These sorts of computer to user interactions might be hard to compute quickly or might not consistently reveal new trends, but could further help communicate facts about the data to the user. Although the visualization itself should not be forgotten as the workhorse of the computer to user interaction, other interactive subtleties could be used.

mariasan wrote:

I really enjoyed the Homefinder demo. I can only imagine how amazing that must have seemed when it came out in -92. It's interesting that we view the tool as outdated and much less useful than tools currently available. In the big picture, Homefinder offers a lot of the same functionality as a similar product would today. The big difference is the fidelity of graphics that we would display our results on. I'd imagine a more current Homefinder mapping out homes on a nice map, maybe even with an option to switch to satellite view available so that you can check out every detail of your new neighborhood before moving in. But apart from nicer graphics and perhaps more available information per unit, what would we do differently in a 2010 remake of Homefinder? What are some new techniques for interactive visualizations that we've invented in the last 18 years?

msewak wrote:

I loved the LA homicide plot and I will definitely incorporate ideas from it for assignment 3. I like that not only can we interact with the data, but we can also change the basic display. I also like how we can specify that we want "data like this". It is a very intuitive way of picking out data and it is cleve to build it in. Some analysis needs to be done to see what "data like this means" but I will try to incorporate that in my interactive data tool.

Playing around with ggobi was fun, I like its "tour" feature - even though it is not great for directed questions, it can be really useful to navigate the data for the first time.

avogel wrote:

The timebox really is one of those features that seems to be 'just right'. It's remarkable how much iteration it takes before finding intuitive and powerful tools.

The starfield is a fun visualization, but from a few perspectives I wonder how appropriate it is for certain situations. For housing it makes perfect sense; I can't think of a better way to present position data (if you assume a map is just a more detailed starfield). However, for movies, the main thing the starfield adds is color for genre information, but that could be presented relatively easily (and perhaps more cleanly) in a table.

On the other hand, if Ahlberg & Shneiderman were to redo their paper with more modern graphical hardware available, I'm not sure the minor flaws mentioned above would be as apparent. Especially in real-world applications, the user experience gain by using a starfield (maybe even a 3D field in order to encode another variable, or subgenres, or cross-/multi-genres in the case of movies) might be significant enough to justify doing so. I'm pretty sure I'd be much happier with the Netflix console product if it used a starfield visualization with some query widgets rather than it's current UI.

ericruth wrote:

In response to @mariasan's post - I think two key improvements a modern version of Homefinder would make on the old version are a nicer map-based interface (like you said) and images/detailed information on individual homes (if the data were available). The map is definitely key because it would allow people to understand where homes are in the context of other landmarks.

That said, it's pretty interesting that thinking about the "improvements" we might make now, they're almost all focused around improving the graphics or data of the Homefinder (not the fundamental visualization mechanism). I wonder if we'd see similar trends around old visualizations; are visualization techniques really changing that much, or are they just getting prettier? It seems like a lot of the visualization concepts and theories from the pioneers of the field still hold relatively true today, which makes me wonder what the biggest changes in the field are. Is it just computation power?

jtamayo wrote:

On @saahmad 's note that "these systems are tightly coupled with the type of data they are intended to explore and thus cannot be generalized easily to other data sources easily," I'd like to point out that the problem is not a simple matter of programming.

As Prof. Heer pointed out in the Name Voyager example, the choice of those few dimensions that provide the maximum exploratory power is an art, not something you could easily automate. The choice is based on human nature: what will people be curious about? what kinds of questions do we want the visualization to answer? It's difficult to automatically answer these questions by looking only at properties of the data.

A different problem is that interactive visualizations are hard to create, and it takes a lot of programming to get the computer to do what you want. This is a classical computer science problem, and languages like Protovis try to make it easier to program at the right level of abstraction.

heddle wrote:

Going on the trend of so many of the previous comments, I was thinking all during Tuesday's lecture that while some of the graphical data display from the past was really interesting, it probably wouldn't fly today. And @ericruth kind of went farther with this by talking about the idea of making graph's "pretty" without working on the actual visualization technique. If you look at Bertin's article on graphical decision processing, while the conceptual break-down of the data and the visualization is still very applicable to data today, I have to wonder how the actual visual would have changed in today's world. Now, we don't just have the ability to make things pretty, we also have more pixels/inch to work with and I think that as we've moved forward it's the concept of space more than anything else that's changing how we can view data. Overlaying data on a map now days is a completely different visual experience than it was even in the 90s. So yes, while all the ways of linking data, doing filter queries, doing interactive temporal queries are all important break-throughs, I think the best thing to happen to data visualization is more space and resolution in which to display the data.

arievans wrote:

@ankitak, I too absolutely loved the simplicity and practicality of zipdecode. This sort of tool is interesting and useful, and I am certainly interested in working on a project that is similar in scope and presentation.

@heddle I think brought up some excellent points--when we saw the professor's demo in class for housing data, I couldn't help but recognize the benefits of the presentational aspect. Though the tool could have been similar or more effective than other tools in its class today, merely due to the appearance, people might have dismissed its use. In today's world, where we can really make things 'pretty,' we are given more opportunities to "oooooh ahhhh" the crowd. And responding to @emrosenf as well, I think that maybe the new generation of small business owners might fall into the opposite trap--wanting to capture as much data as they can, but maybe not realizing how to analyze and display it properly to really get the full utility from it. For example, my father runs a small women's shoe store. When I ask him questions about his business he says, "oh I keep logs of all that stuff, I could run a report to find out." But that's just it--no report really ever gets run. So perhaps now the goal is to push past merely collecting the data, but promoting the visualization aspect. And now that we have so much potential to make things 'pretty,' I think that task will be much easier this time around. That's actually one of the reasons I loved zipdecode so much--it was beautiful and clear and focused on pretty much one question. With this similar kind of design applied to other data sets, I am sure that business owners would be convinced of the value of these tools.

jasonch wrote:

@sindhu Yea, I thought the exact same thing as I was reading the paper: that why something so simple can deserve an entire paper. After today's lecture though, I realized that is the power of intuitive interaction. The TimeBox seemed trivial because it's a very intuitive design (the "I could've thought of that phenomenon"), but the intuitive interaction makes the tool that much more powerful. Same goes with BabyNameWizard, simple and intuitive design makes the tool more engaging and useful!

jbastien wrote:

The TimeSearcher demonstration reminds me a lot of Palantir Finance's product, they have some pretty interesting visualizations in there.

jeffwear wrote:

In 'Postmortem of an example', Bertin asserts that "the most important stages [of analysis] - choice of questions and data, interpretation, and decision-making - can never be automated." This is a contention which is not fully supported in light of some of the other readings we have done this quarter and even in light of Bertin himself.

I agree with Bertin up to a point. Machines have no force in the external world. They cannot gather data, nor enforce any recommendations therein derived. Secondly (and this contributes to the first point) the machine knows nothing of semantics. The data means nothing to the machine. So of course the machine cannot define a problem - neither the problem nor the constraints that bind it.

But once the machine has been provided with a sufficient amount of data in usable form, why can it not offer recommendations for humans to consider? If the hotel operator's concern is maximize occupancy, for instance, he need only tell the machine to optimize for that variable, and the machine could do it. As another example, several weeks ago we read a description of an analysis of VSLI chip production data where a machine determined the ideal yield profile. I would call these recommendations 'decisions', even, if not for that the machine can not put them into practice.

Bertin I think is concerned that there are aspects of the problem which could not be encoded in a machine-usable form. But I think this is only an encouragement to the machine's operators, to be more clever about how to prepare data to leverage machines' terrific pattern-recognition abilities.

esegel wrote:

Bertin outlines the following stages of Decision Making: (1) Defining the problem (2) Defining the data (3) Adopting a processing language (aka how do you view and interact with the data?) (4) Processing the data (aka data exploration) (5) Interpreting, Deciding, & Communicating

I think this list is a good start. A few observations though: -There are no feedback loops in this list. Oftentimes latter steps inform former ones. For example, perhaps exploring the data (step 4) reformulates how you want to view and interact with the data (step 3), and sometimes interpreting the data (step 5) reformulates how you think about the problem (step 1). There are lots of loops in actual decision making. -For more open ended questions, the hypothesis forming and problem-space defining comes from exploring the data. At first, it isn't clear what data will help you answer your questions, so you have to poke around. In other words. for exploratory analysis you almost have to flip the above list on its head!

Leave a comment