Lecture on Nov 18, 2009. (Slides)
The Value of Visualization. Jarke van Wijk. Visualization 2005 (pdf)
The challenge of information visualization evaluation. Plaisant. (pdf)
Evaluation of data visualization is challenging because it resides at the interplay between the objective visual artifact and the subjective experience of the human subject. As such, I would claim that it is impossible to create a universal, a priori ranking of visualization principles, unless such a ranking consists of the single maxim that an effective visualization must account for contextual factors such as audience and intended use.
Audience factors include both domain experience and culture. The latter is particularly interesting given that it has been shown to be a determining factor in objective measurements of cognitive phenomena (e.g. Event Related Potentials, see http://scan.oxfordjournals.org/cgi/content/abstract/nsp038v1). Such findings suggest that experience, both general and domain-specific, determine, in large part, the effectiveness of a visualization.
Furthermore, measures of effectiveness cannot be made without the context of a particular the task. As Plaisant indicates, experimenters should strive to identify domain-relevant tasks to better determine success, rather than rely on traditional and dubious approaches such as measuring task time and error rate in location and identification procedures. Finally, Plaisant's appeal for longitudinal studies is highly relevant, as a marked distinction exists between the initial impact of a visualization tool and its long-term potentiality.
I agree with cabryant that audience factors such as experience and culture cannot be ignored in an evaluation. Though we end up focusing very much on the individual visual components that make up our tools (as they are easier to control), perception and cognition are, after all, balanced by top-down processing that is based on our experience and expectations. Unfortunately, these are much more difficult to understand or predict, especially for tools meant to be used by a wide range of audience.
Consistent with this observation, Plaisant mentions (in section 3.4) the possibility of using sound as one of many ways to enhance universal usability: "Encouraging results have been found with the sonification of graphs, scattergrams, and tables . Spatial sound might help sonify more complex data representations ."
I think data sonification is a fascinating field with great potentials, but also with many interesting challenges. Not only is it difficult to find an intuitive mapping between sound parameters and data parameters, individual differences in head-related transfer functions (HRTF) make it difficult to precisely model 3-d sound using stereo headphones. Sound is great for signaling changes that occur over a fine temporal scale, but individual differences among listeners, again, make it difficult to quantify the resulting perceptual experience of sounds-- particularly those sounds with rich timbre that are less "annoying" and more "beautiful" to hear, that could potentially be used to aid in usability of interfaces. Nonetheless, I hope to see more instances of visualizations that incorporate sounds, because when done correctly, having sound tends to heighten the 'fun' of exploration and encourages interaction.
The hyperbolic tree from Inxight that won the CHI browser cookoff reminded me of another product (but with limited user adoption) from them - perspective time wall - http://www.inxightfedsys.com/products/sdks/tw/default.asp I think it was also mentioned/shown in one of the earlier lectures.
Sometime back, I found a product that used a similar visualization metaphor but tailored it for a different purpose/task altogether - to rapidly browse thru' media files. See http://www.cooliris.com/ Unlike Inxight, it seems to have gained significant mindshare. The lesson here jives with what was said in class regarding evaluation - the tool to be used depends much on the task at hand.
It is now (after taking the class) possible to think in terms of best representations, encodings and interactions for different problem domains instead of shoe-horning traditional chart-ware solutions. The thought process of retaining rigor and structure in the employment of basic primitives, yet conceiving of out-of-box approaches to domain specific custom visualizations is a key benefit derived from taking this class. One other resource I found recently that adopts a similar approach is 'The grammar of graphics' by Wilkinson.
I think evaluation really goes hand in hand with the previous topic of identifying design principles. Evaluation only makes sense in the context of goals or defining meaningful metrics. However, as previously mentioned, the goal may be something subjective like human experience. In this sense, I think the "case studies" approach is more interesting because it takes into account the real world settings that affect how we view things. One thing that wasn't really mentioned is the idea of preference. Using controlled experiments can generate data, but it may be that user preference is in the opposite direction of your own metrics. In that case, should a "worse" design win over a better one?
Plaisant's proposes the need for more benchmarks and concrete evidence to demonstrate the value of good visualizations before they will gain more widespread use. This may be true, but it seems to me that most users simply care about getting their products to work. In Kellog's book on web design "dont make me think" he uses Simon's term "satisficing" to describe the habit of web users simply clicking on the first thing that works. I feel like a similar tract should be taken to encourage more wide spread use of visualizations. I dont think the problem is that people don't recognize the value of good viz, but rather they are simply content to satisfice. I think that in the future we will need to find new ways to make it easier for people to use good visualizations without having to think about it. The best new tools will be ones that make it easier and cheaper to build good visualizations, only then will they become more widespread.
Most scientific evaluation techniques we have seen rely on accuracy or timing to measure the value of a particular visualization, but most do not consider cognitive load--the amount of 'brain power' needed to use a visualization. In the case of horizon graphs, it was clear that people balked at 4 band graphs and the timing/accuracy data did support this, but the decrease in these metrics did not seem to be in line with my impression of the subjective results, and I think this was due to extra cognitive load requiring a mental multiply and add to get the right number. It would be interesting to measure cognitive load for certain visualizations. I know that pupil dilation is one means of doing this but you can also think of other ways: playing load music during the trial would impede higher-level processing and make visualizations that require intense thought more difficult to process than those that can be processed with pre-attentive processing. Or you can ask someone to perform two task simultaneously and see how performance degrades. I think studying this as an evaluation metric is crucial as it gets to one's enjoyment of a visualization--I know I like a visualizations whose meanings are immediately obvious and do not require complicated decoding and measuring cognitive load may help evaluation when an image achieves this property.
Data Visualization: Evaluation (last edited 2009-11-18 22:29:33 by jheer)