current page

history

user

Lecture on Sep 28, 2010. (Slides)

Readings

  • Required

    • Chapter 3: The Power of Representation, In Things That Make Us Smart. Norman. (pdf)

    • Chapter 4: Data-Ink and Graphical Redesign, In The Visual Display of Quantitative Information. Tufte.
    • Chapter 5: Chartjunk, In The Visual Display of Quantitative Information. Tufte.
    • Chapter 6: Data-Ink Maximization and Graphical Design, In The Visual Display of Quantitative Information.
    • A Conversation with Jeff Heer, Martin Wattenberg, and Fernanda Viegas, ACM Queue

  • Optional

    • The representation of numbers. Zhang and Norman. (pdf)

Comments

gneokleo wrote:

I found the visualization of the stock market using a tree map very interesting. At first when i was in class i thought that this could be something very useful for a researcher/trader to find trends in the market very fast. The use of different size of rectangles as sectors and individual stocks as well as color for gain/loss was giving out a lot of information very quickly. However, after some more thought I started to think that perhaps this kind of visualization is not suited for precision or could even mislead the reader. For example it is very hard for humans to distinguish between different color brightness and small differences in shape sizes. Some might argue that these small differences are not important but what if this turns out to give misleading information to the reader? This shows another important factor when choosing a visualization which is audience you are targeting. While the stock market visualization could be excellent for non-professionals, casual traders it might not be ideal for professional traders.

Norman also makes a good point stressing the importance of choosing the right representation. This reminded me of assembly instruction manuals for furniture (also discussed in the previous lecture). I'm sure a lot of people felt the frustration of misrepresentation of actions/objects in these manuals compared to the real world.

amaeda10 wrote:

Comment on "A Conversation with Jeff Heer, Martin Wattenberg, and Fernanda Viegas, ACM Queue", page 4, line 6-. Martin Wattenberg talks about the signs of good visualization here and they were quite helpful. Especially, he says that having fun to play with your visualization is a good second sign, and this reminded me of the work of Hiroshi Ishii at MIT Media Lab.

http://tangible.media.mit.edu/project.php?recid=36

Although Ishii's focus is more on tangibility rather than visibility, the outcome of this 'SandScape' is essentially data visualization. Users can interact with the model of terrain by directly manipulating the sand in the box, and drainage of the landscape will be projected onto the sand, for example. I used this in the Science Museum in Boston, and it was quite fun to play with. Direct manipulation is often useful (and fun) to explore data.

rparikh wrote:

@amaeda10: I agree about the point of having fun with the visualization. They mentioned the similarity to video games in the article, which reminded me of my favorite video game of all time, Civilization 3. One of the fun parts of the game was the data of your position relative to your rivals, and how they visualized that in a way that you could play with for minutes on end (mostly trying to make it so that you look good!). I was also reminded of the sense.us project that those three worked on, I think that's a prime example of a visualization that someone can literally spend hours playing with, slicing and dicing the data in hundreds of ways to tell all sorts of incredible stories about US history.

jbastien wrote:

Reading "The Power of Representation," its argument for objects being what make us intelligent, and the discussion about book's non-interactivity being good or bad (versus interacting with the author), reminds me of Neal Stephenson's "The Diamond Age: Or, A Young Lady's Illustrated Primer."

It's also very interesting to rethink the definition of cognition to take into account web technologies like weblogs and Twitter. Is cognition merely about delivering though or is it about exchanging thought? How does the bandwidth (size of exchange) and latency (time to reception and response) affect cognition? The web redefines all of these.

Back to the Illustrated Primer, we're getting closer and closer: the Kno is a quite nice tablet aimed at education http://www.youtube.com/watch?v=ubP6fnxQ-9o

jorgeh wrote:

A little bit off topic, but ... Probably we'll discuss about typography later in the quarter, but there is a comment on Tufte's book that surprised me:"... crinkly lettering all in upper-case sans serif ..." (p. 120). I agree that all upper-case letter is not nice, but what's wrong with the usage of sans-serif typography in data visualizations?

trcarden wrote:

While reading Norman Chapter 3 the author went over the issues that Socrates had with books, namely that the readers would no longer question or "think" about what they are reading. Despite the fact that the author was trying to make a point about the power of artifacts in cognition, i started to think more critically about the rest of the passage. The first couple pages of the chapter go into how artifacts are essential for reasoning and communication of thought. However i believe its simply the filtering of information not necessarily the visualization required for intelligence. Moreover artifacts can actually take away from some elements of communication. What Norman fails to mention is that when we abstract on the fly with artifacts some of the passion and emotion gets removed because we _are_ having to concentrate on a abstracted/"representing" world instead of telling the story with the images of the situation in our head. When you lack those artifacts you can still reason and communicate its just different.

rakasaka wrote:

"Someday, not too far in the distant future, all the information will be available on electronic devices whose displays will allow the same information to be presented in a variety of ways: different layouts for different needs". I find it ironic that even in this day in age where technology and processing power surpass our capacity to use their full potential we still struggle with filtering and displaying information coherently.

I also found it interesting to contrast Norman's approach to representations and how the lack of representation can either suggest a) we don't know how to make those representations or that we can ignore them (or at least not ascribe any value). Tufte seems to suggest that there is value in deliberately omitting representations because they are not necessarily relevant - differing between the two through a visualization, however, is a tough thing to do. Are visualizations supposed to be effective as long as they don't force us to question whether or not certain representations are missed?

rakasaka wrote:

@gneokleo Speaking of fascinating representations I thought the train schedule graphic to be very impressive. If only we could still use that similar technique for flight informations too, though something tells me it's already been done.

adh15 wrote:

In the interview piece, I was intrigued by the quote from Wattenberg: "You can almost make the analogy that having real data as apart of your project is as important as having real users look at your project...It's almost as if the data is one of the stakeholders in the project and you need its input from the beginning." This makes sense. Trying to specify a visualization before you have real data merely results in a fantastic guessing game. Notably, though, Wattenberg makes no claim that you need to know all of the data. Using several samples of real data they were able to create a generalized visualization for all data of that type. This suggests the need to develop with a data subset that is likely representative of all the data.

At the end of the piece there is a brief discussion of bringing comments and visualizations closer together. What if the visualization itself could be the comment? Perhaps there could be an entire discussion thread composed of nothing but visualizations. Is it possible to build a tool to facilitate the manipulation of visualizations in a way that is fast enough to allow their use for reply and debate directly, or will explanatory prose always be necessary? What about using recorded audio and visual cursor to allow spoken analysis to compliment a visualization?

Lastly, the visualization work using Wikipedia as a dataset reminded me of Ed Chi's WikiDashboard research project.

saahmad wrote:

A real poignant moment when reading "A Conversation with Jeff Heer, Martin Wattenberg, and Fernanda Viegas" was the discussion of the tradeoffs of using fake data. I think using fake data is certainly helpful really early on in the design cycle because it allows you to reason about interesting edge cases and other potential shortcomings of your visualization. However, you really should not spend too much time in this phase because there will inevitably be unforeseen corner cases and malformed input that will be encountered once you switch to real data, so why waste time?

That nonetheless begs the question: what is a principled approach to using real data to create a scalable visualization? How do you balance creating a robust visualization that handles all sorts of malformed input with the development time and effort? An approach that I thought of was to use a "scale-iterative" approach. What I mean is that you start off using a small subset of real data and rapidly create a visualization that handles it, explicitly ignoring cases that you know will not work but do not happen to show up with the current subset. Then, you scale up and use a larger subset of the real data and see what problems arise. You fix those, and continue iterating. You may get to the point of saying, "this visualization is just not going to work", which is fine since you did not waste too much time developing in the first place. You may eventually get to a point where certain issues keep coming up and there is no fix for them. In those cases, you can merely write those off as a "shortcoming" of the visualization.

What do you guys think? Thoughts?

arievans wrote:

The dialogue in A Conversation with Jeff Heer, Martin Wattenberg, and Fernanda Viégas seemed to indicate a trend to me. It appears as if the goals of visualization are changing. Most of the projects discussed in the article were based around displaying useful visualizations for very particular sets of data or projects. For example, the baby naming project NameVoyager specifically "[to] create a visualization of name popularity over time." It turned out that the users of the product were using the information there and asking even deeper questions--the sort of "so what?" questions. They were wondering what the rise or fall in popularity actually meant for names. What trends could be observed about names in general? This sort of social interaction was not expected by the designers but certainly elucidates an important point: this special visualization for babynames was just one specialized version for this sort of data. What we need to realize is that with more powerful "entry level" visualizations on data come deeper questions which warrant even more sophisticated or altered views on the data. This is actually a trend that was built into almost every project discussed in that article.

Therefore, it is abundantly clear that there is a strong need for advanced, open-source data visualization and manipulation tools. The trend in the past on the web was sort of to take specific datasets and figure out how to display those effectively, but I think what we are finding now is that we want to be able to visualize ANY data effectively, having the freedom to sort of zoom out, zoom in, cut, and modify data and views on that data in order to answer the deeper questions. I suppose that by analyzing specific datasets we develop strong visualization techniques, but if we continue in this way we will likely never be able to visualize data at the rate that it is being produced. For these reasons I am excited to hear more about the developments on text-based visualizations, as described by Fernanda Viegas. That seems like one of the most important directions we can move in. Very interesting interview, indeed.

msavva wrote:

An interesting point made by Tufte in the conclusion to the Data-ink Maximization chapter is that new designs seem hard to decipher at first due to the fact that we are unconsciously comparing them against older and more familiar designs in terms of the cognitive effort required when reading them. I thought this point paralleled the argument that Norman made about how the Arabic numeral system for arithmetic is really fairly complicated and only becomes natural for people after a fair degree of training. It would be interesting to investigate how people's performance in reading visualizations varies with former exposure to processing data using similar visual encodings (and to what degree the transfer of experience is possible between related encodings). Of course quantifying "performace" is a challenge itself but one could imagine timing simple data reading tasks or seeing if insights into particular properties of the data are achieved.

To me it seems like one of the big problems in visualization design is to first convince people to overcome the inertia of depending on well known old designs so that they can give a chance to new approaches (it certainly seems to be the case in fields such as medicine where I hear that many professionals are more comfortable reading 2D X-ray scans than using more modern 3D visualizations). Perhaps one aspect of visualization design worth studying is how to introduce innovations in an incremental fashion while minimizing this cognitive dissonance due to lack of previous experience.

yanzhudu wrote:

The essence of the Norman reading is to design visualization to fit the task. To fit a task, the visualization designer must have the audience in mind. That means the designer need to consider education background of the audience, and what the audience is trying to do with the visualization. Therefore, though Data-Ink Maximization is an important principle, it should not be over used to the point that where audience have difficulty in interpreting the visualization. Therefore, for novel type of visualization, we could relax a bit about the Data-Ink Maximization for the sake of gentler learning curve for audience. The converse is that visualization designer should leverage on audience's existing knowledge to save on Non-Data Ink.

jdudley wrote:

As a fan of Tufte I've always had the immediate response that "chart junk" is always bad. Today I was thinking that, outside of what Tufte asserts, I never took the time to look for empirical studies that evaluated the effects of chart junk. After some searching I came across this interesting paper that took a group of 60 people and evaluated their recall accuracy some time after seeing a chart with chart junk and a "plain" chart. In this particular study, they found that there was no difference in recall accuracy between the two groups. Of course, it would be nice to find additional studies that corroborate this finding.

http://hci.usask.ca/publications/view.php?id=173

Bateman, S., Mandryk, R.L., Gutwin, C., Genest, A.M., McDine, D., Brooks, C. 2010. Useful Junk? The Effects of Visual Embellishment on Comprehension and Memorability of Charts. In ACM Conference on Human Factors in Computing Systems (CHI 2010), Atlanta, GA, USA. 2573-2582. Best paper award. DOI=http://dx.doi.org/10.1145/1753326.1753716

sholbert wrote:

In the Norman reading, I really enjoyed the redesign of the airplane schedule. There are so many details to visualize--airport names, flight duration, layovers, flight numbers, flight departures and arrivals, and Norman makes a great comparison of the advantages and the disadvantages of each iteration. Additionally, tic-tac-toe was a genius example to illustrate difference in cognitive impairment associated with a representation.

gdavo wrote:

In "the power of representation", I think Norman gives several interesting clues for precising the concept of "effectiveness" introduces by Mackinlay. According to Norman, experiential representations are often more effective than reflective ones because they help the user finding the relevant info and computing the desired conclusion without the need of slow and tiring cognitive reflections. I think the power of "experiential representations" is linked to the perceptual strength of position encoding already noted by Bertin and Mackinlay.

jsnation wrote:

I liked that in the 'Conversation' reading, data visualization was really being examined as a method of fostering communication, interactivity, and creativity between people - rather than just as a means to convey some data or story. It is really interesting to see that providing tools to create visualizations empowers people to communicate better, or to express something they previously could not. This was most clear to me in that new online visualization creation and sharing site, many-eyes. The most interesting feature that I saw was the bookmarked commenting feature. This feature isn't directly related to the data visualization itself, but rather the entire experience that a user has when interacting with that visualization and with other users. It seems like certain visualizations are not just there to display information, but also to foster communication between groups of people, or to spark a debate, or cause people to think about some idea and interact with each other.

In the Norman reading, Ch. 3, I also found the example using the numbers game / tic tac toe comparison to be pretty enlightening. That showed off really well just how much a visualization can aid in someone's understanding. I remember playing both the 15-count game before, but I never made the connection that it was the exact same game as tic tac toe before - but I do remember the counting one being much harder to win/tie at.

strazz wrote:

I think "The power of representation" made several interesting points, as clearly pointed out by the previous comments. However, what picked my interest the most was the first couple of ideas regarding how the implementation of basic communication tools, and methods for representing the world, knowledge or information is what actually makes us smarter as a society, really made wonder the importance of data visualization technologies since the conception of the modern society and the pivotal role it will have in the future ( the bit about meta-representations was awesome). Also the way he explained data representation for different audiences/users with the airplane example was terrific, as we all could relate to it, but I never quite saw it as a data representation issue until now.

strazz wrote:

I think "The power of representation" made several interesting points, as clearly pointed out by the previous comments. However, what picked my interest the most was the first couple of ideas regarding how the implementation of basic communication tools, and methods for representing the world, knowledge or information is what actually makes us smarter as a society, really made wonder the importance of data visualization technologies since the conception of the modern society and the pivotal role it will have in the future ( the bit about meta-representations was awesome). Also the way he explained data representation for different audiences/users with the airplane example was terrific, as we all could relate to it, but I never quite saw it as a data representation issue until now.

ericruth wrote:

One thing I noticed about many of the stacked area visualizations in today's lecture was that the y-axis was labeled with detailed percentages. For example, every 10%. This seemed a bit misleading or useless because the points on the graph don't have a common base, which makes it really difficult for the user to derive meaning from these numbers without some form of visual subtraction (which isn't very accurate). On the other hand, these y-axis labels seem very useful when all data points have a common base.

Do people agree/disagree with this? If you agree, do you have any ideas to circumvent this issue on graphs without a common base? If you disagree, how do you find these labels useful?

emrosenf wrote:

I really enjoyed reading Norman Ch3. I was astounded to read the example about the airline flight information, because it is quite topical. ITA Software, whose software creates spatial flight visualizations like this, was just acquired for $700M by Google. Some friends of mine recently launched hipmunk, a startup that applies the same visualization to expedia search information.

I'll have to be extra diligent on the readings to look for untapped opportunities . . . by the way, is anyone else in the class on an entrepreneurial bent? I'd love to chat.

clfong wrote:

I find that the Data-Ink principle in Chapter 3 of Tufte a pretty good guiding principle on what to include and what not to include in a visualization. Very often we are confronted by question like if we should add shades, gradients, icons to make the chart more visually appealling, if we should annotate all the data points with numbers or if we should expand the margins of a chart to preserve continuity. The Data-Ink principle, in general, is a nice rule-of-thumb for us to decide which of such features are necessary, or will contribute to the efficient and effective representation of data.

amirg wrote:

I came across this graphic this morning at billshrink.com, which examines the popularity of the iPhone.

They use a very different style than our visualizations (and have some different data as well), but I thought it was pretty interesting and also insightful to look at through the lens of some of the new tools we have acquired for analyzing the effectiveness of visualizations.

I also want to reflect for a moment on the visualization design assignment. For the assignment, I approached the visualization from somewhat of a design perspective, generating several prototypes then asking people to perform particular tasks with the prototypes and incorporating those results into my final design. I noticed other visualizations in class that worked from a storytelling perspective and I am curious what other approaches people took when creating their designs. I am also curious to see how the various approaches can be applied when we start to look at larger data sets. Finally, I think that going through the different designs in class today was excellent and gave me a lot of ideas for improving on my own design.

abhatta1 wrote:

In the "Representation of Numbers" by Zhang and Norman what confounds me is that Arabic numerals became popular in spite of the fact that Arabic numerals needed to be retrieved from memory while Egyptian numerals were self evident by cognition. Where does one draw the line between utility (or complexity) and simplicity (for cognition) for adoption of a graphical representation ? Also is there a way to say that the Indo-Arabic numerical system is the best represented numerical system ?

esegel wrote:

The "conversation" reading emphasizes how important it is to design with real data in mind. Just like testing designs (and products) with real users eliminates "user risk", testing visualizations with real data eliminates "data risk". This makes sense to me—especially when designing for a specific data set. However, it is unclear how to implement this principle when designing a general purpose visualization tool (e.g. excel, many eyes, tableau). What design choices do you make when dealing with diverse and unpredictable data? Is there any systematic/automated way to make the right choices while avoiding a long list of "if" statements?

My quick hypothesis: While some design features may be automatable (e.g. tick marks), other design features may not (e.g. graph type). Perhaps the best way to handle this is to automatically produce a number of different charts, and let the user pick which one suits the data best. This is a semi-automatic approach.

ntatonet wrote:

@jdudley Thank you for posting a link to the Bateman paper. I think it illustrates some very interesting points about using embellishments on data visualizations. They pretty clearly show that more time is spent looking at the data when the embellishments (like drawing your data like a monster) are not present. They also show that there is no benefit in remembering the data between the embellished versions versus the plain versions of the visualizations.

I believe there is a serious methodological flaw, however, that the authors fail to address. These graphics are most likely intended to be embedded in a print article where the graphic is competing with other graphics and text. It is not a fair assessment of this type of graphical technique. A more appropriate test would be to embed the graphics in an article and tell the user to read and understand the results of the article. Then test them on their recall of the graphics. Perhaps in this setting (in which they were designed to exist) they would have better performance.

felixror wrote:

I find the part in Norman's book about the representation of numbers very interesting. The notion of "quantity" is an abstract concept, and that people have to devise convenient symbols to represent such abstract notion. Roman and Arabic numerals are the two most common numeric scheme. Pretty much taking for granted the Arabic numeric scheme, I never really think about the importance of numeric symbols in helping us carry out arithematic operations efficiently. Surprisingly, it is easier for us to carry out addition operation using Roman numerals than Arabic numerals, since there are less arithematic rules to remember. However, Roman numerals is very bad at performing multiplication and division. On the other hand, Arabic numerals, due to its design, allow us to perform multiplication and division with relative ease. I am amazed at how an efficient design of symbols play such an important part in laying the foundation of arithmetic. We also rely on visualization to perform such operations by writing the numbers down on boards or papers.

iofir wrote:

After reading the passage about Socrates objection to books and written media in general, I contemplated whether or not the problem really exists today. Are we more passive readers? Or do we question the information we read? I believe the problem is far more wide spread then only written media. We have become passive listeners as well. Trained from childhood to watch tv without questioning the source of our news reporting. The act of "watching" is not an interactive learning and thinking process. It does not allow the viewer to challenge the information presented before them, but only provides the opportunity to privately agree or disagree. The one redeeming trend that I find interesting is the online distribution of information. I generally do not favor the lower quality and reliability of independent reporting on the web, since it's mostly inaccurate or plain wrong. However, the opportunity to conduct a dialog and present evidence through online forums allows readers to learn actively. The next step is to provide the news forums the ability to share the raw data and allow readers to present visualization to support their arguments. All presentation have innate bias, but a dialog can reveal both sides of the argument. They should also include a way to forward questions to interviewees, since the questions themselves are usually biased. (sorry about the rant, i hope it wasn't too of topic.)

pariser wrote:

Re: Tufte's chapter on Data-Ink Maximization, I find some of the redesigns of the well standardized visualizations more effective than others. I'm currently plagued by the box plot on the bottom of pg. 125, where he's removed any semblance of data-ink to indicates the inner quartile. Instead, he employs whitespace that the chart's consumer is supposed to interpret in upon glancing at the visualization. My question is, how is a consumer supposed to know that the whitespace is a significant indicator of some measurement? I find it problematic that Tufte assumes that every consumer will think of whitespace as a representation of some measurement. An informal sample of officemates led to many questions about what this graphic was supposed to represent. I appreciate whitespace, but suspect that its best use in visualization is to remove clutter.

Maybe I'm just being overcritical, but I think that Tufte's use of whitespace as a visual encoding of the data undermines his general we ought to maximize data-ink principle; if you use whitespace as a visual encoding variable, then the whitespace itself becomes "data-ink", and the background of an image must be counted against Tufte in his minimization.

lekanw wrote:

@iofir I think a major difference as time has passed is that information has become so readily accessible. WIth print media, and now, the web, information is cheap, and we learn that it is often easier to just absorb the expert opinions of others rather than think about something ourselves. If, instead, our only source of news today was a weekly announcement, we would almost certainly ponder that information more because we simply want more to think about. Overall, it seems that because of the way media is structured and the incentives to produce and consume absurd amounts of media quickly, this consumption-based media culture will probably remain.

zeyangl wrote:

Chapter 3: The Power of Representation used very good examples to illustrate that different representations can have a huge impact on our ability to tackle problems. They reminded me of the reduction of register allocation to graph coloring algorithms. Register allocation is hard and has no obvious solutions, but once reduced to graph coloring, well-established algorithms are suddenly useful. But it's really difficult to see the connection between different problems. The tic tac toe example as well, it's really hard to see its connection to game 15. This is what I think most interesting about data visualization. We can theorize design principles, number representations, chart types and human perception all we want, but there is no way to theorize the ability to gain insight and build connections, which is largely independent of one's aesthetics and programming ability, but related to problem solving skills as a whole.

msewak wrote:

While I do not completely agree with Tufte's minimizing ink strategy, I do agree with the minimalistic approach discussed in class. It is important to question how every element in your graph contributes to it. I am very surprised though that Tufte thinks that a quartile plot is a good example of data visualization. While it encodes more data than the original bivariate scatter plot, it is hard to notice that the line is off by one pixel and what it represents is fairly non intuitive. But this chapter made me realize how much redundant ink we use on graphs, even when it does not add any value to the graph - how dark grid lines draw attention away from the data, how three dimensional graphs for 1D data are misrepresentations, how hatching can have negative effects on the graph.

nikil wrote:

I loved the Norman chapter and really felt that what he talked about was simple yet very understated. He also made it very engaging and enjoyable, a very fun read on the whole and fast.

As to the number comparisons:

1. Which number is bigger?

  • 284 vs 912

and

2. Which number is bigger?

  • 284 vs 312

He(the research) claims that the reason the second comparison takes more time actually is because we change these into an size oriented image in our brains that we then actually compare. And since the size difference of the perceptual visualization of the second two numbers is much smaller, he thinks that the images are harder to compare. the visualization causes slowdown.

I think that (at least in my case) after so many years of practice, in fact we compare the numbers through the arabic numeral character by character checking. Our mind reads the number as a whole, not individual arabic digits and before the number is fully processed, our brain notices that there is an 8 in the second column of 284 and that there is a one in the second column of 312 and compared to the first digits - 2 vs 3, this immediately jumps out.

Our brain is trained to pick up patterns and this is a clear pattern we have all seen before - 8 is MUCH larger than 1 (in terms of digits. So our brain does a double take when asked to process three digits of these numbers and that initial subconscious preprocessing is reversed by a judgement. In the first comparison the nine takes care of this initial comparison, and the thought corroborates that intuition.

We have been working in this numerical system our entire human lives, and I feel that we have moved to think in terms of this numeric system instead of visualizations.

nikil wrote:

I loved the Norman chapter and really felt that what he talked about was simple yet very understated. He also made it very engaging and enjoyable, a very fun read on the whole and fast.

As to the number comparisons:

1. Which number is bigger?

  • 284 vs 912

and

2. Which number is bigger?

  • 284 vs 312

He(the research) claims that the reason the second comparison takes more time actually is because we change these into an size oriented image in our brains that we then actually compare. And since the size difference of the perceptual visualization of the second two numbers is much smaller, he thinks that the images are harder to compare. the visualization causes slowdown.

I think that (at least in my case) after so many years of practice, in fact we compare the numbers through the arabic numeral character by character checking. Our mind reads the number as a whole, not individual arabic digits and before the number is fully processed, our brain notices that there is an 8 in the second column of 284 and that there is a one in the second column of 312 and compared to the first digits - 2 vs 3, this immediately jumps out.

Our brain is trained to pick up patterns and this is a clear pattern we have all seen before - 8 is MUCH larger than 1 (in terms of digits. So our brain does a double take when asked to process three digits of these numbers and that initial subconscious preprocessing is reversed by a judgement. In the first comparison the nine takes care of this initial comparison, and the thought corroborates that intuition.

We have been working in this numerical system our entire human lives, and I feel that we have moved to think in terms of this numeric system instead of visualizations.

malcdi wrote:

@pariser: interesting point on white space; I'm inclined to agree for the most part that encoding something with whitespace can be confusing. One instance in which I've seen it work is in visualizations of normalized data that fill up some predefined space (e.g. a pie chart, a stack graph etc.), in which case coding the small, irrelevant (i.e. the "everything else that we're not interested in") portions as white can pop out the relevant chart segments. But then we could just call those data portions clutter.

On a different tack: one thing I took from all the readings this week, and from the experience of doing Assignment 1, is that what matters most in designing a visualization is not a set of visualization "rules"; instead, what matters most is context. It is impossible to visualize any but the simplest data set in such a way that it tells all of its stories at once. Indeed, Norman calls it "Matching The Representation to the Task"; and it's hard to get a good representation when the task is just to create a representation. This sounds pretty simple, but I think it's something often overlooked (myself included).

@esegal: this touches on your question of how best to represent data for exploratory purposes. I think exploring the space of choices is a good one, but of course, then we sacrifice customizability. I think that as human beings we're bad at "throwing away" representations once we envision them. It's hard for us to prototype a design in Tableau and then finesse it in something like Protovis or Processing, because it won't look the same.

skairam wrote:

Tufte rails against "data-ink" and "chart-hunk", but there are reasons to think that sometimes these 'unnecessary' embellishments can add to the perception/memory of a graphic.

Bateman, et al. published a paper at CHI 2010 (http://hci.usask.ca/uploads/173-pap0297-bateman.pdf) exploring the use of chart embellishments (of the type often seen in papers like USA Today). After having participants view the charts, they found that understanding was not diminished by the extra content and that recall after a 2-3 week gap was actually significantly better.

While I really like the beauty of the Tufte-esque minimalist charts, it's very easy to imagine that in a world where all you saw were perfectly honed line graphs, you may start to confuse them with one another. Perhaps the embellishments (done correctly) serve to make a chart unique and more memorable.

Of course, there is no argument which can save the chart on the bottom of p.94.

jtamayo wrote:

When talking about the audience of a visualization, Tufte mentions that it is a common mistake to "underestimate the audience." Instead, he says, "why not assume that if you understand it, most other readers will too? Graphics should be as intelligent and sophisticated as the accompanying text."

As much as I'd like for him to be right, I think that intelligence is not the only quality required to properly read a visualization. If we see visualization as a language for expressing information, learning to "read" that language is a skill that's developed with time and practice. In today's world an expert in visualization will be much more "literate" than his audience. Given the usefulness of data visualization, however, it is appropriate to educate the audience by slowly pushing the envelope of what is commonly known.

rroesler wrote:

An observation:

While I was working on Assignment 1, I drew up a number of different graphs all representing the same set of data. I found that someone's very first impression of the graph, made in the first few seconds of looking at it, influenced the way data was interpreted. For example, I had a stacked area graph of total units sold with Android being on the bottom. Plotting this, you would see the market as a whole (represented by the top of the stacked area) took a sharp upturn starting at 2010. Android also took a significant upturn, but less so than the market as a whole. Showing it to one of my friends, he was surprised when I said Android had almost twice the rate of growth than the market as a whole had (98% for android 55% for the market, calculated as [(newValue - oldValue)/ oldValue] ). Instinctively, he had interpreted the slope as rate of growth, when it actually represented absolute change in total units sold.

Situations like this occur not because the data itself or even the graph is misleading, but because a first impressions play such a large part in the human experience. I think this would especially be true in areas like News Reporting. On TV, paper or the internet, some form of data visualization is frequently used. The graph itself may be seen for less than a minute, so the first impression is vital.

On a side note: a good example of an interactive graphic can be found at USA Today: http://www.usatoday.com/money/economy/2009-02-06-new-jobs-growth-graphic_N.htm

asindhu wrote:

I found the reading "The Power of Representation" to be particularly interesting regarding the point made about the link between visualization and cognition. In lecture. Prof. Heer mentioned that visualizations aid in cognition, but I think the reading went even further than that. The particular example that interested me the most was regarding the measurement of how long it takes someone to tell the difference between two numbers. The fact that there is a perceptible difference in the time it takes to determine which number is larger depending on the magnitude of the difference, the author concludes, is an indication that the mind transforms this task into a visual one, as he demonstrates with the bar chart.

I found this conclusion fascinating, because it underscores even more clearly the importance of designing the right visualization for the task at hand. Based on the author's theory, it could be conjectured that almost every problem we try to solve is eventually somehow visually represented in the mind before it is able to solve it, making visualization an integral part of the problem-solving process. In some sense, a well-designed visualization removes the cognitive burden from the viewer so that the conclusion or solution becomes almost immediately apparent. To me, this is probably the most compelling case for the importance of visualization that I have seen so far.

ankitak wrote:

@jorgeh: Similar arguments have been put forth in various articles and position papers. Traditionally, there has been widespread support for serif fonts. You might find the discussion in this paper to be interesting. However, this article exhaustively discusses many aspects of these fonts - and concludes that there is not any hard difference between the two. This article however, seems to put forth a simple idea which seems to be de facto standard these days - use serif fonts for print, and sans serif for online work.

anomikos wrote:

I love the fact that both Tufte as well as Martin Wattenberg highlight the importance of people in visualization. Either acting as a consumer or a creator/contributor people are generating the data for the vis, they contribute on the actual creation of the vis and ultimately they are the ones that interpret the vis. One important aspect that Tufte barely touches in Chapter 6 is that visualizations my need to differ depending on who is viewing it. The perception and intellect of the audience play a significant role in the success or failure of a visualization, so one might have to consider to alter specific elements of it depending on the viewer. Actually in the world of interconnection and Facebook having dynamic visualizations of the same data based on the viewers profile is not that unlikely.

Here is one interesting paper about social visualization on software projects and how that might promote collaboration. Much similar to the wikipedia example that was presented in class. http://social.cs.uiuc.edu/papers/pdfs/codesaw-ieee-mm.pdf

andreaz wrote:

I was very interested in the part in "A Conversation" where Viegas talks about finding patterns in the Wikipedia edits that revealed different types of users. It's really interesting to find patterns in the data about people that they themselves may not be conscious about. We reveal so much about ourselves just through our activities, and it's fun to see that translated to a visualization. Viegas' example made me think of the site wefeelfine.org which visualizes feelings expressed in newly-created blog posts. The site's algorithm looks for the phrase "I feel" or "I am feeling," associates that remainder of that phrase with a particular feeling, and extracts the age, gender, and geographical location of the author.

Like others in this class, I too was intrigued by the train visualization in the Tufte book because it is able to map distance between cities, speed of the train, and the train departure and arrival times all the same time. Visualizing the rate of change among speed is particularly satisfying and I wish there was a way to use GPS data on our phones to visualize our individual speed throughout the day.

jeffwear wrote:

While I was initially apprehensive, I found our group discussion of our visualizations to be very useful. Comparing my work to others' made me realize that options I had thought I could not take were feasible (and advisable) after all, like providing an ordinal color scheme for the OSes' ribbons in a stacked area chart, even though it could only be based on one side of the data. I realized that I could have, and should have been more open-minded in my design.

Regarding flexibility in design, Prof. Heer's observation that I had used R, and his subsequent commentary, struck me as very astute. His caution that we ought to consider whether the tools that we use allow us to easily make changes had been my exact experience. While I had felt (and still feel in retrospect, given my current, limited experience) that R would be the most apt tool for designing this first visualization - certainly a finer tool than Excel - I was just learning the graphics tools. As a result, there were a number of elements of the graph, such as the background that I did not feel comfortable changing, but that I then could not fully account for, despite Prof. Heer's admonition to defend every pixel.

As I examined others' graphs in class, I badly wanted to go back and alter my own. I might even benefit from creating visualization prototypes, to explore alternatives. But even given the power of R (perhaps because of its indeterminate power), altering my visualization turned out to be a bit of a chore. As this class proceeds I hope to learn tools that facilitate flexibility.

acravens wrote:

@jtamayo I too have been thinking about "visualization literacy." I think all the readings hint at the importance of suiting the design to one's audience and I notice similar themes of it in others' comments as well. But I also sense a tension between maximizing the absolute "quality" of the design (whether in Tufte-terms or by another measure) and concerns about the audience's comprehension/preferences and/or the suitability of that design to that audience. If visualization is a communication medium, a way of thinking with data, something we can think of as analogous to reading, then I think there's a couple of implications of that. One, "visualization" will have certain number of "right" ways of expressing information which sound nonsensical or uneducated when violated, similar to the way written text has grammar rules. These would come from the more empirical fundamental perceptual studies. It will also have style conventions that hold in certain contexts for certain audiences but (like in Pirates of the Caribbean) are less rules than guidelines. Third, both the style guidelines and the grammatical rules will evolve over time to probably controversial debate. With writing, the evolution was primarily driven by human creativity and experimentation whereas with visualization I think that the technology innovation piece will make this evolution much much faster. Finally - and this is probably the most important point given the tension I see in other comments and some of the readings - "visualization literacy" will be something that exists on a continuum. The kind of encodings appropriate for the 2nd grader playing with a large data set for the first or second time will be similar in spirit but not identical to those for a newpaper reader or a scientist.

mariasan wrote:

@jtamayo I reacted to the same thing and agree with you. I wish that instead of working from the assumption that everything that's clear to me is clear to others Tufte would take a more "user-centered" view on visual design, if you will. Just as we need to test our interaction designs on users to gain insights of where to improve our design to fit the user, I'd like to argue that we should for visualizations as well.

An excellent example of this is me spending so long thinking about how to display the data that I completely forgot to include the source (I already knew it was from a reliable source). Asking a few people to look at the visualization would have revealed that flaw pretty fast, I'm sure.

I also really enjoyed the critiques of the class designs. In a sense they work as a really fast but effective user test and gave me some great insights on visual design trade-offs.

nandu wrote:

@jtamayo your analysis almost seems like a nature vs nurture take on sophisticated visualizations, and my own intuition is in agreement with you.

Should visualizations always be built to be self-evident and simple for the audience ? In general, that does achieve our goal of presenting data well. But what about making a game out of this and having the audience solve a small puzzle in the visualization to unlock it ? Might they enjoy that more ?

Also, I found the wired infographic that my team looked at in class at http://www.wired.com/special_multimedia/2008/pl_music_1609. Collaborating on it on VizAnalysis1GirlTalkInforgraphic.

selassid wrote:

I was really excited by getting to see everyone else's visualizations (albeit each very briefly). So much of the class seems to be about learning the visual language, and it's easy to forget just how many "words" are out there to describe the same thing. Getting more exposure to these words will probably help the quality of later visualizations. It's also rare that you get to see so many stabs at looking at the same data and be able to understand shortcomings of different design decisions; in the NYT, you only see the chart that worked, and you don't get to see the one that didn't, annotated with reasons why they plan on scrapping it. That provided for a great learning experience.

I also like @acravens' comment about how newspaper visualizations and scientific visualizations and 2nd grader visualizations will all use slightly different visual grammars. Perhaps discussing the difference between sci-vis and info-vis is like discussing the difference between and academic paper and a newspaper article. They're both clearly words, but someone recognizes the differences in diction and structure to tell them apart.

hyatt4 wrote:

I recently read a book by Malcolm Gladwell called Outliers, that had some overlapping thoughts with Zhand and Norman's paper on The Representation of Numbers. There was one small section, where he discussed why he believed Asian cultures had an advantage when it came to performing arithmetic. Gladwell discussed the complication that arose with the english language with numbers above ten. For example eleven,twenty, or forty take some practice by young children to get used to. There is a mental conversion that takes place when someone reads the problem 20 + 30, and then says twenty plus thirty in their mind before adding the two numbers together. On the contrary, the Chinese numeric system is setup to have 20 represented as two-tens and 30 as three tens. Addition in this case is much quicker as it just becomes five-tens without needing the language conversion as a middle step.

avogel wrote:

My favorite bit from the reading was Tufte's critique of "... the worst graphic ever to find it's way into print" (reproduced at this site). I showed this one to my roommates, and I'm convinced even without my own amusement at introducing it they would have come to the same conclusion. Seeing charts like this one make me want to see a contest for visualizations like the Obfuscated C Code Contest.

jorgeh wrote:

@ankitak Thanks for the info!

nchen11 wrote:

@jorgeh I too was unaware that there are various reasons for picking serif over sans serif fonts (or vice versa), so also thanks to @ankitak for the info.

However, in the context that Tufte was using on pg. 120, it seemed as though he was complaining more in general about how too many journals use the same cookie cutter charts for everything. They all have the same fonts, same patterns, same labeling, etc. and are all equally uninformative and lacking in insight.

Also, while I found myself agreeing with most of the book, I, like a few of my classmates, found the quartile plots in Ch. 6 to be more lacking in information than "minimalistic." The offset quartile plot is overly subtle, and I find it hard to believe that anyone would even notice, much less than understand what was going on. That whole section seems almost like a parody of Tufte's own minimalistic approach, because sure, using less ink is good, but there is a limit...

Leave a comment