current page



Lecture on Nov 2, 2009.

Guest Lecture by Mira Dontcheva, Adobe Research.


  • Required

    • Summarizing personal web browsing sessions, Dontcheva et al. UIST 2006. (pdf)

    • Vispedia: Interactive visual exploration of wikipedia data via search-based integration, Chan et al. InfoVis 2008. (pdf)

  • Optional

    • Zoetrope: Interacting with the ephemeral web, Adar et al. UIST 2008. (pdf)

    • Web Tables: Exploring the Power of Tables on the Web, Cafarella et al. VLDB 2008. (pdf)

    • Attaching UI enhancements to websites with end users, Toomim et al. CHI 2009. (pdf)

    • Relations, cards, and search templates: User-Guided data integration and layout, Dontcheva et al. UIST 2007. (pdf)


wchoi25 wrote:

I really liked this lecture as it was somewhat different from other visualizations we examined. This is about creating a visualization on the go, as you forage through the web, extracting information across sites and amid things you don't care about. What is really challenging is that all of this has to happen through an interface that is minimal and does not disrupt the work the user is currently performing. I think the problem of constructing an automatic visualization of a trail of one's browsing history would be interesting.

nmarrocc wrote:

I love Dontcheva'a high level approach to collecting web functionality. This reminds me a little bit of webclips, which was a feature of mac os 10.5 that I got really excited about but it turned out to be a little useless. This also reminds of what Scott Klemmer class "dove tail" vs "hot glue" programming. In traditional programs objects are connected at a lower level and usually the programmers understand pretty well both objects. Whereas with the "hot glue" approach, you can have "objects" or pieces of the web interacting together in ways the original programmer never thought possible.

I think the web is pretty cool because it allows for us to consider this kind of emergent behavior. I think good design plans for emergent behavior. It would be interesting to visualize the parts of the web that were built on top of other things that no one could have predicted when those original things were written.

bowenli wrote:

The thing that strikes me the most about user created cards, etc. is the power you get by feeding that information back to other people. I hate to say it, but the whole web2.0/crowdsourcing approach will work really well here. When someone already takes the time to distill all the relevant information on a site and present it in the most efficient way, that is very useful information. Not only for other people, but also if you are the owner of a website, it would definitely be of interest to see what people really care about on your site. That kind of insight can help drive future designs of the original site or help to better optimize the knowledge seeking experience.

vagrant wrote:

While I believe that the personalized web cards can be of use to many people, I find that personally, I feel flexible and agile enough with a decent web browser to manage as many as five separate browsing sessions at once.

And I prefer it that way.

I've attempted to visit feeds and aggregate news sites before, but almost always found myself drawn to the personality and presentation of a few favorite sites. So for me, selective web retrieval could only be of very specific use; I can't see my average browsing session being served particularly well by it.

At the same time, I believe that I would be considerably more appreciative of the feature on a mobile device, where web browsing is a pain and where I only give the browser about two seconds of my attention for every ten I spend using it (the other eight seconds are spent waiting for the pages to render). In this case, I feel that I would rather just have the information most pertinent to my browsing goals--a complete experience is not what I am looking for.

jieun5 wrote:

I share similar sentiment as @vagrant, that personally, I think the time and effort (though made minimal by the cool automated data-collection stages) is often not worth the benefit of personalized cards. For instance, I think the time taken to compile information and tweak the design of cards will often not be worth the easy comparison over complete features of interest that the interface allows for, especially when the decision-making is over hotels and restaurants for which I do not necessarily need to find "the best" option for me.

I really liked the idea of Zoetrope though. To capture and easily view changes of websites seem like an invaluable task-- especially for sites that display news, shopping, or sports (i.e. sites whose content changes rapidly). I liked the easy mouse gesture required to browse the entire "historical" time-line over a specified content.

nornaun wrote:

Although I view the system as a gateway to enhanced web exploration experience, I don't plan to use the personalized web cards myself. Like vagrant and jieun5, I think making a person web card is too much a trouble for small task I do. Normally, when I do small task like making a travel plan, the tool I use to record information is just pencil and paper. I think using paper and pen is quick and convenient. It also allows me to literally "grab" on the information I have, moving them around to consider my alternatives or pass them to my friends.

Surely the digitized version is quicker, cleaner and probably more efficient once you get used to it. Still, I believe there are a few people who get attached to the paradigm that treats information as physical entities like I do. Perhaps novel technology such as augmented reality can help blend digital and physical data/visualization together.

zdevito wrote:

I like how the web cards provide both a useful interface to the end user who wants to organize information they gather on the web as well as a means to use crowd sourcing to extract semantic information from web sites not designed to be understood by computers. The semantic web seems like a chicken-and-egg problem: no one wants to spend the effort to make a computer-friendly webpage if there is nothing to be gained doing it, while no one can really make a killer app for the semantic web if there is not data for it to use. I think web cards provides a middle ground. It is useful for the end user, and the cards can also be used to extract semantics from a particular webpage.

alai24 wrote:

The web card idea is very tantalizing idea since I do a lot of my shopping online, resulting in a pile of bookmarks from review/merchant sites. Especially when comparison shopping for computer parts or a digital camera I end up putting everything into a google doc. I think the way the system extracts data from the DOM is pretty interesting; the challenges posed by changes in the website layout seems pretty tough.

malee wrote:

I am really fond of the notion of effectively capturing (parts of) websites and their histories for personal use. My bookmarks are often abandoned, and my current information foraging/retrieval on the Internet involves a lot of context switching between browsers & my favorite text editor. Thus, I would probably use Mira's system even if it took me a little bit to get the hang of it. I can also imagine other less practical uses for it, such as creating a fun scrapbook or collage and sharing with friends. Also, I found both the web cards and the Zoetrope project to be really interesting because of the way they allowed meta-interactions that were not pre-determined by the website itself.

rnarayan wrote:

I tend to agree with some of the comments that the webcards system has diminishing returns for most everyday use including normal travel plans. However, for a long range project such as shopping for a home, or a time-insensitive project such as reference-collecting, it may be well worth the trouble. Here again, a few more observations:

a. To me the organizing principle (as a collection of documents/books) and rendering (3D) originally proposed by Card, etal. from PARC in WebBook/WebForager seems preferable - a cool demonstration of this was in the movie "Disclosure" several years back.

b. Given my parameters of use (for important long range tasks such as shopping for a home), the content needs to be evergreen and upto-date automagically, without user refresh (and perhaps subsequent fixing of broken links). So, some form of publish/subscribe (SOAP, etc.) service seems mandatory to keep the content up-to-date. If such collected/stored content is an important dossier or if pub/sub is not supported by the publisher, a more heavyweight continuous query (OpenCQ) approach could be an option.

c. Given that the tool is meant for a specific task based on preferences of a specific person, further narrowed down to a specific time window, it is somewhat hard to see a crowdsourcing model building around it to generate the kind of collective intelligence as was discussed (@bowenli) - as a trivial example, a lot of people create lists on Amazon and iTunes - do we visit other people's lists often? - or do we have time to just look at the bestselling list?

d. Which brings me to my next point - content/service aggregators such as Amazon, Orbitz, and are continually trying to supersize and to be-all and end-all for everyone by cross-referencing and creating marketplace services and marginalizing smaller players. So, here again, there is a danger of these tools not reaching their potential.

The Zoetrope lens interface and search paradigm was cool - as mentioned in the paper, being able to expand it to different type of content/documents could be invaluable.

cabryant wrote:

Wikipedia, like most collaboratively generated data sets, is equal parts appealing and frustrating, due to its semi-structured format and lack of a strict schema. Given this amalgam of opportunity and limitations, Vispedia's interactive visualization construction model represents a reasonable approach to exploring data.

That being said, two thoughts immediately come to mind as potential augmentations. The first is the integration of search capabilities to identify potential, related tables for visualization purposes (perhaps leveraging the work described in the webtables document). The second is the potential to explore changes in data over time, as recorded in wikipedia through document history. Given that tabular data is likely more stable than text, it may be interesting to visualize which elements of a data set have changed over time, the reasons behind the changes (new information, revision of old information, censoring, etc.), and the instigator of the change.

fxchen wrote:

I really dug this lecture. I think fusing multiple data sources and making sense of these disparate sources is a very interesting problem. It is currently the thrust of my final project

I was curious as to the current progress of semantic searching at Adobe. Also, I would be very interested in the coming years to see how bottom-up organization of the web (e.g. Twitter hashtags) will enhance our exploration and interaction with the internet.

Leave a comment