Robert Kosara’s Infovis example illustrates the Chris Rock effect: a pleasurable intellectual effort spent in discovering something obvious that could’ve been noticed (and even quantified) much more easily and directly via a simple dot and line plot

(by Andrew Gelman)

Last week I pointed you to a pair of dueling mini-articles on information visualization and statistical graphics.

In the infovis article, Robert Kosara gives an example that I think perfectly illustrates one of my general points on the difference between the two approaches.

My remarks below might appear harsh but I don’t mean them to. As Antony Unwin and I have written, we take as a starting point that infovis has value. In exploring the different goals of infovis and statgraphics, we are not questioning the values of infovis but rather trying to move toward a future in which the best ideas from both approaches can be used to understand and communicate quantitative information.

OK, now to the example. In his article, Kosara writes:

A common question in time series data is whether the data is periodic, and if yes, what the period is. A common way of finding out is drawing the data on a spiral. By changing the number of data points that is shown per full round the spiral makes (that number is constant, of course), patterns become visible. Figure 1 shows an example of sick leave data that has an interesting periodic pattern: in 28 days, there are four periods, which means that there is a weekly pattern: more people call in sick on Mondays than later in the week.
The way this pattern was discovered is deceptively simple. All it took was to play with a slider that allowed the user to change the number of days on shown on the spiral. Slide it back and forth, and soon you will see a pattern (if there is one). With a bit of practice, you can even tell when you’re getting close, as there are telltale signs around the optimal value.

Here’s Kosara’s graph:

https://i2.wp.com/andrewgelman.com/wp-content/uploads/2011/07/Screen-shot-2011-07-22-at-5.43.06-PM.png

with the following label:

Figure 1: Spirals are useful for finding periodicity in data. (a) The bar chart shows no obvious periodic pattern; (b) the spiral set to 25 days hints at a periodic pattern, but this is clearly not the correct time frame; (c) at 28 days, the pattern is very clearly visible.

I agree the graph is pretty but I really don’t see the point. Of course you’d want to look for day-of-the-week patterns in sick leave data. Instead of the pretty pinwheel, I’d much prefer a simple dot and line plot showing the data and averages for the 7 days from Monday through Sunday. That would give me quantitative information.

What does the swirly graph tell me? That there’s a weekly pattern. Which I would’ve easily noticed using a simpler, more direct graph.

Kosara’s Figure 1 is an excellent example of something I talk a lot regarding infovis: It’s a graph that invites the reader in, it’s intriguing and appealing, and what it leads the reader to is . . . discovery about the method used to make the graph. If you look at the swirly graphs and think about them, you can say: Hey, cool: It’s a spiral graph, each twist is 28 days, there are four blue zones, 28 divided by 4 is 7 . . . hey, there are 7 days in the week! Yeah, that makes sense, sick days happen by the week! Lots of work to convey a very simple piece of information. The joy of discovery, applied to discovering the familiar and expected–it’s the Chris Rock effect all over again.

In contrast, a dot-and-line plot showing the data by day of week would convey much more information, much more clearly and transparently. But from the standpoint of infovis, clarity and transparency are a minus, not a plus! A confusing graph can invite more reader involvement.

Advertisements

15 Responses to “Robert Kosara’s Infovis example illustrates the Chris Rock effect: a pleasurable intellectual effort spent in discovering something obvious that could’ve been noticed (and even quantified) much more easily and directly via a simple dot and line plot”


  1. 1 Stephan July 28, 2011 at 4:01 pm

    This comment is totally unrelated to this post but I stumbled upon this really cool blog post How an arcane statistical law could have prevented the Greek disaster I though I bring this to your attention because this is also fun for non-statisticians.

  2. 2 Nick Cox July 29, 2011 at 12:03 am

    Jacques Bertin made the same criticism of spiral plots in his Semiologie graphique (1973, p.214). For many readers this reference may be more accessible via Edward Tufte, The visual display of quantitative information (1983, p.169).

    Very minor point: statgraphics is a registered trade mark, so arguably should not used in reference to a generic field. See http://www.statgraphics.com

    With any luck, infovis will soon become one too, and then linguistic conservatives such as myself will be spared this barbaric contraction except in reference to a proprietary product.

    • 3 statisticsforum July 29, 2011 at 1:33 am

      Nick:

      I think the First Amendment to the U.S. Constitution ensures that I can use trademarked terms whenever and wherever I want!

      • 4 Nick Cox July 29, 2011 at 8:27 am

        I am not saying it is illegal, let alone unconstitutional. I am saying there is ambiguity which is better avoided. It would be a courtesy to the people whose livelihood is the program of the same name to avoid using it. Disclosure: I have no connections with that company.

  3. 5 Enrico Bertini August 2, 2011 at 9:10 pm

    Dear Gelman, thanks for your work. I appreciate your points from this post and the next one but, without any intention to offend you, I think your use of the term InfoVis is plain inappropriate.

    When I read in your article: “But from the standpoint of infovis, clarity and transparency are a minus, not a plus! A confusing graph can invite more reader involvement.” I can clearly tell this is not infovis.

    Infovis is a very serious discipline with fervid advocates for clarity and effectiveness since the second half of the nineties. Many of them have run countless studies to understand when and how a technique works. I’d bet some of them would be surprised, to say the least, to hear that for infovis “clarity and transparency are a minus”.

    To tell the truth infovis is deeply rooted in statistical graphics, as well as in geography. If we really want to draw a line between the two, infovis has a stronger focus on interaction and the use of visualization as a means for data exploration and discovery rather than presentation. But even that may sound too orthodox to some hears.

    I am surprised to see the word infovis used in this way and I am wondering why? Because I know as a matter of fact how many people in my community absolutely detest useless and catchy visualizations.

    You might want to refer back to the work of Ben Shneiderman, Jock Mackinlay, and Stuart Card (not only their book but their individual works). They introduced the term, are among the fathers of the field, and very respectable scientists.

    With respect,
    Enrico Bertini
    fellinlovewithdata.com

  4. 7 Jorge Camoes August 2, 2011 at 10:46 pm

    I don’t have to tell you how misleading an average can be.It will not tell you that more people call in sick on Mondays. There may be an outlier that explains higher averages. A chart will reveal the true, so I advise you to use a chart first with all the data first…

    We could reverse your argument and say that statistical graphics are shining examples of clarity and transparency and, in this case, that’s a plus, not a minus, with the added bonus of not inviting more reader involvement. This is absurd, right? Maybe we can agree that both arguments don’t make much sense, then.

  5. 10 Jen Lowe August 3, 2011 at 4:52 am

    This is a recurring argument against the aesthetics of infovis: that a straightforward, utilitarian solution is better, faster, clearer. Often these arguments ignore the issue of audience, which has happened (to some extent) here. With different emphasis, you’ve written:

    “Of course _you’d_ want to look for day-of-the-week patterns…”
    “_I’d_ much prefer a simple dot and line plot…”
    “Which _I_ would’ve easily noticed using a simpler, more direct graph.”

    Ah! I think, but you are not everyone, and others will be more engaged with the swirls. And then you surprise me by acknowledging the audience, recognizing that the draw of the swirly graph is that it “invites the reader in, it’s intriguing and appealing.”

    The real criticism here seems to be that the data presented is “familiar and expected.” In short, it’s boring. From my understanding, Kosara’s visualization was created for instruction and example, and doesn’t profess to present the new and unexpected. Boring is boring, and this data still boring as a dot-and-line plot. Except if you presented sick days as a dot-and-line plot, no one would look at it.

    Kosara has wrapped boring up in beautiful; he has made us want to look deeper, but sadly there’s not much inside. Like someone giving you birthday socks gift-wrapped in gold and silk, it feels like maybe someone’s playing a trick, and that’s uncomfortable.

    Should we only ever visualize significant meaningful data? Impossible. We have to practice to improve; the more beautiful wrappers we make, the more prepared we’ll be when we find the right data. Only a few people make consistently important visualizations – Hans Rosling and Laura Kurgan come to mind. The rest of us might have a limited number of bright moments where profound insight and beauty intersect. More often we’re slogging through boring inconsequential data, hoping to make something of the mess.

    And if we care about our work, we want it to be beautiful, even if the data is unremarkable: phone calls and sick days. Is it deceitful trickery to make the mundane beautiful? Maybe. Maybe we’re wrapping socks in silk. I’d rather have the pretty wrapper to look at, to engage with, and perhaps (as in the case of Kosara’s example) to learn from, and one day apply to a more profound periodic dataset.

    I think the aesthetics/utility, statgraphics/infovis argument is false and tired, and I’ve only just shown up to this party. More than tired, I think the argument distracts from a higher concern: in our limited time, to what data should we apply our attention?

    You conclude your Chris Rock piece: “The graphs that Seth hates so much do their job in that they look unusual and draw the viewer in to look more carefully and rediscover familiar truth. After that, though, there’s not much more there, and it would be great if they could link to something more informative.”

    With this I agree: as much as is possible, people who can make compelling images from data should focus our visualization efforts and (I would say) our discourse on important, meaningful data. If the data is as deep as the presentation is beautiful, the rest falls away.

    –Jen

    • 11 statisticsforum August 4, 2011 at 2:19 am

      Jen:

      Beauty is great. I like beauty. But beauty and informativeness are different goals. It should be no surprise that if you optimize on beauty, your graphs will be less informative, and if you optimize on informativeness, your graphs will be less beautiful.

      • 12 Jérôme Cukier August 4, 2011 at 9:57 am

        Hello, I beg to differ on your last statement.
        The relationship between beauty and informativeness is not a trade-off, but rather strongly positive.

        A chart which is not well-executed present asperities to a viewer and diverts some of their attention to such oddities (say, unfortunate choice of colors, misalignments, illegible text) rather than to the point of the chart. As such, it is less efficient.
        Conversely, a chart which is beautiful can engage the viewer to commit more, to seek information more actively. Interestingly, such a chart can drive the viewer to search for a meaning even if the initial display of information is not obvious at first sight. When there is such a stumbling block and the viewer perseveres, understanding is boosted (“desirable difficulty”). You find this in Kosara’s chart, but more famously in Minard’s one.

        Beauty also comes from informativeness. For instance, a properly-styled web page is beautiful. Such beauty doesn’t come from a lavish artistic work by web designers, but from balance, appropriateness, and clarity. Likewise, Tableau charts, which are mostly automatically laid out, based on standards designed for maximum legibility, have a very strong aesthetic appeal, even though they are very plain.

        so, I feel that if you optimize for informativeness, beauty follows, and if you neglect beauty, you restrict informativeness.

        (though I would concur that going for aesthetics with no respect for infovis principles doesn’t often go well).

      • 13 Jon Peltier August 4, 2011 at 4:37 pm

        I have yet a third point of view. Statisticsforum says that beauty and informativeness are anti-parallel, while Jérôme says they are in fact parallel. It seems to me beauty and informativeness are orthogonal, and it is a clever designer who can produce something that is both informative and beautiful, in the first quadrant, if you will.

      • 14 statisticsforum August 5, 2011 at 2:17 am

        Yes, we certainly can try to be both beautiful and informative. But beauty and informativeness are different. If you optimize on A, you won’t be optimizing B, and vice-versa. (Also people have different definitions of beauty and even of informativeness.)

  6. 15 Ilya Boyandin August 4, 2011 at 6:49 am

    In my opinion, the goal of InfoVis is to produce effective data representations, not fancy or pretty ones, and to prove their effectiveness. InfoVis people are not afraid of experimenting with new things and trying out non-standard visualizations. This is all trail and error, so no wonder that in many of these experiments the outcome ends up to be inferior to more conventional methods. But in the end, the gain in effectiveness can be huge.

    My personal favorite are OD-Maps which represent migration patterns very efficiently. The standard way of visualizing such data would be this which is much less effective when the number of flows is large (check http://flowmappa.com/ for more details).

    Indeed, as for most non-standard visualizations there is a learning phase. But as soon as it is accomplished, using the more efficient representation can give you an enormous advantage. Also, if the target audience of your visualization are data analysts the learning phase is quite affordable.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




About

The Statistics Forum, brought to you by the American Statistical Association and CHANCE magazine, provides everyone the opportunity to participate in discussions about probability and statistics and their role in important and interesting topics.

The views expressed here are those of the individual authors and not necessarily those of the ASA, its officers, or its staff. The Statistics Forum is edited by Andrew Gelman.

A Magazine for People Interested in the Analysis of Data

%d bloggers like this: