2011-02-14

Analysis and the 'So What' Question

While at Strata I had an opportunity to participate in quite a few sessions that demonstrated how to take raw data and analyze it with various tools.  The output was usually a set of graphs, charts, etc, though sometimes just simple tables.   All of this was useful to get a sense of how the tools work, but what was missing was the final step in the analysis - a powerful insight or understanding that one could use to make an intelligent change to a process.   Generally, the presentation technique was fine, the tools were great, but the demonstrated impact of the tools was trivial.

One reason for this is that some of the presenters may have to hold back on their most significant discoveries until the right time - and this just wasn't that time, or this wasn't the right audience.  I can understand this - since most of my best analysis can't really be shown without getting NDA and other agreements in place first.  Another reason is that the presenters might have wanted to focus on the tool and not the data or business being studied which is just serving as a necessary example to work on.   But this is misguided, since delivering insights is the bottom line - not delivering pretty pictures.   The last reason I can imagine is that delivering powerful insights is hard, and while these presenters are working on it they may not yet have a suitable example.  And I think that this is the most likely answer.

My concern is that people spend a lot of time building gorgeous but empty-headed analytical solutions that just don't have much to say.    This is pretty similar to the chart junk problem that Edward Tufte complains about.   To make this a little more clear I've included a few examples below.

2011-02-10

Breadth of Data vs Depth of Analysis

One of the things that I felt was missing from O'Reilly's Strata Conference was a nuanced sense of the trade-offs between complex analysis and vast volumes of data.  Because there is a trade-off and I've seen it play out consistently.  It works like this: where do you spend your investment?
  • deep analysis - with unpredictable costs and benefits
  • broad sets of data - with predictable (high) costs and benefits