Seeds for Thought

Seeds for Thought: Big Data

During the span of a week, I come across lots of interesting stories, resources, and sites online that may be of interest to those in the non-profit-sector. In line with my approach of connecting people with resources and sharing information, I’m thinking about starting a weekly feature to highlight some of those links – consider this the pilot edition!

This week, I’m highlighting a trio of posts from the Harvard Business Review’s Blog Network, a site I recently started following. Although the focus is primarily on for-profit organizations, I’ve already seen content on social enterprises, philanthropy, and international development, as well as resources and trends that would be equally applicable on the non-profit side.

All three articles below relate to managing and using data, particularly “Big Data”. The term recognizes that collectively we are producing and storing exponentially-greater amounts of data in recent years than at any other point in human history – the first article cites research that 90% of data currently in existence was created in the past two years! This explosion in information can help grow our understanding of practically every facet of life, but there are challenges in analyzing and interpretating these giant data sources as well as limits to how much we can learn from them.

  • Jeff Bladt and Bob Filbin’s article title says it all – A Data Scientist’s Real Job: Storytelling. It’s similar to a truism I learned from a great professor during my undergraduate education, that all research projects have to tell a story: we start at some point of knowledge, we run an experiment or collect some information, and we learn something as a result. Tables of numbers and statistical tests are essential tools, but by themselves they do not advance our knowledge. As Bladt and Filbin put it, “Data gives you the what, but humans know the why“.
  • Presenting data in an accurate, easily-comprehensible visual form has become a field in its own right. If you’re not sure where to start in sharing information, Nancy Durante gives a simple suggestion: When Presenting Your Data, Get to the Point Fast. Check her post for some good tips on how to help your audience focus on the key numbers (hint: tables of numbers and pie charts are not in the cards!).
  • Finally, Kate Crawford explores The Hidden Biases in Big Data. Even databases with millions of records may not cover the full spectrum of a phenomenon: Crawford gives the example of the 20 million tweets generated during Hurricane Sandy, the majority of which came from tech-connected Manhattan compared to harder-hit neighbourhoods. Her prescription? “Take a page from social scientists”: pay attention to where the data comes from, examine your cognitive biases in interpreting the data, and utilize a diverse range of methods including qualitative approaches like interviews to complement the quanatitative data findings.

If you have any thoughts or additional links to share on this topic, I’d love to see them! You can use the comments field below or find me on Twitter. Also, any feedback or suggestions on this approach of weekly annotated links would be greatly appreciated.