A few weeks back, this blog entry from the new york times also stressed the need for context. It tells a story of collecting data using sensors on elevator and stair usage, where after a few days of collection the conclusion was that students use the stairs more at night. That seemed an interesting story until a security guard gave them some needed context: that the elevators had been breaking at night. So of course people were taking the stairs!
Missing context and missing data can be as (if not more) important as confounding factors in data collection. As we see more and more data collected and analyzed for various decision-making purposes from government to corporations to industry, and in both the private and public domains, I believe that the need to understand potential pitfalls of missing data and uncertainty will be central to actually getting good use out of that data.
No comments:
Post a Comment