Comment: Big Data and Performance – what the Muppets can teach us

by Cary Burch*
Success in the lab does not come without a few failures. This is equally so of Big Data projects at work, and this is a good thing. We must consider this when managing the performance of our Big Data projects in an organizational setting.
We continue our discussion of the organization of the future by moving from the decisions we must make when looking to introduce Big Data, and instead regard managing the performance of such projects when underway. Just as with the many failed experiments of Dr. Honeydew & Beaker from The Muppets Show in order for the organization of the future to truly succeed in measuring Big Data performance, we must be willing to meld art and science which means being comfortable with and anticipating a number of failures on the quest to getting it right.

When it comes to using Big Data while managing its performance, we cannot be afraid to fail. Analysts can only be successful in their application of Big Data through protracted exploration and experimentation. As just one example of this, we look to the practice of data mining, a core activity among those using Big Data today. Provost and Fawcett give us guidance in Data Science for Business by stating:
“Data mining is a craft… It involves the application of a substantial amount of science and technology, but the proper application still involves art as well. But as with many mature crafts, there is a well-understood process that places a structure on the problem, allowing reasonable consistency, repeatability, and objectiveness. A useful codification of the data mining process is given by the Cross Industry Standard Process for Data Mining (CRISP-DM).”
If the use of Big Data in tomorrow’s enterprise is to include a clear sense of how such projects will be managed, there are a number of components we can identify which give each project meaningful boundaries, while permitting the sort of art and science blend which will be crucial to forward competitive success. Some of the most critical components of analysis work managed well can include:
A Comprehensive Needs Assessment. Taking a page from the study of Program Evaluation and its Standards, one of the core evaluation types is the needs assessment. The practice of conducting a needs assessment first, means what work is done thereafter will unfold only according to the stakeholders, scope, and need(s) identified.
A Defensible Logic Model. Program Evaluation is not only helpful for analysts to understand how the very process and outcomes of that process contribute to program success, yet as this field also asks for a logic model before analysis takes place, this is equally recommended for Big Data projects. Logic models from Program Evaluation work here harmoniously because they ask project team members to question the underlying assumptions driving a given analysis project. The relationship between resources, activities, outputs, outcomes, and their impact are all described in a linear fashion, giving way to the structure required before art can begin to take shape. This also eliminates boundaries around creativity for addressing the problem when within the boundaries of the logic model.
A Great Analysis Question. In the realm of research, we know all great research begins with a great question. This is no different with Big Data. The unstructured nature of Big Data means we can look for patterns among the data iteratively and endlessly. Sometimes if we wander aimlessly for long enough, patterns can even appear to emerge that aren’t really there. It is only through a great question, then, that we can begin to understand what we will look for and thus plan how to respond appropriately when found.
A Meaningful Driving Force. A sense of urgency underlies all great change efforts, and so too must a sense of urgency exist among Big Data analysis projects. As Davenport tells us in Big Data @ Work, “Among the data scientists and company leaders I interviewed, there was a strong belief that the big data market is a land grab, and to the early movers will go the spoils.” I fervently agree, and to that end ask you to reflect, what will be the force which drives your business’ Big Data projects?
A Version of Your Very Own Muppet Labs. Analysts do not tinker with Big Data in the twenty minutes they have free between meetings, or get to Big Data projects when the ‘real work’ is done. Big Data is, well, big. Harnessing that takes a considerable level of talent, skill, focus, and time. Asking already stretched staff to also take on a pilot project in Big Data is asking for piecemeal results from someone who only has piecemeal time to give. Instead, create your very own Muppet Labs, dedicate resources to Big Data, and give them the freedom to tinker.
A Team of Beakers. With every visit to Muppet Labs we see Beaker standing, ready, and willing to try out Dr. Honeydew’s latest invention. Why? Perhaps that is best left to Beaker to answer, yet what we do know is both he and his lab coated counterpart are certainly not afraid of failure. Amid the industry standard CRISP-DM process, the well-documented needs assessment and logic model, the precise analysis question and the potent driving force, must lie a team of analysts with the willingness to go far out of their comfort zone. Traditional analytics, also dubbed small analytics, requires a very traditional analyst using incredibly common and proven methods. Big Data teams should embrace what is not yet proven, for only then will signal come from the unstructured, mammoth noise Big Data has to offer. Beakers can come to you with a number of backgrounds. Computer science, modeling, business, math, or statistics. Your best bets at stocking your team with Beakers will be those who exhibit the traits of the data scientist.
A Means for Leveraging Failure. As our final common component, leveraging failure is perhaps equally the most complex. Yet it is not the most complex because we do not know what failure looks and feels like. It is instead the most complex component because we hardly know how to treat it well. From a traditional performance management perspective failure is something to avoid, because it is something which limits our ability to continue operating proven processes. Yet what of the unproven? What of those areas of our business where we’ve yet to develop a working SOP? This is where the adaptive learning perspective can help us. Aldrich and Ruef in Organizations Evolving tell us, “The adaptive learning perspective [treats] organizations as goal-oriented activity systems that learn from experience by repeating apparently successful behaviors and discarding unsuccessful ones.” The only thing left to do, then, is figure out our best way of communicating what we learned from, and shall then build on, from each of our failures.
We have made the decision to incorporate Big Data into our business process, recognizing the organization of the future will certainly grant Big Data a seat at the table. Our job now is to begin to make the relevant changes to structure, culture, driving forces and needs necessary to invite all of what Big Data has to offer when it is arrives. Just remember, managing the performance of Big Data projects is not about avoiding failure, it is about making the most of it when failure inevitably appears.
* Cary Burch is the Senior Vice President of Innovation at Thomson Reuters Corporation. Prior to taking on the lead innovation role for TR, he was recently President & Managing Director of Thomson Reuters Elite.

Data Science for Business

Standards
http://www.jcsee.org/program-evaluation-standards-statements

Logic Model
http://www.wkkf.org/resource-directory/resource/2006/02/wk-kellogg-foundation-logic-model-development-guide

Question
https://explorable.com/research-paper-question

Big Data @ Work

Data Scientist
http://www-01.ibm.com/software/data/infosphere/data-scientist/

Organizations Evolving