Technique and Understanding
It would seem the “next big thing” is “Big-Data analytics.” Young people! This is the hot job of the near future. Pundits are predicting a shortfall in qualified people to fill the need for advanced analytics application design and development. Companies, governments, NGOs (non-government organizations), churches, and more all have awoken to the possibilities of tapping into the vast treasure trove of data in the cloud. The myriad of data points can predict behavior, reshape industries, and “fuel exponential growth” (cited from McKinsey)—if only we can make sense of it all. Hold that thought.
Google “Big-Data analytics” and the brands have staked out priority ads: IBM, McKinsey, SAS, and Intel are positioning to capture your attention. I took a closer look at the whitepapers and reports issued by these brands in dedicated “Big Data” Web pages, and found something interesting. After extolling the benefits, they offer a cautionary note: Make sure any application is rolled out to the users in a comprehensive way; otherwise, use of the application will not be optimal. This is an often overlooked point, but there are two parts to it that are equally important.
The first seems obvious: Make sure you have trained the users on how to use the application tools. Let’s call this learning the “technique.”
The second point is not so obvious: Make sure the user knows enough about the way data is accessed, pulled together, parsed, and delivered so the results of the technique they apply are truly meaningful to the understanding of the business they are evaluating. Put another way, does a person launching an inquiry into the data understand enough about how the query works so the results can be judged to have value or just be nonsense. Let’s call this aspect of analytics “understanding.”
The brands all speak about delivering powerful analytic tools into the hands of “leaders” who will make decisions informed by the new insights the data provides. Who are these leaders?
I want to make the case it is not the folks in the C-suite. They may have dashboards and reports that aggregate information to an overview of the business sufficient to understand where it is each day, week, month, quarter, and year. This is not using Big Data in the sense the brands mean. The leaders they reference are mostly the mid-level people in the organization who are charged with “making the numbers,” or “driving operational efficiencies,” or “developing the next killer product.” As you would expect, these are very busy, typically overtasked people, with little bandwidth to absorb new technique, let alone take the deeper dive into understanding the assumptions and methods behind the results they are getting when they apply technique. The big question is whether they have the requisite understanding—not just technique—to know whether the results have value.
Let me offer a story by way of illustration about the double-edged nature of this sword, based on a project in which I participated at GEA (General Electric Appliances) in the mid 1990s. To set the table, here is the context for which the project was created. Let’s see, as we step through this, just how closely what happened then parallels today’s story of Big Data.
In the 1990s, GEA had one of the largest collections of business databases in the world. They were not in the cloud as that is understood today, but they might as well have been on the moon—none were “connected.” There were databases for consumer product sales, service, manufacturing, transportation, engineering, and purchasing. Each database was associated with an application (or sets of applications) that were not standardized. The billions of data points that were the aggregate of all systems taken together could not be accessed and analyzed as a whole. A study was done to determine the cost to migrate all systems to a common platform. It was rumored the cost estimate prompted sudden cardiac arrest across Appliance Park in Louisville, Ky.
There was a clear conviction among the senior managers that some methodology was required to mine the data in the disparate databases to reveal new insights to the business’ behavior. The top “driver” (GE-speak for priority) was reducing costs to increase margins for products for which a price increase would not be accepted by customers. So the approach to Big-Data analytics, 1990s-type, was to create a data warehouse as a common repository for important data elements from the systems which were deemed relevant to analysis in a functional area such as transportation. The news of a cost-acceptable solution resuscitated many a heart when it was announced.
I admire GE, and yes, it is a demanding place to work. You participate in an average of four team projects concurrently. I was assigned to the project team responsible for the creation of the TDW (Transportation Data Warehouse), the large-scale pilot that, if successful, would be the template for other major segment data warehouses at GEA. My role was to develop the training curriculum for the users, and train them to effectively use the query tools to access the data in new and meaningful ways.
The project champion for the TDW was the Vice President of Transportation, a person who “got” technology. By this I mean that he understood both the benefits and pitfalls of new technology adoption and deployment. His awareness of this was instrumental in directing the eventual success of the project for the TDW. He took me aside early in the project to make the distinction between teaching technique and understanding, that both were equally required for success, and that my program needed to incorporate a real test of understanding concurrent with the certification of user attainment of technique.
Whether someone writes standard queries to pull and apply analytical properties to parse the data, which provides an “insight” to the business, or as in the case of GEA, having front line managers craft their own queries, there is a dependence on the query writer’s understanding of how data must be appropriately related. If one data point can be included in a query, it does not mean the data point is relevant to the business question being analyzed. The importance of understanding how the data relates to other data when designing a query cannot be underestimated. It is the difference between false and true results in terms of the relevancy of what is happening.
Fast forward: the TDW was a success and the next data warehouse to be implemented was the Purchasing Data Warehouse. GEA had a path forward to Big-Data analytics.
Did I mention the target “audience” for the TDW training? The “leaders” were the mid-to-lower level managers and supervisors in the transportation business unit. All 300 of them. Many were just starting to use PCs. The technique training was for MySQL, which at the time was not graphically rich, and dropped people with little or no Windows-based applications directly into the proverbial fire. Competence with technique—creating an executable query—came to users through repetitive CBT (computer-based training) training exercises.
The greater challenge, as you would imagine, came with the exercises to understand what the data meant, where it came from (what process in an application) and how it relates to and differed from similar data from another application. A data point in one system can have the same field name as one from another system, but have a meaning vastly different from the other. This is the essence of understanding things like where the data point is, what it represents in terms of the process in which it resides, and how it relates to other data points in other applications. Training for understanding involved classroom sessions on data dictionary definitions, their sources and the operative quality that they capture in the process. And most importantly, how similarly named data fields differ based on their position and context within their native applications.
This is where technology introductions to new users hit a snag. Understanding always trumps technique. The brands worth investigating are the ones pitching Big-Data analytics “solutions” that also stress the caveat about properly and effectively attending to the double-edged sword of technique and understanding. Making sense of massive data now residing in clouds is just an update on a recurring theme, but the caveat about the sword’s double edge is enduring.
Remarkably, and to their credit, the mid-level “leaders” who were the primary stakeholders in the success—personal as well as for GEA—of the proper use of the TDW stepped up to the understanding training. Once they connected the dots between management’s expectation of the development of cost-saving insights with the requirement to make relevant inquiries, understanding became a priority for their success. All 300 trainees passed the understanding and technique “final exam.”
If rollouts of Big-Data analytic applications can match GEA’s success for both technique and understanding, then the businesses that deploy these applications have a real chance at achieving the promised results.
Tim Lindner is senior business consultant at Voxware, a company focused on voice picking software. He can be reached at firstname.lastname@example.org