Data. Data. Data. I cannot make bricks without clay.
Decisions. We make a lot of them. In fact, it’s estimated that we make an average of 35,000 of them daily without even realizing it. Our food-related decisions alone number more than 200 per day, according to researchers at Cornell University. Give it a few minutes of reflection and that number becomes significantly less shocking. (Room for cream in your coffee? No thanks. Fries with that? Yes…obviously.)
How many of those daily decisions are what we’d characterize as important? For the sake of discussion, let’s cast aside the question of what constitutes “important” since it almost wholly depends on each individual’s perspective. Intrinsically, each of us knows what is an important decision and we act accordingly by thinking about it, researching possible options, evaluating the potential benefits or drawbacks of each option, and so on. Ultimately, the size and/or duration of the impact of a decision on the future determines its relative importance.
However, when a number of small decisions are made with poor-quality data, what is the outcome? It’s one thing to break from a diet program to splurge on your favorite dessert. It’s quite another to systematically make small deviations from the program on a daily basis. In fact, such a systematic deviation would likely be the product of a misguided diet program based on wrong information entirely or no diet program at all. Regardless of specific outcomes, it is unlikely that goals would be reached and if any progress was made at all, it would almost certainly be at a much slower pace than desired.
Last week we posted about the magnitude of poor data quality within the U.S. economy and how its negative impacts in 2016 were estimated by IBM to exceed the GDP of all but four countries in the world that year. If that has spurred some thoughts on how poor data quality can have such a costly impact on your daily operations, you are not alone. In 2016, Forbes and KPMG teamed up to survey nearly 1,300 executives around the world to gain insight into what they were thinking about and preparing for as leaders. Among several takeaways, the surveyors found that 84% of CEOs surveyed have concerns about the “quality of the data [they are] basing their decisions on.” Without question, data is the foundation on which we base our decisions and even business leaders around the world have commonly-held questions about that data’s underlying accuracy.
As with any foundation, it is universally known that a solid one is better, which is why we all go to extraordinary lengths to make sure that the data that we use is as correct as possible — for important decisions at the very least and hopefully all the time. For work-related data in particular, this is just one aspect where poor data quality plays a pivotal role. In last week’s post, we linked to an article by the Harvard Business Review that focused on IBM’s estimate that the U.S. economy was saddled by a $3.1 trillion drag due to poor data quality. The author of the article, Thomas Redman, refers to the additional steps associated with the correction of poor-quality data as “hidden data factories” primarily due to the negative impact on human resource efficiency.
…hidden data factories are expensive. They form the basis for IBM’s $3.1 trillion per year figure. But quite naturally, managers should be more interested in the costs to their own organizations than to the economy as a whole.
He goes on to offer recommendations for how to reduce the negative impact on the bottom line that can best be summarized as better communication between intra-organizational departments, resulting in data becoming more accurate and usable over time. His recommendations also provide some insight into the overall financial importance of improving data quality. For example, in a related article, Dr. Redman illustrates how a 67% data accuracy rate yields a nearly 400% increase in costs. Lest you think that a 67% rate is low, he notes that it is on the “high end of typical” in his work as a consultant.
In conjunction with the release of the Forbes/KPMG report in May 2017, Forbes also issued a press release that does a great job of succinctly describing the benefits of good data quality and the costs of poor data quality. Suffice it to say, the benefits and cost of data are not limited to less wasted time.
Benefits of Good-Quality Data
Costs of Poor-Quality Data
1. Reputational Damage
2. Missed Opportunities
3. Lost Revenue
The point is that the presence of poor-quality data has many costs of both a known and unknown value. Conversely, the absence of poor-quality data has many benefits of both a known and unknown value.
Let’s shift gears for a moment and think about the sheer volume of breeding decisions made in agriculture. Depending on the industry, there are tens or even hundreds of thousands of early generation crosses evaluated each year. Have you ever, as part of a team, walked through a field of 80,000 single-hill potato varieties and picked winners and losers based on a few seconds of evaluation? In theory, every member of the team is looking for the same thing, but each person has his or her own criteria by which those plants are judged. Even the same person may have a different scale before lunch and after lunch. Over time, these thousands of seemingly innocuous decisions cumulatively become very important. Even for the rare, successful variety that emerges from it, this is a time- and resource-intensive process. How much more expensive — in terms of both time and money — is it for an unsuccessful variety that doesn’t generate a penny in royalties?
Returning for a moment to Dr. Redman’s recommendations for improving data quality, what if it isn’t an organization but a loosely-coordinated group of collaborating researchers that are each contributing data? If a group of 10 researchers each brings their individually-nuanced approach to data collection and measurement to the table, who decides which system is best suited for the group? Even if a common system is developed, is there a mechanism to maintain accountability?
The very nature of public agricultural variety development is extraordinarily collaborative. Without a standardized reporting platform like Medius.Re, an observer can expect to see significant variations in the way that data is measured, gathered, reported, analyzed, etc. across research entities and even among different researchers within the same entity. Criteria that vary by region; different reporting infrastructure and platforms; and individual reporting preferences are just a few drivers of data variability. While the data may be accurate, it may not necessarily be usable unless it has been refined and standardized sufficiently. Simply put, the systems that support variety development in the U.S. are much more inter-organizational as opposed to intra-organizational. As such, a shared platform that standardizes the collection, storage, and analysis of data is worth strongly considering as a strategy to reduce or eliminate those costly hidden data factories.
In the world of agricultural variety development, the sustained practice of making quality decisions on an extraordinarily large volume of seemingly innocuous questions will have a lasting, positive impact. The cumulative result will be an improved pool of emerging varieties that pushes an industry closer to a better variety for growers and consumers alike, regardless of commodity.