The Problem with Big Data Isn’t the Data

7 mins

Today, everyone understands big data’s potential to make a step change in business performance.  Some have even experienced the role of AI to generate actionable insights or solve complex problems.  So, why do we still see reports where only 17% of executives surveyed say they have captured satisfactory value from data and analytics?

For some, part of the cause of this stubbornly low statistic can be attributed to the nightmare of dealing with decades of questionable data practices and sub-optimal infrastructure.  However, the solution is not just about fixing the infrastructure. If it was, most organizations would have solved it already. Instead, finding the answer to the problem with data requires taking a closer look at the people in charge of it.

Let’s clarify that statement a bit.  Decades ago, a phrase was coined that rings true to this day; “garbage in, garbage out.” This phrase, originally used in the late 1950s to describe the challenges with early computers, is at the heart of the real data problem. Simply put, much of our operational data is questionable in quality, inconsistent, or just missing; and all of this comes down to people.  More specifically, it’s directly related to the rigor and diligence with which people collect and manage data, or how purposeful they are in adding instrumentation for new data streams.  

How did we get into this situation?

In reality, we may have got ahead of ourselves in the data race.  As an industry, we saw the potential of big data. Since 2016, we have spent $2T on Industry 4.0 investments. And as capturing data became easier, that is just what we did. We started capturing data and lots of it.  We added instrumentation to our processing units and equipment, we asked operators to collect more data while in the field, and we started viewing reports as data streams.  In essence, we buried ourselves in data – and IDC sees that we are just getting started in our appetite for more data – with data volumes expected to double or triple in the next three years.  

On the surface, this still sounds great. But it takes real work to capture, maintain, and get value from all this data.  As we started tracking more data streams, employees were left asking: “Is the effort worth it?  Is anyone actually doing anything with all of this data?”  For years, the answer to these questions was often “no.” We were collecting data because we liked the idea of having it and what we could do with it; when in reality, we often didn’t have time to analyze these data streams, much less do anything meaningful with the data.  As a result, people became conditioned to be very selective about which data streams they really worked at maintaining or capturing. This organizational conditioning is at the heart of our data problem today and is the reason that while everyone agrees on the potential value of data, most are skeptical on the practicalities of realizing that value.

How do we address the people problem?

As you have likely guessed by now, the answer isn’t creating data lakes or data islands or investing in data management software. While these are all important components of the solution, the answer lies with how the people, who are responsible for collecting, maintaining, and managing the data streams, value it. We must start by changing the relationship between people, the data they are responsible for, and the value they derive from that data.

The path will often be different for every organization but underpinning it will always come back to how the people charged with managing data can generate value within specific data sets. At times, this may require a step backwards - whereby the quantity of data being captured is spurned in favor of creating space to allow value adding insights from key data streams to be generated. By redefining our relationship with data from one that is a data black hole to one where data provides real, demonstrable value to us, we can start adding back data points or creating new data streams along with the mechanisms to process and extract value from these streams.

While there are many paths to redefining this relationship between people and data, one that is often overlooked is through artificial intelligence (AI). The simple reason for this oversight is that AI shines when it has lots of data to work with and therefore people assume that there should be lots of data for AI to add value.  This is an inaccurate perception.  AI and machine learning can, in some instances, generate valuable insights from smaller, more limited data sets.  To be clear, AI still thrives when it has abundant, high-quality data to work with, but there is still value to be attained when working with lean data sets – i.e., limited in time span, number of data points, or frequency of collection.  

The question is one of data sufficiency. For example, we want to know if the data is sufficiently descriptive for the problem being targeted. There is a threshold below which AI simply won’t work, but beyond that threshold, there is value to be obtained, even if the value is simply directional guidance.  As an example, AI/ML can generate value through insights and/or predictions from a single data stream with descriptive variability over time that can be correlated to a specific event/condition.

For many, the more limited value creation of AI in lean data applications will keep them from adopting AI or will steer them towards a more traditional technology (e.g., Excel, First Principles Models, Descriptive Analytics) for analyzing their data. However, this can be a short-sighted perspective. One core value of AI and machine learning is its ability to learn and improve as more data is incorporated. Its ability to evolve is important because it supports the underlying change management requirements of the people responsible for the data.  By applying AI to lean data sets for what may initially be only directional insights or for problems that could easily be solved with other technologies allows people to start getting value from their data and immediately rewards them as they capture and integrate new data streams into the AI.  

Coming back to our problem with data. The problem isn’t the data or data infrastructure, but our personal relationships with the data and the value we gain from it.  By reframing our data problem as an organizational change problem, we can better manage the journey towards big data and the promised value it holds.  So, whether you have lots of high-quality data going back years or decades or are lean on data and have just enough for some basic insights, it’s important to remember that this is an organization journey. To be successful, your first steps should reinforce the data’s value to the individuals that oversee it.  As they see value for themselves, they’ll in turn become better custodians of the data and start delivering on that promise of big data.

If you are ready to see how you can extract value from your data quickly, contact us to schedule a Canvass AI demo.  

About Canvass:

Canvass AI is a patented Industrial AI platform that puts industrial companies in control of their data to make timely decisions that achieve faster and sustainable outcomes. By empowering high-performance production teams with easy-to-use AI solutions, leading industrials trust Canvass AI to address their operational challenges and achieve their sustainability goals every day.

No items found.
Related Articles