Drowning in Data: Consequences of Having Too Much of a Good Thing

Our collective cup runneth over with plant data, and the side effects spilling out can have some negative effects.
Aug. 22, 2018
9 min read

This may be hard to believe, but there was a time in manufacturing when data was hard to come by. Back in the days when gauges were analog, before computers and databases and the internet, data was not as ubiquitous as it is in the manufacturing industry today…not even close. But, as we all know, things change. Fast forward to today…and manufacturing data is available to almost anyone who wants it.

The Cost of Data

Compared to a few decades ago, the cost of technology is dramatically less expensive today. As a result, data collection is faster, more automated, and cheaper. Data is available from machines of all kinds, including checkweighers, ERP systems, CMMs, OPC servers, PLCs, hand-held devices, and databases. Various interfaces harvest data and communicate with other systems. Compared with 30 years ago, it seems as though data access is unlimited.

Data, once a rare commodity, is now generated from machines, people and sensors throughout the plant.

Fully-automated data collection makes it all so easy, while eliminating human error. Perhaps most importantly, fully-automated data collection frees up shop-floor operators from having to manually enter data, so they can focus on the myriad other tasks they need to keep an eye on.

However, the outcome of all this automation and ease is that when collecting data is easier and the equipment is less expensive, we—as an industry—are going to collect it. Period. It’s human nature. If there’s a way to do something, generally speaking, we do it.

My point is, whether it’s process data like temperatures and speeds, or quality features on parts, any data you want or need, you can collect it. And when collecting data is not just easy to do and inexpensive, but (in many cases) fully-automated, well, that’s like Christmas in July. All the time. The shackles are off. Collect all you want, anytime you want, and in any way you want. Collect. Collect. Collect.

The question, though, is this: should you collect the data?

Ask Some Pointed Questions First

There is a downside. Over-consumption. Data gluttony. Companies often say, “We want to gather every bit of data we can!”

The ability to gather temperature, humidity, or speed values every millisecond is exciting. But before becoming a data glutton, you need to ask some simple, yet challenging questions:

 1. Why do we need to gather this data?
       a. Is this a short-term data collection necessary to solve a problem?
       b. Is this data required to fulfill a long-term strategic imperative?

2.  How will the data be used after it is collected?

3.  Who will be evaluating that data?

4.  How will the data be evaluated?

5.  What is a reasonable, rational amount of data to collect?

6.  How frequently do we need to collect the data?

7.  Do we really need to collect data every few milliseconds?
     a.  If so, what purpose would it serve?
     b.  How will the data be used?

A generation ago, we were starved for information. Now, companies are drowning in data. While collecting tons of data may seem like a good idea, there are consequences to be considered before making that fateful decision. Asking questions like those above can help justify the reasons for, and the investments in, data collection technologies.

Automatic Data Collection Is Not Free

Automating data collection requires hardware, software, and support. You’ll want hand-held data collection devices for operators and a means for fully-automating data collection through Programmable Logic Controllers (PLCs), OPC servers, etc. To organize and manage it all, you’ll need databases, wiring, computers, networks—all manner of electronic devices to tie everything together.

Those devices need power, maintenance, and TLC. The more you collect, the more support is required. And if you have a 24x7 operation, you’ll need to consider ‘round-the-clock’ support in the form of engineering and IT services. And don’t forget that, for each device, you’ll likely need to engage a variety of different third-party support contracts for all that hardware and software—all which will need time and attention.

Now consider maintenance. Things break. Precision data collection devices tend to be sensitive when dropped on concrete floors. Expect it to happen, then budget for it. And they’ll wear out and won’t always work as advertised. You’ll need humans to manage it all. It adds up to a bevy of infrastructure costs, IT support, and human resources required to make everything click.

So much data, so few people to make sense of it.

And we haven’t even discussed the actual saving of data just yet. Let’s talk about that. Because data is so easy and inexpensive to gather, a very real issue is saving all that data. Saving data every few milliseconds for just one feature and extrapolating over an entire production enterprise will fill up lots of hard drives in no time. Saving all that data will become an expensive headache. To avoid these issues, carefully consider a rational data collection plan with reasonable frequencies of data sampling. You’ll never need every millisecond of data, so don’t do it.

We all appreciate the fact that data storage costs have become more manageable and less expensive over the years. But if you’re gathering millions of data values every minute of every hour of every day, you have to put it somewhere. Your data storage costs are going to rise—and quickly. Focusing on how much data is needed, not how much is wanted, is key to intelligently determining how much data to collect.

Don’t get me wrong, the investments in hardware, software, and support described above are what’s required for most modern manufacturing organizations to ensure quality products are produced. In many cases, it’s required just to compete—an expectation that software and hardware systems are required for improving operations. And that’s great. However, just beware of collecting too much data. Ease of data access and availability does not equate to a need to collect it.

What Will You Do with the Data?

This is the most important question you can ask, and you should ask it before any data collection investments are made. If you can’t provide a good answer to the question, then don’t collect the data. Oddly, I have found that this question is met with answers like this: “Well, the data is easily captured and made available from our PLCs, so we need to gather it.” Nope. The presence of data does not necessitate the expense of saving it to a database somewhere.

The second most common answer I get is: “We might need that data someday.” Nope. Again, that’s a miss. Gathering data based on just-in-case needs is very expensive. The bottom line is that the data must serve some purpose and, if a purpose cannot be defined, then think twice about gathering it. Better yet, forget about it and invest your budget dollars elsewhere.

Don’t Just Collect Data, Convert it into Information

Imagine a huge spreadsheet filled with thousands of individual data values. That’s “data.” I’m willing to bet that you really don’t want data. What you really want is information. You want information that will tell you a story that you can act upon. Large or small, I’ve never found a dataset that did not tell a story. Seemingly innocuous datasets have revealed extraordinary stories that have resulted in millions of dollars in savings for companies I’ve worked with.

If you want to hear the tales that data tell, you’ve got to convert it into information. That’s done by aggregating, summarizing, and then analyzing data. In many cases, it’s as simple as making pictures of the data. My experience is that statistical techniques are brilliant for making pictures of data so that information can be extracted from them.

If it just sits in a database, data is worthless. Those pictures and millions of dollars of savings will not appear until you take the time to analyze the data you have collected. The most successful organizations I have worked with have scheduled, recurring “information extraction” meetings.

In these meetings, managers, engineers, and quality professionals step back from the daily grind to dig into the data they’ve gathered on the shop floor. They’re smart. They expect a return on their data collection investment—those who do almost always get that ROI. Data analyzers are keen to compare performance across manufacturing lines, product codes, even plants. They strive to better understand what the data is telling them and uncover information that they never knew was there. They are the modern equivalent of prospectors—digging nuggets of information from places in which they never expected to find gold.

Some have weekly meetings, others have monthly or quarterly meetings. But it’s clear that their success is born of the following commitments:

1. To collect data for very specific reasons

2. To analyze the data they have collected

3. To act upon the information found in the data

4. To repeat the process in order to:
    a. Continually improve operational efficiency
    b. Reduce overall costs and improve product quality

Where to Start?

Priority number one is collecting the data that is needed. Don’t collect everything. And certainly, do not collect data every millisecond. That’s wasteful and unnecessary. Remember, just because it’s automated doesn’t mean it’s free. Data collection costs real money, and if you collect it just because it is available, you’re burning budget you don’t need to burn.

Second, I recommend consolidating all your data in a centralized data repository. Then, and only then, can you do the comparisons and analyses mentioned earlier: across regions, plants, and production lines; between specific product codes, vendors, shifts, and anything else you can conjure up.

Having all your data in one place—even from lots of different plants—means that you can aggregate, summarize, and roll-up any data you want to scrutinize. This is the magic point at which you can mine that golden information from across your enterprise and prioritize improvement opportunities that can generate the greatest positive impacts on your business in the shortest amount of time. The right tools can easily support the analysis of aggregated data and convert it into actionable information that can positively transform your business performance.

Douglas Fair is the COO of quality control software company InfinityQS International, Inc.

About the Author

Douglas Fair

COO

A seasoned quality professional with 30 years of experience in manufacturing, analytics, and statistical applications, Doug serves as Chief Operating Officer for InfinityQS. Before joining InfinityQS in 1997, Doug began his career at the Boeing Aerospace Company and spent several years working as a quality systems consultant for Fortune 500 companies.

Doug earned a Bachelor of Science degree in Industrial Statistics from the University of Tennessee in Knoxville, Tennessee, a Six Sigma Black Belt from the University of Wisconsin, Milwaukee, and is a senior member of the American Society for Quality. Doug is a regular contributor to various quality magazines and has co-authored two books on industrial statistics: Innovative Control Charting (ASQ Quality Press, 1998) and Quality Management in Health Care (Jones and Bartlett Publishing, 2004).

Sign up for our eNewsletters
Get the latest news and updates

Voice Your Opinion!

To join the conversation, and become an exclusive member of IndustryWeek, create an account today!