If you're not using mean time between failure (MTBF) as a performance metric, then it may be time to start. So suggests Ricky Smith, senior reliability advisor at GPAllied and the author or co-author of several books on maintenance and reliability.

"I want one metric that tells me the equipment and process are reliable," Smith says. "The No. 1 measure of reliability worldwide is mean time between failure."

For the unfamiliar, mean time between failures is the average length of operating time between one failure and the next of an asset. It is calculated by dividing a specified period of operating hours by the number of failures.

For example, if the specified operating time is 24 hours and in that time 12 failures occur, then the MTBF is two hours.

And a failure is not simply a breakdown. It is any deviation from the expectations of the end user. So, while it includes an equipment breakdown, it also can include degrading quality or slower-than-expected line speeds, for example.

This metric may not receive the attention it deserves.

What About Preventive Maintenance?

Completing a preventive maintenance effort does not assure that equipment is reliable. Indeed, 100% compliance to a preventive maintenance schedule does not mean equipment is reliable, Smith points out.

"If you are performing preventive maintenance and equipment continues to break down, then you are doing reactive maintenance," says the reliability expert.

Measuring MTBF, on the other hand, shows you quite plainly how reliable your equipment and processes are, a necessary step to improving them.

"You don' t know you have a problem until you see if for yourself," Smith says. And because they don' t know, some companies set a low baseline for performance because they believe it is the best they can do.

That said, Smith does not downplay the importance of preventive maintenance. However, he says it should be focused on failure modes, not on people' s "tribal knowledge" or recommendations found in equipment manuals.

Smith also made the following observations with regard to mean time between failure:

  • When you start tracking mean time between failure, make everyone aware. The awareness alone likely will make the metric improve.
  • You may discover that it is operators who are responsible for some of the failures. In many cases, they are behind more of the failures than the equipment is, Smith notes. For example, a piece of equipment could fail because the operator is running it incorrectly. However, the exercise is not about pointing fingers.
  • When starting to use this metric, you are likely to initially see a lot of variation. The key is not to get frustrated but to continue measuring. You will start seeing failure patterns, which you then can address.
  • Everyone owns the metric, Smith says, and "all leadership should be out there solving the problems, not sitting in their offices."
  • Recognize that measuring mean time between failure is simply the start of a journey. If you don' t continue, you will backslide. "You' ve got to keep feeding the beast," Smith says.

The reliability expert' s final point may be the most important: Reliability and safety go hand in hand.

See Also: