Maintenance Optimization is the effective balance of tasks that will maximize asset lifetime and minimize work-hours. The trick nevertheless is to reach such utopia while building a convincing business case that will quantify the efforts against the successes to gain supporting funding. The theory of balancing preventative or time based maintenance tasks among predictive or condition based maintenance evidence is substantial but useless when seeking the sponsorship from accounting or finance groups. There is however an opportunity to let risk and criticality decide on the balance between PBM and CBM.
I am bombarded by offers and promises that herald the advantages of condition based maintenance (CBM) and its impact on the bottom line. The advances in CBM technologies are certainly appealing to an engineer and filed with promises to be less costly than preventive time –based services (PBM). It is nevertheless an uphill battle to justify the initial investment in CBM technologies hanging the business case only on a glimpse of past projects or the immaterial proof of cost savings. From experience I have learned that the better strategy is to define an optimal maintenance program by mixing PBM and CBM technologies (or not) as a function of risk and criticality of services rather than chasing a fickle red herring of CBM savings.
I have extensively studied reliability centered maintenance programs in addition to trend equipment failures to know that although failure modes can be categorized in many ways it is impractical to do so cost effectively. It takes too much time and administrative effort to track, categorize, and document the equipment failures in light of the unpredictability of failures. In addition, I have observed that most organizations are not sufficiently evolved or integrated to capitalize on effective forecast of equipment failures. Random failures are too significant in cost and disruption to mitigate by diligent recordkeeping. It is not cost effective to burden the organization with administrative tasks that employ hundreds of man-hours and cannot succeed in forecasting the next failure. Nevertheless, by bringing our practice back to the primary goal of reliability centered maintenance (RCM) which is to minimize the impact of equipment failures on the operations, we can achieve reduction of maintenance tasks by focusing on critical, high risk and high cost equipment while letting the rest of the assets run to failure.
In my experience, the risk or cost implications of an equipment failure prescribed the maintenance routines to be applied. In other words, we managed to focus our resources on pieces of equipment that are critical to the operations (e.g. main propulsion bearings) or have costly implications when they fail (e.g. costly replacement cost due to expedited lead-times). Categorizing the risks and criticality of the equipment is systematic by a predefined “risk matrix”. Typically the “risk matrix” segregates on the basis of environmental impacts, loss of life or assets, monetary implications, and consequential losses. Our tasks and efforts then focused on utilizing a vast array of CBM technologies to assess the best time and timely plans the services. Meanwhile, other pieces of equipment with backup or redundant systems with acceptable replacement lead-times can be “run to failure” (RTF) while equipment that cannot be effectively maintained by CBM technologies, is critical to the operations, and does not have redundancy must be prescribed to a time based preventative maintenance (PBM) regiment.
In practice, I found that the mixture of CBM, PBM, and RTF must be diligent and routinely adjusted to the operations needs. More so, the threat to an effective maintenance strategy hinges on the handling of the RTF process. Equipment that has failed needs prompt attention but how do we align the definition of “prompt” across the organization. What is “prompt”? Two weeks, Two month, or Two quarters? In my mind, it is essential for the organization to incorporate the RTF equipment restoration as part of the Key Performance Indicators (KPIs) or Balanced Scorecard (BSC). This is particularly important if the basis for the maintenance strategy philosophy rests on “single failure” modes because system overall reliability decays due to system inter-dependability after a RTF event.
In conclusion, I found that although CBM technologies are very interesting and cost effective, it is the risk and failure modes that drive an optimal maintenance strategy. An optimal maintenance strategy must rest on a systematic risk assessment process that is continuously reviewed and more so measured.
I am always eager to learn about more experiences and I am willing to share evidence of effective use of resources. How does my experience compare to yours? How can we do better?